Commit Graph

5026 Commits

Author SHA1 Message Date
Arnd Bergmann 835ea93e9d char/genrtc: remove powerpc support
PowerPC is the last architecture using the GEN_RTC driver on some
machines, but we can migrate them all to using the RTC_DRV_GENERIC
driver instead now.

This moves over the CONFIG_GEN_RTC option from drivers/char into
arch/powerpc/platforms/Kconfig and makes it just select the
replacement driver instead, for the only reason of not breaking
existing defconfig and .config files that users may have.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-06-04 00:23:35 +02:00
Rafael J. Wysocki e788892ba3 cpufreq: governor: Get rid of governor events
The design of the cpufreq governor API is not very straightforward,
as struct cpufreq_governor provides only one callback to be invoked
from different code paths for different purposes.  The purpose it is
invoked for is determined by its second "event" argument, causing it
to act as a "callback multiplexer" of sorts.

Unfortunately, that leads to extra complexity in governors, some of
which implement the ->governor() callback as a switch statement
that simply checks the event argument and invokes a separate function
to handle that specific event.

That extra complexity can be eliminated by replacing the all-purpose
->governor() callback with a family of callbacks to carry out specific
governor operations: initialization and exit, start and stop and policy
limits updates.  That also turns out to reduce the code size too, so
do it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-06-02 23:24:15 +02:00
Stephen Boyd 3fb9c41286 powerpc/512x: clk: Remove CLK_IS_ROOT
This flag is a no-op now (see commit 47b0eeb3dc "clk: Deprecate
CLK_IS_ROOT", 2016-02-02) so remove it.

Cc: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-06-01 14:51:41 -07:00
Russell Currey bd000b82e8 powerpc/pseries/eeh: Refactor the configure_bridge RTAS tokens
The RTAS calls "ibm,configure-pe" and "ibm,configure-bridge" perform the
same actions, however the former can skip configuration if unnecessary.
The existing code treats them as different tokens even though only one
will ever be called.  Refactor this by making a single token that is
assigned during init.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-30 13:50:12 +10:00
Russell Currey 871e178e0f powerpc/pseries/eeh: Handle RTAS delay requests in configure_bridge
In the "ibm,configure-pe" and "ibm,configure-bridge" RTAS calls, the
spec states that values of 9900-9905 can be returned, indicating that
software should delay for 10^x (where x is the last digit, i.e. 990x)
milliseconds and attempt the call again. Currently, the kernel doesn't
know about this, and respecting it fixes some PCI failures when the
hypervisor is busy.

The delay is capped at 0.2 seconds.

Cc: <stable@vger.kernel.org> # 3.10+
Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-30 13:50:04 +10:00
Linus Torvalds c04a588029 powerpc updates for 4.7
Highlights:
  - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
  - Live patching support for ppc64le (also merged via livepatching.git)
 
 Various cleanups & minor fixes from:
  - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
    Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart
    Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael
    Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta,
    Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin
    Rothberg, Vipin K Parashar.
 
 General:
  - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot
  - Fix branching to OOL handlers in relocatable kernel from Hari Bathini
  - Add support for userspace Power9 copy/paste from Chris Smart
  - Always use STRICT_MM_TYPECHECKS from Michael Ellerman
  - Add mask of possible MMU features from Michael Ellerman
 
 PCI:
  - Enable pass through of NVLink to guests from Alexey Kardashevskiy
  - Cleanups in preparation for powernv PCI hotplug from Gavin Shan
  - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
  - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
  - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G. Piccoli
  - Remove the dependency on EEH struct in DDW mechanism from Guilherme G. Piccoli
 
 selftests:
  - Test cp_abort during context switch from Chris Smart
  - Add several tests for transactional memory support from Rashmica Gupta
 
 perf:
  - Add support for sampling interrupt register state from Anju T
  - Add support for unwinding perf-stackdump from Chandan Kumar
 
 cxl:
  - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud
  - Allow initialization on timebase sync failures from Frederic Barrat
  - Increase timeout for detection of AFU mmio hang from Frederic Barrat
  - Handle num_of_processes larger than can fit in the SPA from Ian Munsie
  - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie
  - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie
  - Check periodically the coherent platform function's state from Christophe Lombard
 
 Freescale:
  - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum
    workaround, and a kconfig dependency fix."
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXPsGzAAoJEFHr6jzI4aWAVoAP/iKdrDe0eYHlVAE9SqnbsiZs
 lgDxdsC8P3fsmP1G9o/HkKhC82zHl/La8Ztz8dtqa+LkSzbfliWP1ztJsI7GsBFo
 tyCKzWnX9Rwvd3meHu/o/SQ29TNLm/PbPyyRqpj5QPbJ8XCXkAXR7ZZZqjvcMsJW
 /AgIr7Cgf53tl9oZzzl/c7CnNHhMq+NBdA71vhWtUx+T97wfJEGyKW6HhZyHDbEU
 iAki7fu77ZpEqC/Fh9swf0dCGBJ+a132NoMVo0AdV7EQLznUYlQpQEqa+1PyHZOP
 /ArOzf2mDg6m3PfCo1eiB07v8PnVZ3llEUbVAJNg3GUxbE4SHrqq/kwm0iElm3p/
 DvFxerCwdX9vmskJX4wDs+pSZRabXYj9XVMptsgFzA4joWrqqb7mBHqaort88YcY
 YSljEt1bHyXmiJ+dBya40qARsWUkCVN7ZgEzdxckq0KI3w7g2tqpqIbO2lClWT6t
 B3GpqQ4jp34+d1M14FB91fIGK7tMvOhSInE0Mv9+tPvRsepXqiiU/SwdAtRlr3m2
 zs/K+4FYcVjJ3Rmpgc+tI38PbZxHe212I35YN6L1LP+4ZfAtzz0NyKdooTIBtkbO
 19pX4WbBjKq8zK+YutrySncBIrbnI6VjW51vtRhgVKZliPFO/6zKagyU6FbxM+E5
 udQES+t3F/9gvtxgxtDe
 =YvyQ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights:
   - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
   - Live patching support for ppc64le (also merged via livepatching.git)

  Various cleanups & minor fixes from:
   - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
     Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie,
     Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring,
     Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras,
     Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung
     Bauermann, Valentin Rothberg, Vipin K Parashar.

  General:
   - Update LMB associativity index during DLPAR add/remove from Nathan
     Fontenot
   - Fix branching to OOL handlers in relocatable kernel from Hari Bathini
   - Add support for userspace Power9 copy/paste from Chris Smart
   - Always use STRICT_MM_TYPECHECKS from Michael Ellerman
   - Add mask of possible MMU features from Michael Ellerman

  PCI:
   - Enable pass through of NVLink to guests from Alexey Kardashevskiy
   - Cleanups in preparation for powernv PCI hotplug from Gavin Shan
   - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
   - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
   - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
     from Guilherme G Piccoli
   - Remove the dependency on EEH struct in DDW mechanism from Guilherme
     G Piccoli

  selftests:
   - Test cp_abort during context switch from Chris Smart
   - Add several tests for transactional memory support from Rashmica
     Gupta

  perf:
   - Add support for sampling interrupt register state from Anju T
   - Add support for unwinding perf-stackdump from Chandan Kumar

  cxl:
   - Configure the PSL for two CAPI ports on POWER8NVL from Philippe
     Bergheaud
   - Allow initialization on timebase sync failures from Frederic Barrat
   - Increase timeout for detection of AFU mmio hang from Frederic
     Barrat
   - Handle num_of_processes larger than can fit in the SPA from Ian
     Munsie
   - Ensure PSL interrupt is configured for contexts with no AFU IRQs
     from Ian Munsie
   - Add kernel API to allow a context to operate with relocate disabled
     from Ian Munsie
   - Check periodically the coherent platform function's state from
     Christophe Lombard

  Freescale:
   - Updates from Scott: "Contains 86xx fixes, minor device tree fixes,
     an erratum workaround, and a kconfig dependency fix."

* tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits)
  powerpc/86xx: Fix PCI interrupt map definition
  powerpc/86xx: Move pci1 definition to the include file
  powerpc/fsl: Fix build of the dtb embedded kernel images
  powerpc/fsl: Fix rcpm compatible string
  powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC
  powerpc/fsl-pci: Add a workaround for PCI 5 errata
  powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb
  powerpc/powernv/npu: Add PE to PHB's list
  powerpc/powernv: Fix insufficient memory allocation
  powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism
  Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
  powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner()
  powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover()
  powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover()
  powerpc/eeh: Don't report error in eeh_pe_reset_and_recover()
  Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()"
  powerpc/powernv/npu: Enable NVLink pass through
  powerpc/powernv/npu: Rework TCE Kill handling
  powerpc/powernv/npu: Add set/unset window helpers
  powerpc/powernv/ioda2: Export debug helper pe_level_printk()
  ...
2016-05-20 10:12:41 -07:00
Linus Torvalds 9e17632c0a Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs cleanups from Al Viro:
 "Assorted cleanups and fixes all over the place"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  coredump: only charge written data against RLIMIT_CORE
  coredump: get rid of coredump_params->written
  ecryptfs_lookup(): try either only encrypted or plaintext name
  ecryptfs: avoid multiple aliases for directories
  bpf: reject invalid names right in ->lookup()
  __d_alloc(): treat NULL name as QSTR("/", 1)
  mtd: switch ubi_open_volume_path() to vfs_stat()
  mtd: switch open_mtd_by_chdev() to use of vfs_stat()
2016-05-18 11:51:59 -07:00
Linus Torvalds 1eccc6e152 This is the bulk of GPIO changes for kernel cycle v4.7:
Core infrastructural changes:
 
 - Support for natively single-ended GPIO driver stages. This
   means that if the hardware has registers to configure open
   drain or open source configuration, we use that rather than
   (as we did before) try to emulate it by switching the line
   to an input to get high impedance. This is also documented
   throughly in Documentation/gpio/driver.txt for those of you
   who did not understand one word of what I just wrote.
 
 - Start to do away with the unnecessarily complex and
   unitelligible ARCH_REQUIRE_GPIOLIB and
   ARCH_WANT_OPTIONAL_GPIOLIB, another evolutional artifact from
   the time when the GPIO subsystem was unmaintained. Archs can
   now just select GPIOLIB and be done with it, cleanups to
   arches will trickle in for the next kernel. Some minor archs
   ACKed the changes immediately so these are included in this
   pull request.
 
 - Advancing the use of the data pointer inside the GPIO device
   for storing driver data by switching the PowerPC, Super-H
   Unicore and a few other subarches or subsystem drivers in
   ALSA SoC, Input, serial, SSB, staging etc to use it.
 
 - The initialization now reads the input/output state of the
   GPIO lines, so that each GPIO descriptor knows - if this
   callback is implemented - whether the line is input or
   output. This also reflects nicely in userspace "lsgpio".
 
 - It is now possible to name GPIO producer names, line names,
   from the device tree. (Platform data has been supported for
   a while.) I bet we will get a similar mechanism for ACPI
   one of those days. This makes is possible to get sensible
   producer names for e.g. GPIO rails in "lsgpio" in userspace.
 
 New drivers:
 
 - New driver for the Loongson1.
 
 - The XLP driver now supports Broadcom Vulcan ARM64.
 
 - The IT87 driver now supports IT8620 and IT8628.
 
 - The PCA953X driver now supports Galileo Gen2.
 
 Driver improvements:
 
 - MCP23S08 was switched to use the gpiolib irqchip helpers and
   now also suppors level-triggered interrupts.
 
 - 74x164 and RCAR now supports the .set_multiple() callback
 
 - AMDPT was converted to use generic GPIO.
 
 - TC3589x, TPS65218, SX150X, F7188X, MENZ127, VX855, WM831X, WM8994
   support the new single ended callback for open drain
   and in some cases open source.
 
 - Implement the .get_direction() callback for a few more drivers
   like PL061, Xgene.
 
 Cleanups:
 
 - Paul Gortmaker combed through the drivers and de-modularized
   those who are not really modules.
 
 - Move the GPIO poweroff DT bindings to the power subdir where
   they belong.
 
 - Rename gpio-generic.c to gpio-mmio.c, which is much more to the
   point. That's what it is handling, nothing more, nothing less.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXOuJ5AAoJEEEQszewGV1zNXsQAII5wtkP69WRJ3goYBKg1dZN
 DkuLqZyVI4hCgRhptzUW10gDLHKKOCVubfetTJHSpyG/dWDJXPCyH6FHF+pW6lMX
 y+em8kAvWctKpaosy4EM7O55/IohW0/fNCTOfzfrUNivjydFuA2XwPUiPqC7111O
 DeKlC/t+W1JEvZTiKMi83pKq+9wqhiHmD0qxRHhV57S+MT8e7mdlSKOp7uUkKPkg
 LPlerXosnmeFjL2emuSnKl/tq8pOyruU6uaIGG/uwpbo2W86Dok9GY2GWkQ4pANT
 pDtprc4aJ/Clf6Q0CoKwQbmAozqTDeJo+Und9tRs2KuZRly2bWOcyVE0lyK+Y4s0
 544LcKw2q6cB9ARZ6JExEVRJejPISGKMqo9TaHkyNSIJoiiatKYvNS4WVeFtTgbI
 W+1WfM1svPymNRqVPO1PMLV+3m9dalDH2WjtaFF21uCAQ/G0AuPEHjEDbbx0HIpb
 qrvWmYzZ97Rm/LdYROFRO53nEdCp2jh6c3n4/2kGYM8H0suvGxXZsB1g4i+Dm+B+
 qKVTS282azlDuH9ohXeXizeb6atK6s8TC3Rmew97SmXDO00cUQzEQO/ZquRLHY9r
 n83afQ4OL2Z9yruAxAk7pCshVSyheOsHuFPuZ7bwPW31VMdoWNRkhnaTUXMjGfYg
 3y39IHrCKWNMCCVM1iNl
 =z4d6
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "This is the bulk of GPIO changes for kernel cycle v4.7:

  Core infrastructural changes:

   - Support for natively single-ended GPIO driver stages.

     This means that if the hardware has registers to configure open
     drain or open source configuration, we use that rather than (as we
     did before) try to emulate it by switching the line to an input to
     get high impedance.

     This is also documented throughly in Documentation/gpio/driver.txt
     for those of you who did not understand one word of what I just
     wrote.

   - Start to do away with the unnecessarily complex and unitelligible
     ARCH_REQUIRE_GPIOLIB and ARCH_WANT_OPTIONAL_GPIOLIB, another
     evolutional artifact from the time when the GPIO subsystem was
     unmaintained.

     Archs can now just select GPIOLIB and be done with it, cleanups to
     arches will trickle in for the next kernel.  Some minor archs ACKed
     the changes immediately so these are included in this pull request.

   - Advancing the use of the data pointer inside the GPIO device for
     storing driver data by switching the PowerPC, Super-H Unicore and
     a few other subarches or subsystem drivers in ALSA SoC, Input,
     serial, SSB, staging etc to use it.

   - The initialization now reads the input/output state of the GPIO
     lines, so that each GPIO descriptor knows - if this callback is
     implemented - whether the line is input or output.  This also
     reflects nicely in userspace "lsgpio".

   - It is now possible to name GPIO producer names, line names, from
     the device tree.  (Platform data has been supported for a while).
     I bet we will get a similar mechanism for ACPI one of those days.
     This makes is possible to get sensible producer names for e.g.
     GPIO rails in "lsgpio" in userspace.

  New drivers:

   - New driver for the Loongson1.

   - The XLP driver now supports Broadcom Vulcan ARM64.

   - The IT87 driver now supports IT8620 and IT8628.

   - The PCA953X driver now supports Galileo Gen2.

  Driver improvements:

   - MCP23S08 was switched to use the gpiolib irqchip helpers and now
     also suppors level-triggered interrupts.

   - 74x164 and RCAR now supports the .set_multiple() callback

   - AMDPT was converted to use generic GPIO.

   - TC3589x, TPS65218, SX150X, F7188X, MENZ127, VX855, WM831X, WM8994
     support the new single ended callback for open drain and in some
     cases open source.

   - Implement the .get_direction() callback for a few more drivers like
     PL061, Xgene.

  Cleanups:

   - Paul Gortmaker combed through the drivers and de-modularized those
     who are not really modules.

   - Move the GPIO poweroff DT bindings to the power subdir where they
     belong.

   - Rename gpio-generic.c to gpio-mmio.c, which is much more to the
     point.  That's what it is handling, nothing more, nothing less"

* tag 'gpio-v4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (126 commits)
  MIPS: do away with ARCH_[WANT_OPTIONAL|REQUIRE]_GPIOLIB
  gpio: zevio: make it explicitly non-modular
  gpio: timberdale: make it explicitly non-modular
  gpio: stmpe: make it explicitly non-modular
  gpio: sodaville: make it explicitly non-modular
  pinctrl: sh-pfc: Let gpio_chip.to_irq() return zero on error
  gpio: dwapb: Add ACPI device ID for DWAPB GPIO controller on X-Gene platforms
  gpio: dt-bindings: add wd,mbl-gpio bindings
  gpio: of: make it possible to name GPIO lines
  gpio: make gpiod_to_irq() return negative for NO_IRQ
  gpio: xgene: implement .get_direction()
  gpio: xgene: Enable ACPI support for X-Gene GFC GPIO driver
  gpio: tegra: Implement gpio_get_direction callback
  gpio: set up initial state from .get_direction()
  gpio: rename gpio-generic.c into gpio-mmio.c
  gpio: generic: fix GPIO_GENERIC_PLATFORM is set to module case
  gpio: dwapb: add gpio-signaled acpi event support
  gpio: dwapb: convert device node to fwnode
  gpio: dwapb: remove name from dwapb_port_property
  gpio/qoriq: select IRQ_DOMAIN
  ...
2016-05-17 17:39:42 -07:00
Omar Sandoval a008393951 coredump: get rid of coredump_params->written
cprm->written is redundant with cprm->file->f_pos, so use that instead.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-12 16:55:50 -04:00
Alexey Kardashevskiy 1d4e89cf0a powerpc/powernv/npu: Add PE to PHB's list
Before commit 3e68dc57 "powerpc/powernv: Remove DMA32 PE list", NPU PEs
were linked to the NPU PHB via phb->ioda.pe_dma_list; after that fix,
the phb->ioda.pe_list is used.

During the pe_dma_list removal, list_add_tail(&phb->ioda.pe_dma_list)
was removed, however no list_add() was added so does this patch.

Fixes: 3e68dc57219a ("powerpc/powernv: Remove DMA32 PE list")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-12 19:56:25 +10:00
Alexey Kardashevskiy 92a8675690 powerpc/powernv: Fix insufficient memory allocation
The pnv_pci_init_ioda_phb() helper allocates a blob to store auxilary
data such PE and M32/M64 segment allocation maps; this single blob has
few partitions, size of each is derived from the PE number -
phb->ioda.total_pe_num.

It was assumed that the minimum PE number is 8, however it is 4 for NPU
so the pe_alloc part was missing in the allocated blob. It was invisible
till recently as we were not tracking used M64 segments and NPUs do not
use M32 segments so the phb->ioda.m32_segmap (which was pointing to the
same address as phb->ioda.pe_alloc) has never been written to leaving
the pe_alloc memory intact.

After commit 401203ac2d "powerpc/powernv: Track M64 segment consumption"
the pe_alloc gets corrupted and PE allocation cannot work. This fixes
the issue by enforcing the minimum PE number to 8.

Fixes: 401203ac2d15 ("powerpc/powernv: Track M64 segment consumption")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-12 19:55:42 +10:00
Guilherme G. Piccoli 8445a87f70 powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism
Commit 39baadbf36 ("powerpc/eeh: Remove eeh information from pci_dn")
changed the pci_dn struct by removing its EEH-related members.
As part of this clean-up, DDW mechanism was modified to read the device
configuration address from eeh_dev struct.

As a consequence, now if we disable EEH mechanism on kernel command-line
for example, the DDW mechanism will fail, generating a kernel oops by
dereferencing a NULL pointer (which turns to be the eeh_dev pointer).

This patch just changes the configuration address calculation on DDW
functions to a manual calculation based on pci_dn members instead of
using eeh_dev-based address.

No functional changes were made. This was tested on pSeries, both
in PHyp and qemu guest.

Fixes: 39baadbf36 ("powerpc/eeh: Remove eeh information from pci_dn")
Cc: stable@vger.kernel.org # v3.4+
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-12 19:52:21 +10:00
Michael Ellerman 848912e547 Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()"
This reverts commit c8ceacc22b.

Gavin says: I missed the fact that it affects the PCI passthrou path as
reported by Alexey: When passing GPU (0003:01:00.0) which seats behind
the root port, the reset request is routed to skiboot in original code.
In skiboot, the link bouncing events are masked during the reset. So we
don't see EEH (freeze all) error even link bouncing happens. With the
changes included, the reset is done by kernel and the link bouncing
events aren't masked by altering content of PHB3 (or P7IOC) specific
hardware registers which are invisible to kernel (skiboot hides the
hardware specific). It means the link bouncing is seen by the root port
and it causes a EEH (freeze all) error. The PCI passthrough on GPU
device cannot work.

Requested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Requested-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-12 19:43:37 +10:00
Alexey Kardashevskiy b5cb9ab1a0 powerpc/powernv/npu: Enable NVLink pass through
IBM POWER8 NVlink systems come with Tesla K40-ish GPUs each of which
also has a couple of fast speed links (NVLink). The interface to links
is exposed as an emulated PCI bridge which is included into the same
IOMMU group as the corresponding GPU.

In the kernel, NPUs get a separate PHB of the PNV_PHB_NPU type and a PE
which behave pretty much as the standard IODA2 PHB except NPU PHB has
just a single TVE in the hardware which means it can have either
32bit window or 64bit window or DMA bypass but never two of these.

In order to make these links work when GPU is passed to the guest,
these bridges need to be passed as well; otherwise performance will
degrade.

This implements and exports API to manage NPU state in regard to VFIO;
it replicates iommu_table_group_ops.

This defines a new pnv_pci_ioda2_npu_ops which is assigned to
the IODA2 bridge if there are NPUs for a GPU on the bridge.
The new callbacks call the default IODA2 callbacks plus new NPU API.
This adds a gpe_table_group_to_npe() helper to find NPU PE for the IODA2
table_group, it is not expected to fail as the helper is only called
from the pnv_pci_ioda2_npu_ops.

This does not define NPU-specific .release_ownership() so after
VFIO is finished, DMA on NPU is disabled which is ok as the nvidia
driver sets DMA mask when probing which enable 32 or 64bit DMA on NPU.

This adds a pnv_pci_npu_setup_iommu() helper which adds NPUs to
the GPU group if any found. The helper uses helpers to look for
the "ibm,gpu" property in the device tree which is a phandle of
the corresponding GPU.

This adds an additional loop over PEs in pnv_ioda_setup_dma() as the main
loop skips NPU PEs as they do not have 32bit DMA segments.

As pnv_npu_set_window() and pnv_npu_unset_window() are started being used
by the new IODA2-NPU IOMMU group, this makes the helpers public and
adds the DMA window number parameter.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
[mpe: Add pnv_pci_ioda_setup_iommu_api() to fix build with IOMMU_API=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:31 +10:00
Alexey Kardashevskiy 85674868ce powerpc/powernv/npu: Rework TCE Kill handling
The pnv_ioda_pe struct keeps an array of peers. At the moment it is only
used to link GPU and NPU for 2 purposes:

1. Access NPU quickly when configuring DMA for GPU - this was addressed
in the previos patch by removing use of it as DMA setup is not what
the kernel would constantly do.

2. Invalidate TCE cache for NPU when it is invalidated for GPU.
GPU and NPU are in different PE. There is already a mechanism to
attach multiple iommu_table_group to the same iommu_table (used for VFIO),
we can reuse it here so does this patch.

This gets rid of peers[] array and PNV_IODA_PE_PEER flag as they are
not needed anymore.

While we are here, add TCE cache invalidation after enabling bypass.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:31 +10:00
Alexey Kardashevskiy b575c731fe powerpc/powernv/npu: Add set/unset window helpers
The upcoming NVLink passthrough support will require NPU code to cope
with two DMA windows.

This adds a pnv_npu_set_window() helper which programs 32bit window to
the hardware. This also adds multilevel TCE support.

This adds a pnv_npu_unset_window() helper which removes the DMA window
from the hardware. This does not make difference now as the caller -
pnv_npu_dma_set_bypass() - enables bypass in the hardware but the next
patch will use it to manage TCE table lists for TCE Kill handling.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:30 +10:00
Alexey Kardashevskiy 7d623e4256 powerpc/powernv/ioda2: Export debug helper pe_level_printk()
This exports debugging helper pe_level_printk() and corresponding macroses
so they can be used in npu-dma.c.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:30 +10:00
Alexey Kardashevskiy f9f8345674 powerpc/powernv/npu: Simplify DMA setup
NPU devices are emulated in firmware and mainly used for NPU NVLink
training; one NPU device is per a hardware link. Their DMA/TCE setup
must match the GPU which is connected via PCIe and NVLink so any changes
to the DMA/TCE setup on the GPU PCIe device need to be propagated to
the NVLink device as this is what device drivers expect and it doesn't
make much sense to do anything else.

This makes NPU DMA setup explicit.
pnv_npu_ioda_controller_ops::pnv_npu_dma_set_mask is moved to pci-ioda,
made static and prints warning as dma_set_mask() should never be called
on this function as in any case it will not configure GPU; so we make
this explicit.

Instead of using PNV_IODA_PE_PEER and peers[] (which the next patch will
remove), we test every PCI device if there are corresponding NVLink
devices. If there are any, we propagate bypass mode to just found NPU
devices by calling the setup helper directly (which takes @bypass) and
avoid guessing (i.e. calculating from DMA mask) whether we need bypass
or not on NPU devices. Since DMA setup happens in very rare occasion,
this will not slow down booting or VFIO start/stop much.

This renames pnv_npu_disable_bypass to pnv_npu_dma_set_32 to make it
more clear what the function really does which is programming 32bit
table address to the TVT ("disabling bypass" means writing zeroes to
the TVT).

This removes pnv_npu_dma_set_bypass() from pnv_npu_ioda_fixup() as
the DMA configuration on NPU does not matter until dma_set_mask() is
called on GPU and that will do the NPU DMA configuration.

This removes phb->dma_dev_setup initialization for NPU as
pnv_pci_ioda_dma_dev_setup is no-op for it anyway.

This stops using npe->tce_bypass_base as it never changes and values
other than zero are not supported.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:29 +10:00
Alexey Kardashevskiy 6969af7352 powerpc/powernv/npu: Use the correct IOMMU page size
This uses the page size from iommu_table instead of hard-coded 4K.
This should cause no change in behavior.

While we are here, move bits around to prepare for further rework
which will define and use iommu_table_group_ops.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:29 +10:00
Alexey Kardashevskiy 0bbcdb437d powerpc/powernv/npu: TCE Kill helpers cleanup
NPU PHB TCE Kill register is exactly the same as in the rest of POWER8
so let's reuse the existing code for NPU. The only bit missing is
a helper to reset the entire TCE cache so this moves such a helper
from NPU code and renames it.

Since pnv_npu_tce_invalidate() does really invalidate the entire cache,
this uses pnv_pci_ioda2_tce_invalidate_entire() directly for NPU.
This adds an explicit comment for workaround for invalidating NPU TCE
cache.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:29 +10:00
Alexey Kardashevskiy bef9253f55 powerpc/powernv: Define TCE Kill flags
This replaces magic constants for TCE Kill IODA2 register with macros.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:28 +10:00
Alexey Kardashevskiy a7cf13caad powerpc/powernv: Rename pnv_pci_ioda2_tce_invalidate_entire
As in fact pnv_pci_ioda2_tce_invalidate_entire() invalidates TCEs for
the specific PE rather than the entire cache, rename it to
pnv_pci_ioda2_tce_invalidate_pe(). In later patches we will add
a proper pnv_pci_ioda2_tce_invalidate_entire().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:28 +10:00
Gavin Shan c8ceacc22b powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()
The function pnv_pci_reset_secondary_bus() is called like below.
It's impossible for call the function on root bus. So it's safe
to remove the root bus case in the function. No functional changes
introduced.

   pci_parent_bus_reset() / pci_bus_reset() / pci_try_reset_bus()
   pci_reset_bridge_secondary_bus()
   pcibios_reset_secondary_bus()
   pnv_pci_reset_secondary_bus()

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:27 +10:00
Gavin Shan 4fad494321 powerpc/powernv: Simplify pnv_eeh_reset()
This drops unnecessary nested if statements in pnv_eeh_reset() to
improve the code readability. After the changes, the unused local
variable "ret" is dropped as well. No logical changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:26 +10:00
Gavin Shan cdddc577d9 powerpc/pci: Export pci_traverse_device_nodes()
This renames traverse_pci_devices() to pci_traverse_device_nodes().
The function traverses all subordinate device nodes of the specified
one. Also, below cleanup applied to the function. No logical changes
introduced.

   * Rename "pre" to "fn".
   * Avoid assignment in if condition reported from checkpatch.pl.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:25 +10:00
Gavin Shan d8f66f411e powerpc/pci: Export pci_add_device_node_info()
This renames update_dn_pci_info() to pci_add_device_node_info()
with corresponding adjustment on the parameter type and exports it.
The function is used to create pdn (struct pci_dn) for the indicated
device node. Another function add_pdn(), almost wrapper of
pci_add_device_node_info(), to be used in traverse_pci_devices(). No
logical changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:24 +10:00
Gavin Shan 6384d97780 powerpc/pci: Move pci_find_bus_by_node() around
This moves pci_find_bus_by_node() from arch/powerpc/platforms/
pseries/pci_dlpar.c to arch/powerpc/kernel/pci-hotplug.c so that
the function can be used by pSeries and PowerNV platform at the
same time. Also, below cleanup applied. No functional changes
introduced.

   * Remove variable "busdn" in find_bus_among_children()
   * Use PCI_DN() to convert device node to pci_dn

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:24 +10:00
Gavin Shan 3773dd258e powerpc/pci: Rename pcibios_find_pci_bus()
This renames pcibios_find_pci_bus() to pci_find_bus_by_node() to
avoid conflicts with those PCI subsystem weak function names, which
have prefix "pcibios". No logical changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:24 +10:00
Gavin Shan 1e9167726c powerpc/powernv: Use PE instead of number during setup and release
In current implementation, the PEs that are allocated or picked
from the reserved list are identified by PE number. The PE instance
has to be picked according to the PE number eventually. We have
same issue when PE is released.

For pnv_ioda_pick_m64_pe() and pnv_ioda_alloc_pe(), this returns
PE instance so that pnv_ioda_setup_bus_PE() can use the allocated
or reserved PE instance directly. Also, pnv_ioda_setup_bus_PE()
returns the reserved/allocated PE instance to be used in subsequent
patches. On the other hand, pnv_ioda_free_pe() uses PE instance
(not number) as its argument. No logical changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:23 +10:00
Gavin Shan 2b923ed1bd powerpc/powernv/ioda1: Improve DMA32 segment track
In current implementation, the DMA32 segments required by one specific
PE isn't calculated with the information hold in the PE independently.
It conflicts with the PCI hotplug design: PE centralized, meaning the
PE's DMA32 segments should be calculated from the information hold in
the PE independently.

This introduces an array (@dma32_segmap) for every PHB to track the
DMA32 segmeng usage. Besides, this moves the logic calculating PE's
consumed DMA32 segments to pnv_pci_ioda1_setup_dma_pe() so that PE's
DMA32 segments are calculated/allocated from the information hold in
the PE (DMA32 weight). Also the logic is improved: we try to allocate
as much DMA32 segments as we can. It's acceptable that number of DMA32
segments less than the expected number are allocated.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:22 +10:00
Gavin Shan 801846d1de powerpc/powernv: Remove DMA32 PE list
PEs are put into PHB DMA32 list (phb->ioda.pe_dma_list) according
to their DMA32 weight. The PEs on the list are iterated to setup
their TCE32 tables at system booting time. The list is used for
once at boot time and no need to keep it.

This moves the logic calculating DMA32 weight of PHB and PE to
pnv_ioda_setup_dma() to drop PHB's DMA32 list. Also, every PE
traces the consumed DMA32 segment by @tce32_seg and @tce32_segcount
are useless and they're removed.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:22 +10:00
Gavin Shan acce971c0e powerpc/powernv/ioda1: Introduce PNV_IODA1_DMA32_SEGSIZE
Currently, there is one macro (TCE32_TABLE_SIZE) representing the
TCE table size for one DMA32 segment. The constant representing
the DMA32 segment size (1 << 28) is still used in the code.

This defines PNV_IODA1_DMA32_SEGSIZE representing one DMA32
segment size. the TCE table size can be calcualted when the page
has fixed 4KB size. So all the related calculation depends on one
macro (PNV_IODA1_DMA32_SEGSIZE). No logical changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:21 +10:00
Gavin Shan b30d936f6f powerpc/powernv/ioda1: Rename pnv_pci_ioda_setup_dma_pe()
This renames pnv_pci_ioda_setup_dma_pe() to pnv_pci_ioda1_setup_dma_pe()
as it's the counter-part of IODA2's pnv_pci_ioda2_setup_dma_pe().
No logical changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:21 +10:00
Gavin Shan 9945155143 powerpc/powernv/ioda1: M64 support on P7IOC
This enables M64 window on P7IOC, which has been enabled on PHB3.
Different from PHB3 where 16 M64 BARs are supported and each of
them can be owned by one particular PE# exclusively or divided
evenly to 256 segments, every P7IOC PHB has 16 M64 BARs and each
of them are divided to 8 segments. So every P7IOC PHB supports
128 M64 segments in total. P7IOC has M64DT, which helps mapping
one particular M64 segment# to arbitrary PE#. PHB3 doesn't have
M64DT, indicating that one M64 segment can only be pinned to the
fixed PE#.

In order to unified M64 support M64 on P7IOC and PHB3, we just
provide 128 M64 segments on every P7IOC PHB and each of them is
pinned to the fixed PE# by bypassing the function of M64DT. In
turn, we just need different phb->init_m64() for P7IOC and PHB3
and maps M64 segment in pnv_ioda_reserve_m64_pe() for P7IOC, most
of the code are shared by them.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:20 +10:00
Gavin Shan c430670ad1 powerpc/powernv: Rename M64 related functions
This renames those functions picking PE number based on consumed
M64 segments, mapping M64 segments to PEs as those functions are
going to be shared by IODA1/IODA2 in next patch. No logical changes
introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:20 +10:00
Gavin Shan 93289d8c08 powerpc/powernv: Track M64 segment consumption
When unplugging PCI devices, their parent PEs might be offline.
The consumed M64 resource by the PEs should be released at that
time. As we track M32 segment consumption, this introduces an
array to the PHB to track the mapping between M64 segment and
PE number.

Note: M64 mapping isn't covered by pnv_ioda_setup_pe_seg() as
IODA2 doesn't support the mapping explicitly while it's supported
on IODA1. Until now, no M64 is supported on IODA1 in software.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:19 +10:00
Gavin Shan 69d733e72d powerpc/powernv: IO and M32 mapping based on PCI device resources
Currently, the IO and M32 segments are mapped to the corresponding
PE based on the windows of the parent bridge of PE's primary bus.
It's not going to work when the windows of root port or upstream
port of the PCIe switch behind root port are extended to PHB's
apertures in order to support hotplug in subsequent patch.

This fixes the issue by mapping IO and M32 segments based on the
resources of the PCI devices included in the PE, instead of the
windows of the parent bridge of the PE's primary bus.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:19 +10:00
Gavin Shan 23e79425fe powerpc/powernv: Simplify pnv_ioda_setup_pe_seg()
pnv_ioda_setup_pe_seg() associates the IO and M32 segments with the
owner PE. The code mapping segments should be fixed and immune from
logic changes introduced to pnv_ioda_setup_pe_seg().

This moves the code mapping segments to helper pnv_ioda_setup_pe_res().
The data type for @rc is changed to "int64_t". Also, argument @hose is
removed from pnv_ioda_setup_pe() as it can be got from @pe. No functional
changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:18 +10:00
Gavin Shan 3fa23ff8ff powerpc/powernv: Fix initial IO and M32 segmap
There are two arrays for IO and M32 segment maps on every PHB.
The index of the arrays are segment number and the value stored
in the corresponding element is PE number, indicating the segment
is assigned to the PE. Initially, all elements in those two arrays
are zeroes, meaning all segments are assigned to PE#0. It's wrong.

This fixes the initial values in the elements of those two arrays
to IODA_INVALID_PE, meaning all segments aren't assigned to any
PE.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:18 +10:00
Gavin Shan 689ee8c95f powerpc/powernv: Data type unsigned int for PE number
This changes the data type of PE number from "int" to "unsigned int"
in order to match the fact PE number is never negative:

   * The number of PE to which the specified PCI device is attached.
   * The PE number map for SRIOV VFs.
   * The returned PE number from pnv_ioda_alloc_pe().
   * The returned PE number from pnv_ioda2_pick_m64_pe().

Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:17 +10:00
Gavin Shan 92b8f137b3 powerpc/powernv: Rename PE# fields in struct pnv_phb
This renames the fields related to PE number in "struct pnv_phb"
for better reflecting of their usages as Alexey suggested. No
logical changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:17 +10:00
Gavin Shan 13ce7598b6 powerpc/powernv: Reorder fields in struct pnv_phb
This moves those fields in struct pnv_phb that are related to PE
allocation around. No logical change.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:16 +10:00
Gavin Shan 475d92c27f powerpc/powernv: Drop phb->bdfn_to_pe()
The last usage of pnv_phb::bdfn_to_pe() was removed in
ff57b454dd ("powerpc/eeh: Do probe on pci_dn"), so drop it.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:16 +10:00
Gavin Shan cb4224c501 powerpc/powernv: Cleanup on pci_controller_ops instances
This cleans up on below data struct instances to use tab instead of
space indent of statement to avoid complains from scripts/checkpatch.pl.
No logical changes introduced.

  @pnv_pci_ioda_controller_ops
  @pnv_npu_ioda_controller_ops

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:15 +10:00
Suraj Jitindar Singh 925e2d1ded powerpc: Update of_remove_property() call sites to remove null checking
After obtaining a property from of_find_property() and before calling
of_remove_property() most code checks to ensure that the property
returned from of_find_property() is not null. The previous patch moved
this check to the start of the function of_remove_property() in order to
avoid the case where this check isn't done and a null value is passed.
This ensures the check is always conducted before taking locks and
attempting to remove the property. Thus it is no longer necessary to
perform a check for null values before invoking of_remove_property().

Update of_remove_property() call sites in order to remove redundant
checking for null property value as check is now performed within the
of_remove_property function().

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[mpe: Unbreak some lines which are just >80 chars for readability]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:04 +10:00
Suraj Jitindar Singh b2ed059642 powerpc/pseries: Add null property check to pseries_discover_pic()
The return value of of_get_property() isn't checked before it is passed
to the strstr() function, if it happens that the return value is null
then this will result in a null pointer being dereferenced.

Add a check to see if the return value of of_get_property() is null and
if it is continue straight on to the next node.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: Chris Smart <chris@distroguy.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:03 +10:00
Alexey Kardashevskiy 9e44754755 powerpc/powernv/pci: Fix cfg_dbg() & replace with pr_devel()
When cfg_dbg() is enabled (i.e. mapped to printk()), gcc produces
errors as the __func__ parameter is missing (pnv_pci_cfg_read() has one);
this adds the missing parameter.

cfg_dbg() is just an inferior version of pr_devel() so use the latter
instead.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:02 +10:00
Aneesh Kumar K.V ab62476240 powerpc/mm/radix: Add THP support for 4K linux page size
This adds THP support for 4K Linux page size config with radix. We still
don't do THP with 4K Linux page size and hash page table. Hash page
table needs a 16MB hugepage and we can't do THP with 16MM hugepage and
4K Linux page size.

We add missing functions to 4K hash config to get it to build and
hash__has_transparent_hugepage() makes sure we don't enable THP for 4K
hash config. To catch wrong usage of THP related with 4K config, we add
BUG() in those dummy functions we added to get it compile.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:53:58 +10:00
Aneesh Kumar K.V d8c476eeb6 powerpc/mm/radix: Isolate hash table function from pseries guest code
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:53:46 +10:00
Al Viro 4e82901cd6 dcache_{readdir,dir_lseek}() users: switch to ->iterate_shared
no need to lock directory in dcache_dir_lseek(), while we are
at it - per-struct file exclusion is enough.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:32 -04:00
Aneesh Kumar K.V 2bfd65e45e powerpc/mm/radix: Add radix callbacks for early init routines
This adds routines for early setup for radix. We use device tree
property "ibm,processor-radix-AP-encodings" to find supported page
sizes. If we don't find the above we consider 64K and 4K as supported
page sizes.

We do map vmemap using 2M page size if we can. The linear mapping is
done such that we use required page size for that range. For example
memory of 3.5G is mapped such that we use 1G mapping till 3G range and
use 2M mapping for the rest.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:33:00 +10:00
Aneesh Kumar K.V 566ca99af0 powerpc/mm/radix: Add dummy radix_enabled()
In this patch we add the radix Kconfig and conditional check.
radix_enabled() is written to always return 0 here. Once we have all
needed radix changes added, we will update this to an mmu_feature check.

We need to add this early so that we can get it all build in the early
stage.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:54 +10:00
Aneesh Kumar K.V 50de596de8 powerpc/mm/hash: Add support for Power9 Hash
PowerISA 3.0 adds a parition table indexed by LPID. Parition table
allows us to specify the MMU model that will be used for guest and host
translation.

This patch adds support with SLB based hash model (UPRT = 0). What is
required with this model is to support the new hash page table entry
format and also setup partition table such that we use hash table for
address translation.

We don't have segment table support yet.

In order to make sure we don't load KVM module on Power9 (since we don't
have kvm support yet) this patch also disables KVM on Power9.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:40 +10:00
Aneesh Kumar K.V 30bda41aba powerpc/mm: Drop WIMG in favour of new constants
PowerISA 3.0 introduces two pte bits with the below meaning for radix:
  00 -> Normal Memory
  01 -> Strong Access Order (SAO)
  10 -> Non idempotent I/O (Cache inhibited and guarded)
  11 -> Tolerant I/O (Cache inhibited)

We drop the existing WIMG bits in the Linux page table in favour of the
above constants. We loose _PAGE_WRITETHRU with this conversion. We only
use writethru via pgprot_cached_wthru() which is used by
fbdev/controlfb.c which is Apple control display and also PPC32.

With respect to _PAGE_COHERENCE, we have been marking hpte always
coherent for some time now. htab_convert_pte_flags() always added
HPTE_R_M.

NOTE: KVM changes need closer review.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:33 +10:00
Aneesh Kumar K.V 72176dd0ad powerpc/mm: Use a helper for finding pte bits mapping I/O area
Use a helper instead of open coding with constants. A later patch will
drop the WIMG bits and use PowerISA 3.0 defines.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:32 +10:00
Aneesh Kumar K.V e58e87adc8 powerpc/mm: Update _PAGE_KERNEL_RO
PS3 had used a PPP bit hack to implement a read only mapping in the
kernel area. Since we are bolting the ioremap area, it used the pte
flags _PAGE_PRESENT | _PAGE_USER to get a PPP value of 0x3 there by
resulting in a read only mapping. This means the area can be accessed by
user space, but kernel will never return such an address to user space.

But we can do better by implementing a read only kernel mapping using
PPP bits 0b110.

This also allows us to do read only kernel mapping for radix in later
patches.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:30 +10:00
Aneesh Kumar K.V ac29c64089 powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGED
_PAGE_PRIVILEGED means the page can be accessed only by the kernel. This
is done to keep pte bits similar to PowerISA 3.0 Radix PTE format. User
pages are now marked by clearing _PAGE_PRIVILEGED bit.

Previously we allowed the kernel to have a privileged page in the lower
address range (USER_REGION). With this patch such access is denied.

We also prevent a kernel access to a non-privileged page in higher
address range (ie, REGION_ID != 0).

Both the above access scenarios should never happen.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:26 +10:00
Aneesh Kumar K.V c7d54842de powerpc/mm: Use _PAGE_READ to indicate Read access
This splits the _PAGE_RW bit into _PAGE_READ and _PAGE_WRITE. It also
removes the dependency on _PAGE_USER for implying read only. Few things
to note here is that, we have read implied with write and execute
permission. Hence we should always find _PAGE_READ set on hash pte
fault.

We still can't switch PROT_NONE to !(_PAGE_RWX). Auto numa depends on
marking a prot none pte _PAGE_WRITE. (For more details look at
b191f9b106 "mm: numa: preserve PTE write permissions across a NUMA
hinting fault")

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:21 +10:00
Paul Gortmaker 8038665fb5 powerpc/cell: Make spu_base.c explicitly non-modular
The Kconfig currently controlling compilation of this code is:

arch/powerpc/platforms/cell/Kconfig:config SPU_BASE
arch/powerpc/platforms/cell/Kconfig:    bool

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

We also delete the MODULE_LICENSE tag etc. since all that information
is already contained at the top of the file in the comments.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 20:30:42 +10:00
Russell Currey f8a25db47e powerpc/powernv: Use the "unknown" checkstop type as a fallback
The HMI code knows about three types of errors: CORE, NX and UNKNOWN.
If OPAL were to add a new type, it would not be handled at all since
there is no fallback case.  Instead of explicitly checking for UNKNOWN,
treat any checkstop type without a handler as unknown.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 20:30:40 +10:00
Nathan Fontenot bdf5fc6338 powerpc/pseries: Update LMB associativity index during DLPAR add/remove
The associativity array index specified for a LMB in the device tree,
/ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory, needs to be updated
prior to DLPAR adding a LMB and after DLPAR removing a LMB.

Without doing this step in the DLPAR add process a LMB could be configured
with the incorrect affinity. For a LMB that was not present at boot the
affinity index is set to 0xffffffff, which defaults to adding the LMB to
the first online node since the index is not a valid value. Or, the
affinity index could contain a stale value if the LMB was present at boot
but later DLPAR removed and is being DLPAR added back to the system.

This patch adds a step in the DLPAR add flow to look up the associativity
index for a LMB prior to adding a LMB and setting the associativity to
0xffffffff when a LMB is removed.

This patch also modifies the DLPAR add/remove flow to no longer do a single
update of the device tree property after all of the requested DLPAR
operations are complete and now does a property update during the add
or remove of each LMB.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 11:23:39 +10:00
Nathan Fontenot 4a4bdfea7c powerpc/pseries: Refactor dlpar_add_lmb() code
Re-factor dlpar_lmb_add() routine by moving the validation of the lmb
flags and the acquireing of the DRC to a wrapper around the work to add
the memory to the system. This is done to make handling of errors
during the addition of the memory easier and to facilitate the upcoming
addition of updating the lmb's affinity prior to adding the memory.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 11:23:38 +10:00
Kirill A. Shutemov 09cbfeaf1a mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04 10:41:08 -07:00
Linus Walleij da5f767ef6 powerpc: mpc8349emitx: use gpiochip data pointer
This makes the driver use the data pointer added to the gpio_chip
to store a pointer to the state container instead of relying on
container_of().

Cc: Anatolij Gustschin <agust@denx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-03-30 10:51:54 +02:00
Linus Walleij e99190cea3 powerpc: mpc52xx_gpt: use gpiochip data pointer
This makes the driver use the data pointer added to the gpio_chip
to store a pointer to the state container instead of relying on
container_of().

Cc: Anatolij Gustschin <agust@denx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-03-30 10:50:47 +02:00
Linus Torvalds d5e2d00898 powerpc updates for 4.6
Highlights:
  - Restructure Linux PTE on Book3S/64 to Radix format from Paul Mackerras
  - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh Kumar K.V
  - Add POWER9 cputable entry from Michael Neuling
  - FPU/Altivec/VSX save/restore optimisations from Cyril Bur
  - Add support for new ftrace ABI on ppc64le from Torsten Duwe
 
 Various cleanups & minor fixes from:
  - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy, Cyril
    Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell Currey,
    Sukadev Bhattiprolu, Suraj Jitindar Singh.
 
 General:
  - atomics: Allow architectures to define their own __atomic_op_* helpers from
    Boqun Feng
  - Implement atomic{, 64}_*_return_* variants and acquire/release/relaxed
    variants for (cmp)xchg from Boqun Feng
  - Add powernv_defconfig from Jeremy Kerr
  - Fix BUG_ON() reporting in real mode from Balbir Singh
  - Add xmon command to dump OPAL msglog from Andrew Donnellan
  - Add xmon command to dump process/task similar to ps(1) from Douglas Miller
  - Clean up memory hotplug failure paths from David Gibson
 
 pci/eeh:
  - Redesign SR-IOV on PowerNV to give absolute isolation between VFs from Wei
    Yang.
  - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan.
  - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang
  - PCI: Add pcibios_bus_add_device() weak function from Wei Yang
  - MAINTAINERS: Update EEH details and maintainership from Russell Currey
 
 cxl:
  - Support added to the CXL driver for running on both bare-metal and
    hypervisor systems, from Christophe Lombard and Frederic Barrat.
  - Ignore probes for virtual afu pci devices from Vaibhav Jain
 
 perf:
  - Export Power8 generic and cache events to sysfs from Sukadev Bhattiprolu
  - hv-24x7: Fix usage with chip events, display change in counter values,
    display domain indices in sysfs, eliminate domain suffix in event names,
    from Sukadev Bhattiprolu
 
 Freescale:
  - Updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum
    optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and
    other dt bits, and minor fixes/cleanup."
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW69OrAAoJEFHr6jzI4aWAe5EQAJw/hE6WBQc6a7Tj70AnXOqR
 qk/m5pZjuTwQxfBteIvHR1pE5eXdlvtAjcD254LVkFkAbIn19W/h2k0VX/nlee7P
 n/VRHRifjtGmukqHrPYJJ7ua9mNlY7pxh3leGSixBFASnSWqMxNNNziNQtSTcuCs
 TjHiw6NkZ/kzeunA4bAfE4yHVUZjmL74oiS9JbLyaVHqoW4fqWLlh26AKo2yYMZI
 qPicBBG4HBi3FGvoexnKxlJNdcV4HO7LzDjJmCSfUKYCJi+Pw19T5qmhso0q0qVz
 vHg/A8HNeG4Hn83pNVmLeQSAIQRZ3DvTtcLgbjPo+TVwm/hzrRRBWipTeOVbkLW8
 2bcOXT4t7LWUq15EAJ1LYgYZGzcLrfRfUeOcuQ1TWd3+PcfY9pE7FmizsxAAfaVe
 E9j9mpz4XnIqBtWkFHneTIHkQ5OWptyKuZJEaYH0nut4VsP0k8NarkseafGqBPu7
 5eG83gbiQbCVixfOgblV9eocJ29JcwpjPAY4CZSGJimShg909FV7WRgZgJkKWrbK
 dBRco8Jcp4VglGfo2qymv7Uj4KwQoypBREOhiKUvrAsVlDxPfx+bcskhjGu9xGDC
 xs/+nme0/lKa/wg5K4C3mQ1GAlkMWHI0ojhJjsyODbetup5UbkEu03wjAaTdO9dT
 Y6ptGm0rYAJluPNlziFj
 =qkAt
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "This was delayed a day or two by some build-breakage on old toolchains
  which we've now fixed.

  There's two PCI commits both acked by Bjorn.

  There's one commit to mm/hugepage.c which is (co)authored by Kirill.

  Highlights:
   - Restructure Linux PTE on Book3S/64 to Radix format from Paul
     Mackerras
   - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh
     Kumar K.V
   - Add POWER9 cputable entry from Michael Neuling
   - FPU/Altivec/VSX save/restore optimisations from Cyril Bur
   - Add support for new ftrace ABI on ppc64le from Torsten Duwe

  Various cleanups & minor fixes from:
   - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy,
     Cyril Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell
     Currey, Sukadev Bhattiprolu, Suraj Jitindar Singh.

  General:
   - atomics: Allow architectures to define their own __atomic_op_*
     helpers from Boqun Feng
   - Implement atomic{, 64}_*_return_* variants and acquire/release/
     relaxed variants for (cmp)xchg from Boqun Feng
   - Add powernv_defconfig from Jeremy Kerr
   - Fix BUG_ON() reporting in real mode from Balbir Singh
   - Add xmon command to dump OPAL msglog from Andrew Donnellan
   - Add xmon command to dump process/task similar to ps(1) from Douglas
     Miller
   - Clean up memory hotplug failure paths from David Gibson

  pci/eeh:
   - Redesign SR-IOV on PowerNV to give absolute isolation between VFs
     from Wei Yang.
   - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan.
   - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang
   - PCI: Add pcibios_bus_add_device() weak function from Wei Yang
   - MAINTAINERS: Update EEH details and maintainership from Russell
     Currey

  cxl:
   - Support added to the CXL driver for running on both bare-metal and
     hypervisor systems, from Christophe Lombard and Frederic Barrat.
   - Ignore probes for virtual afu pci devices from Vaibhav Jain

  perf:
   - Export Power8 generic and cache events to sysfs from Sukadev
     Bhattiprolu
   - hv-24x7: Fix usage with chip events, display change in counter
     values, display domain indices in sysfs, eliminate domain suffix in
     event names, from Sukadev Bhattiprolu

  Freescale:
   - Updates from Scott: "Highlights include 8xx optimizations, 32-bit
     checksum optimizations, 86xx consolidation, e5500/e6500 cpu
     hotplug, more fman and other dt bits, and minor fixes/cleanup"

* tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits)
  powerpc: Fix unrecoverable SLB miss during restore_math()
  powerpc/8xx: Fix do_mtspr_cpu6() build on older compilers
  powerpc/rcpm: Fix build break when SMP=n
  powerpc/book3e-64: Use hardcoded mttmr opcode
  powerpc/fsl/dts: Add "jedec,spi-nor" flash compatible
  powerpc/T104xRDB: add tdm riser card node to device tree
  powerpc32: PAGE_EXEC required for inittext
  powerpc/mpc85xx: Add pcsphy nodes to FManV3 device tree
  powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)
  powerpc/86xx: Introduce and use common dtsi
  powerpc/86xx: Update device tree
  powerpc/86xx: Move dts files to fsl directory
  powerpc/86xx: Switch to kconfig fragments approach
  powerpc/86xx: Update defconfigs
  powerpc/86xx: Consolidate common platform code
  powerpc32: Remove one insn in mulhdu
  powerpc32: small optimisation in flush_icache_range()
  powerpc: Simplify test in __dma_sync()
  powerpc32: move xxxxx_dcache_range() functions inline
  powerpc32: Remove clear_pages() and define clear_page() inline
  ...
2016-03-19 15:38:41 -07:00
Kees Cook 4cc7ecb7f2 param: convert some "on"/"off" users to strtobool
This changes several users of manual "on"/"off" parsing to use
strtobool.

Some side-effects:
- these uses will now parse y/n/1/0 meaningfully too
- the early_param uses will now bubble up parse errors

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Amitkumar Karwar <akarwar@marvell.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Joe Perches <joe@perches.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nishant Sarmukadam <nishants@marvell.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Steve French <sfrench@samba.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Joonsoo Kim fe896d1878 mm: introduce page reference manipulation functions
The success of CMA allocation largely depends on the success of
migration and key factor of it is page reference count.  Until now, page
reference is manipulated by direct calling atomic functions so we cannot
follow up who and where manipulate it.  Then, it is hard to find actual
reason of CMA allocation failure.  CMA allocation should be guaranteed
to succeed so finding offending place is really important.

In this patch, call sites where page reference is manipulated are
converted to introduced wrapper function.  This is preparation step to
add tracepoint to each page reference manipulation function.  With this
facility, we can easily find reason of CMA allocation failure.  There is
no functional change in this patch.

In addition, this patch also converts reference read sites.  It will
help a second step that renames page._count to something else and
prevents later attempt to direct access to it (Suggested by Andrew).

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Michael Ellerman a1b5344620 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include 8xx optimizations, 32-bit checksum optimizations,
86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt
bits, and minor fixes/cleanup."
2016-03-14 20:05:14 +11:00
Alessio Igor Bogani 4f9d6e95bc powerpc/86xx: Consolidate common platform code
Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 19:14:12 -06:00
Christophe Leroy e974cd4be0 powerpc32: remove ioremap_base
ioremap_base is not initialised and is nowhere used so remove it

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:02 -06:00
Michael Ellerman d8c0282f4d Merge branch 'topic/mprofile-kernel' into next
Merge the ftrace changes to support -mprofile-kernel on ppc64le. This is
a prerequisite for live patching, the support for which will be merged
via the livepatch tree based on this topic branch.
2016-03-11 11:20:15 +11:00
Wei Yang 0dc2830e0a powerpc/powernv: Support PCI config restore for VFs
After PE reset, OPAL API opal_pci_reinit() is called on all devices
contained in the PE to reinitialize them. While skiboot is not aware of
VFs, we have to implement the function in kernel to reinitialize VFs after
reset on PE for VFs.

In this patch, two functions pnv_pci_fixup_vf_mps() and
pnv_eeh_restore_vf_config() both manipulate the MPS of the VF, since for a
VF it has three cases.

1. Normal creation for a VF
   In this case, pnv_pci_fixup_vf_mps() is called to make the MPS a proper
   value compared with its parent.
2. EEH recovery without VF removed
   In this case, MPS is stored in pci_dn and pnv_eeh_restore_vf_config() is
   called to restore it and reinitialize other part.
3. EEH recovery with VF removed
   In this case, VF will be removed then re-created. Both functions are
   called. First pnv_pci_fixup_vf_mps() is called to store the proper MPS
   to pci_dn and then pnv_eeh_restore_vf_config() is called to do proper
   thing.

This introduces two functions: pnv_pci_fixup_vf_mps() to fixup the VF's
MPS to make sure it is equal to parent's and store this value in pci_dn
for future use. pnv_eeh_restore_vf_config() to re-initialize on VF by
restoring MPS, disabling completion timeout, enabling SERR, etc.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:22 +11:00
Wei Yang 9312bc5bab powerpc/powernv: Support EEH reset for VF PE
PEs for VFs don't have primary bus. So they have to have their own reset
backend, which is used during EEH recovery. The patch implements the reset
backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained
in the PE.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:21 +11:00
Wei Yang c29fa27d26 powerpc/eeh: Create PE for VFs
This creates PEs for VFs in the weak function pcibios_bus_add_device().
Those PEs for VFs are identified with newly introduced flag EEH_PE_VF
so that we treat them differently during EEH recovery.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:19 +11:00
Torsten Duwe 9a7841ae8d powerpc/ftrace: Use $(CC_FLAGS_FTRACE) when disabling ftrace
Rather than open-coding -pg whereever we want to disable ftrace, use the
existing $(CC_FLAGS_FTRACE) variable.

This has the advantage that it will work in future when we use a
different set of flags to enable ftrace.

Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-07 14:53:55 +11:00
chenhui zhao 6becef7ea0 powerpc/mpc85xx: Add CPU hotplug support for E6500
Support Freescale E6500 core-based platforms, like t4240.
Support disabling/enabling individual CPU thread dynamically.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
2016-03-04 23:58:38 -06:00
chenhui zhao 2f4f1f815b powerpc/mpc85xx: Add hotplug support on E5500 and E500MC cores
Freescale E500MC and E5500 core-based platforms, like P4080, T1040,
support disabling/enabling CPU dynamically.
This patch adds this feature on those platforms.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com>
[scottwood: removed unused pr_fmt]
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:56:31 -06:00
chenhui zhao 56f1ba2807 powerpc/mpc85xx: refactor the PM operations
Freescale CoreNet-based and Non-CoreNet-based platforms require
different PM operations. This patch extracted existing PM operations
on Non-CoreNet-based platforms to a new file which can accommodate
both platforms. In this way, PM operation codes are clearer structurally.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:53:11 -06:00
chenhui zhao d17799f9c1 powerpc/rcpm: add RCPM driver
There is a RCPM (Run Control/Power Management) in Freescale QorIQ
series processors. The device performs tasks associated with device
run control and power management.

The driver implements some features: mask/unmask irq, enter/exit low
power states, freeze time base, etc.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Tang Yuantian <Yuantian.Tang@freescale.com>
[scottwood: remove __KERNEL__ ifdef]
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:50:27 -06:00
chenhui zhao e7affb1dba powerpc/cache: add cache flush operation for various e500
Various e500 core have different cache architecture, so they
need different cache flush operations. Therefore, add a callback
function cpu_flush_caches to the struct cpu_spec. The cache flush
operation for the specific kind of e500 is selected at init time.
The callback function will flush all caches inside the current cpu.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:44:51 -06:00
David Gibson 27828f98a0 powerpc/mm: Handle removing maybe-present bolted HPTEs
At the moment the hpte_removebolted callback in ppc_md returns void and
will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
place.  This is awkward for the case of cleaning up a mapping which was
partially made before failing.

So, we add a return value to hpte_removebolted, and have it return ENOENT
in the case that the HPTE to remove didn't exist in the first place.

In the (sole) caller, we propagate errors in hpte_removebolted to its
caller to handle.  However, we handle ENOENT specially, continuing to
complete the unmapping over the specified range before returning the error
to the caller.

This means that htab_remove_mapping() will work sanely on a partially
present mapping, removing any HPTEs which are present, while also returning
ENOENT to its caller in case it's important there.

There are two callers of htab_remove_mapping():
   - In remove_section_mapping() we already WARN_ON() any error return,
     which is reasonable - in this case the mapping should be fully
     present
   - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
     just a WARN_ON() in the case of ENOENT, since failing to remove a
     mapping that wasn't there in the first place probably shouldn't be
     fatal.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01 22:04:18 +11:00
Adam Buchbinder 446957ba51 powerpc: Fix misspellings in comments.
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01 19:27:20 +11:00
Luis Henriques 95442c64de powerpc/ps3: gelic_udbg: use struct udphdr from <linux/udp.h>
Instead of defining a local version of struct udphdr use the standard
definition from <linux/udp.h>.

The 'src' field is named 'source' in the <linux/udp.h> definition.

Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01 19:27:20 +11:00
Luis Henriques 0336c8cd4d powerpc/ps3: gelic_udbg: use struct iphdr from <linux/ip.h>
Instead of defining a local version of struct iphdr use the standard
definition from <linux/ip.h>.

Several fields in the <linux/ip.h> definition have different names:
 - proto -> protocol
 - src -> saddr
 - dest -> daddr
 - total_length -> tot_len
 - checksum -> check

Also, 'ver_len' is composed by 'version' and 'ihl' in <linux/ip.h>.

Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01 19:27:20 +11:00
Luis Henriques e9aaa6d1ab powerpc/ps3: gelic_udbg: use struct vlan_hdr from <linux/if_vlan.h>
Instead of defining the local struct vlantag use the standard definition
of vlan_hdr from <linux/if_vlan.h>.

The fields in the <linux/if_vlan.h> definition have different names:
 - vlan -> h_vlan_TCI
 - subtype -> h_vlan_encapsulated_proto

While there, use also the ETH_P_IP macro instead of an hard-coded 0x0800
value.

Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01 19:27:20 +11:00
Luis Henriques 497abcf6af powerpc/ps3: gelic_udbg: use struct ethhdr from <linux/if_ether.h>
Instead of defining a local version of struct ethhdr use the standard
definition from <linux/if_ether.h>.

The fields in the <linux/if_ether.h> definition have different names:
 - dest -> h_dest
 - src -> h_source
 - type -> h_proto

While there, use a few other standard functions/macros:
 - eth_broadcast_addr (instead of a memset)
 - ETH_ALEN
 - ETH_P_8021Q

Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01 19:27:20 +11:00
Michael Ellerman 2527083cb8 powerpc fixes for 4.5 #3
- eeh: Fix partial hotplug criterion from Gavin Shan
  - mm: Clear the invalid slot information correctly from Aneesh Kumar K.V
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWzXquAAoJEFHr6jzI4aWADHsP/2lbwqz/vS3Ep4zlySHNvStL
 /DrRN2TN35THZ59FPRxgEfeqPxTCXtbpD6zEXwD0gf6m39I2zArhaQMOHXMtVPvV
 p0nAtwR0PX/PxlQTJDpHlg074vVAD7s3iuvad6oNQObLcXhoZ7wYtbStZ9Ithm4R
 YfqZTelzsw+GfMuTYnvAQf5aoRYztUpy7OheaJbbDmSZgMFwF896ZPJnaG9rAOPE
 xcSsRaSfHiUR2NE2ua1K5yya+1ilZqrZhib7QxXgzGuxoVa2AAiPR7Hpx2kX1Wm+
 z0DqPXISzRbVf9zyLgWD3TpJ4OMHI/CYVW+t/Gx/yWCMfNcfavUrh0vPdHRVEPZu
 zxmIUoI6yv7jQ6bcfdzR5s0Mr5pYWlUj5MZg2r8aGqloYcLPk5DiENg+c0QmKI05
 kQPCBoQz2ezzJWAt1BYshkc+mlimv3ODaNWFP34Nc6kcDaSO6a0rhVOecvKuR6dv
 UBNpeh5np1rKq1wX0ri0yAmnm//yXqe+bK0I8Ctipi0++e73sVJGzfFdVvXwEhhW
 h+v1BkdgW8WK/xlH+JCPiXd5dfXrUeFI0D65Kgpb7IbFc9hcXDmp2Dv7+8zx/Wcl
 L2NpuucSDxi+LHkE10QiypgLWSKjn9OSi8PLocqABNXG8uHxIp54jRfyViBNALXF
 XlPveqTgpt7On3aa0qVh
 =bk3U
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-4' into next

Pull in our current fixes from 4.5, in particular the "Fix Multi hit
ERAT" bug is causing folks some grief when testing next.
2016-02-25 21:52:58 +11:00
Michael Neuling ce5732a28d powerpc/powernv: Create separate subcores CPU feature bit
Subcores isn't really part of the 2.07 architecture but currently we
turn it on using the 2.07 feature bit.  Subcores is really a POWER8
specific feature.

This adds a new CPU_FTR bit just for subcores and moves the subcore
init code over to use this.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-22 20:47:46 +11:00
Andrew Donnellan 88409d0c86 powerpc/powernv: don't create OPAL msglog sysfs entry if memcons init fails
When initialising OPAL interfaces, there is a possibility that
opal_msglog_init() may fail to initialise the msglog/memory console.

Fix opal_msglog_sysfs_init() so it doesn't try to create sysfs entry for
the msglog if this occurs.

Suggested-by: Joel Stanley <joel@jms.id.au>
Fixes: 9b4fffa149 ("powerpc/powernv: new function to access OPAL msglog")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-22 20:31:55 +11:00
Linus Torvalds e6a1c1e9dd powerpc fixes for 4.5 #2
- Fix build error on 32-bit with checkpoint restart from Aneesh Kumar
  - Fix dedotify for binutils >= 2.26 from Andreas Schwab
  - Don't trace hcalls on offline CPUs from Denis Kirjanov
  - eeh: Fix stale cached primary bus from Gavin Shan
  - eeh: Fix stale PE primary bus from Gavin Shan
  - mm: Fix Multi hit ERAT cause by recent THP update from Aneesh Kumar K.V
  - ioda: Set "read" permission when "write" is set from Alexey Kardashevskiy
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWx8l5AAoJEFHr6jzI4aWAeY0P/AomeQCRieoBMKJi36WX4+gU
 Cm1iBgM593VEM/KFsYtedm5+4QaCmPE+1tVm4/u0wbLeEQ8TqNLSZLniB9USE0hb
 9655gGQyFE95BZa8WfbqHOI7+BK+TkUOWGY0CfyqPVrknzSN2MCDHjUaNo1wge6l
 zmIYIkKhaQAinFSFovOdjQ63rYdk6CxsfgbP1Gl2aX0cmzWW1n07AvZLqNmLFJ+4
 L3uBXPcrEKY/nfkRi+FutoTb86ggt9J9dqCfJHHfWKn60qwhpKwiva84k3jI/BOu
 yBTFeNzlobXt0ceDSWx1ITXzKmJQokWGC5+Lo+0mDb4veAbhLgHlXdx7NUcZIB6+
 YGYGSOkeKCnbnInIOGLz45LlevJFviI94y0YY4tt++Csq/IjnBhDeTkGx7zcqRLG
 v5hl7AhykHd3Me5iRuyRRVoVyk6+318OZW450Oxxj7EtkzpSeLfHCMKxk5w1Nxuk
 tenWQeApdkTVr91m5VJNFOsrmtFDLJv51C8duiUFWzc195ejSMYDR86K+qBeaebs
 39obXHVYRnCrn9TzODIR9SnEd7dHImekQ4a3G3F54mLJ4qqUN089TDqBGY2GNuT8
 j8QVBttp3SWuZSvtvDJLxoFt2QTKxcuiMQ4FX/OAS4qWRjSR8v2WTCyBZt68l7er
 kpUnIelJSuIDVLdNuFlf
 =7Yzi
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - Fix build error on 32-bit with checkpoint restart from Aneesh Kumar
 - Fix dedotify for binutils >= 2.26 from Andreas Schwab
 - Don't trace hcalls on offline CPUs from Denis Kirjanov
 - eeh: Fix stale cached primary bus from Gavin Shan
 - eeh: Fix stale PE primary bus from Gavin Shan
 - mm: Fix Multi hit ERAT cause by recent THP update from Aneesh Kumar K.V
 - ioda: Set "read" permission when "write" is set from Alexey Kardashevskiy

* tag 'powerpc-4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/ioda: Set "read" permission when "write" is set
  powerpc/mm: Fix Multi hit ERAT cause by recent THP update
  powerpc/powernv: Fix stale PE primary bus
  powerpc/eeh: Fix stale cached primary bus
  powerpc/pseries: Don't trace hcalls on offline CPUs
  powerpc: Fix dedotify for binutils >= 2.26
  powerpc/book3s_32: Fix build error with checkpoint restart
2016-02-20 09:22:11 -08:00
Alexey Kardashevskiy 6ecad912a0 powerpc/ioda: Set "read" permission when "write" is set
Quite often drivers set only "write" permission assuming that this
includes "read" permission as well and this works on plenty of
platforms. However IODA2 is strict about this and produces an EEH when
"read" permission is not set and reading happens.

This adds a workaround in the IODA code to always add the "read" bit
when the "write" bit is set.

Fixes: 10b35b2b74 ("powerpc/powernv: Do not set "read" flag if direction==DMA_NONE")
Cc: stable@vger.kernel.org # 4.2+
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-17 23:52:17 +11:00
Gavin Shan 1bc74f1ccd powerpc/powernv: Fix stale PE primary bus
When PCI bus is unplugged during full hotplug for EEH recovery,
the platform PE instance (struct pnv_ioda_pe) isn't released and
it dereferences the stale PCI bus that has been released. It leads
to kernel crash when referring to the stale PCI bus.

This fixes the issue by correcting the PE's primary bus when it's
oneline at plugging time, in pnv_pci_dma_bus_setup() which is to
be called by pcibios_fixup_bus().

Cc: stable@vger.kernel.org # v4.1+
Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-15 21:10:04 +11:00
Gavin Shan 05ba75f848 powerpc/eeh: Fix stale cached primary bus
When PE is created, its primary bus is cached to pe->bus. At later
point, the cached primary bus is returned from eeh_pe_bus_get().
However, we could get stale cached primary bus and run into kernel
crash in one case: full hotplug as part of fenced PHB error recovery
releases all PCI busses under the PHB at unplugging time and recreate
them at plugging time. pe->bus is still dereferencing the PCI bus
that was released.

This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity
of pe->bus. pe->bus is updated when its first child EEH device is
online and the flag is set. Before unplugging in full hotplug for
error recovery, the flag is cleared.

Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE")
Cc: stable@vger.kernel.org #v3.11+
Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-15 21:10:04 +11:00
Wei Yang be283eeb7f powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE mode
When M64 BAR is set to Single PE mode, the PE# assigned to VF could be
sparse.

This patch restructures the code to allocate sparse PE# for VFs when M64
BAR is set to Single PE mode. Also it rename the offset to pe_num_map to
reflect the content is the PE number.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10 12:04:58 +11:00
Wei Yang dfcc8d45c3 powerpc/powernv: boundary the total VF BAR size instead of the individual one
Each VF could have 6 BARs at most. When the total BAR size exceeds the
gate, after expanding it will also exhaust the M64 Window.

This patch limits the boundary by checking the total VF BAR size instead of
the individual BAR.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10 12:04:57 +11:00
Wei Yang f2dd0afeea powerpc/powernv: replace the hard coded boundary with gate
At the moment 64bit-prefetchable window can be maximum 64GB, which is
currently got from device tree. This means that in shared mode the maximum
supported VF BAR size is 64GB/256=256MB. While this size could exhaust the
whole 64bit-prefetchable window. This is a design decision to set a
boundary to 64MB of the VF BAR size. Since VF BAR size with 64MB would
occupy a quarter of the 64bit-prefetchable window, this is affordable.

This patch replaces magic limit of 64MB with "gate", which is 1/4 of the
M64 Segment Size(m64_segsize >> 2) and adds comment to explain the reason
for it.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vent.ibm.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10 12:04:57 +11:00
Wei Yang ee8222fe95 powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR
In current implementation, when VF BAR is bigger than 64MB, it uses 4 M64
BARs in Single PE mode to cover the number of VFs required to be enabled.
By doing so, several VFs would be in one VF Group and leads to interference
between VFs in the same group.

And in this patch, m64_wins is renamed to m64_map, which means index number
of the M64 BAR used to map the VF BAR. Based on Gavin's comments. Also
makes sure the VF BAR size is bigger than 32MB when M64 BAR is used in
Single PE mode.

This patch changes the design by using one M64 BAR in Single PE mode for
one VF BAR. This gives absolute isolation for VFs.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10 12:04:56 +11:00
Wei Yang 7fbe7a9374 powerpc/powernv: simplify the calculation of iov resource alignment
The alignment of IOV BAR on PowerNV platform is the total size of the IOV
BAR. No matter whether the IOV BAR is extended with number of
roundup_pow_of_two(total_vfs) or number of max PE number (256), the total
size could be calculated by (vfs_expanded * VF_BAR_size).

This patch simplifies the pnv_pci_iov_resource_alignment() by removing the
first case.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10 12:04:55 +11:00
Wei Yang b033185419 powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BAR
On PHB3, we enable SRIOV devices by mapping IOV BAR with M64 BARs. If a
SRIOV device's IOV BAR is not 64bit-prefetchable, this is not assigned from
64bit prefetchable window, which means M64 BAR can't work on it.

The reason is PCI bridges support only 2 memory windows and the kernel code
programs bridges in the way that one window is 32bit-nonprefetchable and
the other one is 64bit-prefetchable. So if devices' IOV BAR is 64bit and
non-prefetchable, it will be mapped into 32bit space and therefore M64
cannot be used for it.

This patch makes this explicit and truncate IOV resource in this case to
save MMIO space.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10 12:04:54 +11:00
Gavin Shan ccc9662da5 powerpc/powernv: Simplify definitions of EEH debugfs handlers
The EEH debugfs handlers have same prototype. This introduces
a macro to define them, then to simplify the code. No logical
changes.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10 12:04:46 +11:00
Andrew Donnellan 9b4fffa149 powerpc/powernv: new function to access OPAL msglog
Currently, the OPAL msglog/console buffer is exposed as a sysfs file, with
the sysfs read handler responsible for retrieving the log from the OPAL
buffer. We'd like to be able to use it in xmon as well.

Refactor the OPAL msglog code to create a new function, opal_msglog_copy(),
that copies to an arbitrary buffer. Separate the initialisation code into
generic memcons init and sysfs file creation.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-09 20:27:04 +11:00
Russell Currey 2de50e9674 powerpc/powernv: Remove support for p5ioc2
"p5ioc2 is used by approximately 2 machines in the world, and has never
ever been a supported configuration."

The code for p5ioc2 is essentially unused and complicates what is already
a very complicated codebase.  Its removal is essentially a "free win" in
the effort to simplify the powernv PCI code.

In addition, support for p5ioc2 has been dropped from skiboot.  There's no
reason to keep it around in the kernel.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-08 22:03:37 +11:00
Al Viro 5955102c99 wrappers for ->i_mutex access
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-22 18:04:28 -05:00
Linus Torvalds f689b742f2 powerpc updates for 4.5
- Ground work for the new Power9 MMU from Aneesh Kumar K.V
  - Optimise FP/VMX/VSX context switching from Anton Blanchard
 
  - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica Gupta,
    Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling, Andrew Donnellan
  - Allow wrapper to work on non-english system from Laurent Vivier
  - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
  - Fix module autoload for rackmeter & axonram drivers from Luis de Bethencourt
  - Include KVM guest test in all interrupt vectors from Paul Mackerras
  - Fix DSCR inheritance over fork() from Anton Blanchard
  - Make value-returning atomics & {cmp}xchg* & their atomic_ versions fully ordered from Boqun Feng
  - Print MSR TM bits in oops messages from Michael Neuling
  - Add TM signal return & invalid stack selftests from Michael Neuling
  - Limit EPOW reset event warnings from Vipin K Parashar
  - Remove the Cell QPACE code from Rashmica Gupta
  - Append linux_banner to exception information in xmon from Rashmica Gupta
  - Add selftest to check if VSRs are corrupted from Rashmica Gupta
  - Remove broken GregorianDay() from Daniel Axtens
  - Import Anton's context_switch2 benchmark into selftests from Michael Ellerman
  - Add selftest script to test HMI functionality from Daniel Axtens
  - Remove obsolete OPAL v2 support from Stewart Smith
  - Make enter_rtas() private from Michael Ellerman
  - PPR exception cleanups from Michael Ellerman
  - Add page soft dirty tracking from Laurent Dufour
  - Add support for Nvlink NPUs from Alistair Popple
  - Add support for kexec on 476fpe from Alistair Popple
  - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
  - Copy only required pieces of the mm_context_t to the paca from Michael Neuling
  - Add a kmsg_dumper that flushes OPAL console output on panic from Russell Currey
  - Implement save_stack_trace_regs() to enable kprobe stack tracing from Steven Rostedt
  - Add HWCAP bits for Power9 from Michael Ellerman
  - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
  - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
  - scripts/recordmcount.pl: support data in text section on powerpc from Ulrich Weigand
  - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand
 
  - cxl: Fix possible idr warning when contexts are released from Vaibhav Jain
  - cxl: use correct operator when writing pcie config space values from Andrew Donnellan
  - cxl: Fix DSI misses when the context owning task exits from Vaibhav Jain
  - cxl: fix build for GCC 4.6.x from Brian Norris
  - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
  - cxl: Enable PCI device ID for future IBM CXL adapter from Uma Krishnan
 
  - Freescale updates from Scott: Highlights include moving QE code out of
    arch/powerpc (to be shared with arm), device tree updates, and minor fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWmIxeAAoJEFHr6jzI4aWAA+cQAIXAw4WfVWJ2V4ZK+1eKfB57
 fdXG71PuXG+WYIWy71ly8keLHdzzD1NQ2OUB64bUVRq202nRgVc15ZYKRJ/FE/sP
 SkxaQ2AG/2kI2EflWshOi0Lu9qaZ+LMHJnszIqE/9lnGSB2kUI/cwsSXgziiMKXR
 XNci9v14SdDd40YV/6BSZXoxApwyq9cUbZ7rnzFLmz4hrFuKmB/L3LABDF8QcpH7
 sGt/YaHGOtqP0UX7h5KQTFLGe1OPvK6NWixSXeZKQ71ED6cho1iKUEOtBA9EZeIN
 QM5JdHFWgX8MMRA0OHAgidkSiqO38BXjmjkVYWoIbYz7Zax3ThmrDHB4IpFwWnk3
 l7WBykEXY7KEqpZzbh0GFGehZWzVZvLnNgDdvpmpk/GkPzeYKomBj7ZZfm3H1yGD
 BTHPwuWCTX+/K75yEVNO8aJO12wBg7DRl4IEwBgqhwU8ga4FvUOCJkm+SCxA1Dnn
 qlpS7qPwTXNIEfKMJcxp5X0KiwDY1EoOotd4glTN0jbeY5GEYcxe+7RQ302GrYxP
 zcc8EGLn8h6BtQvV3ypNHF5l6QeTW/0ZlO9c236tIuUQ5gQU39SQci7jQKsYjSzv
 BB1XdLHkbtIvYDkmbnr1elbeJCDbrWL9rAXRUTRyfuCzaFWTfZmfVNe8c8qwDMLk
 TUxMR/38aI7bLcIQjwj9
 =R5bX
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Core:
   - Ground work for the new Power9 MMU from Aneesh Kumar K.V
   - Optimise FP/VMX/VSX context switching from Anton Blanchard

  Misc:
   - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica
     Gupta, Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling,
     Andrew Donnellan
   - Allow wrapper to work on non-english system from Laurent Vivier
   - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
   - Fix module autoload for rackmeter & axonram drivers from Luis de
     Bethencourt
   - Include KVM guest test in all interrupt vectors from Paul Mackerras
   - Fix DSCR inheritance over fork() from Anton Blanchard
   - Make value-returning atomics & {cmp}xchg* & their atomic_ versions
     fully ordered from Boqun Feng
   - Print MSR TM bits in oops messages from Michael Neuling
   - Add TM signal return & invalid stack selftests from Michael Neuling
   - Limit EPOW reset event warnings from Vipin K Parashar
   - Remove the Cell QPACE code from Rashmica Gupta
   - Append linux_banner to exception information in xmon from Rashmica
     Gupta
   - Add selftest to check if VSRs are corrupted from Rashmica Gupta
   - Remove broken GregorianDay() from Daniel Axtens
   - Import Anton's context_switch2 benchmark into selftests from
     Michael Ellerman
   - Add selftest script to test HMI functionality from Daniel Axtens
   - Remove obsolete OPAL v2 support from Stewart Smith
   - Make enter_rtas() private from Michael Ellerman
   - PPR exception cleanups from Michael Ellerman
   - Add page soft dirty tracking from Laurent Dufour
   - Add support for Nvlink NPUs from Alistair Popple
   - Add support for kexec on 476fpe from Alistair Popple
   - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
   - Copy only required pieces of the mm_context_t to the paca from
     Michael Neuling
   - Add a kmsg_dumper that flushes OPAL console output on panic from
     Russell Currey
   - Implement save_stack_trace_regs() to enable kprobe stack tracing
     from Steven Rostedt
   - Add HWCAP bits for Power9 from Michael Ellerman
   - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
   - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
   - scripts/recordmcount.pl: support data in text section on powerpc
     from Ulrich Weigand
   - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand

  cxl:
   - cxl: Fix possible idr warning when contexts are released from
     Vaibhav Jain
   - cxl: use correct operator when writing pcie config space values
     from Andrew Donnellan
   - cxl: Fix DSI misses when the context owning task exits from Vaibhav
     Jain
   - cxl: fix build for GCC 4.6.x from Brian Norris
   - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
   - cxl: Enable PCI device ID for future IBM CXL adapter from Uma
     Krishnan

  Freescale:
   - Freescale updates from Scott: Highlights include moving QE code out
     of arch/powerpc (to be shared with arm), device tree updates, and
     minor fixes"

* tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (149 commits)
  powerpc/module: Handle R_PPC64_ENTRY relocations
  scripts/recordmcount.pl: support data in text section on powerpc
  powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
  powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  powerpc/mm: Fix _PAGE_PTE breaking swapoff
  cxl: Enable PCI device ID for future IBM CXL adapter
  cxl: use -Werror only with CONFIG_PPC_WERROR
  cxl: fix build for GCC 4.6.x
  powerpc: Add HWCAP bits for Power9
  powerpc/powernv: Reserve PE#0 on NPU
  powerpc/powernv: Change NPU PE# assignment
  powerpc/powernv: Fix update of NVLink DMA mask
  powerpc/powernv: Remove misleading comment in pci.c
  powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing
  powerpc: Fix build break due to paca mm_context_t changes
  cxl: Fix DSI misses when the context owning task exits
  MAINTAINERS: Update Scott Wood's e-mail address
  powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()
  powerpc: Fix style of self-test config prompts
  powerpc/powernv: Only delay opal_rtc_read() retry when necessary
  ...
2016-01-15 13:18:47 -08:00
Vladimir Davydov 5d097056c9 kmemcg: account certain kmem allocations to memcg
Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg.  For the list, see below:

 - threadinfo
 - task_struct
 - task_delay_info
 - pid
 - cred
 - mm_struct
 - vm_area_struct and vm_region (nommu)
 - anon_vma and anon_vma_chain
 - signal_struct
 - sighand_struct
 - fs_struct
 - files_struct
 - fdtable and fdtable->full_fds_bits
 - dentry and external_name
 - inode for all filesystems. This is the most tedious part, because
   most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds.  Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
Michael Ellerman be6bfc29bc Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include moving QE code out of arch/powerpc (to be shared with
arm), device tree updates, and minor fixes."
2016-01-14 09:55:01 +11:00
Russell Currey c88c5d4373 powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
The recently added OPAL API call, OPAL_CONSOLE_FLUSH, originally took no
parameters and returned nothing.  The call was updated to accept the
terminal number to flush, and returned various values depending on the
state of the output buffer.

The prototype has been updated and its usage in the OPAL kmsg dumper has
been modified to support its new behaviour as an incremental flush.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-13 12:35:17 +11:00
Alistair Popple 08f48f3234 powerpc/powernv: Reserve PE#0 on NPU
P8+ hardware reports all errors on PE#0. This patch ensures PE#0 is
not assigned to NPU devices so that it can be used for EEH.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 20:30:52 +11:00
Alistair Popple b521549a09 powerpc/powernv: Change NPU PE# assignment
The P8+ hardware supports four partitionable endpoints (PEs) however
the hardware reports all errors as occurring on PE#0. This means we
need to reserve this PE for error handling (EEH) and not assign it to
a NPU device, implying that some devices will need to share PEs.

This patch changes the PE assignment for NPU devices such that NPU
devices which connect to the same GPU are assigned to the same
PE#.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 20:30:51 +11:00
Alistair Popple 419dbd5e1f powerpc/powernv: Fix update of NVLink DMA mask
The emulated NVLink PCI devices share the same IODA2 TCE tables but only
support a single TVT (instead of the normal two for PCI devices). This
requires the kernel to manually replace windows with either the bypass
or non-bypass window depending on what the driver has requested.

Unfortunately an incorrect optimisation was made in
pnv_pci_ioda_dma_set_mask() which caused updating of some NPU device PEs
to be skipped in certain configurations due to an incorrect assumption
that a NULL peer PE in the array indicated there were no more peers
present. This patch fixes the problem by ensuring all peer PEs are
updated.

Fixes: 5d2aa710e6 ("powerpc/powernv: Add support for Nvlink NPUs")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 20:30:48 +11:00
Russell Currey b0eab5b29a powerpc/powernv: Remove misleading comment in pci.c
PCI in powernv now supports quite a bit more than p5ioc2, so remove the
outdated comment.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 16:56:01 +11:00
Andrew Lunn e7f4dc3536 mdio: Move allocation of interrupts into core
Have mdio_alloc() create the array of interrupt numbers, and
initialize it to POLLING. This is what most MDIO drivers want, so
allowing code to be removed from the drivers.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-07 14:31:26 -05:00
Andrew Donnellan dc3799bb9a powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()
Fix off-by-one error in opal_mce_check_early_recovery() when checking
whether the NIP falls within OPAL space.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:41 +11:00
Michael Neuling 57a9039052 powerpc/powernv: Only delay opal_rtc_read() retry when necessary
Only delay opal_rtc_read() when busy and are going to retry.

This has the advantage of possibly saving a massive 10ms off booting!

Kudos to Stewart for noticing.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:40 +11:00
Russell Currey affddff69c powerpc/powernv: Add a kmsg_dumper that flushes console output on panic
On BMC machines, console output is controlled by the OPAL firmware and is
only flushed when its pollers are called.  When the kernel is in a panic
state, it no longer calls these pollers and thus console output does not
completely flush, causing some output from the panic to be lost.

Output is only actually lost when the kernel is configured to not power off
or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
flushes the console buffer as part of its power down routines.  Before this
patch, however, only partial output would be printed during the timeout wait.

This patch adds a new kmsg_dumper which gets called at panic time to ensure
panic output is not lost.  It accomplishes this by calling OPAL_CONSOLE_FLUSH
in the OPAL API, and if that is not available, the pollers are called enough
times to (hopefully) completely flush the buffer.

The flushing mechanism will only affect output printed at and before the
kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel panic"
message may still be truncated as follows:

>Call Trace:
>[c000000f1f603b00] [c0000000008e9458] dump_stack+0x90/0xbc (unreliable)
>[c000000f1f603b30] [c0000000008e7e78] panic+0xf8/0x2c4
>[c000000f1f603bc0] [c000000000be4860] mount_block_root+0x288/0x33c
>[c000000f1f603c80] [c000000000be4d14] prepare_namespace+0x1f4/0x254
>[c000000f1f603d00] [c000000000be43e8] kernel_init_freeable+0x318/0x350
>[c000000f1f603dc0] [c00000000000bd74] kernel_init+0x24/0x130
>[c000000f1f603e30] [c0000000000095b0] ret_from_kernel_thread+0x5c/0xac
>---[ end Kernel panic - not

This functionality is implemented as a kmsg_dumper as it seems to be the
most sensible way to introduce platform-specific functionality to the
panic function.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:40 +11:00
Daniel Walker 433c858a61 powerpc/85xx: mpc85xx ADS: remove pci exclude
This code was reworked in commit,

905e75c46d

This change removed the fsl_add_bridge() which originally was above
the addition of the pci_exclude_device function. I think the assumption was that
the pci_exclude_device would prevent changes to the bridge PCI config after
it's been added. It seems it wasn't fully tested on MPC85xx ADS because
if you move the fsl_add_bridge() the pci_exclude_device is set in the machine
description then you can never update the PCI Config since the exclude
prevents it. This disrupts things like DMA.

This issue was extensively debugged by David Beazley.

Cc: xe-kernel@external.cisco.com
Cc: dbeazley@cisco.com
Cc: dwalker@fifo99.com
Signed-off-by: Daniel Walker <danielwa@cisco.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 18:23:21 -06:00
Harninder Rai 720d7aebcd powerpc/85xx: Add PCIe controller support for bsc9132qds
1. Use machine_arch_initcall to hook mpc85xx_common_publish_devices
This can ensure before pcibios_init() is called, pci controllers have
been probed and added to the hose_list.
2. Add a workaround for errata A-005434
For the BSC9132, PEX_PEXIWARn[TRGT] for all windows defaults to 0xF,
which is mapped to CCSRBAR. However, for other products, 0xF is
mapped to the local memory. Therefore, for the BSC9132, any default
PCI Express access to the local memory (DDR) will now access the
CCSRBAR. This patch changes the mapping of targets of inbound windows
PEX_PEXIWARn[TRGT] to the Local address space – 0x0 (from 0xF).

Signed-off-by: Harninder Rai <harninder.rai@freescale.com>
Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Hou Zhiqiang <B48286@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 18:17:15 -06:00
Zhao Qiang 7aa1aa6ece QE: Move QE from arch/powerpc to drivers/soc
ls1 has qe and ls1 has arm cpu.
move qe from arch/powerpc to drivers/soc/fsl
to adapt to powerpc and arm

Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 17:12:56 -06:00
Zhao Qiang 302c059f2e QE: use subsys_initcall to init qe
Use subsys_initcall to init qe to adapt ARM architecture.
Remove qe_reset from PowerPC platform file.

Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 17:10:18 -06:00
Zhao Qiang 0e6e01ff69 CPM/QE: use genalloc to manage CPM/QE muram
Use genalloc to manage CPM/QE muram instead of rheap.

Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 17:10:18 -06:00
Alistair Popple 036592fbbe powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"
Commit 25642e1459 ("powerpc/opal-irqchip: Fix double endian
conversion") fixed an endian bug by calling opal_handle_events() in
opal_event_unmask().

However this introduced a deadlock if we find an event is active
during unmasking and call opal_handle_events() again. The bad call
sequence is:

  opal_interrupt()
  -> opal_handle_events()
     -> generic_handle_irq()
        -> handle_level_irq()
           -> raw_spin_lock(&desc->lock)
              handle_irq_event(desc)
              unmask_irq(desc)
              -> opal_event_unmask()
                 -> opal_handle_events()
                    -> generic_handle_irq()
                       -> handle_level_irq()
                          -> raw_spin_lock(&desc->lock)	(BOOM)

When generating multiple opal events in quick succession this would lead
to the following stall warnings:

EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32
INFO: rcu_sched detected stalls on CPUs/tasks:

         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065
         (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602)
NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696]
INFO: rcu_sched detected stalls on CPUs/tasks:
         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371
         (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290)

This patch corrects the problem by queuing the work if an event is
active during unmasking, which is similar to the pre-endian fix
behaviour.

Fixes: 25642e1459 ("powerpc/opal-irqchip: Fix double endian conversion")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-18 22:24:15 +11:00
Daniel Axtens 1b855e167b powerpc: Add missing calls to va_end()
cppcheck picked up that there were a couple of missing va_end()
calls in functions using va_start().

Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 23:23:22 +11:00
Nathan Fontenot e9d764f803 powerpc/pseries: Enable kernel CPU dlpar from sysfs
Enable new kernel cpu hotplug functionality by allowing cpu dlpar requests
to be initiated from sysfs.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:03 +11:00
Nathan Fontenot 90edf184b9 powerpc/pseries: Add CPU dlpar add functionality
Add the ability to hotplug add cpus via rtas hotplug events by either
specifying the drc index of the CPU to add, or providing a count of the
number of CPUs to add.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:02 +11:00
Nathan Fontenot ac71380071 powerpc/pseries: Add CPU dlpar remove functionality
Add the ability to dlpar remove CPUs via hotplug rtas events, either by
specifying the drc-index of the CPU to remove or providing a count of cpus
to remove.

To remove multiple cpus in a single request we create a list of possible
DR (Dynamic Reconfiguration) cpus and their drc indexes that can be
removed.  We can then traverse the list remove each cpu and easily clean
up in any cases of failure.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:02 +11:00
Nathan Fontenot e666ae0b10 powerpc/pseries: Update CPU hotplug error recovery
Update the cpu dlpar add/remove paths to do better error recovery when
a failure occurs during the add/remove operation.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:02 +11:00
Nathan Fontenot d98389f375 powerpc/pseries: Factor out common cpu hotplug code
Re-factor the cpu hotplug code to support doing cpu hotplug completely in
the kernel and using the existing sysfs probe/release interfaces. This
patch pulls out pieces of existing cpu hotplug code into common routines,
dlpar_cpu_add() and dlpar_cpu_remove(), to be used by both interfaces.
There are no functional changes introduced.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:01 +11:00
Nathan Fontenot 183deeea58 powerpc/pseries: Consolidate CPU hotplug code to hotplug-cpu.c
No functional changes, this patch is simply a move of the cpu hotplug
code from pseries/dlpar.c to pseries/hotplug-cpu.c. This is in an effort
to consolidate all of the cpu hotplug code in a common place.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:01 +11:00
Nathan Fontenot 1f859adb92 powerpc/pseries: Verify CPU doesn't exist before adding
When DLPAR adding a CPU we should verify that the CPU does not already
exist. Failure to do so can generate a kernel oops;

[    9.465585] kernel BUG at arch/powerpc/platforms/pseries/dlpar.c:382!
[    9.465796] Oops: Exception in kernel mode, sig: 5 [#1]

This oops can be generated by causing a probe to be performed on a cpu
by writing to the sysfs cpu probe file (/sys/devices/system/cpu/probe).
This patch adds a check for the existence of cpu prior to probing the cpu
so userspace doing the wrong thing won't trigger a BUG_ON().

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:01 +11:00
Alistair Popple 5d2aa710e6 powerpc/powernv: Add support for Nvlink NPUs
NVLink is a high speed interconnect that is used in conjunction with a
PCI-E connection to create an interface between CPU and GPU that
provides very high data bandwidth. A PCI-E connection to a GPU is used
as the control path to initiate and report status of large data
transfers sent via the NVLink.

On IBM Power systems the NVLink processing unit (NPU) is similar to
the existing PHB3. This patch adds support for a new NPU PHB type. DMA
operations on the NPU are not supported as this patch sets the TCE
translation tables to be the same as the related GPU PCIe device for
each NVLink. Therefore all DMA operations are setup and controlled via
the PCIe device.

EEH is not presently supported for the NPU devices, although it may be
added in future.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:00 +11:00
Alistair Popple a84bf32140 powerpc: Add __raw_rm_writeq() function
Move __raw_rm_writeq() from platforms/powernv/pci-ioda.c to
include/asm/io.h so that it can be used by other code.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:59 +11:00
Alistair Popple 94973b24d6 Revert "powerpc/pci: Remove unused struct pci_dn.pcidev field"
This commit removed the pcidev field from struct pci_dn as it was no
longer in use by the kernel. However to support finding the
association of Nvlink devices to GPU devices from the device-tree this
field is required.

This reverts commit 250c7b277c ("powerpc/pci: Remove unused struct
pci_dn.pcidev field").

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:59 +11:00
Gavin Shan e80c4e7ca5 powerpc/powernv: Fix M64 resource name in /proc/iomem
The name of PCI root bus's M64 resource isn't initialized properly.
When dumping "/proc/iomem", "<BAD>" is seen for those M64 resources
on PCI root buses.

   ~# cat /proc/iomem | grep -e "BAD"
   3b0000000000-3b0fefffffff : <BAD>
   3b1000000000-3b1fefffffff : <BAD>
   3c0000000000-3c0fefffffff : <BAD>
   3c1000000000-3c1fefffffff : <BAD>
   3c2000000000-3c2fefffffff : <BAD>

This fixes the issue by setting the name of PCI root bus's M64
resource to that of PHB's device node full name. With the patch,
no "<BAD>" is seen from "/proc/iomem".

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:59 +11:00
Michael Ellerman b2e8590fa1 powerpc/pseries: Use rtas_call_unlocked() in pseries hotplug
Avoid open coding the logic by using rtas_call_unlocked().

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:55 +11:00
Stewart Smith e4d54f71d2 powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL
Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is
just OPALv3, with nobody ever expecting anything on pre-OPALv3 to
be cared about or supported by mainline kernels.

So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL
exclusively.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:54 +11:00
Stewart Smith 7261aafc09 powerpc/powernv: Remove OPALv2 firmware define and references
OPALv2 only ever existed in the lab and didn't escape to the world.
All OPAL systems in the wild are OPALv3.

The probability of there being an OPALv2 system still powered on
anywhere inside IBM is approximately zero, let alone anyone
expecting to run mainline kernels.

So, start to remove references to OPALv2.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:54 +11:00
Stewart Smith 786842b62f powerpc/powernv: panic() on OPAL < V3
The OpenPower Abstraction Layer firmware went through a couple
of iterations in the lab before being released. What we now know
as OPAL advertises itself as OPALv3.

OPALv2 and OPALv1 never made it outside the lab, and the possibility
of anyone at all ever building a mainline kernel today and expecting
it to boot on such hardware is zero.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:53 +11:00
Stewart Smith 98da62b716 powerpc/powernv: pr_warn_once on unsupported OPAL_MSG type
When running on newer OPAL firmware that supports sending extra
OPAL_MSG types, we would print a warning on *every* message received.

This could be a problem for kernels that don't support OPAL_MSG_OCC
on machines that are running real close to thermal limits and the
OCC is throttling the chip. For a kernel that is paying attention to
the message queue, we could get these notifications quite often.

Conceivably, future message types could also come fairly often,
and printing that we didn't understand them 10,000 times provides
no further information than printing them once.

Cc: stable@vger.kernel.org
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 20:42:13 +11:00
Daniel Axtens 00b912b0c8 powerpc: Remove broken GregorianDay()
GregorianDay() is supposed to calculate the day of the week
(tm->tm_wday) for a given day/month/year. In that calcuation it
indexed into an array called MonthOffset using tm->tm_mon-1. However
tm_mon is zero-based, not one-based, so this is off-by-one. It also
means that every January, GregoiranDay() will access element -1 of
the MonthOffset array.

It also doesn't appear to be a correct algorithm either: see in
contrast kernel/time/timeconv.c's time_to_tm function.

It's been broken forever, which suggests no-one in userland uses
this. It looks like no-one in the kernel uses tm->tm_wday either
(see e.g. drivers/rtc/rtc-ds1305.c:319).

tm->tm_wday is conventionally set to -1 when not available in
hardware so we can simply set it to -1 and drop the function.
(There are over a dozen other drivers in drivers/rtc that do
this.)

Found using UBSAN.

Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrew Morton <akpm@linux-foundation.org> # as an example of what UBSan finds.
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: rtc-linux@googlegroups.com
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-16 12:54:04 +11:00
Rashmica Gupta 24ad1648ed powerpc/cell: Remove the Cell QPACE code
All users of QPACE have upgraded to QPACE2 so remove the Cell QPACE code.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-14 20:41:50 +11:00
Vipin K Parashar b4af279a7c powerpc/pseries: Limit EPOW reset event warnings
Kernel prints respective warnings about various EPOW events for
user information/action after parsing EPOW interrupts. At times
below EPOW reset event warning is seen to be flooding kernel log
over a period of time.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared

These EPOW reset events are spurious in nature and are triggered by
firmware without an actual EPOW event being reset. This patch avoids these
multiple EPOW reset warnings by using a counter variable. This variable
is incremented every time an EPOW event is reported. Upon receiving a EPOW
reset event the same variable is checked to filter out spurious events and
decremented accordingly.

This patch also improves log messages to better describe EPOW event being
reported. Merged adjacent log messages into single one to reduce number of
lines printed per event.

Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Vipin K Parashar <vipin@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-14 20:41:49 +11:00
Aneesh Kumar K.V 4ad90c8649 powerpc/mm: Use H_READ with H_READ_4
This will bulk read 4 hash pte slot entries and should reduce the loop

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-14 15:19:17 +11:00
Aneesh Kumar K.V e34aa03ca4 powerpc/mm: Move THP headers around
We support THP only with book3s_64 and 64K page size. Move
THP details to hash64-64k.h to clarify the same.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-14 15:19:14 +11:00
Alistair Popple 25642e1459 powerpc/opal-irqchip: Fix double endian conversion
The OPAL event calls return a mask of events that are active in big
endian format. This is checked when unmasking the events in the
irqchip by comparison with a cached value. The cached value was stored
in big endian format but should've been converted to CPU endian
first.

This bug leads to OPAL event delivery being delayed or dropped on some
systems. Symptoms may include a non-functional console.

The bug is fixed by calling opal_handle_events(...) instead of
duplicating code in opal_event_unmask(...).

Fixes: 9f0fd0499d ("powerpc/powernv: Add a virtual irqchip for opal events")
Cc: stable@vger.kernel.org # v4.2+
Reported-by: Douglas L Lehr <dllehr@us.ibm.com>
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-08 16:53:31 +11:00
Rashmica Gupta f43194e458 powerpc: Standardise on NR_syscalls rather than __NR_syscalls.
Most architectures use NR_syscalls as the #define for the number of syscalls.

We use __NR_syscalls, and then define NR_syscalls as __NR_syscalls.

__NR_syscalls is not used outside arch code, whereas NR_syscalls is. So as
NR_syscalls must be defined and __NR_syscalls does not, replace __NR_syscalls
with NR_syscalls.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-26 22:11:17 +11:00
John Ogness 57f889471c powerpc/powermac: set IRQF_NO_THREAD for xmon/cascade handlers
The xmon and cascade irq handlers must not run as threads.
pmac_pic_lock is already a raw_spinlock, but the irq flag
IRQF_NO_THREAD needs to be set as well.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-26 22:11:05 +11:00
Krzysztof Kozlowski 87630eb1d5 powerpc/powernv: Drop owner assignment from platform_driver
platform_driver does not need to set an owner because
platform_driver_register() will set it.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-24 14:21:28 +11:00
Linus Torvalds 3c87b79188 PCI changes for the v4.4 merge window:
Resource management
     Add support for Enhanced Allocation devices (Sean O. Stalley)
     Add Enhanced Allocation register entries (Sean O. Stalley)
     Handle IORESOURCE_PCI_FIXED when sizing resources (David Daney)
     Handle IORESOURCE_PCI_FIXED when assigning resources (David Daney)
     Handle Enhanced Allocation capability for SR-IOV devices (David Daney)
     Clear IORESOURCE_UNSET when reverting to firmware-assigned address (Bjorn Helgaas)
     Make Enhanced Allocation bitmasks more obvious (Bjorn Helgaas)
     Expand Enhanced Allocation BAR output (Bjorn Helgaas)
     Add of_pci_check_probe_only to parse "linux,pci-probe-only" (Marc Zyngier)
     Fix lookup of linux,pci-probe-only property (Marc Zyngier)
     Add sparc mem64 resource parsing for root bus (Yinghai Lu)
 
   PCI device hotplug
     pciehp: Queue power work requests in dedicated function (Guenter Roeck)
 
   Driver binding
     Add builtin_pci_driver() to avoid registration boilerplate (Paul Gortmaker)
 
   Virtualization
     Set SR-IOV NumVFs to zero after enumeration (Alexander Duyck)
     Remove redundant validation of SR-IOV offset/stride registers (Alexander Duyck)
     Remove VFs in reverse order if virtfn_add() fails (Alexander Duyck)
     Reorder pcibios_sriov_disable() (Alexander Duyck)
     Wait 1 second between disabling VFs and clearing NumVFs (Alexander Duyck)
     Fix sriov_enable() error path for pcibios_enable_sriov() failures (Alexander Duyck)
     Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs (Ben Shelton)
     Don't try to restore VF BARs (Wei Yang)
 
   MSI
     Don't alloc pcibios-irq when MSI is enabled (Joerg Roedel)
     Add msi_controller setup_irqs() method for special multivector setup (Lucas Stach)
     Export all remapped MSIs to sysfs attributes (Romain Bezut)
     Disable MSI on SiS 761 (Ondrej Zary)
 
   AER
     Clear error status registers during enumeration and restore (Taku Izumi)
 
   Generic host bridge driver
     Fix lookup of linux,pci-probe-only property (Marc Zyngier)
     Allow multiple hosts with different map_bus() methods (David Daney)
     Pass starting bus number to pci_scan_root_bus() (David Daney)
     Fix address window calculation for non-zero starting bus (David Daney)
 
   Altera host bridge driver
     Add msi.h to ARM Kbuild (Ley Foon Tan)
     Add Altera PCIe host controller driver (Ley Foon Tan)
     Add Altera PCIe MSI driver (Ley Foon Tan)
 
   APM X-Gene host bridge driver
     Remove msi_controller assignment (Duc Dang)
 
   Broadcom iProc host bridge driver
     Fix header comment "Corporation" misspelling (Florian Fainelli)
     Fix code comment to match code (Ray Jui)
     Remove unused struct iproc_pcie.irqs[] (Ray Jui)
     Call pci_fixup_irqs() for ARM64 as well as ARM (Ray Jui)
     Fix PCIe reset logic (Ray Jui)
     Improve link detection logic (Ray Jui)
     Update PCIe device tree bindings (Ray Jui)
     Add outbound mapping support (Ray Jui)
 
   Freescale i.MX6 host bridge driver
     Return real error code from imx6_add_pcie_port() (Fabio Estevam)
     Add PCIE_PHY_RX_ASIC_OUT_VALID definition (Fabio Estevam)
 
   Freescale Layerscape host bridge driver
     Remove ls_pcie_establish_link() (Minghuan Lian)
     Ignore PCIe controllers in Endpoint mode (Minghuan Lian)
     Factor out SCFG related function (Minghuan Lian)
     Update ls_add_pcie_port() (Minghuan Lian)
     Remove unused fields from struct ls_pcie (Minghuan Lian)
     Add support for LS1043a and LS2080a (Minghuan Lian)
     Add ls_pcie_msi_host_init() (Minghuan Lian)
 
   HiSilicon host bridge driver
     Add HiSilicon SoC Hip05 PCIe driver (Zhou Wang)
 
   Marvell MVEBU host bridge driver
     Return zero for reserved or unimplemented config space (Russell King)
     Use exact config access size; don't read/modify/write (Russell King)
     Use of_get_available_child_count() (Russell King)
     Use for_each_available_child_of_node() to walk child nodes (Russell King)
     Report full node name when reporting a DT error (Russell King)
     Use port->name rather than "PCIe%d.%d" (Russell King)
     Move port parsing and resource claiming to  separate function (Russell King)
     Fix memory leaks and refcount leaks (Russell King)
     Split port parsing and resource claiming from  port setup (Russell King)
     Use gpio_set_value_cansleep() (Russell King)
     Use devm_kcalloc() to allocate an array (Russell King)
     Use gpio_desc to carry around gpio (Russell King)
     Improve clock/reset handling (Russell King)
     Add PCI Express root complex capability block (Russell King)
     Remove code restricting accesses to slot 0 (Russell King)
 
   NVIDIA Tegra host bridge driver
     Wrap static pgprot_t initializer with __pgprot() (Ard Biesheuvel)
 
   Renesas R-Car host bridge driver
     Build pci-rcar-gen2.c only on ARM (Geert Uytterhoeven)
     Build pcie-rcar.c only on ARM (Geert Uytterhoeven)
     Make PCI aware of the I/O resources (Phil Edworthy)
     Remove dependency on ARM-specific struct hw_pci (Phil Edworthy)
     Set root bus nr to that provided in DT (Phil Edworthy)
     Fix I/O offset for multiple host bridges (Phil Edworthy)
 
   ST Microelectronics SPEAr13xx host bridge driver
     Fix dw_pcie_cfg_read/write() usage (Gabriele Paoloni)
 
   Synopsys DesignWare host bridge driver
     Make "clocks" and "clock-names" optional DT properties (Bhupesh Sharma)
     Use exact access size in dw_pcie_cfg_read() (Gabriele Paoloni)
     Simplify dw_pcie_cfg_read/write() interfaces (Gabriele Paoloni)
     Require config accesses to be naturally aligned (Gabriele Paoloni)
     Make "num-lanes" an optional DT property (Gabriele Paoloni)
     Move calculation of bus addresses to DRA7xx (Gabriele Paoloni)
     Replace ARM pci_sys_data->align_resource with global function pointer (Gabriele Paoloni)
     Factor out MSI msg setup (Lucas Stach)
     Implement multivector MSI IRQ setup (Lucas Stach)
     Make get_msi_addr() return phys_addr_t, not u32 (Lucas Stach)
     Set up high part of MSI target address (Lucas Stach)
     Fix PORT_LOGIC_LINK_WIDTH_MASK (Zhou Wang)
     Revert "PCI: designware: Program ATU with untranslated address" (Zhou Wang)
     Use of_pci_get_host_bridge_resources() to parse DT (Zhou Wang)
     Make driver arch-agnostic (Zhou Wang)
 
   Miscellaneous
     Make x86 pci_subsys_init() static (Alexander Kuleshov)
     Turn off Request Attributes to avoid Chelsio T5 Completion erratum (Hariprasad Shenai)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWPM9aAAoJEFmIoMA60/r89f8QALFMHegqKk5M08ZcCaG7unLF
 5U5t88y6KF/kNYF6HOqLQiHh79U3ToU5HdrNaAtutr0UnvgbFC2WulLqKYgLiq6Y
 YJnwR3EfgGmdG7DKAVAXq19+nc2hgTPAEe8sciU7HKlTbqQmGj6//y3sQULGNLx3
 zur0C33DCtrDgKDP7to273lkHO8Vl0YuLzyqwt0ePCMNcXR0h1dK8QxSTjuXBxaR
 c+T1V1Hw64MTxLz3xJd1/1ipy32u+9LnqqcdRz0zRq6qi48G9ch/i4Z6DHa8kTbj
 DUZrrTYKILQ2TcjcZSBmTueX11Z1Xa4/I/45sehIi6gVWL9qQbmGpt2E5YtFED+4
 GdcmBSbWG/qsNsabXk38uiM3ww7+ltXEOhTXbcK+EgjvIhE6gSK/plYG0fU9pybs
 AKViEXVdHoT1X0N1dLK12mq7kvDCQvShHn08lz97Q9YrZ32wv1Fnij6WVSbJvfWt
 DubtPtisVM+rVy+VTpOImNR9wO54lTmG5jK53yNqH7I20K89y1kqARlN9nMXMB1a
 2nQnwe9yWlsGj9gVNCn1KmyQSPOWjg+3Z+ekfwbxpca14s1AaN3Jm0N9Z61dXFoF
 y2ygoQtZ/z9BHr3quBpxXGt+aVUg2kcNw5GYeDYiALxXdJSObyzRrZ6HDb/zicU2
 ZH9hBj0ctXvucmy6I2mt
 =uZrt
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Resource management:
   - Add support for Enhanced Allocation devices (Sean O. Stalley)
   - Add Enhanced Allocation register entries (Sean O. Stalley)
   - Handle IORESOURCE_PCI_FIXED when sizing resources (David Daney)
   - Handle IORESOURCE_PCI_FIXED when assigning resources (David Daney)
   - Handle Enhanced Allocation capability for SR-IOV devices (David Daney)
   - Clear IORESOURCE_UNSET when reverting to firmware-assigned address (Bjorn Helgaas)
   - Make Enhanced Allocation bitmasks more obvious (Bjorn Helgaas)
   - Expand Enhanced Allocation BAR output (Bjorn Helgaas)
   - Add of_pci_check_probe_only to parse "linux,pci-probe-only" (Marc Zyngier)
   - Fix lookup of linux,pci-probe-only property (Marc Zyngier)
   - Add sparc mem64 resource parsing for root bus (Yinghai Lu)

  PCI device hotplug:
   - pciehp: Queue power work requests in dedicated function (Guenter Roeck)

  Driver binding:
   - Add builtin_pci_driver() to avoid registration boilerplate (Paul Gortmaker)

  Virtualization:
   - Set SR-IOV NumVFs to zero after enumeration (Alexander Duyck)
   - Remove redundant validation of SR-IOV offset/stride registers (Alexander Duyck)
   - Remove VFs in reverse order if virtfn_add() fails (Alexander Duyck)
   - Reorder pcibios_sriov_disable() (Alexander Duyck)
   - Wait 1 second between disabling VFs and clearing NumVFs (Alexander Duyck)
   - Fix sriov_enable() error path for pcibios_enable_sriov() failures (Alexander Duyck)
   - Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs (Ben Shelton)
   - Don't try to restore VF BARs (Wei Yang)

  MSI:
   - Don't alloc pcibios-irq when MSI is enabled (Joerg Roedel)
   - Add msi_controller setup_irqs() method for special multivector setup (Lucas Stach)
   - Export all remapped MSIs to sysfs attributes (Romain Bezut)
   - Disable MSI on SiS 761 (Ondrej Zary)

  AER:
   - Clear error status registers during enumeration and restore (Taku Izumi)

  Generic host bridge driver:
   - Fix lookup of linux,pci-probe-only property (Marc Zyngier)
   - Allow multiple hosts with different map_bus() methods (David Daney)
   - Pass starting bus number to pci_scan_root_bus() (David Daney)
   - Fix address window calculation for non-zero starting bus (David Daney)

  Altera host bridge driver:
   - Add msi.h to ARM Kbuild (Ley Foon Tan)
   - Add Altera PCIe host controller driver (Ley Foon Tan)
   - Add Altera PCIe MSI driver (Ley Foon Tan)

  APM X-Gene host bridge driver:
   - Remove msi_controller assignment (Duc Dang)

  Broadcom iProc host bridge driver:
   - Fix header comment "Corporation" misspelling (Florian Fainelli)
   - Fix code comment to match code (Ray Jui)
   - Remove unused struct iproc_pcie.irqs[] (Ray Jui)
   - Call pci_fixup_irqs() for ARM64 as well as ARM (Ray Jui)
   - Fix PCIe reset logic (Ray Jui)
   - Improve link detection logic (Ray Jui)
   - Update PCIe device tree bindings (Ray Jui)
   - Add outbound mapping support (Ray Jui)

  Freescale i.MX6 host bridge driver:
   - Return real error code from imx6_add_pcie_port() (Fabio Estevam)
   - Add PCIE_PHY_RX_ASIC_OUT_VALID definition (Fabio Estevam)

  Freescale Layerscape host bridge driver:
   - Remove ls_pcie_establish_link() (Minghuan Lian)
   - Ignore PCIe controllers in Endpoint mode (Minghuan Lian)
   - Factor out SCFG related function (Minghuan Lian)
   - Update ls_add_pcie_port() (Minghuan Lian)
   - Remove unused fields from struct ls_pcie (Minghuan Lian)
   - Add support for LS1043a and LS2080a (Minghuan Lian)
   - Add ls_pcie_msi_host_init() (Minghuan Lian)

  HiSilicon host bridge driver:
   - Add HiSilicon SoC Hip05 PCIe driver (Zhou Wang)

  Marvell MVEBU host bridge driver:
   - Return zero for reserved or unimplemented config space (Russell King)
   - Use exact config access size; don't read/modify/write (Russell King)
   - Use of_get_available_child_count() (Russell King)
   - Use for_each_available_child_of_node() to walk child nodes (Russell King)
   - Report full node name when reporting a DT error (Russell King)
   - Use port->name rather than "PCIe%d.%d" (Russell King)
   - Move port parsing and resource claiming to  separate function (Russell King)
   - Fix memory leaks and refcount leaks (Russell King)
   - Split port parsing and resource claiming from  port setup (Russell King)
   - Use gpio_set_value_cansleep() (Russell King)
   - Use devm_kcalloc() to allocate an array (Russell King)
   - Use gpio_desc to carry around gpio (Russell King)
   - Improve clock/reset handling (Russell King)
   - Add PCI Express root complex capability block (Russell King)
   - Remove code restricting accesses to slot 0 (Russell King)

  NVIDIA Tegra host bridge driver:
   - Wrap static pgprot_t initializer with __pgprot() (Ard Biesheuvel)

  Renesas R-Car host bridge driver:
   - Build pci-rcar-gen2.c only on ARM (Geert Uytterhoeven)
   - Build pcie-rcar.c only on ARM (Geert Uytterhoeven)
   - Make PCI aware of the I/O resources (Phil Edworthy)
   - Remove dependency on ARM-specific struct hw_pci (Phil Edworthy)
   - Set root bus nr to that provided in DT (Phil Edworthy)
   - Fix I/O offset for multiple host bridges (Phil Edworthy)

  ST Microelectronics SPEAr13xx host bridge driver:
   - Fix dw_pcie_cfg_read/write() usage (Gabriele Paoloni)

  Synopsys DesignWare host bridge driver:
   - Make "clocks" and "clock-names" optional DT properties (Bhupesh Sharma)
   - Use exact access size in dw_pcie_cfg_read() (Gabriele Paoloni)
   - Simplify dw_pcie_cfg_read/write() interfaces (Gabriele Paoloni)
   - Require config accesses to be naturally aligned (Gabriele Paoloni)
   - Make "num-lanes" an optional DT property (Gabriele Paoloni)
   - Move calculation of bus addresses to DRA7xx (Gabriele Paoloni)
   - Replace ARM pci_sys_data->align_resource with global function pointer (Gabriele Paoloni)
   - Factor out MSI msg setup (Lucas Stach)
   - Implement multivector MSI IRQ setup (Lucas Stach)
   - Make get_msi_addr() return phys_addr_t, not u32 (Lucas Stach)
   - Set up high part of MSI target address (Lucas Stach)
   - Fix PORT_LOGIC_LINK_WIDTH_MASK (Zhou Wang)
   - Revert "PCI: designware: Program ATU with untranslated address" (Zhou Wang)
   - Use of_pci_get_host_bridge_resources() to parse DT (Zhou Wang)
   - Make driver arch-agnostic (Zhou Wang)

  Miscellaneous:
   - Make x86 pci_subsys_init() static (Alexander Kuleshov)
   - Turn off Request Attributes to avoid Chelsio T5 Completion erratum (Hariprasad Shenai)"

* tag 'pci-v4.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (94 commits)
  PCI: altera: Add Altera PCIe MSI driver
  PCI: hisi: Add HiSilicon SoC Hip05 PCIe driver
  PCI: layerscape: Add ls_pcie_msi_host_init()
  PCI: layerscape: Add support for LS1043a and LS2080a
  PCI: layerscape: Remove unused fields from struct ls_pcie
  PCI: layerscape: Update ls_add_pcie_port()
  PCI: layerscape: Factor out SCFG related function
  PCI: layerscape: Ignore PCIe controllers in Endpoint mode
  PCI: layerscape: Remove ls_pcie_establish_link()
  PCI: designware: Make "clocks" and "clock-names" optional DT properties
  PCI: designware: Make driver arch-agnostic
  ARM/PCI: Replace pci_sys_data->align_resource with global function pointer
  PCI: designware: Use of_pci_get_host_bridge_resources() to parse DT
  Revert "PCI: designware: Program ATU with untranslated address"
  PCI: designware: Move calculation of bus addresses to DRA7xx
  PCI: designware: Make "num-lanes" an optional DT property
  PCI: designware: Require config accesses to be naturally aligned
  PCI: designware: Simplify dw_pcie_cfg_read/write() interfaces
  PCI: designware: Use exact access size in dw_pcie_cfg_read()
  PCI: spear: Fix dw_pcie_cfg_read/write() usage
  ...
2015-11-06 11:29:53 -08:00
Linus Torvalds 2f4bf528ec powerpc updates for 4.4
- Kconfig: remove BE-only platforms from LE kernel build from Boqun Feng
  - Refresh ps3_defconfig from Geoff Levand
  - Emit GNU & SysV hashes for the vdso from Michael Ellerman
  - Define an enum for the bolted SLB indexes from Anshuman Khandual
  - Use a local to avoid multiple calls to get_slb_shadow() from Michael Ellerman
  - Add gettimeofday() benchmark from Michael Neuling
  - Avoid link stack corruption in __get_datapage() from Michael Neuling
  - Add virt_to_pfn and use this instead of opencoding from Aneesh Kumar K.V
  - Add ppc64le_defconfig from Michael Ellerman
  - pseries: extract of_helpers module from Andy Shevchenko
  - Correct string length in pseries_of_derive_parent() from Nathan Fontenot
  - Free the MSI bitmap if it was slab allocated from Denis Kirjanov
  - Shorten irq_chip name for the SIU from Christophe Leroy
  - Wait 1s for secondaries to enter OPAL during kexec from Samuel Mendoza-Jonas
  - Fix _ALIGN_* errors due to type difference. from Aneesh Kumar K.V
  - powerpc/pseries/hvcserver: don't memset pi_buff if it is null from Colin Ian King
  - Disable hugepd for 64K page size. from Aneesh Kumar K.V
  - Differentiate between hugetlb and THP during page walk from Aneesh Kumar K.V
  - Make PCI non-optional for pseries from Michael Ellerman
  - Individual System V IPC system calls from Sam bobroff
  - Add selftest of unmuxed IPC calls from Michael Ellerman
  - discard .exit.data at runtime from Stephen Rothwell
  - Delete old orphaned PrPMC 280/2800 DTS and boot file. from Paul Gortmaker
  - Use of_get_next_parent to simplify code from Christophe Jaillet
  - Paginate some xmon output from Sam bobroff
  - Add some more elements to the xmon PACA dump from Michael Ellerman
  - Allow the tm-syscall selftest to build with old headers from Michael Ellerman
  - Run EBB selftests only on POWER8 from Denis Kirjanov
  - Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU from Michael Ellerman
  - Avoid reference to potentially freed memory in prom.c from Christophe Jaillet
  - Quieten boot wrapper output with run_cmd from Geoff Levand
  - EEH fixes and cleanups from Gavin Shan
  - Fix recursive fenced PHB on Broadcom shiner adapter from Gavin Shan
  - Use of_get_next_parent() in of_get_ibm_chip_id() from Michael Ellerman
  - Fix section mismatch warning in msi_bitmap_alloc() from Denis Kirjanov
  - Fix ps3-lpm white space from Rudhresh Kumar J
  - Fix ps3-vuart null dereference from Colin King
  - nvram: Add missing kfree in error path from Christophe Jaillet
  - nvram: Fix function name in some errors messages. from Christophe Jaillet
  - drivers/macintosh: adb: fix misleading Kconfig help text from Aaro Koskinen
  - agp/uninorth: fix a memleak in create_gatt_table from Denis Kirjanov
  - cxl: Free virtual PHB when removing from Andrew Donnellan
  - scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target from Michael Ellerman
  - scripts/kconfig/Makefile: Fix KBUILD_DEFCONFIG check when building with O= from Michael Ellerman
 
  - Freescale updates from Scott: Highlights include 64-bit book3e kexec/kdump
    support, a rework of the qoriq clock driver, device tree changes including
    qoriq fman nodes, support for a new 85xx board, and some fixes.
 
  - MPC5xxx updates from Anatolij: Highlights include a driver for MPC512x
    LocalPlus Bus FIFO with its device tree binding documentation, mpc512x
    device tree updates and some minor fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWPEZgAAoJEFHr6jzI4aWANjYQAKX2Q/95hqKfCuF5FBcUmtMC
 Pu/Nff027MVzxZ2ApDcvvLGps5Nz2bn3nIhc9zjkXc5E8DuL6X3Yl8ce7qyNcc3g
 cJJ8RvtUo6J1OMWetXFehtPYniAAwKMhZYKnj0+WnLr2SyH/Vhl3ehDkFbGyPtuH
 r+2E7krFjfVgU+bzciIFnOaDekFuFN/pXWMb6e6zQyBJe9N8ZIp96uouGCebKVd0
 VDLItzdaKErT8JFfbymMPvZm3V0rMVx4WWu3kAbQX8LrD5a18NF1zrjAOHRXc61n
 kkk8/DPuNOon1PbXXyiS5BcFyZRe+KE3VBnoW5sOMqMIRg5WdO1oU3e2pEfXMO8+
 leXYwFLXiKzUZuOgQG2QiUhrzD2yC1o6/TJWATv0dSl9AwrecgPX+Vj6X357slAf
 A9E3eMy5tgnpndBWZmvZS3W7YDKH+NkeZ+Q40+NErAlqr++ErrTcKVndk5vWlYTT
 7mMZeTXagX66al/k5ATKqwB7iUSpnYHSAa9fcUYPSM2FnXsDxPyeJGkBbcoOmkGj
 QrpgNYOvJaUJd076goZCV39v0c1xpfV9/9kyVch8HUadf6JcjpVZwYnbGw2qlJjh
 ZanuBG2VOeSwaKQqXiRBSBetnpAg8CVpFjDmX9wOBfSek2wxEJqDX/vQExdbIDQQ
 pUs7vnUxLzhmW/x+ygOI
 =YwcM
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Kconfig: remove BE-only platforms from LE kernel build from Boqun
   Feng
 - Refresh ps3_defconfig from Geoff Levand
 - Emit GNU & SysV hashes for the vdso from Michael Ellerman
 - Define an enum for the bolted SLB indexes from Anshuman Khandual
 - Use a local to avoid multiple calls to get_slb_shadow() from Michael
   Ellerman
 - Add gettimeofday() benchmark from Michael Neuling
 - Avoid link stack corruption in __get_datapage() from Michael Neuling
 - Add virt_to_pfn and use this instead of opencoding from Aneesh Kumar
   K.V
 - Add ppc64le_defconfig from Michael Ellerman
 - pseries: extract of_helpers module from Andy Shevchenko
 - Correct string length in pseries_of_derive_parent() from Nathan
   Fontenot
 - Free the MSI bitmap if it was slab allocated from Denis Kirjanov
 - Shorten irq_chip name for the SIU from Christophe Leroy
 - Wait 1s for secondaries to enter OPAL during kexec from Samuel
   Mendoza-Jonas
 - Fix _ALIGN_* errors due to type difference, from Aneesh Kumar K.V
 - powerpc/pseries/hvcserver: don't memset pi_buff if it is null from
   Colin Ian King
 - Disable hugepd for 64K page size, from Aneesh Kumar K.V
 - Differentiate between hugetlb and THP during page walk from Aneesh
   Kumar K.V
 - Make PCI non-optional for pseries from Michael Ellerman
 - Individual System V IPC system calls from Sam bobroff
 - Add selftest of unmuxed IPC calls from Michael Ellerman
 - discard .exit.data at runtime from Stephen Rothwell
 - Delete old orphaned PrPMC 280/2800 DTS and boot file, from Paul
   Gortmaker
 - Use of_get_next_parent to simplify code from Christophe Jaillet
 - Paginate some xmon output from Sam bobroff
 - Add some more elements to the xmon PACA dump from Michael Ellerman
 - Allow the tm-syscall selftest to build with old headers from Michael
   Ellerman
 - Run EBB selftests only on POWER8 from Denis Kirjanov
 - Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU from Michael
   Ellerman
 - Avoid reference to potentially freed memory in prom.c from Christophe
   Jaillet
 - Quieten boot wrapper output with run_cmd from Geoff Levand
 - EEH fixes and cleanups from Gavin Shan
 - Fix recursive fenced PHB on Broadcom shiner adapter from Gavin Shan
 - Use of_get_next_parent() in of_get_ibm_chip_id() from Michael
   Ellerman
 - Fix section mismatch warning in msi_bitmap_alloc() from Denis
   Kirjanov
 - Fix ps3-lpm white space from Rudhresh Kumar J
 - Fix ps3-vuart null dereference from Colin King
 - nvram: Add missing kfree in error path from Christophe Jaillet
 - nvram: Fix function name in some errors messages, from Christophe
   Jaillet
 - drivers/macintosh: adb: fix misleading Kconfig help text from Aaro
   Koskinen
 - agp/uninorth: fix a memleak in create_gatt_table from Denis Kirjanov
 - cxl: Free virtual PHB when removing from Andrew Donnellan
 - scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target from
   Michael Ellerman
 - scripts/kconfig/Makefile: Fix KBUILD_DEFCONFIG check when building
   with O= from Michael Ellerman
 - Freescale updates from Scott: Highlights include 64-bit book3e
   kexec/kdump support, a rework of the qoriq clock driver, device tree
   changes including qoriq fman nodes, support for a new 85xx board, and
   some fixes.
 - MPC5xxx updates from Anatolij: Highlights include a driver for
   MPC512x LocalPlus Bus FIFO with its device tree binding
   documentation, mpc512x device tree updates and some minor fixes.

* tag 'powerpc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (106 commits)
  powerpc/msi: Fix section mismatch warning in msi_bitmap_alloc()
  powerpc/prom: Use of_get_next_parent() in of_get_ibm_chip_id()
  powerpc/pseries: Correct string length in pseries_of_derive_parent()
  powerpc/e6500: hw tablewalk: make sure we invalidate and write to the same tlb entry
  powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s)
  powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan
  powerpc/fsl: Add #clock-cells and clockgen label to clockgen nodes
  powerpc: handle error case in cpm_muram_alloc()
  powerpc: mpic: use IRQCHIP_SKIP_SET_WAKE instead of redundant mpic_irq_set_wake
  powerpc/book3e-64: Enable kexec
  powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop
  powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32
  powerpc/book3e-64/kexec: Enable SMP release
  powerpc/book3e-64/kexec: create an identity TLB mapping
  powerpc/book3e-64: Don't limit paca to 256 MiB
  powerpc/book3e/kdump: Enable crash_kexec_wait_realmode
  powerpc/book3e: support CONFIG_RELOCATABLE
  powerpc/booke64: Fix args to copy_and_flush
  powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts
  powerpc/e6500: kexec: Handle hardware threads
  ...
2015-11-05 23:38:43 -08:00
Michael Ellerman 8bdf2023e2 Merge branch 'next' of git://git.denx.de/linux-denx-agust into next
MPC5xxx updates from Anatolij:

"Highlights include a driver for MPC512x LocalPlus Bus FIFO with its
device tree binding documentation, mpc512x device tree updates and some
minor fixes."
2015-11-05 19:35:12 +11:00
Linus Torvalds 6aa2fdb87c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The irq departement delivers:

   - Rework the irqdomain core infrastructure to accomodate ACPI based
     systems.  This is required to support ARM64 without creating
     artificial device tree nodes.

   - Sanitize the ACPI based ARM GIC initialization by making use of the
     new firmware independent irqdomain core

   - Further improvements to the generic MSI management

   - Generalize the irq migration on CPU hotplug

   - Improvements to the threaded interrupt infrastructure

   - Allow the migration of "chained" low level interrupt handlers

   - Allow optional force masking of interrupts in disable_irq[_nosysnc]

   - Support for two new interrupt chips - Sigh!

   - A larger set of errata fixes for ARM gicv3

   - The usual pile of fixes, updates, improvements and cleanups all
     over the place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
  Document that IRQ_NONE should be returned when IRQ not actually handled
  PCI/MSI: Allow the MSI domain to be device-specific
  PCI: Add per-device MSI domain hook
  of/irq: Use the msi-map property to provide device-specific MSI domain
  of/irq: Split of_msi_map_rid to reuse msi-map lookup
  irqchip/gic-v3-its: Parse new version of msi-parent property
  PCI/MSI: Use of_msi_get_domain instead of open-coded "msi-parent" parsing
  of/irq: Use of_msi_get_domain instead of open-coded "msi-parent" parsing
  of/irq: Add support code for multi-parent version of "msi-parent"
  irqchip/gic-v3-its: Add handling of PCI requester id.
  PCI/MSI: Add helper function pci_msi_domain_get_msi_rid().
  of/irq: Add new function of_msi_map_rid()
  Docs: dt: Add PCI MSI map bindings
  irqchip/gic-v2m: Add support for multiple MSI frames
  irqchip/gic-v3: Fix translation of LPIs after conversion to irq_fwspec
  irqchip/mxs: Add Alphascale ASM9260 support
  irqchip/mxs: Prepare driver for hardware with different offsets
  irqchip/mxs: Panic if ioremap or domain creation fails
  irqdomain: Documentation updates
  irqdomain/msi: Use fwnode instead of of_node
  ...
2015-11-03 14:40:01 -08:00
Michael Ellerman 3b0e21ec3b Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include 64-bit book3e kexec/kdump support, a rework of the
qoriq clock driver, device tree changes including qoriq fman nodes,
support for a new 85xx board, and some fixes.

Note that there is a trivial merge conflict with the clock tree's next
branch, in the clock Makefile."
2015-11-02 13:59:48 +11:00
Nathan Fontenot f755ecfb8c powerpc/pseries: Correct string length in pseries_of_derive_parent()
Commit a030e1e4bb make a change to use
kstrndup() instead of kmalloc() + strlcpy() in the pseries_of_derive_parent()
routine that introduces a subtle change in the parent path name generated.
The kstrndup() routine will copy n characters followed by a terminating null,
whereas strlcpy() will copy n-1 characters and add a terminating null.

This slight difference results in having a parent path that includes the
tailing '/' character, "/cpus/" vs. "/cpus". This then causes the subsequent
call to of_find_node_by_path() to fail, and in the case of DLPAR add
operations the DLPAR request fails.

This patch decrements the pointer returned from kbasename() to point to the
'/' character before the base name instead of the base name. This then
adjusts the string length calculations to not include the trailing '/'
in the parent path name.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-28 12:08:18 +09:00
Scott Wood 43f2cfcce2 Merge branch 'clock' into HEAD
This is a major overhaul of the clk-qoriq driver, which I'm merging
via PPC with Stephen Boyd's ack in order to apply subsequent PPC patches
that depend on it.
2015-10-27 18:14:16 -05:00
Scott Wood f34b3e19fd powerpc/e6500: kexec: Handle hardware threads
The new kernel will be expecting secondary threads to be disabled,
not spinning.

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27 18:13:25 -05:00
Tiejun Chen 939fbf0080 powerpc/85xx: Implement 64-bit kexec support
Unlike 32-bit 85xx kexec, we don't do a core reset.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood: edit changelog, and cleanup]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27 18:13:25 -05:00
Himangi Saraogi 39e69f55f8 powerpc: Introduce the use of the managed version of kzalloc
This patch moves data allocated using kzalloc to managed data allocated
using devm_kzalloc and cleans now unnecessary kfree in probe function.

The following Coccinelle semantic patch was used for making the change:

@platform@
identifier p, probefn, removefn;
@@
struct platform_driver p = {
  .probe = probefn,
  .remove = removefn,
};

@prb@
identifier platform.probefn, pdev;
expression e, e1, e2;
@@
probefn(struct platform_device *pdev, ...) {
  <+...
- e = kzalloc(e1, e2)
+ e = devm_kzalloc(&pdev->dev, e1, e2)
  ...
?-kfree(e);
  ...+>
}

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
2015-10-22 16:13:23 +02:00
Alexander Popov 1a4bb93f79 powerpc/512x: add LocalPlus Bus FIFO device driver
This driver for Freescale MPC512x LocalPlus Bus FIFO (called SCLPC
in the Reference Manual) allows Direct Memory Access transfers
between RAM and peripheral devices on LocalPlus Bus.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
2015-10-22 15:19:40 +02:00
Luis de Bethencourt 7e610890b5 powerpc: platforms: mpc52xx_lpbfifo: Fix module autoload for OF platform driver
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.

Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
2015-10-22 14:36:53 +02:00
Scott Wood 9484865447 powerpc/fsl: Move fsl_guts.h out of arch/powerpc
Freescale's Layerscape ARM chips use the same structure.

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-21 18:05:50 -05:00
Paul Mackerras 53c656c413 powerpc/powernv: Handle irq_happened flag correctly in off-line loop
This fixes a bug where it is possible for an off-line CPU to fail to go
into a low-power state (nap/sleep/winkle), and to become unresponsive to
requests from the KVM subsystem to wake up and run a VCPU. What can
happen is that a maskable interrupt of some kind (external, decrementer,
hypervisor doorbell, or HMI) after we have called local_irq_disable() at
the beginning of pnv_smp_cpu_kill_self() and before interrupts are
hard-disabled inside power7_nap/sleep/winkle(). In this situation, the
pending event is marked in the irq_happened flag in the PACA. This
pending event prevents power7_nap/sleep/winkle from going to the
requested low-power state; instead they return immediately. We don't
deal with any of these pending event flags in the off-line loop in
pnv_smp_cpu_kill_self() because power7_nap et al. return 0 in this case,
so we will have srr1 == 0, and none of the processing to clear
interrupts or doorbells will be done.

Usually, the most obvious symptom of this is that a KVM guest will fail
with a console message saying "KVM: couldn't grab cpu N".

This fixes the problem by making sure we handle the irq_happened flags
properly. First, we hard-disable before the off-line loop. Once we have
hard-disabled, the irq_happened flags can't change underneath us. We
unconditionally clear the DEC and HMI flags: there is no processing of
timer interrupts while off-line, and the necessary HMI processing is all
done in lower-level code. We leave the EE and DBELL flags alone for the
first iteration of the loop, so that we won't fail to respond to a
split-core request that came in just before hard-disabling. Within the
loop, we handle external interrupts if the EE bit is set in irq_happened
as well as if the low-power state was interrupted by an external
interrupt. (We don't need to do the msgclr for a pending doorbell in
irq_happened, because doorbells are edge-triggered and don't remain
pending in hardware.) Then we clear both the EE and DBELL flags, and
once clear, they cannot be set again (until this CPU comes online again,
that is).

This also fixes the debug check to not be done when we just ran a KVM
guest or when the sleep didn't happen because of a pending event in
irq_happened.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21 20:52:49 +11:00
Gavin Shan 353169acf1 powerpc/eeh: Fix recursive fenced PHB on Broadcom shiner adapter
Similar to commit b6541db ("powerpc/eeh: Block PCI config access
upon frozen PE"), this blocks the PCI config space of Broadcom
Shiner adapter until PE reset is completed, to avoid recursive
fenced PHB when dumping PCI config registers during the period
of error recovery.

   ~# lspci -ns 0003:03:00.0
   0003:03:00.0 0200: 14e4:168a (rev 10)
   ~# lspci -s 0003:03:00.0
   0003:03:00.0 Ethernet controller: Broadcom Corporation \
                NetXtreme II BCM57800 1/10 Gigabit Ethernet (rev 10)

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21 20:42:16 +11:00
Gavin Shan f9433718d6 powerpc/powernv: Simplify pnv_eeh_set_option()
This simplifies pnv_eeh_set_option() to avoid unnecessary nested
if statements, to improve readability. No functional changes.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21 20:42:15 +11:00
Gavin Shan 4d6186ca6f powerpc/powernv: Remove pnv_eeh_cap_start()
This moves the logic of pnv_eeh_cap_start() to pnv_eeh_find_cap()
as the function is only called by pnv_eeh_find_cap(). The logic
of both functions are pretty simple. No need to have separate
functions.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21 20:42:15 +11:00
Gavin Shan 608fb9c296 powerpc/powernv: Cleanup on EEH comments
This applies cleanup on eeh-powernv.c, no functional changes:

   * Remove unnecessary comments and empty line.
   * Correct inaccurate comments.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21 20:42:15 +11:00
Gavin Shan 00ba05a12b powerpc/pseries: Cleanup on pseries_eeh_get_state()
This cleans up pseries_eeh_get_state(), no functional changes:

   * Return EEH_STATE_NOT_SUPPORT early when the 2nd RTAS output
     argument is zero to avoid nested if statements.
   * Skip clearing bits in the PE state represented by variable
     "result" to simplify the code.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21 20:42:14 +11:00
Michael Ellerman bed08b7e1f powerpc/cell: Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU
The TUNE_CELL option allows you to build a kernel that runs on multiple
CPUs but is tuned (ie. optimised) to run on Cell CPUs. Now days no one
is building a distro in that fashion, and any users who are building
custom kernels for their Cell machines are better off building with
CONFIG_CELL_CPU, which builds a kernel that only runs on Cell and
therefore can be optimised even more aggresively.

Dropping the option also avoids confusing other users, who are presented
with an option to tune for Cell when they are not building for a Cell
CPU at all.

Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-19 19:51:18 +11:00
Scott Wood 1112450a18 powerpc/85xx: Don't use generic timebase sync on 64-bit
85xx currently uses the generic timebase sync mechanism when
CONFIG_KEXEC is enabled, because 32-bit 85xx kexec support does a hard
reset of each core.  64-bit 85xx kexec does not do this, so we neither
need nor want this (nor is the generic timebase sync code built on
ppc64).

FWIW, I don't like the fact that the hard reset is done on 32-bit
kexec, and I especially don't like the timebase sync being triggered
only on the presence of CONFIG_KEXEC rather than actually booting in
that environment, but that's beyond the scope of this patch...

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-17 00:36:36 -05:00
Scott Wood 9f640bf532 powerpc/fsl-corenet: Disable coreint if kexec is enabled
Problems have been observed in coreint (EPR) mode if interrupts are
left pending (due to the lack of device quiescence with kdump) after
having tried to deliver to a CPU but unable to deliver due to MSR[EE]
-- interrupts no longer get reliably delivered in the new kernel.  I
tried various ways of fixing it up inside the crash kernel itself, and
none worked (including resetting the entire mpic).  Masking all
interrupts and issuing EOIs in the crashing kernel did help a lot of
the time, but the behavior was not consistent.

Thus, stick to standard IACK mode when kdump is a possibility.

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-17 00:36:36 -05:00
Scott Wood 01c593d749 powerpc/fsl-booke-64: Allow booting from the secondary thread
This allows SMP kernels to work as kdump crash kernels.  While crash
kernels don't really need to be SMP, this prevents things from breaking
if a user does it anyway (which is not something you want to only find
out once the main kernel has crashed in the field, especially if
whether it works or not depends on which cpu crashed).

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-17 00:36:35 -05:00
Andy Fleming c383ee84e1 powerpc/85xx: Add support for Varisys Cyrus board
This board uses a P5020 chip, and boots just fine using
the corenet_generic code. The device tree is very similar to the
P5020DS, except that there is no Flash memory. The environment is,
instead, stored on an MMC card on the motherboard.

Signed-off-by: Andy Fleming <afleming@gmail.com>
[scottwood: fixed trailing whitespace]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-17 00:36:33 -05:00
chenhui zhao 881ea7d3f5 powerpc/corenet: use the mixed mode of MPIC when enabling CPU hotplug
Core reset may cause issue if using the proxy mode of MPIC.
Use the mixed mode of MPIC if enabling CPU hotplug.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-16 18:44:27 -05:00
Michael Ellerman f7688056cb powerpc/pseries: Drop always true CONFIG_PSERIES_MSI
Now that pseries selects PCI_MSI && PCI, EEH will always be true, and
therefore CONFIG_PSERIES_MSI will always be true. So drop it, and move
msi.o to obj-y.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-15 20:31:56 +11:00
Michael Ellerman 44f2aecfd0 powerpc/pseries: Move PCI objects to obj-y
Make it entirely clear in the Makefile that we always build the pci
related files by moving them to obj-y.

Note that CONFIG_EEH is now always enabled on pseries, because it
depends on PSERIES && PCI.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-15 20:31:56 +11:00
Michael Ellerman 84eb9e612b powerpc/pseries: Remove use of CONFIG_PCI
Now that we always have CONFIG_PCI=y for pseries, we can stop guarding
code with CONFIG_PCI ifdefs.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-15 20:31:55 +11:00
Michael Ellerman 4c9cd468b3 powerpc/pseries: Make PCI non-optional
The pseries build with PCI=n looks to have been broken for at least 5
years, and no one's noticed or cared.

Following the obvious breakages backward, the first commit I can find
that builds is the parent of 2eb4afb69f ("powerpc/pci: Move pseries
code into pseries platform specific area") from April 2009.

A distro would never ship a PCI=n kernel, so it is only useful for folks
building custom kernels. Also on KVM the virtio devices appear on PCI,
so it would only be useful if you were building kernels specifically to
run on PowerVM and with no PCI devices.

The added code complexity, and testing load (which we've clearly not
been doing), is not justified by the small reduction in kernel size for
such a niche use case.

So just make PCI non-optional on pseries.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-15 20:31:54 +11:00
Marc Zyngier 5d4c9bc776 irqdomain: Use irq_domain_get_of_node() instead of direct field access
The struct irq_domain contains a "struct device_node *" field
(of_node) that is almost the only link between the irqdomain
and the device tree infrastructure.

In order to prepare for the removal of that field, convert all
users to use irq_domain_get_of_node() instead.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-13 19:01:23 +02:00
Daniel Axtens f2dd80ecca powerpc/powernv: Panic on unhandled Machine Check
All unrecovered machine check errors on PowerNV should cause an
immediate panic. There are 2 reasons that this is the right policy:
it's not safe to continue, and we're already trying to reboot.

Firstly, if we go through the recovery process and do not successfully
recover, we can't be sure about the state of the machine, and it is
not safe to recover and proceed.

Linux knows about the following sources of Machine Check Errors:
- Uncorrectable Errors (UE)
- Effective - Real Address Translation (ERAT)
- Segment Lookaside Buffer (SLB)
- Translation Lookaside Buffer (TLB)
- Unknown/Unrecognised

In the SLB, TLB and ERAT cases, we can further categorise these as
parity errors, multihit errors or unknown/unrecognised.

We can handle SLB errors by flushing and reloading the SLB. We can
handle TLB and ERAT multihit errors by flushing the TLB. (It appears
we may not handle TLB and ERAT parity errors: I will investigate
further and send a followup patch if appropriate.)

This leaves us with uncorrectable errors. Uncorrectable errors are
usually the result of ECC memory detecting an error that it cannot
correct, but they also crop up in the context of PCI cards failing
during DMA writes, and during CAPI error events.

There are several types of UE, and there are 3 places a UE can occur:
Skiboot, the kernel, and userspace. For Skiboot errors, we have the
facility to make some recoverable. For userspace, we can simply kill
(SIGBUS) the affected process. We have no meaningful way to deal with
UEs in kernel space or in unrecoverable sections of Skiboot.

Currently, these unrecovered UEs fall through to
machine_check_expection() in traps.c, which calls die(), which OOPSes
and sends SIGBUS to the process. This sometimes allows us to stumble
onwards. For example we've seen UEs kill the kernel eehd and
khugepaged. However, the process killed could have held a lock, or it
could have been a more important process, etc: we can no longer make
any assertions about the state of the machine. Similarly if we see a
UE in skiboot (and again we've seen this happen), we're not in a
position where we can make any assertions about the state of the
machine.

Likewise, for unknown or unrecognised errors, we're not able to say
anything about the state of the machine.

Therefore, if we have an unrecovered MCE, the most appropriate thing
to do is to panic.

The second reason is that since e784b6499d ("powerpc/powernv: Invoke
opal_cec_reboot2() on unrecoverable machine check errors."), we
attempt a special OPAL reboot on an unhandled MCE. This is so the
hardware can record error data for later debugging.

The comments in that commit assert that we are heading down the panic
path anyway. At the moment this is not always true. With UEs in kernel
space, for instance, they are marked as recoverable by the hardware,
so if the attempt to reboot failed (e.g. old Skiboot), we wouldn't
panic() but would simply die() and OOPS. It doesn't make sense to be
staggering on if we've just tried to reboot: we should panic().

Explicitly panic() on unrecovered MCEs on PowerNV.
Update the comments appropriately.

This fixes some hangs following EEH events on cxlflash setups.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-09 08:07:19 +11:00
Colin Ian King c13e1c05b2 powerpc/pseries/hvcserver: don't memset pi_buff if it is null
pi_buff is being memset before it is sanity checked. Move the
memset after the null pi_buff sanity check to avoid an oops.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-09 08:03:03 +11:00
Samuel Mendoza-Jonas 1b70386c99 powerpc/kexec: Wait 1s for secondaries to enter OPAL
Always include a timeout when waiting for secondary cpus to enter OPAL
in the kexec path, rather than only when crashing.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-06 21:57:42 +11:00
Andy Shevchenko 06bacefcbd powerpc/pseries: re-use code from of_helpers module
The derive_parent() has similar semantics to what we have in newly introduced
of_helpers module. The replacement reduces code base and propagates the actual
error code to the caller.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-05 21:11:28 +11:00
Andy Shevchenko a46d988409 powerpc/pseries: handle nodes without '/'
In case we have node without '/' strrchr() returns NULL which might lead to
crash. Replace strrchr() by kbasename() and modify condition to avoid such
behaviour.

Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-05 21:11:27 +11:00
Andy Shevchenko a030e1e4bb powerpc/pseries: replace kmalloc + strlcpy
The helper kstrndup() will do the same in one line.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-05 21:11:26 +11:00
Andy Shevchenko dc85aaed64 powerpc/pseries: fix a potential memory leak
In case we have a full node name like /foo/bar and /foo is not found the
parent_path left unfreed. So, free a memory before return to a caller.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-05 21:11:25 +11:00
Andy Shevchenko 948ad1acaf powerpc/pseries: extract of_helpers module
Extract a new module to share the code between other modules.

There is no functional change.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-05 21:11:24 +11:00
Michael Ellerman 4fa9a3f6b6 powerpc/ps3: Remove unused os_area_db_id_video_mode
This struct is unused, which is now a build error with gcc 6:

  error: 'os_area_db_id_video_mode' defined but not used

There doesn't seem to be any good reason to keep it around so remove it,
it's in the history if anyone needs it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-30 10:33:02 +10:00
Boqun Feng e5e16d8f3e powerpc: Kconfig: remove BE-only platforms from LE kernel build
Currently, little endian is only supported on powernv and pseries,
however, Kconfigs still allow us to include other platforms in a LE
kernel, this may result in space wasting or even build error if some
BE-only platforms always assume they are built for a BE kernel. So just
modify the Kconfigs of BE-only platforms to remove them from being built
for a LE kernel.

For 32bit only platforms, nothing needs to be done, because
CPU_LITTLE_ENDIAN depends on PPC64. For 64bit supported platforms, add
CPU_BIG_ENDIAN to dependencies explicitly, so that these platforms will
be disabled for LE [Suggested-by: Cédric Le Goater <clg@fr.ibm.com>].

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-29 22:57:00 +10:00
Linus Torvalds fadb97b089 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "This is a rather large update post rc1 due to the final steps of
  cleanups and API changes which had to wait for the preparatory patches
  to hit your tree.

   - Regression fixes for ARM GIC irqchips

   - Regression fixes and lockdep anotations for renesas irq chips

   - The leftovers of the cleanup and preparatory patches which have
     been ignored by maintainers

   - Final conversions of the newly merged users of obsolete APIs

   - Final removal of obsolete APIs

   - Final removal of ARM artifacts which had been introduced during the
     conversion of ARM to the generic interrupt code.

   - Final split of the irq_data into chip specific and common data to
     reflect the needs of hierarchical irq domains.

   - Treewide removal of the first argument of interrupt flow handlers,
     i.e. the irq number, which is not used by the majority of handlers
     and simple to retrieve from the other argument the irq descriptor.

   - A few comment updates and build warning fixes"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  arm64: Remove ununsed set_irq_flags
  ARM: Remove ununsed set_irq_flags
  sh: Kill off set_irq_flags usage
  irqchip: Kill off set_irq_flags usage
  gpu/drm: Kill off set_irq_flags usage
  genirq: Remove irq argument from irq flow handlers
  genirq: Move field 'msi_desc' from irq_data into irq_common_data
  genirq: Move field 'affinity' from irq_data into irq_common_data
  genirq: Move field 'handler_data' from irq_data into irq_common_data
  genirq: Move field 'node' from irq_data into irq_common_data
  irqchip/gic-v3: Use IRQD_FORWARDED_TO_VCPU flag
  irqchip/gic: Use IRQD_FORWARDED_TO_VCPU flag
  genirq: Provide IRQD_FORWARDED_TO_VCPU status flag
  genirq: Simplify irq_data_to_desc()
  genirq: Remove __irq_set_handler_locked()
  pinctrl/pistachio: Use irq_set_handler_locked
  gpio: vf610: Use irq_set_handler_locked
  powerpc/mpc8xx: Use irq_set_handler_locked()
  powerpc/ipic: Use irq_set_handler_locked()
  powerpc/cpm2: Use irq_set_handler_locked()
  ...
2015-09-18 08:11:42 -07:00
Linus Torvalds f240bdd2a5 powerpc fixes for 4.3
- Fix 32-bit TCE table init in kdump kernel from Nish
  - Fix kdump with non-power-of-2 crashkernel= from Nish
  - Abort cxl_pci_enable_device_hook() if PCI channel is offline from Andrew
  - Fix to release DRC when configure_connector() fails from Bharata
  - Wire up sys_userfaultfd()
  - Fix race condition in tearing down MSI interrupts from Paul
  - Fix unbalanced pci_dev_get() in cxl_probe() from Daniel
  - Fix cxl build failure due to -Wunused-variable gcc behaviour change from Ian
  - Tell the toolchain to use ABI v2 when building an LE boot wrapper from Benh
  - Fix THP to recompute hash value after a failed update from Aneesh
  - 32-bit memcpy/memset: only use dcbz once cache is enabled from Christophe
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV+h5EAAoJEFHr6jzI4aWADboP/3jc6vUtOmFZ1GBGwXrftOc0
 PGtkTtYnGbaNsp+ZBl39Y+CFsSfhFVaDgylm/8G5NMKZaSBVdcFJwfU1w6Ymn49R
 nZkAT5PC9KgT5RTuRTZ3DO/Y2RC9vg2T9pXjEn8NGYcV8GgUkc3dZAn48S3AFgnV
 4jQI5sbxvwU12XkCUn+DkETh13g3gLYtRxwehBu/S/ovED5iNHKJwnXRzxyAA969
 dARNriSeyLBVMLamJ+rJB1S5hVTZTMbughFVVFbgriyIGuC/C1g9b9GN86dCGS6w
 T6VrKveK/iVCLUB16KV8+inbfvUrXItOxhGJWPHw9uAJGLZTz5G+yLHRRPX8onyC
 pgDesJDDpP/P7sAnKto3tF1Vzi7lwVtVPC1dT1Fc9VAWJGPYC/d16EKGIpNqqlnc
 mAIJ7wcI5c/HxvqXR2rdRV6fMer+aY7utwMsh4o/gDs7ArQUcuCrOKSW0jvHGmyr
 MumARXnUGDwPnGD8IfYI2vDOvwisv9g6XACwsM+pi499SfowaiLuK3utsMccagGZ
 INFtaqS7gpcj8TTu3kymw1TbKW/tqG9T81RRZ0rFH1q3aSfvwQ9QvsXctSeqOl/n
 lkxR3Mk0CT0oupXNKV6pjqsRwLroA0AF5tGxuw4ap1Rt4i8G7WHaPgRp7SPZxtkR
 qVBryfK5bKqYtWp4ztVK
 =Dgn5
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix 32-bit TCE table init in kdump kernel from Nish

 - Fix kdump with non-power-of-2 crashkernel= from Nish

 - Abort cxl_pci_enable_device_hook() if PCI channel is offline from
   Andrew

 - Fix to release DRC when configure_connector() fails from Bharata

 - Wire up sys_userfaultfd()

 - Fix race condition in tearing down MSI interrupts from Paul

 - Fix unbalanced pci_dev_get() in cxl_probe() from Daniel

 - Fix cxl build failure due to -Wunused-variable gcc behaviour change
   from Ian

 - Tell the toolchain to use ABI v2 when building an LE boot wrapper
   from Benh

 - Fix THP to recompute hash value after a failed update from Aneesh

 - 32-bit memcpy/memset: only use dcbz once cache is enabled from
   Christophe

* tag 'powerpc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc32: memset: only use dcbz once cache is enabled
  powerpc32: memcpy: only use dcbz once cache is enabled
  powerpc/mm: Recompute hash value after a failed update
  powerpc/boot: Specify ABI v2 when building an LE boot wrapper
  cxl: Fix build failure due to -Wunused-variable behaviour change
  cxl: Fix unbalanced pci_dev_get in cxl_probe
  powerpc/MSI: Fix race condition in tearing down MSI interrupts
  powerpc: Wire up sys_userfaultfd()
  powerpc/pseries: Release DRC when configure_connector fails
  cxl: abort cxl_pci_enable_device_hook() if PCI channel is offline
  powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=
  powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernel
2015-09-18 08:01:06 -07:00
Marc Zyngier 705a7b474e powerpc/PCI: Fix lookup of linux,pci-probe-only property
When find_and_init_phbs() looks for the probe-only property, it seems to
trust the firmware to be correctly written, and assumes that there is a
parameter to the property.

It is conceivable that the firmware could not be that perfect, and it could
expose this property naked (at least one arm64 platform seems to exhibit
this exact behaviour).  The setup code the ends up making a decision based
on whatever the property pointer points to, which is likely to be junk.

Instead, switch to the common of_pci.c implementation that doesn't suffer
from this problem and ignore the property if the firmware couldn't make up
its mind.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-17 12:21:02 -05:00
Thomas Gleixner bd0b9ac405 genirq: Remove irq argument from irq flow handlers
Most interrupt flow handlers do not use the irq argument. Those few
which use it can retrieve the irq number from the irq descriptor.

Remove the argument.

Search and replace was done with coccinelle and some extra helper
scripts around it. Thanks to Julia for her help!

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
2015-09-16 15:47:51 +02:00
Thomas Gleixner 6b83bd9414 powerpc/mpc52xx: Use irq_set_handler_locked()
Use irq_set_handler_locked() as it avoids a redundant lookup of the
irq descriptor.

Search and replacement was done with coccinelle:

@@
struct irq_data *d;
expression E1;
@@

-__irq_set_handler_locked(d->irq, E1);
+irq_set_handler_locked(d, E1);

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
2015-09-16 15:43:10 +02:00
Thomas Gleixner 391de7f9ef powerpc/cell: Prepare irq handler for irq argument removal
The irq argument of most interrupt flow handlers is unused or merily
used instead of a local variable. The handlers which need the irq
argument can retrieve the irq number from the irq descriptor.
    
Search and update was done with coccinelle and the invaluable help of
Julia Lawall.
    
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linuxppc-dev@lists.ozlabs.org
2015-09-14 10:30:05 +02:00
Thomas Gleixner 0a0dbd9258 powerpc/85xx: Prepare irq handlers for irq argument removal
The irq argument of most interrupt flow handlers is unused or merily
used instead of a local variable. The handlers which need the irq
argument can retrieve the irq number from the irq descriptor.
    
Search and update was done with coccinelle and the invaluable help of
Julia Lawall.
    
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: linuxppc-dev@lists.ozlabs.org
2015-09-14 10:30:05 +02:00
Thomas Gleixner 5aac2d3368 powerpc/mpc5121_ads_cpld: Prepare irq handler for irq argument removal
The irq argument of most interrupt flow handlers is unused or merily
used instead of a local variable. The handlers which need the irq
argument can retrieve the irq number from the irq descriptor.
    
Search and update was done with coccinelle and the invaluable help of
Julia Lawall.
    
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linuxppc-dev@lists.ozlabs.org
2015-09-14 10:30:04 +02:00
Paul Mackerras e297c939b7 powerpc/MSI: Fix race condition in tearing down MSI interrupts
This fixes a race which can result in the same virtual IRQ number
being assigned to two different MSI interrupts.  The most visible
consequence of that is usually a warning and stack trace from the
sysfs code about an attempt to create a duplicate entry in sysfs.

The race happens when one CPU (say CPU 0) is disposing of an MSI
while another CPU (say CPU 1) is setting up an MSI.  CPU 0 calls
(for example) pnv_teardown_msi_irqs(), which calls
msi_bitmap_free_hwirqs() to indicate that the MSI (i.e. its
hardware IRQ number) is no longer in use.  Then, before CPU 0 gets
to calling irq_dispose_mapping() to free up the virtal IRQ number,
CPU 1 comes in and calls msi_bitmap_alloc_hwirqs() to allocate an
MSI, and gets the same hardware IRQ number that CPU 0 just freed.
CPU 1 then calls irq_create_mapping() to get a virtual IRQ number,
which sees that there is currently a mapping for that hardware IRQ
number and returns the corresponding virtual IRQ number (which is
the same virtual IRQ number that CPU 0 was using).  CPU 0 then
calls irq_dispose_mapping() and frees that virtual IRQ number.
Now, if another CPU comes along and calls irq_create_mapping(), it
is likely to get the virtual IRQ number that was just freed,
resulting in the same virtual IRQ number apparently being used for
two different hardware interrupts.

To fix this race, we just move the call to msi_bitmap_free_hwirqs()
to after the call to irq_dispose_mapping().  Since virq_to_hw()
doesn't work for the virtual IRQ number after irq_dispose_mapping()
has been called, we need to call it before irq_dispose_mapping() and
remember the result for the msi_bitmap_free_hwirqs() call.

The pattern of calling msi_bitmap_free_hwirqs() before
irq_dispose_mapping() appears in 5 places under arch/powerpc, and
appears to have originated in commit 05af7bd2d7 ("[POWERPC] MPIC
U3/U4 MSI backend") from 2007.

Fixes: 05af7bd2d7 ("[POWERPC] MPIC U3/U4 MSI backend")
Cc: stable@vger.kernel.org # v2.6.22+
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-10 17:27:08 +10:00
Vlastimil Babka 96db800f5d mm: rename alloc_pages_exact_node() to __alloc_pages_node()
alloc_pages_exact_node() was introduced in commit 6484eb3e2a ("page
allocator: do not check NUMA node ID when the caller knows the node is
valid") as an optimized variant of alloc_pages_node(), that doesn't
fallback to current node for nid == NUMA_NO_NODE.  Unfortunately the
name of the function can easily suggest that the allocation is
restricted to the given node and fails otherwise.  In truth, the node is
only preferred, unless __GFP_THISNODE is passed among the gfp flags.

The misleading name has lead to mistakes in the past, see for example
commits 5265047ac3 ("mm, thp: really limit transparent hugepage
allocation to local node") and b360edb43f ("mm, mempolicy:
migrate_to_node should only migrate to node").

Another issue with the name is that there's a family of
alloc_pages_exact*() functions where 'exact' means exact size (instead
of page order), which leads to more confusion.

To prevent further mistakes, this patch effectively renames
alloc_pages_exact_node() to __alloc_pages_node() to better convey that
it's an optimized variant of alloc_pages_node() not intended for general
usage.  Both functions get described in comments.

It has been also considered to really provide a convenience function for
allocations restricted to a node, but the major opinion seems to be that
__GFP_THISNODE already provides that functionality and we shouldn't
duplicate the API needlessly.  The number of users would be small
anyway.

Existing callers of alloc_pages_exact_node() are simply converted to
call __alloc_pages_node(), with the exception of sba_alloc_coherent()
which open-codes the check for NUMA_NO_NODE, so it is converted to use
alloc_pages_node() instead.  This means it no longer performs some
VM_BUG_ON checks, and since the current check for nid in
alloc_pages_node() uses a 'nid < 0' comparison (which includes
NUMA_NO_NODE), it may hide wrong values which would be previously
exposed.

Both differences will be rectified by the next patch.

To sum up, this patch makes no functional changes, except temporarily
hiding potentially buggy callers.  Restricting the checks in
alloc_pages_node() is left for the next patch which can in turn expose
more existing buggy callers.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Robin Holt <robinmholt@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cliff Whickman <cpw@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Bharata B Rao daebaabb5c powerpc/pseries: Release DRC when configure_connector fails
Commit f32393c943 ("powerpc/pseries: Correct cpu affinity for
dlpar added cpus") moved dlpar_acquire_drc() call to before
dlpar_configure_connector() call in dlpar_cpu_probe(), but missed
to release the DRC if dlpar_configure_connector() failed.
During CPU hotplug, if configure-connector fails for any reason,
then this will result in subsequent CPU hotplug attempts to fail.

Release the acquired DRC if dlpar_configure_connector() call fails
so that the DRC is left in right isolation and allocation state
for the subsequent hotplug operation to succeed.

Fixes: f32393c943 ("powerpc/pseries: Correct cpu affinity for dlpar added cpus")
Cc: stable@vger.kernel.org # 4.1+
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-08 14:07:58 +10:00
Nishanth Aravamudan fa14486979 powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=
The 32-bit TCE table initialization relies on the DMA window having a
size equal to a power of 2 (and checks for it explicitly). But
crashkernel= has no constraint that requires a power-of-2 be specified.
This causes the kdump kernel to fail to boot as none of the PCI devices
(including the disk controller) are successfully initialized.

After this change, the PCI devices successfully set up the 32-bit TCE
table and kdump succeeds.

Fixes: aca6913f55 ("powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages")
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 4.2
Tested-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-07 20:14:18 +10:00