Commit Graph

3311 Commits

Author SHA1 Message Date
Aneesh Kumar K.V 34fbadd8e9 powerpc/mm: Move pte related functions together
Only code movement. No functionality change.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:51 +10:00
Aneesh Kumar K.V aba480e137 powerpc/mm: Move page table index and and vaddr to pgtable.h
Now that the page table size is a variable, we can move these to
generic pgtable.h.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:50 +10:00
Aneesh Kumar K.V dd1842a2a4 powerpc/mm: Make page table size a variable
Radix and hash MMU models support different page table sizes. Make
the #defines a variable so that existing code can work with variable
sizes.

Slice related code is only used by hash, so use hash constants there. We
will replicate some of the boundary conditions with resepct to TASK_SIZE
using radix values too. Right now we do boundary condition check using
hash constants.

Swapper pgdir size is initialized in asm code. We select the max pgd
size to keep it simple. For now we select hash pgdir. When adding radix
we will switch that to radix pgdir which is 64K.

BUILD_BUG_ON check which is removed is already done in hugepage_init()
using MAYBE_BUILD_BUG_ON().

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:48 +10:00
Aneesh Kumar K.V 13f829a5a1 powerpc/mm: Move pte accessors that operate on common pte bits to pgtable.h
These pte functions will remain the same between radix and hash. Move
them to pgtable.h.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:47 +10:00
Aneesh Kumar K.V 2e8735198a powerpc/mm: Move common pte bits and accessors to book3s/64/pgtable.h
Now that we have moved book3s hash64 Linux pte bits to match Power ISA
3.0 radix pte bit positions, we move the matching pte bits to a common
header.

Only code movement in this patch. No functionality change.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:46 +10:00
Aneesh Kumar K.V d2cf005038 powerpc/mm: Handle _PTE_NONE_MASK
I am splitting this as a separate patch to get better review. If ok
we should merge this with previous patch.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:45 +10:00
Aneesh Kumar K.V 945537df7a powerpc/mm/book3s: Rename hash specific PTE bits to carry H_ prefix
This helps to make following hash only pte bits easier.

We have kept _PAGE_CHG_MASK, _HPAGE_CHG_MASK and _PAGE_PROT_BITS as it
is in this patch eventhough they use hash specific bits. Using them in
radix as it is should be ok, because with radix we expect those bit
positions to be zero.

Only renames in this patch, no change in functionality.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:43 +10:00
Aneesh Kumar K.V 50de596de8 powerpc/mm/hash: Add support for Power9 Hash
PowerISA 3.0 adds a parition table indexed by LPID. Parition table
allows us to specify the MMU model that will be used for guest and host
translation.

This patch adds support with SLB based hash model (UPRT = 0). What is
required with this model is to support the new hash page table entry
format and also setup partition table such that we use hash table for
address translation.

We don't have segment table support yet.

In order to make sure we don't load KVM module on Power9 (since we don't
have kvm support yet) this patch also disables KVM on Power9.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:40 +10:00
Aneesh Kumar K.V e99833448c powerpc/mm/radix: Add partition table format & callback
Add structs and #defines related to the radix MMU partition table
format. We also add a ppc_md callback for updating a partition table
entry.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:39 +10:00
Aneesh Kumar K.V 11a6f6abd7 powerpc/mm: Move radix/hash common data structures to book3s64 headers
Start moving code that is generic between radix and hash to book3s64
specific headers from the book3s64 hash specific one.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:37 +10:00
Aneesh Kumar K.V 33d336d986 powerpc/mm: Use generic version of ptep_clear_flush_young()
The radix variant is going to require a flush_tlb_range(). With
flush_tlb_range() added, ptep_clear_flush_young() is the same as the
generic version. So drop the powerpc specific variant.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:36 +10:00
Aneesh Kumar K.V ff844b741e powerpc/mm: Use generic version of pmdp_clear_flush_young()
The radix variant is going to require a flush_pmd_tlb_range(). With
flush_pmd_tlb_range() added, pmdp_clear_flush_young() is the same as the
generic version. So drop the powerpc specific variant.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:35 +10:00
Aneesh Kumar K.V 30bda41aba powerpc/mm: Drop WIMG in favour of new constants
PowerISA 3.0 introduces two pte bits with the below meaning for radix:
  00 -> Normal Memory
  01 -> Strong Access Order (SAO)
  10 -> Non idempotent I/O (Cache inhibited and guarded)
  11 -> Tolerant I/O (Cache inhibited)

We drop the existing WIMG bits in the Linux page table in favour of the
above constants. We loose _PAGE_WRITETHRU with this conversion. We only
use writethru via pgprot_cached_wthru() which is used by
fbdev/controlfb.c which is Apple control display and also PPC32.

With respect to _PAGE_COHERENCE, we have been marking hpte always
coherent for some time now. htab_convert_pte_flags() always added
HPTE_R_M.

NOTE: KVM changes need closer review.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:33 +10:00
Aneesh Kumar K.V e58e87adc8 powerpc/mm: Update _PAGE_KERNEL_RO
PS3 had used a PPP bit hack to implement a read only mapping in the
kernel area. Since we are bolting the ioremap area, it used the pte
flags _PAGE_PRESENT | _PAGE_USER to get a PPP value of 0x3 there by
resulting in a read only mapping. This means the area can be accessed by
user space, but kernel will never return such an address to user space.

But we can do better by implementing a read only kernel mapping using
PPP bits 0b110.

This also allows us to do read only kernel mapping for radix in later
patches.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:30 +10:00
Aneesh Kumar K.V 96270b1fc2 powerpc/mm: Remove RPN_SHIFT and RPN_SIZE
PTE_RPN_SHIFT is actually page size dependent. Even though PowerISA 3.0
expects only the lower 12 bits to be zero, we will always find the pages
to be PAGE_SHIFT aligned. In case of hash config, this also allows us to
use the additional 3 bits to track pte specific information. We need
to make sure we use these bits only for hash specific pte flags.

For both 4K and 64K config, pte now can hold 57 bits address.

Inorder to keep things simpler, drop PTE_RPN_SHIFT and PTE_RPN_SIZE and
specify the 57 bit detail explicitly.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:29 +10:00
Aneesh Kumar K.V ac29c64089 powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGED
_PAGE_PRIVILEGED means the page can be accessed only by the kernel. This
is done to keep pte bits similar to PowerISA 3.0 Radix PTE format. User
pages are now marked by clearing _PAGE_PRIVILEGED bit.

Previously we allowed the kernel to have a privileged page in the lower
address range (USER_REGION). With this patch such access is denied.

We also prevent a kernel access to a non-privileged page in higher
address range (ie, REGION_ID != 0).

Both the above access scenarios should never happen.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:26 +10:00
Aneesh Kumar K.V e7bfc462d3 powerpc/mm: Use pte_user() instead of open coding
We have a common declaration in pte-common.h Add a book3s specific one
and switch to pte_user() in callchain.c. In a subsequent patch we will
switch _PAGE_USER to _PAGE_PRIVILEGED in the book3s version only.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:25 +10:00
Michael Ellerman 7e1e63c5e9 powerpc/mm: Convert pte_user() to static inline
In a subsequent patch we want to add a second definition of pte_user().
Before we do that, make the signature clear, ie. it takes a pte_t and
returns bool.

We move it up inside the existing #ifndef __ASSEMBLY__ block, but
otherwise it's a straight conversion.

Convert the call in settlbcam(), which passes an unsigned long, to pass
a pte_t.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:24 +10:00
Aneesh Kumar K.V c7d54842de powerpc/mm: Use _PAGE_READ to indicate Read access
This splits the _PAGE_RW bit into _PAGE_READ and _PAGE_WRITE. It also
removes the dependency on _PAGE_USER for implying read only. Few things
to note here is that, we have read implied with write and execute
permission. Hence we should always find _PAGE_READ set on hash pte
fault.

We still can't switch PROT_NONE to !(_PAGE_RWX). Auto numa depends on
marking a prot none pte _PAGE_WRITE. (For more details look at
b191f9b106 "mm: numa: preserve PTE write permissions across a NUMA
hinting fault")

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:21 +10:00
Michael Ellerman ee3caed37d powerpc/mm: Use pte_raw() in pte_same()/pmd_same()
We can avoid doing endian conversions by using pte_raw() in pxx_same().
The swap of the constant (_PAGE_HPTEFLAGS) should be done at compile
time by the compiler.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:19 +10:00
Aneesh Kumar K.V 5dc1ef858c powerpc/mm: Use big endian Linux page tables for book3s 64
Traditionally Power server machines have used the Hashed Page Table MMU
mode. In this mode Linux manages its own tree of nested page tables,
aka. "the Linux page tables", which are not used by the hardware
directly, and software loads translations into the hash page table for
use by the hardware.

Power ISA 3.0 defines a new MMU mode, known as Radix Tree Translation,
where the hardware can directly operate on the Linux page tables.
However the hardware requires that the page tables be in big endian
format.

To accommodate this, switch the pgtable types to __be64 and add
appropriate endian conversions.

Because we will be supporting a single kernel binary that boots using
either radix or hash mode, we always store the Linux page tables big
endian, even in hash mode where they are not actually used by the
hardware.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Fix sparse errors, flesh out change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:18 +10:00
Michael Ellerman 3910a7f485 powerpc/mm: Add pte_xchg() helper
We have five locations in 64-bit hash MMU code that do a cmpxchg() of a
PTE. Currently doing it inline OK, but in a future patch we will be
converting the PTEs to __be64 in some configs. In that case we will need
casts at every cmpxchg() site in order to keep sparse happy.

So move the logic into a helper, this is a reasonably nice cleanup on
its own.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:16 +10:00
Aneesh Kumar K.V 4bece39b50 powerpc/mm: Drop PTE_ATOMIC_UPDATES from pmd_hugepage_update()
pmd_hugepage_update() is inside #ifdef CONFIG_TRANSPARENT_HUGEPAGE. THP
can only be enabled if PPC_BOOK3S_64=y && PPC_64K_PAGES=y, aka. hash64.

On hash64 we always define PTE_ATOMIC_UPDATES to 1, meaning the #ifdef
in pmd_hugepage_update() is unnecessary, so drop it.

That is also the only use of PTE_ATOMIC_UPDATES in any of the hash code,
meaning we no longer need to #define it at all in the hash headers.

Note it's still #defined and used in the nohash code.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:15 +10:00
Michael Ellerman 670eea9241 powerpc/mm: Always use STRICT_MM_TYPECHECKS
Testing done by Paul Mackerras has shown that with a modern compiler
there is no negative effect on code generation from enabling
STRICT_MM_TYPECHECKS.

So remove the option, and always use the strict type definitions.

Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:14 +10:00
Rui Salvaterra d701cca674 powerpc: wire up preadv2 and pwritev2 syscalls
Wire up preadv2/pwritev2 in the same way as preadv/pwritev. Fixes two
build warnings on ppc64.

mpe: Lightly tested with fio (slightly hacked to add the syscall
wrappers):

  fio-4217  [009] ....  1304.635300: sys_preadv2(fd: 3, vec:
  10025821de0, vlen: 1, pos_l: 6253000, pos_h: 0, flags: 1)
  fio-4217  [009] ....  1304.635474: sys_preadv2 -> 0x1000

Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-27 16:47:55 +10:00
Chris Smart 8a649045e7 powerpc: Add support for userspace P9 copy paste
The copy paste facility introduced in POWER9 provides an optimised
mechanism for a userspace application to copy a cacheline. This is
provided by a pair of instructions, copy and paste, while a third,
cp_abort (copy paste abort), provides a clean up of the state in case of
a failure.

The copy instruction will read a 128 byte cacheline and store it in an
internal buffer. The subsequent paste instruction will store this
internal buffer to memory and set a CR field if the paste succeeds.

Since the state of the copy paste buffer is internal (and not
architecturally visible), in the unlikely event of a context switch, the
state cannot be stored and the paste should therefore fail.

The cp_abort instruction exists to fail and clean up any such
interrupted copy paste sequence and is to be called by the kernel as
part of the context switch. Doing so prevents data from a preceding copy
in one process leaking into the paste of another.

This code enables use of the cp_abort instruction if a supported
processor is detected.

NOTE: this is for userspace only, not in kernel, and does not deal
with KVM guests.

Patch created with much assistance from Michael Neuling
<mikey@neuling.org>

Signed-off-by: Chris Smart <chris@distroguy.com>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-27 09:28:07 +10:00
Anju T 1bfadabfeb powerpc/perf: Assign an id to each powerpc register
The enum definition assigns an 'id' to each register in "struct pt_regs"
of arch/powerpc. The order of these values in the enum definition are
based on the order of members in pt_regs.

Signed-off-by: Anju T <anju@linux.vnet.ibm.com>
[mpe: Rename LNK to LINK, use _UAPI_ASM for include guards]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-21 23:32:59 +10:00
Michael Ellerman 8404410b29 Merge branch 'topic/livepatch' into next
Merge the support for live patching on ppc64le using mprofile-kernel.
This branch has also been merged into the livepatching tree for v4.7.
2016-04-18 20:45:32 +10:00
Anton Blanchard 6997e57d69 powerpc: scan_features() updates incorrect bits for REAL_LE
The REAL_LE feature entry in the ibm_pa_feature struct is missing an MMU
feature value, meaning all the remaining elements initialise the wrong
values.

This means instead of checking for byte 5, bit 0, we check for byte 0,
bit 0, and then we incorrectly set the CPU feature bit as well as MMU
feature bit 1 and CPU user feature bits 0 and 2 (5).

Checking byte 0 bit 0 (IBM numbering), means we're looking at the
"Memory Management Unit (MMU)" feature - ie. does the CPU have an MMU.
In practice that bit is set on all platforms which have the property.

This means we set CPU_FTR_REAL_LE always. In practice that seems not to
matter because all the modern cpus which have this property also
implement REAL_LE, and we've never needed to disable it.

We're also incorrectly setting MMU feature bit 1, which is:

  #define MMU_FTR_TYPE_8xx		0x00000002

Luckily the only place that looks for MMU_FTR_TYPE_8xx is in Book3E
code, which can't run on the same cpus as scan_features(). So this also
doesn't matter in practice.

Finally in the CPU user feature mask, we're setting bits 0 and 2. Bit 2
is not currently used, and bit 0 is:

  #define PPC_FEATURE_PPC_LE		0x00000001

Which says the CPU supports the old style "PPC Little Endian" mode.
Again this should be harmless in practice as no 64-bit CPUs implement
that mode.

Fix the code by adding the missing initialisation of the MMU feature.

Also add a comment marking CPU user feature bit 2 (0x4) as reserved. It
would be unsafe to start using it as old kernels incorrectly set it.

Fixes: 44ae3ab335 ("powerpc: Free up some CPU feature bits by moving out MMU-related features")
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
[mpe: Flesh out changelog, add comment reserving 0x4]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-18 20:08:38 +10:00
Jiri Kosina 4d4fb97a62 Merge branch 'topic/livepatch' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into for-4.7/livepatching-ppc64le
Pull livepatching support for ppc64 architecture from Michael Ellerman.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-04-15 11:42:51 +02:00
Michael Ellerman 5d31a96e6c powerpc/livepatch: Add livepatch stack to struct thread_info
In order to support live patching we need to maintain an alternate
stack of TOC & LR values. We use the base of the stack for this, and
store the "live patch stack pointer" in struct thread_info.

Unlike the other fields of thread_info, we can not statically initialise
that value, so it must be done at run time.

This patch just adds the code to support that, it is not enabled until
the next patch which actually adds live patch support.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
2016-04-14 15:47:06 +10:00
Michael Ellerman f63e6d8987 powerpc/livepatch: Add livepatch header
Add the powerpc specific livepatch definitions. In particular we provide
a non-default implementation of klp_get_ftrace_location().

This is required because the location of the mcount call is not constant
when using -mprofile-kernel (which we always do for live patching).

Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-14 15:47:06 +10:00
Philippe Bergheaud 86c9ffcc1e powerpc: Define PVR value for POWER8NVL processor
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 20:30:46 +10:00
Vipin K Parashar b3d79eaa6c powerpc/opal: Assign numbers to OPAL_MSG macros of enum opal_msg_type
This patch assigns numbers to OPAL_MSG macros of enum opal_msg_type
to prevent accidental insertion of any new value in between and thus
break OPAL API. This is also helpful while backporting mainline kernel
changes to distros which run downlevel kernel and thus don't have all
OPAL messages defined, avoiding unnecessary bugs due to enum values
order mismatch.

Signed-off-by: Vipin K Parashar <vipin@linux.vnet.ibm.com>
Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 20:30:45 +10:00
Russell Currey e3824e4281 powerpc/swsusp: Only use tlbie in POWER4 mode
If CONFIG_HIBERNATION and CONFIG_PPC_BOOK3S_64 are set, code in
arch/powerpc/kernel/swsusp_amd64.S which uses the tlbia macro is enabled.
tlbia in turn uses tlbie, an instruction which takes more than one
operand in newer versions of POWER.  As such, the kernel fails to build
due to the assembler complaining about missing operands.

This can be worked around by assembling the instruction as in POWER4.
This fixes the build breakage caused by enabling CONFIG_HIBERNATION.
Hibernation is currently only tested on G5 PowerMacs, which should be
unaffected by this change.  For other platforms it may now build,
whether or not it works is a different story.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 20:30:44 +10:00
Simon Guo 71528d8bd7 powerpc: Correct used_vsr comment
The used_vsr flag is set if process has used VSX registers, not Altivec
registers. But the comment says otherwise, correct the comment.

Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-29 12:08:08 +11:00
Linus Torvalds 643ad15d47 Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 protection key support from Ingo Molnar:
 "This tree adds support for a new memory protection hardware feature
  that is available in upcoming Intel CPUs: 'protection keys' (pkeys).

  There's a background article at LWN.net:

      https://lwn.net/Articles/643797/

  The gist is that protection keys allow the encoding of
  user-controllable permission masks in the pte.  So instead of having a
  fixed protection mask in the pte (which needs a system call to change
  and works on a per page basis), the user can map a (handful of)
  protection mask variants and can change the masks runtime relatively
  cheaply, without having to change every single page in the affected
  virtual memory range.

  This allows the dynamic switching of the protection bits of large
  amounts of virtual memory, via user-space instructions.  It also
  allows more precise control of MMU permission bits: for example the
  executable bit is separate from the read bit (see more about that
  below).

  This tree adds the MM infrastructure and low level x86 glue needed for
  that, plus it adds a high level API to make use of protection keys -
  if a user-space application calls:

        mmap(..., PROT_EXEC);

  or

        mprotect(ptr, sz, PROT_EXEC);

  (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
  this special case, and will set a special protection key on this
  memory range.  It also sets the appropriate bits in the Protection
  Keys User Rights (PKRU) register so that the memory becomes unreadable
  and unwritable.

  So using protection keys the kernel is able to implement 'true'
  PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
  PROT_READ as well.  Unreadable executable mappings have security
  advantages: they cannot be read via information leaks to figure out
  ASLR details, nor can they be scanned for ROP gadgets - and they
  cannot be used by exploits for data purposes either.

  We know about no user-space code that relies on pure PROT_EXEC
  mappings today, but binary loaders could start making use of this new
  feature to map binaries and libraries in a more secure fashion.

  There is other pending pkeys work that offers more high level system
  call APIs to manage protection keys - but those are not part of this
  pull request.

  Right now there's a Kconfig that controls this feature
  (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
  (like most x86 CPU feature enablement code that has no runtime
  overhead), but it's not user-configurable at the moment.  If there's
  any serious problem with this then we can make it configurable and/or
  flip the default"

* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
  x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
  mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
  x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
  mm/core, x86/mm/pkeys: Add execute-only protection keys support
  x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
  x86/mm/pkeys: Allow kernel to modify user pkey rights register
  x86/fpu: Allow setting of XSAVE state
  x86/mm: Factor out LDT init from context init
  mm/core, x86/mm/pkeys: Add arch_validate_pkey()
  mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
  x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
  x86/mm/pkeys: Add Kconfig prompt to existing config option
  x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
  x86/mm/pkeys: Dump PKRU with other kernel registers
  mm/core, x86/mm/pkeys: Differentiate instruction fetches
  x86/mm/pkeys: Optimize fault handling in access_error()
  mm/core: Do not enforce PKEY permissions on remote mm access
  um, pkeys: Add UML arch_*_access_permitted() methods
  mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
  x86/mm/gup: Simplify get_user_pages() PTE bit handling
  ...
2016-03-20 19:08:56 -07:00
Linus Torvalds d5e2d00898 powerpc updates for 4.6
Highlights:
  - Restructure Linux PTE on Book3S/64 to Radix format from Paul Mackerras
  - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh Kumar K.V
  - Add POWER9 cputable entry from Michael Neuling
  - FPU/Altivec/VSX save/restore optimisations from Cyril Bur
  - Add support for new ftrace ABI on ppc64le from Torsten Duwe
 
 Various cleanups & minor fixes from:
  - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy, Cyril
    Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell Currey,
    Sukadev Bhattiprolu, Suraj Jitindar Singh.
 
 General:
  - atomics: Allow architectures to define their own __atomic_op_* helpers from
    Boqun Feng
  - Implement atomic{, 64}_*_return_* variants and acquire/release/relaxed
    variants for (cmp)xchg from Boqun Feng
  - Add powernv_defconfig from Jeremy Kerr
  - Fix BUG_ON() reporting in real mode from Balbir Singh
  - Add xmon command to dump OPAL msglog from Andrew Donnellan
  - Add xmon command to dump process/task similar to ps(1) from Douglas Miller
  - Clean up memory hotplug failure paths from David Gibson
 
 pci/eeh:
  - Redesign SR-IOV on PowerNV to give absolute isolation between VFs from Wei
    Yang.
  - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan.
  - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang
  - PCI: Add pcibios_bus_add_device() weak function from Wei Yang
  - MAINTAINERS: Update EEH details and maintainership from Russell Currey
 
 cxl:
  - Support added to the CXL driver for running on both bare-metal and
    hypervisor systems, from Christophe Lombard and Frederic Barrat.
  - Ignore probes for virtual afu pci devices from Vaibhav Jain
 
 perf:
  - Export Power8 generic and cache events to sysfs from Sukadev Bhattiprolu
  - hv-24x7: Fix usage with chip events, display change in counter values,
    display domain indices in sysfs, eliminate domain suffix in event names,
    from Sukadev Bhattiprolu
 
 Freescale:
  - Updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum
    optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and
    other dt bits, and minor fixes/cleanup."
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW69OrAAoJEFHr6jzI4aWAe5EQAJw/hE6WBQc6a7Tj70AnXOqR
 qk/m5pZjuTwQxfBteIvHR1pE5eXdlvtAjcD254LVkFkAbIn19W/h2k0VX/nlee7P
 n/VRHRifjtGmukqHrPYJJ7ua9mNlY7pxh3leGSixBFASnSWqMxNNNziNQtSTcuCs
 TjHiw6NkZ/kzeunA4bAfE4yHVUZjmL74oiS9JbLyaVHqoW4fqWLlh26AKo2yYMZI
 qPicBBG4HBi3FGvoexnKxlJNdcV4HO7LzDjJmCSfUKYCJi+Pw19T5qmhso0q0qVz
 vHg/A8HNeG4Hn83pNVmLeQSAIQRZ3DvTtcLgbjPo+TVwm/hzrRRBWipTeOVbkLW8
 2bcOXT4t7LWUq15EAJ1LYgYZGzcLrfRfUeOcuQ1TWd3+PcfY9pE7FmizsxAAfaVe
 E9j9mpz4XnIqBtWkFHneTIHkQ5OWptyKuZJEaYH0nut4VsP0k8NarkseafGqBPu7
 5eG83gbiQbCVixfOgblV9eocJ29JcwpjPAY4CZSGJimShg909FV7WRgZgJkKWrbK
 dBRco8Jcp4VglGfo2qymv7Uj4KwQoypBREOhiKUvrAsVlDxPfx+bcskhjGu9xGDC
 xs/+nme0/lKa/wg5K4C3mQ1GAlkMWHI0ojhJjsyODbetup5UbkEu03wjAaTdO9dT
 Y6ptGm0rYAJluPNlziFj
 =qkAt
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "This was delayed a day or two by some build-breakage on old toolchains
  which we've now fixed.

  There's two PCI commits both acked by Bjorn.

  There's one commit to mm/hugepage.c which is (co)authored by Kirill.

  Highlights:
   - Restructure Linux PTE on Book3S/64 to Radix format from Paul
     Mackerras
   - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh
     Kumar K.V
   - Add POWER9 cputable entry from Michael Neuling
   - FPU/Altivec/VSX save/restore optimisations from Cyril Bur
   - Add support for new ftrace ABI on ppc64le from Torsten Duwe

  Various cleanups & minor fixes from:
   - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy,
     Cyril Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell
     Currey, Sukadev Bhattiprolu, Suraj Jitindar Singh.

  General:
   - atomics: Allow architectures to define their own __atomic_op_*
     helpers from Boqun Feng
   - Implement atomic{, 64}_*_return_* variants and acquire/release/
     relaxed variants for (cmp)xchg from Boqun Feng
   - Add powernv_defconfig from Jeremy Kerr
   - Fix BUG_ON() reporting in real mode from Balbir Singh
   - Add xmon command to dump OPAL msglog from Andrew Donnellan
   - Add xmon command to dump process/task similar to ps(1) from Douglas
     Miller
   - Clean up memory hotplug failure paths from David Gibson

  pci/eeh:
   - Redesign SR-IOV on PowerNV to give absolute isolation between VFs
     from Wei Yang.
   - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan.
   - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang
   - PCI: Add pcibios_bus_add_device() weak function from Wei Yang
   - MAINTAINERS: Update EEH details and maintainership from Russell
     Currey

  cxl:
   - Support added to the CXL driver for running on both bare-metal and
     hypervisor systems, from Christophe Lombard and Frederic Barrat.
   - Ignore probes for virtual afu pci devices from Vaibhav Jain

  perf:
   - Export Power8 generic and cache events to sysfs from Sukadev
     Bhattiprolu
   - hv-24x7: Fix usage with chip events, display change in counter
     values, display domain indices in sysfs, eliminate domain suffix in
     event names, from Sukadev Bhattiprolu

  Freescale:
   - Updates from Scott: "Highlights include 8xx optimizations, 32-bit
     checksum optimizations, 86xx consolidation, e5500/e6500 cpu
     hotplug, more fman and other dt bits, and minor fixes/cleanup"

* tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits)
  powerpc: Fix unrecoverable SLB miss during restore_math()
  powerpc/8xx: Fix do_mtspr_cpu6() build on older compilers
  powerpc/rcpm: Fix build break when SMP=n
  powerpc/book3e-64: Use hardcoded mttmr opcode
  powerpc/fsl/dts: Add "jedec,spi-nor" flash compatible
  powerpc/T104xRDB: add tdm riser card node to device tree
  powerpc32: PAGE_EXEC required for inittext
  powerpc/mpc85xx: Add pcsphy nodes to FManV3 device tree
  powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)
  powerpc/86xx: Introduce and use common dtsi
  powerpc/86xx: Update device tree
  powerpc/86xx: Move dts files to fsl directory
  powerpc/86xx: Switch to kconfig fragments approach
  powerpc/86xx: Update defconfigs
  powerpc/86xx: Consolidate common platform code
  powerpc32: Remove one insn in mulhdu
  powerpc32: small optimisation in flush_icache_range()
  powerpc: Simplify test in __dma_sync()
  powerpc32: move xxxxx_dcache_range() functions inline
  powerpc32: Remove clear_pages() and define clear_page() inline
  ...
2016-03-19 15:38:41 -07:00
Linus Torvalds 1200b6809d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support more Realtek wireless chips, from Jes Sorenson.

   2) New BPF types for per-cpu hash and arrap maps, from Alexei
      Starovoitov.

   3) Make several TCP sysctls per-namespace, from Nikolay Borisov.

   4) Allow the use of SO_REUSEPORT in order to do per-thread processing
   of incoming TCP/UDP connections.  The muxing can be done using a
   BPF program which hashes the incoming packet.  From Craig Gallek.

   5) Add a multiplexer for TCP streams, to provide a messaged based
      interface.  BPF programs can be used to determine the message
      boundaries.  From Tom Herbert.

   6) Add 802.1AE MACSEC support, from Sabrina Dubroca.

   7) Avoid factorial complexity when taking down an inetdev interface
      with lots of configured addresses.  We were doing things like
      traversing the entire address less for each address removed, and
      flushing the entire netfilter conntrack table for every address as
      well.

   8) Add and use SKB bulk free infrastructure, from Jesper Brouer.

   9) Allow offloading u32 classifiers to hardware, and implement for
      ixgbe, from John Fastabend.

  10) Allow configuring IRQ coalescing parameters on a per-queue basis,
      from Kan Liang.

  11) Extend ethtool so that larger link mode masks can be supported.
      From David Decotigny.

  12) Introduce devlink, which can be used to configure port link types
      (ethernet vs Infiniband, etc.), port splitting, and switch device
      level attributes as a whole.  From Jiri Pirko.

  13) Hardware offload support for flower classifiers, from Amir Vadai.

  14) Add "Local Checksum Offload".  Basically, for a tunneled packet
      the checksum of the outer header is 'constant' (because with the
      checksum field filled into the inner protocol header, the payload
      of the outer frame checksums to 'zero'), and we can take advantage
      of that in various ways.  From Edward Cree"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits)
  bonding: fix bond_get_stats()
  net: bcmgenet: fix dma api length mismatch
  net/mlx4_core: Fix backward compatibility on VFs
  phy: mdio-thunder: Fix some Kconfig typos
  lan78xx: add ndo_get_stats64
  lan78xx: handle statistics counter rollover
  RDS: TCP: Remove unused constant
  RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket
  net: smc911x: convert pxa dma to dmaengine
  team: remove duplicate set of flag IFF_MULTICAST
  bonding: remove duplicate set of flag IFF_MULTICAST
  net: fix a comment typo
  ethernet: micrel: fix some error codes
  ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it
  bpf, dst: add and use dst_tclassid helper
  bpf: make skb->tc_classid also readable
  net: mvneta: bm: clarify dependencies
  cls_bpf: reset class and reuse major in da
  ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c
  ldmvsw: Add ldmvsw.c driver code
  ...
2016-03-19 10:05:34 -07:00
Linus Torvalds 1a46712aa9 This is the bulk of GPIO changes for kernel v4.6:
Core changes:
 
 - The gpio_chip is now a *real device*. Until now the gpio chips
   were just piggybacking the parent device or (gasp) floating in
   space outside of the device model. We now finally make GPIO chips
   devices. The gpio_chip will create a gpio_device which contains
   a struct device, and this gpio_device struct is kept private.
   Anything that needs to be kept private from the rest of the kernel
   will gradually be moved over to the gpio_device.
 
 - As a result of making the gpio_device a real device, we have added
   resource management, so devm_gpiochip_add_data() will cut down on
   overhead and reduce code lines. A huge slew of patches convert
   almost all drivers in the subsystem to use this.
 
 - Building on making the GPIO a real device, we add the first step
   of a new userspace ABI: the GPIO character device. We take small
   steps here, so we first add a pure *information* ABI and the tool
   "lsgpio" that will list all GPIO devices on the system and all
   lines on these devices. We can now discover GPIOs properly from
   userspace. We still have not come up with a way to actually *use*
   GPIOs from userspace.
 
 - To encourage people to use the character device for the future,
   we have it always-enabled when using GPIO. The old sysfs ABI is
   still opt-in (and can be used in parallel), but is marked as
   deprecated. We will keep it around for the foreseeable future,
   but it will not be extended to cover ever more use cases.
 
 Cleanup:
 
 - Bjorn Helgaas removed a whole slew of per-architecture <asm/gpio.h>
   includes. This dates back to when GPIO was an opt-in feature and
   no shared library even existed: just a header file with proper
   prototypes was provided and all semantics were up to the arch to
   implement. These patches make the GPIO chip even more a proper
   device and cleans out leftovers of the old in-kernel API here
   and there. Still some cruft is left but it's very little now.
 
 - There is still some clamping of return values for .get() going
   on, but we now return sane values in the vast majority of drivers
   and the errorpath is sanitized. Some patches for powerpc, blackfin
   and unicore still drop in.
 
 - We continue to switch the ARM, MIPS, blackfin, m68k local GPIO
   implementations to use gpiochip_add_data() and cut down on code
   lines.
 
 - MPC8xxx is converted to use the generic GPIO helpers.
 
 - ATH79 is converted to use the generic GPIO helpers.
 
 New drivers:
 
 - WinSystems WS16C48
 
 - Acces 104-DIO-48E
 
 - F81866 (a F7188x variant)
 
 - Qoric (a MPC8xxx variant)
 
 - TS-4800
 
 - SPI serializers (pisosr): simple 74xx shift registers connected
   to SPI to obtain a dirt-cheap output-only GPIO expander.
 
 - Texas Instruments TPIC2810
 
 - Texas Instruments TPS65218
 
 - Texas Instruments TPS65912
 
 - X-Gene (ARM64) standby GPIO controller
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW6m24AAoJEEEQszewGV1zUasP/RpTrjRcNI5QFHjudd2oioDx
 R/IljC06Q072ZqVy/MR7QxwhoU8jUnCgKgv4rgMa1OcfHblxC2R1+YBKOUSij831
 E+SYmYDYmoMhN7j5Aslr66MXg1rLdFSdCZWemuyNruAK8bx6cTE1AWS8AELQzzTn
 Re/CPpCDbujLy0ZK2wJHgr9ZkdcBGICtDRCrOR3Kyjpwk/DSZcruK1PDN+VQMI3k
 bJlwgtGenOHINgCq/16edpwj/hzmoJXhTOZXJHI5XVR6czTwb3SvCYACvCkauI/a
 /N7b3quG88b5y0OPQPVxp5+VVl9GyVcv5oGzIfTNat/g5QinShZIT4kVV9r0xu6/
 TQHh1HlXleh+QI3yX0oRv9ztHreMf+vdpw1dhIwLqHqfJ7AWdOGk7BbKjwCrsOoq
 t/qUVFnyvooLpyr53Z5JY8+LqyynHF68G+jUQyHLgTZ0GCE+z+1jqNl1T501n3kv
 3CSlNYxSN/YUBN3cnroAIU/ZWcV4YRdxmOtEWP+7xgcdzTE6s/JHb2fuEfVHzWPf
 mHWtJGy8U0IR4VSSEln5RtjhRr0PAjTHeTOGAmivUnaIGDziTowyUVF+X5hwC77E
 DGTuLVx/Kniv173DK7xNAsUZNAETBa3fQZTgu+RfOpMiM1FZc7tI1rd7K7PjbyCc
 d2M0gcq+d11ITJTxC7OM
 =9AJ4
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "This is the bulk of GPIO changes for kernel v4.6.  There is quite a
  lot of interesting stuff going on.

  The patches to other subsystems and arch-wide are ACKed as far as
  possible, though I consider things like per-arch <asm/gpio.h> as
  essentially a part of the GPIO subsystem so it should not be needed.

  Core changes:

   - The gpio_chip is now a *real device*.  Until now the gpio chips
     were just piggybacking the parent device or (gasp) floating in
     space outside of the device model.

     We now finally make GPIO chips devices.  The gpio_chip will create
     a gpio_device which contains a struct device, and this gpio_device
     struct is kept private.  Anything that needs to be kept private
     from the rest of the kernel will gradually be moved over to the
     gpio_device.

   - As a result of making the gpio_device a real device, we have added
     resource management, so devm_gpiochip_add_data() will cut down on
     overhead and reduce code lines.  A huge slew of patches convert
     almost all drivers in the subsystem to use this.

   - Building on making the GPIO a real device, we add the first step of
     a new userspace ABI: the GPIO character device.  We take small
     steps here, so we first add a pure *information* ABI and the tool
     "lsgpio" that will list all GPIO devices on the system and all
     lines on these devices.

     We can now discover GPIOs properly from userspace.  We still have
     not come up with a way to actually *use* GPIOs from userspace.

   - To encourage people to use the character device for the future, we
     have it always-enabled when using GPIO.  The old sysfs ABI is still
     opt-in (and can be used in parallel), but is marked as deprecated.

     We will keep it around for the foreseeable future, but it will not
     be extended to cover ever more use cases.

  Cleanup:

   - Bjorn Helgaas removed a whole slew of per-architecture <asm/gpio.h>
     includes.

     This dates back to when GPIO was an opt-in feature and no shared
     library even existed: just a header file with proper prototypes was
     provided and all semantics were up to the arch to implement.  These
     patches make the GPIO chip even more a proper device and cleans out
     leftovers of the old in-kernel API here and there.

     Still some cruft is left but it's very little now.

   - There is still some clamping of return values for .get() going on,
     but we now return sane values in the vast majority of drivers and
     the errorpath is sanitized.  Some patches for powerpc, blackfin and
     unicore still drop in.

   - We continue to switch the ARM, MIPS, blackfin, m68k local GPIO
     implementations to use gpiochip_add_data() and cut down on code
     lines.

   - MPC8xxx is converted to use the generic GPIO helpers.

   - ATH79 is converted to use the generic GPIO helpers.

  New drivers:

   - WinSystems WS16C48

   - Acces 104-DIO-48E

   - F81866 (a F7188x variant)

   - Qoric (a MPC8xxx variant)

   - TS-4800

   - SPI serializers (pisosr): simple 74xx shift registers connected to
     SPI to obtain a dirt-cheap output-only GPIO expander.

   - Texas Instruments TPIC2810

   - Texas Instruments TPS65218

   - Texas Instruments TPS65912

   - X-Gene (ARM64) standby GPIO controller"

* tag 'gpio-v4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (194 commits)
  Revert "Share upstreaming patches"
  gpio: mcp23s08: Fix clearing of interrupt.
  gpiolib: Fix comment referring to gpio_*() in gpiod_*()
  gpio: pca953x: Fix pca953x_gpio_set_multiple() on 64-bit
  gpio: xgene: Fix kconfig for standby GIPO contoller
  gpio: Add generic serializer DT binding
  gpio: uapi: use 0xB4 as ioctl() major
  gpio: tps65912: fix bad merge
  Revert "gpio: lp3943: Drop pin_used and lp3943_gpio_request/lp3943_gpio_free"
  gpio: omap: drop dev field from gpio_bank structure
  gpio: mpc8xxx: Slightly update the code for better readability
  gpio: mpc8xxx: Remove *read_reg and *write_reg from struct mpc8xxx_gpio_chip
  gpio: mpc8xxx: Fixup setting gpio direction output
  gpio: mcp23s08: Add support for mcp23s18
  dt-bindings: gpio: altera: Fix altr,interrupt-type property
  gpio: add driver for MEN 16Z127 GPIO controller
  gpio: lp3943: Drop pin_used and lp3943_gpio_request/lp3943_gpio_free
  gpio: timberdale: Switch to devm_ioremap_resource()
  gpio: ts4800: Add IMX51 dependency
  gpiolib: rewrite gpiodev_add_to_list
  ...
2016-03-17 21:05:32 -07:00
Linus Torvalds 63e30271b0 PCI changes for the v4.6 merge window:
Enumeration
     Disable IO/MEM decoding for devices with non-compliant BARs (Bjorn Helgaas)
     Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs (Bjorn Helgaas
 
   Resource management
     Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED (Bjorn Helgaas)
     Don't assign or reassign immutable resources (Bjorn Helgaas)
     Don't enable/disable ROM BAR if we're using a RAM shadow copy (Bjorn Helgaas)
     Set ROM shadow location in arch code, not in PCI core (Bjorn Helgaas)
     Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs (Bjorn Helgaas)
     ia64: Use ioremap() instead of open-coded equivalent (Bjorn Helgaas)
     ia64: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
     MIPS: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
     Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY (Bjorn Helgaas)
     Don't leak memory if sysfs_create_bin_file() fails (Bjorn Helgaas)
     rcar: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)
     designware: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)
 
   Virtualization
     Wait for up to 1000ms after FLR reset (Alex Williamson)
     Support SR-IOV on any function type (Kelly Zytaruk)
     Add ACS quirk for all Cavium devices (Manish Jaggi)
 
   AER
     Rename pci_ops_aer to aer_inj_pci_ops (Bjorn Helgaas)
     Restore pci_ops pointer while calling original pci_ops (David Daney)
     Fix aer_inject error codes (Jean Delvare)
     Use dev_warn() in aer_inject (Jean Delvare)
     Log actual error causes in aer_inject (Jean Delvare)
     Log aer_inject error injections (Jean Delvare)
 
   VPD
     Prevent VPD access for buggy devices (Babu Moger)
     Move pci_read_vpd() and pci_write_vpd() close to other VPD code (Bjorn Helgaas)
     Move pci_vpd_release() from header file to pci/access.c (Bjorn Helgaas)
     Remove struct pci_vpd_ops.release function pointer (Bjorn Helgaas)
     Rename VPD symbols to remove unnecessary "pci22" (Bjorn Helgaas)
     Fold struct pci_vpd_pci22 into struct pci_vpd (Bjorn Helgaas)
     Sleep rather than busy-wait for VPD access completion (Bjorn Helgaas)
     Update VPD definitions (Hannes Reinecke)
     Allow access to VPD attributes with size 0 (Hannes Reinecke)
     Determine actual VPD size on first access (Hannes Reinecke)
 
   Generic host bridge driver
     Move structure definitions to separate header file (David Daney)
     Add pci_host_common_probe(), based on gen_pci_probe() (David Daney)
     Expose pci_host_common_probe() for use by other drivers (David Daney)
 
   Altera host bridge driver
     Fix altera_pcie_link_is_up() (Ley Foon Tan)
 
   Cavium ThunderX host bridge driver
     Add PCIe host driver for ThunderX processors (David Daney)
     Add driver for ThunderX-pass{1,2} on-chip devices (David Daney)
 
   Freescale i.MX6 host bridge driver
     Add DT bindings to configure PHY Tx driver settings (Justin Waters)
     Move imx6_pcie_reset_phy() near other PHY handling functions (Lucas Stach)
     Move PHY reset into imx6_pcie_establish_link() (Lucas Stach)
     Remove broken Gen2 workaround (Lucas Stach)
     Move link up check into imx6_pcie_wait_for_link() (Lucas Stach)
 
   Freescale Layerscape host bridge driver
     Add "fsl,ls2085a-pcie" compatible ID (Yang Shi)
 
   Intel VMD host bridge driver
     Attach VMD resources to parent domain's resource tree (Jon Derrick)
     Set bus resource start to 0 (Keith Busch)
 
   Microsoft Hyper-V host bridge driver
     Add fwnode_handle to x86 pci_sysdata (Jake Oshins)
     Look up IRQ domain by fwnode_handle (Jake Oshins)
     Add paravirtual PCI front-end for Microsoft Hyper-V VMs (Jake Oshins)
 
   NVIDIA Tegra host bridge driver
     Add pci_ops.{add,remove}_bus() callbacks (Thierry Reding)
     Implement ->{add,remove}_bus() callbacks (Thierry Reding)
     Remove unused struct tegra_pcie.num_ports field (Thierry Reding)
     Track bus -> CPU mapping (Thierry Reding)
     Remove misleading PHYS_OFFSET (Thierry Reding)
 
   Renesas R-Car host bridge driver
     Depend on ARCH_RENESAS, not ARCH_SHMOBILE (Simon Horman)
 
   Synopsys DesignWare host bridge driver
     ARC: Add PCI support (Joao Pinto)
     Add generic dw_pcie_wait_for_link() (Joao Pinto)
     Add default link up check if sub-driver doesn't override (Joao Pinto)
     Add driver for prototyping kits based on ARC SDP (Joao Pinto)
 
   TI Keystone host bridge driver
     Defer probing if devm_phy_get() returns -EPROBE_DEFER (Shawn Lin)
 
   Xilinx AXI host bridge driver
     Use of_pci_get_host_bridge_resources() to parse DT (Bharat Kumar Gogada)
     Remove dependency on ARM-specific struct hw_pci (Bharat Kumar Gogada)
     Don't call pci_fixup_irqs() on Microblaze (Bharat Kumar Gogada)
     Update Zynq binding with Microblaze node (Bharat Kumar Gogada)
     microblaze: Support generic Xilinx AXI PCIe Host Bridge IP driver (Bharat Kumar Gogada)
 
   Xilinx NWL host bridge driver
     Add support for Xilinx NWL PCIe Host Controller (Bharat Kumar Gogada)
 
   Miscellaneous
     Check device_attach() return value always (Bjorn Helgaas)
     Move pci_set_flags() from asm-generic/pci-bridge.h to linux/pci.h (Bjorn Helgaas)
     Remove includes of empty asm-generic/pci-bridge.h (Bjorn Helgaas)
     ARM64: Remove generated include of asm-generic/pci-bridge.h (Bjorn Helgaas)
     Remove empty asm-generic/pci-bridge.h (Bjorn Helgaas)
     Remove includes of asm/pci-bridge.h (Bjorn Helgaas)
     Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h (Bjorn Helgaas)
     unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition (Bjorn Helgaas)
     Cleanup pci/pcie/Kconfig whitespace (Andreas Ziegler)
     Include pci/hotplug Kconfig directly from pci/Kconfig (Bjorn Helgaas)
     Include pci/pcie/Kconfig directly from pci/Kconfig (Bogicevic Sasa)
     frv: Remove stray pci_{alloc,free}_consistent() declaration (Christoph Hellwig)
     Move pci_dma_* helpers to common code (Christoph Hellwig)
     Add PCI_CLASS_SERIAL_USB_DEVICE definition (Heikki Krogerus)
     Add QEMU top-level IDs for (sub)vendor & device (Robin H. Johnson)
     Fix broken URL for Dell biosdevname (Naga Venkata Sai Indubhaskar Jupudi)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW6XgMAAoJEFmIoMA60/r8Yq4P/1nNwwZPikU+9Z8k0HyGPll6
 vqXBOYj/wlbAxJTzH2weaoyUamFrwvsKaO3Vap3xHkAeTFPD/Dp0TipCCNMrZ82Z
 j1y83JJpenkRyX6ifLARCNYpOtvnvgzSrO9x7Sb2Xfqb64dPb7+jGAfOpGNzhKsO
 n1nj/L7RGx8Q6fNFGf8ANMXKTsdkdL+1pdwegjUXmD5WdOT+oW8DmqVbhyfSKwl0
 E8r4Ml2lIg7Qd5Wu5iKMIBsR0+5HEyrwV7ch92wXChwKfoRwG70qnn7FGdc0y5ZB
 XvJuj8UD5UeMxEUeoRa9SwU6wWQT3Q9e6BzMS+P+43z36SPYjMfy/Xffv054z/bY
 rQomLjuGxNLESpmfNK5JfKxWoe2YNXjHQIDWMrAHyNlwdKJbYiwPcxnZJhvOa/eB
 p0QYcGS7O43STjibG9PZhzeq8tuSJRshxi0W6iB9QlqO8qs8nJQxIO+sZj/vl4yz
 lSnswWcV9062KITl8Fe9xDw244/RTz1xSVCdldlSoDhJyeMOjRvzS8raUMyyVmbA
 YULsI3l2iCl+fwDm/T21o7hJG966oYdAmgEv7lc7BWfgEAMg//LZXvMzVvrPFB2D
 R77u/0idtOciVJrmnO/x9DnQO2hzro9SLmVH6m0+0YU4wSSpZfGn98PCrtkatOAU
 c8zT9dJgyJVE3Z7cnPJ4
 =otsF
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes for v4.6:

  Enumeration:
   - Disable IO/MEM decoding for devices with non-compliant BARs (Bjorn Helgaas)
   - Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs (Bjorn Helgaas

  Resource management:
   - Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED (Bjorn Helgaas)
   - Don't assign or reassign immutable resources (Bjorn Helgaas)
   - Don't enable/disable ROM BAR if we're using a RAM shadow copy (Bjorn Helgaas)
   - Set ROM shadow location in arch code, not in PCI core (Bjorn Helgaas)
   - Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs (Bjorn Helgaas)
   - ia64: Use ioremap() instead of open-coded equivalent (Bjorn Helgaas)
   - ia64: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
   - MIPS: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
   - Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY (Bjorn Helgaas)
   - Don't leak memory if sysfs_create_bin_file() fails (Bjorn Helgaas)
   - rcar: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)
   - designware: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)

  Virtualization:
   - Wait for up to 1000ms after FLR reset (Alex Williamson)
   - Support SR-IOV on any function type (Kelly Zytaruk)
   - Add ACS quirk for all Cavium devices (Manish Jaggi)

  AER:
   - Rename pci_ops_aer to aer_inj_pci_ops (Bjorn Helgaas)
   - Restore pci_ops pointer while calling original pci_ops (David Daney)
   - Fix aer_inject error codes (Jean Delvare)
   - Use dev_warn() in aer_inject (Jean Delvare)
   - Log actual error causes in aer_inject (Jean Delvare)
   - Log aer_inject error injections (Jean Delvare)

  VPD:
   - Prevent VPD access for buggy devices (Babu Moger)
   - Move pci_read_vpd() and pci_write_vpd() close to other VPD code (Bjorn Helgaas)
   - Move pci_vpd_release() from header file to pci/access.c (Bjorn Helgaas)
   - Remove struct pci_vpd_ops.release function pointer (Bjorn Helgaas)
   - Rename VPD symbols to remove unnecessary "pci22" (Bjorn Helgaas)
   - Fold struct pci_vpd_pci22 into struct pci_vpd (Bjorn Helgaas)
   - Sleep rather than busy-wait for VPD access completion (Bjorn Helgaas)
   - Update VPD definitions (Hannes Reinecke)
   - Allow access to VPD attributes with size 0 (Hannes Reinecke)
   - Determine actual VPD size on first access (Hannes Reinecke)

  Generic host bridge driver:
   - Move structure definitions to separate header file (David Daney)
   - Add pci_host_common_probe(), based on gen_pci_probe() (David Daney)
   - Expose pci_host_common_probe() for use by other drivers (David Daney)

  Altera host bridge driver:
   - Fix altera_pcie_link_is_up() (Ley Foon Tan)

  Cavium ThunderX host bridge driver:
   - Add PCIe host driver for ThunderX processors (David Daney)
   - Add driver for ThunderX-pass{1,2} on-chip devices (David Daney)

  Freescale i.MX6 host bridge driver:
   - Add DT bindings to configure PHY Tx driver settings (Justin Waters)
   - Move imx6_pcie_reset_phy() near other PHY handling functions (Lucas Stach)
   - Move PHY reset into imx6_pcie_establish_link() (Lucas Stach)
   - Remove broken Gen2 workaround (Lucas Stach)
   - Move link up check into imx6_pcie_wait_for_link() (Lucas Stach)

  Freescale Layerscape host bridge driver:
   - Add "fsl,ls2085a-pcie" compatible ID (Yang Shi)

  Intel VMD host bridge driver:
   - Attach VMD resources to parent domain's resource tree (Jon Derrick)
   - Set bus resource start to 0 (Keith Busch)

  Microsoft Hyper-V host bridge driver:
   - Add fwnode_handle to x86 pci_sysdata (Jake Oshins)
   - Look up IRQ domain by fwnode_handle (Jake Oshins)
   - Add paravirtual PCI front-end for Microsoft Hyper-V VMs (Jake Oshins)

  NVIDIA Tegra host bridge driver:
   - Add pci_ops.{add,remove}_bus() callbacks (Thierry Reding)
   - Implement ->{add,remove}_bus() callbacks (Thierry Reding)
   - Remove unused struct tegra_pcie.num_ports field (Thierry Reding)
   - Track bus -> CPU mapping (Thierry Reding)
   - Remove misleading PHYS_OFFSET (Thierry Reding)

  Renesas R-Car host bridge driver:
   - Depend on ARCH_RENESAS, not ARCH_SHMOBILE (Simon Horman)

  Synopsys DesignWare host bridge driver:
   - ARC: Add PCI support (Joao Pinto)
   - Add generic dw_pcie_wait_for_link() (Joao Pinto)
   - Add default link up check if sub-driver doesn't override (Joao Pinto)
   - Add driver for prototyping kits based on ARC SDP (Joao Pinto)

  TI Keystone host bridge driver:
   - Defer probing if devm_phy_get() returns -EPROBE_DEFER (Shawn Lin)

  Xilinx AXI host bridge driver:
   - Use of_pci_get_host_bridge_resources() to parse DT (Bharat Kumar Gogada)
   - Remove dependency on ARM-specific struct hw_pci (Bharat Kumar Gogada)
   - Don't call pci_fixup_irqs() on Microblaze (Bharat Kumar Gogada)
   - Update Zynq binding with Microblaze node (Bharat Kumar Gogada)
   - microblaze: Support generic Xilinx AXI PCIe Host Bridge IP driver (Bharat Kumar Gogada)

  Xilinx NWL host bridge driver:
   - Add support for Xilinx NWL PCIe Host Controller (Bharat Kumar Gogada)

  Miscellaneous:
   - Check device_attach() return value always (Bjorn Helgaas)
   - Move pci_set_flags() from asm-generic/pci-bridge.h to linux/pci.h (Bjorn Helgaas)
   - Remove includes of empty asm-generic/pci-bridge.h (Bjorn Helgaas)
   - ARM64: Remove generated include of asm-generic/pci-bridge.h (Bjorn Helgaas)
   - Remove empty asm-generic/pci-bridge.h (Bjorn Helgaas)
   - Remove includes of asm/pci-bridge.h (Bjorn Helgaas)
   - Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h (Bjorn Helgaas)
   - unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition (Bjorn Helgaas)
   - Cleanup pci/pcie/Kconfig whitespace (Andreas Ziegler)
   - Include pci/hotplug Kconfig directly from pci/Kconfig (Bjorn Helgaas)
   - Include pci/pcie/Kconfig directly from pci/Kconfig (Bogicevic Sasa)
   - frv: Remove stray pci_{alloc,free}_consistent() declaration (Christoph Hellwig)
   - Move pci_dma_* helpers to common code (Christoph Hellwig)
   - Add PCI_CLASS_SERIAL_USB_DEVICE definition (Heikki Krogerus)
   - Add QEMU top-level IDs for (sub)vendor & device (Robin H. Johnson)
   - Fix broken URL for Dell biosdevname (Naga Venkata Sai Indubhaskar Jupudi)"

* tag 'pci-v4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (94 commits)
  PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition
  PCI: designware: Add driver for prototyping kits based on ARC SDP
  PCI: designware: Add default link up check if sub-driver doesn't override
  PCI: designware: Add generic dw_pcie_wait_for_link()
  PCI: Cleanup pci/pcie/Kconfig whitespace
  PCI: Simplify pci_create_attr() control flow
  PCI: Don't leak memory if sysfs_create_bin_file() fails
  PCI: Simplify sysfs ROM cleanup
  PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
  MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource
  MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
  ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource
  ia64/PCI: Use ioremap() instead of open-coded equivalent
  ia64/PCI: Use temporary struct resource * to avoid repetition
  PCI: Clean up pci_map_rom() whitespace
  PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs
  PCI: thunder: Add driver for ThunderX-pass{1,2} on-chip devices
  PCI: thunder: Add PCIe host driver for ThunderX processors
  PCI: generic: Expose pci_host_common_probe() for use by other drivers
  PCI: generic: Add pci_host_common_probe(), based on gen_pci_probe()
  ...
2016-03-16 14:45:55 -07:00
Linus Torvalds 10dc374766 One of the largest releases for KVM... Hardly any generic improvement,
but lots of architecture-specific changes.
 
 * ARM:
 - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
 - PMU support for guests
 - 32bit world switch rewritten in C
 - various optimizations to the vgic save/restore code.
 
 * PPC:
 - enabled KVM-VFIO integration ("VFIO device")
 - optimizations to speed up IPIs between vcpus
 - in-kernel handling of IOMMU hypercalls
 - support for dynamic DMA windows (DDW).
 
 * s390:
 - provide the floating point registers via sync regs;
 - separated instruction vs. data accesses
 - dirty log improvements for huge guests
 - bugfixes and documentation improvements.
 
 * x86:
 - Hyper-V VMBus hypercall userspace exit
 - alternative implementation of lowest-priority interrupts using vector
 hashing (for better VT-d posted interrupt support)
 - fixed guest debugging with nested virtualizations
 - improved interrupt tracking in the in-kernel IOAPIC
 - generic infrastructure for tracking writes to guest memory---currently
 its only use is to speedup the legacy shadow paging (pre-EPT) case, but
 in the future it will be used for virtual GPUs as well
 - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJW5r3BAAoJEL/70l94x66D2pMH/jTSWWwdTUJMctrDjPVzKzG0
 yOzHW5vSLFoFlwEOY2VpslnXzn5TUVmCAfrdmFNmQcSw6hGb3K/xA/ZX/KLwWhyb
 oZpr123ycahga+3q/ht/dFUBCCyWeIVMdsLSFwpobEBzPL0pMgc9joLgdUC6UpWX
 tmN0LoCAeS7spC4TTiTTpw3gZ/L+aB0B6CXhOMjldb9q/2CsgaGyoVvKA199nk9o
 Ngu7ImDt7l/x1VJX4/6E/17VHuwqAdUrrnbqerB/2oJ5ixsZsHMGzxQ3sHCmvyJx
 WG5L00ubB1oAJAs9fBg58Y/MdiWX99XqFhdEfxq4foZEiQuCyxygVvq3JwZTxII=
 =OUZZ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "One of the largest releases for KVM...  Hardly any generic
  changes, but lots of architecture-specific updates.

  ARM:
   - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
   - PMU support for guests
   - 32bit world switch rewritten in C
   - various optimizations to the vgic save/restore code.

  PPC:
   - enabled KVM-VFIO integration ("VFIO device")
   - optimizations to speed up IPIs between vcpus
   - in-kernel handling of IOMMU hypercalls
   - support for dynamic DMA windows (DDW).

  s390:
   - provide the floating point registers via sync regs;
   - separated instruction vs.  data accesses
   - dirty log improvements for huge guests
   - bugfixes and documentation improvements.

  x86:
   - Hyper-V VMBus hypercall userspace exit
   - alternative implementation of lowest-priority interrupts using
     vector hashing (for better VT-d posted interrupt support)
   - fixed guest debugging with nested virtualizations
   - improved interrupt tracking in the in-kernel IOAPIC
   - generic infrastructure for tracking writes to guest
     memory - currently its only use is to speedup the legacy shadow
     paging (pre-EPT) case, but in the future it will be used for
     virtual GPUs as well
   - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits)
  KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch
  KVM: x86: disable MPX if host did not enable MPX XSAVE features
  arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit
  arm64: KVM: vgic-v3: Reset LRs at boot time
  arm64: KVM: vgic-v3: Do not save an LR known to be empty
  arm64: KVM: vgic-v3: Save maintenance interrupt state only if required
  arm64: KVM: vgic-v3: Avoid accessing ICH registers
  KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit
  KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit
  KVM: arm/arm64: vgic-v2: Reset LRs at boot time
  KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty
  KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function
  KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required
  KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers
  KVM: s390: allocate only one DMA page per VM
  KVM: s390: enable STFLE interpretation only if enabled for the guest
  KVM: s390: wake up when the VCPU cpu timer expires
  KVM: s390: step the VCPU timer while in enabled wait
  KVM: s390: protect VCPU cpu timer with a seqcount
  KVM: s390: step VCPU cpu timer during kvm_run ioctl
  ...
2016-03-16 09:55:35 -07:00
Christophe Leroy 2e098dcee6 powerpc/8xx: Fix do_mtspr_cpu6() build on older compilers
GCC < 4.9 is unable to build this, saying:

  arch/powerpc/mm/8xx_mmu.c:139:2: error: memory input 1 is not directly addressable

Change the one-element array into a simple variable to avoid this.

Fixes: 1458dd951f ("powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-16 15:22:40 +11:00
Michael Ellerman a1b5344620 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include 8xx optimizations, 32-bit checksum optimizations,
86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt
bits, and minor fixes/cleanup."
2016-03-14 20:05:14 +11:00
Christophe Leroy affe587bac powerpc32: move xxxxx_dcache_range() functions inline
flush/clean/invalidate _dcache_range() functions are all very
similar and are quite short. They are mainly used in __dma_sync()
perf_event locate them in the top 3 consumming functions during
heavy ethernet activity

They are good candidate for inlining, as __dma_sync() does
almost nothing but calling them

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:12 -06:00
Christophe Leroy 5736f96d12 powerpc32: Remove clear_pages() and define clear_page() inline
clear_pages() is never used expect by clear_page, and PPC32 is the
only architecture (still) having this function. Neither PPC64 nor
any other architecture has it.

This patch removes clear_pages() and moves clear_page() function
inline (same as PPC64) as it only is a few isns

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:11 -06:00
Christophe Leroy d6bfa02fcc powerpc: add inline functions for cache related instructions
This patch adds inline functions to use dcbz, dcbi, dcbf, dcbst
from C functions

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:11 -06:00
Christophe Leroy 63e9e1c28f powerpc/8xx: remove special handling of CPU6 errata in set_dec()
CPU6 ERRATA is now handled directly in mtspr(), so we can use the
standard set_dec() fonction in all cases.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:03 -06:00
Christophe Leroy 1458dd951f powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro
MPC8xx has an ERRATA on the use of mtspr() for some registers
This patch includes the ERRATA handling directly into mtspr() macro
so that mtspr() users don't need to bother about that errata

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:02 -06:00
Christophe Leroy 7ee5cf6bfa powerpc/8xx: Add missing SPRN defines into reg_8xx.h
Add missing SPRN defines into reg_8xx.h
Some of them are defined in mmu-8xx.h, so we include mmu-8xx.h in
reg_8xx.h, for that we remove references to PAGE_SHIFT in mmu-8xx.h
to have it self sufficient, as includers of reg_8xx.h don't all
include asm/page.h

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:02 -06:00
Christophe Leroy e974cd4be0 powerpc32: remove ioremap_base
ioremap_base is not initialised and is nowhere used so remove it

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:02 -06:00
Christophe Leroy be00ed728c powerpc32: Fix pte_offset_kernel() to return NULL for bad pages
The fixmap related functions try to map kernel pages that are
already mapped through Large TLBs. pte_offset_kernel() has to
return NULL for LTLBs, otherwise the caller will try to access
level 2 table which doesn't exist

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:02 -06:00
Michael Ellerman d8c0282f4d Merge branch 'topic/mprofile-kernel' into next
Merge the ftrace changes to support -mprofile-kernel on ppc64le. This is
a prerequisite for live patching, the support for which will be merged
via the livepatch tree based on this topic branch.
2016-03-11 11:20:15 +11:00
Sukadev Bhattiprolu e0728b50d4 powerpc/perf: Export Power8 generic and cache events to sysfs
Power8 supports a large number of events in each susbystem so when a
user runs:

	perf stat -e branch-instructions sleep 1
	perf stat -e L1-dcache-loads sleep 1

it is not clear as to which PMU events were monitored.

Export the generic hardware and cache perf events for Power8 to sysfs,
so users can precisely determine the PMU event monitored by the generic
event.

Eg:
	cat /sys/bus/event_source/devices/cpu/events/branch-instructions
	event=0x10068

	$ cat /sys/bus/event_source/devices/cpu/events/L1-dcache-loads
	event=0x100ee

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10 22:56:05 +11:00
Sukadev Bhattiprolu d4969e2459 powerpc/perf: Remove PME_ prefix for power7 events
We used the PME_ prefix earlier to avoid some macro/variable name
collisions.  We have since changed the way we define/use the event
macros so we no longer need the prefix.

By dropping the prefix, we keep the the event macros consistent with
their official names.

Reported-by: Michael Ellerman <ellerman@au1.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10 22:56:04 +11:00
Christophe Leroy 7e393220b6 powerpc: optimise csum_partial() call when len is constant
csum_partial is often called for small fixed length packets
for which it is suboptimal to use the generic csum_partial()
function.

For instance, in my configuration, I got:
* One place calling it with constant len 4
* Seven places calling it with constant len 8
* Three places calling it with constant len 14
* One place calling it with constant len 20
* One place calling it with constant len 24
* One place calling it with constant len 32

This patch renames csum_partial() to __csum_partial() and
implements csum_partial() as a wrapper inline function which
* uses csum_add() for small 16bits multiple constant length
* uses ip_fast_csum() for other 32bits multiple constant
* uses __csum_partial() in all other cases

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09 10:44:18 -06:00
Paolo Bonzini ab92f30875 KVM/ARM updates for 4.6
- VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
 - PMU support for guests
 - 32bit world switch rewritten in C
 - Various optimizations to the vgic save/restore code
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW36xjAAoJECPQ0LrRPXpDGQkQAMDppzcTOixT3e8VPdHAX09a
 Z5PO0gyTMVV7Jyz5Ul3pedPJA2GSK9mxOCwqvIFbdxLAR6ZB00juO5FrTHkSdI91
 1XLPj4bKoMWcVvhL/g5A4Glp/pVMW1k/9Yq8zZAtYlsLRlqG5rLOutSadcqHcYaJ
 cTD/pFf7b2oPtkTPyoFml75KgHBT/8uvAvFDOWA66Id2z6T11+PsBT/6XnGDiwKg
 tpGTNzx3kPIKIzOAOHqVW6UBxFOeabebXLT8wUz3VwNn/UbG6gkumMNApMAyF2q1
 zU0nAh8+7Ek6Dr4OFWE6BfW6sgg/l7i1lA8XoAmqG7ZTrSptCc59fvaZJxPruG+Q
 dMsU6QgR77JJjbZTinf9a1jReZ/liZrx2gZXedVKdILrjmDSq0UnGcxjUOEDZOGy
 2/dbrlJhv+LhpcJtuPpxPCfoqbW5L0ynzmuYuXRdRz3lTHiOWIRx5gugrhO+wH4D
 4gvZhbw3XCiYfpYHYhl8A1EH5kanKgdXDocz9yIm7mZm89gngufF/HkeXS3ZU25T
 yThyBGulGjqN4FCdgf1HolkTfFjnfSx4qJovJ58eHga+HNLXRkTecZZcbFy2OOHv
 8Bx0PIlwj4RgSaRLWQUudAhdhKS2g22DKDDljxFwhkMPNghvqkYMJCRDKLu6GBXQ
 4YsLKM+TaShHFjSpx+ao
 =rpvb
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM updates for 4.6

- VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
- PMU support for guests
- 32bit world switch rewritten in C
- Various optimizations to the vgic save/restore code

Conflicts:
	include/uapi/linux/kvm.h
2016-03-09 11:50:42 +01:00
Linus Walleij 0bae2f1732 Merge branch 'ib-mfd-regulator-gpio-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into devel 2016-03-09 17:40:37 +07:00
Christophe Lombard c0efa9aee8 powerpc: New possible return value from hcall
The hcalls introduced for cxl use a possible new value:
H_STATE (invalid state).

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:51 +11:00
Wei Yang 67086e32b5 powerpc/eeh: powerpc/eeh: Support error recovery for VF PE
PFs are enumerated on PCI bus, while VFs are created by PF's driver.

In EEH recovery, it has two cases:
1. Device and driver is EEH aware, error handlers are called.
2. Device and driver is not EEH aware, un-plug the device and plug it again
by enumerating it.

The special thing happens on the second case. For a PF, we could use the
original pci core to enumerate the bus, while for VF we need to record the
VFs which aer un-plugged then plug it again.

Also The patch caches the VF index in pci_dn, which can be used to
calculate VF's bus, device and function number. Those information helps to
locate the VF's PCI device instance when doing hotplug during EEH recovery
if necessary.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:23 +11:00
Wei Yang 0dc2830e0a powerpc/powernv: Support PCI config restore for VFs
After PE reset, OPAL API opal_pci_reinit() is called on all devices
contained in the PE to reinitialize them. While skiboot is not aware of
VFs, we have to implement the function in kernel to reinitialize VFs after
reset on PE for VFs.

In this patch, two functions pnv_pci_fixup_vf_mps() and
pnv_eeh_restore_vf_config() both manipulate the MPS of the VF, since for a
VF it has three cases.

1. Normal creation for a VF
   In this case, pnv_pci_fixup_vf_mps() is called to make the MPS a proper
   value compared with its parent.
2. EEH recovery without VF removed
   In this case, MPS is stored in pci_dn and pnv_eeh_restore_vf_config() is
   called to restore it and reinitialize other part.
3. EEH recovery with VF removed
   In this case, VF will be removed then re-created. Both functions are
   called. First pnv_pci_fixup_vf_mps() is called to store the proper MPS
   to pci_dn and then pnv_eeh_restore_vf_config() is called to do proper
   thing.

This introduces two functions: pnv_pci_fixup_vf_mps() to fixup the VF's
MPS to make sure it is equal to parent's and store this value in pci_dn
for future use. pnv_eeh_restore_vf_config() to re-initialize on VF by
restoring MPS, disabling completion timeout, enabling SERR, etc.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:22 +11:00
Wei Yang 9312bc5bab powerpc/powernv: Support EEH reset for VF PE
PEs for VFs don't have primary bus. So they have to have their own reset
backend, which is used during EEH recovery. The patch implements the reset
backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained
in the PE.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:21 +11:00
Wei Yang c29fa27d26 powerpc/eeh: Create PE for VFs
This creates PEs for VFs in the weak function pcibios_bus_add_device().
Those PEs for VFs are identified with newly introduced flag EEH_PE_VF
so that we treat them differently during EEH recovery.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:19 +11:00
Wei Yang 39218cd00e powerpc/eeh: EEH device for VF
VFs and their corresponding pdn are created and released dynamically
when their PF's SRIOV capability is enabled and disabled. This creates
and releases EEH devices for VFs when creating and releasing their pdn
instances, which means EEH devices and pdn instances have same life
cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn).

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:18 +11:00
Christoph Hellwig bc4b024a8b PCI: Move pci_dma_* helpers to common code
For a long time all architectures implement the pci_dma_* functions using
the generic DMA API, and they all use the same header to do so.

Move this header, pci-dma-compat.h, to include/linux and include it from
the generic pci.h instead of having each arch duplicate this include.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-07 10:40:02 -06:00
Torsten Duwe 153086644f powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI
The gcc switch -mprofile-kernel defines a new ABI for calling _mcount()
very early in the function with minimal overhead.

Although mprofile-kernel has been available since GCC 3.4, there were
bugs which were only fixed recently. Currently it is known to work in
GCC 4.9, 5 and 6.

Additionally there are two possible code sequences generated by the
flag, the first uses mflr/std/bl and the second is optimised to omit the
std. Currently only gcc 6 has the optimised sequence. This patch
supports both sequences.

Initial work started by Vojtech Pavlik, used with permission.

Key changes:
 - rework _mcount() to work for both the old and new ABIs.
 - implement new versions of ftrace_caller() and ftrace_graph_caller()
   which deal with the new ABI.
 - updates to __ftrace_make_nop() to recognise the new mcount calling
   sequence.
 - updates to __ftrace_make_call() to recognise the nop'ed sequence.
 - implement ftrace_modify_call().
 - updates to the module loader to surpress the toc save in the module
   stub when calling mcount with the new ABI.

Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-07 14:53:55 +11:00
Michael Ellerman f17c4e01e9 powerpc/module: Mark module stubs with a magic value
When a module is loaded, calls out to the kernel go via a stub which is
generated at runtime. One of these stubs is used to call _mcount(),
which is the default target of tracing calls generated by the compiler
with -pg.

If dynamic ftrace is enabled (which it typically is), another stub is
used to call ftrace_caller(), which is the target of tracing calls when
ftrace is actually active.

ftrace then wants to disable the calls to _mcount() at module startup,
and enable/disable the calls to ftrace_caller() when enabling/disabling
tracing - all of these it does by patching the code.

As part of that code patching, the ftrace code wants to confirm that the
branch it is about to modify, is in fact a call to a module stub which
calls _mcount() or ftrace_caller().

Currently it does that by inspecting the instructions and confirming
they are what it expects. Although that works, the code to do it is
pretty intricate because it requires lots of knowledge about the exact
format of the stub.

We can make that process easier by marking the generated stubs with a
magic value, and then looking for that magic value. Altough this is not
as rigorous as the current method, I believe it is sufficient in
practice.

Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-07 14:53:53 +11:00
Michael Ellerman 136cd3450a powerpc/module: Only try to generate the ftrace_caller() stub once
Currently we generate the module stub for ftrace_caller() at the bottom
of apply_relocate_add(). However apply_relocate_add() is potentially
called more than once per module, which means we will try to generate
the ftrace_caller() stub multiple times.

Although the current code deals with that correctly, ie. it only
generates a stub the first time, it would be clearer to only try to
generate the stub once.

Note also on first reading it may appear that we generate a different
stub for each section that requires relocation, but that is not the
case. The code in stub_for_addr() that searches for an existing stub
uses sechdrs[me->arch.stubs_section], ie. the single stub section for
this module.

A cleaner approach is to only generate the ftrace_caller() stub once,
from module_finalize(). Although the original code didn't check to see
if the stub was actually generated correctly, it seems prudent to add a
check, so do that. And an additional benefit is we can clean the ifdefs
up a little.

Finally we must propagate the const'ness of some of the pointers passed
to module_finalize(), but that is also an improvement.

Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-07 14:53:53 +11:00
Michael Ellerman a5cab83cd3 powerpc: Create a helper for getting the kernel toc value
Move the logic to work out the kernel toc pointer into a header. This is
a good cleanup, and also means we can use it elsewhere in future.

Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
2016-03-07 14:53:52 +11:00
chenhui zhao 6becef7ea0 powerpc/mpc85xx: Add CPU hotplug support for E6500
Support Freescale E6500 core-based platforms, like t4240.
Support disabling/enabling individual CPU thread dynamically.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
2016-03-04 23:58:38 -06:00
chenhui zhao 2f4f1f815b powerpc/mpc85xx: Add hotplug support on E5500 and E500MC cores
Freescale E500MC and E5500 core-based platforms, like P4080, T1040,
support disabling/enabling CPU dynamically.
This patch adds this feature on those platforms.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com>
[scottwood: removed unused pr_fmt]
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:56:31 -06:00
chenhui zhao d17799f9c1 powerpc/rcpm: add RCPM driver
There is a RCPM (Run Control/Power Management) in Freescale QorIQ
series processors. The device performs tasks associated with device
run control and power management.

The driver implements some features: mask/unmask irq, enter/exit low
power states, freeze time base, etc.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Tang Yuantian <Yuantian.Tang@freescale.com>
[scottwood: remove __KERNEL__ ifdef]
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:50:27 -06:00
chenhui zhao e7affb1dba powerpc/cache: add cache flush operation for various e500
Various e500 core have different cache architecture, so they
need different cache flush operations. Therefore, add a callback
function cpu_flush_caches to the struct cpu_spec. The cache flush
operation for the specific kind of e500 is selected at init time.
The callback function will flush all caches inside the current cpu.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:44:51 -06:00
chenhui zhao ebb9d30a6a powerpc/mm: any thread in one core can be the first to setup TLB1
On e6500, in the case of cpu hotplug, either thread in one core
may be the first thread initilzing the TLB1. The subsequent threads
must not setup it again.

The code is derived from the comment of Scott Wood.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:44:02 -06:00
Christophe Leroy 5a8847c83c powerpc: simplify csum_add(a, b) in case a or b is constant 0
Simplify csum_add(a, b) in case a or b is constant 0

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:04:00 -06:00
Christophe Leroy 37e08cad8f powerpc: inline ip_fast_csum()
In several architectures, ip_fast_csum() is inlined
There are functions like ip_send_check() which do nothing
much more than calling ip_fast_csum().
Inlining ip_fast_csum() allows the compiler to optimise better

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[scottwood: whitespace and cast fixes]
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 21:49:49 -06:00
Christophe Leroy 03bc8b0fc8 powerpc32: checksum_wrappers_64 becomes checksum_wrappers
The powerpc64 checksum wrapper functions adds csum_and_copy_to_user()
which otherwise is implemented in include/net/checksum.h by using
csum_partial() then copy_to_user()

Those two wrapper fonctions are also applicable to powerpc32 as it is
based on the use of csum_partial_copy_generic() which also
exists on powerpc32

This patch renames arch/powerpc/lib/checksum_wrappers_64.c to
arch/powerpc/lib/checksum_wrappers.c and
makes it non-conditional to CONFIG_WORD_SIZE

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 21:47:47 -06:00
Christophe Leroy 11dfbf588a powerpc: mark xer clobbered in csum_add()
addc uses carry so xer is clobbered in csum_add()

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 21:47:27 -06:00
Aneesh Kumar K.V ee3b93ebfb powerpc/mm: Move hash64 tlbflush code into a new header
No code changes.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03 21:19:39 +11:00
Aneesh Kumar K.V f64e8084c9 powerpc/mm: Move hash related mmu-*.h headers to book3s/
No code changes.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03 21:19:21 +11:00
Aneesh Kumar K.V 368ced78e6 powerpc/mm: Switch book3s 64 with 64K page size to 4 level page table
This is needed so that we can support both hash and radix page table
using single kernel. Radix kernel uses a 4 level table.

We now use physical address in upper page table tree levels. Even though
they are aligned to their size, for the masked bits we use the
bit positions as per PowerISA 3.0.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03 21:18:28 +11:00
Aneesh Kumar K.V ae9a71afa4 powerpc/mm: Don't have conditional defines for real_pte_t
We remove real_pte_t out of STRICT_MM_TYPESCHECK.

Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03 16:47:02 +11:00
Aneesh Kumar K.V 2bf59916ef powerpc/mm: Split pgtable types to separate header
We move the page table accessors into a separate header. We will
later add a big endian variant of the table which is needed for radix.
No functionality change only code movement.

Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03 16:47:01 +11:00
Cyril Bur bf6a4d5b75 powerpc: Add the ability to save VSX without giving it up
This patch adds the ability to be able to save the VSX registers to the
thread struct without giving up (disabling the facility) next time the
process returns to userspace.

This patch builds on a previous optimisation for the FPU and VEC registers
in the thread copy path to avoid a possibly pointless reload of VSX state.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02 23:34:50 +11:00
Cyril Bur 6f515d842e powerpc: Add the ability to save Altivec without giving it up
This patch adds the ability to be able to save the VEC registers to the
thread struct without giving up (disabling the facility) next time the
process returns to userspace.

This patch builds on a previous optimisation for the FPU registers in the
thread copy path to avoid a possibly pointless reload of VEC state.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02 23:34:49 +11:00
Cyril Bur 8792468da5 powerpc: Add the ability to save FPU without giving it up
This patch adds the ability to be able to save the FPU registers to the
thread struct without giving up (disabling the facility) next time the
process returns to userspace.

This patch optimises the thread copy path (as a result of a fork() or
clone()) so that the parent thread can return to userspace with hot
registers avoiding a possibly pointless reload of FPU register state.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02 23:34:49 +11:00
Cyril Bur de2a20aa72 powerpc: Prepare for splitting giveup_{fpu, altivec, vsx} in two
This prepares for the decoupling of saving {fpu,altivec,vsx} registers and
marking {fpu,altivec,vsx} as being unused by a thread.

Currently giveup_{fpu,altivec,vsx}() does both however optimisations to
task switching can be made if these two operations are decoupled.
save_all() will permit the saving of registers to thread structs and leave
threads MSR with bits enabled.

This patch introduces no functional change.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02 23:34:48 +11:00
Cyril Bur 70fe3d980f powerpc: Restore FPU/VEC/VSX if previously used
Currently the FPU, VEC and VSX facilities are lazily loaded. This is not
a problem unless a process is using these facilities.

Modern versions of GCC are very good at automatically vectorising code,
new and modernised workloads make use of floating point and vector
facilities, even the kernel makes use of vectorised memcpy.

All this combined greatly increases the cost of a syscall since the
kernel uses the facilities sometimes even in syscall fast-path making it
increasingly common for a thread to take an *_unavailable exception soon
after a syscall, not to mention potentially taking all three.

The obvious overcompensation to this problem is to simply always load
all the facilities on every exit to userspace. Loading up all FPU, VEC
and VSX registers every time can be expensive and if a workload does
avoid using them, it should not be forced to incur this penalty.

An 8bit counter is used to detect if the registers have been used in the
past and the registers are always loaded until the value wraps to back
to zero.

Several versions of the assembly in entry_64.S were tested:

  1. Always calling C.
  2. Performing a common case check and then calling C.
  3. A complex check in asm.

After some benchmarking it was determined that avoiding C in the common
case is a performance benefit (option 2). The full check in asm (option
3) greatly complicated that codepath for a negligible performance gain
and the trade-off was deemed not worth it.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
[mpe: Move load_vec in the struct to fill an existing hole, reword change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

fixup
2016-03-02 23:34:48 +11:00
Alexey Kardashevskiy 58ded4201f KVM: PPC: Add support for 64bit TCE windows
The existing KVM_CREATE_SPAPR_TCE only supports 32bit windows which is not
enough for directly mapped windows as the guest can get more than 4GB.

This adds KVM_CREATE_SPAPR_TCE_64 ioctl and advertises it
via KVM_CAP_SPAPR_TCE_64 capability. The table size is checked against
the locked memory limit.

Since 64bit windows are to support Dynamic DMA windows (DDW), let's add
@bus_offset and @page_shift which are also required by DDW.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-03-02 09:56:50 +11:00
Alexey Kardashevskiy 14f853f1b2 KVM: PPC: Add @offset to kvmppc_spapr_tce_table
This enables userspace view of TCE tables to start from non-zero offset
on a bus. This will be used for huge DMA windows.

This only changes the internal structure, the user interface needs to
change in order to use an offset.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-03-02 09:56:50 +11:00
Alexey Kardashevskiy fe26e52712 KVM: PPC: Add @page_shift to kvmppc_spapr_tce_table
At the moment the kvmppc_spapr_tce_table struct can only describe
4GB windows and handle fixed size (4K) pages. Dynamic DMA windows
support more so these limits need to be extended.

This replaces window_size (in bytes, 4GB max) with page_shift (32bit)
and size (64bit, in pages).

This should cause no behavioural change as this is changing
the internal structures only - the user interface still only
allows one to create a 32-bit table with 4KiB pages at this stage.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-03-02 09:56:50 +11:00
David Gibson 5c3c7ede2b powerpc/mm: Split hash page table sizing heuristic into a helper
htab_get_table_size() either retrieve the size of the hash page table (HPT)
from the device tree - if the HPT size is determined by firmware - or
uses a heuristic to determine a good size based on RAM size if the kernel
is responsible for allocating the HPT.

To support a PAPR extension allowing resizing of the HPT, we're going to
want the memory size -> HPT size logic elsewhere, so split it out into a
helper function.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02 09:06:16 +11:00
David Gibson 27828f98a0 powerpc/mm: Handle removing maybe-present bolted HPTEs
At the moment the hpte_removebolted callback in ppc_md returns void and
will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
place.  This is awkward for the case of cleaning up a mapping which was
partially made before failing.

So, we add a return value to hpte_removebolted, and have it return ENOENT
in the case that the HPTE to remove didn't exist in the first place.

In the (sole) caller, we propagate errors in hpte_removebolted to its
caller to handle.  However, we handle ENOENT specially, continuing to
complete the unmapping over the specified range before returning the error
to the caller.

This means that htab_remove_mapping() will work sanely on a partially
present mapping, removing any HPTEs which are present, while also returning
ENOENT to its caller in case it's important there.

There are two callers of htab_remove_mapping():
   - In remove_section_mapping() we already WARN_ON() any error return,
     which is reasonable - in this case the mapping should be fully
     present
   - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
     just a WARN_ON() in the case of ENOENT, since failing to remove a
     mapping that wasn't there in the first place probably shouldn't be
     fatal.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01 22:04:18 +11:00
Adam Buchbinder 446957ba51 powerpc: Fix misspellings in comments.
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01 19:27:20 +11:00
Paul Mackerras 8daf51f55f powerpc/mm/book3s-64: Expand the real page number field of the Linux PTE
Now that other PTE fields have been moved out of the way, we can
expand the RPN field of the PTE on 64-bit Book 3S systems and align
it with the RPN field in the radix PTE format used by PowerISA v3.0
CPUs in radix mode.  For 64k page size, this means we need to move
the _PAGE_COMBO and _PAGE_4K_PFN bits.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29 20:34:41 +11:00
Paul Mackerras e726202f06 powerpc/mm/book3s-64: Move software-used bits in PTE
This moves the _PAGE_SPECIAL and _PAGE_SOFT_DIRTY bits in the Linux
PTE on 64-bit Book 3S systems to bit positions which are designated
for software use in the radix PTE format used by PowerISA v3.0 CPUs
in radix mode.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29 20:34:40 +11:00
Paul Mackerras c915df162d powerpc/mm/book3s-64: Shuffle read, write, execute and user bits in PTE
This moves the _PAGE_EXEC, _PAGE_RW and _PAGE_USER bits around in
the Linux PTE on 64-bit Book 3S systems to correspond with the bit
positions used in radix mode by PowerISA v3.0 CPUs.  This also adds
a _PAGE_READ bit corresponding to the read permission bit in the
radix PTE.  _PAGE_READ is currently unused but could possibly be used
in future to improve pte_protnone().

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29 20:34:40 +11:00
Paul Mackerras a9d4996df1 powerpc/mm/book3s-64: Move HPTE-related bits in PTE to upper end
This moves the _PAGE_HASHPTE, _PAGE_F_GIX and _PAGE_F_SECOND fields in
the Linux PTE on 64-bit Book 3S systems to the most significant byte.
Of the 5 bits, one is a software-use bit and the other four are
reserved bit positions in the PowerISA v3.0 radix PTE format.
Using these bits is OK because these bits are all to do with tracking
the HPTE(s) associated with the Linux PTE, and therefore won't be
needed in radix mode.  This frees up bit positions in the lower two
bytes.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29 20:34:39 +11:00
Paul Mackerras 84c957560a powerpc/mm/book3s-64: Move _PAGE_PTE to 2nd most significant bit
This changes _PAGE_PTE for 64-bit Book 3S processors from 0x1 to
0x4000_0000_0000_0000, because that bit is used as the L (leaf)
bit by PowerISA v3.0 CPUs in radix mode.  The "leaf" bit indicates
that the PTE points to a page directly rather than another radix
level, which is what the _PAGE_PTE bit means.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29 20:34:39 +11:00
Paul Mackerras 849f86a630 powerpc/mm/book3s-64: Move _PAGE_PRESENT to the most significant bit
This changes _PAGE_PRESENT for 64-bit Book 3S processors from 0x2 to
0x8000_0000_0000_0000, because that is where PowerISA v3.0 CPUs in
radix mode will expect to find it.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29 20:34:39 +11:00