Commit Graph

5075 Commits

Author SHA1 Message Date
Harald Freudenberger 13b251bdc8 s390/zcrypt: tracepoint definitions for zcrypt device driver.
This patch introduces tracepoint definitions and tracepoint
event invocations for the s390 zcrypt device.

Currently there are just two tracepoint events defined.
An s390_zcrypt_req request event occurs as soon as the
request is recognized by the zcrypt ioctl function. This
event may act as some kind of request-processing-starts-now
indication.
As late as possible within the zcrypt ioctl function there
occurs the s390_zcrypt_rep event which may act as the point
in time where the request has been processed by the kernel
and the result is about to be transferred back to userspace.
The glue which binds together request and reply event is the
ptr parameter, which is the local buffer address where the
request from userspace has been stored by the ioctl function.

The main purpose of this zcrypt tracepoint patch is to get
some data for performance measurements together with
information about the kind of request and on which card and
queue the request has been processed. It is not an ffdc
interface as there is already code in the zcrypt device
driver to serve the s390 debug feature interface.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:40 +01:00
Ingo Tuchscherer b886a9d156 s390/zcrypt: Introduce new zcrypt device status API
Introduce new ioctl (ZDEVICESTATUS) to provide detailed
information, like hardware type, domains, status and functionality
of available crypto devices.

Signed-off-by: Ingo Tuchscherer <ingo.tuchscherer@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:38 +01:00
Paul Bolle 846221cfb8 Remove references to dead make variable LINUX_INCLUDE
Commit 4fd06960f1 ("Use the new x86 setup code for i386") introduced a
reference to the make variable LINUX_INCLUDE. That reference got moved
around a bit and copied twice and now there are three references to it.

There has never been a definition of that variable. (Presumably that is
because it started out as a mistyped reference to LINUXINCLUDE.) So this
reference has always been an empty string. Let's remove it before it
spreads any further.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-12-14 10:54:28 +01:00
Linus Torvalds 2ec4584eb8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "The main bulk of the s390 patches for the 4.10 merge window:

   - Add support for the contiguous memory allocator.

   - The recovery for I/O errors in the dasd device driver is improved,
     the driver will now remove channel paths that are not working
     properly.

   - Additional fields are added to /proc/sysinfo, the extended
     partition name and the partition UUID.

   - New naming for PCI devices with system defined UIDs.

   - The last few remaining alloc_bootmem calls are converted to
     memblock.

   - The thread_info structure is stripped down and moved to the
     task_struct. The only field left in thread_info is the flags field.

   - Rework of the arch topology code to fix a fake numa issue.

   - Refactoring of the atomic primitives and add a new preempt_count
     implementation.

   - Clocksource steering for the STP sync check offsets.

   - The s390 specific headers are changed to make them usable with
     CLANG.

   - Bug fixes and cleanup"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (70 commits)
  s390/cpumf: Use configuration level indication for sampling data
  s390: provide memmove implementation
  s390: cleanup arch/s390/kernel Makefile
  s390: fix initrd corruptions with gcov/kcov instrumented kernels
  s390: exclude early C code from gcov profiling
  s390/dasd: channel path aware error recovery
  s390/dasd: extend dasd path handling
  s390: remove unused labels from entry.S
  s390/vmlogrdr: fix IUCV buffer allocation
  s390/crypto: unlock on error in prng_tdes_read()
  s390/sysinfo: show partition extended name and UUID if available
  s390/numa: pin all possible cpus to nodes early
  s390/numa: establish cpu to node mapping early
  s390/topology: use cpu_topology array instead of per cpu variable
  s390/smp: initialize cpu_present_mask in setup_arch
  s390/topology: always use s390 specific sched_domain_topology_level
  s390/smp: use smp_get_base_cpu() helper function
  s390/numa: always use logical cpu and core ids
  s390: Remove VLAIS in ptff() and clear_table()
  s390: fix machine check panic stack switch
  ...
2016-12-13 16:33:33 -08:00
Linus Torvalds 93173b5bf2 Small release, the most interesting stuff is x86 nested virt improvements.
x86: userspace can now hide nested VMX features from guests; nested
 VMX can now run Hyper-V in a guest; support for AVX512_4VNNIW and
 AVX512_FMAPS in KVM; infrastructure support for virtual Intel GPUs.
 
 PPC: support for KVM guests on POWER9; improved support for interrupt
 polling; optimizations and cleanups.
 
 s390: two small optimizations, more stuff is in flight and will be
 in 4.11.
 
 ARM: support for the GICv3 ITS on 32bit platforms.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJYTkP0FBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 lZIH/iT1n9OQXcuTpYYnQhuCenzI3GZZOIMTbCvK2i5bo0FIJKxVn0EiAAqZSXvO
 nO185FqjOgLuJ1AD1kJuxzye5suuQp4HIPWWgNHcexLuy43WXWKZe0IQlJ4zM2Xf
 u31HakpFmVDD+Cd1qN3yDXtDrRQ79/xQn2kw7CWb8olp+pVqwbceN3IVie9QYU+3
 gCz0qU6As0aQIwq2PyalOe03sO10PZlm4XhsoXgWPG7P18BMRhNLTDqhLhu7A/ry
 qElVMANT7LSNLzlwNdpzdK8rVuKxETwjlc1UP8vSuhrwad4zM2JJ1Exk26nC2NaG
 D0j4tRSyGFIdx6lukZm7HmiSHZ0=
 =mkoB
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "Small release, the most interesting stuff is x86 nested virt
  improvements.

  x86:
   - userspace can now hide nested VMX features from guests
   - nested VMX can now run Hyper-V in a guest
   - support for AVX512_4VNNIW and AVX512_FMAPS in KVM
   - infrastructure support for virtual Intel GPUs.

  PPC:
   - support for KVM guests on POWER9
   - improved support for interrupt polling
   - optimizations and cleanups.

  s390:
   - two small optimizations, more stuff is in flight and will be in
     4.11.

  ARM:
   - support for the GICv3 ITS on 32bit platforms"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits)
  arm64: KVM: pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guest
  KVM: arm/arm64: timer: Check for properly initialized timer on init
  KVM: arm/arm64: vgic-v2: Limit ITARGETSR bits to number of VCPUs
  KVM: x86: Handle the kthread worker using the new API
  KVM: nVMX: invvpid handling improvements
  KVM: nVMX: check host CR3 on vmentry and vmexit
  KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry
  KVM: nVMX: propagate errors from prepare_vmcs02
  KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT
  KVM: nVMX: load GUEST_EFER after GUEST_CR0 during emulated VM-entry
  KVM: nVMX: generate MSR_IA32_CR{0,4}_FIXED1 from guest CPUID
  KVM: nVMX: fix checks on CR{0,4} during virtual VMX operation
  KVM: nVMX: support restore of VMX capability MSRs
  KVM: nVMX: generate non-true VMX MSRs based on true versions
  KVM: x86: Do not clear RFLAGS.TF when a singlestep trap occurs.
  KVM: x86: Add kvm_skip_emulated_instruction and use it.
  KVM: VMX: Move skip_emulated_instruction out of nested_vmx_check_vmcs12
  KVM: VMX: Reorder some skip_emulated_instruction calls
  KVM: x86: Add a return value to kvm_emulate_cpuid
  KVM: PPC: Book3S: Move prototypes for KVM functions into kvm_ppc.h
  ...
2016-12-13 15:47:02 -08:00
Linus Torvalds e34bac726d Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - various misc bits

 - most of MM (quite a lot of MM material is awaiting the merge of
   linux-next dependencies)

 - kasan

 - printk updates

 - procfs updates

 - MAINTAINERS

 - /lib updates

 - checkpatch updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (123 commits)
  init: reduce rootwait polling interval time to 5ms
  binfmt_elf: use vmalloc() for allocation of vma_filesz
  checkpatch: don't emit unified-diff error for rename-only patches
  checkpatch: don't check c99 types like uint8_t under tools
  checkpatch: avoid multiple line dereferences
  checkpatch: don't check .pl files, improve absolute path commit log test
  scripts/checkpatch.pl: fix spelling
  checkpatch: don't try to get maintained status when --no-tree is given
  lib/ida: document locking requirements a bit better
  lib/rbtree.c: fix typo in comment of ____rb_erase_color
  lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEM
  MAINTAINERS: add drm and drm/i915 irc channels
  MAINTAINERS: add "C:" for URI for chat where developers hang out
  MAINTAINERS: add drm and drm/i915 bug filing info
  MAINTAINERS: add "B:" for URI where to file bugs
  get_maintainer: look for arbitrary letter prefixes in sections
  printk: add Kconfig option to set default console loglevel
  printk/sound: handle more message headers
  printk/btrfs: handle more message headers
  printk/kdb: handle more message headers
  ...
2016-12-12 20:50:02 -08:00
Linus Torvalds e71c3978d6 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp hotplug updates from Thomas Gleixner:
 "This is the final round of converting the notifier mess to the state
  machine. The removal of the notifiers and the related infrastructure
  will happen around rc1, as there are conversions outstanding in other
  trees.

  The whole exercise removed about 2000 lines of code in total and in
  course of the conversion several dozen bugs got fixed. The new
  mechanism allows to test almost every hotplug step standalone, so
  usage sites can exercise all transitions extensively.

  There is more room for improvement, like integrating all the
  pointlessly different architecture mechanisms of synchronizing,
  setting cpus online etc into the core code"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  tracing/rb: Init the CPU mask on allocation
  soc/fsl/qbman: Convert to hotplug state machine
  soc/fsl/qbman: Convert to hotplug state machine
  zram: Convert to hotplug state machine
  KVM/PPC/Book3S HV: Convert to hotplug state machine
  arm64/cpuinfo: Convert to hotplug state machine
  arm64/cpuinfo: Make hotplug notifier symmetric
  mm/compaction: Convert to hotplug state machine
  iommu/vt-d: Convert to hotplug state machine
  mm/zswap: Convert pool to hotplug state machine
  mm/zswap: Convert dst-mem to hotplug state machine
  mm/zsmalloc: Convert to hotplug state machine
  mm/vmstat: Convert to hotplug state machine
  mm/vmstat: Avoid on each online CPU loops
  mm/vmstat: Drop get_online_cpus() from init_cpu_node_state/vmstat_cpu_dead()
  tracing/rb: Convert to hotplug state machine
  oprofile/nmi timer: Convert to hotplug state machine
  net/iucv: Use explicit clean up labels in iucv_init()
  x86/pci/amd-bus: Convert to hotplug state machine
  x86/oprofile/nmi: Convert to hotplug state machine
  ...
2016-12-12 19:25:04 -08:00
Johannes Weiner 6d75f366b9 lib: radix-tree: check accounting of existing slot replacement users
The bug in khugepaged fixed earlier in this series shows that radix tree
slot replacement is fragile; and it will become more so when not only
NULL<->!NULL transitions need to be caught but transitions from and to
exceptional entries as well.  We need checks.

Re-implement radix_tree_replace_slot() on top of the sanity-checked
__radix_tree_replace().  This requires existing callers to also pass the
radix tree root, but it'll warn us when somebody replaces slots with
contents that need proper accounting (transitions between NULL entries,
real entries, exceptional entries) and where a replacement through the
slot pointer would corrupt the radix tree node counts.

Link: http://lkml.kernel.org/r/20161117193021.GB23430@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:08 -08:00
Aneesh Kumar K.V 692a68c154 mm: remove the page size change check in tlb_remove_page
Now that we check for page size change early in the loop, we can
partially revert e9d55e1570 ("mm: change the interface for
__tlb_remove_page").

This simplies the code much, by removing the need to track the last
address with which we adjusted the range.  We also go back to the older
way of filling the mmu_gather array, ie, we add an entry and then check
whether the gather batch is full.

Link: http://lkml.kernel.org/r/20161026084839.27299-6-aneesh.kumar@linux.vnet.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:07 -08:00
Aneesh Kumar K.V 07e326610e mm: add tlb_remove_check_page_size_change to track page size change
With commit e77b0852b5 ("mm/mmu_gather: track page size with mmu
gather and force flush if page size change") we added the ability to
force a tlb flush when the page size change in a mmu_gather loop.  We
did that by checking for a page size change every time we added a page
to mmu_gather for lazy flush/remove.  We can improve that by moving the
page size change check early and not doing it every time we add a page.

This also helps us to do tlb flush when invalidating a range covering
dax mapping.  Wrt dax mapping we don't have a backing struct page and
hence we don't call tlb_remove_page, which earlier forced the tlb flush
on page size change.  Moving the page size change check earlier means we
will do the same even for dax mapping.

We also avoid doing this check on architecture other than powerpc.

In a later patch we will remove page size check from tlb_remove_page().

Link: http://lkml.kernel.org/r/20161026084839.27299-5-aneesh.kumar@linux.vnet.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:07 -08:00
Aneesh Kumar K.V b528e4b640 mm/hugetlb: add tlb_remove_hugetlb_entry for handling hugetlb pages
This add tlb_remove_hugetlb_entry similar to tlb_remove_pmd_tlb_entry.

Link: http://lkml.kernel.org/r/20161026084839.27299-4-aneesh.kumar@linux.vnet.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:07 -08:00
Linus Torvalds 92c020d08d Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main scheduler changes in this cycle were:

   - support Intel Turbo Boost Max Technology 3.0 (TBM3) by introducig a
     notion of 'better cores', which the scheduler will prefer to
     schedule single threaded workloads on. (Tim Chen, Srinivas
     Pandruvada)

   - enhance the handling of asymmetric capacity CPUs further (Morten
     Rasmussen)

   - improve/fix load handling when moving tasks between task groups
     (Vincent Guittot)

   - simplify and clean up the cputime code (Stanislaw Gruszka)

   - improve mass fork()ed task spread a.k.a. hackbench speedup (Vincent
     Guittot)

   - make struct kthread kmalloc()ed and related fixes (Oleg Nesterov)

   - add uaccess atomicity debugging (when using access_ok() in the
     wrong context), under CONFIG_DEBUG_ATOMIC_SLEEP=y (Peter Zijlstra)

   - implement various fixes, cleanups and other enhancements (Daniel
     Bristot de Oliveira, Martin Schwidefsky, Rafael J. Wysocki)"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  sched/core: Use load_avg for selecting idlest group
  sched/core: Fix find_idlest_group() for fork
  kthread: Don't abuse kthread_create_on_cpu() in __kthread_create_worker()
  kthread: Don't use to_live_kthread() in kthread_[un]park()
  kthread: Don't use to_live_kthread() in kthread_stop()
  Revert "kthread: Pin the stack via try_get_task_stack()/put_task_stack() in to_live_kthread() function"
  kthread: Make struct kthread kmalloc'ed
  x86/uaccess, sched/preempt: Verify access_ok() context
  sched/x86: Make CONFIG_SCHED_MC_PRIO=y easier to enable
  sched/x86: Change CONFIG_SCHED_ITMT to CONFIG_SCHED_MC_PRIO
  x86/sched: Use #include <linux/mutex.h> instead of #include <asm/mutex.h>
  cpufreq/intel_pstate: Use CPPC to get max performance
  acpi/bus: Set _OSC for diverse core support
  acpi/bus: Enable HWP CPPC objects
  x86/sched: Add SD_ASYM_PACKING flags to x86 ITMT CPU
  x86/sysctl: Add sysctl for ITMT scheduling feature
  x86: Enable Intel Turbo Boost Max Technology 3.0
  x86/topology: Define x86's arch_update_cpu_topology
  sched: Extend scheduler's asym packing
  sched/fair: Clean up the tunable parameter definitions
  ...
2016-12-12 12:15:10 -08:00
Linus Torvalds 6cdf89b1ca Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The tree got pretty big in this development cycle, but the net effect
  is pretty good:

    115 files changed, 673 insertions(+), 1522 deletions(-)

  The main changes were:

   - Rework and generalize the mutex code to remove per arch mutex
     primitives. (Peter Zijlstra)

   - Add vCPU preemption support: add an interface to query the
     preemption status of vCPUs and use it in locking primitives - this
     optimizes paravirt performance. (Pan Xinhui, Juergen Gross,
     Christian Borntraeger)

   - Introduce cpu_relax_yield() and remov cpu_relax_lowlatency() to
     clean up and improve the s390 lock yielding machinery and its core
     kernel impact. (Christian Borntraeger)

   - Micro-optimize mutexes some more. (Waiman Long)

   - Reluctantly add the to-be-deprecated mutex_trylock_recursive()
     interface on a temporary basis, to give the DRM code more time to
     get rid of its locking hacks. Any other users will be NAK-ed on
     sight. (We turned off the deprecation warning for the time being to
     not pollute the build log.) (Peter Zijlstra)

   - Improve the rtmutex code a bit, in light of recent long lived
     bugs/races. (Thomas Gleixner)

   - Misc fixes, cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  x86/paravirt: Fix bool return type for PVOP_CALL()
  x86/paravirt: Fix native_patch()
  locking/ww_mutex: Use relaxed atomics
  locking/rtmutex: Explain locking rules for rt_mutex_proxy_unlock()/init_proxy_locked()
  locking/rtmutex: Get rid of RT_MUTEX_OWNER_MASKALL
  x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()
  locking/mutex: Break out of expensive busy-loop on {mutex,rwsem}_spin_on_owner() when owner vCPU is preempted
  locking/osq: Break out of spin-wait busy waiting loop for a preempted vCPU in osq_lock()
  Documentation/virtual/kvm: Support the vCPU preemption check
  x86/xen: Support the vCPU preemption check
  x86/kvm: Support the vCPU preemption check
  x86/kvm: Support the vCPU preemption check
  kvm: Introduce kvm_write_guest_offset_cached()
  locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests
  locking/spinlocks, s390: Implement vcpu_is_preempted(cpu)
  locking/core, powerpc: Implement vcpu_is_preempted(cpu)
  sched/core: Introduce the vcpu_is_preempted(cpu) interface
  sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q
  locking/core: Provide common cpu_relax_yield() definition
  locking/mutex: Don't mark mutex_trylock_recursive() as deprecated, temporarily
  ...
2016-12-12 10:48:02 -08:00
Christian Borntraeger c19805f870 s390/cpumf: Use configuration level indication for sampling data
Newer hardware provides the level of virtualization that a particular
sample belongs to. Use that information and fall back to the old
heuristics if the sample does not contain that information.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:12:18 +01:00
Heiko Carstens b4623d4e5b s390: provide memmove implementation
Provide an s390 specific memmove implementation which is faster than
the generic implementation which copies byte-wise.

For non-destructive (as defined by the mvc instruction) memmove
operations the following table compares the old default implementation
versus the new s390 specific implementation:

size     old   new
   1     1ns   8ns
   2     2ns   8ns
   4     4ns   8ns
   8     7ns   8ns
  16    17ns   8ns
  32    35ns   8ns
  64    65ns   9ns
 128   146ns  10ns
 256   298ns  11ns
 512   537ns  11ns
1024  1193ns  19ns
2048  2405ns  36ns

So only for very small sizes the old implementation is faster. For
overlapping memmoves, where the mvc instruction can't be used, the new
implementation is as slow as the old one.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:11:32 +01:00
Heiko Carstens 82897ede92 s390: cleanup arch/s390/kernel Makefile
Group all compiler flag modification lines together and sort them
alphabetically. This should hopefully prevent future bugs due to
missing flag modifications.
Also fix indentation at some places.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:11:26 +01:00
Heiko Carstens d543a106f9 s390: fix initrd corruptions with gcov/kcov instrumented kernels
The early C code within arch/s390/kernel/early.c saves ipl parameters
before the bss section is cleared. When doing that it jumps to code
that is potentially gcov/kcov instrumented. That code in turn will
corrupt an initrd that potentially may reside in the not yet ready to
be used bss section.

Instead of excluding more and more code from gcov/kcov instrumentation
provide an early memmove function which will be used to save ipl
parameters. The verification if these parameters are actually valid
will be done later.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:11:20 +01:00
Heiko Carstens b5cb9bf8dd s390: exclude early C code from gcov profiling
Early C code must be excluded from gcov profiling since it may write
to the bss section before
- a potential initrd that resides there is rescued
- the bss section is initialized (zeroed)

This patch only addresses the problem that early code is instrumented
for profiling, but not the problem that it jumps into other code that
is still instrumented. That problem will be fixed with a follow-on
patch.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:10:15 +01:00
Stefan Haberland a521b048bc s390/dasd: channel path aware error recovery
With this feature, the DASD device driver more robustly handles DASDs
that are attached via multiple channel paths and are subject to
constant Interface-Control-Checks (IFCCs) and Channel-Control-Checks
(CCCs) or loss of High-Performance-FICON (HPF) functionality on one or
more of these paths.

If a channel path does not work correctly, it is removed from normal
operation as long as other channel paths are available. All extended
error recovery states can be queried and reset via user space
interfaces.

Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:05:03 +01:00
Heiko Carstens 7df1160459 s390: remove unused labels from entry.S
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:04:26 +01:00
Dan Carpenter 9e6e7c7431 s390/crypto: unlock on error in prng_tdes_read()
We added some new locking but forgot to unlock on error.

Fixes: 57127645d7 ("s390/zcrypt: Introduce new SHA-512 based Pseudo Random Generator.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 09:01:10 +01:00
Martin KaFai Lau 17bedab272 bpf: xdp: Allow head adjustment in XDP prog
This patch allows XDP prog to extend/remove the packet
data at the head (like adding or removing header).  It is
done by adding a new XDP helper bpf_xdp_adjust_head().

It also renames bpf_helper_changes_skb_data() to
bpf_helper_changes_pkt_data() to better reflect
that XDP prog does not work on skb.

This patch adds one "xdp_adjust_head" bit to bpf_prog for the
XDP-capable driver to check if the XDP prog requires
bpf_xdp_adjust_head() support.  The driver can then decide
to error out during XDP_SETUP_PROG.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08 14:25:13 -05:00
Viktor Mihajlovski e32eae10e5 s390/sysinfo: show partition extended name and UUID if available
Extract extended name and UUID from SYSIB 2.2.2 data.
As the code to convert the raw extended name into printable format
can be reused by stsi_2_2_2 we're moving the conversion code into a
separate function convert_ext_name.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 12:29:47 +01:00
Heiko Carstens e6d4a636ac s390/numa: pin all possible cpus to nodes early
It is required to have an early static cpu to node mapping. This patch
pins all possible cpus to nodes for which no topology information is
present. Since there is no interface available which would allow to
tell where a non-present cpu would appear topology-wise, simply use a
round robin algorithm.
Right now this makes sure that the cpu_to_node() function will return
the same value for a cpu during the life time of the system.

Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:23:33 +01:00
Heiko Carstens 8c91058022 s390/numa: establish cpu to node mapping early
Initialize the cpu topology and therefore also the cpu to node mapping
much earlier. Fixes this warning and subsequent crashes when using the
fake numa emulation mode on s390:

WARNING: CPU: 0 PID: 1 at include/linux/cpumask.h:121 select_task_rq+0xe6/0x1a8
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-00001-ge9d867a67fd0-dirty #28
task: 00000001dd270008 ti: 00000001eccb4000 task.ti: 00000001eccb4000
Krnl PSW : 0404c00180000000 0000000000176c56 (select_task_rq+0xe6/0x1a8)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Call Trace:
([<0000000000176c30>] select_task_rq+0xc0/0x1a8)
([<0000000000177d64>] try_to_wake_up+0x2e4/0x478)
([<000000000015d46c>] create_worker+0x174/0x1c0)
([<0000000000161a98>] alloc_unbound_pwq+0x360/0x438)
([<0000000000162550>] apply_wqattrs_prepare+0x200/0x2a0)
([<000000000016266a>] apply_workqueue_attrs_locked+0x7a/0xb0)
([<0000000000162af0>] apply_workqueue_attrs+0x50/0x78)
([<000000000016441c>] __alloc_workqueue_key+0x304/0x520)
([<0000000000ee3706>] default_bdi_init+0x3e/0x70)
([<0000000000100270>] do_one_initcall+0x140/0x1d8)
([<0000000000ec9da8>] kernel_init_freeable+0x220/0x2d8)
([<0000000000984a7a>] kernel_init+0x2a/0x150)
([<00000000009913fa>] kernel_thread_starter+0x6/0xc)
([<00000000009913f4>] kernel_thread_starter+0x0/0xc)

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:23:25 +01:00
Heiko Carstens 30fc4ca2a8 s390/topology: use cpu_topology array instead of per cpu variable
CPU topology information like cpu to node mapping must be setup in
setup_arch already. Topology information is currently made available
with a per cpu variable; this however will not work when the
initialization will be moved to setup_arch, since the generic percpu
setup will be done much later.

Therefore convert back to a cpu_topology array.

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:23:16 +01:00
Heiko Carstens af51160ebd s390/smp: initialize cpu_present_mask in setup_arch
In order to be able to setup the cpu to node mappings early it is a
prerequisite to know which cpus are present. Therefore cpus must be
detected much earlier than before.

For sclp based cpu detection this requires yet another early sclp
call, since the system is not ready to use the regular interrupt and
memory allocations.

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:23:07 +01:00
Heiko Carstens ebb299a510 s390/topology: always use s390 specific sched_domain_topology_level
The s390 specific sched_domain_topology_level should always be used,
not only if the machine provides topology information. Luckily this
odd behaviour, that was by accident introduced with git commit
d05d15da18 ("s390/topology: delay initialization of topology cpu
masks") has currently no side effect.

Fixes: d05d15da18 ("s390/topology: delay initialization of topology cpumasks")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:59 +01:00
Heiko Carstens 5423145f8c s390/smp: use smp_get_base_cpu() helper function
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:50 +01:00
Heiko Carstens 307b3114ef s390/numa: always use logical cpu and core ids
The toptree algorithm uses the physical core ids to create a mapping
between cores and nodes (to_node_id array within emu_cores structure).
The core ids are used as an index into an array which size depends on
CONFIG_NR_CPUS. If the physical core ids are larger, this will result
in out-of-bounds write accesses.

Generate logical core ids instead to avoid this.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:41 +01:00
Michael Holzheu 11a247e376 s390: Remove VLAIS in ptff() and clear_table()
The ptff() and clear_table() functions use the gcc extension "variable
length arrays in structures" (VLAIS) to define in the inline assembler
constraints the area of the clobbered memory. This extension will most
likely never be supported by LLVM/Clang.

Since currently BPF programs are compiled with LLVM, this leads to the
following compile errors:

 $ cd samples/bpf
 $ make

 In file included from /root/linux-master/samples/bpf/tracex1_kern.c:8:
 In file included from ./include/linux/netdevice.h:44:
 ...
 In file included from ./arch/s390/include/asm/mmu_context.h:10:
  ./arch/s390/include/asm/pgalloc.h:30:24: error: fields must have a
  constant size: 'variable length array in structure' extension will never
  be supported
         typedef struct { char _[n]; } addrtype;

 In file included from /root/linux-master/samples/bpf/tracex1_kern.c:7:
 In file included from ./include/linux/skbuff.h:18:
 ...
 In file included from ./include/linux/jiffies.h:8:
 In file included from ./include/linux/timex.h:65:
  ./arch/s390/include/asm/timex.h:105:24: error: fields must have a
  constant size: 'variable length array in structure' extension will never
  be supported
        typedef struct { char _[len]; } addrtype;

To fix this do the following:

 - Convert ptff() into a macro that then uses a fixed size array
   when expanded.
 - Convert the clear_table() function and use an inline assembly
   with fixed size array in a loop.
   The runtime performance of the new version is even better than
   the old version (tested with EC12/z13 and gcc 4.8.5/6.2.1 with
   "-march=z196 -O2").

Reported-by: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:30 +01:00
Martin Schwidefsky ce4dda3f02 s390: fix machine check panic stack switch
For system damage machine checks or machine checks due to invalid PSW
fields the system will be stopped. In order to get an oops message out
before killing the system the machine check handler branches to
.Lmcck_panic, switches to the panic stack and then does the usual
machine check handling.

The switch to the panic stack is incomplete, the stack pointer in %r15
is replaced, but the pt_regs pointer in %r11 is not. The result is
a program check which will kill the system in a slightly different way.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:13 +01:00
Heiko Carstens db7ad63624 s390/setup: fix memblock usage
When converting from bootmem to memblock I missed a subtle difference:
the memblock_alloc() functions return uninitialized memory, while the
memblock_virt_alloc() functions return zeroed memory.

This led to quite random early boot crashes.

Therefore use the correct version everywhere now.
Hopefully.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-02 07:36:24 +01:00
Heiko Carstens 9f88eb4df7 s390/kexec: use node 0 when re-adding crash kernel memory
When re-adding crash kernel memory within setup_resources() the
function memblock_add() is used. That function will add memory by
default to node "MAX_NUMNODES" instead of node 0, like the memory
detection code does. In case of !NUMA this will trigger this warning
when the kernel generates the vmemmap:

Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead
WARNING: CPU: 0 PID: 0 at mm/memblock.c:1261 memblock_virt_alloc_internal+0x76/0x220
CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc6 #16
Call Trace:
 [<0000000000d0b2e8>] memblock_virt_alloc_try_nid+0x88/0xc8
 [<000000000083c8ea>] __earlyonly_bootmem_alloc.constprop.1+0x42/0x50
 [<000000000083e7f4>] vmemmap_populate+0x1ac/0x1e0
 [<0000000000840136>] sparse_mem_map_populate+0x46/0x68
 [<0000000000d0c59c>] sparse_init+0x184/0x238
 [<0000000000cf45f6>] paging_init+0xbe/0xf8
 [<0000000000cf1d4a>] setup_arch+0xa02/0xae0
 [<0000000000ced75a>] start_kernel+0x72/0x450
 [<0000000000100020>] _stext+0x20/0x80

If NUMA is selected numa_setup_memory() will fix the node assignments
before the vmemmap will be populated; so this warning will only appear
if NUMA is not selected.

To fix this simply use memblock_add_node() and re-add crash kernel
memory explicitly to node 0.

Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 4e042af463 ("s390/kexec: fix crash on resize of reserved memory")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-02 07:36:17 +01:00
Francis Yan 1c885808e4 tcp: SOF_TIMESTAMPING_OPT_STATS option for SO_TIMESTAMPING
This patch exports the sender chronograph stats via the socket
SO_TIMESTAMPING channel. Currently we can instrument how long a
particular application unit of data was queued in TCP by tracking
SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_TX_SCHED. Having
these sender chronograph stats exported simultaneously along with
these timestamps allow further breaking down the various sender
limitation.  For example, a video server can tell if a particular
chunk of video on a connection takes a long time to deliver because
TCP was experiencing small receive window. It is not possible to
tell before this patch without packet traces.

To prepare these stats, the user needs to set
SOF_TIMESTAMPING_OPT_STATS and SOF_TIMESTAMPING_OPT_TSONLY flags
while requesting other SOF_TIMESTAMPING TX timestamps. When the
timestamps are available in the error queue, the stats are returned
in a separate control message of type SCM_TIMESTAMPING_OPT_STATS,
in a list of TLVs (struct nlattr) of types: TCP_NLA_BUSY_TIME,
TCP_NLA_RWND_LIMITED, TCP_NLA_SNDBUF_LIMITED. Unit is microsecond.

Signed-off-by: Francis Yan <francisyyan@gmail.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 10:04:25 -05:00
Heiko Carstens dd5224986e s390/uapi: sort header export list
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-29 07:52:59 +01:00
Heiko Carstens aa9725ff9c s390/hypfs: add hypfs header file to uapi header export list
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-29 07:52:58 +01:00
Heiko Carstens 6bc32bf0a0 s390: use generic asm-offsets.h
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-29 07:52:57 +01:00
Heiko Carstens 9e427365af s390: convert remaining bootmem allocations to memblock
Get rid of all remaining alloc_bootmem calls and use memblock_alloc
instead everywhere.  This way we get rid of the inconsistent mixture
of alloc_bootmem and memblock_alloc usages.

Two of the alloc_bootmem_low calls within arch/s390/kernel/setup.c are
replaced with memblock_alloc calls that don't enforce that the
allocated memory is below 2GB. This restriction was never necessary.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-29 07:52:55 +01:00
Martin Schwidefsky 61aaef51cc s390: fix kernel oops for CONFIG_MARCH_Z900=y builds
The LAST_BREAK macro in entry.S uses a different instruction sequence
for CONFIG_MARCH_Z900 builds. The branch target offset to skip the
store of the last breaking event address needs to take the different
length of the code block into account.

Fixes: f8fc82b471 ("s390: move sys_call_table and last_break from thread_info to thread_struct")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-25 10:07:55 +01:00
Sebastian Ott 5c5afd0201 s390/pci: use unique UIDs for domain enumeration
Use UIDs as domain numbers if the UID checking rules apply (in this
case the FW guarantees uniqueness of these values).

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:26 +01:00
Heiko Carstens 80abb39b50 s390: update defconfig
Enable the contiguous memory allocator but set the default size to
zero. If somebody wants to use the cma allocator the "cma=" kernel
parameter has to be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:24 +01:00
Heiko Carstens e1231b0e48 s390: add cma support
In order to make the cma infrastructure usable we need to add a small
architecture backend which calls dma_contiguous_reserve.
Otherwise we would end up with the cma allocator enabled, but no pool
where memory can be allocated from.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:23 +01:00
Heiko Carstens 1e16b09666 s390/cpumf: simplify psw generation
Use the psw_bits macro and simplify the code. The generated code is
also better since it doesn't contain any conditional branches anymore.

Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:22 +01:00
Heiko Carstens 3a890380e4 s390/thread_info: get rid of THREAD_ORDER define
We have the s390 specific THREAD_ORDER define and the THREAD_SIZE_ORDER
define which is also used in common code. Both have exactly the same
semantics. Therefore get rid of THREAD_ORDER and always use
THREAD_SIZE_ORDER instead.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:21 +01:00
Heiko Carstens 56e9219a82 s390/uaccess: make setfs macro return void
For an unknown (historic) reason the s390 specific implementation of
set_fs returns whatever the __ctl_load would return. The set_fs macro
however is supposed to return void.
Change the macro to do that.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:19 +01:00
Christian Borntraeger e1788bb995 KVM: s390: handle floating point registers in the run ioctl not in vcpu_put/load
Right now we switch the host fprs/vrs in kvm_arch_vcpu_load and switch
back in kvm_arch_vcpu_put. This process is already optimized
since commit 9977e886cb ("s390/kernel: lazy restore fpu registers")
avoiding double save/restores on schedule. We still reload the pointers
and test the guest fpc on each context switch, though.

We can minimize the cost of vcpu_load/put by doing the test in the
VCPU_RUN ioctl itself. As most VCPU threads almost never exit to
userspace in the common fast path, this allows to avoid this overhead
for the common case (eventfd driven I/O, all exits including sleep
handled in the kernel) - making kvm_arch_vcpu_load/put basically
disappear in perf top.

Also adapt the fpu get/set ioctls.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-11-22 19:32:35 +01:00
Christian Borntraeger 31d8b8d41a KVM: s390: handle access registers in the run ioctl not in vcpu_put/load
Right now we save the host access registers in kvm_arch_vcpu_load
and load them in kvm_arch_vcpu_put. Vice versa for the guest access
registers. On schedule this means, that we load/save access registers
multiple times.

e.g. VCPU_RUN with just one reschedule and then return does

[from user space via VCPU_RUN]
- save the host registers in kvm_arch_vcpu_load (via ioctl)
- load the guest registers in kvm_arch_vcpu_load (via ioctl)
- do guest stuff
- decide to schedule/sleep
- save the guest registers in kvm_arch_vcpu_put (via sched)
- load the host registers in kvm_arch_vcpu_put (via sched)
- save the host registers in switch_to (via sched)
- schedule
- return
- load the host registers in switch_to (via sched)
- save the host registers in kvm_arch_vcpu_load (via sched)
- load the guest registers in kvm_arch_vcpu_load (via sched)
- do guest stuff
- decide to go to userspace
- save the guest registers in kvm_arch_vcpu_put (via ioctl)
- load the host registers in kvm_arch_vcpu_put (via ioctl)
[back to user space]

As the kernel does not use access registers, we can avoid
this reloading and simply piggy back on switch_to (let it save
the guest values instead of host values in thread.acrs) by
moving the host/guest switch into the VCPU_RUN ioctl function.
We now do

[from user space via VCPU_RUN]
- save the host registers in kvm_arch_vcpu_ioctl_run
- load the guest registers in kvm_arch_vcpu_ioctl_run
- do guest stuff
- decide to schedule/sleep
- save the guest registers in switch_to
- schedule
- return
- load the guest registers in switch_to (via sched)
- do guest stuff
- decide to go to userspace
- save the guest registers in kvm_arch_vcpu_ioctl_run
- load the host registers in kvm_arch_vcpu_ioctl_run

This seems to save about 10% of the vcpu_put/load functions
according to perf.

As vcpu_load no longer switches the acrs, We can also loading
the acrs in kvm_arch_vcpu_ioctl_set_sregs.

Suggested-by: Fan Zhang <zhangfan@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-11-22 19:32:35 +01:00
Christian Borntraeger 760928c0da locking/spinlocks, s390: Implement vcpu_is_preempted(cpu)
This implements the s390 version for vcpu_is_preempted(cpu),
by reworking the existing smp_vcpu_scheduled() function into
arch_vcpu_is_preempted().

We can then also get rid of the local cpu_is_preempted()
function by moving the CIF_ENABLED_WAIT test into
arch_vcpu_is_preempted().

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-6-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:48:06 +01:00
Ingo Molnar 02cb689b2c Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:37:38 +01:00
Christian Borntraeger 6d0d287891 locking/core: Provide common cpu_relax_yield() definition
No need to duplicate the same define everywhere. Since
the only user is stop-machine and the only provider is
s390, we can use a default implementation of cpu_relax_yield()
in sched.h.

Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-s390 <linux-s390@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-17 08:17:36 +01:00
Sebastian Ott a95aeb2ff4 s390/pci_dma: remove memset from dma_alloc
Get rid of a useless memset from dma_alloc. Users of dma_alloc who want
zero initialized memory can get it by specifying __GFP_ZERO or use one
of the zalloc variants.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17 07:10:14 +01:00
Sebastian Ott 4f5359e94b s390/pci_dma: make lazy flush independent from the tlb_refresh bit
We have 2 strategies to reduce the number of RPCIT instructions:
* A HW feature indicated via the tlb_refresh bit allows us to omit RPCIT for
  invalid -> valid translation-table entry updates.
* With "lazy flush" we omit RPCIT for valid -> invalid updates until we run
  out of dma addresses. When we have to reuse dma addresses we issue a global
  tlb flush using only one RPCIT instruction.

Currently lazy flushing depends on tlb_refresh. Since there is no technical
reason for this remove this dependency.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17 07:10:08 +01:00
Sebastian Ott 6b7df3ce92 s390/pci: fix dma address calculation in map_sg
__s390_dma_map_sg maps a dma-contiguous area. Although we only map
whole pages we have to take into account that the area doesn't start
or stop at a page boundary because we use the dma address to loop
over the individual sg entries. Failing to do that might lead to an
access of the wrong sg entry.

Fixes: ee877b81c6 ("s390/pci_dma: improve map_sg")
Reported-and-tested-by: Christoph Raisch <raisch@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17 07:09:53 +01:00
Martin Schwidefsky 191ce9d1fd s390/time: fix clocksource steering for negative clock offsets
The TOD clock offset injected by an STP sync check can be negative.
If the resulting total tod_steering_delta gets negative the kernel
will panic.

Change the type of tod_steering_delta to a signed type.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 75c7b6f3f6 ("s390/time: steer clocksource on STP sync events")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17 06:56:38 +01:00
Ingo Molnar 0acbc7aa47 Merge branch 'linus' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:16:28 +01:00
Christian Borntraeger 5bd0b85ba8 locking/core, arch: Remove cpu_relax_lowlatency()
As there are no users left, we can remove cpu_relax_lowlatency()
implementations from every architecture.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Cc: <linux-arch@vger.kernel.org>
Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:11 +01:00
Christian Borntraeger 22b6430d36 locking/core, s390: Make cpu_relax() a barrier again
stop_machine() seemed to be the only important place for yielding during
cpu_relax(). This was fixed by using cpu_relax_yield().

Therefore, we can now redefine cpu_relax() to be a barrier instead on s390,
making s390 identical to all other architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1477386195-32736-4-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:10 +01:00
Christian Borntraeger 79ab11cdb9 locking/core: Introduce cpu_relax_yield()
For spinning loops people do often use barrier() or cpu_relax().
For most architectures cpu_relax and barrier are the same, but on
some architectures cpu_relax can add some latency.
For example on power,sparc64 and arc, cpu_relax can shift the CPU
towards other hardware threads in an SMT environment.
On s390 cpu_relax does even more, it uses an hypercall to the
hypervisor to give up the timeslice.
In contrast to the SMT yielding this can result in larger latencies.
In some places this latency is unwanted, so another variant
"cpu_relax_lowlatency" was introduced. Before this is used in more
and more places, lets revert the logic and provide a cpu_relax_yield
that can be called in places where yielding is more important than
latency. By default this is the same as cpu_relax on all architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:09 +01:00
Martin Schwidefsky ef280c859f s390: move sys_call_table and last_break from thread_info to thread_struct
Move the last two architecture specific fields from the thread_info
structure to the thread_struct. All that is left in thread_info is
the flags field.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-15 16:48:20 +01:00
Stanislaw Gruszka 40565b5aed sched/cputime, powerpc, s390: Make scaled cputime arch specific
Only s390 and powerpc have hardware facilities allowing to measure
cputimes scaled by frequency. On all other architectures
utimescaled/stimescaled are equal to utime/stime (however they are
accounted separately).

Remove {u,s}timescaled accounting on all architectures except
powerpc and s390, where those values are explicitly accounted
in the proper places.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161031162143.GB12646@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-15 09:51:05 +01:00
Linus Torvalds 015ed9433b Merge branch 'maybe-uninitialized' (patches from Arnd)
Merge fixes for -Wmaybe-uninitialized from Arnd Bergmann:
 "It took a while for some patches to make it into mainline through
  maintainer trees, but the 28-patch series is now reduced to 10, with
  one tiny patch added at the end.

  Aside from patches that are no longer required, I did these changes
  compared to version 1:

   - Dropped "iio: maxim_thermocouple: detect invalid storage size in
     read()", which is currently in linux-next as commit 32cb7d27e6.
     This is the only remaining warning I see for a couple of corner
     cases (kbuild bot reports it on blackfin, kernelci bot and arm-soc
     bot both report it on arm64)

   - Dropped "brcmfmac: avoid maybe-uninitialized warning in
     brcmf_cfg80211_start_ap", which is currently in net/master merge
     pending.

   - Dropped two x86 patches, "x86: math-emu: possible uninitialized
     variable use" and "x86: mark target address as output in 'insb'
     asm" as they do not seem to trigger for a default build, and I got
     no feedback on them. Both of these are ancient issues and seem
     harmless, I will send them again to the x86 maintainers once the
     rest is merged.

   - Dropped "rbd: false-postive gcc-4.9 -Wmaybe-uninitialized" based on
     feedback from Ilya Dryomov, who already has a different fix queued
     up for v4.10. The kbuild bot reports this as a warning for xtensa.

   - Replaced "crypto: aesni: avoid -Wmaybe-uninitialized warning" with
     a simpler patch, this one always triggers but my first solution
     would not be safe for linux-4.9 any more at this point. I'll follow
     up with the larger patch as a cleanup for 4.10.

   - Replaced "dib0700: fix nec repeat handling" with a better one,
     contributed by Sean Young"

* -Wmaybe-uninitialized fixes:
  Kbuild: enable -Wmaybe-uninitialized warnings by default
  pcmcia: fix return value of soc_pcmcia_regulator_set
  infiniband: shut up a maybe-uninitialized warning
  crypto: aesni: shut up -Wmaybe-uninitialized warning
  rc: print correct variable for z8f0811
  dib0700: fix nec repeat handling
  s390: pci: don't print uninitialized data for debugging
  nios2: fix timer initcall return value
  x86: apm: avoid uninitialized data
  NFSv4.1: work around -Wmaybe-uninitialized warning
  Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"
2016-11-11 10:03:01 -08:00
Arnd Bergmann 92dfffee97 s390: pci: don't print uninitialized data for debugging
gcc correctly warns about an incorrect use of the 'pa' variable in case
we pass an empty scatterlist to __s390_dma_map_sg:

  arch/s390/pci/pci_dma.c: In function '__s390_dma_map_sg':
  arch/s390/pci/pci_dma.c:309:13: warning: 'pa' may be used uninitialized in this function [-Wmaybe-uninitialized]

This adds a bogus initialization to the function to sanitize the debug
output.  I would have preferred a solution without the initialization,
but I only got the report from the kbuild bot after turning on the
warning again, and didn't manage to reproduce it myself.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11 08:45:08 -08:00
Jakub Kicinski d7c19b066d mm: kmemleak: scan .data.ro_after_init
Limit the number of kmemleak false positives by including
.data.ro_after_init in memory scanning.  To achieve this we need to add
symbols for start and end of the section to the linker scripts.

The problem was been uncovered by commit 56989f6d85 ("genetlink: mark
families as __ro_after_init").

Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.com
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11 08:12:37 -08:00
Martin Schwidefsky 90c53e6580 s390: move cputime accounting fields from thread_info to thread_struct
The user_timer and system_timer fields are used for the per-thread
cputime accounting code. The access to these values is simpler if
they are moved to the thread_struct as the task_thread_info(tsk)
indirection is not needed anymore.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:43 +01:00
Martin Schwidefsky f8fc82b471 s390: move system_call field from thread_info to thread_struct
The system_call field in thread_info structure is used by the signal
code to store the number of the current system call while the debugger
interacts with its inferior. A better location for the system_call
field is with the other debugger related information in the
thread_struct.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:43 +01:00
Heiko Carstens d5c352cdd0 s390: move thread_info into task_struct
This is the s390 variant of commit 15f4eae70d ("x86: Move
thread_info into task_struct").

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:41 +01:00
Martin Schwidefsky c360192bf4 s390/preempt: move preempt_count to the lowcore
Convert s390 to use a field in the struct lowcore for the CPU
preemption count. It is a bit cheaper to access a lowcore field
compared to a thread_info variable and it removes the depencency
on a task related structure.

bloat-o-meter on the vmlinux image for the default configuration
(CONFIG_PREEMPT_NONE=y) reports a small reduction in text size:

add/remove: 0/0 grow/shrink: 18/578 up/down: 228/-5448 (-5220)

A larger improvement is achieved with the default configuration
but with CONFIG_PREEMPT=y and CONFIG_DEBUG_PREEMPT=n:

add/remove: 2/6 grow/shrink: 59/4477 up/down: 1618/-228762 (-227144)

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:40 +01:00
Martin Schwidefsky 1993dbc7e0 s390/bitops: use atomic primitives for bitops
Replace the bitops specific atomic update code by the functions
from atomic_ops.h. This saves a few lines of non-trivial code.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:38 +01:00
Martin Schwidefsky 126b30c3cb s390/atomic: refactor atomic primitives
Rework atomic.h to make the low level functions avaible for use
in other headers without using atomic_t, e.g. in bitops.h.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:33 +01:00
Ingo Molnar 4c8ee71620 Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-11 08:25:07 +01:00
Sebastian Andrzej Siewior dfbbd86a0f s390/smp: Convert to hotplug state machine
cpuhp_setup_state() invokes the startup callback on all online cpus with
the proper protection, so we can remove the cpu hotplug protection from the
init function and the creation of the per cpu files for online cpus in
smp_add_present_cpu(). smp_add_present_cpu() is called also called from
__smp_rescan_cpus(), but this callpath never adds an online cpu, it merily
adds newly present cpus, so the creation of the cpu files is not required.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161104144502.7kd4bxz2rxqvtack@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-09 23:45:28 +01:00
Thomas Gleixner ef65d45cbf s390/smp: Make cpu notifier symetric
There is no reason to remove the sysfs cpu files when the CPU is dead, they
can be removed when the cpu is prepared to go down. Doing it at
DOWN_PREPARE allows us to convert it to a symetric hotplug state in the
next step.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161104144140.lcee6kwmwlx37m7g@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-09 23:45:28 +01:00
Linus Torvalds ae67e87f40 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "Two bug fixes

   - a memory alignment fix in the s390 only hypfs code

   - a fix for the generic percpu code that caused ftrace to break on
     s390. This is not relevant for x86 but for all architectures that
     use the generic percpu code"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  percpu: use notrace variant of preempt_disable/preempt_enable
  s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignment
2016-11-09 11:09:40 -08:00
Masahiro Yamada 847e070012 s390: remove unneeded dependency for gen_facilities
The dependency between the object and the source is handled by
scripts/Makefile.host, so only "hostprogs-y += gen_facilities"
is fine.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-08 10:30:28 +01:00
Masahiro Yamada d1f7e8f85b s390: squash facilities_src.h into gen_facilities.c
We generally expect headers in arch/$(ARCH)/include/asm directory
are included from kernel sources, but facilities_src.h is not;
it is included from the arch/s390/tools/gen_facilities.c tool.

There is no reason to expose this header to the public include path.
Furthermore, facilities_src.h makes sure to be included only from
gen_facilities.c by the following:

  #ifndef S390_GEN_FACILITIES_C
  #error "This file can only be included by gen_facilities.c"
  #endif

This check can be removed by merging the two files.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-07 11:06:57 +01:00
Masahiro Yamada 72cc9918ef s390: delete unneeded #include <linux/kconfig.h> from facilities_src.h
The header facilities_src.h is only included from gen_facilities.c
and the tool is compiled with the following extra options:

    HOSTCFLAGS_gen_facilities.o += -Wall $(LINUXINCLUDE)

Please note $(LINUXINCLUDE) is expanded into build options including:

    -include $(srctree)/include/linux/kconfig.h

So, the Makefile always forces the tool to include kconfig.h, i.e.,
the #include <linux/kconfig.h> directive in the header is redundant.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-07 11:06:56 +01:00
Linus Torvalds 66cecb6789 One NULL pointer dereference, and two fixes for regressions introduced
during the merge window.  The rest are fixes for MIPS, s390 and nested VMX.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJYG2H5AAoJEL/70l94x66DK/cH/0jEQ3ynuLAd5CKux7JxI/EP
 msSJh1Xqr4+XhXZnuDpGQWrdsBlxoiqA6PsJrUTtyi4nQCDXlT8g+2MDuvqhWIHz
 7vw58j/EMJDCVQzYAbN5VDUfk13uB5aSWTo3M9Rf09v0hU1Ql7z8u4CtKEdLpN5Y
 LY9bT9fxUmXO7REKP7bdW6ZrDX/hUShYHgMqzXGFMyGBG3ym3a9bggXEzTCD6eNQ
 ioogQIWqg+icdhta0iLNAwFClPlcKB2/xo4IUuNgrPwGoHFGJN/8+qxT4+sVbp2B
 v8u1zOXlCFXBcskWE+yRRsGe72+mIzz6QScCyO+5HbhKYVfbE9H7KBlFX9rZZ2c=
 =IbKx
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "One NULL pointer dereference, and two fixes for regressions introduced
  during the merge window.

  The rest are fixes for MIPS, s390 and nested VMX"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: x86: Check memopp before dereference (CVE-2016-8630)
  kvm: nVMX: VMCLEAR an active shadow VMCS after last use
  KVM: x86: drop TSC offsetting kvm_x86_ops to fix KVM_GET/SET_CLOCK
  KVM: x86: fix wbinvd_dirty_mask use-after-free
  kvm/x86: Show WRMSR data is in hex
  kvm: nVMX: Fix kernel panics induced by illegal INVEPT/INVVPID types
  KVM: document lock orders
  KVM: fix OOPS on flush_work
  KVM: s390: Fix STHYI buffer alignment for diag224
  KVM: MIPS: Precalculate MMIO load resume PC
  KVM: MIPS: Make ERET handle ERL before EXL
  KVM: MIPS: Fix lazy user ASID regenerate for SMP
2016-11-04 13:08:05 -07:00
Paul Gortmaker 8ba8b05f57 s390: kernel: make lgr explicitly non-modular
The Makefile currently controlling compilation of this code is obj-y
meaning that it currently is not being built as a module by anyone.

Lets remove the couple traces of modular infrastructure use, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

We replace module.h with init.h and export.h since the file does
export some symbols.

Cc: linux-s390@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-31 17:55:42 +01:00
Paul Gortmaker cee672e180 s390: hypfs: make inode explicitly non-modular
The Kconfig currently controlling compilation of this code is:

arch/s390/Kconfig:config S390_HYPFS_FS
arch/s390/Kconfig:      def_bool y

...meaning that it currently is not being built as a module by anyone.

Lets remove the couple traces of modular infrastructure use, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

Also note that MODULE_ALIAS is a no-op for non-modular code.

We also delete the MODULE_LICENSE tag etc. since all that information
was (or is now) contained at the top of the file in the comments.

Build testing indicated the presence of module.h was masking an
implicit include of kobject.h, hence the addition of that.

Cc: linux-s390@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-31 17:55:40 +01:00
Martin Schwidefsky 75c7b6f3f6 s390/time: steer clocksource on STP sync events
On STP sync events the TOD clock will jump in time, either forward or
backward. The TOD clocksource claims to be continuous but in case of
an STP sync with a negative offset it is not.

Subtract the offset injected by the STP sync check from the result of
the TOD clocksource to make it continuous again. Add code to drift the
offset towards zero with a fixed rate, steering 1 second in ~9 hours.

Suggested-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-28 10:09:02 +02:00
Martin Schwidefsky 2ace06ec0d s390/time: adjust last_update_clock at clock synchronization
The last_update_clock time stamp in the lowcore should be adjusted by
the TOD clock delta that is created by the clock synchronization.
Otherwise the calculation of the steal time will be incorrect.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-28 10:09:02 +02:00
Martin Schwidefsky b1c0854d16 s390/time: refactor clock sync
Merge clock_sync_cpu into stp_sync_clock and split out the update
of the global and per-CPU clock fields into clock_sync_global
and clock_sync_local.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-28 10:09:02 +02:00
Michael Holzheu 237d6e6884 s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignment
Since commit d86bd1bece ("mm/slub: support left redzone") it is no longer
guaranteed that kmalloc(PAGE_SIZE) returns page aligned memory.

After the above commit we get an error for diag224 because aligned
memory is required. This leads to the following user visible error:

 # mount none -t s390_hypfs /sys/hypervisor/
 mount: unknown filesystem type 's390_hypfs'

 # dmesg | grep hypfs
 hypfs.cccfb8: The hardware system does not provide all functions
               required by hypfs
 hypfs.7a79f0: Initialization of hypfs failed with rc=-61

Fix this problem and use get_free_page() instead of kmalloc() to get
correctly aligned memory.

Cc: stable@vger.kernel.org # v3.6+
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-28 10:08:58 +02:00
Linus Torvalds 55bea71ed5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "A few more s390 patches for 4.9:
   - a fix for an overflow in the dasd driver reported by UBSAN
   - fix a regression and add hotplug memory to the zone movable again
   - add ignore defines for the pkey system calls
   - fix the ouput of the merged stack tracer
   - replace printk with pr_cont in arch/s390 where appropriate
   - remove the arch specific return_address function again
   - ignore reserved channel paths at boot time
   - add a missing hugetlb_bad_size call to the arch backend"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: fix zone calculation in arch_add_memory()
  s390/dumpstack: use pr_cont within show_stack and die
  s390/dumpstack: get rid of return_address again
  s390/disassambler: use pr_cont where appropriate
  s390/dumpstack: use pr_cont where appropriate
  s390/dumpstack: restore reliable indicator for call traces
  s390/mm: use hugetlb_bad_size()
  s390/cio: don't register chpids in reserved state
  s390: ignore pkey system calls
  s390/dasd: avoid undefined behaviour
2016-10-27 14:16:30 -07:00
Paolo Bonzini b5149a5fd1 KVM: s390: Fix wrong memory allocation
With commit d86bd1bece ("mm/slub: support left redzone") or
 with slab debugging the allocation of our diag224 buffer is not
 aligned properly. Let's fix this.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJYELGIAAoJEBF7vIC1phx8eY0P/jGHsq8vvtvJFQxwMLx0MPkS
 HOd/kw9nupyNeJMrvLGgcpyxfLTbCJsdA5P+ZpgZEUygwAkjs/EFbzGfgWL6kY4t
 pT9l4uo0LfdGdNITNVWNgUvpFC4VIIlyQAeTdMG2+hUlaIF0I1zpdQDYa9umacKN
 lGuKRObTH7kVX6yj02kTTallBKDdtl45fSAPCoMSFImi+Hlyn8BBVTlbNPujoOk0
 i50vWzjn711Fm/qnc0p/8XpLTcZLdYPm8vToxD6WR6C8mjepb3+JyHgTx4u9WRVY
 cvVVTrqDvrUaqOss4KTssbgzSKZ74aPPlQm27n2l8scpAcHbr+1ZY59PnkEdoF2Y
 jA5JHeOfqijneOf0F7vwGAG2c4MU5rIVq7bFbxVGVglb3S5TBTRnd63Q0kUsJc1g
 amrdFOjXtkyqpusEJMIsLMkhw2ICjQ7D2oXIRq8GBqxnEoMh99KUYdFu3Bd21/Bv
 uP5vBvvuKx0jtD7WBkIOL9Df0YEhKYwqmonubPyUwr/8ok04g/dKTPgelXzCVFts
 e2uXkQ0uSxT2l04E2iSiq7qMM4CMNVqhkJHHHyBAxisHpM3Wdbyq2y0evOqATwzN
 F26a4JAEhOxRbytnaoRH/XaGgGLfKFegSHZoJ/dtrxuBIBV2t4aOHHhR0N0FTyK0
 bPCH857jWJV2yB1ym4OY
 =YoA+
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fix wrong memory allocation

With commit d86bd1bece ("mm/slub: support left redzone") or
with slab debugging the allocation of our diag224 buffer is not
aligned properly. Let's fix this.
2016-10-27 13:22:54 +02:00
Janosch Frank 45c7ee43a5 KVM: s390: Fix STHYI buffer alignment for diag224
Diag224 requires a page-aligned 4k buffer to store the name table
into. kmalloc does not guarantee page alignment, hence we replace it
with __get_free_page for the buffer allocation.

Cc: stable@vger.kernel.org # v4.8+
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-10-26 13:46:44 +02:00
Peter Zijlstra 890658b7ab locking/mutex: Kill arch specific code
Its all generic atomic_long_t stuff now.

Tested-by: Jason Low <jason.low2@hpe.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 11:31:51 +02:00
Gerald Schaefer 4a65429457 s390/mm: fix zone calculation in arch_add_memory()
Standby (hotplug) memory should be added to ZONE_MOVABLE on s390. After
commit 199071f1 "s390/mm: make arch_add_memory() NUMA aware",
arch_add_memory() used memblock_end_of_DRAM() to find out the end of
ZONE_NORMAL and the beginning of ZONE_MOVABLE. However, commit 7f36e3e5
"memory-hotplug: add hot-added memory ranges to memblock before allocate
node_data for a node." moved the call of memblock_add_node() before
the call of arch_add_memory() in add_memory_resource(), and thus changed
the return value of memblock_end_of_DRAM() when called in
arch_add_memory(). As a result, arch_add_memory() will think that all
memory blocks should be added to ZONE_NORMAL.

Fix this by changing the logic in arch_add_memory() so that it will
manually iterate over all zones of a given node to find out which zone
a memory block should be added to.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-24 10:26:17 +02:00
Heiko Carstens 47ece7fef4 s390/dumpstack: use pr_cont within show_stack and die
Use pr_cont instead of printk calls also within show_stack and
die in order to avoid extra line breaks.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-24 10:26:14 +02:00
Linus Torvalds a23b27ae12 KVM fixes for v4.9-rc2
ARM:
  - avoid livelock when walking guest page tables
  - fix HYP mode static keys without CC_HAVE_ASM_GOTO
 
 MIPS:
  - fix a build error without TRACEPOINTS_ENABLED
 
 s390:
  - reject a malformed userspace configuration
 
 x86:
  - suppress a warning without CONFIG_CPU_FREQ
  - initialize whole irq_eoi array
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJYCl1iAAoJEED/6hsPKofo7pUH/R/sL417YLTkY6UVhtrCXQq1
 cUPWLLp96/Ijkmb+PoByLn5msKxhUa9A06QfphKCbmvpInubXPTxaWDCpoXxHmCO
 ywHmwuNk7Zgc8MnvcqBKte1jo8/JxQTM1NYZEys7va+J/fC4Nqb9gjZnECSTfUK5
 JE8bPs+yxVSavsh0KOZcTdTHtuZQ6SQijgDkE4pSDBYhCKxIpYAXaKVUOC+VSTDH
 ACUMLvUrFlFbAev0z4oF4CSKotAq6VEkJQhequghKPUHSeWabZB4wAHTkfUbJ+Bb
 Ar57zrz5YCGbojywuHi1954eHWv6AfWyD8bnYSCtD4gsIRws+dH/MIiPgEMjLOQ=
 =9U78
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "ARM:
   - avoid livelock when walking guest page tables
   - fix HYP mode static keys without CC_HAVE_ASM_GOTO

  MIPS:
   - fix a build error without TRACEPOINTS_ENABLED

  s390:
   - reject a malformed userspace configuration

  x86:
   - suppress a warning without CONFIG_CPU_FREQ
   - initialize whole irq_eoi array"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  arm/arm64: KVM: Map the BSS at HYP
  arm64: KVM: Take S1 walks into account when determining S2 write faults
  KVM: s390: reject invalid modes for runtime instrumentation
  kvm: x86: memset whole irq_eoi
  kvm/x86: Fix unused variable warning in kvm_timer_init()
  KVM: MIPS: Add missing uaccess.h include
2016-10-21 19:09:29 -07:00
Christian Borntraeger a5efb6b6c9 KVM: s390: reject invalid modes for runtime instrumentation
Usually a validity intercept is a programming error of the host
because of invalid entries in the state description.
We can get a validity intercept if the mode of the runtime
instrumentation control block is wrong. As the host does not know
which modes are valid, this can be used by userspace to trigger
a WARN.
Instead of printing a WARN let's return an error to userspace as
this can only happen if userspace provides a malformed initial
value (e.g. on migration). The kernel should never warn on bogus
input. Instead let's log it into the s390 debug feature.

While at it, let's return -EINVAL for all validity intercepts as
this will trigger an error in QEMU like

error: kvm run failed Invalid argument
PSW=mask 0404c00180000000 addr 000000000063c226 cc 00
R00=000000000000004f R01=0000000000000004 R02=0000000000760005 R03=000000007fe0a000
R04=000000000064ba2a R05=000000049db73dd0 R06=000000000082c4b0 R07=0000000000000041
R08=0000000000000002 R09=000003e0804042a8 R10=0000000496152c42 R11=000000007fe0afb0
[...]

This will avoid an endless loop of validity intercepts.

Cc: stable@vger.kernel.org # v4.5+
Fixes: c6e5f16637 ("KVM: s390: implement the RI support of guest")
Acked-by: Fan Zhang <zhangfan@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-10-20 20:06:12 +02:00
Linus Torvalds 63ae602cea Merge branch 'gup_flag-cleanups'
Merge the gup_flags cleanups from Lorenzo Stoakes:
 "This patch series adjusts functions in the get_user_pages* family such
  that desired FOLL_* flags are passed as an argument rather than
  implied by flags.

  The purpose of this change is to make the use of FOLL_FORCE explicit
  so it is easier to grep for and clearer to callers that this flag is
  being used.  The use of FOLL_FORCE is an issue as it overrides missing
  VM_READ/VM_WRITE flags for the VMA whose pages we are reading
  from/writing to, which can result in surprising behaviour.

  The patch series came out of the discussion around commit 38e0885465
  ("mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing"),
  which addressed a BUG_ON() being triggered when a page was faulted in
  with PROT_NONE set but having been overridden by FOLL_FORCE.
  do_numa_page() was run on the assumption the page _must_ be one marked
  for NUMA node migration as an actual PROT_NONE page would have been
  dealt with prior to this code path, however FOLL_FORCE introduced a
  situation where this assumption did not hold.

  See

      https://marc.info/?l=linux-mm&m=147585445805166

  for the patch proposal"

Additionally, there's a fix for an ancient bug related to FOLL_FORCE and
FOLL_WRITE by me.

[ This branch was rebased recently to add a few more acked-by's and
  reviewed-by's ]

* gup_flag-cleanups:
  mm: replace access_process_vm() write parameter with gup_flags
  mm: replace access_remote_vm() write parameter with gup_flags
  mm: replace __access_remote_vm() write parameter with gup_flags
  mm: replace get_user_pages_remote() write/force parameters with gup_flags
  mm: replace get_user_pages() write/force parameters with gup_flags
  mm: replace get_vaddr_frames() write/force parameters with gup_flags
  mm: replace get_user_pages_locked() write/force parameters with gup_flags
  mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
  mm: remove write/force parameters from __get_user_pages_unlocked()
  mm: remove write/force parameters from __get_user_pages_locked()
  mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
2016-10-19 08:39:47 -07:00
Lorenzo Stoakes c164154f66 mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
This removes the 'write' and 'force' use from get_user_pages_unlocked()
and replaces them with 'gup_flags' to make the use of FOLL_FORCE
explicit in callers as use of this flag can result in surprising
behaviour (and hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-18 14:13:37 -07:00
Heiko Carstens dcddba96cd s390/dumpstack: get rid of return_address again
With commit ef6000b4c6 ("Disable the __builtin_return_address()
warning globally after all)" the kernel does not warn at all again if
__builtin_return_address(n) is called with n > 0.

Besides the fact that this was a false warning on s390 anyway, due to
the always present backchain, we can now revert commit 5606330627
("s390/dumpstack: implement and use return_address()") again, to
simplify the code again.

After all I shouldn't have had return_address() implememted at all to
workaround this issue. So get rid of this again.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:33 +02:00
Heiko Carstens 4d062487f3 s390/disassambler: use pr_cont where appropriate
Just like for dumpstack use pr_cont instead of simple printk calls to
fix the output when disassembling a piece of code.

Before:
[    0.840627] Krnl Code: 000000000017d1c6: a77400f7            brc     7,17d3b4
[    0.840630]
                          000000000017d1ca: 92015000            mvi     0(%r5),1
[    0.840634]
                         #000000000017d1ce: a7f40001            brc     15,17d1d0

After:
[    0.831792] Krnl Code: 000000000017d13e: a77400f7            brc     7,17d32c
                          000000000017d142: 92015000            mvi     0(%r5),1
                         #000000000017d146: a7f40001            brc     15,17d148

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:32 +02:00
Heiko Carstens a790634544 s390/dumpstack: use pr_cont where appropriate
Use pr_cont instead of simple printk calls when lines will be
continued. This fixes the kernel output of various lines printed on
e.g. a warning:

Before:
[    0.840604] Krnl PSW : 0404c00180000000 000000000017d1d2
[    0.840606]  (try_to_wake_up+0x382/0x5e0)

[    0.840610]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0
[    0.840611]  RI:0 EA:3

After:
[    0.831772] Krnl PSW : 0404c00180000000 000000000017d14a (try_to_wake_up+0x382/0x5e0)
[    0.831776]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:31 +02:00
Heiko Carstens d0208639db s390/dumpstack: restore reliable indicator for call traces
Before merging all different stack tracers the call traces printed had
an indicator if an entry can be considered reliable or not.
Unreliable entries were put in braces, reliable not. Currently all
lines contain these extra braces.

This patch restores the old behaviour by adding an extra "reliable"
parameter to the callback functions. Only show_trace makes currently
use of it.

Before:
[    0.804751] Call Trace:
[    0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0)
[    0.804756] ([<0000000000161d64>] create_worker+0x174/0x1c0)

After:
[    0.804751] Call Trace:
[    0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0)
[    0.804756]  [<0000000000161d64>] create_worker+0x174/0x1c0

Fixes: 758d39ebd3 ("s390/dumpstack: merge all four stack tracers")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:30 +02:00
Shyam Saini b5003b5f0a s390/mm: use hugetlb_bad_size()
Update setup_hugepagesz() to call hugetlb_bad_size() when unsupported
hugepage size is found.

Signed-off-by: Shyam Saini <mayhs11saini@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 11:25:26 +02:00
Heiko Carstens 12e721964e s390: ignore pkey system calls
Ignore the pkey systems calls since they don't make any sense on s390.
In addition any user could trigger a warning if issueing the pkey_free
system call, if it would be wired up on a system without pkey support.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 11:25:25 +02:00
Linus Torvalds 84d69848c9 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:

 - EXPORT_SYMBOL for asm source by Al Viro.

   This does bring a regression, because genksyms no longer generates
   checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is
   working on a patch to fix this.

   Plus, we are talking about functions like strcpy(), which rarely
   change prototypes.

 - Fixes for PPC fallout of the above by Stephen Rothwell and Nick
   Piggin

 - fixdep speedup by Alexey Dobriyan.

 - preparatory work by Nick Piggin to allow architectures to build with
   -ffunction-sections, -fdata-sections and --gc-sections

 - CONFIG_THIN_ARCHIVES support by Stephen Rothwell

 - fix for filenames with colons in the initramfs source by me.

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits)
  initramfs: Escape colons in depfile
  ppc: there is no clear_pages to export
  powerpc/64: whitelist unresolved modversions CRCs
  kbuild: -ffunction-sections fix for archs with conflicting sections
  kbuild: add arch specific post-link Makefile
  kbuild: allow archs to select link dead code/data elimination
  kbuild: allow architectures to use thin archives instead of ld -r
  kbuild: Regenerate genksyms lexer
  kbuild: genksyms fix for typeof handling
  fixdep: faster CONFIG_ search
  ia64: move exports to definitions
  sparc32: debride memcpy.S a bit
  [sparc] unify 32bit and 64bit string.h
  sparc: move exports to definitions
  ppc: move exports to definitions
  arm: move exports to definitions
  s390: move exports to definitions
  m68k: move exports to definitions
  alpha: move exports to actual definitions
  x86: move exports to actual definitions
  ...
2016-10-14 14:26:58 -07:00
Linus Torvalds 101105b171 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 ">rename2() work from Miklos + current_time() from Deepa"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: Replace current_fs_time() with current_time()
  fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
  fs: Replace CURRENT_TIME with current_time() for inode timestamps
  fs: proc: Delete inode time initializations in proc_alloc_inode()
  vfs: Add current_time() api
  vfs: add note about i_op->rename changes to porting
  fs: rename "rename2" i_op to "rename"
  vfs: remove unused i_op->rename
  fs: make remaining filesystems use .rename2
  libfs: support RENAME_NOREPLACE in simple_rename()
  fs: support RENAME_NOREPLACE for local filesystems
  ncpfs: fix unused variable warning
2016-10-10 20:16:43 -07:00
Al Viro 3873691e5a Merge remote-tracking branch 'ovl/rename2' into for-linus 2016-10-10 23:02:51 -04:00
Alexey Dobriyan 81243eacfa cred: simpler, 1D supplementary groups
Current supplementary groups code can massively overallocate memory and
is implemented in a way so that access to individual gid is done via 2D
array.

If number of gids is <= 32, memory allocation is more or less tolerable
(140/148 bytes).  But if it is not, code allocates full page (!)
regardless and, what's even more fun, doesn't reuse small 32-entry
array.

2D array means dependent shifts, loads and LEAs without possibility to
optimize them (gid is never known at compile time).

All of the above is unnecessary.  Switch to the usual
trailing-zero-len-array scheme.  Memory is allocated with
kmalloc/vmalloc() and only as much as needed.  Accesses become simpler
(LEA 8(gi,idx,4) or even without displacement).

Maximum number of gids is 65536 which translates to 256KB+8 bytes.  I
think kernel can handle such allocation.

On my usual desktop system with whole 9 (nine) aux groups, struct
group_info shrinks from 148 bytes to 44 bytes, yay!

Nice side effects:

 - "gi->gid[i]" is shorter than "GROUP_AT(gi, i)", less typing,

 - fix little mess in net/ipv4/ping.c
   should have been using GROUP_AT macro but this point becomes moot,

 - aux group allocation is persistent and should be accounted as such.

Link: http://lkml.kernel.org/r/20160817201927.GA2096@p183.telecom.by
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Vasily Kulikov <segoon@openwall.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:30 -07:00
Marcin Nowakowski ea036230f7 uprobes: remove function declarations from arch/{mips,s390}
The declarations of arch-specific functions have been moved to a common
header in commit 3820b4d278 ('uprobes: Move function declarations out
of arch'), but MIPS and S390 has added them to their own trees later.
Remove the unnecessary duplicates.

Link: http://lkml.kernel.org/r/1472804384-17830-1-git-send-email-marcin.nowakowski@imgtec.com
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:30 -07:00
Chris Metcalf 6727ad9e20 nmi_backtrace: generate one-line reports for idle cpus
When doing an nmi backtrace of many cores, most of which are idle, the
output is a little overwhelming and very uninformative.  Suppress
messages for cpus that are idling when they are interrupted and just
emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".

We do this by grouping all the cpuidle code together into a new
.cpuidle.text section, and then checking the address of the interrupted
PC to see if it lies within that section.

This commit suitably tags x86 and tile idle routines, and only adds in
the minimal framework for other architectures.

Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
Tested-by: Petr Mladek <pmladek@suse.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:30 -07:00
Vineet Gupta 51a021244b atomic64: no need for CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
This came to light when implementing native 64-bit atomics for ARCv2.

The atomic64 self-test code uses CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
to check whether atomic64_dec_if_positive() is available.  It seems it
was needed when not every arch defined it.  However as of current code
the Kconfig option seems needless

 - for CONFIG_GENERIC_ATOMIC64 it is auto-enabled in lib/Kconfig and a
   generic definition of API is present lib/atomic64.c
 - arches with native 64-bit atomics select it in arch/*/Kconfig and
   define the API in their headers

So I see no point in keeping the Kconfig option

Compile tested for:
 - blackfin (CONFIG_GENERIC_ATOMIC64)
 - x86 (!CONFIG_GENERIC_ATOMIC64)
 - ia64

Link: http://lkml.kernel.org/r/1473703083-8625-3-git-send-email-vgupta@synopsys.com
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Zhaoxiu Zeng <zhaoxiu.zeng@gmail.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ming Lin <ming.l@ssi.samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:30 -07:00
Yisheng Xie 461a718432 mm/hugetlb: introduce ARCH_HAS_GIGANTIC_PAGE
Avoid making ifdef get pretty unwieldy if many ARCHs support gigantic
page.  No functional change with this patch.

Link: http://lkml.kernel.org/r/1475227569-63446-2-git-send-email-xieyisheng1@huawei.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:29 -07:00
Linus Torvalds 6218590bcb KVM updates for v4.9-rc1
All architectures:
   Move `make kvmconfig` stubs from x86;  use 64 bits for debugfs stats.
 
 ARM:
   Important fixes for not using an in-kernel irqchip; handle SError
   exceptions and present them to guests if appropriate; proxying of GICV
   access at EL2 if guest mappings are unsafe; GICv3 on AArch32 on ARMv8;
   preparations for GICv3 save/restore, including ABI docs; cleanups and
   a bit of optimizations.
 
 MIPS:
   A couple of fixes in preparation for supporting MIPS EVA host kernels;
   MIPS SMP host & TLB invalidation fixes.
 
 PPC:
   Fix the bug which caused guests to falsely report lockups; other minor
   fixes; a small optimization.
 
 s390:
   Lazy enablement of runtime instrumentation; up to 255 CPUs for nested
   guests; rework of machine check deliver; cleanups and fixes.
 
 x86:
   IOMMU part of AMD's AVIC for vmexit-less interrupt delivery; Hyper-V
   TSC page; per-vcpu tsc_offset in debugfs; accelerated INS/OUTS in
   nVMX; cleanups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJX9iDrAAoJEED/6hsPKofoOPoIAIUlgojkb9l2l1XVDgsXdgQL
 sRVhYSVv7/c8sk9vFImrD5ElOPZd+CEAIqFOu45+NM3cNi7gxip9yftUVs7wI5aC
 eDZRWm1E4trDZLe54ZM9ThcqZzZZiELVGMfR1+ZndUycybwyWzafpXYsYyaXp3BW
 hyHM3qVkoWO3dxBWFwHIoO/AUJrWYkRHEByKyvlC6KPxSdBPSa5c1AQwMCoE0Mo4
 K/xUj4gBn9eMelNhg4Oqu/uh49/q+dtdoP2C+sVM8bSdquD+PmIeOhPFIcuGbGFI
 B+oRpUhIuntN39gz8wInJ4/GRSeTuR2faNPxMn4E1i1u4LiuJvipcsOjPfe0a18=
 =fZRB
 -----END PGP SIGNATURE-----

Merge tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Radim Krčmář:
 "All architectures:
   - move `make kvmconfig` stubs from x86
   - use 64 bits for debugfs stats

  ARM:
   - Important fixes for not using an in-kernel irqchip
   - handle SError exceptions and present them to guests if appropriate
   - proxying of GICV access at EL2 if guest mappings are unsafe
   - GICv3 on AArch32 on ARMv8
   - preparations for GICv3 save/restore, including ABI docs
   - cleanups and a bit of optimizations

  MIPS:
   - A couple of fixes in preparation for supporting MIPS EVA host
     kernels
   - MIPS SMP host & TLB invalidation fixes

  PPC:
   - Fix the bug which caused guests to falsely report lockups
   - other minor fixes
   - a small optimization

  s390:
   - Lazy enablement of runtime instrumentation
   - up to 255 CPUs for nested guests
   - rework of machine check deliver
   - cleanups and fixes

  x86:
   - IOMMU part of AMD's AVIC for vmexit-less interrupt delivery
   - Hyper-V TSC page
   - per-vcpu tsc_offset in debugfs
   - accelerated INS/OUTS in nVMX
   - cleanups and fixes"

* tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (140 commits)
  KVM: MIPS: Drop dubious EntryHi optimisation
  KVM: MIPS: Invalidate TLB by regenerating ASIDs
  KVM: MIPS: Split kernel/user ASID regeneration
  KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  KVM: arm/arm64: vgic: Don't flush/sync without a working vgic
  KVM: arm64: Require in-kernel irqchip for PMU support
  KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register
  KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL
  KVM: PPC: Book3S: Remove duplicate setting of the B field in tlbie
  KVM: PPC: BookE: Fix a sanity check
  KVM: PPC: Book3S HV: Take out virtual core piggybacking code
  KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread
  ARM: gic-v3: Work around definition of gic_write_bpr1
  KVM: nVMX: Fix the NMI IDT-vectoring handling
  KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive
  KVM: nVMX: Fix reload apic access page warning
  kvmconfig: add virtio-gpu to config fragment
  config: move x86 kvm_guest.config to a common location
  arm64: KVM: Remove duplicating init code for setting VMID
  ARM: KVM: Support vgic-v3
  ...
2016-10-06 10:49:01 -07:00
Linus Torvalds e46cae4418 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "The new features and main improvements in this merge for v4.9

   - Support for the UBSAN sanitizer

   - Set HAVE_EFFICIENT_UNALIGNED_ACCESS, it improves the code in some
     places

   - Improvements for the in-kernel fpu code, in particular the overhead
     for multiple consecutive in kernel fpu users is recuded

   - Add a SIMD implementation for the RAID6 gen and xor operations

   - Add RAID6 recovery based on the XC instruction

   - The PCI DMA flush logic has been improved to increase the speed of
     the map / unmap operations

   - The time synchronization code has seen some updates

  And bug fixes all over the place"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits)
  s390/con3270: fix insufficient space padding
  s390/con3270: fix use of uninitialised data
  MAINTAINERS: update DASD maintainer
  s390/cio: fix accidental interrupt enabling during resume
  s390/dasd: add missing \n to end of dev_err messages
  s390/config: Enable config options for Docker
  s390/dasd: make query host access interruptible
  s390/dasd: fix panic during offline processing
  s390/dasd: fix hanging offline processing
  s390/pci_dma: improve lazy flush for unmap
  s390/pci_dma: split dma_update_trans
  s390/pci_dma: improve map_sg
  s390/pci_dma: simplify dma address calculation
  s390/pci_dma: remove dma address range check
  iommu/s390: simplify registration of I/O address translation parameters
  s390: migrate exception table users off module.h and onto extable.h
  s390: export header for CLP ioctl
  s390/vmur: fix irq pointer dereference in int handler
  s390/dasd: add missing KOBJ_CHANGE event for unformatted devices
  s390: enable UBSAN
  ...
2016-10-04 14:05:52 -07:00
Linus Torvalds 597f03f9d1 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug updates from Thomas Gleixner:
 "Yet another batch of cpu hotplug core updates and conversions:

   - Provide core infrastructure for multi instance drivers so the
     drivers do not have to keep custom lists.

   - Convert custom lists to the new infrastructure. The block-mq custom
     list conversion comes through the block tree and makes the diffstat
     tip over to more lines removed than added.

   - Handle unbalanced hotplug enable/disable calls more gracefully.

   - Remove the obsolete CPU_STARTING/DYING notifier support.

   - Convert another batch of notifier users.

   The relayfs changes which conflicted with the conversion have been
   shipped to me by Andrew.

   The remaining lot is targeted for 4.10 so that we finally can remove
   the rest of the notifiers"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  cpufreq: Fix up conversion to hotplug state machine
  blk/mq: Reserve hotplug states for block multiqueue
  x86/apic/uv: Convert to hotplug state machine
  s390/mm/pfault: Convert to hotplug state machine
  mips/loongson/smp: Convert to hotplug state machine
  mips/octeon/smp: Convert to hotplug state machine
  fault-injection/cpu: Convert to hotplug state machine
  padata: Convert to hotplug state machine
  cpufreq: Convert to hotplug state machine
  ACPI/processor: Convert to hotplug state machine
  virtio scsi: Convert to hotplug state machine
  oprofile/timer: Convert to hotplug state machine
  block/softirq: Convert to hotplug state machine
  lib/irq_poll: Convert to hotplug state machine
  x86/microcode: Convert to hotplug state machine
  sh/SH-X3 SMP: Convert to hotplug state machine
  ia64/mca: Convert to hotplug state machine
  ARM/OMAP/wakeupgen: Convert to hotplug state machine
  ARM/shmobile: Convert to hotplug state machine
  arm64/FP/SIMD: Convert to hotplug state machine
  ...
2016-10-03 19:43:08 -07:00
Deepa Dinamani 078cd8279e fs: Replace CURRENT_TIME with current_time() for inode timestamps
CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_time() instead.

CURRENT_TIME is also not y2038 safe.

This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe. As part of the effort current_time() will be
extended to do range checks. Hence, it is necessary for all
file system timestamps to use current_time(). Also,
current_time() will be transitioned along with vfs to be
y2038 safe.

Note that whenever a single call to current_time() is used
to change timestamps in different inodes, it is because they
share the same time granularity.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Felipe Balbi <balbi@kernel.org>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-27 21:06:21 -04:00
Michael Holzheu fdcebf6f18 s390/config: Enable config options for Docker
The following config options are required/recommended for running Docker:

 Networking:

 - CONFIG_NF_NAT_MASQUERADE_IPV4=m
 - CONFIG_NF_NAT_MASQUERADE_IPV6=m
 - CONFIG_IPVLAN=m
 - CGROUP_NET_PRIO=y

 Storage drivers:

 - CONFIG_DM_THIN_PROVISIONING=m
 - CONFIG_OVERLAY_FS=m

 Scheduling:

 - CONFIG_FAIR_GROUP_SCHED=y
 - CONFIG_CFS_BANDWIDTH=y

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-27 08:54:41 +02:00
Sebastian Ott 13954fd691 s390/pci_dma: improve lazy flush for unmap
Lazy unmap (defer tlb flush after unmap until dma address reuse) can
greatly reduce the number of RPCIT instructions in the best case. In
reality we are often far away from the best case scenario because our
implementation suffers from the following problem:

To create dma addresses we maintain an iommu bitmap and a pointer into
that bitmap to mark the start of the next search. That pointer moves from
the start to the end of that bitmap and we issue a global tlb flush
once that pointer wraps around. To prevent address reuse before we issue
the tlb flush we even have to move the next pointer during unmaps - when
clearing a bit > next. This could lead to a situation where we only use
the rear part of that bitmap and issue more tlb flushes than expected.

To fix this we no longer clear bits during unmap but maintain a 2nd
bitmap which we use to mark addresses that can't be reused until we issue
the global tlb flush after wrap around.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22 13:42:33 +02:00
Sebastian Ott 1f166e9e5c s390/pci_dma: split dma_update_trans
Split dma_update_trans into __dma_update_trans which handles updating
the dma translation tables and __dma_purge_tlb which takes care of
purging associated entries in the dma translation lookaside buffer.

The map_sg API makes use of this split approach by calling
__dma_update_trans once per physically contiguous address range but
__dma_purge_tlb only once per dma contiguous address range.

This results in less invocations of the expensive RPCIT instruction
when using map_sg.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22 13:42:33 +02:00
Sebastian Ott ee877b81c6 s390/pci_dma: improve map_sg
Our map_sg implementation mapped sg entries independently of each other.
For ease of use and possible performance improvements this patch changes
the implementation to try to map as many (likely physically non-contiguous)
sglist entries as possible into a contiguous DMA segment.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22 13:42:32 +02:00
Sebastian Ott 8cb63b7879 s390/pci_dma: simplify dma address calculation
Simplify the code we use to calculate dma addresses by putting
everything related in a dma_alloc_address function. Also provide
a dma_free_address counterpart.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22 13:42:31 +02:00
Sebastian Ott 3b13f1fea1 s390/pci_dma: remove dma address range check
We calculate dma addresses using an iommu bitmap. Since commit
69eea95c ("s390/pci_dma: fix DMA table corruption with > 4 TB main memory")
we've made sure that addresses created using that bitmap are below
the maximum reported by firmware. Thus the additional check for
that address to be within range can be removed.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22 13:42:30 +02:00
Paul Gortmaker dcc096c540 s390: migrate exception table users off module.h and onto extable.h
These files were only including module.h for exception table
related functions.  We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.

The additions of uaccess.h are to deal with implict includes like:

arch/s390/kernel/traps.c: In function 'do_report_trap':
arch/s390/kernel/traps.c:56:4: error: implicit declaration of function 'extable_fixup' [-Werror=implicit-function-declaration]
arch/s390/kernel/traps.c: In function 'illegal_op':
arch/s390/kernel/traps.c:173:3: error: implicit declaration of function 'get_user' [-Werror=implicit-function-declaration]

Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:38 +02:00
Sebastian Ott ecc6410abf s390: export header for CLP ioctl
Export clp.h for usage by userspace.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:35 +02:00
Christian Borntraeger c42d8c7dbe s390: enable UBSAN
This enables UBSAN for s390. We have to disable the null sanitizer
as s390 code does access memory via a null pointer (the prefix page).

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:23 +02:00
Masahiro Yamada f296190e41 s390/crashdump: use list_first_entry_or_null
The combo of list_empty() check and return list_first_entry()
can be replaced with list_first_entry_or_null().

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:04 +02:00
Christian Borntraeger 9078a54996 s390: claim efficient unaligned access
most unaligned accesses are reasonable efficient (no kernel emulation)
on s390, let's announce it

This also
- removes the ubsan false positives for unaligned accesses on s390 with
  default config
- uses simpler arithmetic in several functions in several other areas
  of the kernel like ethernet frame classification

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:01 +02:00
Sebastian Andrzej Siewior 84c9ceefec s390/mm/pfault: Convert to hotplug state machine
Install the callbacks via the state machine.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-s390@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: rt@linutronix.de
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20160906170457.32393-18-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19 21:44:32 +02:00
Luiz Capitulino 235539b48a kvm: add stubs for arch specific debugfs support
Two stubs are added:

 o kvm_arch_has_vcpu_debugfs(): must return true if the arch
   supports creating debugfs entries in the vcpu debugfs dir
   (which will be implemented by the next commit)

 o kvm_arch_create_vcpu_debugfs(): code that creates debugfs
   entries in the vcpu debugfs dir

For x86, this commit introduces a new file to avoid growing
arch/x86/kvm/x86.c even more.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-16 16:57:47 +02:00
Ingo Molnar d4b80afbba Merge branch 'linus' into x86/asm, to pick up recent fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:24:53 +02:00
Linus Torvalds 77e5bdf9f7 Merge branch 'uaccess-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull uaccess fixes from Al Viro:
 "Fixes for broken uaccess primitives - mostly lack of proper zeroing
  in copy_from_user()/get_user()/__get_user(), but for several
  architectures there's more (broken clear_user() on frv and
  strncpy_from_user() on hexagon)"

* 'uaccess-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
  avr32: fix copy_from_user()
  microblaze: fix __get_user()
  microblaze: fix copy_from_user()
  m32r: fix __get_user()
  blackfin: fix copy_from_user()
  sparc32: fix copy_from_user()
  sh: fix copy_from_user()
  sh64: failing __get_user() should zero
  score: fix copy_from_user() and friends
  score: fix __get_user/get_user
  s390: get_user() should zero on failure
  ppc32: fix copy_from_user()
  parisc: fix copy_from_user()
  openrisc: fix copy_from_user()
  nios2: fix __get_user()
  nios2: copy_from_user() should zero the tail of destination
  mn10300: copy_from_user() should zero on access_ok() failure...
  mn10300: failing __get_user() and get_user() should zero
  mips: copy_from_user() must zero the destination on access_ok() failure
  ARC: uaccess: get_user to zero out dest in cause of fault
  ...
2016-09-14 09:35:05 -07:00
Al Viro fd2d2b191f s390: get_user() should zero on failure
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:13 -04:00
Paolo Bonzini ad53e35ae5 Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
Paul Mackerras writes:

    The highlights are:

    * Reduced latency for interrupts from PCI pass-through devices, from
      Suresh Warrier and me.
    * Halt-polling implementation from Suraj Jitindar Singh.
    * 64-bit VCPU statistics, also from Suraj.
    * Various other minor fixes and improvements.
2016-09-13 15:20:55 +02:00
Linus Torvalds ac059c4fa7 * s390: nested virt fixes (new 4.8 feature)
* x86: fixes for 4.8 regressions
 * ARM: two small bugfixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJX1xaBAAoJEL/70l94x66Db/8H/AuKWs6QbvUNnA+tWITdmFbi
 ah7R2BoRVak4h6UGYC4OsY9BTuWVeaCx2+8yLIqSdfWoYUJzwiGFPwkuUK/hVSkK
 /9gZO8SsREAm/1gVck2X2vzATYbfCxKcRsSq0/ocmIQEpRYWKGoJWS6zeiwLZ9kq
 H/KsAvsGc83VdlPXrhLVQa7js/Tl/M5JlBEbm6i8nIsvhoqZkES4rgwHKBtkxTyU
 8oepfNQnYTb2hZhJE0aW+9z3V0O+NKbGW75bKOzrbJBoEUGeHuPpLnP0mZIyEGPU
 grQkfuWMmfEaWrGahNA9ARpsNcy4Zp0BF5mgJ03OAmgfY+KmJroVk95IvcP9jF4=
 =+uuA
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 - s390: nested virt fixes (new 4.8 feature)
 - x86: fixes for 4.8 regressions
 - ARM: two small bugfixes

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm-arm: Unmap shadow pagetables properly
  x86, clock: Fix kvm guest tsc initialization
  arm: KVM: Fix idmap overlap detection when the kernel is idmap'ed
  KVM: lapic: adjust preemption timer correctly when goes TSC backward
  KVM: s390: vsie: fix riccbd
  KVM: s390: don't use current->thread.fpu.* when accessing registers
2016-09-12 14:30:14 -07:00
Christian Borntraeger b0eb91ae63 Merge remote-tracking branch 'kvms390/s390forkvm' into kvms390next 2016-09-08 13:41:08 +02:00
Markus Elfring 0624a8eb82 KVM: s390: Use memdup_user() rather than duplicating code
* Reuse existing functionality from memdup_user() instead of keeping
  duplicate source code.

  This issue was detected by using the Coccinelle software.

* Return directly if this copy operation failed.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Message-Id: <c86f7520-885e-2829-ae9c-b81caa898e84@users.sourceforge.net>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 13:40:55 +02:00
Markus Elfring a1708a2ead KVM: s390: Improve determination of sizes in kvm_s390_import_bp_data()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus reuse the corresponding function "kmalloc_array".

  Suggested-by: Paolo Bonzini <pbonzini@redhat.com>

  This issue was detected also by using the Coccinelle software.

* Replace the specification of data structures by pointer dereferences
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

* Delete the local variable "size" which became unnecessary with
  this refactoring.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-Id: <c3323f6b-4af2-0bfb-9399-e529952e378e@users.sourceforge.net>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 13:40:54 +02:00
David Hildenbrand a6940674c3 KVM: s390: allow 255 VCPUs when sca entries aren't used
If the SCA entries aren't used by the hardware (no SIGPIF), we
can simply not set the entries, stick to the basic sca and allow more
than 64 VCPUs.

To hinder any other facility from using these entries, let's properly
provoke intercepts by not setting the MCN and keeping the entries
unset.

This effectively allows when running KVM under KVM (vSIE) or under z/VM to
provide more than 64 VCPUs to a guest. Let's limit it to 255 for now, to
not run into problems if the CPU numbers are limited somewhere else.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 13:40:53 +02:00
Fan Zhang 80cd876338 KVM: s390: lazy enable RI
Only enable runtime instrumentation if the guest issues an RI related
instruction or if userspace changes the riccb to a valid state.
This makes entry/exit a tiny bit faster.

Initial patch by Christian Borntraeger
Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 13:40:39 +02:00
Janosch Frank c14b88d766 KVM: s390: gaccess: simplify translation exception handling
The payload data for protection exceptions is a superset of the
payload of other translation exceptions. Let's set the additional
flags and use a fall through to minimize code duplication.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:53 +02:00
David Hildenbrand b1ffffbd0f KVM: s390: guestdbg: separate defines for per code
Let's avoid working with the PER_EVENT* defines, used for control register
manipulation, when checking the u8 PER code. Introduce separate defines
based on the existing defines.

Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:52 +02:00
David Hildenbrand 8953fb08ab KVM: s390: write external damage code on machine checks
Let's also write the external damage code already provided by
struct kvm_s390_mchk_info.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:52 +02:00
David Hildenbrand ff5dc1492a KVM: s390: fix delivery of vector regs during machine checks
Vector registers are only to be stored if the facility is available
and if the guest has set up the machine check extended save area.

If anything goes wrong while writing the vector registers, the vector
registers are to be marked as invalid. Please note that we are allowed
to write the registers although they are marked as invalid.

Machine checks and "store status" SIGP orders are two different concepts,
let's correctly separate these. As the SIGP part is completely handled in
user space, we can drop it.

This patch is based on a patch from Cornelia Huck.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:52 +02:00
David Hildenbrand 0319dae677 KVM: s390: split store status and machine check handling
Store status writes the prefix which is not to be done by a machine check.
Also, the psw is stored and later on overwritten by the failing-storage
address, which looks strange at first sight.

Store status and machine check handling look similar, but they are actually
two different things.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:51 +02:00
David Hildenbrand d6404ded30 KVM: s390: factor out actual delivery of machine checks
Let's factor this out to prepare for bigger changes. Reorder to calls to
match the logical order given in the PoP.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:51 +02:00
Suraj Jitindar Singh 8a7e75d47b KVM: Add provisioning for ulong vm stats and u64 vcpu stats
vms and vcpus have statistics associated with them which can be viewed
within the debugfs. Currently it is assumed within the vcpu_stat_get() and
vm_stat_get() functions that all of these statistics are represented as
u32s, however the next patch adds some u64 vcpu statistics.

Change all vcpu statistics to u64 and modify vcpu_stat_get() accordingly.
Since vcpu statistics are per vcpu, they will only be updated by a single
vcpu at a time so this shouldn't present a problem on 32-bit machines
which can't atomically increment 64-bit numbers. However vm statistics
could potentially be updated by multiple vcpus from that vm at a time.
To avoid the overhead of atomics make all vm statistics ulong such that
they are 64-bit on 64-bit systems where they can be atomically incremented
and are 32-bit on 32-bit systems which may not be able to atomically
increment 64-bit numbers. Modify vm_stat_get() to expect ulongs.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-08 12:25:37 +10:00
Martin Schwidefsky c783b91ebd s390: add assembler include path for vx-insn.h
With git commit 0eab11c7e0
"s390/vx: allow to include vx-insn.h with .include"
and an older gcc we get errors like this:

{standard input}:6: Error: can't open asm/vx-insn.h for reading:
No such file or directory
arch/s390/kernel/fpu.c:57: Error: Unrecognized opcode: `vstm'

To solve this issue simply add the path to arch/s390/include to
all assembler runs.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-06 11:00:04 +02:00
Colin Ian King 6512391a30 s390/crypto: avoid returning garbage value
Static analysis with cppcheck detected that ret is not initialized
and hence garbage is potentially being returned in the case where
prng_data->ppnows.reseed_counter <= prng_reseed_limit.

Thanks to Martin Schwidefsky for spotting a mistake in my original
fix.

Fixes: 0177db01ad ("s390/crypto: simplify return code handling")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-06 11:00:00 +02:00
Bhaktipriya Shridhar e68f1d4ca9 s390: Remove deprecated create_singlethread_workqueue
The workqueue "appldata_wq" has been replaced with an ordered dedicated
workqueue.

WQ_MEM_RECLAIM has not been set since the workqueue is not being used on
a memory reclaim path.

The adapter->work_queue queues multiple work items viz
&adapter->scan_work, &port->rport_work, &adapter->ns_up_work,
&adapter->stat_work, adapter->work_queue, &adapter->events.work,
&port->gid_pn_work, &port->test_link_work. Hence, an ordered
dedicated workqueue has been used.

WQ_MEM_RECLAIM has been set to ensure forward progress under memory
pressure.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-06 10:59:57 +02:00
David Hildenbrand 4d21cef3ea KVM: s390: vsie: fix riccbd
We store the address of riccbd at the wrong location, overwriting
gvrd. This means that our nested guest will not be able to use runtime
instrumentation. Also, a memory leak, if our KVM guest actually sets gvrd.

Not noticed until now, as KVM guests never make use of gvrd and runtime
instrumentation wasn't completely tested yet.

Reported-by: Fan Zhang <zhangfan@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-09-05 13:48:50 +02:00
Josh Poimboeuf 0d025d271e mm/usercopy: get rid of CONFIG_DEBUG_STRICT_USER_COPY_CHECKS
There are three usercopy warnings which are currently being silenced for
gcc 4.6 and newer:

1) "copy_from_user() buffer size is too small" compile warning/error

   This is a static warning which happens when object size and copy size
   are both const, and copy size > object size.  I didn't see any false
   positives for this one.  So the function warning attribute seems to
   be working fine here.

   Note this scenario is always a bug and so I think it should be
   changed to *always* be an error, regardless of
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS.

2) "copy_from_user() buffer size is not provably correct" compile warning

   This is another static warning which happens when I enable
   __compiletime_object_size() for new compilers (and
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS).  It happens when object size
   is const, but copy size is *not*.  In this case there's no way to
   compare the two at build time, so it gives the warning.  (Note the
   warning is a byproduct of the fact that gcc has no way of knowing
   whether the overflow function will be called, so the call isn't dead
   code and the warning attribute is activated.)

   So this warning seems to only indicate "this is an unusual pattern,
   maybe you should check it out" rather than "this is a bug".

   I get 102(!) of these warnings with allyesconfig and the
   __compiletime_object_size() gcc check removed.  I don't know if there
   are any real bugs hiding in there, but from looking at a small
   sample, I didn't see any.  According to Kees, it does sometimes find
   real bugs.  But the false positive rate seems high.

3) "Buffer overflow detected" runtime warning

   This is a runtime warning where object size is const, and copy size >
   object size.

All three warnings (both static and runtime) were completely disabled
for gcc 4.6 with the following commit:

  2fb0815c9e ("gcc4: disable __compiletime_object_size for GCC 4.6+")

That commit mistakenly assumed that the false positives were caused by a
gcc bug in __compiletime_object_size().  But in fact,
__compiletime_object_size() seems to be working fine.  The false
positives were instead triggered by #2 above.  (Though I don't have an
explanation for why the warnings supposedly only started showing up in
gcc 4.6.)

So remove warning #2 to get rid of all the false positives, and re-enable
warnings #1 and #3 by reverting the above commit.

Furthermore, since #1 is a real bug which is detected at compile time,
upgrade it to always be an error.

Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer
needed.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-30 10:10:21 -07:00
Paolo Bonzini 20b8f9e2dd KVM: s390: Fix for fpu register errors since 4.7
This fixes a regression that was introduced by a semantic
 change in commit 3f6813b9a5 ("s390/fpu: allocate 'struct
 fpu' with the task_struct"). Symptoms are broken host userspace
 fpu registers if the old FPU set/get ioctls are used.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJXvxNAAAoJEBF7vIC1phx8LdsQAIrzq8fRarrNhOX595ur1Js/
 ygB4TkwismmCtSIuqqBOkxlTS2HyED4o3C3i38x4wYiQQtGtGaS0nZFZ8VF064Po
 c9//lGMtJXTp+xJDodWTQPgRC/wdaoEOEdKwfm0svd8DmqivRmgAFZ3o+olZVBDL
 QlE9JBDk04+3z1eOLD4vdO0F7z4R/U0DJNdDQnPKGakJETO6Y6kAublpA1p27UHC
 dktpQOle2dEYt9EpBiTmMFIOXkMt0ilaS5SwTjydRlXvrpCV3edtJSlvvJm7/KRc
 SMdLUqQjeR7Pcaxg13p6wm+lj6Isq6bsB+fylnMd94rNia+OgVgETzGPMXCqE5HY
 8DeWL5m4PN73Gx2HO+MDIYmYFtlwkoTMOLcVyk6yXeoM1cUHDBhNmLw2RzZTx/rZ
 ArqO8qk20zbggdvEhEAL7npsWVBbX2BsLRfw/ALxVh+EP4IjMsj609Ll78oieay5
 fE6idPHySlhZPqkTo6qu5spYGwsUanTzNGrAxYGEBJwt8lET80vqnqM6Oq2CSvHL
 8zqsORBWzOJyJUyq174z63s7Pi8qGQEMmYE0qN8hIqCNK0YMMYPKCenL/SHegcr0
 ohWyvDxbkgeW0MwsF75wuJQ6mV/nnYcZHonrkHSWcdXn1QWuJiK5lffYgJmcO/JS
 7ppOE1YBoHVbN131h0R2
 =S2yL
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master

KVM: s390: Fix for fpu register errors since 4.7

This fixes a regression that was introduced by a semantic
change in commit 3f6813b9a5 ("s390/fpu: allocate 'struct
fpu' with the task_struct"). Symptoms are broken host userspace
fpu registers if the old FPU set/get ioctls are used.
2016-08-30 14:11:33 +02:00
Martin Schwidefsky 7bac4f5b8e s390/crypto: simplify CPACF encryption / decryption functions
The double while loops of the CTR mode encryption / decryption functions
are overly complex for little gain. Simplify the functions to a single
while loop at the cost of an additional memcpy of a few bytes for every
4K page worth of data.
Adapt the other crypto functions to make them all look alike.

Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:05:11 +02:00
Martin Schwidefsky 69c0e360f9 s390/crypto: cpacf function detection
The CPACF code makes some assumptions about the availablity of hardware
support. E.g. if the machine supports KM(AES-256) without chaining it is
assumed that KMC(AES-256) with chaining is available as well. For the
existing CPUs this is true but the architecturally correct way is to
check each CPACF functions on its own. This is what the query function
of each instructions is all about.

Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:05:09 +02:00
Martin Schwidefsky d863d5945f s390/crypto: simplify init / exit functions
The aes and the des module register multiple crypto algorithms
dependent on the availability of specific CPACF instructions.
To simplify the deregistration with crypto_unregister_alg add
an array with pointers to the successfully registered algorithms
and use it for the error handling in the init function and in
the module exit function.

Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:05:08 +02:00
Martin Schwidefsky 0177db01ad s390/crypto: simplify return code handling
The CPACF instructions can complete with three different condition codes:
CC=0 for successful completion, CC=1 if the protected key verification
failed, and CC=3 for partial completion.

The inline functions will restart the CPACF instruction for partial
completion, this removes the CC=3 case. The CC=1 case is only relevant
for the protected key functions of the KM, KMC, KMAC and KMCTR
instructions. As the protected key functions are not used by the
current code, there is no need for any kind of return code handling.

Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:05:07 +02:00
Martin Schwidefsky edc63a3785 s390/crypto: cleanup cpacf function codes
Use a separate define for the decryption modifier bit instead of
duplicating the function codes for encryption / decrypton.
In addition use an unsigned type for the function code.

Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:05:05 +02:00
Martin Schwidefsky 474fd6e80f RAID/s390: add SIMD implementation for raid6 gen/xor
Using vector registers is slightly faster:

raid6: vx128x8  gen() 19705 MB/s
raid6: vx128x8  xor() 11886 MB/s
raid6: using algorithm vx128x8 gen() 19705 MB/s
raid6: .... xor() 11886 MB/s, rmw enabled

vs the software algorithms:

raid6: int64x1  gen()  3018 MB/s
raid6: int64x1  xor()  1429 MB/s
raid6: int64x2  gen()  4661 MB/s
raid6: int64x2  xor()  3143 MB/s
raid6: int64x4  gen()  5392 MB/s
raid6: int64x4  xor()  3509 MB/s
raid6: int64x8  gen()  4441 MB/s
raid6: int64x8  xor()  3207 MB/s
raid6: using algorithm int64x4 gen() 5392 MB/s
raid6: .... xor() 3509 MB/s, rmw enabled

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:05:04 +02:00
Martin Schwidefsky 8f149ea6e9 s390/nmi: improve revalidation of fpu / vector registers
The machine check handler will do one of two things if the floating-point
control, a floating point register or a vector register can not be
revalidated:
1) if the PSW indicates user mode the process is terminated
2) if the PSW indicates kernel mode the system is stopped

To unconditionally stop the system for 2) is incorrect.

There are three possible outcomes if the floating-point control, a
floating point register or a vector registers can not be revalidated:
1) The kernel is inside a kernel_fpu_begin/kernel_fpu_end block and
   needs the register. The system is stopped.
2) No active kernel_fpu_begin/kernel_fpu_end block and the CIF_CPU bit
   is not set. The user space process needs the register and is killed.
3) No active kernel_fpu_begin/kernel_fpu_end block and the CIF_FPU bit
   is set. Neither the kernel nor the user space process needs the
   lost register. Just revalidate it and continue.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:05:03 +02:00
Martin Schwidefsky 7f79695cc1 s390/fpu: improve kernel_fpu_[begin|end]
In case of nested user of the FPU or vector registers in the kernel
the current code uses the mask of the FPU/vector registers of the
previous contexts to decide which registers to save and restore.
E.g. if the previous context used KERNEL_VXR_V0V7 and the next
context wants to use KERNEL_VXR_V24V31 the first 8 vector registers
are stored to the FPU state structure. But this is not necessary
as the next context does not use these registers.

Rework the FPU/vector register save and restore code. The new code
does a few things differently:
1) A lowcore field is used instead of a per-cpu variable.
2) The kernel_fpu_end function now has two parameters just like
   kernel_fpu_begin. The register flags are required by both
   functions to save / restore the minimal register set.
3) The inline functions kernel_fpu_begin/kernel_fpu_end now do the
   update of the register masks. If the user space FPU registers
   have already been stored neither save_fpu_regs nor the
   __kernel_fpu_begin/__kernel_fpu_end functions have to be called
   for the first context. In this case kernel_fpu_begin adds 7
   instructions and kernel_fpu_end adds 4 instructions.
3) The inline assemblies in __kernel_fpu_begin / __kernel_fpu_end
   to save / restore the vector registers are simplified a bit.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:05:01 +02:00
Martin Schwidefsky 0eab11c7e0 s390/vx: allow to include vx-insn.h with .include
To make the vx-insn.h more versatile avoid cpp preprocessor macros
and allow to use plain numbers for vector and general purpose register
operands. With that you can emit an .include from a C file into the
assembler text and then use the vx-insn macros in inline assemblies.

For example:

asm (".include \"asm/vx-insn.h\"");

static inline void xor_vec(int x, int y, int z)
{
	asm volatile("VX %0,%1,%2"
		     : : "i" (x), "i" (y), "i" (z));
}

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:04:59 +02:00
David Hildenbrand 67f03de5f0 s390/time: avoid races when updating tb_update_count
The increment might not be atomic and we're not holding the
timekeeper_lock. Therefore we might lose an update to count, resulting in
VDSO being trapped in a loop. As other archs also simply update the
values and count doesn't seem to have an impact on reloading of these
values in VDSO code, let's just remove the update of tb_update_count.

Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:04:58 +02:00
David Hildenbrand 0c00b1e00b s390/time: fixup the clock comparator on all cpus
By leaving fixup_cc unset, only the clock comparator of the cpu actually
doing the sync is fixed up until now.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:04:56 +02:00
David Hildenbrand ca64f63901 s390/time: cleanup etr leftovers
There are still some etr leftovers and wrong comments, let's clean that up.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:04:54 +02:00
David Hildenbrand 41ad022039 s390/time: simplify stp time syncs
The way we call do_adjtimex() today is broken. It has 0 effect, as
ADJ_OFFSET_SINGLESHOT (0x0001) in the kernel maps to !ADJ_ADJTIME
(in contrast to user space where it maps to  ADJ_OFFSET_SINGLESHOT |
ADJ_ADJTIME - 0x8001). !ADJ_ADJTIME will silently ignore all adjustments
without STA_PLL being active. We could switch to ADJ_ADJTIME or turn
STA_PLL on, but still we would run into some problems:

- Even when switching to nanoseconds, we lose accuracy.
- Successive calls to do_adjtimex() will simply overwrite any leftovers
  from the previous call (if not fully handled)
- Anything that NTP does using the sysctl heavily interferes with our
  use.
- !ADJ_ADJTIME will silently round stuff > or < than 0.5 seconds

Reusing do_adjtimex() here just feels wrong. The whole STP synchronization
works right now *somehow* only, as do_adjtimex() does nothing and our
TOD clock jumps in time, although it shouldn't. This is especially bad
as the clock could jump backwards in time. We will have to find another
way to fix this up.

As leap seconds are also not properly handled yet, let's just get rid of
all this complex logic altogether and use the correct clock_delta for
fixing up the clock comparator and keeping the sched_clock monotonic.

This change should have 0 effect on the current STP mechanism. Once we
know how to best handle sync events and leap second updates, we'll start
with a fresh implementation.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:04:52 +02:00
Masahiro Yamada a5ff1b34e1 treewide: replace config_enabled() with IS_ENABLED() (2nd round)
Commit 97f2645f35 ("tree-wide: replace config_enabled() with
IS_ENABLED()") mostly killed config_enabled(), but some new users have
appeared for v4.8-rc1.  They are all used for a boolean option, so can
be replaced with IS_ENABLED() safely.

Link: http://lkml.kernel.org/r/1471970749-24867-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26 17:39:35 -07:00
Martin Schwidefsky 88bf46bfde Merge branch 's390forkvm' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
Pull facility mask patch from the KVM tree.

* tag 's390forkvm' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
  KVM: s390: generate facility mask from readable list
2016-08-26 16:34:19 +02:00
Heiko Carstens f6c1d359be KVM: s390: generate facility mask from readable list
Automatically generate the KVM facility mask out of a readable list.
Manually changing the masks is very error prone, especially if the
special IBM bit numbering has to be considered.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-08-25 22:47:03 +02:00
David Hildenbrand a7d4b8f256 KVM: s390: don't use current->thread.fpu.* when accessing registers
As the meaning of these variables and pointers seems to change more
frequently, let's directly access our save area, instead of going via
current->thread.

Right now, this is broken for set/get_fpu. They simply overwrite the
host registers, as the pointers to the current save area were turned
into the static host save area.

Cc: stable@vger.kernel.org # 4.7
Fixes: 3f6813b9a5 ("s390/fpu: allocate 'struct fpu' with the task_struct")
Reported-by: Hao QingFeng <haoqf@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-08-25 17:33:24 +02:00
Josh Poimboeuf 9a7c348ba6 ftrace: Add return address pointer to ftrace_ret_stack
Storing this value will help prevent unwinders from getting out of sync
with the function graph tracer ret_stack.  Now instead of needing a
stateful iterator, they can compare the return address pointer to find
the right ret_stack entry.

Note that an array of 50 ftrace_ret_stack structs is allocated for every
task.  So when an arch implements this, it will add either 200 or 400
bytes of memory usage per task (depending on whether it's a 32-bit or
64-bit platform).

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a95cfcc39e8f26b89a430c56926af0bb217bc0a1.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:15:14 +02:00
Martin Schwidefsky bd3a172557 s390/pci: add zpci_report_error interface
The 'report_error' interface for PCI devices found on s390 can be
used by a user space program to inject an adapter error notification.
Add a new kernel interface zpci_report_error to allow a PCI device
driver to inject these error notifications without a detour over
user space.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-24 09:23:56 +02:00
Martin Schwidefsky 47e4d851c5 s390/mm: merge local / non-local IDTE helper
Merge the __p[m|u]xdp_idte and __p[m|u]dp_idte_local functions into a
single __p[m|u]dp_idte function with an additional parameter.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-24 09:23:56 +02:00
Martin Schwidefsky 34eeaf376d s390/mm: merge local / non-local IPTE helper
Merge the __ptep_ipte and __ptep_ipte_local functions into a single
__ptep_ipte function with an additional parameter. The __pte_ipte_range
function is still extra as the while loops makes it hard to merge.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-24 09:23:55 +02:00
Martin Schwidefsky 44b6cc8130 s390/mm,kvm: flush gmap address space with IDTE
The __tlb_flush_mm() helper uses a global flush if the mm struct
has a gmap structure attached to it. Replace the global flush with
two individual flushes by means of the IDTE instruction if only a
single gmap is attached the the mm.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-24 09:23:55 +02:00
Martin Schwidefsky d5dcafee5f s390/mm: no local TLB flush for clearing-by-ASCE IDTE
The local-clearing control of the IDTE instruction does not have any effect
for the clearing-by-ASCE operation. Only the invalidation-and-clearing
operation respects the local-clearing bit.

Remove __tlb_flush_idte_local and simplify the batched TLB flushing code.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-24 09:23:54 +02:00
Linus Torvalds 45b6ae761e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "A couple of bug fixes, minor cleanup and a change to the default
  config"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/dasd: fix failing CUIR assignment under LPAR
  s390/pageattr: handle numpages parameter correctly
  s390/dasd: fix hanging device after clear subchannel
  s390/qdio: avoid reschedule of outbound tasklet once killed
  s390/qdio: remove checks for ccw device internal state
  s390/qdio: fix double return code evaluation
  s390/qdio: get rid of spin_lock_irqsave usage
  s390/cio: remove subchannel_id from ccw_device_private
  s390/qdio: obtain subchannel_id via ccw_device_get_schid()
  s390/cio: stop using subchannel_id from ccw_device_private
  s390/config: make the vector optimized crc function builtin
  s390/lib: fix memcmp and strstr
  s390/crc32-vx: Fix checksum calculation for small sizes
  s390: clarify compressed image code path
2016-08-16 15:50:22 -07:00
Linus Torvalds 329f415291 KVM locks kvm_device list to prevent corruption on device creation.
PPC splits debugfs initialization from creation of the xics device to
 unlock the newly taken kvm lock earlier.
 
 s390 prevents userspace from triggering two WARN_ON_ONCE.
 
 MIPS fixes several issues in the management of TLB faults (Cc: stable).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCAAGBQJXrx2ZAAoJEED/6hsPKofoo/4H/jra5NNxvpo09LWlXTwGXxBH
 cwcfDZSiOFxgvWztKJOIjPI4ETL3mnZvb9SFWBZZh1U0kfZ/TGiWouwaDNlBkPYj
 I3YHuPI7if+yUOmJlI3N2hWa0Wo0qiMqIjKT0pQVSLLdK/CVE+xGyS+qtXTNXHQn
 pFdKlYr//7OwQEY0ow1yj5VnsFrXB1JWFyB/+N5zaCfbCaQVyZAL7rj8SUbC/32W
 CiNhrvatzierKIfPerWw8DvvBKhCgWaRuLl0W+uMncrC9Qepcx9moM2beD1txK2I
 iHor1TDxUPifGQONfWMAlw87FluzHF4vQ5nN2jyTi8TT+CEfZpZ43Q+DY7okD4w=
 =NQP9
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "KVM:
   - lock kvm_device list to prevent corruption on device creation.

  PPC:
   - split debugfs initialization from creation of the xics device to
     unlock the newly taken kvm lock earlier.

  s390:
   - prevent userspace from triggering two WARN_ON_ONCE.

  MIPS:
   - fix several issues in the management of TLB faults (Cc: stable)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  MIPS: KVM: Propagate kseg0/mapped tlb fault errors
  MIPS: KVM: Fix gfn range check in kseg0 tlb faults
  MIPS: KVM: Add missing gfn range check
  MIPS: KVM: Fix mapped fault broken commpage handling
  KVM: Protect device ops->create and list_add with kvm->lock
  KVM: PPC: Move xics_debugfs_init out of create
  KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed
  KVM: s390: set the prefix initially properly
2016-08-13 10:11:14 -07:00
Julius Niedworok aca411a4b1 KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed
When triggering KVM_RUN without a user memory region being mapped
(KVM_SET_USER_MEMORY_REGION) a validity intercept occurs. This could
happen, if the user memory region was not mapped initially or if it
was unmapped after the vcpu is initialized. The function
kvm_s390_handle_requests checks for the KVM_REQ_MMU_RELOAD bit. The
check function always clears this bit. If gmap_mprotect_notify
returns an error code, the mapping failed, but the KVM_REQ_MMU_RELOAD
was not set anymore. So the next time kvm_s390_handle_requests is
called, the execution would fall trough the check for
KVM_REQ_MMU_RELOAD. The bit needs to be resetted, if
gmap_mprotect_notify returns an error code. Resetting the bit with
kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu) fixes the bug.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Julius Niedworok <jniedwor@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-08-12 09:11:08 +02:00
Julius Niedworok 75a4615c95 KVM: s390: set the prefix initially properly
When KVM_RUN is triggered on a VCPU without an initial reset, a
validity intercept occurs.
Setting the prefix will set the KVM_REQ_MMU_RELOAD bit initially,
thus preventing the bug.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Julius Niedworok <jniedwor@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-08-12 09:10:17 +02:00
Linus Torvalds 6da7e95326 virtio/vhost: fixes and cleanups for 4.8
- Misc fixes and cleanups all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXq0ruAAoJECgfDbjSjVRp5P8H/2OlDJdSS1l+TwOXbY95ntQ1
 vxUX4vGCX5IujC+Rbt7sQV2prE3b6IktFNagpbRoWn21JkpoDMvPtYJrn5BhLtoh
 fvDkZE6Wo3QztFSjaUBZWEABBt03KPX0yrAIZplu8ne/Z8KAT3zK57BPnKfmxwv+
 dpxt+1wlnqAvYsoUUQZBFT4Gmk2oDiTofiIbQq7W9W/fooznLtLB+ArYtdfNJizC
 JnI/vJuWceEXfjT26HexCRhA2OZskrA4ZadDhOjAqkTPN5DHfweLDuHh7IsVfDd1
 wXqjc4ks3cYG0CloJ2qY2K7RpDOFIxIizixeDIuAbn9aX4sPOYYfqRm+4iRwmqQ=
 =9aUO
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio/vhost fixes and cleanups from Michael Tsirkin:
 "Misc fixes and cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio/s390: deprecate old transport
  virtio/s390: keep early_put_chars
  virtio_blk: Fix a slient kernel panic
  virtio-vsock: fix include guard typo
  vhost/vsock: fix vhost virtio_vsock_pkt use-after-free
  9p/trans_virtio: use kvfree() for iov_iter_get_pages_alloc()
  virtio: fix error handling for debug builds
  virtio: fix memory leak in virtqueue_add()
2016-08-11 14:10:23 -07:00
Heiko Carstens 4d81aaa53c s390/pageattr: handle numpages parameter correctly
Both set_memory_ro() and set_memory_rw() will modify the page
attributes of at least one page, even if the numpages parameter is
zero.

The author expected that calling these functions with numpages == zero
would never happen. However with the new 444d13ff10 ("modules: add
ro_after_init support") feature this happens frequently.

Therefore do the right thing and make these two functions return
gracefully if nothing should be done.

Fixes crashes on module load like this one:

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 000003ff80008000 TEID: 000003ff80008407
Fault in home space mode while using kernel ASCE.
AS:0000000000d18007 R3:00000001e6aa4007 S:00000001e6a10800 P:00000001e34ee21d
Oops: 0004 ilc:3 [#1] SMP
Modules linked in: x_tables
CPU: 10 PID: 1 Comm: systemd Not tainted 4.7.0-11895-g3fa9045 #4
Hardware name: IBM              2964 N96              703              (LPAR)
task: 00000001e9118000 task.stack: 00000001e9120000
Krnl PSW : 0704e00180000000 00000000005677f8 (rb_erase+0xf0/0x4d0)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 000003ff80008b20 000003ff80008b20 000003ff80008b70 0000000000b9d608
           000003ff80008b20 0000000000000000 00000001e9123e88 000003ff80008950
           00000001e485ab40 000003ff00000000 000003ff80008b00 00000001e4858480
           0000000100000000 000003ff80008b68 00000000001d5998 00000001e9123c28
Krnl Code: 00000000005677e8: ec1801c3007c        cgij    %r1,0,8,567b6e
           00000000005677ee: e32010100020        cg      %r2,16(%r1)
          #00000000005677f4: a78401c2            brc     8,567b78
          >00000000005677f8: e35010080024        stg     %r5,8(%r1)
           00000000005677fe: ec5801af007c        cgij    %r5,0,8,567b5c
           0000000000567804: e30050000024        stg     %r0,0(%r5)
           000000000056780a: ebacf0680004        lmg     %r10,%r12,104(%r15)
           0000000000567810: 07fe                bcr     15,%r14
Call Trace:
([<000003ff80008900>] __this_module+0x0/0xffffffffffffd700 [x_tables])
([<0000000000264fd4>] do_init_module+0x12c/0x220)
([<00000000001da14a>] load_module+0x24e2/0x2b10)
([<00000000001da976>] SyS_finit_module+0xbe/0xd8)
([<0000000000803b26>] system_call+0xd6/0x264)
Last Breaking-Event-Address:
 [<000000000056771a>] rb_erase+0x12/0x4d0
 Kernel panic - not syncing: Fatal exception: panic_on_oops

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Fixes: e8a97e42dc ("s390/pageattr: allow kernel page table splitting")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-10 10:12:19 +02:00
Cornelia Huck 3b2fbb3f06 virtio/s390: deprecate old transport
There only ever have been two host implementations of the old
s390-virtio (pre-ccw) transport: the experimental kuli userspace,
and qemu. As qemu switched its default to ccw with 2.4 (with most
users having used ccw well before that) and removed the old transport
entirely in 2.6, s390-virtio probably hasn't been in active use for
quite some time and is therefore likely to bitrot.

Let's start the slow march towards removing the code by deprecating
it.

Note that this also deprecates the early virtio console code, which
has been causing trouble in the guest without being wired up in any
relevant hypervisor code.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-09 13:42:41 +03:00
Linus Torvalds 1eccfa090e Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user
bounds checking for most architectures on SLAB and SLUB.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJXl9tlAAoJEIly9N/cbcAm5BoP/ikTtDp2bFw1sn92yHTnIWzl
 O+dcKVAeRgjfnSvPfb1JITpaM58exQSaDsPBeR0DbVzU1zDdhLcwHHiQupFh98Ka
 vBZthbrlL/u4NB26enEEW0iyA32BsxYBMnIu0z5ux9RbZflmQwGQ0c0rvy3dJ7/b
 FzB5ayVST5y/a0m6/sImeeExh78GU9rsMb1XmJRMwlJAy6miDz/F9TP0LnuW6PhG
 J5XC99ygNJS1pQBLACRsrZw6ImgBxXnWCok6tWPMxFfD+rJBU2//wqS+HozyMWHL
 iYP7+ytVo/ZVok4114X/V4Oof3a6wqgpBuYrivJ228QO+UsLYbYLo6sZ8kRK7VFm
 9GgHo/8rWB1T9lBbSaa7UL5r0dVNNLjFGS42vwV+YlgUMQ1A35VRojO0jUnJSIQU
 Ug1IxKmylLd0nEcwD8/l3DXeQABsfL8GsoKW0OtdTZtW4RND4gzq34LK6t7hvayF
 kUkLg1OLNdUJwOi16M/rhugwYFZIMfoxQtjkRXKWN4RZ2QgSHnx2lhqNmRGPAXBG
 uy21wlzUTfLTqTpoeOyHzJwyF2qf2y4nsziBMhvmlrUvIzW1LIrYUKCNT4HR8Sh5
 lC2WMGYuIqaiu+NOF3v6CgvKd9UW+mxMRyPEybH8mEgfm+FLZlWABiBjIUpSEZuB
 JFfuMv1zlljj/okIQRg8
 =USIR
 -----END PGP SIGNATURE-----

Merge tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull usercopy protection from Kees Cook:
 "Tbhis implements HARDENED_USERCOPY verification of copy_to_user and
  copy_from_user bounds checking for most architectures on SLAB and
  SLUB"

* tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  mm: SLUB hardened usercopy support
  mm: SLAB hardened usercopy support
  s390/uaccess: Enable hardened usercopy
  sparc/uaccess: Enable hardened usercopy
  powerpc/uaccess: Enable hardened usercopy
  ia64/uaccess: Enable hardened usercopy
  arm64/uaccess: Enable hardened usercopy
  ARM: uaccess: Enable hardened usercopy
  x86/uaccess: Enable hardened usercopy
  mm: Hardened usercopy
  mm: Implement stack frame object validation
  mm: Add is_migrate_cma_page
2016-08-08 14:48:14 -07:00
Christian Borntraeger 6e127efeea s390/config: make the vector optimized crc function builtin
For all configs with CONFIG_BTRFS_FS = y we should also make the
optimized crc module builtin. Otherwise early mounts will fall
back to the software variant.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-08 15:41:32 +02:00
Christian Borntraeger e2efc42454 s390/lib: fix memcmp and strstr
if two string compare equal the clcle instruction will update the
string addresses to point _after_ the string. This might already
be on a different page, so we should not use these pointer to
calculate the difference as in that case the calculation of the
difference can cause oopses.

The return value of memcmp does not need the difference, we
can just reuse the condition code and return for CC=1 (All bytes
compared, first operand low) -1 and for CC=2 (All bytes compared,
first operand high) +1
strstr also does not need the diff.
While fixing this, make the common function clcle "correct on its
own" by using l1 instead of l2 for the first length. strstr will
call this with l2 for both strings.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: db7f5eef3d ("s390/lib: use basic blocks for inline assemblies")
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-08 15:41:31 +02:00
Christian Borntraeger 134a24cd89 s390/crc32-vx: Fix checksum calculation for small sizes
The current prealign logic will fail for sizes < alignment,
as the new datalen passed to the vector function is smaller
than zero. Being a size_t this gets wrapped to a huge
number causing memory overruns and wrong data.

Let's add an early exit if the size is smaller than the minimal
size with alignment. This will also avoid calling the software
fallback twice for all sizes smaller than the minimum size
(prealign + remaining)

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: f848dbd3bc ("s390/crc32-vx: add crypto API module for optimized CRC-32 algorithms")
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-08 15:41:31 +02:00
Sascha Silbe bf47dc572a s390: clarify compressed image code path
The way the decompressor is hooked into the start-up code is rather
subtle, with a mix of multiply-defined symbols and hardcoded address
literals. Add some comments at the junction points to clarify how it
works.

Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-08 15:41:31 +02:00
Al Viro 711f5df7bf s390: move exports to definitions
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-08-07 23:47:20 -04:00
Krzysztof Kozlowski 00085f1efa dma-mapping: use unsigned long for dma_attrs
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer.  Thus the pointer can point to const data.
However the attributes do not have to be a bitfield.  Instead unsigned
long will do fine:

1. This is just simpler.  Both in terms of reading the code and setting
   attributes.  Instead of initializing local attributes on the stack
   and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
   attributes are passed by value.

Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-04 08:50:07 -04:00
Linus Torvalds f716a85cd6 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:

 - GCC plugin support by Emese Revfy from grsecurity, with a fixup from
   Kees Cook.  The plugins are meant to be used for static analysis of
   the kernel code.  Two plugins are provided already.

 - reduction of the gcc commandline by Arnd Bergmann.

 - IS_ENABLED / IS_REACHABLE macro enhancements by Masahiro Yamada

 - bin2c fix by Michael Tautschnig

 - setlocalversion fix by Wolfram Sang

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  gcc-plugins: disable under COMPILE_TEST
  kbuild: Abort build on bad stack protector flag
  scripts: Fix size mismatch of kexec_purgatory_size
  kbuild: make samples depend on headers_install
  Kbuild: don't add obj tree in additional includes
  Kbuild: arch: look for generated headers in obtree
  Kbuild: always prefix objtree in LINUXINCLUDE
  Kbuild: avoid duplicate include path
  Kbuild: don't add ../../ to include path
  vmlinux.lds.h: replace config_enabled() with IS_ENABLED()
  kconfig.h: allow to use IS_{ENABLE,REACHABLE} in macro expansion
  kconfig.h: use already defined macros for IS_REACHABLE() define
  export.h: use __is_defined() to check if __KSYM_* is defined
  kconfig.h: use __is_defined() to check if MODULE is defined
  kbuild: setlocalversion: print error to STDERR
  Add sancov plugin
  Add Cyclomatic complexity GCC plugin
  GCC plugin infrastructure
  Shared library support
2016-08-02 16:37:12 -04:00
Linus Torvalds 221bb8a46e - ARM: GICv3 ITS emulation and various fixes. Removal of the old
VGIC implementation.
 
 - s390: support for trapping software breakpoints, nested virtualization
 (vSIE), the STHYI opcode, initial extensions for CPU model support.
 
 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups,
 preliminary to this and the upcoming support for hardware virtualization
 extensions.
 
 - x86: support for execute-only mappings in nested EPT; reduced vmexit
 latency for TSC deadline timer (by about 30%) on Intel hosts; support for
 more than 255 vCPUs.
 
 - PPC: bugfixes.
 
 The ugly bit is the conflicts.  A couple of them are simple conflicts due
 to 4.7 fixes, but most of them are with other trees. There was definitely
 too much reliance on Acked-by here.  Some conflicts are for KVM patches
 where _I_ gave my Acked-by, but the worst are for this pull request's
 patches that touch files outside arch/*/kvm.  KVM submaintainers should
 probably learn to synchronize better with arch maintainers, with the
 latter providing topic branches whenever possible instead of Acked-by.
 This is what we do with arch/x86.  And I should learn to refuse pull
 requests when linux-next sends scary signals, even if that means that
 submaintainers have to rebase their branches.
 
 Anyhow, here's the list:
 
 - arch/x86/kvm/vmx.c: handle_pcommit and EXIT_REASON_PCOMMIT was removed
 by the nvdimm tree.  This tree adds handle_preemption_timer and
 EXIT_REASON_PREEMPTION_TIMER at the same place.  In general all mentions
 of pcommit have to go.
 
 There is also a conflict between a stable fix and this patch, where the
 stable fix removed the vmx_create_pml_buffer function and its call.
 
 - virt/kvm/kvm_main.c: kvm_cpu_notifier was removed by the hotplug tree.
 This tree adds kvm_io_bus_get_dev at the same place.
 
 - virt/kvm/arm/vgic.c: a few final bugfixes went into 4.7 before the
 file was completely removed for 4.8.
 
 - include/linux/irqchip/arm-gic-v3.h: this one is entirely our fault;
 this is a change that should have gone in through the irqchip tree and
 pulled by kvm-arm.  I think I would have rejected this kvm-arm pull
 request.  The KVM version is the right one, except that it lacks
 GITS_BASER_PAGES_SHIFT.
 
 - arch/powerpc: what a mess.  For the idle_book3s.S conflict, the KVM
 tree is the right one; everything else is trivial.  In this case I am
 not quite sure what went wrong.  The commit that is causing the mess
 (fd7bacbca4, "KVM: PPC: Book3S HV: Fix TB corruption in guest exit
 path on HMI interrupt", 2016-05-15) touches both arch/powerpc/kernel/
 and arch/powerpc/kvm/.  It's large, but at 396 insertions/5 deletions
 I guessed that it wasn't really possible to split it and that the 5
 deletions wouldn't conflict.  That wasn't the case.
 
 - arch/s390: also messy.  First is hypfs_diag.c where the KVM tree
 moved some code and the s390 tree patched it.  You have to reapply the
 relevant part of commits 6c22c98637, plus all of e030c1125e, to
 arch/s390/kernel/diag.c.  Or pick the linux-next conflict
 resolution from http://marc.info/?l=kvm&m=146717549531603&w=2.
 Second, there is a conflict in gmap.c between a stable fix and 4.8.
 The KVM version here is the correct one.
 
 I have pushed my resolution at refs/heads/merge-20160802 (commit
 3d1f53419842) at git://git.kernel.org/pub/scm/virt/kvm/kvm.git.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJXoGm7AAoJEL/70l94x66DugQIAIj703ePAFepB/fCrKHkZZia
 SGrsBdvAtNsOhr7FQ5qvvjLxiv/cv7CymeuJivX8H+4kuUHUllDzey+RPHYHD9X7
 U6n1PdCH9F15a3IXc8tDjlDdOMNIKJixYuq1UyNZMU6NFwl00+TZf9JF8A2US65b
 x/41W98ilL6nNBAsoDVmCLtPNWAqQ3lajaZELGfcqRQ9ZGKcAYOaLFXHv2YHf2XC
 qIDMf+slBGSQ66UoATnYV2gAopNlWbZ7n0vO6tE2KyvhHZ1m399aBX1+k8la/0JI
 69r+Tz7ZHUSFtmlmyByi5IAB87myy2WQHyAPwj+4vwJkDGPcl0TrupzbG7+T05Y=
 =42ti
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:

 - ARM: GICv3 ITS emulation and various fixes.  Removal of the
   old VGIC implementation.

 - s390: support for trapping software breakpoints, nested
   virtualization (vSIE), the STHYI opcode, initial extensions
   for CPU model support.

 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots
   of cleanups, preliminary to this and the upcoming support for
   hardware virtualization extensions.

 - x86: support for execute-only mappings in nested EPT; reduced
   vmexit latency for TSC deadline timer (by about 30%) on Intel
   hosts; support for more than 255 vCPUs.

 - PPC: bugfixes.

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
  KVM: PPC: Introduce KVM_CAP_PPC_HTM
  MIPS: Select HAVE_KVM for MIPS64_R{2,6}
  MIPS: KVM: Reset CP0_PageMask during host TLB flush
  MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
  MIPS: KVM: Sign extend MFC0/RDHWR results
  MIPS: KVM: Fix 64-bit big endian dynamic translation
  MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
  MIPS: KVM: Use 64-bit CP0_EBase when appropriate
  MIPS: KVM: Set CP0_Status.KX on MIPS64
  MIPS: KVM: Make entry code MIPS64 friendly
  MIPS: KVM: Use kmap instead of CKSEG0ADDR()
  MIPS: KVM: Use virt_to_phys() to get commpage PFN
  MIPS: Fix definition of KSEGX() for 64-bit
  KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
  kvm: x86: nVMX: maintain internal copy of current VMCS
  KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
  KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
  KVM: arm64: vgic-its: Simplify MAPI error handling
  KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
  KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
  ...
2016-08-02 16:11:27 -04:00
Jiri Olsa e64a5470dc s390/ftrace/jprobes: Fix conflict between jprobes and function graph tracing
This fixes the same issue Steven already fixed for x86
in following commit:

  237d28db03 ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing

It fixes the crash, that happens when function graph tracing
and jprobes are used simultaneously. Please refer to above
commit for details.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
2016-07-31 09:28:12 -04:00
James Hogan 68c5cf5a60 s390: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO
AT_VECTOR_SIZE_ARCH should be defined with the maximum number of
NEW_AUX_ENT entries that ARCH_DLINFO can contain, but it wasn't defined
for s390 at all even though ARCH_DLINFO can contain one NEW_AUX_ENT when
VDSO is enabled.

This shouldn't be a problem as AT_VECTOR_SIZE_BASE includes space for
AT_BASE_PLATFORM which s390 doesn't use, but lets define it now and add
the comment above ARCH_DLINFO as found in several other architectures to
remind future modifiers of ARCH_DLINFO to keep AT_VECTOR_SIZE_ARCH up to
date.

Fixes: b020632e40 ("[S390] introduce vdso on s390")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
2016-07-31 09:28:09 -04:00
Heiko Carstens ef4423ce70 s390/numa: only set possible nodes within node_possible_map
Make sure that only those nodes appear in the node_possible_map that
may actually be used. Usually that means that the node online and
possible maps are identical. For mode "plain" we only have one node,
for mode "emu" we have "emu_nodes" nodes.

Before this the possible map included (with default config) 16 nodes
while usually only one was used. That made a couple of loops that
iterated over all possible nodes do more work than necessary.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-07-31 05:28:00 -04:00
Heiko Carstens 8814309163 s390/als: fix compile with gcov enabled
Fix this one when gcov is enabled:

arch/s390/kernel/als.o:(.data+0x118): undefined reference to `__gcov_merge_add'
arch/s390/kernel/als.o: In function `_GLOBAL__sub_I_65535_0_verify_facilities':
(.text.startup+0x8): undefined reference to `__gcov_init'

Please merge with "s390/als: convert architecture level set code to C".

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-07-31 05:28:00 -04:00
Heiko Carstens fbd6534ce0 s390/facilities: do not generate DWORDS define anymore
The architecture level set code has been converted to C and doesn't
need a define to figure out array sizes. Since the old code was the
only user of the DWORDS define, we can get rid of it again.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-07-31 05:27:59 -04:00
Heiko Carstens a02d1988b5 s390/als: print missing facilities on facility mismatch
If the kernel needs more facilities to run than the machine provides
it is running on, print the facility bit numbers which are missing.

This allows to easily tell what went wrong and if simply the machine
does not provide a required facility or if either the kernel or the
hypervisor may have a bug.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-07-31 05:27:59 -04:00
Heiko Carstens 06ed5512a2 s390/als: print machine type on facility mismatch
If we have a facility mismatch the kernel only emits a warning that
the processor is not recent enough and stops operating. This doesn't
give us a lot of an idea of what actually went wrong.

As a first step print the machine type in addition.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-07-31 05:27:59 -04:00
Heiko Carstens be2412c247 s390/als: convert architecture level set code to C
There is no reason to have this code in assembly language. Therefore
convert it to C.

Note that this code needs special treatment: it is called very early
and one of the side effects is that e.g. the bss section is not
cleared. Therefore the preferred way for static variables is to put
them on the stack which has a size of 16KB.

There is no functional change with this patch.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-07-31 05:27:58 -04:00
Heiko Carstens cf9fdfea5f s390/sclp: move uninitialized data to data section
The early sclp code may be called before the bss section is
cleared. Therefore move all variables to the data section.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-07-31 05:27:58 -04:00
Gerald Schaefer bc29b7ac1d s390/mm: clean up pte/pmd encoding
The hugetlbfs pte<->pmd conversion functions currently assume that the pmd
bit layout is consistent with the pte layout, which is not really true.

The SW read and write bits are encoded as the sequence "wr" in a pte, but
in a pmd it is "rw". The hugetlbfs conversion assumes that the sequence
is identical in both cases, which results in swapped read and write bits
in the pmd. In practice this is not a problem, because those pmd bits are
only relevant for THP pmds and not for hugetlbfs pmds. The hugetlbfs code
works on (fake) ptes, and the converted pte bits are correct.

There is another variation in pte/pmd encoding which affects dirty
prot-none ptes/pmds. In this case, a pmd has both its HW read-only and
invalid bit set, while it is only the invalid bit for a pte. This also has
no effect in practice, but it should better be consistent.

This patch fixes both inconsistencies by changing the SW read/write bit
layout for pmds as well as the PAGE_NONE encoding for ptes. It also makes
the hugetlbfs conversion functions more robust by introducing a
move_set_bit() macro that uses the pte/pmd bit #defines instead of
constant shifts.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-07-31 05:27:57 -04:00
Linus Torvalds 797cee982e Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit
Pull audit updates from Paul Moore:
 "Six audit patches for 4.8.

  There are a couple of style and minor whitespace tweaks for the logs,
  as well as a minor fixup to catch errors on user filter rules, however
  the major improvements are a fix to the s390 syscall argument masking
  code (reviewed by the nice s390 folks), some consolidation around the
  exclude filtering (less code, always a win), and a double-fetch fix
  for recording the execve arguments"

* 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit:
  audit: fix a double fetch in audit_log_single_execve_arg()
  audit: fix whitespace in CWD record
  audit: add fields to exclude filter by reusing user filter
  s390: ensure that syscall arguments are properly masked on s390
  audit: fix some horrible switch statement style crimes
  audit: fixup: log on errors from filter user rules
2016-07-29 17:54:17 -07:00
Linus Torvalds 7a1e8b80fb Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - TPM core and driver updates/fixes
   - IPv6 security labeling (CALIPSO)
   - Lots of Apparmor fixes
   - Seccomp: remove 2-phase API, close hole where ptrace can change
     syscall #"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
  apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
  tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
  tpm: Factor out common startup code
  tpm: use devm_add_action_or_reset
  tpm2_i2c_nuvoton: add irq validity check
  tpm: read burstcount from TPM_STS in one 32-bit transaction
  tpm: fix byte-order for the value read by tpm2_get_tpm_pt
  tpm_tis_core: convert max timeouts from msec to jiffies
  apparmor: fix arg_size computation for when setprocattr is null terminated
  apparmor: fix oops, validate buffer size in apparmor_setprocattr()
  apparmor: do not expose kernel stack
  apparmor: fix module parameters can be changed after policy is locked
  apparmor: fix oops in profile_unpack() when policy_db is not present
  apparmor: don't check for vmalloc_addr if kvzalloc() failed
  apparmor: add missing id bounds check on dfa verification
  apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
  apparmor: use list_next_entry instead of list_entry_next
  apparmor: fix refcount race when finding a child profile
  apparmor: fix ref count leak when profile sha1 hash is read
  apparmor: check that xindex is in trans_table bounds
  ...
2016-07-29 17:38:46 -07:00
Linus Torvalds a6408f6cb6 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp hotplug updates from Thomas Gleixner:
 "This is the next part of the hotplug rework.

   - Convert all notifiers with a priority assigned

   - Convert all CPU_STARTING/DYING notifiers

     The final removal of the STARTING/DYING infrastructure will happen
     when the merge window closes.

  Another 700 hundred line of unpenetrable maze gone :)"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  timers/core: Correct callback order during CPU hot plug
  leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
  powerpc/numa: Convert to hotplug state machine
  arm/perf: Fix hotplug state machine conversion
  irqchip/armada: Avoid unused function warnings
  ARC/time: Convert to hotplug state machine
  clocksource/atlas7: Convert to hotplug state machine
  clocksource/armada-370-xp: Convert to hotplug state machine
  clocksource/exynos_mct: Convert to hotplug state machine
  clocksource/arm_global_timer: Convert to hotplug state machine
  rcu: Convert rcutree to hotplug state machine
  KVM/arm/arm64/vgic-new: Convert to hotplug state machine
  smp/cfd: Convert core to hotplug state machine
  x86/x2apic: Convert to CPU hotplug state machine
  profile: Convert to hotplug state machine
  timers/core: Convert to hotplug state machine
  hrtimer: Convert to hotplug state machine
  x86/tboot: Convert to hotplug state machine
  arm64/armv8 deprecated: Convert to hotplug state machine
  hwtracing/coresight-etm4x: Convert to hotplug state machine
  ...
2016-07-29 13:55:30 -07:00