Provide a hook in hypervisor_x86 called after setting up initial
memory mapping.
This is needed e.g. by Xen HVM guests to map the hypervisor shared
info page.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Mark a couple of structures and functions as 'static', pointed out by Sparse:
warning: symbol 'bts_pmu' was not declared. Should it be static?
warning: symbol 'p4_event_aliases' was not declared. Should it be static?
warning: symbol 'rapl_attr_groups' was not declared. Should it be static?
symbol 'process_uv2_message' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Andrew Banman <abanman@hpe.com> # for the UV change
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20170810155709.7094-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
"virtual_vmload_vmsave" is what is going to land in /proc/cpuinfo now
as per v4.13-rc4, for a single feature bit which is clearly too long.
So rename it to what it is called in the processor manual.
"v_vmsave_vmload" is a bit shorter, after all.
We could go more aggressively here but having it the same as in the
processor manual is advantageous.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Radim Krčmář <rkrcmar@redhat.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Jörg Rödel <joro@8bytes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm-ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170801185552.GA3743@nazgul.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
ARC cores on reset have all interrupt lines of built-in INTC enabled.
Which means once we globally enable interrupts (very early on boot)
faulty hardware blocks may trigger an interrupt that Linux kernel
cannot handle yet as corresponding handler is not yet installed.
In that case system falls in "interrupt storm" and basically never
does anything useful except entering and exiting generic IRQ handling
code.
One real example of that kind of problematic hardware is DW GMAC which
also has interrupts enabled on reset and if Ethernet PHY informs GMAC
about link state, GMAC immediately reports that upstream to ARC core
and here we are.
Now with that change we mask all individual IRQ lines making entire
system more fool-proof.
[This patch was motivated by Adaptrum platform support]
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Eugeniy Paltsev <paltsev@synopsys.com>
Tested-by: Alexandru Gagniuc <alex.g@adaptrum.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
According to Intel 64 and IA-32 Architectures SDM, Volume 3,
Chapter 14.2, "Software needs to exercise care to avoid delays
between the two RDMSRs (for example interrupts)".
So, disable interrupts during reading MSRs IA32_APERF and IA32_MPERF.
See also: commit 4ab60c3f32 (cpufreq: intel_pstate: Disable
interrupts during MSRs reading).
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Nadav reported parallel MADV_DONTNEED on same range has a stale TLB
problem and Mel fixed it[1] and found same problem on MADV_FREE[2].
Quote from Mel Gorman:
"The race in question is CPU 0 running madv_free and updating some PTEs
while CPU 1 is also running madv_free and looking at the same PTEs.
CPU 1 may have writable TLB entries for a page but fail the pte_dirty
check (because CPU 0 has updated it already) and potentially fail to
flush.
Hence, when madv_free on CPU 1 returns, there are still potentially
writable TLB entries and the underlying PTE is still present so that a
subsequent write does not necessarily propagate the dirty bit to the
underlying PTE any more. Reclaim at some unknown time at the future
may then see that the PTE is still clean and discard the page even
though a write has happened in the meantime. I think this is possible
but I could have missed some protection in madv_free that prevents it
happening."
This patch aims for solving both problems all at once and is ready for
other problem with KSM, MADV_FREE and soft-dirty story[3].
TLB batch API(tlb_[gather|finish]_mmu] uses [inc|dec]_tlb_flush_pending
and mmu_tlb_flush_pending so that when tlb_finish_mmu is called, we can
catch there are parallel threads going on. In that case, forcefully,
flush TLB to prevent for user to access memory via stale TLB entry
although it fail to gather page table entry.
I confirmed this patch works with [4] test program Nadav gave so this
patch supersedes "mm: Always flush VMA ranges affected by zap_page_range
v2" in current mmotm.
NOTE:
This patch modifies arch-specific TLB gathering interface(x86, ia64,
s390, sh, um). It seems most of architecture are straightforward but
s390 need to be careful because tlb_flush_mmu works only if
mm->context.flush_mm is set to non-zero which happens only a pte entry
really is cleared by ptep_get_and_clear and friends. However, this
problem never changes the pte entries but need to flush to prevent
memory access from stale tlb.
[1] http://lkml.kernel.org/r/20170725101230.5v7gvnjmcnkzzql3@techsingularity.net
[2] http://lkml.kernel.org/r/20170725100722.2dxnmgypmwnrfawp@suse.de
[3] http://lkml.kernel.org/r/BD3A0EBE-ECF4-41D4-87FA-C755EA9AB6BD@gmail.com
[4] https://patchwork.kernel.org/patch/9861621/
[minchan@kernel.org: decrease tlb flush pending count in tlb_finish_mmu]
Link: http://lkml.kernel.org/r/20170808080821.GA31730@bbox
Link: http://lkml.kernel.org/r/20170802000818.4760-7-namit@vmware.com
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Reported-by: Nadav Amit <namit@vmware.com>
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is a preparatory patch for solving race problems caused by
TLB batch. For that, we will increase/decrease TLB flush pending count
of mm_struct whenever tlb_[gather|finish]_mmu is called.
Before making it simple, this patch separates architecture specific part
and rename it to arch_tlb_[gather|finish]_mmu and generic part just
calls it.
It shouldn't change any behavior.
Link: http://lkml.kernel.org/r/20170802000818.4760-5-namit@vmware.com
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
THP migration is added but only supports x86_64 at the moment. For all
other architectures, swp_entry_to_pmd() only returns a zero pmd_t.
Due to a GCC zero initializer bug #53119, the standard (pmd_t){0}
initializer is not accepted by all GCC versions. __pmd() is a feasible
workaround. In addition, sparc32's pmd_t is an array instead of a single
value, so we need (pmd_t){ {0}, } instead of (pmd_t){0}. Thus,
a different __pmd() definition is needed in sparc32.
Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull sparc updates from David Miller:
1) Recognize M8 cpus, just basic chip ID matching, from Allen Pais.
2) Prevent crashes when bringing up sunvdc virtual block devices in
some environments. From Jim Quigley.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sunvdc: prevent sunvdc panic when mpgroup disk added to guest domain
sparc64: Increase max_phys_bits to 51 and VA bits to 53 for M8.
sparc64: recognize and support sparc M8 cpu type
sparc64: properly name the cpu constants
A hang on CPU0 onlining after a preceding offlining is observed. Trace
shows that CPU0 is stuck in check_tsc_sync_target() waiting for source
CPU to run check_tsc_sync_source() but this never happens. Source CPU,
in its turn, is stuck on synchronize_sched() which is called from
native_cpu_up() -> do_boot_cpu() -> unregister_nmi_handler().
So it's a classic ABBA deadlock, due to the use of synchronize_sched() in
unregister_nmi_handler().
Fix the bug by moving unregister_nmi_handler() from do_boot_cpu() to
native_cpu_up() after cpu onlining is done.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170803105818.9934-1-vkuznets@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since the kernel segment registers are not prepared at the
entry of irq-entry code, if a kprobe on such code is
jump-optimized, accessing per-CPU variables may cause a
kernel panic.
However, if the kprobe is not optimized, it triggers an int3
exception and sets segment registers correctly.
With this patch we check the probe-address and if it is in the
irq-entry code, it prohibits optimizing such kprobes.
This means we can continue probing such interrupt handlers by kprobes
but it is not optimized anymore.
Reported-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Tested-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David S . Miller <davem@davemloft.net>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: linux-arch@vger.kernel.org
Cc: linux-cris-kernel@axis.com
Cc: mathieu.desnoyers@efficios.com
Link: http://lkml.kernel.org/r/150172795654.27216.9824039077047777477.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Generate irqentry and softirqentry text sections without
any Kconfig dependencies. This will add extra sections, but
there should be no performace impact.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David S . Miller <davem@davemloft.net>
Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: linux-arch@vger.kernel.org
Cc: linux-cris-kernel@axis.com
Cc: mathieu.desnoyers@efficios.com
Link: http://lkml.kernel.org/r/150172789110.27216.3955739126693102122.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mark _stext and _end as character arrays instead of single
character variable, like include/asm-generic/sections.h does.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David S . Miller <davem@davemloft.net>
Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: linux-arch@vger.kernel.org
Cc: linux-cris-kernel@axis.com
Cc: mathieu.desnoyers@efficios.com
Link: http://lkml.kernel.org/r/150172782555.27216.2805751327900543374.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mark _stext and _end as character arrays instead of single
character variables, like include/asm-generic/sections.h does.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David S . Miller <davem@davemloft.net>
Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: linux-arch@vger.kernel.org
Cc: linux-cris-kernel@axis.com
Cc: mathieu.desnoyers@efficios.com
Link: http://lkml.kernel.org/r/150172775958.27216.12951305461398200544.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mark _stext and _etext as character arrays instead of
single character variables, like include/asm-generic/sections.h
does.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David S . Miller <davem@davemloft.net>
Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: linux-arch@vger.kernel.org
Cc: linux-cris-kernel@axis.com
Cc: mathieu.desnoyers@efficios.com
Link: http://lkml.kernel.org/r/150172769415.27216.12021110228384155707.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This closes a hole in our SMAP implementation.
This patch comes from grsecurity. Good catch!
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/314cc9f294e8f14ed85485727556ad4f15bb1659.1502159503.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In Family 17h, the number of cores sharing a cache level is obtained
from the Cache Properties CPUID leaf (0x8000001d) by passing in the
cache level in ECX. In prior families, a cache level of 2 was used to
determine this information.
To get the right information, irrespective of Family, iterate over
the cache levels using CPUID 0x8000001d. The last level cache is the
last value to return a non-zero value in EAX.
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5ab569025b39cdfaeca55b571d78c0fc800bdb69.1497452002.git.Janakarajan.Natarajan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In Family 17h, L3 is the last level cache as opposed to L2 in previous
families. Avoid this name confusion and rename X86_FEATURE_PERFCTR_L2 to
X86_FEATURE_PERFCTR_LLC to indicate the performance counter on the last
level of cache.
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/016311029fdecdc3fdc13b7ed865c6cbf48b2f15.1497452002.git.Janakarajan.Natarajan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Vince reported the following rdpmc() testcase failure:
> Failing test case:
>
> fd=perf_event_open();
> addr=mmap(fd);
> exec() // without closing or unmapping the event
> fd=perf_event_open();
> addr=mmap(fd);
> rdpmc() // GPFs due to rdpmc being disabled
The problem is of course that exec() plays tricks with what is
current->mm, only destroying the old mappings after having
installed the new mm.
Fix this confusion by passing along vma->vm_mm instead of relying on
current->mm.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 1e0fb9ec67 ("perf: Add pmu callbacks to track event mapping and unmapping")
Link: http://lkml.kernel.org/r/20170802173930.cstykcqefmqt7jau@hirez.programming.kicks-ass.net
[ Minor cleanups. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The EMAC Ethernet controller was enabled, but an accompanying alias
was not added. This results in unstable numbering if other Ethernet
devices, such as a USB dongle, are present. Also, the bootloader uses
the alias to assign a generated stable MAC address to the device node.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Fixes: 96219b0048 ("arm64: allwinner: a64: add device tree for SoPine
with baseboard")
[wens@csie.org: Rewrite commit log as fixing a previous patch with Fixes]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
The EMAC Ethernet controller was enabled, but an accompanying alias
was not added. This results in unstable numbering if other Ethernet
devices, such as a USB dongle, are present. Also, the bootloader uses
the alias to assign a generated stable MAC address to the device node.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Fixes: 9702394374 ("arm64: allwinner: pine64: Enable dwmac-sun8i")
[wens@csie.org: Rewrite commit log as fixing a previous patch with Fixes]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
The EMAC Ethernet controller was enabled, but an accompanying alias
was not added. This results in unstable numbering if other Ethernet
devices, such as a USB dongle, are present. Also, the bootloader uses
the alias to assign a generated stable MAC address to the device node.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Fixes: e729549990 ("arm64: allwinner: bananapi-m64: Enable dwmac-sun8i")
[wens@csie.org: Rewrite commit log as fixing a previous patch with Fixes]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Pull networking fixes from David Miller:
"The pull requests are getting smaller, that's progress I suppose :-)
1) Fix infinite loop in CIPSO option parsing, from Yujuan Qi.
2) Fix remote checksum handling in VXLAN and GUE tunneling drivers,
from Koichiro Den.
3) Missing u64_stats_init() calls in several drivers, from Florian
Fainelli.
4) TCP can set the congestion window to an invalid ssthresh value
after congestion window reductions, from Yuchung Cheng.
5) Fix BPF jit branch generation on s390, from Daniel Borkmann.
6) Correct MIPS ebpf JIT merge, from David Daney.
7) Correct byte order test in BPF test_verifier.c, from Daniel
Borkmann.
8) Fix various crashes and leaks in ASIX driver, from Dean Jenkins.
9) Handle SCTP checksums properly in mlx4 driver, from Davide
Caratti.
10) We can potentially enter tcp_connect() with a cached route
already, due to fastopen, so we have to explicitly invalidate it.
11) skb_warn_bad_offload() can bark in legitimate situations, fix from
Willem de Bruijn"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
net: avoid skb_warn_bad_offload false positives on UFO
qmi_wwan: fix NULL deref on disconnect
ppp: fix xmit recursion detection on ppp channels
rds: Reintroduce statistics counting
tcp: fastopen: tcp_connect() must refresh the route
net: sched: set xt_tgchk_param par.net properly in ipt_init_target
net: dsa: mediatek: add adjust link support for user ports
net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets
qed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()'
hysdn: fix to a race condition in put_log_buffer
s390/qeth: fix L3 next-hop in xmit qeth hdr
asix: Fix small memory leak in ax88772_unbind()
asix: Ensure asix_rx_fixup_info members are all reset
asix: Add rx->ax_skb = NULL after usbnet_skb_return()
bpf: fix selftest/bpf/test_pkt_md_access on s390x
netvsc: fix race on sub channel creation
bpf: fix byte order test in test_verifier
xgene: Always get clk source, but ignore if it's missing for SGMII ports
MIPS: Add missing file for eBPF JIT.
bpf, s390: fix build for libbpf and selftest suite
...
When CPUs start and stop the watchdog, they manipulate shared data
that is normally protected by the lock. Other CPUs can be running
concurrently at this time, so it's a good idea to use locking here
to be on the safe side.
Remove the barrier which is undocumented and didn't do anything.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When the SMP detector finds other CPUs stuck, it iterates over
them and marks them as stuck. This pulls them out of the pending
mask and allows the detector to continue with remaining good
CPUs (if nmi_watchdog=panic is not enabled).
The code to dothat was buggy because when setting a CPU stuck,
if the pending mask became empty, it resets it to keep the
watchdog running. However the iterator will continue to run
over the new pending mask and mark remaining good CPUs sas stuck.
Fix this by doing it with cpumask bitwise operations.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When the watchdog decides to panic, it takes the lock and double
checks everything (to avoid races with the CPU being unstuck or
panic()ed by something else).
The exit label was misplaced and would result in all-CPUs backtrace
and watchdog panic even in the case that the condition was found to be
resolved.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Some code can go into a tight loop calling touch_nmi_watchdog (e.g.,
stop_machine CPU hotplug code). This can cause contention on watchdog
locks particularly if all CPUs with watchdog enabled are spinning in
the loops.
Avoid this storm of activity by running the watchdog timer callback
from this path if we have exceeded the timer period since it was last
run.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
- Hard-disable interrupts before taking the lock, which prevents
soft-NMI re-entrancy and therefore can prevent deadlocks.
- Use raw_ variants of local_irq_disable to avoid irq debugging.
- When the lock is contended, spin at low SMT priority, using
loads only, and with interrupts enabled (where possible).
Some stalls have been noticed at high loads that go away with improved
locking. There should not be so much locking contention in the first
place (which is addressed in a subsequent patch), but locking should
still be improved.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When the NMI IPI lock is contended, spin at low SMT priority, using
loads only, and with interrupts enabled (where possible). This
improves behaviour under high contention (e.g., a system crash when
a number of CPUs are trying to enter the debugger).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In commit 05a4a95279 ("kernel/watchdog: split up config options"),
CONFIG_LOCKUP_DETECTOR was split into two separate config options,
CONFIG_HARDLOCKUP_DETECTOR and CONFIG_SOFTLOCKUP_DETECTOR.
Our defconfigs still have CONFIG_LOCKUP_DETECTOR=y, but that is no longer
user selectable, and we don't mention the new options, so we end up with
none of them enabled.
So update the defconfigs to turn on the new SOFT and HARD options, the
end result being the same as what we had previously.
Fixes: 05a4a95279 ("kernel/watchdog: split up config options")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
It was reported that the sha1 AVX2 function(sha1_transform_avx2) is
reading ahead beyond its intended data, and causing a crash if the next
block is beyond page boundary:
http://marc.info/?l=linux-crypto-vger&m=149373371023377
This patch makes sure that there is no overflow for any buffer length.
It passes the tests written by Jan Stancek that revealed this problem:
https://github.com/jstancek/sha1-avx2-crash
I have re-enabled sha1-avx2 by reverting commit
b82ce24426
Cc: <stable@vger.kernel.org>
Fixes: b82ce24426 ("crypto: sha1-ssse3 - Disable avx2")
Originally-by: Ilya Albrekht <ilya.albrekht@intel.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When building a kernel for the microMIPS ISA, ensure that the ISA bit
(ie. bit 0) in the entry address is set. Otherwise we may include an
entry address in images which bootloaders will jump to as MIPS32 code.
I originally tried using "objdump -f" to obtain the entry address, which
works for microMIPS but it always outputs a 32 bit address for a 32 bit
ELF whilst nm will sign extend to 64 bit. That matters for systems where
we might want to run a MIPS32 kernel on a MIPS64 CPU & load it with a
MIPS64 bootloader, which would then jump to a non-canonical
(non-sign-extended) address.
This works in all cases as it only changes the behaviour for microMIPS
kernels, but isn't the prettiest solution. A possible alternative would
be to write a custom tool to just extract, sign extend & print the entry
point of an ELF executable. I'm open to feedback if that would be
preferred.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16950/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We don't currently support the MT ASE for microMIPS kernels, and there
are no CPUs currently in existence that use both. They can however both
be enabled in Kconfig, resulting in build failures such as:
AS arch/mips/kernel/cps-vec.o
arch/mips/kernel/cps-vec.S: Assembler messages:
arch/mips/kernel/cps-vec.S:242: Warning: the 32-bit microMIPS architecture does not support the `mt' extension
arch/mips/kernel/cps-vec.S:276: Error: unrecognized opcode `mttc0 $13,$2,2'
arch/mips/kernel/cps-vec.S:282: Error: unrecognized opcode `mttc0 $8,$1,2'
arch/mips/kernel/cps-vec.S:285: Error: unrecognized opcode `mttc0 $0,$2,1'
...
Fix this by preventing MT from being enabled when targeting microMIPS.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16951/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Currently, we use the opal call opal_slw_set_reg() to inform the
Sleep-Winkle Engine (SLW) to restore the contents of some of the
Hypervisor state on wakeup from deep idle states that lose full
hypervisor context (characterized by the flag
OPAL_PM_LOSE_FULL_CONTEXT).
However, the current code has a bug in that if opal_slw_set_reg()
fails, we don't disable the use of these deep states (winkle on
POWER8, stop4 onwards on POWER9).
This patch fixes this bug by ensuring that if programing the
sleep-winkle engine to restore the hypervisor states in
pnv_save_sprs_for_deep_states() fails, then we exclude such states by
clearing the OPAL_PM_LOSE_FULL_CONTEXT flag from
supported_cpuidle_states. As a result POWER8 will be prevented from
using winkle for CPU-Hotplug, and POWER9 will put the offlined CPUs to
the default stop state when available.
Further, we ensure in the initialization of the cpuidle-powernv driver
to only include those states whose flags are present in
supported_cpuidle_states, thereby skipping OPAL_PM_LOSE_FULL_CONTEXT
states when they have been disabled due to stop-api failure.
Fixes: 1e1601b38e ("powerpc/powernv/idle: Restore SPRs for deep idle
states via stop API.")
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit 1c3c5eab17 ("sched/core: Enable might_sleep() and
smp_processor_id() checks early") enables checks for might_sleep() and
smp_processor_id() being used in preemptible code earlier in the boot
than before. This results in a new BUG from
pcibios_set_cache_line_size().
BUG: using smp_processor_id() in preemptible [00000000] code:
swapper/0/1 caller is pcibios_set_cache_line_size+0x10/0x70
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc1-00007-g3ce3e4ba4275 #615
Stack: 0000000000000000 ffffffff81189694 0000000000000000 ffffffff81822318
000000000000004e 0000000000000001 800000000e20bd08 20c49ba5e3540000
0000000000000000 0000000000000000 ffffffff818d0000 0000000000000000
0000000000000000 ffffffff81189328 ffffffff818ce692 0000000000000000
0000000000000000 ffffffff81189bc8 ffffffff818d0000 0000000000000000
ffffffff81828907 ffffffff81769970 800000020ec78d80 ffffffff818c7b48
0000000000000001 0000000000000001 ffffffff818652b0 ffffffff81896268
ffffffff818c0000 800000020ec7fb40 800000020ec7fc58 ffffffff81684cac
0000000000000000 ffffffff8118ab50 0000000000000030 ffffffff81769970
0000000000000001 ffffffff81122a58 0000000000000000 0000000000000000 ...
Call Trace:
[<ffffffff81122a58>] show_stack+0x90/0xb0
[<ffffffff81684cac>] dump_stack+0xac/0xf0
[<ffffffff813f7050>] check_preemption_disabled+0x120/0x128
[<ffffffff818855e8>] pcibios_set_cache_line_size+0x10/0x70
[<ffffffff81100578>] do_one_initcall+0x48/0x140
[<ffffffff81865dc4>] kernel_init_freeable+0x194/0x24c
[<ffffffff8169c534>] kernel_init+0x14/0x118
[<ffffffff8111ca84>] ret_from_kernel_thread+0x14/0x1c
Fix this by using the cpu_*cache_line_size() macros instead. These
macros are the "proper" way to determine the CPU cache sizes.
This makes use of the newly added cpu_tcache_line_size.
Fixes: 1c3c5eab17 ("sched/core: Enable might_sleep() and smp_processor_id() checks early")
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Suggested-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
There exist macros to return the cache line size of the L1 dcache and L2
scache but there is currently no macro for the L3 tcache. Add this macro
which will be used by the following patch "MIPS: PCI: Fix
smp_processor_id() in preemptible"
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Maciej W. Rozycki <macro@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/16871/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Fix a commit 3021773c7c ("MIPS: DEC: Avoid la pseudo-instruction in
delay slots") regression and remove assembly errors:
arch/mips/dec/int-handler.S: Assembler messages:
arch/mips/dec/int-handler.S:162: Error: Macro used $at after ".set noat"
arch/mips/dec/int-handler.S:163: Error: Macro used $at after ".set noat"
arch/mips/dec/int-handler.S:229: Error: Macro used $at after ".set noat"
arch/mips/dec/int-handler.S:230: Error: Macro used $at after ".set noat"
triggering with with the CPU_DADDI_WORKAROUNDS option set and the DADDIU
instruction. This is because with that option in place the instruction
becomes a macro, which expands to an LI/DADDU (or actually ADDIU/DADDU)
sequence that uses $at as a temporary register.
With CPU_DADDI_WORKAROUNDS we only support `-msym32' compilation though,
and this is already enforced in arch/mips/Makefile, so choose the 32-bit
expansion variant for the supported configurations and then replace the
64-bit variant with #error just in case.
Fixes: 3021773c7c ("MIPS: DEC: Avoid la pseudo-instruction in delay slots")
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # 4.8+
Patchwork: https://patchwork.linux-mips.org/patch/16893/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Extend clobber lists to include all GP registers.
Fixes: 0b523a85e1 ("MIPS: VDSO: Add implementation of gettimeofday() fallback")
Signed-off-by: Miodrag Dinic <miodrag.dinic@imgtec.com>
Signed-off-by: Goran Ferenc <goran.ferenc@imgtec.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: Bo Hu <bohu@google.com>
Cc: Douglas Leung <douglas.leung@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jin Qian <jinqian@google.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Petar Jovanovic <petar.jovanovic@imgtec.com>
Cc: Raghu Gandham <raghu.gandham@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/16879/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 296e46db00 ("MIPS: Don't unnecessarily include kmalloc.h into
<asm/cache.h>.") claimed that the inclusion of the machine's kmalloc.h
from asm/cache.h is unnecessary, but this is not true.
Without including kmalloc.h we don't get a definition for
ARCH_DMA_MINALIGN, which means we no longer suitably align DMA. Further
to this the definition of ARCH_KMALLOC_MINALIGN provided by linux/slab.h
ends up being set to the alignment of an unsigned long long value rather
than to ARCH_DMA_MINALIGN, which means that buffers allocated using
kmalloc may no longer be safely aligned for use with DMA.
Fix this by re-adding the include of kmalloc.h in asm/cache.h. This
reverts commit 296e46db00 ("MIPS: Don't unnecessarily include
kmalloc.h into <asm/cache.h>.")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 296e46db00 ("MIPS: Don't unnecessarily include kmalloc.h into <asm/cache.h>.")
Cc: linux-mips@linux-mips.org
Cc: stable <stable@vger.kernel.org> # v4.12+
Patchwork: https://patchwork.linux-mips.org/patch/16895/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Fix build error when CONFIG_SMP is turned off:
CC [M] arch/mips/cavium-octeon/octeon-usb.o
arch/mips/cavium-octeon/octeon-usb.c: In function
‘dwc3_octeon_device_init’:
arch/mips/cavium-octeon/octeon-usb.c:540:4: error: implicit declaration
of function ‘devm_iounmap’ [-Werror=implicit-function-declaration]
devm_iounmap(&pdev->dev, base);
Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Tested-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16907/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit "MIPS: Octeon: Remove unused L2C types and macros." broke the
the EDAC driver. Bring back 'cvmx-l2d-defs.h' file and the missing
types for L2C. Fixes: 15f6847923 ("MIPS: Octeon: Remove unused L2C
types and macros.")
Fixes: 15f6847923 ("MIPS: Octeon: Remove unused L2C types and macros.")
Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.12+
Patchwork: https://patchwork.linux-mips.org/patch/16906/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
While testing cpu hoptlug (cpu down and up in loops) on kernel 4.4, it was
observed that occasionally check for cpu online will fail in kernel/cpu.c,
_cpu_up:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/kernel/cpu.c?h=v4.4.79#n485
518 /* Arch-specific enabling code. */
519 ret = __cpu_up(cpu, idle);
520
521 if (ret != 0)
522 goto out_notify;
523 BUG_ON(!cpu_online(cpu));
Reason is race between start_secondary and _cpu_up. cpu_callin_map is set
before cpu_online_mask. In __cpu_up, cpu_callin_map is waited for, but cpu
online mask is not, resulting in race in which secondary processor started
and set cpu_callin_map, but not yet set the online mask,resulting in above
BUG being hit.
Upstream differs in the area. cpu_online check is in bringup_wait_for_ap,
which is after cpu reached AP_ONLINE_IDLE,where secondary passed its start
function. Nonetheless, fix makes start_secondary safe and not depending on
other locks throughout the code. It protects as well against cpu_online
checks put in between sometimes in the future.
Fix this by moving completion after all flags are set.
Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16925/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
clock name of "audio_clkout" is used by Renesas sound driver.
This duplicated naming breaks its clock registering/unregistering.
Especially, when unbind/bind it can't handle clkout correctly.
This patch renames "audio_clkout" to "audio-clkout" to avoid
naming conflict.
Fixes: 8a8f181d2c ("arm64: renesas: salvator-x: use CS2000 as AUDIO_CLK_B")
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Pull MIPS fixes from Ralf Baechle:
"This fixes two build issues for ralink platforms, both due to missing
#includes which used to be included indirectly via other headers"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: ralink: mt7620: Add missing header
MIPS: ralink: Fix build error due to missing header
Add a ranges; line to the tscadc node. This creates a 1:1 mapping between
the addresses used by tscadc and those in its child nodes (adc, tsc).
Without such a mapping, the reg = ... lines in the tsc and adc nodes do
not create a resource. Probing the fsl-imx25-tcq and fsl-imx25-tsadc
drivers will then fail since there's no IORESOURCE_MEM.
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Fixes: 92f651f39b ("ARM: dts: imx25: Add TSC and ADC support")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
ARM:
- Yet another race with VM destruction plugged
- A set of small vgic fixes
x86:
- Preserve pending INIT
- RCU fixes in paravirtual async pf, VM teardown, and VMXOFF emulation
- nVMX interrupt injection and dirty tracking fixes
- initialize to make UBSAN happy
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJZhNZ2AAoJEED/6hsPKofoKDEH/iIw1pcgdEW2NP/kFtKXSMCK
josdFwGPQMjBGzx6No4tfMCNDOjW2FKYXapN6CASAqMJo5H2krj8VHMVwm0h3lUl
4RdbbkFTdfl/Znp8M39efFheWrjX+L37AltKV7xAgA7n8cO39KV4RReimzSc7aVq
5dDt4k0dbF9/zXHxkGiKEhwaSSbZEEznQQ/09annSoOVe6om5esUrUtnUF5P99uz
IhAsmJbZxE5VmowjT5MjaR1mXSLLNL55HWKvkf3B3ZGnyxQU+3Vz7IGf2Ma2j+jV
IrdXA11NHDY1anDYgDhFlr3rTCPu9CBmTv4O8zsDRlX9TGpr8bBX2dvjRKl7uOo=
=KM80
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"ARM:
- Yet another race with VM destruction plugged
- A set of small vgic fixes
x86:
- Preserve pending INIT
- RCU fixes in paravirtual async pf, VM teardown, and VMXOFF
emulation
- nVMX interrupt injection and dirty tracking fixes
- initialize to make UBSAN happy"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: arm/arm64: vgic: Use READ_ONCE fo cmpxchg
KVM: nVMX: Fix interrupt window request with "Acknowledge interrupt on exit"
KVM: nVMX: mark vmcs12 pages dirty on L2 exit
kvm: nVMX: don't flush VMCS12 during VMXOFF or VCPU teardown
KVM: nVMX: do not pin the VMCS12
KVM: avoid using rcu_dereference_protected
KVM: X86: init irq->level in kvm_pv_kick_cpu_op
KVM: X86: Fix loss of pending INIT due to race
KVM: async_pf: make rcu irq exit if not triggered from idle task
KVM: nVMX: fixes to nested virt interrupt injection
KVM: nVMX: do not fill vm_exit_intr_error_code in prepare_vmcs12
KVM: arm/arm64: Handle hva aging while destroying the vm
KVM: arm/arm64: PMU: Fix overflow interrupt injection
KVM: arm/arm64: Fix bug in advertising KVM_CAP_MSI_DEVID capability
Pull x86 fix from Thomas Gleixner:
"The recent irq core changes unearthed API abuse in the HPET code,
which manifested itself in a suspend/resume regression.
The fix replaces the cruft with the proper function calls and cures
the regression"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/hpet: Cure interface abuse in the resume path
This comes a bit later than I planned, and as a consequence is a larger
than it should be.
Most of the changes are devicetree fixes, across lots of platforms:
Renesas, Samsung Exynos, Marvell EBU, TI OMAP, Rockchips, Amlogic Meson,
Sigma Desings Tango, Allwinner SUNxi and TI Davinci.
Also across many platforms, I applied an older series of simple randconfig
build fixes. This includes making the CONFIG_MTD_XIP option compile again,
which had been broken for many years and probably has not been missed, but
it felt wrong to just remove it completely.
The only other changes are:
- We enable HWSPINLOCK in defconfig to get some Qualcomm boards
to work out of the box.
- A few regression fixes for Texas Instruments OMAP2+.
- A boot regression fix for the Renesas regulator quirk.
- A suspend/resume fix for Uniphier SoCs, fixing the resume of the
system bus.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAWYTonGCrR//JCVInAQJTrw/+Kv+Y1EybtFFJuMgVqH0FserTKAonNv25
OJxby86GQQTKkju9U7jYM4MmpufBEMKnY9kn83rviUKfiuVDsJcRWp2jMynJtE5m
W1aIFCu1L8TsKAQJpmBPe8G/cSG7SRH2OoX9Ee5ozii0cBUTOCy6UGDYJqsY0MKy
8KPUWcHqYhLCT+0F0raf1+F5LqJlDh44q1+UVDPphqO4pbguio7PfXA+ut8VMHS7
YSDgiXb3UoXy8tSXRX4JcYJfFG+jIqg0vWpkOW9mlLc5e/+Ndmj5CWlDZPt4hWaR
3Us5VVkI8SXq7VGcJV/mA6JO7M4L4oP1caTATxzDdK1C2u/SuhdnMcenOy8LtRLi
gImdeZLK7+RFPS+Sp/mhQQ0UtULw0kQ5BEzpN0jVI1CKybK4TN2+GrJdZLR9Usas
TkavAUl0X5E7YXws7yb8qTJqDLXxTxyN8H4gMaTh+rCssh9+I8bvChoJ0gpGRIMl
iAHodqGFBvlVQWtUefc4BfDCqj7sVmtfBcAeOK7LVQPjb9D+hPvHJ50yfN3qMpMa
eVvEz4l91W4k/vxI5HsA/r7RybDujTLvgR/BGU0kLZAsEP/FpkrIKgmi5yuzyf1u
Sd3OiK8V7FXB3dBPf0i0wagaBjuEUcEuE0GQU+bM7qX5aW8flLxbEhj4SEv0NqvL
bHiaTEi9uAU=
=a2dO
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"This comes a bit later than I planned, and as a consequence is a
larger than it should be.
Most of the changes are devicetree fixes, across lots of platforms:
Renesas, Samsung Exynos, Marvell EBU, TI OMAP, Rockchips, Amlogic
Meson, Sigma Desings Tango, Allwinner SUNxi and TI Davinci.
Also across many platforms, I applied an older series of simple
randconfig build fixes. This includes making the CONFIG_MTD_XIP option
compile again, which had been broken for many years and probably has
not been missed, but it felt wrong to just remove it completely.
The only other changes are:
- We enable HWSPINLOCK in defconfig to get some Qualcomm boards to
work out of the box.
- A few regression fixes for Texas Instruments OMAP2+.
- A boot regression fix for the Renesas regulator quirk.
- A suspend/resume fix for Uniphier SoCs, fixing the resume of the
system bus"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (43 commits)
ARM: dts: tango4: Request RGMII RX and TX clock delays
bus: uniphier-system-bus: set up registers when resuming
ARM64: dts: marvell: armada-37xx: Fix the number of GPIO on south bridge
ARM: shmobile: rcar-gen2: Fix deadlock in regulator quirk
arm64: defconfig: enable missing HWSPINLOCK
ARM: pxa: select both FB and FB_W100 for eseries
ARM: ixp4xx: fix ioport_unmap definition
ARM: ep93xx: use ARM_PATCH_PHYS_VIRT correctly
ARM: mmp: mark usb_dma_mask as __maybe_unused
ARM: omap2: mark unused functions as __maybe_unused
ARM: omap1: avoid unused variable warning
ARM: sirf: mark sirfsoc_init_late as __maybe_unused
ARM: ixp4xx: use normal prototype for {read,write}s{b,w,l}
ARM: omap1/ams-delta: warn about failed regulator enable
ARM: rpc: rename RAM_SIZE macro
ARM: w90x900: normalize clk API
ARM: ep93xx: normalize clk API
ARM: dts: sun8i: a83t: Switch to CCU device tree binding macros
arm64: allwinner: sun50i-a64: Correct emac register size
ARM: dts: sunxi: h3/h5: Correct emac register size
...
- Report correct timer frequency to userspace when trapping CNTFRQ_EL0
- Fix race with hardware page table updates when updating access flags
- Silence clang overflow warning in VA_START and PAGE_OFFSET calculations
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJZhIc7AAoJELescNyEwWM0EaUH/0f1/MnKwZHAUoSOWrZynKna
0y59hNvlr47FCGxaMVSVWq/+mtSqxsQTPyLkLSULh8PR7OEwjq5qLxLMhFWF/7B4
/nmAGhFOYHDR95/0Mo0cMOMyamIvGPIqd304yld0P13S5z26UdRCoTrWgeKFJ2sm
wcKCJeAK+U4NZbASNthDKH7Mo0lM25oLIJ8gqj4o0NrxyZr86mXVTypV/nM82qdm
WRMFqGh5iRrmnGSCajgcTjqnYkFOYwHirKvdrV68+GFOYqVIS+9+RdwSnRNOZ9hl
MsIOUDRn3doHDe4tK9NcQlLoaRIqA9fM4+94XGJu+8uUNfDRYMueO6XUvGmnmOQ=
=FER5
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Here are some more arm64 fixes for 4.13. The main one is the PTE race
with the hardware walker, but there are a couple of other things too.
- Report correct timer frequency to userspace when trapping
CNTFRQ_EL0
- Fix race with hardware page table updates when updating access
flags
- Silence clang overflow warning in VA_START and PAGE_OFFSET
calculations"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: avoid overflow in VA_START and PAGE_OFFSET
arm64: Fix potential race with hardware DBM in ptep_set_access_flags()
arm64: Use arch_timer_get_rate when trapping CNTFRQ_EL0
Inexplicably, commit f381bf6d82 ("MIPS: Add support for eBPF JIT.")
lost a file somewhere on its path to Linus' tree. Add back the
missing ebpf_jit.c so that we can build with CONFIG_BPF_JIT selected.
This version of ebpf_jit.c is identical to the original except for two
minor change need to resolve conflicts with changes merged from the
BPF branch:
A) Set prog->jited_len = image_size;
B) Use BPF_TAIL_CALL instead of BPF_CALL | BPF_X
Fixes: f381bf6d82 ("MIPS: Add support for eBPF JIT.")
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
While testing some other work that required JIT modifications, I
run into test_bpf causing a hang when JIT enabled on s390. The
problematic test case was the one from ddc665a4bb (bpf, arm64:
fix jit branch offset related to ldimm64), and turns out that we
do have a similar issue on s390 as well. In bpf_jit_prog() we
update next instruction address after returning from bpf_jit_insn()
with an insn_count. bpf_jit_insn() returns either -1 in case of
error (e.g. unsupported insn), 1 or 2. The latter is only the
case for ldimm64 due to spanning 2 insns, however, next address
is only set to i + 1 not taking actual insn_count into account,
thus fix is to use insn_count instead of 1. bpf_jit_enable in
mode 2 provides also disasm on s390:
Before fix:
000003ff800349b6: a7f40003 brc 15,3ff800349bc ; target
000003ff800349ba: 0000 unknown
000003ff800349bc: e3b0f0700024 stg %r11,112(%r15)
000003ff800349c2: e3e0f0880024 stg %r14,136(%r15)
000003ff800349c8: 0db0 basr %r11,%r0
000003ff800349ca: c0ef00000000 llilf %r14,0
000003ff800349d0: e320b0360004 lg %r2,54(%r11)
000003ff800349d6: e330b03e0004 lg %r3,62(%r11)
000003ff800349dc: ec23ffeda065 clgrj %r2,%r3,10,3ff800349b6 ; jmp
000003ff800349e2: e3e0b0460004 lg %r14,70(%r11)
000003ff800349e8: e3e0b04e0004 lg %r14,78(%r11)
000003ff800349ee: b904002e lgr %r2,%r14
000003ff800349f2: e3b0f0700004 lg %r11,112(%r15)
000003ff800349f8: e3e0f0880004 lg %r14,136(%r15)
000003ff800349fe: 07fe bcr 15,%r14
After fix:
000003ff80ef3db4: a7f40003 brc 15,3ff80ef3dba
000003ff80ef3db8: 0000 unknown
000003ff80ef3dba: e3b0f0700024 stg %r11,112(%r15)
000003ff80ef3dc0: e3e0f0880024 stg %r14,136(%r15)
000003ff80ef3dc6: 0db0 basr %r11,%r0
000003ff80ef3dc8: c0ef00000000 llilf %r14,0
000003ff80ef3dce: e320b0360004 lg %r2,54(%r11)
000003ff80ef3dd4: e330b03e0004 lg %r3,62(%r11)
000003ff80ef3dda: ec230006a065 clgrj %r2,%r3,10,3ff80ef3de6 ; jmp
000003ff80ef3de0: e3e0b0460004 lg %r14,70(%r11)
000003ff80ef3de6: e3e0b04e0004 lg %r14,78(%r11) ; target
000003ff80ef3dec: b904002e lgr %r2,%r14
000003ff80ef3df0: e3b0f0700004 lg %r11,112(%r15)
000003ff80ef3df6: e3e0f0880004 lg %r14,136(%r15)
000003ff80ef3dfc: 07fe bcr 15,%r14
test_bpf.ko suite runs fine after the fix.
Fixes: 0546231057 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On M8 chips, use a max_phys_bits value of 51.
Also, M8 supports VA bits up to 54 bits. However, for now
restrict VA bits to 53 due to 4-level pagetable limitation.
Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recognize SPARC-M8 cpu type, hardware caps and cpu
distribution map.
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: David Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull sparc fixes from David Miller:
- block interrupts properly across the entire MMU context change (both
the hw MMU context change and the TSB table change) so that we don't
get a perf event interrupt in the middle. From Rob Gardner.
- be sure to register hugepages early enough, from Nitin Gupta.
- UltraSPARC-III user copy exception handling would return garbage for
the copied length in some circumstances.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix exception handling in UltraSPARC-III memcpy.
sbus: Convert to using %pOF instead of full_name
sparc: defconfig: Cleanup from old Kconfig options
sparc64: Register hugepages during arch init
sparc64: Prevent perf from running during super critical sections
Fixes for recently merged code:
- a fix for the _PAGE_DEVMAP support, which was breaking KVM on Power9 radix
- avoid a (harmless) lockdep warning in the early SMP code
- return failure for some uses of dma_set_mask() rather than falling back to 32-bits
- fix stack setup in watchdog soft_nmi_common() to use emergency stack
- fix of_irq_to_resource() error check in of_fsl_spi_probe()
Two fixes going to stable:
- fix saving of Transactional Memory SPRs in core dump
- fix __check_irq_replay missing decrementer interrupt
And two misc:
- fix 64-bit boot wrapper build with non-biarch compiler
- work around a POWER9 PMU hang after state-loss idle
Thanks to:
Alistair Popple, Aneesh Kumar K.V, Cyril Bur, Gustavo Romero, Jose Ricardo
Ziviani, Laurent Vivier, Nicholas Piggin, Oliver O'Halloran, Sergei Shtylyov,
Suraj Jitindar Singh, Thomas Gleixner.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZhEH/AAoJEFHr6jzI4aWA8GoP+wdtaBYZIElFMnPay/BY8lSQ
MODF6FlaexdNuVFo5In2BnMAs12bIXDO69wMziqc+uZIA+vI3K7lqsi+8p1R38Ts
FG5OFLzX/gx1ykoeDe0mC8GPUHGVBdMr6+rHzt/9M7PrQWApCoBJLakYIElOrN5B
EiAxunwJQISKNQYYmyVhjMVFq8ydMHSgtR7JVDHTh1AoStaO9M5hoJegD4Dnytaf
F0/0Y5U+NGGWwPIyDrjB4wE9L8NCk0JfvpBhjORMkFSpAOe4VhxZrAGDyaCzVKVh
YkxSirVKuVwEmvV2prmq4SAJi61EMUEPx2WL7hRN3ol/dF+2lU9nl2lyL3yRnGPY
9mtNHur07qoMqyXI8750ayTzbMG75aakJF2be9G+zwNc3Sw8dqJvs2PwpSVdnima
jWmWnRikws4qmEOEBV6nxtWf8qYACHevyK7YzIHDU8SVXLjAkX/u6LZGHAfTQ8Xo
jYcISpiA2ko8YnMR7yy9FZqs4KGqeOotMamgJRGUcSY/9Xn8WDIrtQ4tIuJtGjzW
XFOFafZOzyt6s1dNz01C4hzP60J+/CxPNRxx6zWnbijY6puADu/Mv1g+ai3nYdKA
e/CYt6O8tzooFHCHtVLUPgWA82heDwNtjmlnSikWTdT07KLRhP+0w9Qq8gBzztKT
7B/vPMjCMPQaJrrnj6GX
=Rdy8
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fixes for recently merged code:
- a fix for the _PAGE_DEVMAP support, which was breaking KVM on
Power9 radix
- avoid a (harmless) lockdep warning in the early SMP code
- return failure for some uses of dma_set_mask() rather than falling
back to 32-bits
- fix stack setup in watchdog soft_nmi_common() to use emergency
stack
- fix of_irq_to_resource() error check in of_fsl_spi_probe()
Two fixes going to stable:
- fix saving of Transactional Memory SPRs in core dump
- fix __check_irq_replay missing decrementer interrupt
And two misc:
- fix 64-bit boot wrapper build with non-biarch compiler
- work around a POWER9 PMU hang after state-loss idle
Thanks to: Alistair Popple, Aneesh Kumar K.V, Cyril Bur, Gustavo
Romero, Jose Ricardo Ziviani, Laurent Vivier, Nicholas Piggin, Oliver
O'Halloran, Sergei Shtylyov, Suraj Jitindar Singh, Thomas Gleixner"
* tag 'powerpc-4.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64: Fix __check_irq_replay missing decrementer interrupt
powerpc/perf: POWER9 PMU stops after idle workaround
powerpc/83xx/mpc832x_rdb: fix of_irq_to_resource() error check
powerpc/64s: Fix stack setup in watchdog soft_nmi_common()
powerpc/powernv/pci: Return failure for some uses of dma_set_mask()
powerpc/boot: Fix 64-bit boot wrapper build with non-biarch compiler
powerpc/smp: Call smp_ops->setup_cpu() directly on the boot CPU
powerpc/tm: Fix saving of TM SPRs in core dump
powerpc/mm: Fix pmd/pte_devmap() on non-leaf entries
Mikael Pettersson reported that some test programs in the strace-4.18
testsuite cause an OOPS.
After some debugging it turns out that garbage values are returned
when an exception occurs, causing the fixup memset() to be run with
bogus arguments.
The problem is that two of the exception handler stubs write the
successfully copied length into the wrong register.
Fixes: ee841d0aff ("sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.")
Reported-by: Mikael Pettersson <mikpelinux@gmail.com>
Tested-by: Mikael Pettersson <mikpelinux@gmail.com>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bitmask used to define these values produces overflow, as seen by
this compiler warning:
arch/arm64/kernel/head.S:47:8: warning:
integer overflow in preprocessor expression
#elif (PAGE_OFFSET & 0x1fffff) != 0
^~~~~~~~~~~
arch/arm64/include/asm/memory.h:52:46: note:
expanded from macro 'PAGE_OFFSET'
#define PAGE_OFFSET (UL(0xffffffffffffffff) << (VA_BITS -
1))
~~~~~~~~~~~~~~~~~~ ^
It would be preferrable to use GENMASK_ULL() instead, but it's not set
up to be used from assembly (the UL() macro token pastes UL suffixes
when not included in assembly sources).
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Suggested-by: Yury Norov <ynorov@caviumnetworks.com>
Suggested-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In a system with DBM (dirty bit management) capable agents there is a
possible race between a CPU executing ptep_set_access_flags() (maybe
non-DBM capable) and a hardware update of the dirty state (clearing of
PTE_RDONLY). The scenario:
a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old
(PTE_AF clear)
b) ptep_set_access_flags() is called as a result of a read access and it
needs to set the pte to writable, clean and young (PTE_AF set)
c) a DBM-capable agent, as a result of a different write access, is
marking the entry as young (setting PTE_AF) and dirty (clearing
PTE_RDONLY)
The current ptep_set_access_flags() implementation would set the
PTE_RDONLY bit in the resulting value overriding the DBM update and
losing the dirty state.
This patch fixes such race by setting PTE_RDONLY to the most permissive
(lowest value) of the current entry and the new one.
Fixes: 66dbd6e61a ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM")
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
They should be used only when an actual
remote-endpoint is connected.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZectgAAoJEGFBu2jqvgRNZdkP/27ovOMGuzleyl18PiLotyXn
98qxZvnFYyEw258vkUJj0N7Rmf3gMnrfbrRIOGXyOIBpZNBkgEqjGlagXn9Z+o8F
QOERxzDT5JiM6BbR+tRPOmKc07uuccVskOCoYOZ1mk4c336b/m9dNppdWGGY83LC
WJ0OeoFP8k8gUMYqSyKscXyfryP82KbryZH5s5tmSVAnEHf6psUzAUUdQf7An5v2
Up5nD4ljF+leGX6PG8rAFRbMGy/jTHOQi7QafVx3rC+p8SydoUnEWhPTrtkDzRTb
bXtpjSUId7n80qKtyI/vD3ABDBajr4RUzYjuXGfSf32A5uvk7xjzUk8QfwWogxv5
fdOga6ogEOUjgB1L+u2uegHQl3CGSZK5sXufoQivcndBgXqA5S9VkQk/FNjAM9vJ
twyC+s3pgC7cBrHGmM4QFToubYQthmXOevKmKlbRvXyNFHW6BcIPcbq+6t4IDakd
DU7iYRzR+AxVK5fezWOhBxxL2aRK533r/gc1ZOBklRN1gD+5AWLfaQJMNDDdVePJ
8AH+PAZZz2ETXLvMH+D4XSPF6t7gDyZ6Jroh+YJ3i0Y0vOO16Jr2LvJlAClFfQNM
+TRjhaGd4WFrUVnpiyPU9l2MrTOmSu4VgEnWvf28KqyHq8fEA3ew9u+vRZhcV/2X
SwJX0rE5626UDsieIVGP
=rgNN
-----END PGP SIGNATURE-----
Merge tag 'davinci-fixes-for-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci into fixes
Pull "DaVinci fixes for v4.13" from Sekhar Nori:
Drop unused VPIF endpoints from device-tree.
They should be used only when an actual
remote-endpoint is connected.
* tag 'davinci-fixes-for-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
ARM: dts: da850-lcdk: drop unused VPIF endpoints
ARM: dts: da850-evm: drop unused VPIF endpoints
Two fixes to correct the EMAC blocks memory region size to match the
datasheet. One that converts raw A83T clock indices to macros from the
clk dt-binding header, completing the A83T sunxi-ng clk driver.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCgAsFiEE2nN1m/hhnkhOWjtHOJpUIZwPJDAFAll566wOHHdlbnNAY3Np
ZS5vcmcACgkQOJpUIZwPJDBBhw//RK8UpKhviq1kdn6yDNKe3dVYxRTGtISvpjJL
oPXpstt1b28DdbVjeIzh6jqF2HjjNmfFlLphOoOj1H7qKqhobr9+Cll05BNWbCmH
CIL6qL8FT7QyK8yw5sPLcdlNLkNil+KfgnrBf76Crdw0R1HXGENsw+gVE2lVX+tJ
yRqhguaQjWNLA5v+nfK4la+Ip1ddgKgiBmkpkq9tuR8hUDe33LHFZxur28EChHb/
P1HXL1DHUn2uO/tyk1TDhpYkNbOGiER4neOeQt21uaKLDaRMmkRlaNHgGg4CQ/fR
OYs24Sm0XB9oFVKzz2GttkIkdWTOqRI99VlXHEd3bU3Rc2PdClrdky615qkDn/90
CCjz7ibhnkri5+1/MWVC1iF/sotYnlCKjptXMZNRmqoyC2ol+tkNy84MoHvuWBCU
/IzCNuU8sGqC5/PjGNo2LwQNmzkFsk78s0MQAUyxG0isJwMrGQaSXGtOpSLJzUPv
MRBqYyK2CQk7gZtDyeZPepUUeQta/bAFIUROeozgTstvib86ywYrflrjkBTEyPar
Q+ptcoEgw1FVZZ4cQCGQWkg16dtFPSBMbGAbjlh/9DJXbZU/DVF5XRpTj8FthWm7
Vw/QQx4A18O+oS5/CkBjdltAb4RZIcF2ijAupRbwPIG+pzNVJVJ7JN7BzdVAFAyF
cf/Btkw=
=aFvT
-----END PGP SIGNATURE-----
Merge tag 'sunxi-fixes-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
Pull "Allwinner fixes for 4.13" from Chen-Yu Tsai:
Two fixes to correct the EMAC blocks memory region size to match the
datasheet. One that converts raw A83T clock indices to macros from the
clk dt-binding header, completing the A83T sunxi-ng clk driver.
* tag 'sunxi-fixes-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
ARM: dts: sun8i: a83t: Switch to CCU device tree binding macros
arm64: allwinner: sun50i-a64: Correct emac register size
ARM: dts: sunxi: h3/h5: Correct emac register size
Fix deadlock in regulator quirk for R-Car Gen 2 SoCs
The da9063/da9210 regulator quirk for R-Car Gen2 boards uses a bus
notifier, and unregisters the notifier when it is no longer needed.
However, a notifier must not be unregistered from within the call chain.
This bug went unnoticed, as blocking_notifier_chain_unregister() didn't
take the semaphore during early boot. This is no longer the case as of
upstream commit 1c3c5eab17 ("sched/core: Enable might_sleep() and
smp_processor_id() checks early") and a deadlock occurs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZfw75AAoJENfPZGlqN0++i24P/j38Z5LKt529CBT2uPgdiIr7
lWEzU3Dgij/vPBUevz1s+hPa2i4jH1sfeKjSd2Ild3Ac+qAy6AwnzWlVYBCaBhlJ
qFAsr3nZ6ybftdA4rIJb1pWF+ugG7uNcQ3S0J5vLeiRU5HC/ZR+Ar+hjeuDwwIGa
889tI5UpKDFxYzkLIDVyiW0gX7WFhmxML0jptuV6c7bSl4Td/RxaP25Hcq8MsApv
oVre7mWPBKrIrjH26lPIUmdN4SGIHgUS3k2j9fFoWit9p2gYX6WDVbHjgALzPL1r
2Su8mSuP6ijutrrOuoB3OaNRlTaSr1Jpzfb6m6xRY+y1Anks0FxRVd712HrxinZK
4xW0ebUpfxpFsg8lJGoe1d1FGOSuj6A3/5GCrA+fQS0LSKthft+ovHmKBAogbSm3
7uCCU/h05LIkW0kDZYNjUqPWi+FwHZ4nEZEJiXMBt5DOyWpF5oIjCFeHmyp8VruI
xHZnq0pcck0Y+OSln2tiV43kw4QzK3BjTZZRc/IkVf19m24y+WJu9KfXTQhCDeLt
g4/zjAZrZddVQPtUz0f1YC07gPTmxLQpQJqLZBNeaed2W8p+tmUWUuoILpjPjEqR
Uu0EQE+t205YaxEz/ctYATdrpNOM5TWRhD+U25VTfhgxar3p9aC4GNVskL/fMGUy
/wTWYp1KVlxYjaxi3aOP
=N5ig
-----END PGP SIGNATURE-----
Merge tag 'renesas-fixes3-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
Pull "Third Round of Renesas ARM Based SoC Fixes for v4.13" from Simon Horman:
Fix deadlock in regulator quirk for R-Car Gen 2 SoCs
The da9063/da9210 regulator quirk for R-Car Gen2 boards uses a bus
notifier, and unregisters the notifier when it is no longer needed.
However, a notifier must not be unregistered from within the call chain.
This bug went unnoticed, as blocking_notifier_chain_unregister() didn't
take the semaphore during early boot. This is no longer the case as of
upstream commit 1c3c5eab17 ("sched/core: Enable might_sleep() and
smp_processor_id() checks early") and a deadlock occurs.
* tag 'renesas-fixes3-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: rcar-gen2: Fix deadlock in regulator quirk
All the fixes are for ARM64 mvebu:
- Fix the RTC interrupt on A7K/A8K which was missed when switching
from GIC to ICU
- Mark the A7K/A8K crypto engine as dma coherent
- Fix the number of GPIO on south bridge on Armada 3700
-----BEGIN PGP SIGNATURE-----
iIEEABECAEEWIQQYqXDMF3cvSLY+g9cLBhiOFHI71QUCWYHbmCMcZ3JlZ29yeS5j
bGVtZW50QGZyZWUtZWxlY3Ryb25zLmNvbQAKCRALBhiOFHI71ZhfAJ9Z9aMBG1xd
vbRYBDdqPdntruTudwCcDtM9jS2nL+207JfeMD0WMs7sDdU=
=XwFM
-----END PGP SIGNATURE-----
Merge tag 'mvebu-fixes-4.13-2' of git://git.infradead.org/linux-mvebu into fixes
Pull "mvebu fixes for 4.13 (part 2)" from Gregory CLEMENT:
All the fixes are for ARM64 mvebu:
- Fix the RTC interrupt on A7K/A8K which was missed when switching
from GIC to ICU
- Mark the A7K/A8K crypto engine as dma coherent
- Fix the number of GPIO on south bridge on Armada 3700
* tag 'mvebu-fixes-4.13-2' of git://git.infradead.org/linux-mvebu:
ARM64: dts: marvell: armada-37xx: Fix the number of GPIO on south bridge
arm64: dts: marvell: mark the cp110 crypto engine as dma coherent
arm64: dts: marvell: use ICU for the CP110 slave RTC
- 2 minor DT fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZf0PMAAoJEFk3GJrT+8ZlJskP/0JXcJt1I4myJv9A9z/+t0nh
7vT8H/ZwSyvK+/Iq1ccrf0Abe/c3445yHCwnF0T87R14A+oVVML5+uDolHIRZjlf
Wka34OoZFR9qTjphsFfQdZ0YcId7+a3BXNyvdkIWytWB+tzy/gpvlExBd54wJy0c
rceTEJsdmyObBFTmMTdlTzKQYLvurWoZ7d/N+T81UD3g0Xmq0EIOk0pEHSThK4SS
yQQWfRu7PCWDfXVu7qImalniaSa6wriHFC8hnrmaLfm2a9axfrTpvVknYuZUROlj
3T0Hgq3YHYe+BpS1P1Wryh60Yc8umbGy5JYg4GFOlNEGT9lBflk7nyG25aAbtwXY
BUU/r1cA1K+wR3XbUK9dpxW0GeCDTsQkHY9SA6+Aj5utNcMVhlnmGAHxJX9j1a8n
OhZtk339oCuBaf2Eg67OLvNGCPN21LAr2PL22LEGWkKSEUI61/1ZP1jjuCjN8Yk0
spXdFshdkYJyMqA4VuR05R5sBgqO25hXVDB2RYE4OuKINvAOfckgY9XpUhndxuYx
VVfX1up4bwoAUzVR+7XIsVrHaDR8pNS/Uv6kbhfB3X2c/jyPD1/Rc3T8p5DYirt7
DUPv8MJm/cwEC6uinQyqHTgB0tYLTWEY2+sv8NI2u8geYoJZO+Kt/12f3gcx296e
82LIlHnYKfYqUaqFwRRJ
=Z5d3
-----END PGP SIGNATURE-----
Merge tag 'amlogic-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into fixes
Pull "Amlogic fixes for v4.13-rc" from Kevin Hilman:
- 2 minor DT fixes
* tag 'amlogic-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic:
ARM64: dts: meson-gxl-s905x-libretech-cc: fixup board definition
ARM64: dts: meson-gx: use specific compatible for the AO pwms
showed a wrong value, so fix it before it gets copy-pasted
to much.
-----BEGIN PGP SIGNATURE-----
iQFEBAABCAAuFiEE7v+35S2Q1vLNA3Lx86Z5yZzRHYEFAlmEM5wQHGhlaWtvQHNu
dGVjaC5kZQAKCRDzpnnJnNEdgVn+B/9Xy9km3t8cFVQ//SjNm1MEfiyVrgCnH6TH
trLit6AKq19LTIfeyI2RQucEXB0Csw2+YNh4q9Td6utnc0S3OpzueFsfKfAD9xif
WpRR6MJjCSlIQfhd09XuKyWoDwwNmnidv+Xd/H3M83dTOzoPmy0aDkn/shc1I+Tw
gL0CbLPz1HuHuG55S5EkxcBNi/Mj34qI5e3O8hrqzpfzmc1CdQ38zblfkiyu94v7
f/sVUv8p2YHyfvIJ1n1DxQx7lvAxGzo3Q7fXqS9D1jJx4iDWWFaraLMPu3NztLcv
1XrawJTdGZ9vTJSmzcEwH7JeAUHX41wIWjf/jEx9iE2R9EY0lyLo
=B0nU
-----END PGP SIGNATURE-----
Merge tag 'v4.13-rockchip-dts32fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes
Pull "Rockchip dts32 fixes for 4.13" from Heiko Stübner:
Fix for the recently added mali dt support. The example
showed a wrong value, so fix it before it gets copy-pasted
to much.
* tag 'v4.13-rockchip-dts32fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: dts: rockchip: fix mali gpu node on rk3288
dt-bindings: gpu: drop wrong compatible from midgard binding example
PAE40 confiuration in hardware extends some of the address registers
for TLB/cache ops to 2 words.
So far kernel was NOT setting the higher word if feature was not enabled
in software which is wrong. Those need to be set to 0 in such case.
Normally this would be done in the cache flush / tlb ops, however since
these registers only exist conditionally, this would have to be
conditional to a flag being set on boot which is expensive/ugly -
specially for the more common case of PAE exists but not in use.
Optimize that by zero'ing them once at boot - nobody will write to
them afterwards
Cc: stable@vger.kernel.org #4.4+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
It is necessary to explicitly set both SLC_AUX_RGN_START1 and SLC_AUX_RGN_END1
which hold MSB bits of the physical address correspondingly of region start
and end otherwise SLC region operation is executed in unpredictable manner
Without this patch, SLC flushes on HSDK (IOC disabled) were taking
seconds.
Cc: stable@vger.kernel.org #4.4+
Reported-by: Vladimir Kondratiev <vladimir.kondratiev@intel.com>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: PAR40 regs only written if PAE40 exist]
c70c473396 "ARCv2: SLC: Make sure busy bit is set properly on SLC flushing"
fixes problem for entire SLC operation where the problem was initially
caught. But given a nature of the issue it is perfectly possible for
busy bit to be read incorrectly even when region operation was started.
So extending initial fix for regional operation as well.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: stable@vger.kernel.org #4.10
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Essentially remove CONFIG_ARC_PLAT_SIM
There is no need for any platform specific code, just the board DTS
match strings which we can include unconditionally
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Enable 64bit adressing, where it needed, to make possible
enabling PAE40 on axs103.
This patch doesn't affect on any functionality.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
If the decrementer wraps again and de-asserts the decrementer
exception while hard-disabled, __check_irq_replay() has a test to
notice the wrap when interrupts are re-enabled.
The decrementer check must be done when clearing the PACA_IRQ_HARD_DIS
flag, not when the PACA_IRQ_DEC flag is tested. Previously this worked
because the decrementer interrupt was always the first one checked
after clearing the hard disable flag, but HMI check was moved ahead of
that, which introduced this bug.
This can cause a missed decrementer interrupt if we soft-disable
interrupts then take an HMI which is recorded in irq_happened, then
hard-disable interrupts for > 4s to wrap the decrementer.
Fixes: e0e0d6b739 ("powerpc/64: Replay hypervisor maintenance interrupt first")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
POWER9 DD2 PMU can stop after a state-loss idle in some conditions.
A solution is to set then clear MMCRA[60] after wake from state-loss
idle. MMCRA[60] is a non-architected bit, see the user manual for
details.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
- Fix the handling of the scaling_cur_freq cpufreq policy attribute
on x86 systems with the MPERF/APERF registers present to make it
behave more as expected after recent changes (Rafael Wysocki).
- Drop a leftover callback from the intel_pstate driver which also
prevents the cpuinfo_cur_freq cpufreq policy attribute from being
incorrectly exposed when intel_pstate works in the active mode
(Rafael Wysocki).
- Add a missing piece describing the cpuinfo_cur_freq policy
attribute to cpufreq documentation (Rafael Wysocki).
- Fix up a recently added part of the Thunderbolt driver to avoid
aborting system suspends if its mailbox commands time out (Rafael
Wysocki).
- Update device runtime PM framework documentation to reflect the
current behavior of the code (Johan Hovold).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZg3DiAAoJEILEb/54YlRxvgkP/RV5djXAUeUpoMnzJCnP7jpB
bYvzqIXN2olcZPNP3hNLM7W1yRAWiwL9F1aFF2apJPyMQ8DW7iBqltAxQY+h2Ggc
lLvQcT5NRCFbIlsRadBL3TMZMFmtYqBD9Q8C+LLgQDWmm3mgG8o0kFJdGWkOMHg7
4FaqTuF1np8ARd1jX3E4QGCVaSqIEKN9xMo7jzsWIHXruag3tJb92ryROVcq3uO7
5WspJSbVMKS0E6jbK3RQkJ9T/3U453p8JMDJNQfOUFr1KQNfAikyV3U0ZoQA6Hjk
1tmJE9v7jq75BDj0Dtfskl8EBFA4Sk+Pegyb3/hB8MR/oRzCWRAe9rVW5LweckcC
Q1Q+SDKcVKVIEQa3GZOH1nBfHP/pIMQb0+WGMIq614rr1Ug9IuF4cCn2AnzkIFLE
lpkVch3mCfrCwAKmsvYug5isHdazlS93MY9dOZr4D1lewgFp1piShsBSK3RFBT8F
Cu3lYTzBi/MJ+8T0vZysHVEkCNu3emrXwccjzubW1rSNINGgRF5AV9F2HwGYzDHe
KD3Kw1KH2LZEWZAlR6W711pBJny4INENPH0XE85hZ4Ryi+2SN7zvsA/PM0rGjqBq
TcuFc3jN+DMwWoOJryY8701gE8SMz+8Vam/B191XFUkdZSHFk1WWILAu/OfTZCRo
bLYhORgyemWPluAzzcsx
=j2UJ
-----END PGP SIGNATURE-----
Merge tag 'pm-4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix two cpufreq issues, one introduced recently and one related
to recent changes, fix cpufreq documentation, fix up recently added
code in the Thunderbolt driver and update runtime PM framework
documentation.
Specifics:
- Fix the handling of the scaling_cur_freq cpufreq policy attribute
on x86 systems with the MPERF/APERF registers present to make it
behave more as expected after recent changes (Rafael Wysocki).
- Drop a leftover callback from the intel_pstate driver which also
prevents the cpuinfo_cur_freq cpufreq policy attribute from being
incorrectly exposed when intel_pstate works in the active mode
(Rafael Wysocki).
- Add a missing piece describing the cpuinfo_cur_freq policy
attribute to cpufreq documentation (Rafael Wysocki).
- Fix up a recently added part of the Thunderbolt driver to avoid
aborting system suspends if its mailbox commands time out (Rafael
Wysocki).
- Update device runtime PM framework documentation to reflect the
current behavior of the code (Johan Hovold)"
* tag 'pm-4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thunderbolt: icm: Ignore mailbox errors in icm_suspend()
cpufreq: x86: Make scaling_cur_freq behave more as expected
PM / runtime: Document new pm_runtime_set_suspended() constraint
cpufreq: docs: Add missing cpuinfo_cur_freq description
cpufreq: intel_pstate: Drop ->get from intel_pstate structure
- Yet another race with VM destruction plugged
- A set of small vgic fixes
-----BEGIN PGP SIGNATURE-----
iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlmDOZ4VHG1hcmMuenlu
Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDnwQP/ji7d7bbWybmeuWKXm8TYt6CNPBr
JC57OQHytU1+ccagjeUPaAgUAqWSQWOwfbqPuS1afshJu1ZZsN851QNmcr5W6bXQ
EwjICGWcAdKSMLnhRDdf+uXQbFSkkcaxsXnJWjmD0EE8ylaXCCkdV278xhTx0hVO
yiov/xWNzLMa3O3W/l4SIKQ4UNyeQ7f+Od1Vkf7iUpaaFcz6s3RNsPAOwc7Thq7L
eLk4SgXMLDNHz5HwdLoOp3RwQsNe9A7uh05z9VOav0YAyLG5vf29JhMroCO/4P/x
1HgWvpGwzxAUqUcHbW4FSOsbydEltlWMdfSoAF7BtPCLudmmsxfMJJlK3FKLvV+P
MsO2FdXiz6/ZLI9/ds7YJDOfc1cGSVK07Efx+SLXD5FAnkFYq4SElmI4sK8BWc5U
Dugw3j8u9M6jTiVKBioFo7yRT5iHG9O0A+6DtsMQ0iREfLolC7EEzpfGB/vWpKk5
BbdwDUiu6JxXEBJJqyP948iGfuIrikAx2mcf3dmEHVKhEnKi+MN/H5c9BKGAJ3sf
Fm1xF99IG+m3L3zXiukQ11OqsCx32kqLiXaejVttscFg/C33OYmGOhsk066hHCzx
LtoxrHUl7rwF+ssnKHq4Az2mYzZtquEq13O5tgleQcVwH/2+6EgWDf3mDSy1em0h
5IXgqk+PsaQ5n0jL
=TMR9
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-v4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
KVM/ARM Fixes for v4.13-rc4
- Yet another race with VM destruction plugged
- A set of small vgic fixes
------------[ cut here ]------------
WARNING: CPU: 5 PID: 2288 at arch/x86/kvm/vmx.c:11124 nested_vmx_vmexit+0xd64/0xd70 [kvm_intel]
CPU: 5 PID: 2288 Comm: qemu-system-x86 Not tainted 4.13.0-rc2+ #7
RIP: 0010:nested_vmx_vmexit+0xd64/0xd70 [kvm_intel]
Call Trace:
vmx_check_nested_events+0x131/0x1f0 [kvm_intel]
? vmx_check_nested_events+0x131/0x1f0 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0x5dd/0x1be0 [kvm]
? vmx_vcpu_load+0x1be/0x220 [kvm_intel]
? kvm_arch_vcpu_load+0x62/0x230 [kvm]
kvm_vcpu_ioctl+0x340/0x700 [kvm]
? kvm_vcpu_ioctl+0x340/0x700 [kvm]
? __fget+0xfc/0x210
do_vfs_ioctl+0xa4/0x6a0
? __fget+0x11d/0x210
SyS_ioctl+0x79/0x90
do_syscall_64+0x8f/0x750
? trace_hardirqs_on_thunk+0x1a/0x1c
entry_SYSCALL64_slow_path+0x25/0x25
This can be reproduced by booting L1 guest w/ 'noapic' grub parameter, which
means that tells the kernel to not make use of any IOAPICs that may be present
in the system.
Actually external_intr variable in nested_vmx_vmexit() is the req_int_win
variable passed from vcpu_enter_guest() which means that the L0's userspace
requests an irq window. I observed the scenario (!kvm_cpu_has_interrupt(vcpu) &&
L0's userspace reqeusts an irq window) is true, so there is no interrupt which
L1 requires to inject to L2, we should not attempt to emualte "Acknowledge
interrupt on exit" for the irq window requirement in this scenario.
This patch fixes it by not attempt to emulate "Acknowledge interrupt on exit"
if there is no L1 requirement to inject an interrupt to L2.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
[Added code comment to make it obvious that the behavior is not correct.
We should do a userspace exit with open interrupt window instead of the
nested VM exit. This patch still improves the behavior, so it was
accepted as a (temporary) workaround.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The host physical addresses of L1's Virtual APIC Page and Posted
Interrupt descriptor are loaded into the VMCS02. The CPU may write
to these pages via their host physical address while L2 is running,
bypassing address-translation-based dirty tracking (e.g. EPT write
protection). Mark them dirty on every exit from L2 to prevent them
from getting out of sync with dirty tracking.
Also mark the virtual APIC page and the posted interrupt descriptor
dirty when KVM is virtualizing posted interrupt processing.
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
According to the Intel SDM, software cannot rely on the current VMCS to be
coherent after a VMXOFF or shutdown. So this is a valid way to handle VMCS12
flushes.
24.11.1 Software Use of Virtual-Machine Control Structures
...
If a logical processor leaves VMX operation, any VMCSs active on
that logical processor may be corrupted (see below). To prevent
such corruption of a VMCS that may be used either after a return
to VMX operation or on another logical processor, software should
execute VMCLEAR for that VMCS before executing the VMXOFF instruction
or removing power from the processor (e.g., as part of a transition
to the S3 and S4 power states).
...
This fixes a "suspicious rcu_dereference_check() usage!" warning during
kvm_vm_release() because nested_release_vmcs12() calls
kvm_vcpu_write_guest_page() without holding kvm->srcu.
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Since the current implementation of VMCS12 does a memcpy in and out
of guest memory, we do not need current_vmcs12 and current_vmcs12_page
anymore. current_vmptr is enough to read and write the VMCS12.
And David Matlack noted:
This patch also fixes dirty tracking (memslot->dirty_bitmap) of the
VMCS12 page by using kvm_write_guest. nested_release_page() only marks
the struct page dirty.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[Added David Matlack's note and nested_release_page_clean() fix.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
'lapic_irq' is a local variable and its 'level' field isn't
initialized, so 'level' is random, it doesn't matter but
makes UBSAN unhappy:
UBSAN: Undefined behaviour in .../lapic.c:...
load of value 10 is not a valid value for type '_Bool'
...
Call Trace:
[<ffffffff81f030b6>] dump_stack+0x1e/0x20
[<ffffffff81f03173>] ubsan_epilogue+0x12/0x55
[<ffffffff81f03b96>] __ubsan_handle_load_invalid_value+0x118/0x162
[<ffffffffa1575173>] kvm_apic_set_irq+0xc3/0xf0 [kvm]
[<ffffffffa1575b20>] kvm_irq_delivery_to_apic_fast+0x450/0x910 [kvm]
[<ffffffffa15858ea>] kvm_irq_delivery_to_apic+0xfa/0x7a0 [kvm]
[<ffffffffa1517f4e>] kvm_emulate_hypercall+0x62e/0x760 [kvm]
[<ffffffffa113141a>] handle_vmcall+0x1a/0x30 [kvm_intel]
[<ffffffffa114e592>] vmx_handle_exit+0x7a2/0x1fa0 [kvm_intel]
...
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
When SMP VM start, AP may lost INIT because of receiving INIT between
kvm_vcpu_ioctl_x86_get/set_vcpu_events.
vcpu 0 vcpu 1
kvm_vcpu_ioctl_x86_get_vcpu_events
events->smi.latched_init = 0
send INIT to vcpu1
set vcpu1's pending_events
kvm_vcpu_ioctl_x86_set_vcpu_events
if (events->smi.latched_init == 0)
clear INIT in pending_events
This patch fixes it by just update SMM related flags if we are in SMM.
Thanks Peng Hao for the report and original commit message.
Reported-by: Peng Hao <peng.hao2@zte.com.cn>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The number of pins in South Bridge is 30 and not 29. There is a fix for
the driver for the pinctrl, but a fix is also need at device tree level
for the GPIO.
Fixes: afda007fed ("ARM64: dts: marvell: Add pinctrl nodes for Armada
3700")
Cc: <stable@vger.kernel.org>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
of_irq_to_resource() has recently been fixed to return negative error #'s
along with 0 in case of failure, however the Freescale MPC832x RDB board
code still only regards 0 as a failure indication -- fix it up.
Fixes: 7a4228bbff ("of: irq: use of_irq_get() in of_irq_to_resource()")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
WARNING: CPU: 5 PID: 1242 at kernel/rcu/tree_plugin.h:323 rcu_note_context_switch+0x207/0x6b0
CPU: 5 PID: 1242 Comm: unity-settings- Not tainted 4.13.0-rc2+ #1
RIP: 0010:rcu_note_context_switch+0x207/0x6b0
Call Trace:
__schedule+0xda/0xba0
? kvm_async_pf_task_wait+0x1b2/0x270
schedule+0x40/0x90
kvm_async_pf_task_wait+0x1cc/0x270
? prepare_to_swait+0x22/0x70
do_async_page_fault+0x77/0xb0
? do_async_page_fault+0x77/0xb0
async_page_fault+0x28/0x30
RIP: 0010:__d_lookup_rcu+0x90/0x1e0
I encounter this when trying to stress the async page fault in L1 guest w/
L2 guests running.
Commit 9b132fbe54 (Add rcu user eqs exception hooks for async page
fault) adds rcu_irq_enter/exit() to kvm_async_pf_task_wait() to exit cpu
idle eqs when needed, to protect the code that needs use rcu. However,
we need to call the pair even if the function calls schedule(), as seen
from the above backtrace.
This patch fixes it by informing the RCU subsystem exit/enter the irq
towards/away from idle for both n.halted and !n.halted.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
There are three issues in nested_vmx_check_exception:
1) it is not taking PFEC_MATCH/PFEC_MASK into account, as reported
by Wanpeng Li;
2) it should rebuild the interruption info and exit qualification fields
from scratch, as reported by Jim Mattson, because the values from the
L2->L0 vmexit may be invalid (e.g. if an emulated instruction causes
a page fault, the EPT misconfig's exit qualification is incorrect).
3) CR2 and DR6 should not be written for exception intercept vmexits
(CR2 only for AMD).
This patch fixes the first two and adds a comment about the last,
outlining the fix.
Cc: Jim Mattson <jmattson@google.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do this in the caller of nested_vmx_vmexit instead.
nested_vmx_check_exception was doing a vmwrite to the vmcs02's
VM_EXIT_INTR_ERROR_CODE field, so that prepare_vmcs12 would move
the field to vmcs12->vm_exit_intr_error_code. However that isn't
possible on pre-Haswell machines. Moving the vmcs12 write to the
callers fixes it.
Reported-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Changed nested_vmx_reflect_vmexit() return type to (int)1 from (bool)1,
thanks to fengguang.wu@intel.com]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Functions clear_user_highpage, copy_user_highpage, flush_dcache_page,
local_flush_cache_range and local_flush_cache_page may be used from
modules. Export them.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
csum_partial and csum_partial_copy_generic are defined unconditionally
and are available even when CONFIG_NET is disabled. They are used not
only by the network drivers, but also by scsi and media.
Don't limit these functions export by CONFIG_NET.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
In an ideal world, CNTFRQ_EL0 always contains the timer frequency
for the kernel to use. Sadly, we get quite a few broken systems
where the firmware authors cannot be bothered to program that
register on all CPUs, and rely on DT to provide that frequency.
So when trapping CNTFRQ_EL0, make sure to return the actual rate
(as known by the kernel), and not CNTFRQ_EL0.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The HPET resume path abuses irq_domain_[de]activate_irq() to restore the
MSI message in the HPET chip for the boot CPU on resume and it relies on an
implementation detail of the interrupt core code, which magically makes the
HPET unmask call invoked via a irq_disable/enable pair. This worked as long
as the irq code did unconditionally invoke the unmask() callback. With the
recent changes which keep track of the masked state to avoid expensive
hardware access, this does not longer work. As a consequence the HPET timer
interrupts are not unmasked which breaks resume as the boot CPU waits
forever that a timer interrupt arrives.
Make the restore of the MSI message explicit and invoke the unmask()
function directly. While at it get rid of the pointless affinity setting as
nothing can change the affinity of the interrupt and the vector across
suspend/resume. The restore of the MSI message reestablishes the previous
affinity setting which is the correct one.
Fixes: bf22ff45be ("genirq: Avoid unnecessary low level irq function calls")
Reported-and-tested-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reported-by: Martin Peres <martin.peres@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: jeffy.chen@rock-chips.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707312158590.2287@nanos
While working on enabling queued rwlock on SPARC, found this following
code in include/asm-generic/qrwlock.h which uses CONFIG_CPU_BIG_ENDIAN
to clear a byte.
static inline u8 *__qrwlock_write_byte(struct qrwlock *lock)
{
return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN);
}
Problem is many of the fixed big endian architectures don't define
CPU_BIG_ENDIAN and clears the wrong byte.
Define CPU_BIG_ENDIAN for parisc architecture to fix it.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Helge Deller <deller@gmx.de>
The watchdog soft-NMI exception stack setup loads a stack pointer
twice, which is an obvious error. It ends up using the system reset
interrupt (true-NMI) stack, which is also a bug because the watchdog
could be preempted by a system reset interrupt that overwrites the
NMI stack.
Change the soft-NMI to use the "emergency stack". The current kernel
stack is not used, because of the longer-term goal to prevent
asynchronous stack access using soft-disable.
Fixes: 2104180a53 ("powerpc/64s: implement arch-specific hardlockup watchdog")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>