Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down
a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto'
constructs, as outlined here:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
Implement a workaround suggested by Jakub Jelinek.
Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSUc9zAAoJEHm+PkMAQRiG9DMH/AtpuAF6LlMRPjrCeuJQ1pyh
T0IUO+CsLKO6qtM5IyweP8V6zaasNjIuW1+B6IwVIl8aOrM+M7CwRiKvpey26ldM
I8G2ron7hqSOSQqSQs20jN2yGAqQGpYIbTmpdGLAjQ350NNNvEKthbP5SZR5PAmE
UuIx5OGEkaOyZXvCZJXU9AZkCxbihlMSt2zFVxybq2pwnGezRUYgCigE81aeyE0I
QLwzzMVdkCxtZEpkdJMpLILAz22jN4RoVDbXRa2XC7dA9I2PEEXI9CcLzqCsx2Ii
8eYS+no2K5N2rrpER7JFUB2B/2X8FaVDE+aJBCkfbtwaYTV9UYLq3a/sKVpo1Cs=
=xSFJ
-----END PGP SIGNATURE-----
Merge tag 'v3.12-rc4' into sched/core
Merge Linux v3.12-rc4 to fix a conflict and also to refresh the tree
before applying more scheduler patches.
Conflicts:
arch/avr32/include/asm/Kbuild
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Conflicts:
drivers/net/ethernet/emulex/benet/be.h
drivers/net/usb/qmi_wwan.c
drivers/net/wireless/brcm80211/brcmfmac/dhd_bus.h
include/net/netfilter/nf_conntrack_synproxy.h
include/net/secure_seq.h
The conflicts are of two varieties:
1) Conflicts with Joe Perches's 'extern' removal from header file
function declarations. Usually it's an argument signature change
or a function being added/removed. The resolutions are trivial.
2) Some overlapping changes in qmi_wwan.c and be.h, one commit adds
a new value, another changes an existing value. That sort of
thing.
Signed-off-by: David S. Miller <davem@davemloft.net>
As mentioned in commit afe4fd0624 ("pkt_sched: fq: Fair Queue packet
scheduler"), this patch adds a new socket option.
SO_MAX_PACING_RATE offers the application the ability to cap the
rate computed by transport layer. Value is in bytes per second.
u32 val = 1000000;
setsockopt(sockfd, SOL_SOCKET, SO_MAX_PACING_RATE, &val, sizeof(val));
To be effectively paced, a flow must use FQ packet scheduler.
Note that a packet scheduler takes into account the headers for its
computations. The effective payload rate depends on MSS and retransmits
if any.
I chose to make this pacing rate a SOL_SOCKET option instead of a
TCP one because this can be used by other protocols.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable ARCH_USE_CMPXCHG_LOCKREF since it shows performance improvements
with Linus' simple stat() test case of up to 50% on a 30 cpu system.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Linus suggested to replace
#ifndef CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX
#define arch_mutex_cpu_relax() cpu_relax()
#endif
with just a simple
#ifndef arch_mutex_cpu_relax
# define arch_mutex_cpu_relax() cpu_relax()
#endif
to get rid of CONFIG_HAVE_CPU_RELAX_SIMPLE. So architectures can
simply define arch_mutex_cpu_relax if they want an architecture
specific function instead of having to add a select statement in
their Kconfig in addition.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
In order to prepare to per-arch implementations of preempt_count move
the required bits into an asm-generic header and use this for all
archs.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The need for SIE_INTERCEPT_RERUNVCPU has been removed long ago already,
with the following commit:
f7850c9288
[S390] remove kvm mmu reload on s390
Since the remainders are dead code, they are now removed by this patch.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Modify the s390 copy_oldmem_page() and remap_oldmem_pfn_range() function
for zfcpdump to read from the HSA memory if memory below HSA_SIZE bytes is
requested. Otherwise real memory is used.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Jan Willeke <willeke@de.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the general-instruction extension facility (z10) a couple of
instructions with a pc-relative long displacement were introduced. The
kprobes support for these instructions however was never implemented.
In result, if anybody ever put a probe on any of these instructions the
result would have been random behaviour after the instruction got executed
within the insn slot.
So lets add the missing handling for these instructions. Since all of the
new instructions have 32 bit signed displacement the easiest solution is
to allocate an insn slot that is within the same 2GB area like the
original instruction and patch the displacement field.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull more s390 updates from Heiko Carstens:
"This includes one bpf/jit bug fix where the jit compiler could
sometimes write generated code out of bounds of the allocated memory
area.
The rest of the patches are only cleanups and minor improvements"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/irq: reduce size of external interrupt handler hash array
s390/compat,uid16: use current_cred()
s390/ap_bus: use and-mask instead of a cast
s390/ftrace: avoid pointer arithmetics with function pointers
s390: make various functions static, add declarations to header files
s390/compat signal: add couple of __force annotations
s390/mm: add __releases()/__acquires() annotations to gmap_alloc_table()
s390: keep Kconfig sorted
s390/irq: rework irq subclass handling
s390/irq: use hlists for external interrupt handler array
s390/dumpstack: convert print_symbol to %pSR
s390/perf: Remove print_hex_dump_bytes() debug output
s390: update defconfig
s390/bpf,jit: fix address randomization
This branch contains mostly additions and changes to platform enablement
and SoC-level drivers. Since there's sometimes a dependency on device-tree
changes, there's also a fair amount of those in this branch.
Pieces worth mentioning are:
- Mbus driver for Marvell platforms, allowing kernel configuration
and resource allocation of on-chip peripherals.
- Enablement of the mbus infrastructure from Marvell PCI-e drivers.
- Preparation of MSI support for Marvell platforms.
- Addition of new PCI-e host controller driver for Tegra platforms
- Some churn caused by sharing of macro names between i.MX 6Q and 6DL
platforms in the device tree sources and header files.
- Various suspend/PM updates for Tegra, including LP1 support.
- Versatile Express support for MCPM, part of big little support.
- Allwinner platform support for A20 and A31 SoCs (dual and quad Cortex-A7)
- OMAP2+ support for DRA7, a new Cortex-A15-based SoC.
The code that touches other architectures are patches moving
MSI arch-specific functions over to weak symbols and removal of
ARCH_SUPPORTS_MSI, acked by PCI maintainers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJSKhYmAAoJEIwa5zzehBx322AP/1ONYs8o8f7/Gzq6lZvTN6T3
0pBTApg6Jfioi3lwKvUAEIcsW82YKQ+UZkbW66GQH6+Ri4aZJKZHuz0+JPU67OJ4
LtSLuzVWrymy2VOOUvAnS/SXkOZw/pHhU4cLNHn1dMndhUL1Uqp9/XwuiHEQyFsP
uOkpcBtIu0EWElov0PKKZ5SWBg8JJs2vy5ydiViGelWHCrZvDDZkWzIsDcBQxJLQ
juzT4+JE+KOu7vKmfw78o6iHoCS2TBRAN9YUCajRb8Wl+out1hrTahHnDWaZ5Mce
EskcQNkJROqFbjD4k3ABN4XGTv2VDmrztIwFe0SEQ7Dz/9ypCrBGT69uI9xIqTXr
GwVRIwAUFTpMupK0gy93z1ajV3N0CXV79out9+jQNUQybYE+czp8QOyhmuc1tZx0
8fn9jlBQe9Vy6yrs39gEcE7nUwrayeyQ+6UvqqwsE2pWZabNAnCMSPX5+QIu+T/3
tQ7+jYmfFeserp1sIDOHOnxfhtW9EI6U9d1h/DUCwrsuFdkL9ha4M/vh9Pwgye98
tBdz0T4yE39AJQwwFWRkv1jcQKcGu6WqJanmvS4KRBksGwuLWxy+ewOnkz2ifS25
ZYSyxAryZRBvQRqlOK11rXPfRcbGcY0MG9lkKX96rGcyWEizgE1DdjxXD8HoIleN
R8heV6GX5OzlFLGX2tKK
=fJ5x
-----END PGP SIGNATURE-----
Merge tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC platform changes from Olof Johansson:
"This branch contains mostly additions and changes to platform
enablement and SoC-level drivers. Since there's sometimes a
dependency on device-tree changes, there's also a fair amount of
those in this branch.
Pieces worth mentioning are:
- Mbus driver for Marvell platforms, allowing kernel configuration
and resource allocation of on-chip peripherals.
- Enablement of the mbus infrastructure from Marvell PCI-e drivers.
- Preparation of MSI support for Marvell platforms.
- Addition of new PCI-e host controller driver for Tegra platforms
- Some churn caused by sharing of macro names between i.MX 6Q and 6DL
platforms in the device tree sources and header files.
- Various suspend/PM updates for Tegra, including LP1 support.
- Versatile Express support for MCPM, part of big little support.
- Allwinner platform support for A20 and A31 SoCs (dual and quad
Cortex-A7)
- OMAP2+ support for DRA7, a new Cortex-A15-based SoC.
The code that touches other architectures are patches moving MSI
arch-specific functions over to weak symbols and removal of
ARCH_SUPPORTS_MSI, acked by PCI maintainers"
* tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (266 commits)
tegra-cpuidle: provide stub when !CONFIG_CPU_IDLE
PCI: tegra: replace devm_request_and_ioremap by devm_ioremap_resource
ARM: tegra: Drop ARCH_SUPPORTS_MSI and sort list
ARM: dts: vf610-twr: enable i2c0 device
ARM: dts: i.MX51: Add one more I2C2 pinmux entry
ARM: dts: i.MX51: Move pins configuration under "iomuxc" label
ARM: dtsi: imx6qdl-sabresd: Add USB OTG vbus pin to pinctrl_hog
ARM: dtsi: imx6qdl-sabresd: Add USB host 1 VBUS regulator
ARM: dts: imx27-phytec-phycore-som: Enable AUDMUX
ARM: dts: i.MX27: Disable AUDMUX in the template
ARM: dts: wandboard: Add support for SDIO bcm4329
ARM: i.MX5 clocks: Remove optional clock setup (CKIH1) from i.MX51 template
ARM: dts: imx53-qsb: Make USBH1 functional
ARM i.MX6Q: dts: Enable I2C1 with EEPROM and PMIC on Phytec phyFLEX-i.MX6 Ouad module
ARM i.MX6Q: dts: Enable SPI NOR flash on Phytec phyFLEX-i.MX6 Ouad module
ARM: dts: imx6qdl-sabresd: Add touchscreen support
ARM: imx: add ocram clock for imx53
ARM: dts: imx: ocram size is different between imx6q and imx6dl
ARM: dts: imx27-phytec-phycore-som: Fix regulator settings
ARM: dts: i.MX27: Remove clock name from CPU node
...
Pull KVM updates from Gleb Natapov:
"The highlights of the release are nested EPT and pv-ticketlocks
support (hypervisor part, guest part, which is most of the code, goes
through tip tree). Apart of that there are many fixes for all arches"
Fix up semantic conflicts as discussed in the pull request thread..
* 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (88 commits)
ARM: KVM: Add newlines to panic strings
ARM: KVM: Work around older compiler bug
ARM: KVM: Simplify tracepoint text
ARM: KVM: Fix kvm_set_pte assignment
ARM: KVM: vgic: Bump VGIC_NR_IRQS to 256
ARM: KVM: Bugfix: vgic_bytemap_get_reg per cpu regs
ARM: KVM: vgic: fix GICD_ICFGRn access
ARM: KVM: vgic: simplify vgic_get_target_reg
KVM: MMU: remove unused parameter
KVM: PPC: Book3S PR: Rework kvmppc_mmu_book3s_64_xlate()
KVM: PPC: Book3S PR: Make instruction fetch fallback work for system calls
KVM: PPC: Book3S PR: Don't corrupt guest state when kernel uses VMX
KVM: x86: update masterclock when kvmclock_offset is calculated (v2)
KVM: PPC: Book3S: Fix compile error in XICS emulation
KVM: PPC: Book3S PR: return appropriate error when allocation fails
arch: powerpc: kvm: add signed type cast for comparation
KVM: x86: add comments where MMIO does not return to the emulator
KVM: vmx: count exits to userspace during invalid guest emulation
KVM: rename __kvm_io_bus_sort_cmp to kvm_io_bus_cmp
kvm: optimize away THP checks in kvm_is_mmio_pfn()
...
Pull timers/nohz changes from Ingo Molnar:
"It mostly contains fixes and full dynticks off-case optimizations, by
Frederic Weisbecker"
* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
nohz: Include local CPU in full dynticks global kick
nohz: Optimize full dynticks's sched hooks with static keys
nohz: Optimize full dynticks state checks with static keys
nohz: Rename a few state variables
vtime: Always debug check snapshot source _before_ updating it
vtime: Always scale generic vtime accounting results
vtime: Optimize full dynticks accounting off case with static keys
vtime: Describe overriden functions in dedicated arch headers
m68k: hardirq_count() only need preempt_mask.h
hardirq: Split preempt count mask definitions
context_tracking: Split low level state headers
vtime: Fix racy cputime delta update
vtime: Remove a few unneeded generic vtime state checks
context_tracking: User/kernel broundary cross trace events
context_tracking: Optimize context switch off case with static keys
context_tracking: Optimize guest APIs off case with static key
context_tracking: Optimize main APIs off case with static key
context_tracking: Ground setup for static key use
context_tracking: Remove full dynticks' hacky dependency on wide context tracking
nohz: Only enable context tracking on full dynticks CPUs
...
Let's not add a function for every external interrupt subclass for
which we need reference counting. Just have two register/unregister
functions which have a subclass parameter:
void irq_subclass_register(enum irq_subclass subclass);
void irq_subclass_unregister(enum irq_subclass subclass);
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Function handles may change while the system was in hibernation
use list pci functions and update the function handles.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
List pci functions is used to query and iterate over pci functions.
This function currently has 2 users - initial device discovery and
rescan after a machine check. Instead of having a multipurpose
function pass a callback which gets called for each pci function.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Some functions that do arch specific resume actions are called
directly from swsusp_asm64.S . Before we add another function call
provide a generic s390_early_resume function which can be used
for this purpose.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Convert s390' pci hotplug to be builtin only, with no module option.
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The last remaining use for the storage key of the s390 architecture
is reference counting. The alternative is to make page table entries
invalid while they are old. On access the fault handler marks the
pte/pmd as young which makes the pte/pmd valid if the access rights
allow read access. The pte/pmd invalidations required for software
managed reference bits cost a bit of performance, on the other hand
the RRBE/RRBM instructions to read and reset the referenced bits are
quite expensive as well.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
For a single-threaded KVM guest ptep_modify_prot_start will not use
IPTE, the invalid bit will therefore not be set. If DEBUG_VM is set
pgste_set_key called by ptep_modify_prot_commit will complain about
the missing invalid bit. ptep_modify_prot_start should set the
invalid bit in all cases.
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix broken contraints for both save_access_regs() and restore_access_regs().
The constraints are incorrect since they tell the compiler that the inline
assemblies only access the first element of an array of 16 elements.
Therefore the compiler could generate incorrect code.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix inline assembly contraints for non atomic bitops functions.
This is broken since 2.6.34 987bcdac "[S390] use inline assembly
contraints available with gcc 3.3.3".
Reported-by: Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Reported-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Isolate the logic of IDTE vs. IPTE flushing of ptes in two functions,
ptep_flush_lazy and __tlb_flush_mm_lazy.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
ptep_modify_prot_start uses the pgste_set helper to store the pgste,
while ptep_modify_prot_commit uses its own pointer magic to retrieve
the value again. Add the pgste_get help function to keep things
symmetrical and improve readability.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
On process exit there is no more need for the pgste information.
Skip the expensive storage key operations which should speed up
termination of KVM processes.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Improve the encoding of the different pte types and the naming of the
page, segment table and region table bits. Due to the different pte
encoding the hugetlbfs primitives need to be adapted as well. To improve
compatability with common code make the huge ptes use the encoding of
normal ptes. The conversion between the pte and pmd encoding for a huge
pte is done with set_huge_pte_at and huge_ptep_get.
Overall the code is now easier to understand.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With the introduction of PCI it became apparent that s390 should
convert to generic hardirqs as too many drivers do not have the
correct dependency for GENERIC_HARDIRQS. On the architecture
level s390 does not have irq lines. It has external interrupts,
I/O interrupts and adapter interrupts. This patch hard-codes all
external interrupts as irq #1, all I/O interrupts as irq #2 and
all adapter interrupts as irq #3. The additional information from
the lowcore associated with the interrupt is stored in the
pt_regs of the interrupt frame, where the interrupt handler can
pick it up. For PCI/MSI interrupts the adapter interrupt handler
scans the relevant bit fields and calls generic_handle_irq with
the virtual irq number for the MSI interrupt.
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Make use of the adapter interrupt helpers in the PCI code. This is
the first step to convert the MSI interrupt code to PCI domains.
The patch removes the limitation of 64 adapter interrupts per
PCI function.
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The PCI code is the first user of adapter interrupts vectors.
Add a set of helpers to airq.c to separate the adatper interrupt
code from the PCI bits. The helpers allow for adapter interrupt
vectors of any size.
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This branch includes a number of enhancements to core SoC support for
Tegra devices. The major new features are:
* Adds a new CPU-power-gated cpuidle state for Tegra114.
* Adds initial system suspend support for Tegra114, initially supporting
just CPU-power-gating during suspend.
* Adds "LP1" suspend mode support for all of Tegra20/30/114. This mode
both gates CPU power, and places the DRAM into self-refresh mode.
* A new DT-driven PCIe driver to Tegra20/30. The driver is also moved
from arch/arm/mach-tegra/ to drivers/pci/host/.
The PCIe driver work depends on the following tag from Thomas Petazzoni:
git://git.infradead.org/linux-mvebu.git mis-3.12.2
... which is merged into the middle of this pull request.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJSDlwwAAoJEMzrak5tbycxR68QAJZ/Izc9Izj0JH8hmCEvMNfi
ub1DQfWAy3oXk0ttkk+BMvuyD8JTvBr8LSK8GqjZs//rFGlW81A4NHTvCwoKZjKe
hgrRgI2B1wj3Um1sp8le9D0klKrTcfmpXrOxH8ALgz0BIpMge8AGZHkV0SrfQa1z
bKiISFVAw12WJCVrQ2nbzpZGU51lbyJ/+RghttM1a8LuS2P03CZgt2kqiytk3UVK
uiGEy3sCkjXLFO3EsUvM6ha623S6BumCAYjNfgDowTVKaoEe1r2TD4bFeU6lGcXJ
mlVTv0Kywazf4Q2gKzkbDz8UQMArW4hok2iILHzz+sf/Rn0hie5XVqhFlbBlcae8
vyWsHmqvmE9BJAK2G2RLs9cJCTzEpEyAjUWfE3sIIa3ztSguT5+PHndDLR/d76aS
j8L3FYReICZ1NuNw1JSQPFs9g2EWJbNRiy+8o9O2elsJMpLDBj/FcV6TVpudbBTI
z7hvN+XSVYUaCVD4e8ma9YoC3VGseiAZvd+Y8hPd2MFBECVPNpy2bOacieU6Bgxh
zjSBXZ/URxN3rTkv9+F3BLWAOfVmJYN0rKV9YfM/rqpWjc9iQx30m1fRZDnXWhvd
ps8eFIYsKqc6v9AAugl/RexFy4Laav9eREjb0k2LA8ClLhK/qLLuiisVmKWS/grh
lX9tzPEG2nZcjxSYaEjz
=ve9i
-----END PGP SIGNATURE-----
Merge tag 'tegra-for-3.12-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/swarren/linux-tegra into next/soc
From: Stephen Warren:
ARM: tegra: core SoC enhancements for 3.12
This branch includes a number of enhancements to core SoC support for
Tegra devices. The major new features are:
* Adds a new CPU-power-gated cpuidle state for Tegra114.
* Adds initial system suspend support for Tegra114, initially supporting
just CPU-power-gating during suspend.
* Adds "LP1" suspend mode support for all of Tegra20/30/114. This mode
both gates CPU power, and places the DRAM into self-refresh mode.
* A new DT-driven PCIe driver to Tegra20/30. The driver is also moved
from arch/arm/mach-tegra/ to drivers/pci/host/.
The PCIe driver work depends on the following tag from Thomas Petazzoni:
git://git.infradead.org/linux-mvebu.git mis-3.12.2
... which is merged into the middle of this pull request.
* tag 'tegra-for-3.12-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/swarren/linux-tegra: (33 commits)
ARM: tegra: disable LP2 cpuidle state if PCIe is enabled
MAINTAINERS: Add myself as Tegra PCIe maintainer
PCI: tegra: set up PADS_REFCLK_CFG1
PCI: tegra: Add Tegra 30 PCIe support
PCI: tegra: Move PCIe driver to drivers/pci/host
PCI: msi: add default MSI operations for !HAVE_GENERIC_HARDIRQS platforms
ARM: tegra: add LP1 suspend support for Tegra114
ARM: tegra: add LP1 suspend support for Tegra20
ARM: tegra: add LP1 suspend support for Tegra30
ARM: tegra: add common LP1 suspend support
clk: tegra114: add LP1 suspend/resume support
ARM: tegra: config the polarity of the request of sys clock
ARM: tegra: add common resume handling code for LP1 resuming
ARM: pci: add ->add_bus() and ->remove_bus() hooks to hw_pci
of: pci: add registry of MSI chips
PCI: Introduce new MSI chip infrastructure
PCI: remove ARCH_SUPPORTS_MSI kconfig option
PCI: use weak functions for MSI arch-specific functions
ARM: tegra: unify Tegra's Kconfig a bit more
ARM: tegra: remove the limitation that Tegra114 can't support suspend
...
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Fix this build error:
In file included from fs/exec.c:61:0:
arch/s390/include/asm/tlb.h:35:23: error: expected identifier or '(' before 'unsigned'
arch/s390/include/asm/tlb.h:36:1: warning: no semicolon at end of struct or union [enabled by default]
arch/s390/include/asm/tlb.h: In function 'tlb_gather_mmu':
arch/s390/include/asm/tlb.h:57:5: error: 'struct mmu_gather' has no member named 'end'
Broken due to commit 2b047252d0 ("Fix TLB gather virtual address range
invalidation corner cases").
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[ Oh well. We had build testing for ppc amd um, but no s390 - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Tebulin reported:
"Since v3.7.2 on two independent machines a very specific Git
repository fails in 9/10 cases on git-fsck due to an SHA1/memory
failures. This only occurs on a very specific repository and can be
reproduced stably on two independent laptops. Git mailing list ran
out of ideas and for me this looks like some very exotic kernel issue"
and bisected the failure to the backport of commit 53a59fc67f ("mm:
limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT").
That commit itself is not actually buggy, but what it does is to make it
much more likely to hit the partial TLB invalidation case, since it
introduces a new case in tlb_next_batch() that previously only ever
happened when running out of memory.
The real bug is that the TLB gather virtual memory range setup is subtly
buggered. It was introduced in commit 597e1c3580 ("mm/mmu_gather:
enable tlb flush range in generic mmu_gather"), and the range handling
was already fixed at least once in commit e6c495a96c ("mm: fix the TLB
range flushed when __tlb_remove_page() runs out of slots"), but that fix
was not complete.
The problem with the TLB gather virtual address range is that it isn't
set up by the initial tlb_gather_mmu() initialization (which didn't get
the TLB range information), but it is set up ad-hoc later by the
functions that actually flush the TLB. And so any such case that forgot
to update the TLB range entries would potentially miss TLB invalidates.
Rather than try to figure out exactly which particular ad-hoc range
setup was missing (I personally suspect it's the hugetlb case in
zap_huge_pmd(), which didn't have the same logic as zap_pte_range()
did), this patch just gets rid of the problem at the source: make the
TLB range information available to tlb_gather_mmu(), and initialize it
when initializing all the other tlb gather fields.
This makes the patch larger, but conceptually much simpler. And the end
result is much more understandable; even if you want to play games with
partial ranges when invalidating the TLB contents in chunks, now the
range information is always there, and anybody who doesn't want to
bother with it won't introduce subtle bugs.
Ben verified that this fixes his problem.
Reported-bisected-and-tested-by: Ben Tebulin <tebulin@googlemail.com>
Build-testing-by: Stephen Rothwell <sfr@canb.auug.org.au>
Build-testing-by: Richard Weinberger <richard.weinberger@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull nohz improvements from Frederic Weisbecker:
" It mostly contains fixes and full dynticks off-case optimizations. I believe that
distros want to enable this feature so it seems important to optimize the case
where the "nohz_full=" parameter is empty. ie: I'm trying to remove any performance
regression that comes with NO_HZ_FULL=y when the feature is not used.
This patchset improves the current situation a lot (off-case appears to be around 11% faster
with hackbench, although I guess it may vary depending on the configuration but it should be
significantly faster in any case) now there is still some work to do: I can still observe a
remaining loss of 1.6% throughput seen with hackbench compared to CONFIG_NO_HZ_FULL=n. "
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If the arch overrides some generic vtime APIs, let it describe
these on a dedicated and standalone header. This way it becomes
convenient to include it in vtime generic headers without irrelevant
stuff in such a low level header.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Until now, the MSI architecture-specific functions could be overloaded
using a fairly complex set of #define and compile-time
conditionals. In order to prepare for the introduction of the msi_chip
infrastructure, it is desirable to switch all those functions to use
the 'weak' mechanism. This commit converts all the architectures that
were overidding those MSI functions to use the new strategy.
Note that we keep two separate, non-weak, functions
default_teardown_msi_irqs() and default_restore_msi_irqs() for the
default behavior of the arch_teardown_msi_irqs() and
arch_restore_msi_irqs(), as the default behavior is needed by x86 PCI
code.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Daniel Price <daniel.price@gmail.com>
Tested-by: Thierry Reding <thierry.reding@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: linux-s390@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Current common code uses PAGE_OFFSET to indicate a bad host virtual address.
As this check won't work on architectures that don't map kernel and user memory
into the same address space (e.g. s390), such architectures can now provide
their own KVM_HVA_ERR_BAD defines.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The gmap_map_segment function uses PGDIR_SIZE in the check for the
maximum address in the tasks address space. This incorrectly limits
the amount of memory usable for a kvm guest to 4TB. The correct limit
is (1UL << 53). As the TASK_SIZE has different values (4TB vs 8PB)
dependent on the existance of the fourth page table level, create
a new define 'TASK_MAX_SIZE' for (1UL << 53).
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Improve the code to upgrade the standard 2K page tables to 4K page tables
with PGSTEs to allow the operation to happen when the program is already
multi-threaded.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The find_next_bit_left function is broken if used with an offset which
is not a multiple of 64. The shift to mask the bits of a 64-bit word
not to search is in the wrong direction, the result can be either a
bit found smaller than the offset or failure to find a set bit.
Cc: <stable@vger.kernel.org> # v3.8+
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The patch implements a s390 specific ptrace request
PTRACE_TE_ABORT_RAND to modify the randomness of spontaneous
aborts of memory transactions of the transaction execution
facility. The data argument of the ptrace request is used to
specify the levels of randomness, 0 for normal operation, 1 to
abort every transaction at a random instruction, and 2 to abort
a random transaction at a random instruction. The default is 0
for normal operation.
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Rename LL_SO to BUSY_POLL_SO
Rename sysctl_net_ll_{read,poll} to sysctl_busy_{read,poll}
Fix up users of these variables.
Fix documentation for sysctl.
a patch for the socket.7 man page will follow separately,
because of limitations of my mail setup.
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking updates from David Miller:
"This is a re-do of the net-next pull request for the current merge
window. The only difference from the one I made the other day is that
this has Eliezer's interface renames and the timeout handling changes
made based upon your feedback, as well as a few bug fixes that have
trickeled in.
Highlights:
1) Low latency device polling, eliminating the cost of interrupt
handling and context switches. Allows direct polling of a network
device from socket operations, such as recvmsg() and poll().
Currently ixgbe, mlx4, and bnx2x support this feature.
Full high level description, performance numbers, and design in
commit 0a4db187a9 ("Merge branch 'll_poll'")
From Eliezer Tamir.
2) With the routing cache removed, ip_check_mc_rcu() gets exercised
more than ever before in the case where we have lots of multicast
addresses. Use a hash table instead of a simple linked list, from
Eric Dumazet.
3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from
Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski,
Marek Puzyniak, Michal Kazior, and Sujith Manoharan.
4) Support reporting the TUN device persist flag to userspace, from
Pavel Emelyanov.
5) Allow controlling network device VF link state using netlink, from
Rony Efraim.
6) Support GRE tunneling in openvswitch, from Pravin B Shelar.
7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from
Daniel Borkmann and Eric Dumazet.
8) Allow controlling of TCP quickack behavior on a per-route basis,
from Cong Wang.
9) Several bug fixes and improvements to vxlan from Stephen
Hemminger, Pravin B Shelar, and Mike Rapoport. In particular,
support receiving on multiple UDP ports.
10) Major cleanups, particular in the area of debugging and cookie
lifetime handline, to the SCTP protocol code. From Daniel
Borkmann.
11) Allow packets to cross network namespaces when traversing tunnel
devices. From Nicolas Dichtel.
12) Allow monitoring netlink traffic via AF_PACKET sockets, in a
manner akin to how we monitor real network traffic via ptype_all.
From Daniel Borkmann.
13) Several bug fixes and improvements for the new alx device driver,
from Johannes Berg.
14) Fix scalability issues in the netem packet scheduler's time queue,
by using an rbtree. From Eric Dumazet.
15) Several bug fixes in TCP loss recovery handling, from Yuchung
Cheng.
16) Add support for GSO segmentation of MPLS packets, from Simon
Horman.
17) Make network notifiers have a real data type for the opaque
pointer that's passed into them. Use this to properly handle
network device flag changes in arp_netdev_event(). From Jiri
Pirko and Timo Teräs.
18) Convert several drivers over to module_pci_driver(), from Peter
Huewe.
19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a
O(1) calculation instead. From Eric Dumazet.
20) Support setting of explicit tunnel peer addresses in ipv6, just
like ipv4. From Nicolas Dichtel.
21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet.
22) Prevent a single high rate flow from overruning an individual cpu
during RX packet processing via selective flow shedding. From
Willem de Bruijn.
23) Don't use spinlocks in TCP md5 signing fast paths, from Eric
Dumazet.
24) Don't just drop GSO packets which are above the TBF scheduler's
burst limit, chop them up so they are in-bounds instead. Also
from Eric Dumazet.
25) VLAN offloads are missed when configured on top of a bridge, fix
from Vlad Yasevich.
26) Support IPV6 in ping sockets. From Lorenzo Colitti.
27) Receive flow steering targets should be updated at poll() time
too, from David Majnemer.
28) Fix several corner case regressions in PMTU/redirect handling due
to the routing cache removal, from Timo Teräs.
29) We have to be mindful of ipv4 mapped ipv6 sockets in
upd_v6_push_pending_frames(). From Hannes Frederic Sowa.
30) Fix L2TP sequence number handling bugs, from James Chapman."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits)
drivers/net: caif: fix wrong rtnl_is_locked() usage
drivers/net: enic: release rtnl_lock on error-path
vhost-net: fix use-after-free in vhost_net_flush
net: mv643xx_eth: do not use port number as platform device id
net: sctp: confirm route during forward progress
virtio_net: fix race in RX VQ processing
virtio: support unlocked queue poll
net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit
Documentation: Fix references to defunct linux-net@vger.kernel.org
net/fs: change busy poll time accounting
net: rename low latency sockets functions to busy poll
bridge: fix some kernel warning in multicast timer
sfc: Fix memory leak when discarding scattered packets
sit: fix tunnel update via netlink
dt:net:stmmac: Add dt specific phy reset callback support.
dt:net:stmmac: Add support to dwmac version 3.610 and 3.710
dt:net:stmmac: Allocate platform data only if its NULL.
net:stmmac: fix memleak in the open method
ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
net: ipv6: fix wrong ping_v6_sendmsg return value
...
Pull powerpc updates from Ben Herrenschmidt:
"This is the powerpc changes for the 3.11 merge window. In addition to
the usual bug fixes and small updates, the main highlights are:
- Support for transparent huge pages by Aneesh Kumar for 64-bit
server processors. This allows the use of 16M pages as transparent
huge pages on kernels compiled with a 64K base page size.
- Base VFIO support for KVM on power by Alexey Kardashevskiy
- Wiring up of our nvram to the pstore infrastructure, including
putting compressed oopses in there by Aruna Balakrishnaiah
- Move, rework and improve our "EEH" (basically PCI error handling
and recovery) infrastructure. It is no longer specific to pseries
but is now usable by the new "powernv" platform as well (no
hypervisor) by Gavin Shan.
- I fixed some bugs in our math-emu instruction decoding and made it
usable to emulate some optional FP instructions on processors with
hard FP that lack them (such as fsqrt on Freescale embedded
processors).
- Support for Power8 "Event Based Branch" facility by Michael
Ellerman. This facility allows what is basically "userspace
interrupts" for performance monitor events.
- A bunch of Transactional Memory vs. Signals bug fixes and HW
breakpoint/watchpoint fixes by Michael Neuling.
And more ... I appologize in advance if I've failed to highlight
something that somebody deemed worth it."
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits)
pstore: Add hsize argument in write_buf call of pstore_ftrace_call
powerpc/fsl: add MPIC timer wakeup support
powerpc/mpic: create mpic subsystem object
powerpc/mpic: add global timer support
powerpc/mpic: add irq_set_wake support
powerpc/85xx: enable coreint for all the 64bit boards
powerpc/8xx: Erroneous double irq_eoi() on CPM IRQ in MPC8xx
powerpc/fsl: Enable CONFIG_E1000E in mpc85xx_smp_defconfig
powerpc/mpic: Add get_version API both for internal and external use
powerpc: Handle both new style and old style reserve maps
powerpc/hw_brk: Fix off by one error when validating DAWR region end
powerpc/pseries: Support compression of oops text via pstore
powerpc/pseries: Re-organise the oops compression code
pstore: Pass header size in the pstore write callback
powerpc/powernv: Fix iommu initialization again
powerpc/pseries: Inform the hypervisor we are using EBB regs
powerpc/perf: Add power8 EBB support
powerpc/perf: Core EBB support for 64-bit book3s
powerpc/perf: Drop MMCRA from thread_struct
powerpc/perf: Don't enable if we have zero events
...
Conflicts:
drivers/net/ethernet/freescale/fec_main.c
drivers/net/ethernet/renesas/sh_eth.c
net/ipv4/gre.c
The GRE conflict is between a bug fix (kfree_skb --> kfree_skb_list)
and the splitting of the gre.c code into seperate files.
The FEC conflict was two sets of changes adding ethtool support code
in an "!CONFIG_M5272" CPP protected block.
Finally the sh_eth.c conflict was between one commit add bits set
in the .eesr_err_check mask whilst another commit removed the
.tx_error_check member and assignments.
Signed-off-by: David S. Miller <davem@davemloft.net>
On the x86 side, there are some optimizations and documentation updates.
The big ARM/KVM change for 3.11, support for AArch64, will come through
Catalin Marinas's tree. s390 and PPC have misc cleanups and bugfixes.
There is a conflict due to "s390/pgtable: fix ipte notify bit" having
entered 3.10 through Martin Schwidefsky's s390 tree. This pull request
has additional changes on top, so this tree's version is the correct one.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJR0oU6AAoJEBvWZb6bTYbynnsP/RSUrrHrA8Wu1tqVfAKu+1y5
6OIihqZ9x11/YMaNofAfv86jqxFu0/j7CzMGphNdjzujqKI+Q1tGe7oiVCmKzoG+
UvSctWsz0lpllgBtnnrm5tcfmG6rrddhLtpA7m320+xCVx8KV5P4VfyHZEU+Ho8h
ziPmb2mAQ65gBNX6nLHEJ3ITTgad6gt4NNbrKIYpyXuWZQJypzaRqT/vpc4md+Ed
dCebMXsL1xgyb98EcnOdrWH1wV30MfucR7IpObOhXnnMKeeltqAQPvaOlKzZh4dK
+QfxJfdRZVS0cepcxzx1Q2X3dgjoKQsHq1nlIyz3qu1vhtfaqBlixLZk0SguZ/R9
1S1YqucZiLRO57RD4q0Ak5oxwobu18ZoqJZ6nledNdWwDe8bz/W2wGAeVty19ky0
qstBdM9jnwXrc0qrVgZp3+s5dsx3NAm/KKZBoq4sXiDLd/yBzdEdWIVkIrU3X9wU
3X26wOmBxtsB7so/JR7ciTsQHelmLicnVeXohAEP9CjIJffB81xVXnXs0P0SYuiQ
RzbSCwjPzET4JBOaHWT0Dhv0DTS/EaI97KzlN32US3Bn3WiLlS1oDCoPFoaLqd2K
LxQMsXS8anAWxFvexfSuUpbJGPnKSidSQoQmJeMGBa9QhmZCht3IL16/Fb641ToN
xBohzi49L9FDbpOnTYfz
=1zpG
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"On the x86 side, there are some optimizations and documentation
updates. The big ARM/KVM change for 3.11, support for AArch64, will
come through Catalin Marinas's tree. s390 and PPC have misc cleanups
and bugfixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (87 commits)
KVM: PPC: Ignore PIR writes
KVM: PPC: Book3S PR: Invalidate SLB entries properly
KVM: PPC: Book3S PR: Allow guest to use 1TB segments
KVM: PPC: Book3S PR: Don't keep scanning HPTEG after we find a match
KVM: PPC: Book3S PR: Fix invalidation of SLB entry 0 on guest entry
KVM: PPC: Book3S PR: Fix proto-VSID calculations
KVM: PPC: Guard doorbell exception with CONFIG_PPC_DOORBELL
KVM: Fix RTC interrupt coalescing tracking
kvm: Add a tracepoint write_tsc_offset
KVM: MMU: Inform users of mmio generation wraparound
KVM: MMU: document fast invalidate all mmio sptes
KVM: MMU: document fast invalidate all pages
KVM: MMU: document fast page fault
KVM: MMU: document mmio page fault
KVM: MMU: document write_flooding_count
KVM: MMU: document clear_spte_count
KVM: MMU: drop kvm_mmu_zap_mmio_sptes
KVM: MMU: init kvm generation close to mmio wrap-around value
KVM: MMU: add tracepoint for check_mmio_spte
KVM: MMU: fast invalidate all mmio sptes
...
Pull s390 updates from Martin Schwidefsky:
"This is the bulk of the s390 patches for the 3.11 merge window.
Notable enhancements are: the block timeout patches for dasd from
Hannes, and more work on the PCI support front. In addition some
cleanup and the usual bug fixing."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (42 commits)
s390/dasd: Fail all requests when DASD_FLAG_ABORTIO is set
s390/dasd: Add 'timeout' attribute
block: check for timeout function in blk_rq_timed_out()
block/dasd: detailed I/O errors
s390/dasd: Reduce amount of messages for specific errors
s390/dasd: Implement block timeout handling
s390/dasd: process all requests in the device tasklet
s390/dasd: make number of retries configurable
s390/dasd: Clarify comment
s390/hwsampler: Updated misleading member names in hws_data_entry
s390/appldata_net_sum: do not use static data
s390/appldata_mem: do not use static data
s390/vmwatchdog: do not use static data
s390/airq: simplify adapter interrupt code
s390/pci: remove per device debug attribute
s390/dma: remove gratuitous brackets
s390/facility: decompose test_facility()
s390/sclp: remove duplicated include from sclp_ctl.c
s390/irq: store interrupt information in pt_regs
s390/drivers: Cocci spatch "ptr_ret.spatch"
...
Pull VFS patches (part 1) from Al Viro:
"The major change in this pile is ->readdir() replacement with
->iterate(), dealing with ->f_pos races in ->readdir() instances for
good.
There's a lot more, but I'd prefer to split the pull request into
several stages and this is the first obvious cutoff point."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (67 commits)
[readdir] constify ->actor
[readdir] ->readdir() is gone
[readdir] convert ecryptfs
[readdir] convert coda
[readdir] convert ocfs2
[readdir] convert fatfs
[readdir] convert xfs
[readdir] convert btrfs
[readdir] convert hostfs
[readdir] convert afs
[readdir] convert ncpfs
[readdir] convert hfsplus
[readdir] convert hfs
[readdir] convert befs
[readdir] convert cifs
[readdir] convert freevxfs
[readdir] convert fuse
[readdir] convert hpfs
reiserfs: switch reiserfs_readdir_dentry to inode
reiserfs: is_privroot_deh() needs only directory inode, actually
...
Whenever a DASD request encounters a timeout we might
need to abort all outstanding requests on this or
even other devices.
This is especially useful if one wants to fail all
devices on one side of a RAID10 configuration, even
though only one device exhibited an error.
To handle this I've introduced a new device flag
DASD_FLAG_ABORTIO.
This flag is evaluated in __dasd_process_request_queue()
and will invoke blk_abort_request() for all
outstanding requests with DASD_CQR_FLAGS_FAILFAST set.
This will cause any of these requests to be aborted
immediately if the blk_timeout function is activated.
The DASD_FLAG_ABORTIO is also evaluated in
__dasd_process_request_queue to abort all
new request which would have the
DASD_CQR_FLAGS_FAILFAST bit set.
The flag can be set with the new ioctls 'BIODASDABORTIO'
and removed with 'BIODASDALLOWIO'.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJR0K2gAAoJEHm+PkMAQRiGWsEH+gMZSN1qRm34hZ82q1Tx7HvL
Eb/Gsl3Qw/7G2TlTqgjBUs36IdqV9O2cui/aa3/TfXvdvrx+0GlhRkEwQPc+ygcO
Mvoyoke4tT4+4jVFdCg1J8avREsa28/6oaHs0ZZxuVmJBBLTJH7aXaNsGn6eU1q9
9+p798MQis6naIiPC63somlZcCIiBhsuWCPWpEfLMn8G1HWAFTM3xXIbNBqe/brS
bmIOfhomlIZ5dcdaXGvjtP3+KJhkNDwhkPC4tVYu8JqqgSlrE+a+EGyEuuGqKk10
U+swiqyuD31uBI9ga54u/2FzSqDiAu6YOcMXevjo/m3g9XLdYbYLvN+nvN8alCQ=
=Ob6Z
-----END PGP SIGNATURE-----
Merge tag 'v3.10' into next
Merge 3.10 in order to get some of the last minute powerpc
changes, resolve conflicts and add additional fixes on top
of them.
There are three users of adapter interrupts: AP, QDIO and PCI. Each
registers a single adapter interrupt with independent ISCs. Define
a "struct airq" with the interrupt handler, a pointer and a mask for
the local summary indicator and the ISC for the adapter interrupt
source. Convert the indicator array with its fixed number of adapter
interrupt sources per ISE to an array of hlists. This removes the
limitation to 32 adapter interrupts per ISC and allows for arbitrary
memory locations for the local summary indicator.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The per-pci-device 'debug' attribute is ill defined. For each device
it prints the same information, the adapter interrupt bit vector for
irq numbers 0 & 1, the start of the global interrupt summary vector
and the global irq retries counter. Just remove the attribute and
the associated code.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove gratuitous brackets in dma_mapping_error.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The patch decomposes the function test_facility() into its API
test_facility() and its implementation __test_facility(). This
allows to reuse the implementation with a different API.
Patch is used to prepare checkin of SIE satellite code.
Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Copy the interrupt parameters from the lowcore to the pt_regs structure
in entry[64].S and reduce the arguments of the low level interrupt handler
to the pt_regs pointer only. In addition move the test-pending-interrupt
loop from do_IRQ to entry[64].S to make sure that interrupt information
is always delivered via pt_regs.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a character misc device "sclp_ctl" that allows to run SCCBs
from user space using the SCLP_CTL_SCCB ioctl.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introduce two new ioctls CHSC_ON_CLOSE_SET and CHSC_ON_CLOSE_REMOVE
that allow to add and remove one CHSC that is unconditionally executed
when the CHSC device node is closed.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch adds a new ioctl CHSC_START_SYNC that allows to
execute any synchronous CHSC that is provided by user space.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Provide wrappers for the [de]configure operations, add some error
handling, and use pci_scan_slot instead of pci_scan_single_device.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
From time to time we need to set the guest storage key. Lets
provide a helper function that handles the changes with all the
right locking and checking.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
virt_to_phys on s390 currently uses the LRA instruction to translate
virtual to physical addresses. This creates an unnecessary overhead
and caused trouble with dma debugging code (when called with an
address pointing to a already unmapped page).
Just get rid of s390's implementation and use the one from
asm-generic/io.h .
Note: with this change virt_to_phys will no longer work on vmalloc'ed
addresses.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Renamed the PGM_PRIVILEGED_OPERATION define to PGM_PRIVILEGED_OP since this
define was way longer than the other PGM_* defines and caused the code often
to exceed the 80 columns limit when not split to multiple lines.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will be later used by powerpc THP support. In powerpc we want to use
pgtable for storing the hash index values. So instead of adding them to
mm_context list, we would like to store them in the second half of pmd
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Conflicts:
drivers/net/wireless/ath/ath9k/Kconfig
drivers/net/xen-netback/netback.c
net/batman-adv/bat_iv_ogm.c
net/wireless/nl80211.c
The ath9k Kconfig conflict was a change of a Kconfig option name right
next to the deletion of another option.
The xen-netback conflict was overlapping changes involving the
handling of the notify list in xen_netbk_rx_action().
Batman conflict resolution provided by Antonio Quartulli, basically
keep everything in both conflict hunks.
The nl80211 conflict is a little more involved. In 'net' we added a
dynamic memory allocation to nl80211_dump_wiphy() to fix a race that
Linus reported. Meanwhile in 'net-next' the handlers were converted
to use pre and post doit handlers which use a flag to determine
whether to hold the RTNL mutex around the operation.
However, the dump handlers to not use this logic. Instead they have
to explicitly do the locking. There were apparent bugs in the
conversion of nl80211_dump_wiphy() in that we were not dropping the
RTNL mutex in all the return paths, and it seems we very much should
be doing so. So I fixed that whilst handling the overlapping changes.
To simplify the initial returns, I take the RTNL mutex after we try
to allocate 'tb'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Without this patch drivers will get blamed (CONFIG_DMA_API_DEBUG=y)
for not calling dma_mapping_error (even if they do).
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The map_page implementation of s390 returns DMA_ERROR_CODE in an error
situation. Correctly test if a mapping was erroneous (DMA_ERROR_CODE is
defined as ~0).
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
adds a socket option for low latency polling.
This allows overriding the global sysctl value with a per-socket one.
Unexport sysctl_net_ll_poll since for now it's not needed in modules.
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is based on an original patch of David Hildenbrand.
The perf core implementation calls architecture specific code in order
to ask for specific information for a particular sample:
perf_instruction_pointer()
When perf core code asks for the instruction pointer, architecture
specific code must detect if a KVM guest was running when the sample
was taken. A sample can be associated with a KVM guest when the PSW
supervisor state bit is set and the PSW instruction pointer part
contains the address of 'sie_exit'.
A KVM guest's instruction pointer information is then retrieved via
gpsw entry pointed to by the sie control-block.
perf_misc_flags()
perf code code calls this function in order to associate the kernel
vs. user state infomation with a particular sample. Architecture
specific code must also first detectif a KVM guest was running
at the time the sample was taken.
Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Lets use the common waitqueue for kvm cpus on s390. By itself it is
just a cleanup, but it should also improve the accuracy of diag 0x44
which is implemented via kvm_vcpu_on_spin. kvm_vcpu_on_spin has
an explicit check for waiting on the waitqueue to optimize the
yielding.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch enables kvm to give large pages to the guest. The heavy
lifting is done by the hardware, the host only has to take care
of the PFMF instruction, which is also part of EDAT-1.
We also support the non-quiescing key setting facility if the host
supports it, to behave similar to the interpretation of sske.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
From time to time we need to set the guest storage key. Lets
provide a helper function that handles the changes with all the
right locking and checking.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Getting and Releasing the pgste lock has lock semantics. Make the
code an explicit barrier.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In modify_prot_start we update the pgste value but never store it back
into the original location. Lets save the calculated result, since
modify_prot_commit will use the value of the pgste.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When doing the transition invalid->valid in the host page table for
a guest, then the guest view of C/R is in the pgste. After validation
the view is pgste OR real key. We must zero out the real key C/R to
avoid guest over-indication for change (and reference).
Touching the real key is ok also for the host: The change bit is
tracked via write protection and the reference bit is also ok
because set_pte_at was called and the page will be touched anyway
soon. Furthermore architecture defines reference as "substantially
accurate", over- and underindication are ok.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
pte_present might return true on PAGE_TYPE_NONE, even if
the invalid bit is on. Modify the existing check of the
pgste functions to avoid crashes.
[ Martin Schwidefsky: added ptep_modify_prot_[start|commit] bits ]
Reported-by: Martin Schwidefky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
On s390 the prefix page and absolute zero pages are not correctly
returned when reading /dev/mem. The reason is that the s390 asm/io.h
file includes the asm-generic/io.h file which then defines
xlate_dev_mem_ptr() and therefore overwrites the s390 specific
version that does the correct swap operation for prefix and absolute
zero pages. The problem is a regression that was introduced with git
commit cd248341 (s390/pci: base support).
To fix the problem add "#ifndef xlate_dev_mem_ptr" in asm-generic/io.h
and "#define xlate_dev_mem_ptr" in asm/io.h. This ensures that the
s390 version is used. For completeness also add the "#ifndef"
construct for xlate_dev_kmem_ptr().
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In dma_free_coherent call debug_dma_free_coherent before deallocating
the memory to avoid a possible use after free.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The guest prefix pages must be mapped writeable all the time
while SIE is running, otherwise the guest might see random
behaviour. (pinned at the pte level) Turns out that mlocking is
not enough, the page table entry (not the page) might change or
become r/o. This patch uses the gmap notifiers to kick guest
cpus out of SIE.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Lets provide functions to prevent KVM from reentering SIE and
to kick cpus out of SIE. We cannot use the common kvm_vcpu_kick code,
since we need to kick out guests in places that hold architecture
specific locks (e.g. pgste lock) which might be necessary on the
other cpus - so no waiting possible.
So lets provide a bit in a private field of the sie control block
that acts as a gate keeper, after we claimed we are in SIE.
Please note that we do not reuse prog0c, since we want to access
that bit without atomic ops.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Lets track in a private bit if the sie control block is active.
We want to track this as closely as possible, so we also have to
instrument the interrupt and program check handler. Lets use the
existing HANDLE_SIE_INTERCEPT macro.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
The RCP byte is a part of the PGSTE value, the existing RCP_xxx names
are inaccurate. As the defines describe bits and pieces of the PGSTE,
the names should start with PGSTE_. The KVM_UR_BIT and KVM_UC_BIT are
part of the PGSTE as well, give them better names as well.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Dont use the same bit as user referenced.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Tony Jones reported that the ftrace self tests on s390 do not work:
<6>Testing dynamic ftrace ops #1: (0 0 0 0 0) FAILED!
<6>Testing tracer irqsoff:
<3>failed to start irqsoff tracer
<4>.. no entries found ..FAILED!
<6>Testing tracer wakeup:
<3>failed to start wakeup tracer
<4>.. no entries found ..FAILED!
<6>Testing tracer function_graph:
<4>Failed to init function_graph tracer, init returned -19
<4>FAILED!
This happens because we forgot to adjust the instruction pointer that gets
passed to the ftrace trace function by MCOUNT_INSN_SIZE.
In addition change MCOUNT_INSN_SIZE to the correct value on 31 bit.
It only worked so far because the to be patched instruction was identical.
Reported-by: Tony Jones <tonyj@suse.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Wit the introduction of large pages Linux also used pfmf for page
clearing. The current implementation is not ideal, though:
- currently we set usage intent=0, but cleared pages are often used
directly after the clearing
- z/VM does not yet provide EDAT
- KVM does have to intercept PFMF even for resident pages
Lets just the mvcl loop in all cases until we have a well defined
pattern were pfmf is besser.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull more s390 updates from Martin Schwidefsky:
"This is the second batch of s390 patches for the 3.10 merge window.
Heiko improved the memory detection, this fixes kdump for large memory
sizes. Some kvm related memory management work, new ipldev/condev
keywords in cio and bug fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mem_detect: remove artificial kdump memory types
s390/mm: add pte invalidation notifier for kvm
s390/zcrypt: ap bus rescan problem when toggle crypto adapters on/off
s390/memory hotplug,sclp: get rid of per memory increment usecount
s390/memory hotplug: provide memory_block_size_bytes() function
s390/mem_detect: limit memory detection loop to "mem=" parameter
s390/kdump,bootmem: fix bootmem allocator bitmap size
s390: get rid of odd global real_memory_size
s390/kvm: Change the virtual memory mapping location for Virtio devices
s390/zcore: calculate real memory size using own get_mem_size function
s390/mem_detect: add DAT sanity check
s390/mem_detect: fix lockdep irq tracing
s390/mem_detect: move memory detection code to mm folder
s390/zfcpdump: exploit new cio_ignore keywords
s390/cio: add condev keyword to cio_ignore
s390/cio: add ipldev keyword to cio_ignore
s390/uaccess: add "fallthrough" comments
Pull kvm updates from Gleb Natapov:
"Highlights of the updates are:
general:
- new emulated device API
- legacy device assignment is now optional
- irqfd interface is more generic and can be shared between arches
x86:
- VMCS shadow support and other nested VMX improvements
- APIC virtualization and Posted Interrupt hardware support
- Optimize mmio spte zapping
ppc:
- BookE: in-kernel MPIC emulation with irqfd support
- Book3S: in-kernel XICS emulation (incomplete)
- Book3S: HV: migration fixes
- BookE: more debug support preparation
- BookE: e6500 support
ARM:
- reworking of Hyp idmaps
s390:
- ioeventfd for virtio-ccw
And many other bug fixes, cleanups and improvements"
* tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits)
kvm: Add compat_ioctl for device control API
KVM: x86: Account for failing enable_irq_window for NMI window request
KVM: PPC: Book3S: Add API for in-kernel XICS emulation
kvm/ppc/mpic: fix missing unlock in set_base_addr()
kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/write
kvm/ppc/mpic: remove users
kvm/ppc/mpic: fix mmio region lists when multiple guests used
kvm/ppc/mpic: remove default routes from documentation
kvm: KVM_CAP_IOMMU only available with device assignment
ARM: KVM: iterate over all CPUs for CPU compatibility check
KVM: ARM: Fix spelling in error message
ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally
KVM: ARM: Fix API documentation for ONE_REG encoding
ARM: KVM: promote vfp_host pointer to generic host cpu context
ARM: KVM: add architecture specific hook for capabilities
ARM: KVM: perform HYP initilization for hotplugged CPUs
ARM: KVM: switch to a dual-step HYP init code
ARM: KVM: rework HYP page table freeing
ARM: KVM: enforce maximum size for identity mapped code
ARM: KVM: move to a KVM provided HYP idmap
...
Simplify the memory detection code a bit by removing the CHUNK_OLDMEM
and CHUNK_CRASHK memory types.
They are not needed. Everything that is needed is a mechanism to
insert holes into the detected memory.
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a notifier for kvm to get control before a page table entry is
invalidated. The notifier is only called for ptes of an address space
with pgstes that have been explicitly marked to require notification.
Kvm will use this to get control before prefix pages of virtual CPU
are unmapped.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The current memory detection loop will detect all present memory of
a machine. This is true even if the user specified the "mem=" parameter
on the kernel command line.
This can be a problem since the memory detection may cause a fully
populated host page table for the guest, even for those parts of the
memory that the guest will never use afterwards.
So fix this and only detect memory up to a user supplied "mem=" limit
if specified.
Reported-by: Michael Johanssen <johanssn@de.ibm.com>
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The variable real_memory_size has odd semantics and has been used in
a broken way by e.g. the old kvm code.
Therefore get rid of it before anybody else makes use of it.
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull networking updates from David Miller:
"Highlights (1721 non-merge commits, this has to be a record of some
sort):
1) Add 'random' mode to team driver, from Jiri Pirko and Eric
Dumazet.
2) Make it so that any driver that supports configuration of multiple
MAC addresses can provide the forwarding database add and del
calls by providing a default implementation and hooking that up if
the driver doesn't have an explicit set of handlers. From Vlad
Yasevich.
3) Support GSO segmentation over tunnels and other encapsulating
devices such as VXLAN, from Pravin B Shelar.
4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.
5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
Dukkipati.
6) In the PHY layer, allow supporting wake-on-lan in situations where
the PHY registers have to be written for it to be configured.
Use it to support wake-on-lan in mv643xx_eth.
From Michael Stapelberg.
7) Significantly improve firewire IPV6 support, from YOSHIFUJI
Hideaki.
8) Allow multiple packets to be sent in a single transmission using
network coding in batman-adv, from Martin Hundebøll.
9) Add support for T5 cxgb4 chips, from Santosh Rastapur.
10) Generalize the VXLAN forwarding tables so that there is more
flexibility in configurating various aspects of the endpoints.
From David Stevens.
11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
from Dmitry Kravkov.
12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
Neira Ayuso.
13) Start adding networking selftests.
14) In situations of overload on the same AF_PACKET fanout socket, or
per-cpu packet receive queue, minimize drop by distributing the
load to other cpus/fanouts. From Willem de Bruijn and Eric
Dumazet.
15) Add support for new payload offset BPF instruction, from Daniel
Borkmann.
16) Convert several drivers over to mdoule_platform_driver(), from
Sachin Kamat.
17) Provide a minimal BPF JIT image disassembler userspace tool, from
Daniel Borkmann.
18) Rewrite F-RTO implementation in TCP to match the final
specification of it in RFC4138 and RFC5682. From Yuchung Cheng.
19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
you like netlink, so I implemented netlink dumping of netlink
sockets.") From Andrey Vagin.
20) Remove ugly passing of rtnetlink attributes into rtnl_doit
functions, from Thomas Graf.
21) Allow userspace to be able to see if a configuration change occurs
in the middle of an address or device list dump, from Nicolas
Dichtel.
22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
Frederic Sowa.
23) Increase accuracy of packet length used by packet scheduler, from
Jason Wang.
24) Beginning set of changes to make ipv4/ipv6 fragment handling more
scalable and less susceptible to overload and locking contention,
from Jesper Dangaard Brouer.
25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
instead. From Hong Zhiguo.
26) Optimize route usage in IPVS by avoiding reference counting where
possible, from Julian Anastasov.
27) Convert IPVS schedulers to RCU, also from Julian Anastasov.
28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
Eitzenberger.
29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
nfnetlink_log, and nfnetlink_queue. From Gao feng.
30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.
31) Support several new r8169 chips, from Hayes Wang.
32) Support tokenized interface identifiers in ipv6, from Daniel
Borkmann.
33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.
34) Add 802.1ad vlan offload support, from Patrick McHardy.
35) Support mmap() based netlink communication, also from Patrick
McHardy.
36) Support HW timestamping in mlx4 driver, from Amir Vadai.
37) Rationalize AF_PACKET packet timestamping when transmitting, from
Willem de Bruijn and Daniel Borkmann.
38) Bring parity to what's provided by /proc/net/packet socket dumping
and the info provided by netlink socket dumping of AF_PACKET
sockets. From Nicolas Dichtel.
39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
Poirier"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
filter: fix va_list build error
af_unix: fix a fatal race with bit fields
bnx2x: Prevent memory leak when cnic is absent
bnx2x: correct reading of speed capabilities
net: sctp: attribute printl with __printf for gcc fmt checks
netlink: kconfig: move mmap i/o into netlink kconfig
netpoll: convert mutex into a semaphore
netlink: Fix skb ref counting.
net_sched: act_ipt forward compat with xtables
mlx4_en: fix a build error on 32bit arches
Revert "bnx2x: allow nvram test to run when device is down"
bridge: avoid OOPS if root port not found
drivers: net: cpsw: fix kernel warn on cpsw irq enable
sh_eth: use random MAC address if no valid one supplied
3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
tg3: fix to append hardware time stamping flags
unix/stream: fix peeking with an offset larger than data in queue
unix/dgram: fix peeking with an offset larger than data in queue
unix/dgram: peek beyond 0-sized skbs
openvswitch: Remove unneeded ovs_netdev_get_ifindex()
...
Pull compat cleanup from Al Viro:
"Mostly about syscall wrappers this time; there will be another pile
with patches in the same general area from various people, but I'd
rather push those after both that and vfs.git pile are in."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
syscalls.h: slightly reduce the jungles of macros
get rid of union semop in sys_semctl(2) arguments
make do_mremap() static
sparc: no need to sign-extend in sync_file_range() wrapper
ppc compat wrappers for add_key(2) and request_key(2) are pointless
x86: trim sys_ia32.h
x86: sys32_kill and sys32_mprotect are pointless
get rid of compat_sys_semctl() and friends in case of ARCH_WANT_OLD_COMPAT_IPC
merge compat sys_ipc instances
consolidate compat lookup_dcookie()
convert vmsplice to COMPAT_SYSCALL_DEFINE
switch getrusage() to COMPAT_SYSCALL_DEFINE
switch epoll_pwait to COMPAT_SYSCALL_DEFINE
convert sendfile{,64} to COMPAT_SYSCALL_DEFINE
switch signalfd{,4}() to COMPAT_SYSCALL_DEFINE
make SYSCALL_DEFINE<n>-generated wrappers do asmlinkage_protect
make HAVE_SYSCALL_WRAPPERS unconditional
consolidate cond_syscall and SYSCALL_ALIAS declarations
teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long
get rid of duplicate logics in __SC_....[1-6] definitions
Commit abf09bed3c ("s390/mm: implement software dirty bits")
introduced another difference in the pte layout vs. the pmd layout on
s390, thoroughly breaking the s390 support for hugetlbfs. This requires
replacing some more pte_xxx functions in mm/hugetlbfs.c with a
huge_pte_xxx version.
This patch introduces those huge_pte_xxx functions and their generic
implementation in asm-generic/hugetlb.h, which will now be included on
all architectures supporting hugetlbfs apart from s390. This change
will be a no-op for those architectures.
[akpm@linux-foundation.org: fix warning]
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Hillf Danton <dhillf@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.cz> [for !s390 parts]
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull s390 update from Martin Schwidefsky:
"This is the first batch of s390 patches for the 3.10 merge window.
Included are some performance enhancements: storage key
initialization, zero page cache synonyms, system call micro
optimization and the speedup patches for dasdfmt. Sebastian managed
to get rid of the special casing for the console device in the cio
layer. And the usual bunch of bug fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (59 commits)
s390/pci: use pci_scan_root_bus
s390/scm_blk: fix memleak in init function
s390/scm_blk: allow more cluster size values
s390/cio: fix irq statistics
s390/memory hotplug: prevent offline of active memory increments
s390: remove small stack config option
s390: system call path micro optimization
s390: lowcore stack pointer offsets
s390/uapi: change struct statfs[64] member types to unsigned values
s390/pci: return correct dma address for offset > PAGE_SIZE
s390/ptrace: remove empty ifdefs
s390/compat: remove ptrace compat definitions from uapi header file
s390/compat: fix compile error for !COMPAT
s390/compat: fix compat_sys_statfs() memory corruption
s390/zcore: Fix HSA copy length for last block
s390/mm,gmap: segment mapping race
s390/mm,gmap: implement gmap_translate()
s390/pci: remove disable_device implementation
s390/pci: disable per default
s390/pci: return error after failed pci ops
...
We've seen repeatedly that 8KB stack size on 64 bit kernels
is not sufficient.
So simply remove the config option.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>