Commit Graph

120372 Commits

Author SHA1 Message Date
Linus Torvalds 047486d8e7 EDAC queue for 4.6
* Altera: L2 cache and On-Chip RAM support (Thor Thayer).
 
 * EDAC: Workqueue handling cleanups (Borislav Petkov).
 
 * Xgene: Register bus error handling (Loc Ho).
 
 * Misc small fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW5sS0AAoJEBLB8Bhh3lVKxPEP/j8pleG17s1HI9wfhukzuT9z
 epjyaYjEv1BR6+3drmFjbAYGgk7hhL8khVePEhouS/P5WQSzRHMdgimL5NMpnwSZ
 6XXyR2szhz86eYMC1lQxdRnESZarbnVtqyiRG0Mv1hFbObyzM0ewiHt2lTtCa1Uj
 DeprrSE6QCQLwq0WSIF8MJ9LBcmGh5dXJTy0sTeATfkmgBaBQeJhiZOmJer25Jqf
 tEyvxkkFw0OLAy3BoI7eeI7ALmgzIXPLIWVOo0t1qTeKsURwdfap8xjteT9/5iiJ
 HHB+4A+8iUMSPYMoIiSr0qsyZT1CI2ncVV4VOAhm11if4gB8sCnOkioMo6bHhtbq
 NZrug6BnwizBBh4FFGHmTTpN8MlhLgYA8YfSkUD5jal3jM/l5VvlM9DDM72JuzLy
 0hd5zaAqlqEJj4plvSx4ieuDHAkA3Qzd58U12LuN+R+nbN2EO8Svr9CvEMSLE+1r
 exxOYRP5xO9SPeK7pTnXJFsI09MrRmXGinRQu/LChWAm1j7NxaHkeUl7ulzxjV+5
 E6Bx+mOPvYTUF32CQi4i/oIePK2kVGoExaUSkRNKeRgwF/ObQaVxjM+cYvM+VJEh
 2OknzHNG/F87UvkcOdGMEy6G4E5xkHSuEYtq7bHGWI2prKn+f0nbPpWgx9d4DiE8
 jo0vi58v+Irwk9H465lH
 =ssy9
 -----END PGP SIGNATURE-----

Merge tag 'edac_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC updates from Borislav Petkov:

 - Altera: L2 cache and On-Chip RAM support (Thor Thayer).

 - EDAC: Workqueue handling cleanups (Borislav Petkov).

 - Xgene: Register bus error handling (Loc Ho).

 - Misc small fixes.

* tag 'edac_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  ARM: socfpga: Enable OCRAM ECC on startup
  ARM: socfpga: Enable L2 cache ECC on startup
  ARM: dts: Add Altera L2 Cache and OCRAM EDAC entries
  EDAC, altera: Add Altera L2 cache and OCRAM support
  EDAC: Use edac_debugfs_remove_recursive() in edac_debugfs_exit()
  EDAC, mpc85xx: Silence unused variable warning
  EDAC: Cleanup/sync workqueue functions
  EDAC: Kill workqueue setup/teardown functions
  EDAC: Balance workqueue setup and teardown
  arm64: Update the APM X-Gene EDAC node with the RB register resource
  EDAC, xgene: Add missing SoC register bus error handling
  Documentation, EDAC: Update xgene binding for missing register bus
  EDAC, amd64_edac: Shift wrapping issue in f1x_get_norm_dct_addr()
2016-03-16 08:36:55 -07:00
Tony Luck cbf8b5a2b6 x86/mm, x86/mce: Fix return type/value for memcpy_mcsafe()
Returning a 'bool' was very unpopular. Doubly so because the
code was just wrong (returning zero for true, one for false;
great for shell programming, not so good for C).

Change return type to "int". Keep zero as the success indicator
because it matches other similar code and people may be more
comfortable writing:

	if (memcpy_mcsafe(to, from, count)) {
		printk("Sad panda, copy failed\n");
		...
	}

Make the failure return value -EFAULT for now.

Reported by: Mika Penttilä <mika.penttila@nextfour.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: mika.penttila@nextfour.com
Fixes: 92b0729c34 ("x86/mm, x86/mce: Add memcpy_mcsafe()")
Link: http://lkml.kernel.org/r/695f14233fa7a54fcac4406c706d7fec228e3f4c.1457993040.git.tony.luck@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-16 09:02:18 +01:00
Ingo Molnar ba4e06d68e Merge branch 'linus' into x86/urgent, to pick up dependencies for a fix
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-16 09:01:55 +01:00
Matt Fleming d367cef0a7 x86/mm/pat: Fix boot crash when 1GB pages are not supported by the CPU
Scott reports that with the new separate EFI page tables he's seeing
the following error on boot, caused by setting reserved bits in the
page table structures (fault code is PF_RSVD | PF_PROT),

  swapper/0: Corrupted page table at address 17b102020
  PGD 17b0e5063 PUD 1400000e3
  Bad pagetable: 0009 [#1] SMP

On first inspection the PUD is using a 1GB page size (_PAGE_PSE) and
looks fine but that's only true if support for 1GB PUD pages
("pdpe1gb") is present in the CPU.

Scott's Intel Celeron N2820 does not have that feature and so the
_PAGE_PSE bit is reserved. Fix this issue by making the 1GB mapping
code in conditional on "cpu_has_gbpages".

This issue didn't come up in the past because the required mapping for
the faulting address (0x17b102020) will already have been setup by the
kernel in early boot before we got to efi_map_regions(), but we no
longer use the standard kernel page tables during EFI calls.

Reported-by: Scott Ashcroft <scott.ashcroft@talk21.com>
Tested-by: Scott Ashcroft <scott.ashcroft@talk21.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raphael Hertzog <hertzog@debian.org>
Cc: Roger Shimizu <rogershimizu@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1457951581-27353-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-16 09:00:49 +01:00
Linus Torvalds f0718cea47 hwmon updates for v4.5:
- New drivers for NSA320 and LTC2990
 - Added support for ADM1278 to adm1275 driver
 - Added support for ncpXXxh103 to ntc_thermistor driver
 - Renamed vexpress hwmon implementation
 - Minor cleanups and improvements
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW5sANAAoJEMsfJm/On5mBKTEP/2SIXOFN+ew3Lhmz36Z8gDJY
 LpVcFjb8S/nRDibvaYjaO5r7p6x2d3jW6wkdpt1V44l5khvODjnaZhH7qZ1xaVuI
 qt9xRX0PvHAlwIpXHRNgueNAkAwTuCJqmbbYGrPQJCe4zu0sxiDrpRa117Ym1AJY
 NxOts2n/VAjVn0O+F4QiJELDmSpajsVt7QPjUGYfiEHPaGDgPVq+5gqRPeL9hfOu
 tjFVWn3YVJTI/Q3N/aB0+XQpwoZdu74RFnr/x7rkjzVcqiNdq0+WtFAIAS9Aw3s1
 gZbwRPX7Pu/sNmcVRaj+9rSpzl9uzHbqD4+/+hCuv3LjnyAbLm3NmV2EB1u42SwE
 bmqjnCTtC5InlkcKGi85Sv72YcGj5MgddSTQ9TDBKQIW8Z70rnFK/7xPbq776W/L
 vFXQ62YZ14lX3sVTMVfntf6BrYeteVbJmP7d9SXeTHfHVSSaxwGrq6kim8m2zi/J
 R8YMsVOok0P3ldyua73TIdujYtpz2AZ4vdddDFSlu/m1o/bp642le5HLVuKBpuYy
 Szjc26JGL5ymGLAUmeEUt8jOWUqQo9MLw9Gc8kgc/Zui3q+lMJKXZ6dvkgJpnFCL
 xw/fB/E6x4pDEhJPe3LeFcKBk/vF6bktEWrrA0rGHrVGJIaOhgPkxyeGV7BdNqFS
 DtMy4EcjmlAKOn5oxOL5
 =n07O
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-linus-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon updates from Guenter Roeck:
 - New drivers for NSA320 and LTC2990
 - Added support for ADM1278 to adm1275 driver
 - Added support for ncpXXxh103 to ntc_thermistor driver
 - Renamed vexpress hwmon implementation
 - Minor cleanups and improvements

* tag 'hwmon-for-linus-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: Create an NSA320 hardware monitoring driver
  hwmon: Define binding for the nsa320-hwmon driver
  hwmon: (adm1275) Add support for ADM1278
  hwmon: (ntc_thermistor) Add support for ncpXXxh103
  Doc: hwmon: Fix typo "montoring" in hwmon
  ARM: dts: vfxxx: Add iio_hwmon node for ADC temperature channel
  ARM: dts: Change iio_hwmon nodes to use hypen in node names
  hwmon: (iio_hwmon) Allow the driver to accept hypen in device tree node names
  hwmon: Add LTC2990 sensor driver
  hwmon: (vexpress) rename vexpress hwmon implementation
2016-03-15 21:44:47 -07:00
Cyril Bur 6e669f085d powerpc: Fix unrecoverable SLB miss during restore_math()
Commit 70fe3d9 "powerpc: Restore FPU/VEC/VSX if previously used" introduces a
call to restore_math() late in the syscall return path, after MSR_RI has been
cleared. The MSR_RI flag is used to indicate whether the kernel can take
another exception or not. A cleared MSR_RI flag indicates that the kernel
cannot.

Unfortunately when a machine is under SLB pressure an SLB miss can occur
in restore_math() which (with MSR_RI cleared) leads to an unrecoverable
exception.

  Unrecoverable exception 4100 at c0000000000088d8
  cpu 0x0: Vector: 4100  at [c0000003fa473b20]
      pc: c0000000000088d8: .load_vr_state+0x70/0x110
      lr: c00000000000f710: .restore_math+0x130/0x188
      sp: c0000003fa473da0
     msr: 9000000002003030
    current = 0xc0000007f876f180
    paca    = 0xc00000000fff0000	 softe: 0	 irq_happened: 0x01
      pid   = 1944, comm = K08umountfs
  [link register   ] c00000000000f710 .restore_math+0x130/0x188
  [c0000003fa473da0] c0000003fa473e30 (unreliable)
  [c0000003fa473e30] c000000000007b6c system_call+0x84/0xfc

The clearing of MSR_RI is actually an optimisation to avoid multiple MSR
writes, what must be disabled are interrupts. See comment in entry_64.S:

  /*
   * For performance reasons we clear RI the same time that we
   * clear EE. We only need to clear RI just before we restore r13
   * below, but batching it with EE saves us one expensive mtmsrd call.
   * We have to be careful to restore RI if we branch anywhere from
   * here (eg syscall_exit_work).
   */

At the point of calling restore_math() r13 has not been restored, as such, the
quick fix of turning MSR_RI back on for the call to restore_math() will
eliminate the occurrence of an unrecoverable exception.

We'd like to do a better fix in future.

Fixes: 70fe3d980f ("powerpc: Restore FPU/VEC/VSX if previously used")
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-16 15:23:02 +11:00
Christophe Leroy 2e098dcee6 powerpc/8xx: Fix do_mtspr_cpu6() build on older compilers
GCC < 4.9 is unable to build this, saying:

  arch/powerpc/mm/8xx_mmu.c:139:2: error: memory input 1 is not directly addressable

Change the one-element array into a simple variable to avoid this.

Fixes: 1458dd951f ("powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-16 15:22:40 +11:00
Michael Ellerman b081251ef6 powerpc/rcpm: Fix build break when SMP=n
Add an include of asm/smp.h to fix a build break when SMP=n:

  arch/powerpc/sysdev/fsl_rcpm.c:32:2: error: implicit declaration of
  function 'get_hard_smp_processor_id'

Fixes: d17799f9c1 ("powerpc/rcpm: add RCPM driver")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-16 15:22:32 +11:00
Linus Torvalds b7aae4a9d0 regmap: Updates for v4.6
This has been a very busy release for regmap, not just in cleaning up
 the mess we got ourselves into with the endianness handling but also in
 other areas too:
 
  - Fixes for the endianness handling so that we now explicltly default
    to little endian (the code used to do this by accident).  This
    fixes handling of explictly specified endianness on big endian
    systems.
  - Optimisation of the implementation of register striding.
  - A refectoring of the _update_bits() code to reduce duplication.
  - Fixes and enhancements for the interrupt implementation which
    make it easier to use in a wider range of systems.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJW5v4RAAoJECTWi3JdVIfQcP0H/R+22ZgPNo76YUHtnMi5zCiW
 /TtUzcKg3QXMXpCxQx2p8shRg1eckjb2zeImDJKPteIsO9y04lrlO2ljVnio/Ut9
 6uBGwmmcCa9/haL7waDrw5kQKmCjNAcXhAn7etsRcvpMaPdEwL71ZWfjCY3HtwyJ
 zGoArFNveHzTeZKRNzGZemSA7r1TOLIkHNvS4yRD4H9K7bGfn1HWwumCgurTvpiB
 nxuhpwO7GQy28fPQRfBlfg2TGI+B8GD+VV4qwWnGYR2pijyIE8Kxqawizxueq70f
 YsjiRNepxgPSdWHGMdjcUwQVB2iOSh8LvObxuyW8oGZEFeUxC3JHUiJ79zkKGsg=
 =PBtQ
 -----END PGP SIGNATURE-----

Merge tag 'regmap-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap updates from Mark Brown:
 "This has been a very busy release for regmap, not just in cleaning up
  the mess we got ourselves into with the endianness handling but also
  in other areas too:

   - Fixes for the endianness handling so that we now explicitly default
     to little endian (the code used to do this by accident).  This
     fixes handling of explictly specified endianness on big endian
     systems.

   - Optimisation of the implementation of register striding.

   - A refectoring of the _update_bits() code to reduce duplication.

   - Fixes and enhancements for the interrupt implementation which make
     it easier to use in a wider range of systems"

* tag 'regmap-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: (28 commits)
  regmap: irq: add devm apis for regmap_{add,del}_irq_chip
  regmap: replace regmap_write_bits()
  regmap: irq: Enable irq retriggering for nested irqs
  regmap: add regmap_fields_force_xxx() macros
  regmap: add regmap_field_force_xxx() macros
  regmap: merge regmap_fields_update_bits() into macro
  regmap: merge regmap_fields_write() into macro
  regmap: add regmap_fields_update_bits_base()
  regmap: merge regmap_field_update_bits() into macro
  regmap: merge regmap_field_write() into macro
  regmap: add regmap_field_update_bits_base()
  regmap: merge regmap_update_bits_check_async() into macro
  regmap: merge regmap_update_bits_check() into macro
  regmap: merge regmap_update_bits_async() into macro
  regmap: merge regmap_update_bits() into macro
  regmap: add regmap_update_bits_base()
  regcache: flat: Introduce register strider order
  regcache: Introduce the index parsing API by stride order
  regmap: core: Introduce register stride order
  regmap: irq: add devm apis for regmap_{add,del}_irq_chip
  ...
2016-03-15 21:22:26 -07:00
Scott Wood 7a25d91214 powerpc/book3e-64: Use hardcoded mttmr opcode
This preserves the ability to build using older binutils (reportedly <=
2.22).

Fixes: 6becef7ea0 ("powerpc/mpc85xx: Add CPU hotplug support for E6500")
Signed-off-by: Scott Wood <oss@buserror.net>
Cc: chenhui.zhao@freescale.com
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-16 15:22:16 +11:00
Linus Torvalds 5ca5446ec5 Pin control changes for kernel v4.6:
An almost purely driver related set of changes with no
 major changes to the framework, only one patch adding
 an unlocked version of the pinctrl_find_gpio_range_from_pin()
 library call.
 
 New drivers:
 - ST Microelectronics STM32 MCU support: this is a non-MMU
   low-end platform for IoT things (etc).
 - Microchip PIC32 MCU support: same story as for STM32.
 
 New subdrivers:
 - Allwinner SunXi H3 R_PIO controller support.
 - Qualcomm IPQ4019 support.
 - MediaTek MT2701 and MT7623.
 - Allwinner A64
 
 Non-critical fixes:
 - gpio_disable_free() for the Vybrid.
 - pinctrl single: use a separate lockdep class.
 
 Misc:
 - Substantial cleanups and rewrites for the Super-H PFC
   driver and subdrivers.
 - Various fixes and cleanups, especially Paul Gortmakers
   work to make nonmodular drivers nonmodular.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW5rS1AAoJEEEQszewGV1zskIQALNriHdVmPGNIuSZOUk2gqAv
 hFNIzFUYM6BK6AL2wD+ZVqQe/EbZtWUF2RqhI2juP8j2WBVMxo8B9ypGjm8qDZ7q
 Xtdnd8l22VP0fEmYKTwDgrjSCcMxbXFiPurZBlqCCyb/9raFqLKwx2kcxWqD4PR2
 dcC8i/t2JSjGYRRCMcGn+zcKW2zja36xci/ZOExdioHYgFmorCZb9+4NYz+coijv
 VEjEy98CG6itBFJ0MS3jHT949TyxpDzWp7hO8LiOAiLR50xCxTlZRT1dObOUQJqk
 qhiLK2sTUFJFlTNBUGN0gfMoJo2MpfJIkVre4EV5QNkfy8vjtXeLrbjnSnTXf0r7
 5OLMEJPK7mfHd7jsw/nwJMYUoqgeRO9VBXtL2JyGWpsNFBFlJv1YRXaT0AZXkgUq
 63BfhTrbtge2xECJ9iqWVdGmmNv2x7lMK6RWqDr72fjONdtmNLDusfNdgYaBWc50
 K910IijMX2t2HGQFzuqwQC5HgADPhqBRb2eMcilwwUs5rxzLXX/wer1rhc8guUtK
 4DvGZ+wPZd2znQvNAWxsG5azoSfO8J3ibVIkMmaW3NTqyLvnbT9KiNvYuUVg/9NQ
 Vcb1d9UnrhAIWzpfpeo4rzr+QIq4j46YHBqreiW/l952Apxpp2lJmV/btK0noPmA
 MDTknHc2772QYIBtn01D
 =TsdU
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control updates from Linus Walleij:
 "An almost purely driver related set of changes with no major changes
  to the framework, only one patch adding an unlocked version of the
  pinctrl_find_gpio_range_from_pin() library call.

  New drivers:
   - ST Microelectronics STM32 MCU support: this is a non-MMU low-end
     platform for IoT things (etc).
   - Microchip PIC32 MCU support: same story as for STM32.

  New subdrivers:
   - Allwinner SunXi H3 R_PIO controller support.
   - Qualcomm IPQ4019 support.
   - MediaTek MT2701 and MT7623.
   - Allwinner A64

  Non-critical fixes:
   - gpio_disable_free() for the Vybrid.
   - pinctrl single: use a separate lockdep class.

  Misc:
   - Substantial cleanups and rewrites for the Super-H PFC driver and
     subdrivers.
   - Various fixes and cleanups, especially Paul Gortmakers work to make
     nonmodular drivers nonmodular"

* tag 'pinctrl-v4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (75 commits)
  pinctrl: single: Use a separate lockdep class
  drivers: pinctrl: add driver for Allwinner A64 SoC
  pinctrl: Broadcom Northstar2 pinctrl device tree bindings
  pinctrl: amlogic: Make driver independent from two-domain configuration
  pinctrl: amlogic: Separate some pin functions for Meson8 / Meson8b
  pinctrl: at91: use __maybe_unused to hide pm functions
  pinctrl: sh-pfc: core: don't open code of_device_get_match_data()
  pinctrl: uniphier: rename CONFIG options and file names
  pinctrl: sunxi: make A80 explicitly non-modular
  pinctrl: stm32: make explicitly non-modular
  pinctrl: sh-pfc: make explicitly non-modular
  pinctrl: meson: make explicitly non-modular
  pinctrl: pinctrl-mt6397 driver explicitly non-modular
  pinctrl: sunxi: does not need module.h
  pinctrl: pxa2xx: export symbols
  pinctrl: sunxi: Change mux setting on PI irq pins
  pinctrl: sunxi: Remove non existing irq's
  pinctrl: imx: attach iomuxc device to gpr syscon
  pinctrl-bcm2835: Fix cut-and-paste error in "pull" parsing
  pinctrl: lpc1850-scu: document nxp,gpio-pin-interrupt
  ...
2016-03-15 20:23:13 -07:00
Simon Horman 09c32427c9 clk: renesas: Rename header file renesas.h
This is part of an ongoing process to migrate from ARCH_SHMOBILE to
ARCH_RENESAS the motivation for which being that RENESAS seems to be a more
appropriate name than SHMOBILE for the majority of Renesas ARM based SoCs.

Along with the above mentioned Kconfig changes it seems appropriate
to also rename files.

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-03-15 18:12:14 -07:00
Arnd Bergmann 4d2508a559 ARM: pxa/raumfeld: use PROPERTY_ENTRY_INTEGER to define props
gcc-6.0 notices that the use of the property_entry in this file that
was recently introduced cannot work right, as we initialize the wrong
field:

raumfeld.c:387:3: error: the address of 'raumfeld_rotary_encoder_steps' will always evaluate as 'true' [-Werror=address]
   DEV_PROP_U32, 1, &raumfeld_rotary_encoder_steps, },
   ^~~~~~~~~~~~
raumfeld.c:389:3: error: the address of 'raumfeld_rotary_encoder_axis' will always evaluate as 'true' [-Werror=address]
   DEV_PROP_U32, 1, &raumfeld_rotary_encoder_axis, },
   ^~~~~~~~~~~~
raumfeld.c:391:3: error: the address of 'raumfeld_rotary_encoder_relative_axis' will always evaluate as 'true' [-Werror=address]
   DEV_PROP_U32, 1, &raumfeld_rotary_encoder_relative_axis, },
   ^~~~~~~~~~~~

The problem appears to stem from relying on an old definition of
'struct property', but it has changed several times since the code
could have last been correct.

This changes the code to use the PROPERTY_ENTRY_INTEGER() macro instead,
which works fine for the current definition and is a safer way of doing
the initialization.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: a9e340dce3 ("Input: rotary_encoder - move away from platform data structure")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-03-15 16:56:46 -07:00
Christian Borntraeger a75e1f637c x86: also use debug_pagealloc_enabled() for free_init_pages
we want to couple all debugging features with debug_pagealloc_enabled()
and not with the config option CONFIG_DEBUG_PAGEALLOC.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Suggested-by: David Rientjes <rientjes@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Christian Borntraeger 10917b8393 s390: query dynamic DEBUG_PAGEALLOC setting
We can use debug_pagealloc_enabled() to check if we can map the identity
mapping with 1MB/2GB pages as well as to print the current setting in
dump_stack.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Christian Borntraeger 288cf3c64e x86: query dynamic DEBUG_PAGEALLOC setting
We can use debug_pagealloc_enabled() to check if we can map the identity
mapping with 2MB pages.  We can also add the state into the dump_stack
output.

The patch does not touch the code for the 1GB pages, which ignored
CONFIG_DEBUG_PAGEALLOC.  Do we need to fence this as well?

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Sudip Mukherjee e928f35061 blackfin: define dummy pgprot_writecombine for !MMU
blackfin allmodconfig build fails with the error:

  ../sound/core/pcm_native.c: In function 'snd_pcm_lib_default_mmap':
  ../sound/core/pcm_native.c:3386:24: error: implicit declaration of function 'pgprot_writecombine' [-Werror=implicit-function-declaration]
     area->vm_page_prot = pgprot_writecombine(area->vm_page_prot);
                          ^
  ../sound/core/pcm_native.c:3386:22: error: incompatible types when assigning to type 'pgprot_t {aka struct <anonymous>}' from type 'int'
     area->vm_page_prot = pgprot_writecombine(area->vm_page_prot);
                        ^

When !MMU, asm-generic will not define default pgprot_writecombine, so
blackfin needs to define it by itself.

The patch idea is from commit 65b9ab888c ("arch/c6x/include/asm/pgtable.h:
define dummy pgprot_writecombine for !MMU")

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Cc: Steven Miao <realmz6@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Sudip Mukherjee 3701dc8151 m32r: mm: fix build warning
While building we are getting warnings:

  arch/m32r/mm/init.c:63:17: warning: unused variable 'low'
  arch/m32r/mm/init.c:62:17: warning: unused variable 'max_dma'

max_dma and low are only used if CONFIG_MMU is defined.  Lets declare
the variables inside the #ifdef.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Dmitry Torokhov 245f0db0de Linux 4.5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJW5j4RAAoJEHm+PkMAQRiGhVEH/0qZbM1J+WnCK92bm9+inCnB
 JO2JViGIuCQB5BxljVMil2dzrw85D+dC7+fryr0wVBhhBlr0lXPJGSYCYYTEaI20
 Wco5YlTmjRirUwmxWzBXvB5kvTdIaNfNYDcFch6lbsaLUNgqydNKtk08ckO/4k0D
 AmaShW8swBiXE/RmHuj8H41ksHsnY8W62dlczEaAIfr4kluPX/kKnyXpmpvmZm1j
 sM4fskPlq+Jz5pOXXFsFfrhiBgpSUnwSj1tNwK5+DkmaVnWOkPuwkqLBWqpy4pzm
 GTeDBdf5/ixGxgNsZ2VWtbPnc2wEP7SIcu45MU7QFw5kqwDN2nN63BRVXI5Z5qY=
 =RFx2
 -----END PGP SIGNATURE-----

Merge tag 'v4.5' into next

Merge with Linux 4.5 to get PROPERTY_ENTRY_INTEGER() that is needed to
fix pxa/raumfeld rotary encoder properties.
2016-03-15 16:54:45 -07:00
Linus Torvalds 710d60cbf1 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull cpu hotplug updates from Thomas Gleixner:
 "This is the first part of the ongoing cpu hotplug rework:

   - Initial implementation of the state machine

   - Runs all online and prepare down callbacks on the plugged cpu and
     not on some random processor

   - Replaces busy loop waiting with completions

   - Adds tracepoints so the states can be followed"

More detailed commentary on this work from an earlier email:
 "What's wrong with the current cpu hotplug infrastructure?

   - Asymmetry

     The hotplug notifier mechanism is asymmetric versus the bringup and
     teardown.  This is mostly caused by the notifier mechanism.

   - Largely undocumented dependencies

     While some notifiers use explicitely defined notifier priorities,
     we have quite some notifiers which use numerical priorities to
     express dependencies without any documentation why.

   - Control processor driven

     Most of the bringup/teardown of a cpu is driven by a control
     processor.  While it is understandable, that preperatory steps,
     like idle thread creation, memory allocation for and initialization
     of essential facilities needs to be done before a cpu can boot,
     there is no reason why everything else must run on a control
     processor.  Before this patch series, bringup looks like this:

       Control CPU                     Booting CPU

       do preparatory steps
       kick cpu into life

                                       do low level init

       sync with booting cpu           sync with control cpu

       bring the rest up

   - All or nothing approach

     There is no way to do partial bringups.  That's something which is
     really desired because we waste e.g.  at boot substantial amount of
     time just busy waiting that the cpu comes to life.  That's stupid
     as we could very well do preparatory steps and the initial IPI for
     other cpus and then go back and do the necessary low level
     synchronization with the freshly booted cpu.

   - Minimal debuggability

     Due to the notifier based design, it's impossible to switch between
     two stages of the bringup/teardown back and forth in order to test
     the correctness.  So in many hotplug notifiers the cancel
     mechanisms are either not existant or completely untested.

   - Notifier [un]registering is tedious

     To [un]register notifiers we need to protect against hotplug at
     every callsite.  There is no mechanism that bringup/teardown
     callbacks are issued on the online cpus, so every caller needs to
     do it itself.  That also includes error rollback.

  What's the new design?

     The base of the new design is a symmetric state machine, where both
     the control processor and the booting/dying cpu execute a well
     defined set of states.  Each state is symmetric in the end, except
     for some well defined exceptions, and the bringup/teardown can be
     stopped and reversed at almost all states.

     So the bringup of a cpu will look like this in the future:

       Control CPU                     Booting CPU

       do preparatory steps
       kick cpu into life

                                       do low level init

       sync with booting cpu           sync with control cpu

                                       bring itself up

     The synchronization step does not require the control cpu to wait.
     That mechanism can be done asynchronously via a worker or some
     other mechanism.

     The teardown can be made very similar, so that the dying cpu cleans
     up and brings itself down.  Cleanups which need to be done after
     the cpu is gone, can be scheduled asynchronously as well.

  There is a long way to this, as we need to refactor the notion when a
  cpu is available.  Today we set the cpu online right after it comes
  out of the low level bringup, which is not really correct.

  The proper mechanism is to set it to available, i.e. cpu local
  threads, like softirqd, hotplug thread etc. can be scheduled on that
  cpu, and once it finished all booting steps, it's set to online, so
  general workloads can be scheduled on it.  The reverse happens on
  teardown.  First thing to do is to forbid scheduling of general
  workloads, then teardown all the per cpu resources and finally shut it
  off completely.

  This patch series implements the basic infrastructure for this at the
  core level.  This includes the following:

   - Basic state machine implementation with well defined states, so
     ordering and prioritization can be expressed.

   - Interfaces to [un]register state callbacks

     This invokes the bringup/teardown callback on all online cpus with
     the proper protection in place and [un]installs the callbacks in
     the state machine array.

     For callbacks which have no particular ordering requirement we have
     a dynamic state space, so that drivers don't have to register an
     explicit hotplug state.

     If a callback fails, the code automatically does a rollback to the
     previous state.

   - Sysfs interface to drive the state machine to a particular step.

     This is only partially functional today.  Full functionality and
     therefor testability will be achieved once we converted all
     existing hotplug notifiers over to the new scheme.

   - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying
     processor:

       Control CPU                     Booting CPU

       do preparatory steps
       kick cpu into life

                                       do low level init

       sync with booting cpu           sync with control cpu
       wait for boot
                                       bring itself up

                                       Signal completion to control cpu

     In a previous step of this work we've done a full tree mechanical
     conversion of all hotplug notifiers to the new scheme.  The balance
     is a net removal of about 4000 lines of code.

     This is not included in this series, as we decided to take a
     different approach.  Instead of mechanically converting everything
     over, we will do a proper overhaul of the usage sites one by one so
     they nicely fit into the symmetric callback scheme.

     I decided to do that after I looked at the ugliness of some of the
     converted sites and figured out that their hotplug mechanism is
     completely buggered anyway.  So there is no point to do a
     mechanical conversion first as we need to go through the usage
     sites one by one again in order to achieve a full symmetric and
     testable behaviour"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  cpu/hotplug: Document states better
  cpu/hotplug: Fix smpboot thread ordering
  cpu/hotplug: Remove redundant state check
  cpu/hotplug: Plug death reporting race
  rcu: Make CPU_DYING_IDLE an explicit call
  cpu/hotplug: Make wait for dead cpu completion based
  cpu/hotplug: Let upcoming cpu bring itself fully up
  arch/hotplug: Call into idle with a proper state
  cpu/hotplug: Move online calls to hotplugged cpu
  cpu/hotplug: Create hotplug threads
  cpu/hotplug: Split out the state walk into functions
  cpu/hotplug: Unpark smpboot threads from the state machine
  cpu/hotplug: Move scheduler cpu_online notifier to hotplug core
  cpu/hotplug: Implement setup/removal interface
  cpu/hotplug: Make target state writeable
  cpu/hotplug: Add sysfs state interface
  cpu/hotplug: Hand in target state to _cpu_up/down
  cpu/hotplug: Convert the hotplugged cpu work to a state machine
  cpu/hotplug: Convert to a state machine for the control processor
  cpu/hotplug: Add tracepoints
  ...
2016-03-15 13:50:29 -07:00
Linus Torvalds df2e37c814 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The 4.6 pile of irq updates contains:

   - Support for IPI irqdomains to support proper integration of IPIs to
     and from coprocessors.  The first user of this new facility is
     MIPS.  The relevant MIPS patches come with the core to avoid merge
     ordering issues and have been acked by Ralf.

   - A new command line option to set the default interrupt affinity
     mask at boot time.

   - Support for some more new ARM and MIPS interrupt controllers:
     tango, alpine-msix and bcm6345-l1

   - Two small cleanups for x86/apic which we merged into irq/core to
     avoid yet another branch in x86 with two tiny commits.

   - The usual set of updates, cleanups in drivers/irqchip.  Mostly in
     the area of ARM-GIC, arada-37-xp and atmel chips.  Nothing
     outstanding here"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
  irqchip/irq-alpine-msi: Release the correct domain on error
  irqchip/mxs: Fix error check of of_io_request_and_map()
  irqchip/sunxi-nmi: Fix error check of of_io_request_and_map()
  genirq: Export IRQ functions for module use
  irqchip/gic/realview: Support more RealView DCC variants
  Documentation/bindings: Document the Alpine MSIX driver
  irqchip: Add the Alpine MSIX interrupt controller
  irqchip/gic-v3: Always return IRQ_SET_MASK_OK_DONE in gic_set_affinity
  irqchip/gic-v3-its: Mark its_init() and its children as __init
  irqchip/gic-v3: Remove gic_root_node variable from the ITS code
  irqchip/gic-v3: ACPI: Add redistributor support via GICC structures
  irqchip/gic-v3: Add ACPI support for GICv3/4 initialization
  irqchip/gic-v3: Refactor gic_of_init() for GICv3 driver
  x86/apic: Deinline _flat_send_IPI_mask, save ~150 bytes
  x86/apic: Deinline __default_send_IPI_*, save ~200 bytes
  dt-bindings: interrupt-controller: Add SoC-specific compatible string to Marvell ODMI
  irqchip/mips-gic: Add new DT property to reserve IPIs
  MIPS: Delete smp-gic.c
  MIPS: Make smp CMP, CPS and MT use the new generic IPI functions
  MIPS: Add generic SMP IPI support
  ...
2016-03-15 12:48:48 -07:00
Linus Torvalds 8a284c062e Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "The timer department delivers this time:

   - Support for cross clock domain timestamps in the core code plus a
     first user.  That allows more precise timestamping for PTP and
     later for audio and other peripherals.

     The ptp/e1000e patches have been acked by the relevant maintainers
     and are carried in the timer tree to avoid merge ordering issues.

   - Support for unregistering the current clocksource watchdog.  That
     lifts a limitation for switching clocksources which has been there
     from day 1

   - The usual pile of fixes and updates to the core and the drivers.
     Nothing outstanding and exciting"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
  time/timekeeping: Work around false positive GCC warning
  e1000e: Adds hardware supported cross timestamp on e1000e nic
  ptp: Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping
  x86/tsc: Always Running Timer (ART) correlated clocksource
  hrtimer: Revert CLOCK_MONOTONIC_RAW support
  time: Add history to cross timestamp interface supporting slower devices
  time: Add driver cross timestamp interface for higher precision time synchronization
  time: Remove duplicated code in ktime_get_raw_and_real()
  time: Add timekeeping snapshot code capturing system time and counter
  time: Add cycles to nanoseconds translation
  jiffies: Use CLOCKSOURCE_MASK instead of constant
  clocksource: Introduce clocksource_freq2mult()
  clockevents/drivers/exynos_mct: Implement ->set_state_oneshot_stopped()
  clockevents/drivers/arm_global_timer: Implement ->set_state_oneshot_stopped()
  clockevents/drivers/arm_arch_timer: Implement ->set_state_oneshot_stopped()
  clocksource/drivers/arm_global_timer: Register delay timer
  clocksource/drivers/lpc32xx: Support timer-based ARM delay
  clocksource/drivers/lpc32xx: Support periodic mode
  clocksource/drivers/lpc32xx: Don't use the prescaler counter for clockevents
  clocksource/drivers/rockchip: Add err handle for rk_timer_init
  ...
2016-03-15 12:13:56 -07:00
Linus Torvalds ae465beeff Merge branch 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 timer update from Ingo Molnar:
 "A single simplification of the x86 TSC code"

* 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tsc: Use topology functions
2016-03-15 11:29:24 -07:00
Linus Torvalds 8ab84ef699 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core platform updates from Ingo Molnar:
 "Intel Quark and Geode SoC platform updates"

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/platform/intel/quark: Drop IMR lock bit support
  x86/platform/intel/mid: Remove dead code
  x86/platform: Make platform/geode/net5501.c explicitly non-modular
  x86/platform: Make platform/geode/alix.c explicitly non-modular
  x86/platform: Make platform/geode/geos.c explicitly non-modular
  x86/platform: Make platform/intel-quark/imr_selftest.c explicitly non-modular
  x86/platform: Make platform/intel-quark/imr.c explicitly non-modular
2016-03-15 11:20:44 -07:00
Linus Torvalds 13c76ad872 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Enable full ASLR randomization for 32-bit programs (Hector
     Marco-Gisbert)

   - Add initial minimal INVPCI support, to flush global mappings (Andy
     Lutomirski)

   - Add KASAN enhancements (Andrey Ryabinin)

   - Fix mmiotrace for huge pages (Karol Herbst)

   - ... misc cleanups and small enhancements"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/32: Enable full randomization on i386 and X86_32
  x86/mm/kmmio: Fix mmiotrace for hugepages
  x86/mm: Avoid premature success when changing page attributes
  x86/mm/ptdump: Remove paravirt_enabled()
  x86/mm: Fix INVPCID asm constraint
  x86/dmi: Switch dmi_remap() from ioremap() [uncached] to ioremap_cache()
  x86/mm: If INVPCID is available, use it to flush global mappings
  x86/mm: Add a 'noinvpcid' boot option to turn off INVPCID
  x86/mm: Add INVPCID helpers
  x86/kasan: Write protect kasan zero shadow
  x86/kasan: Clear kasan_zero_page after TLB flush
  x86/mm/numa: Check for failures in numa_clear_kernel_node_hotplug()
  x86/mm/numa: Clean up numa_clear_kernel_node_hotplug()
  x86/mm: Make kmap_prot into a #define
  x86/mm/32: Set NX in __supported_pte_mask before enabling paging
  x86/mm: Streamline and restore probe_memory_block_size()
2016-03-15 10:45:39 -07:00
Linus Torvalds 9cf8d6360c Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode updates from Ingo Molnar:
 "The biggest change in this cycle was the separation of the microcode
  loading mechanism from the initrd code plus the support of built-in
  microcode images.

  There were also lots cleanups and general restructuring (by Borislav
  Petkov)"

* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/microcode/intel: Drop orig_sum from ext signature checksum
  x86/microcode/intel: Improve microcode sanity-checking error messages
  x86/microcode/intel: Merge two consecutive if-statements
  x86/microcode/intel: Get rid of DWSIZE
  x86/microcode/intel: Change checksum variables to u32
  x86/microcode: Use kmemdup() rather than duplicating its implementation
  x86/microcode: Remove unnecessary paravirt_enabled check
  x86/microcode: Document builtin microcode loading method
  x86/microcode/AMD: Issue microcode updated message later
  x86/microcode/intel: Cleanup get_matching_model_microcode()
  x86/microcode/intel: Remove unused arg of get_matching_model_microcode()
  x86/microcode/intel: Rename mc_saved_in_initrd
  x86/microcode/intel: Use *wrmsrl variants
  x86/microcode/intel: Cleanup apply_microcode_intel()
  x86/microcode/intel: Move the BUG_ON up and turn it into WARN_ON
  x86/microcode/intel: Rename mc_intel variable to mc
  x86/microcode/intel: Rename mc_saved_count to num_saved
  x86/microcode/intel: Rename local variables of type struct mc_saved_data
  x86/microcode/AMD: Drop redundant printk prefix
  x86/microcode: Issue update message only once
  ...
2016-03-15 10:39:22 -07:00
Linus Torvalds ecc026bff6 Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fpu updates from Ingo Molnar:
 "The biggest change in terms of impact is the changing of the FPU
  context switch model to 'eagerfpu' for all CPU types, via: commit
  58122bf1d856: "x86/fpu: Default eagerfpu=on on all CPUs"

  This makes all FPU saves and restores synchronous and makes the FPU
  code a lot more obvious to read.  In the next cycle, if this change is
  problem free, we'll remove the old lazy FPU restore code altogether.

  This change flushed out some old bugs, which should all be fixed by
  now, BYMMV"

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Default eagerfpu=on on all CPUs
  x86/fpu: Speed up lazy FPU restores slightly
  x86/fpu: Fold fpu_copy() into fpu__copy()
  x86/fpu: Fix FNSAVE usage in eagerfpu mode
  x86/fpu: Fix math emulation in eager fpu mode
2016-03-15 10:23:56 -07:00
Linus Torvalds fa53c48939 Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build update from Ingo Molnar:
 "A single adjustment of a defconfig value"

* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/defconfigs/32: Set CONFIG_FRAME_WARN to the Kconfig default
2016-03-15 10:16:48 -07:00
Linus Torvalds 42576bee6e Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
 "Early command line options parsing enhancements from Dave Hansen, plus
  minor cleanups and enhancements"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Remove unused 'is_big_kernel' variable
  x86/boot: Use proper array element type in memset() size calculation
  x86/boot: Pass in size to early cmdline parsing
  x86/boot: Simplify early command line parsing
  x86/boot: Fix early command-line parsing when partial word matches
  x86/boot: Fix early command-line parsing when matching at end
  x86/boot: Simplify kernel load address alignment check
  x86/boot: Micro-optimize reset_early_page_tables()
2016-03-15 10:02:25 -07:00
Linus Torvalds ba33ea811e Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar:
 "This is another big update. Main changes are:

   - lots of x86 system call (and other traps/exceptions) entry code
     enhancements.  In particular the complex parts of the 64-bit entry
     code have been migrated to C code as well, and a number of dusty
     corners have been refreshed.  (Andy Lutomirski)

   - vDSO special mapping robustification and general cleanups (Andy
     Lutomirski)

   - cpufeature refactoring, cleanups and speedups (Borislav Petkov)

   - lots of other changes ..."

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
  x86/cpufeature: Enable new AVX-512 features
  x86/entry/traps: Show unhandled signal for i386 in do_trap()
  x86/entry: Call enter_from_user_mode() with IRQs off
  x86/entry/32: Change INT80 to be an interrupt gate
  x86/entry: Improve system call entry comments
  x86/entry: Remove TIF_SINGLESTEP entry work
  x86/entry/32: Add and check a stack canary for the SYSENTER stack
  x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup
  x86/entry: Only allocate space for tss_struct::SYSENTER_stack if needed
  x86/entry: Vastly simplify SYSENTER TF (single-step) handling
  x86/entry/traps: Clear DR6 early in do_debug() and improve the comment
  x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions
  x86/entry/32: Restore FLAGS on SYSEXIT
  x86/entry/32: Filter NT and speed up AC filtering in SYSENTER
  x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test
  selftests/x86: In syscall_nt, test NT|TF as well
  x86/asm-offsets: Remove PARAVIRT_enabled
  x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled
  uprobes: __create_xol_area() must nullify xol_mapping.fault
  x86/cpufeature: Create a new synthetic cpu capability for machine check recovery
  ...
2016-03-15 09:32:27 -07:00
Bjorn Helgaas 6e6f498b03 Merge branch 'pci/resource' into next
* pci/resource:
  PCI: Simplify pci_create_attr() control flow
  PCI: Don't leak memory if sysfs_create_bin_file() fails
  PCI: Simplify sysfs ROM cleanup
  PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
  MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource
  MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
  ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource
  ia64/PCI: Use ioremap() instead of open-coded equivalent
  ia64/PCI: Use temporary struct resource * to avoid repetition
  PCI: Clean up pci_map_rom() whitespace
  PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs
  PCI: Set ROM shadow location in arch code, not in PCI core
  PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy
  PCI: Don't assign or reassign immutable resources
  PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
  x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs
  PCI: Disable IO/MEM decoding for devices with non-compliant BARs
2016-03-15 08:56:28 -05:00
Bjorn Helgaas cfeb8139a1 Merge branch 'pci/host-hv' into next
* pci/host-hv:
  PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs
  PCI: Look up IRQ domain by fwnode_handle
  PCI: Add fwnode_handle to x86 pci_sysdata
2016-03-15 08:56:16 -05:00
Bjorn Helgaas 562df5c852 Merge branch 'pci/host-designware' into next
* pci/host-designware:
  PCI: designware: Add driver for prototyping kits based on ARC SDP
  PCI: designware: Add default link up check if sub-driver doesn't override
  PCI: designware: Add generic dw_pcie_wait_for_link()
  ARC: Add PCI support
2016-03-15 08:55:52 -05:00
Bjorn Helgaas c334f9c89e Merge branches 'pci/host-altera', 'pci/host-imx6', 'pci/host-keystone', 'pci/host-rcar', 'pci/host-tegra', 'pci/host-thunder', 'pci/host-vmd', 'pci/host-xilinx' and 'pci/host-xilinx-nwl' into next
* pci/host-altera:
  PCI: altera: Fix altera_pcie_link_is_up()

* pci/host-imx6:
  PCI: imx6: Add DT bindings to configure PHY Tx driver settings

* pci/host-keystone:
  PCI: keystone: Defer probing if devm_phy_get() returns -EPROBE_DEFER

* pci/host-rcar:
  PCI: rcar: Depend on ARCH_RENESAS, not ARCH_SHMOBILE

* pci/host-tegra:
  PCI: tegra: Remove misleading PHYS_OFFSET
  PCI: tegra: Track bus -> CPU mapping
  PCI: tegra: Remove unused struct tegra_pcie.num_ports field
  PCI: tegra: Implement ->{add,remove}_bus() callbacks
  PCI: Add pci_ops.{add,remove}_bus() callbacks

* pci/host-thunder:
  PCI: thunder: Add driver for ThunderX-pass{1,2} on-chip devices
  PCI: thunder: Add PCIe host driver for ThunderX processors
  PCI: generic: Expose pci_host_common_probe() for use by other drivers
  PCI: generic: Add pci_host_common_probe(), based on gen_pci_probe()
  PCI: generic: Move structure definitions to separate header file

* pci/host-vmd:
  x86/PCI: VMD: Attach VMD resources to parent domain's resource tree
  x86/PCI: VMD: Set bus resource start to 0
  x86/PCI: VMD: Document code for maintainability

* pci/host-xilinx:
  microblaze/PCI: Support generic Xilinx AXI PCIe Host Bridge IP driver
  PCI: xilinx: Update Zynq binding with Microblaze node
  PCI: xilinx: Don't call pci_fixup_irqs() on Microblaze
  PCI: xilinx: Remove dependency on ARM-specific struct hw_pci
  PCI: xilinx: Use of_pci_get_host_bridge_resources() to parse DT

* pci/host-xilinx-nwl:
  PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller
2016-03-15 08:55:19 -05:00
Bjorn Helgaas 18e5e6913b Merge branches 'pci/aer', 'pci/enumeration', 'pci/kconfig', 'pci/misc', 'pci/virtualization' and 'pci/vpd' into next
* pci/aer:
  PCI/AER: Log aer_inject error injections
  PCI/AER: Log actual error causes in aer_inject
  PCI/AER: Use dev_warn() in aer_inject
  PCI/AER: Fix aer_inject error codes

* pci/enumeration:
  PCI: Fix broken URL for Dell biosdevname

* pci/kconfig:
  PCI: Cleanup pci/pcie/Kconfig whitespace
  PCI: Include pci/hotplug Kconfig directly from pci/Kconfig
  PCI: Include pci/pcie/Kconfig directly from pci/Kconfig

* pci/misc:
  PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition
  PCI: Add QEMU top-level IDs for (sub)vendor & device
  unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition
  PCI: Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h
  PCI: Move pci_dma_* helpers to common code
  frv/PCI: Remove stray pci_{alloc,free}_consistent() declaration

* pci/virtualization:
  PCI: Wait for up to 1000ms after FLR reset
  PCI: Support SR-IOV on any function type

* pci/vpd:
  PCI: Prevent VPD access for buggy devices
  PCI: Sleep rather than busy-wait for VPD access completion
  PCI: Fold struct pci_vpd_pci22 into struct pci_vpd
  PCI: Rename VPD symbols to remove unnecessary "pci22"
  PCI: Remove struct pci_vpd_ops.release function pointer
  PCI: Move pci_vpd_release() from header file to pci/access.c
  PCI: Move pci_read_vpd() and pci_write_vpd() close to other VPD code
  PCI: Determine actual VPD size on first access
  PCI: Use bitfield instead of bool for struct pci_vpd_pci22.busy
  PCI: Allow access to VPD attributes with size 0
  PCI: Update VPD definitions
2016-03-15 08:55:02 -05:00
Lada Trimasova df420fd688 arc: [plat-nsimosci*] use ezchip network driver
Since ezchip network driver was adapted to little endian architecture
this patch provides the corresponding arch/arc/{boot/dts,configs}/ updates so
we can switch over to this device-model/driver for OSCI platform.

Cc: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Lada Trimasova <ltrimas@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-15 16:05:59 +05:30
Vineet Gupta b31ac42697 ARCv2: LLSC: software backoff is NOT needed starting HS2.1c
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-15 16:04:44 +05:30
Vitaly Kuznetsov 743146db07 x86/video: Don't assume all FB devices are PCI devices
When booting Hyper-V Generation 2 guests KASAN reports the following
out-of-bounds access:

  BUG: KASAN: slab-out-of-bounds in fb_is_primary_device+0x58/0x70 at addr ffff880079cf0eb0
  Read of size 8 by task swapper/0/1
  ...
   [<ffffffff81581308>] dump_stack+0x63/0x8b
   [<ffffffff812e1f99>] print_trailer+0xf9/0x150
   [<ffffffff812e7344>] object_err+0x34/0x40
   [<ffffffff812e9630>] kasan_report_error+0x230/0x550
   [<ffffffff812e9ee8>] kasan_report+0x58/0x60
   [<ffffffff812e4500>] ? ___slab_alloc+0x80/0x490
   [<ffffffff81878a28>] ? fb_is_primary_device+0x58/0x70
   [<ffffffff812e87cd>] __asan_load8+0x5d/0x70
   [<ffffffff81878a28>] fb_is_primary_device+0x58/0x70
   [<ffffffff8162357a>] register_framebuffer+0xda/0x5b0
   [<ffffffff816234a0>] ? remove_conflicting_framebuffers+0x50/0x50
  ...

The issue is caused by the to_pci_dev() call with no check that the given
info->device is in fact a PCI device and some FB devices (Hyper-V FB, EFI
FB,...) are not.

While on it, clean up the function.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Cathy Avery <cavery@redhat.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1458030033-10122-1-git-send-email-vkuznets@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-15 11:08:26 +01:00
Linus Torvalds d4e796152a Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle are:

   - Make schedstats a runtime tunable (disabled by default) and
     optimize it via static keys.

     As most distributions enable CONFIG_SCHEDSTATS=y due to its
     instrumentation value, this is a nice performance enhancement.
     (Mel Gorman)

   - Implement 'simple waitqueues' (swait): these are just pure
     waitqueues without any of the more complex features of full-blown
     waitqueues (callbacks, wake flags, wake keys, etc.).  Simple
     waitqueues have less memory overhead and are faster.

     Use simple waitqueues in the RCU code (in 4 different places) and
     for handling KVM vCPU wakeups.

     (Peter Zijlstra, Daniel Wagner, Thomas Gleixner, Paul Gortmaker,
     Marcelo Tosatti)

   - sched/numa enhancements (Rik van Riel)

   - NOHZ performance enhancements (Rik van Riel)

   - Various sched/deadline enhancements (Steven Rostedt)

   - Various fixes (Peter Zijlstra)

   - ... and a number of other fixes, cleanups and smaller enhancements"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
  sched/cputime: Fix steal_account_process_tick() to always return jiffies
  sched/deadline: Remove dl_new from struct sched_dl_entity
  Revert "kbuild: Add option to turn incompatible pointer check into error"
  sched/deadline: Remove superfluous call to switched_to_dl()
  sched/debug: Fix preempt_disable_ip recording for preempt_disable()
  sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity
  time, acct: Drop irq save & restore from __acct_update_integrals()
  acct, time: Change indentation in __acct_update_integrals()
  sched, time: Remove non-power-of-two divides from __acct_update_integrals()
  sched/rt: Kick RT bandwidth timer immediately on start up
  sched/debug: Add deadline scheduler bandwidth ratio to /proc/sched_debug
  sched/debug: Move sched_domain_sysctl to debug.c
  sched/debug: Move the /sys/kernel/debug/sched_features file setup into debug.c
  sched/rt: Fix PI handling vs. sched_setscheduler()
  sched/core: Remove duplicated sched_group_set_shares() prototype
  sched/fair: Consolidate nohz CPU load update code
  sched/fair: Avoid using decay_load_missed() with a negative value
  sched/deadline: Always calculate end of period on sched_yield()
  sched/cgroup: Fix cgroup entity load tracking tear-down
  rcu: Use simple wait queues where possible in rcutree
  ...
2016-03-14 19:14:06 -07:00
Linus Torvalds d88bfe1d68 Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Ingo Molnar:
 "Various RAS updates:

   - AMD MCE support updates for future CPUs, fixes and 'SMCA' (Scalable
     MCA) error decoding support (Aravind Gopalakrishnan)

   - x86 memcpy_mcsafe() support, to enable smart(er) hardware error
     recovery in NVDIMM drivers, based on an extension of the x86
     exception handling code.  (Tony Luck)"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  EDAC/sb_edac: Fix computation of channel address
  x86/mm, x86/mce: Add memcpy_mcsafe()
  x86/mce/AMD: Document some functionality
  x86/mce: Clarify comments regarding deferred error
  x86/mce/AMD: Fix logic to obtain block address
  x86/mce/AMD, EDAC: Enable error decoding of Scalable MCA errors
  x86/mce: Move MCx_CONFIG MSR definitions
  x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries
  x86/mm: Expand the exception table logic to allow new handling options
  x86/mce/AMD: Set MCAX Enable bit
  x86/mce/AMD: Carve out threshold block preparation
  x86/mce/AMD: Fix LVT offset configuration for thresholding
  x86/mce/AMD: Reduce number of blocks scanned per bank
  x86/mce/AMD: Do not perform shared bank check for future processors
  x86/mce: Fix order of AMD MCE init function call
2016-03-14 18:43:51 -07:00
Linus Torvalds e71c2c1eeb Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Main kernel side changes:

   - Big reorganization of the x86 perf support code.  The old code grew
     organically deep inside arch/x86/kernel/cpu/perf* and its naming
     became somewhat messy.

     The new location is under arch/x86/events/, using the following
     cleaner hierarchy of source code files:

       perf/x86: Move perf_event.c .................. => x86/events/core.c
       perf/x86: Move perf_event_amd.c .............. => x86/events/amd/core.c
       perf/x86: Move perf_event_amd_ibs.c .......... => x86/events/amd/ibs.c
       perf/x86: Move perf_event_amd_iommu.[ch] ..... => x86/events/amd/iommu.[ch]
       perf/x86: Move perf_event_amd_uncore.c ....... => x86/events/amd/uncore.c
       perf/x86: Move perf_event_intel_bts.c ........ => x86/events/intel/bts.c
       perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.c
       perf/x86: Move perf_event_intel_cqm.c ........ => x86/events/intel/cqm.c
       perf/x86: Move perf_event_intel_cstate.c ..... => x86/events/intel/cstate.c
       perf/x86: Move perf_event_intel_ds.c ......... => x86/events/intel/ds.c
       perf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.c
       perf/x86: Move perf_event_intel_pt.[ch] ...... => x86/events/intel/pt.[ch]
       perf/x86: Move perf_event_intel_rapl.c ....... => x86/events/intel/rapl.c
       perf/x86: Move perf_event_intel_uncore.[ch] .. => x86/events/intel/uncore.[ch]
       perf/x86: Move perf_event_intel_uncore_nhmex.c => x86/events/intel/uncore_nmhex.c
       perf/x86: Move perf_event_intel_uncore_snb.c   => x86/events/intel/uncore_snb.c
       perf/x86: Move perf_event_intel_uncore_snbep.c => x86/events/intel/uncore_snbep.c
       perf/x86: Move perf_event_knc.c .............. => x86/events/intel/knc.c
       perf/x86: Move perf_event_p4.c ............... => x86/events/intel/p4.c
       perf/x86: Move perf_event_p6.c ............... => x86/events/intel/p6.c
       perf/x86: Move perf_event_msr.c .............. => x86/events/msr.c

     (Borislav Petkov)

   - Update various x86 PMU constraint and hw support details (Stephane
     Eranian)

   - Optimize kprobes for BPF execution (Martin KaFai Lau)

   - Rewrite, refactor and fix the Intel uncore PMU driver code (Thomas
     Gleixner)

   - Rewrite, refactor and fix the Intel RAPL PMU code (Thomas Gleixner)

   - Various fixes and smaller cleanups.

  There are lots of perf tooling updates as well.  A few highlights:

  perf report/top:

     - Hierarchy histogram mode for 'perf top' and 'perf report',
       showing multiple levels, one per --sort entry: (Namhyung Kim)

       On a mostly idle system:

         # perf top --hierarchy -s comm,dso

       Then expand some levels and use 'P' to take a snapshot:

         # cat perf.hist.0
         -  92.32%         perf
               58.20%         perf
               22.29%         libc-2.22.so
                5.97%         [kernel]
                4.18%         libelf-0.165.so
                1.69%         [unknown]
         -   4.71%         qemu-system-x86
                3.10%         [kernel]
                1.60%         qemu-system-x86_64 (deleted)
         +   2.97%         swapper
         #

     - Add 'L' hotkey to dynamicly set the percent threshold for
       histogram entries and callchains, i.e.  dynamicly do what the
       --percent-limit command line option to 'top' and 'report' does.
       (Namhyung Kim)

  perf mem:

     - Allow specifying events via -e in 'perf mem record', also listing
       what events can be specified via 'perf mem record -e list' (Jiri
       Olsa)

  perf record:

     - Add 'perf record' --all-user/--all-kernel options, so that one
       can tell that all the events in the command line should be
       restricted to the user or kernel levels (Jiri Olsa), i.e.:

         perf record -e cycles:u,instructions:u

       is equivalent to:

         perf record --all-user -e cycles,instructions

     - Make 'perf record' collect CPU cache info in the perf.data file header:

         $ perf record usleep 1
         [ perf record: Woken up 1 times to write data ]
         [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
         $ perf report --header-only -I | tail -10 | head -8
         # CPU cache info:
         #  L1 Data                 32K [0-1]
         #  L1 Instruction          32K [0-1]
         #  L1 Data                 32K [2-3]
         #  L1 Instruction          32K [2-3]
         #  L2 Unified             256K [0-1]
         #  L2 Unified             256K [2-3]
         #  L3 Unified            4096K [0-3]

       Will be used in 'perf c2c' and eventually in 'perf diff' to
       allow, for instance running the same workload in multiple
       machines and then when using 'diff' show the hardware difference.
       (Jiri Olsa)

     - Improved support for Java, using the JVMTI agent library to do
       jitdumps that then will be inserted in synthesized
       PERF_RECORD_MMAP2 events via 'perf inject' pointed to synthesized
       ELF files stored in ~/.debug and keyed with build-ids, to allow
       symbol resolution and even annotation with source line info, see
       the changeset comments to see how to use it (Stephane Eranian)

  perf script/trace:

     - Decode data_src values (e.g.  perf.data files generated by 'perf
       mem record') in 'perf script': (Jiri Olsa)

         # perf script
           perf 693 [1] 4.088652: 1 cpu/mem-loads,ldlat=30/P: ffff88007d0b0f40 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No <SNIP>
                                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     - Improve support to 'data_src', 'weight' and 'addr' fields in
       'perf script' (Jiri Olsa)

     - Handle empty print fmts in 'perf script -s' i.e. when running
       python or perl scripts (Taeung Song)

  perf stat:

     - 'perf stat' now shows shadow metrics (insn per cycle, etc) in
       interval mode too.  E.g:

         # perf stat -I 1000 -e instructions,cycles sleep 1
         #         time   counts unit events
            1.000215928  519,620      instructions     #  0.69 insn per cycle
            1.000215928  752,003      cycles
         <SNIP>

     - Port 'perf kvm stat' to PowerPC (Hemant Kumar)

     - Implement CSV metrics output in 'perf stat' (Andi Kleen)

  perf BPF support:

     - Support converting data from bpf events in 'perf data' (Wang Nan)

     - Print bpf-output events in 'perf script': (Wang Nan).

         # perf record -e bpf-output/no-inherit,name=evt/ -e ./test_bpf_output_3.c/map:channel.event=evt/ usleep 1000
         # perf script
            usleep  4882 21384.532523:   evt:  ffffffff810e97d1 sys_nanosleep ([kernel.kallsyms])
             BPF output: 0000: 52 61 69 73 65 20 61 20  Raise a
                         0008: 42 50 46 20 65 76 65 6e  BPF even
                         0010: 74 21 00 00              t!..
             BPF string: "Raise a BPF event!"
         #

     - Add API to set values of map entries in a BPF object, be it
       individual map slots or ranges (Wang Nan)

     - Introduce support for the 'bpf-output' event (Wang Nan)

     - Add glue to read perf events in a BPF program (Wang Nan)

     - Improve support for bpf-output events in 'perf trace' (Wang Nan)

  ... and tons of other changes as well - see the shortlog and git log
  for details!"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (342 commits)
  perf stat: Add --metric-only support for -A
  perf stat: Implement --metric-only mode
  perf stat: Document CSV format in manpage
  perf hists browser: Check sort keys before hot key actions
  perf hists browser: Allow thread filtering for comm sort key
  perf tools: Add sort__has_comm variable
  perf tools: Recalc total periods using top-level entries in hierarchy
  perf tools: Remove nr_sort_keys field
  perf hists browser: Cleanup hist_browser__fprintf_hierarchy_entry()
  perf tools: Remove hist_entry->fmt field
  perf tools: Fix command line filters in hierarchy mode
  perf tools: Add more sort entry check functions
  perf tools: Fix hist_entry__filter() for hierarchy
  perf jitdump: Build only on supported archs
  tools lib traceevent: Add '~' operation within arg_num_eval()
  perf tools: Omit unnecessary cast in perf_pmu__parse_scale
  perf tools: Pass perf_hpp_list all the way through setup_sort_list
  perf tools: Fix perf script python database export crash
  perf jitdump: DWARF is also needed
  perf bench mem: Prepare the x86-64 build for upstream memcpy_mcsafe() changes
  ...
2016-03-14 17:58:53 -07:00
Linus Torvalds d09e356ad0 Merge branch 'mm-readonly-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull read-only kernel memory updates from Ingo Molnar:
 "This tree adds two (security related) enhancements to the kernel's
  handling of read-only kernel memory:

   - extend read-only kernel memory to a new class of formerly writable
     kernel data: 'post-init read-only memory' via the __ro_after_init
     attribute, and mark the ARM and x86 vDSO as such read-only memory.

     This kind of attribute can be used for data that requires a once
     per bootup initialization sequence, but is otherwise never modified
     after that point.

     This feature was based on the work by PaX Team and Brad Spengler.

     (by Kees Cook, the ARM vDSO bits by David Brown.)

   - make CONFIG_DEBUG_RODATA always enabled on x86 and remove the
     Kconfig option.  This simplifies the kernel and also signals that
     read-only memory is the default model and a first-class citizen.
     (Kees Cook)"

* 'mm-readonly-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ARM/vdso: Mark the vDSO code read-only after init
  x86/vdso: Mark the vDSO code read-only after init
  lkdtm: Verify that '__ro_after_init' works correctly
  arch: Introduce post-init read-only memory
  x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option
  mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings
  asm-generic: Consolidate mark_rodata_ro()
2016-03-14 16:58:50 -07:00
Linus Torvalds 5ec942463b Merge branch 'mm-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull dma_*_writecombine rename from Ingo Molnar:
 "Rename dma_*_writecombine() to dma_*_wc()

  This is a tree-wide API rename, to move the dma_*() write-combining
  APIs closer in name to their usual API families.  (The old API names
  are kept as compatibility wrappers to not introduce extra breakage.)

  The patch was Coccinelle generated"

* 'mm-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  dma, mm/pat: Rename dma_*_writecombine() to dma_*_wc()
2016-03-14 16:31:41 -07:00
Linus Torvalds fbed0bc091 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking changes from Ingo Molnar:
 "Various updates:

   - Futex scalability improvements: remove page lock use for shared
     futex get_futex_key(), which speeds up 'perf bench futex hash'
     benchmarks by over 40% on a 60-core Westmere.  This makes anon-mem
     shared futexes perform close to private futexes.  (Mel Gorman)

   - lockdep hash collision detection and fix (Alfredo Alvarez
     Fernandez)

   - lockdep testing enhancements (Alfredo Alvarez Fernandez)

   - robustify lockdep init by using hlists (Andrew Morton, Andrey
     Ryabinin)

   - mutex and csd_lock micro-optimizations (Davidlohr Bueso)

   - small x86 barriers tweaks (Michael S Tsirkin)

   - qspinlock updates (Waiman Long)"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  locking/csd_lock: Use smp_cond_acquire() in csd_lock_wait()
  locking/csd_lock: Explicitly inline csd_lock*() helpers
  futex: Replace barrier() in unqueue_me() with READ_ONCE()
  locking/lockdep: Detect chain_key collisions
  locking/lockdep: Prevent chain_key collisions
  tools/lib/lockdep: Fix link creation warning
  tools/lib/lockdep: Add tests for AA and ABBA locking
  tools/lib/lockdep: Add userspace version of READ_ONCE()
  tools/lib/lockdep: Fix the build on recent kernels
  locking/qspinlock: Move __ARCH_SPIN_LOCK_UNLOCKED to qspinlock_types.h
  locking/mutex: Allow next waiter lockless wakeup
  locking/pvqspinlock: Enable slowpath locking count tracking
  locking/qspinlock: Use smp_cond_acquire() in pending code
  locking/pvqspinlock: Move lock stealing count tracking code into pv_queued_spin_steal_lock()
  locking/mcs: Fix mcs_spin_lock() ordering
  futex: Remove requirement for lock_page() in get_futex_key()
  futex: Rename barrier references in ordering guarantees
  locking/atomics: Update comment about READ_ONCE() and structures
  locking/lockdep: Eliminate lockdep_init()
  locking/lockdep: Convert hash tables to hlists
  ...
2016-03-14 15:50:44 -07:00
Linus Torvalds d37a14bb5f Merge branch 'core-resources-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull ram resource handling changes from Ingo Molnar:
 "Core kernel resource handling changes to support NVDIMM error
  injection.

  This tree introduces a new I/O resource type, IORESOURCE_SYSTEM_RAM,
  for System RAM while keeping the current IORESOURCE_MEM type bit set
  for all memory-mapped ranges (including System RAM) for backward
  compatibility.

  With this resource flag it no longer takes a strcmp() loop through the
  resource tree to find "System RAM" resources.

  The new resource type is then used to extend ACPI/APEI error injection
  facility to also support NVDIMM"

* 'core-resources-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ACPI/EINJ: Allow memory error injection to NVDIMM
  resource: Kill walk_iomem_res()
  x86/kexec: Remove walk_iomem_res() call with GART type
  x86, kexec, nvdimm: Use walk_iomem_res_desc() for iomem search
  resource: Add walk_iomem_res_desc()
  memremap: Change region_intersects() to take @flags and @desc
  arm/samsung: Change s3c_pm_run_res() to use System RAM type
  resource: Change walk_system_ram() to use System RAM type
  drivers: Initialize resource entry to zero
  xen, mm: Set IORESOURCE_SYSTEM_RAM to System RAM
  kexec: Set IORESOURCE_SYSTEM_RAM for System RAM
  arch: Set IORESOURCE_SYSTEM_RAM flag for System RAM
  ia64: Set System RAM type and descriptor
  x86/e820: Set System RAM type and descriptor
  resource: Add I/O resource descriptor
  resource: Handle resource flags properly
  resource: Add System RAM resource type
2016-03-14 15:15:51 -07:00
Andrew Lunn ca3dfa51e6 dsa: Rename mv88e6123_61_65 to mv88e6123 to be consistent
All the drivers support multiple chips, but mv88e6123_61_65 is the
only one that reflects this in its naming. Change it to be consistent
with the other drivers.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14 15:43:10 -04:00
Gregory CLEMENT 293fdc24fc ARM: dts: armada-xp-openblocks-ax3-4: Add BM support
Allow Openblock AX3 using hardware buffer management with mvneta.

Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14 12:19:46 -04:00
Marcin Wojtas 9dd7a57e2c ARM: dts: armada-xp: enable buffer manager support on Armada XP boards
Since mvneta driver supports using hardware buffer management (BM), in
order to use it, board files have to be adjusted accordingly. This commit
enables BM on AXP-DB and AXP-GP in same manner - because number of ports
on those boards is the same as number of possible pools, each port is
supposed to use single pool for all kind of packets.

Moreover appropriate entry is added to 'soc' node ranges, as well as "okay"
status for 'bm' and 'bm-bppi' (internal SRAM) nodes.

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14 12:19:45 -04:00
Marcin Wojtas ebae1376fd ARM: dts: armada-xp: add buffer manager nodes
Armada XP network controller supports hardware buffer management (BM).
Since it is now enabled in mvneta driver, appropriate nodes can be added
to armada-xp.dtsi - for the actual common BM unit (bm@c0000) and its
internal SRAM (bm-bppi), which is used for indirect access to buffer
pointer ring residing in DRAM.

Pools - ports mapping, bm-bppi entry in 'soc' node's ranges and optional
parameters are supposed to be set in board files.

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14 12:19:45 -04:00
Marcin Wojtas c49e99c2b2 ARM: dts: armada-38x: enable buffer manager support on Armada 38x boards
Since mvneta driver supports using hardware buffer management (BM), in
order to use it, board files have to be adjusted accordingly. This commit
enables BM on:
* A385-DB-AP - each port has its own pool for long and common pool for
short packets,
* A388-ClearFog - same as above,
* A388-DB - to each port unique 'short' and 'long' pools are mapped,
* A388-GP - same as above.

Moreover appropriate entry is added to 'soc' node ranges, as well as "okay"
status for 'bm' and 'bm-bppi' (internal SRAM) nodes.

[gregory.clement@free-electrons.com: add suppport for the ClearFog board]

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14 12:19:45 -04:00
Marcin Wojtas 4a547a5a46 ARM: dts: armada-38x: add buffer manager nodes
Armada 38x network controller supports hardware buffer management (BM).
Since it is now enabled in mvneta driver, appropriate nodes can be added
to armada-38x.dtsi - for the actual common BM unit (bm@c8000) and its
internal SRAM (bm-bppi), which is used for indirect access to buffer
pointer ring residing in DRAM.

Pools - ports mapping, bm-bppi entry in 'soc' node's ranges and optional
parameters are supposed to be set in board files.

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14 12:19:45 -04:00
Sebastian Ott 80c544ded2 s390/pci: enforce fmb page boundary rule
The function measurement block must not cross a page boundary. Ensure
that by raising the alignment requirement to the smallest power of 2
larger than the size of the fmb.

Fixes: d0b088531 ("s390/pci: performance statistics and debug infrastructure")
Cc: stable@vger.kernel.org # v3.8+
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-14 16:19:09 +01:00
Rafael J. Wysocki 0d571b62dd Merge branch 'pm-tools'
* pm-tools:
  tools/power turbostat: bugfix: TDP MSRs print bits fixing
  tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
  tools/power turbostat: call __cpuid() instead of __get_cpuid()
  tools/power turbostat: indicate SMX and SGX support
  tools/power turbostat: detect and work around syscall jitter
  tools/power turbostat: show GFX%rc6
  tools/power turbostat: show GFXMHz
  tools/power turbostat: show IRQs per CPU
  tools/power turbostat: make fewer systems calls
  tools/power turbostat: fix compiler warnings
  tools/power turbostat: add --out option for saving output in a file
  tools/power turbostat: re-name "%Busy" field to "Busy%"
  tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
  tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
  tools/power turbostat: allow sub-sec intervals
  tools/power turbostat: Decode MSR_MISC_PWR_MGMT
  tools/power turbostat: decode HWP registers
  x86 msr-index: Simplify syntax for HWP fields
  tools/power turbostat: CPUID(0x16) leaf shows base, max, and bus frequency
  tools/power turbostat: decode more CPUID fields
2016-03-14 14:22:34 +01:00
Rafael J. Wysocki 93dffd03b3 Merge branches 'pm-cpuidle', 'pm-sleep' and 'pm-domains'
* pm-cpuidle:
  cpuidle: menu: help gcc generate slightly better code
  cpuidle: menu: avoid expensive square root computation

* pm-sleep:
  PM / suspend: replacing printk
  PM/freezer: y2038, use boottime to compare tstamps
  PM / sleep: declare __tracedata symbols as char[] rather than char

* pm-domains:
  PM / Domains: Fix potential NULL pointer dereference
  PM / Domains: Fix removal of a subdomain
  PM / Domains: Propagate start and restore errors during runtime resume
  PM / Domains: Join state name and index in debugfs output
  PM / Domains: Restore alignment of slaves in debugfs output
  PM / Domains: remove old power on/off latencies
  ARM: imx6: pm: declare pm domain latency on power_state struct
  PM / Domains: Support for multiple states
2016-03-14 14:22:22 +01:00
Mans Rullgard 392c517449 avr32: fix asm operand constraint in cmpxchg()
If the 'old' operand to cmpxchg() is a constant wider than 21 bits,
linking fails with a "relocation truncated to fit: R_AVR32_21S" error.

Fix this by replacing the "i" constraint with "Ks21" which makes the
compiler use a temporary register for out of range constants.

Signed-off-by: Mans Rullgard <mans@mansr.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
2016-03-14 11:08:29 +01:00
Hans-Christian Egtvedt b837e97fd3 avr32: wire up copy_file_range syscall
This patch wires up the new copy_file_range syscall on AVR32.

On AVR32, all parameters beyond the 5th are passed on the stack. System
calls don't use the stack -- they borrow a callee-saved register
instead. This means that syscalls that take 6 parameters must be called
through a stub that pushes the last parameter on the stack.

Signed-off-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
2016-03-14 11:08:29 +01:00
Michael Ellerman a1b5344620 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include 8xx optimizations, 32-bit checksum optimizations,
86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt
bits, and minor fixes/cleanup."
2016-03-14 20:05:14 +11:00
Alexander Duyck 1e94082963 ipv6: Pass proto to csum_ipv6_magic as __u8 instead of unsigned short
This patch updates csum_ipv6_magic so that it correctly recognizes that
protocol is a unsigned 8 bit value.

This will allow us to better understand what limitations may or may not be
present in how we handle the data.  For example there are a number of
places that call htonl on the protocol value.  This is likely not necessary
and can be replaced with a multiplication by ntohl(1) which will be
converted to a shift by the compiler.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13 23:55:13 -04:00
Alexander Duyck 01cfbad79a ipv4: Update parameters for csum_tcpudp_magic to their original types
This patch updates all instances of csum_tcpudp_magic and
csum_tcpudp_nofold to reflect the types that are usually used as the source
inputs.  For example the protocol field is populated based on nexthdr which
is actually an unsigned 8 bit value.  The length is usually populated based
on skb->len which is an unsigned integer.

This addresses an issue in which the IPv6 function csum_ipv6_magic was
generating a checksum using the full 32b of skb->len while
csum_tcpudp_magic was only using the lower 16 bits.  As a result we could
run into issues when attempting to adjust the checksum as there was no
protocol agnostic way to update it.

With this change the value is still truncated as many architectures use
"(len + proto) << 8", however this truncation only occurs for values
greater than 16776960 in length and as such is unlikely to occur as we stop
the inner headers at ~64K in size.

I did have to make a few minor changes in the arm, mn10300, nios2, and
score versions of the function in order to support these changes as they
were either using things such as an OR to combine the protocol and length,
or were using ntohs to convert the length which would have truncated the
value.

I also updated a few spots in terms of whitespace and type differences for
the addresses.  Most of this was just to make sure all of the definitions
were in sync going forward.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13 23:55:13 -04:00
Rafael J. Wysocki 3fdb74649b Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux into pm-tools
Pull turbostat updates for 4.6 from Len Brown.

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: bugfix: TDP MSRs print bits fixing
  tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
  tools/power turbostat: call __cpuid() instead of __get_cpuid()
  tools/power turbostat: indicate SMX and SGX support
  tools/power turbostat: detect and work around syscall jitter
  tools/power turbostat: show GFX%rc6
  tools/power turbostat: show GFXMHz
  tools/power turbostat: show IRQs per CPU
  tools/power turbostat: make fewer systems calls
  tools/power turbostat: fix compiler warnings
  tools/power turbostat: add --out option for saving output in a file
  tools/power turbostat: re-name "%Busy" field to "Busy%"
  tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
  tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
  tools/power turbostat: allow sub-sec intervals
  tools/power turbostat: Decode MSR_MISC_PWR_MGMT
  tools/power turbostat: decode HWP registers
  x86 msr-index: Simplify syntax for HWP fields
  tools/power turbostat: CPUID(0x16) leaf shows base, max, and bus frequency
  tools/power turbostat: decode more CPUID fields
2016-03-14 02:13:05 +01:00
Linus Torvalds a265554988 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "Another round of MIPS fixes for 4.5:

   - Fix JZ4780 build with DEBUG_ZBOOT and MACH_JZ4780
   - Fix build with DEBUG_ZBOOT and MACH_JZ4780
   - Fix issue with uninitialised temp_foreign_map
   - Fix awk regex compile failure with certain versions of awk.  At
     this time, the sole user, ld-ifversion, is only used on MIPS"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: smp.c: Fix uninitialised temp_foreign_map
  MIPS: Fix build error when SMP is used without GIC
  ld-version: Fix awk regex compile failure
  MIPS: Fix build with DEBUG_ZBOOT and MACH_JZ4780
2016-03-13 13:04:46 -07:00
James Hogan d825c06bfe MIPS: smp.c: Fix uninitialised temp_foreign_map
When calculate_cpu_foreign_map() recalculates the cpu_foreign_map
cpumask it uses the local variable temp_foreign_map without initialising
it to zero. Since the calculation only ever sets bits in this cpumask
any existing bits at that memory location will remain set and find their
way into cpu_foreign_map too. This could potentially lead to cache
operations suboptimally doing smp calls to multiple VPEs in the same
core, even though the VPEs share primary caches.

Therefore initialise temp_foreign_map using cpumask_clear() before use.

Fixes: cccf34e941 ("MIPS: c-r4k: Fix cache flushing for MT cores")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12759/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-03-13 10:59:19 +01:00
Hauke Mehrtens 7a50e4688d MIPS: Fix build error when SMP is used without GIC
The MIPS_GIC_IPI should only be selected when MIPS_GIC is also
selected, otherwise it results in a compile error. smp-gic.c uses some
functions from include/linux/irqchip/mips-gic.h like
plat_ipi_call_int_xlate() which are only added to the header file when
MIPS_GIC is set. The Lantiq SoC does not use the GIC, but supports SMP.
The calls top the functions from smp-gic.c are already protected by
some #ifdefs

The first part of this was introduced in commit 72e20142b2 ("MIPS:
Move GIC IPI functions out of smp-cmp.c")

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: stable@vger.kernel.org # v3.15+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12774/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-03-13 10:56:48 +01:00
Aaro Koskinen ba9e72c229 MIPS: Fix build with DEBUG_ZBOOT and MACH_JZ4780
Ingenic SoC declares ZBOOT support, but debug definitions are missing
for MACH_JZ4780 resulting in a build failure when DEBUG_ZBOOT is set.
The UART addresses are same as with JZ4740, so fix by covering JZ4780
with those as well.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12830/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-03-13 10:50:46 +01:00
Linus Torvalds 2f51c8204a Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "This fixes 3 FPU handling related bugs, an EFI boot crash and a
  runtime warning.

  The EFI fix arrived late but I didn't want to delay it to after v4.5
  because the effects are pretty bad for the systems that are affected
  by it"

[ Actually, I don't think the EFI fix really matters yet, because we
  haven't switched to the separate EFI page tables in mainline yet ]

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Fix boot crash by always mapping boot service regions into new EFI page tables
  x86/fpu: Fix eager-FPU handling on legacy FPU machines
  x86/delay: Avoid preemptible context checks in delay_mwaitx()
  x86/fpu: Revert ("x86/fpu: Disable AVX when eagerfpu is off")
  x86/fpu: Fix 'no387' regression
2016-03-12 20:09:25 -08:00
Olof Johansson a86e56cb39 Provide the ARCH_MESON Kconfig symbol for the Amlogic S905 SoCs.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW3W14AAoJELyGR0S84or4rXEQAIGX749uogrEgM1Olmv6xsIT
 k34IWRXW+8mOoDvJ3dE4JxaXOglbrLmBsDcYkHTQTN/0IsBpd7ILf+EM3/ggtycr
 bjv49swlOXGSBKttDSbiMIGteuuwjSwc6tF+MgFL4RMAJQpVVT8roFshLCl3MZZu
 vNKhqJhlOhT63Qt60wsNbI+ejL5Q8y3CLZqAMsuBz8ZeLOMM2voc/n2+RbkcJ+Hz
 Ihvjaj4wlEl+DOdLVuYYSCZT6VpHfTsO38haioCwWFALBsFur6dSLJTDimpCt88Z
 gkbY/tBaQnwRhefJKd97m+fWncme0ZoyeacZupS5wwj0SVdvxyWfBjHhBFsMuynA
 UGI1WnfwDWPQHwLByMV/GWysF/NlygyHEXufh2yGHalYCOO5n5OhKVpPmcWAbhks
 lOskUfebdukbUIYZP45qnaoU2z+a2TWhBqlejbuQ2wr26d7rPZqrg4s9ImVoOKMW
 aKefnMEuqlzNBo3kgY1Lg2H9kX1A6zP/LO2tgDSWh+tjJQcPIEL8cS4KnQnh04lG
 0X7rkGYnJIDtpYLRJmZArAfJ715D7ra9rXfXU4yY2uvZpAGlkqXZM8uTyrSR17Pj
 wzI3k2LQl2gWAgBEJPkHGaLxtlhcAAjGAEN8SuOJDU291tZm7HMSgDQn0wW5LTM0
 iy3TKRIf/WHoClgCy45C
 =Zm1/
 -----END PGP SIGNATURE-----

Merge tag 'for-v4.6/gxbb-arm64' of https://github.com/carlocaione/linux-meson into next/arm64

Provide the ARCH_MESON Kconfig symbol for the Amlogic S905 SoCs.

* tag 'for-v4.6/gxbb-arm64' of https://github.com/carlocaione/linux-meson:
  ARM64: Enable Amlogic Meson GXBaby platform

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 17:44:51 -08:00
Olof Johansson 8061a17ee3 This series adds initial support for the Amlogic S905 based
Tronsmart Vega S95 Pro, Meta and Telos TV boxes.
 
 - Add new DTS to enable support for the boards
 - Add documentation for compatibles and vendor prefix
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW3W22AAoJELyGR0S84or4ijgQAMDR2a/sg9b9fzxFD8nH5BMn
 HFGgIWc+JDhzUYdslaeBPadDiJGeKysekcs4dvOvtuOpL2bcDIjJSU07XsZavY/0
 kUirDCMCDc7S0VSA2r7qgOBspaGWrYgYuc/P6ilwsiD+H7vOQIQkQtEn5wJU+pD5
 rpFfg/0OjE27YFDmq+9WfZMf1AzNvkH17XyeozP/gZZa3PWqbdP0nkj0o9JxdTcG
 WHwoq43OBOClGzufFWQQEnGzxR0n3hzqfCSU+spSBmoKFhhYqw8xtyIow9FFbMMR
 oFQNFTfpVaJm/FqFsqcdplpMYHMUlLjFjWyVJWk48gyvl7mykdWTyHWgN7bETObJ
 QKUQVYSzkgeQ4KTYWqfqLsa6PDlzt4zjzLqRzEb/3MvACX+zpFqSBH8rjobG/Her
 EcHOE068lHj0bzi8Tohm0YzIvvnJLKeb8n9fJmPSnQzlBdTbKqh3Aps917ZTMbqt
 /E2UWT0iSnNG4Ln3UG7rAnbg1xDHXL5YYX/D0r5h5ckD5v+ItsjlG4acehOuNkzb
 rFMIoWwzfLFaYqJIjCvi7yNK5IfsppTqWIc/fhmOeXHnS5NtGIcmiQhReh8xac7u
 N4Mn9ZP4hT/5xBk3MV/McX/qNdQkUmv9C+Kg7xb4GiqmA3q8TsjOnrYHj2tQxHSl
 L1Fz8eeFQSRqWIBC+Y6a
 =KmAy
 -----END PGP SIGNATURE-----

Merge tag 'for-v4.6/gxbb-dt' of https://github.com/carlocaione/linux-meson into next/dt64

This series adds initial support for the Amlogic S905 based
Tronsmart Vega S95 Pro, Meta and Telos TV boxes.

- Add new DTS to enable support for the boards
- Add documentation for compatibles and vendor prefix

* tag 'for-v4.6/gxbb-dt' of https://github.com/carlocaione/linux-meson:
  ARM64: dts: amlogic: Add Tronsmart Vega S95 configs
  Documentation: devicetree: amlogic: Document Tronsmart Vega S95 boards
  ARM64: dts: Prepare configs for Amlogic Meson GXBaby
  Documentation: devicetree: amlogic: Document Meson GXBaby
  devicetree: bindings: Add vendor prefix for Tronsmart

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 17:43:08 -08:00
Masahiro Yamada 2ef7d5f342 ARM, ARM64: dts: drop "arm,amba-bus" in favor of "simple-bus"
The compatible string "simple-bus" is well defined in ePAPR, while
I see no documentation for the "arm,amba-bus" arnywhere in ePAPR or
Documentation/devicetree/.

DT is also used by other projects than Linux kernel.  It is not a
good idea to rely on such an unofficial binding.

This commit
  - replaces "arm,amba-bus" with "simple-bus"
  - drops "arm,amba-bus" where it is used along with "simple-bus"

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 17:40:34 -08:00
Olof Johansson 301c6e0b16 Additional updates for ARM VExpress/Juno platforms
1. Add support for SBSA Generic Watchdog on foundation models
 
 2. Fix node name unit-address presence/absence mismatch warnings in
    all the device trees
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJW3vnXAAoJEABBurwxfuKY1CcP/iGb7aS31UUOvmQzzFZg8hKq
 Jgms1P1KGohD4cjN9DKWR4HOa/1s5LfEqifY92Y4BOmninWZdQmhUYJnUbFF6qT+
 I4Z3Bs05RZGnMcaGM8zseWjk5C9cvtTAU/mAVpWtxqJnodVte0FtbFA/2E7N0hZU
 Dh4sbK12CTYYnafZf9ilWo75TlE86OwSGHCprAFRIcypqDD6xsWU0c3gpFeLixxH
 CHWe7m66IPO8C9wYBYzUHHumT275K2IZS05mV0yBpQWAZC4upQjt7GION5wSMvxT
 DTqHSwRge0SL7zaj1YOP6MpRApYXhFpRr8nSsdU84H4TlvxgyMzzSqet4Ou1W4He
 lE9RaHk9ISreeqb+7cLv4sDz1R4h+PrxE5GknC/+WR9bvXhdN8DS75y9zT3HACiE
 W1glLb+5B7cR/hr0Otbm7iEB0WickLtCLdQ6usAkqBOmFI0VaOGJeod1oRWrFh8Z
 TQ2csrhE073akutdOEnVCTzHO94BR3ElYHAVYmdnvNMgeinrd1C+2I5Ce0rs4BIT
 OoMPA8ed11dASj4p496qIL7Stkzd3sYV72vQKV+EOk8m8i2YjkSbBj9aEWdDzzI5
 o9pvNHA1Bts84nc0SVa2a7XDI/RFxREUBwUVhFieSA2PRJSL+YVvpPSwVBNnuFWq
 cE681DOj4e9RBsW74WgU
 =uO7I
 -----END PGP SIGNATURE-----

Merge tag 'vexpress-for-v4.6/dt-updates-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into next/dt

Additional updates for ARM VExpress/Juno platforms

1. Add support for SBSA Generic Watchdog on foundation models

2. Fix node name unit-address presence/absence mismatch warnings in
   all the device trees

* tag 'vexpress-for-v4.6/dt-updates-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
  arm64: dts: juno/vexpress: fix node name unit-address presence warnings
  arm64: dts: foundation-v8: add SBSA Generic Watchdog device node

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 17:26:37 -08:00
Olof Johansson c7e1d89b34 Samsung Exynos ARM64 improvements for v4.6:
1. Remove separate ARCH_EXYNOS7 symbol and consolidate it into
    one ARCH_EXYNOS.
    This depends on clk tree: removal of last presence of ARCH_EXYNOS7.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWz5PsAAoJEME3ZuaGi4PX49cP/jhU8jtgdXvzysVITUPi5Ukv
 Rl82rmCrrX//z6n2LpHaqG/bg1ynOeaRa8TLYU/4gQ53fORmk4UgtqNvC54/2+KX
 2CNdNko85oX7DIc2nr5RwTNEBspYh0lTvChZtzQqoKVRdMJSiv2Ms5FcnMNYDt5q
 xGlQoriTCzlZjt2q6hjlmSZgy8RIn1EeiwsnP3DzmdiX+POgpItfwKln9Ci7E+oz
 xGa4N/rdQOA0MeV/wizDyWcaCi8HJSDQfMBq6rvx/NtYSGmJORUggFjmmhfp6wAr
 tuVMflDX20ZoDrP5XptbHapoTtOpQny/D7TiFhWvNZc59zpLei1GJOUOhWdq1PVH
 om2D59GZ+cNEzDSM/frJiT7VzB32/d479T8DfrwzUahDC2bj+yDPfSK73T72cERs
 /eBORjYrjM3h9IEwB4DbE6QrxHondXw3B2PQLq4+40w2sE4efupkiBn6vVKUIT9S
 Za2n6l1W1D2XQawz37Ya2/rAybkMaTIURgsRXE/Oe7GKz11pS3oRmYSeDWcOnMsw
 PnVL6uTz5zAtoVRygkRVmPv0JUzZgxlNMKfHy7O1zzhSLBtNwXtzGRNyFO+k1yIk
 FoWly5/o0mYvW+yU1i5lgBi9Zh5aB5Guq7qsqMoGgDC18u7v+GGAcFhNtwKAXUPy
 UDf3Ut2RSTitQnyNyl5m
 =XcED
 -----END PGP SIGNATURE-----

Merge tag 'samsung-soc64-4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into next/arm64

Samsung Exynos ARM64 improvements for v4.6:
1. Remove separate ARCH_EXYNOS7 symbol and consolidate it into
   one ARCH_EXYNOS.

This depends on clk tree: removal of last presence of ARCH_EXYNOS7.

* tag 'samsung-soc64-4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
  arm64: EXYNOS: Consolidate ARCH_EXYNOS7 symbol into ARCH_EXYNOS
  clk: samsung: Don't build ARMv8 clock drivers on ARMv7
  clk: samsung: Enable COMPILE_TEST for Samsung clocks
  clk: Move vendor's Kconfig into CCF menu section
  clk: mediatek: Fix memory leak on clock init fail
  clk: move the common clock's to_clk_*(_hw) macros to clk-provider.h
  clk: xgene: Remove return from void function
  clk: xgene: Add SoC and PMD PLL clocks with v2 hardware
  Documentation: Update APM X-Gene clock binding for v2 hardware
  clk: s2mps11: remove redundant code
  clk: s2mps11: remove redundant static variables declaration
  clk: s2mps11: allocate only one structure for clock init
  clk: s2mps11: merge two for loops in one
  clk-divider: make sure read-only dividers do not write to their register
  clk: tango4: rename ARCH_TANGOX to ARCH_TANGO
  clk: scpi: Fix checking return value of platform_device_register_simple()
  clk: mvebu: Mark ioremapped memory as __iomem

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 17:13:34 -08:00
Olof Johansson 2a993a587e Samsung Exynos (and older platforms) improvements for v4.6:
1. Split out Exynos PMU driver implementation from arm/mach-exynos
    to the drivers/soc/samsung which will allow re-use of it on ARM64.
 2. Use generic DT cpufreq driver on Exynos542x/5800.
 3. Minor cleanups.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWz5R8AAoJEME3ZuaGi4PX/dUP/i55BQ3HREph4flgzGZ0mTc2
 f9/v5tZ3jkx8uyVCqcHA94faGRPex0q5KfDThQGTf4GpJWvrOSXY7e9ILdz+VNPd
 1S1pWluiR+V9TJHRRCz0l7vo81n96EyhHAiCUq0kS9mPbi1aXu0T15kGAWyFa0cK
 AF/DAlLrJn7F+0Fspj4Y0YsZLJCt+XxtCxPkGG87aGWPzBBvSgh3PuNHN5sczq3S
 l5mQiqulX+Z1KYzN/Qlng2fXmh0llp0qcBdRmQsChhQyYvDCYe3++5efKqyaKXVN
 MHAU4sRhgt9CGV6/9J4trJHq/SLqmVQjIoOfZyfvnGIthyIc0hMlT9H5UyWT9cXK
 FD5OIWY110XnfUFsViQfEKHeEFjBL6ABSXXITjltDAwc9o2bHKSVtSyq3NLBz4qO
 2WzvGSXIhDHtfjBB+wzagrnpOPAwVdVKQ4ZJvuWR7V2aM4Jbrwz9hVCdWKK/iEyU
 Oys/SzVzikkvRRolcXsIQphqK6p6tj3H8NY5nDcZZb53x/MxEEk5Pg7oqZNJD5xN
 eKsjaSwIgYZ4xaT/mwcLWOc6/XXt7a9SLmQqaYSf4ABWCfZdQtu1an3AzFPUIog+
 rJ4IWLJoPkixAPkPZebkyWaIBIktfQjKPWatX+SAm6Y9A9QmC4dgDgoKIYhnTXze
 nAUOLxHxx75k9/XQPjFX
 =/LNl
 -----END PGP SIGNATURE-----

Merge tag 'samsung-soc-4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into next/soc

Samsung Exynos (and older platforms) improvements for v4.6:
1. Split out Exynos PMU driver implementation from arm/mach-exynos
   to the drivers/soc/samsung which will allow re-use of it on ARM64.
2. Use generic DT cpufreq driver on Exynos542x/5800.
3. Minor cleanups.

* tag 'samsung-soc-4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
  ARM: s3c24xx: Avoid warning for inb/outb
  ARM: SAMSUNG: Remove unused register offset definition
  ARM: EXYNOS: Cleanup header files inclusion
  drivers: soc: samsung: Enable COMPILE_TEST
  MAINTAINERS: Add maintainers entry for drivers/soc/samsung
  drivers: soc: Add support for Exynos PMU driver
  ARM: EXYNOS: Split up exynos5420 SoC specific PMU data
  ARM: EXYNOS: Split up exynos5250 SoC specific PMU data
  ARM: EXYNOS: Split up exynos4 SoC specific PMU data
  ARM: EXYNOS: Split up exynos3250 SoC specific PMU data
  ARM: EXYNOS: Move pmu specific headers under "linux/soc/samsung"
  ARM: EXYNOS: Correct header comment in Kconfig file
  ARM: EXYNOS: Use generic cpufreq driver for Exynos5422/5800
  ARM: EXYNOS: Use generic cpufreq driver for Exynos5420
  ARM: s3c64xx: use "depends on" instead of "if" after prompt
  ARM: plat-samsung: use to_platform_device()
  ARM: EXYNOS: Code cleanup in map.h
  ARM: EXYNOS: Remove unused static mapping of CMU for exynos5

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 17:10:20 -08:00
Olof Johansson c8c904471a mvebu dt64 for 4.6 (part 2)
Add support for the Armada 7K and 8K SoCs and the Armada 8040 DB board
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iEYEABECAAYFAlbQX0wACgkQCwYYjhRyO9V9nwCgqk3LOUVjtyjGxd4YU8iHzEBe
 B+oAoKMMiTTb/yqcwMdN+l1B5RMOxW4o
 =fs9N
 -----END PGP SIGNATURE-----

Merge tag 'mvebu-dt64-4.6-2' of git://git.infradead.org/linux-mvebu into next/dt64

mvebu dt64 for 4.6 (part 2)

Add support for the Armada 7K and 8K SoCs and the Armada 8040 DB board

* tag 'mvebu-dt64-4.6-2' of git://git.infradead.org/linux-mvebu:
  arm64: dts: marvell: re-order Device Tree nodes for Armada AP806
  arm64: dts: marvell: update Armada AP806 clock description
  arm64: dts: marvell: add Device Tree files for Armada 7K/8K

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 17:05:27 -08:00
Krzysztof Halasa 88e9da9a2a CNS3xxx: Fix PCI cns3xxx_write_config()
The "where" offset was added twice, fix it.

Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Fixes: 498a92d425 ("ARM: cns3xxx: pci: avoid potential stack overflow")
Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 17:02:06 -08:00
Georgi Djakov 2f85bb09d9 arm64: defconfig: Increase MMC_BLOCK_MINORS to 16
Increase the block minors from the default 8 to 16. The db410c board
by default has eMMC rootfs on the 10th partition.

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 16:16:27 -08:00
Georgi Djakov d1be05ab23 arm64: defconfig: Add Qualcomm sdhci and restart functionality
Enable sdhci and restart functionality for devices based on msm8916 platform.

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 16:16:15 -08:00
Lars Persson b2af8e58a0 ARM: dts: artpec: dual-license on artpec6.dtsi
Relaxed the license on the dtsi to permit use in other projects.

Signed-off-by: Lars Persson <larper@axis.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 16:15:47 -08:00
Linus Walleij 7ad86d612b ARM: dts: ux500: add synaptics RMI4 for Ux500 TVK DT
This adds the Synaptics RMI4 touchscreen to the Ux500 TVK
user interface board. Tested on the U8500 HREFv60plus with
the TVK UIB.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-12 16:15:46 -08:00
Fenghua Yu d050049442 x86/cpufeature: Enable new AVX-512 features
A few new AVX-512 instruction groups/features are added in cpufeatures.h
for enuermation: AVX512DQ, AVX512BW, and AVX512VL.

Clear the flags in fpu__xstate_clear_all_cpu_caps().

The specification for latest AVX-512 including the features can be found at:

  https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf

Note, I didn't enable the flags in KVM. Hopefully the KVM guys can pick up
the flags and enable them in KVM.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Link: http://lkml.kernel.org/r/1457667498-37357-1-git-send-email-fenghua.yu@intel.com
[ Added more detailed feature descriptions. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-12 17:30:53 +01:00
Matt Fleming 452308de61 x86/efi: Fix boot crash by always mapping boot service regions into new EFI page tables
Some machines have EFI regions in page zero (physical address
0x00000000) and historically that region has been added to the e820
map via trim_bios_range(), and ultimately mapped into the kernel page
tables. It was not mapped via efi_map_regions() as one would expect.

Alexis reports that with the new separate EFI page tables some boot
services regions, such as page zero, are not mapped. This triggers an
oops during the SetVirtualAddressMap() runtime call.

For the EFI boot services quirk on x86 we need to memblock_reserve()
boot services regions until after SetVirtualAddressMap(). Doing that
while respecting the ownership of regions that may have already been
reserved by the kernel was the motivation behind this commit:

  7d68dc3f10 ("x86, efi: Do not reserve boot services regions within reserved areas")

That patch was merged at a time when the EFI runtime virtual mappings
were inserted into the kernel page tables as described above, and the
trick of setting ->numpages (and hence the region size) to zero to
track regions that should not be freed in efi_free_boot_services()
meant that we never mapped those regions in efi_map_regions(). Instead
we were relying solely on the existing kernel mappings.

Now that we have separate page tables we need to make sure the EFI
boot services regions are mapped correctly, even if someone else has
already called memblock_reserve(). Instead of stashing a tag in
->numpages, set the EFI_MEMORY_RUNTIME bit of ->attribute. Since it
generally makes no sense to mark a boot services region as required at
runtime, it's pretty much guaranteed the firmware will not have
already set this bit.

For the record, the specific circumstances under which Alexis
triggered this bug was that an EFI runtime driver on his machine was
responding to the EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE event during
SetVirtualAddressMap().

The event handler for this driver looks like this,

  sub rsp,0x28
  lea rdx,[rip+0x2445] # 0xaa948720
  mov ecx,0x4
  call func_aa9447c0  ; call to ConvertPointer(4, & 0xaa948720)
  mov r11,QWORD PTR [rip+0x2434] # 0xaa948720
  xor eax,eax
  mov BYTE PTR [r11+0x1],0x1
  add rsp,0x28
  ret

Which is pretty typical code for an EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE
handler. The "mov r11, QWORD PTR [rip+0x2424]" was the faulting
instruction because ConvertPointer() was being called to convert the
address 0x0000000000000000, which when converted is left unchanged and
remains 0x0000000000000000.

The output of the oops trace gave the impression of a standard NULL
pointer dereference bug, but because we're accessing physical
addresses during ConvertPointer(), it wasn't. EFI boot services code
is stored at that address on Alexis' machine.

Reported-by: Alexis Murzeau <amurzeau@gmail.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raphael Hertzog <hertzog@debian.org>
Cc: Roger Shimizu <rogershimizu@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1457695163-29632-2-git-send-email-matt@codeblueprint.co.uk
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815125
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-12 16:57:45 +01:00
Borislav Petkov 6e6867093d x86/fpu: Fix eager-FPU handling on legacy FPU machines
i486 derived cores like Intel Quark support only the very old,
legacy x87 FPU (FSAVE/FRSTOR, CPUID bit FXSR is not set), and
our FPU code wasn't handling the saving and restoring there
properly in the 'eagerfpu' case.

So after we made eagerfpu the default for all CPU types:

  58122bf1d8 x86/fpu: Default eagerfpu=on on all CPUs

these old FPU designs broke. First, Andy Shevchenko reported a splat:

  WARNING: CPU: 0 PID: 823 at arch/x86/include/asm/fpu/internal.h:163 fpu__clear+0x8c/0x160

which was us trying to execute FXRSTOR on those machines even though
they don't support it.

After taking care of that, Bryan O'Donoghue reported that a simple FPU
test still failed because we weren't initializing the FPU state properly
on those machines.

Take care of all that.

Reported-and-tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Reported-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/20160311113206.GD4312@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-12 16:13:55 +01:00
Bjorn Helgaas 97f47e73c4 MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource
Loongson 3 used the IORESOURCE_ROM_COPY flag for its ROM resource.  There
are two problems with this:

  - When IORESOURCE_ROM_COPY is set, pci_map_rom() assumes the resource
    contains virtual addresses, so it doesn't ioremap the resource.  This
    implies loongson_sysconf.vgabios_addr is a virtual address.  That's a
    problem because resources should contain CPU *physical* addresses not
    virtual addresses.

  - When IORESOURCE_ROM_COPY is set, pci_cleanup_rom() calls kfree() on the
    resource.  We did not kmalloc() the loongson_sysconf.vgabios_addr area,
    so it is incorrect to kfree() it.

If we're using a shadow copy in RAM for the Loongson 3 VGA BIOS area,
disable the ROM BAR and release the address space it was consuming.

Use IORESOURCE_ROM_SHADOW instead of IORESOURCE_ROM_COPY.  This means the
struct resource contains CPU physical addresses, and pci_map_rom() will
ioremap() it as needed.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-12 06:00:29 -06:00
Bjorn Helgaas 53f0a50977 MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
Use a temporary struct resource pointer to avoid needless repetition of
"pdev->resource[PCI_ROM_RESOURCE]".  No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-12 06:00:29 -06:00
Bjorn Helgaas 240504adaf ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource
A struct resource contains CPU physical addresses, not virtual addresses.
But sn_acpi_slot_fixup() and sn_io_slot_fixup() stored the virtual address
of a shadow ROM copy in the resource.  To compensate, pci_map_rom() had a
special case that returned the resource address directly rather than
calling ioremap() on it.

When we're using a shadow copy in RAM or PROM, disable the ROM BAR and
release the address space it was consuming.

Store the CPU physical (not virtual) address in the shadow ROM resource,
and mark the resource as IORESOURCE_ROM_SHADOW so we use the normal
pci_map_rom() path that ioremaps the copy.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-12 06:00:29 -06:00
Bjorn Helgaas f976721e82 ia64/PCI: Use ioremap() instead of open-coded equivalent
Depositing __IA64_UNCACHED_OFFSET in the upper address bits is essentially
equivalent to ioremap(): it converts a CPU physical address to a virtual
address using the ia64 uncacheable identity map.

Call ioremap() instead of doing the phys-to-virt conversion manually with
__IA64_UNCACHED_OFFSET.

Note that this makes it obvious that (a) we're putting a virtual address in
a struct resource, and (b) we're passing a virtual address to ioremap()
below in the PCI_ROM_RESOURCE case.  These are both pre-existing problems
that I'll resolve next.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-12 06:00:29 -06:00
Bjorn Helgaas ab97b8cc56 ia64/PCI: Use temporary struct resource * to avoid repetition
Use a temporary struct resource pointer to avoid needless repetition of
"dev->resource[idx]".  No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-12 06:00:29 -06:00
Vineet Gupta c2ff5cf273 ARC: mm: Use virt_to_pfn() for addr >> PAGE_SHIFT pattern
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-12 11:59:47 +05:30
Vineet Gupta 0b291635fc ARC: [plat-nsim] document ranges
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-12 11:58:44 +05:30
Vineet Gupta 20d780374c ARC: build: Better way to detect ISA compatible toolchain
ARC architecture has 2 instruction sets: ARCompact/ARCv2.
While same gcc supports compiling for either (using appropriate toggles),
we can't use the same toolchain to build kernel because libgcc needs
to be unique and the toolchian (uClibc based) is not multilibed.

uClibc toolchain is convenient since it allows all userspace and
kernel to be built with a single install for an ISA.

This however means 2 gnu installs (with same triplet prefix) are needed
for building for 2 ISA and need to be in PATH.
As developers we keep switching the builds, but would occassionally fail
to update the PATH leading to usage of wrong tools. And this would only
show up at the end of kernel build when linking incompatible libgcc.

So the initial solution was to have gcc define a special preprocessor macro
DEFAULT_CPU_xxx which is unique for default toolchain configuration.
Claudiu proposed using grep for an existing preprocessor macro which is
again uniquely defined per ISA.

Cc: Michal Marek <mmarek@suse.cz>
Suggested-by: Claudiu Zissulescu <claziss@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-12 11:58:30 +05:30
Vineet Gupta 7cab91b87d ARCv2: Allow enabling PAE40 w/o HIGHMEM
This allows for regression testing in PAE specific code as we lack
a 32+ bit physical memory platform other than nSIM.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-12 10:48:14 +05:30
Lada Trimasova f778cc6571 ARC: [BE] readl()/writel() to work in Big Endian CPU configuration
read{l,w}() write{l,w}() primitives should use le{16,32}_to_cpu() and
cpu_to_le{16,32}() respectively to ensure device registers are read
correctly in Big Endian CPU configuration.

Per Arnd Bergmann
| Most drivers using readl() or readl_relaxed() expect those to perform byte
| swaps on big-endian architectures, as the registers tend to be fixed endian

This was needed for getting UART to work correctly on a Big Endian ARC.

The ARC accessors originally were fine, and the bug got introduced
inadventently by commit b8a0330239 ("ARCv2: barriers")

Fixes: b8a0330239 ("ARCv2: barriers")
Link: http://lkml.kernel.org/r/201603100845.30602.arnd@arndb.de
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: stable@vger.kernel.org  [4.2+]
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Lada Trimasova <ltrimas@synopsys.com>
[vgupta: beefed up changelog, added Fixes/stable tags]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-12 09:46:45 +05:30
Hou Zhiqiang fba4e9f989 powerpc/fsl/dts: Add "jedec,spi-nor" flash compatible
Starting with commit <8947e396a829> ("Documentation: dt: mtd:
replace "nor-jedec" binding with "jedec, spi-nor"") we have
"jedec,spi-nor" binding indicating support for JEDEC identification.

Use it for all flashes that are supposed to support READ ID op
according to the datasheets.

Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@freescale.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 20:06:41 -06:00
Zhao Qiang fd5475a75b powerpc/T104xRDB: add tdm riser card node to device tree
Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 20:05:39 -06:00
Christophe Leroy 060ef9d89d powerpc32: PAGE_EXEC required for inittext
PAGE_EXEC is required for inittext, otherwise CONFIG_DEBUG_PAGEALLOC
ends up with an Oops

[    0.000000] Inode-cache hash table entries: 8192 (order: 1, 32768 bytes)
[    0.000000] Sorting __ex_table...
[    0.000000] bootmem::free_all_bootmem_core nid=0 start=0 end=2000
[    0.000000] Unable to handle kernel paging request for instruction fetch
[    0.000000] Faulting instruction address: 0xc045b970
[    0.000000] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.000000] PREEMPT DEBUG_PAGEALLOC CMPC885
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.25-local-dirty #1673
[    0.000000] task: c04d83d0 ti: c04f8000 task.ti: c04f8000
[    0.000000] NIP: c045b970 LR: c045b970 CTR: 0000000a
[    0.000000] REGS: c04f9ea0 TRAP: 0400   Not tainted  (3.18.25-local-dirty)
[    0.000000] MSR: 08001032 <ME,IR,DR,RI>  CR: 39955d35  XER: a000ff40
[    0.000000]
GPR00: c045b970 c04f9f50 c04d83d0 00000000 ffffffff c04dcdf4 00000048 c04f6b10
GPR08: c04f6ab0 00000001 c0563488 c04f6ab0 c04f8000 00000000 00000000 b6db6db7
GPR16: 00003474 00000180 00002000 c7fec000 00000000 000003ff 00000176 c0415014
GPR24: c0471018 c0414ee8 c05304e8 c03aeaac c0510000 c0471018 c0471010 00000000
[    0.000000] NIP [c045b970] free_all_bootmem+0x164/0x228
[    0.000000] LR [c045b970] free_all_bootmem+0x164/0x228
[    0.000000] Call Trace:
[    0.000000] [c04f9f50] [c045b970] free_all_bootmem+0x164/0x228 (unreliable)
[    0.000000] [c04f9fa0] [c0454044] mem_init+0x3c/0xd0
[    0.000000] [c04f9fb0] [c045080c] start_kernel+0x1f4/0x390
[    0.000000] [c04f9ff0] [c0002214] start_here+0x38/0x98
[    0.000000] Instruction dump:
[    0.000000] 2f150000 7f968840 72a90001 3ad60001 56b5f87e 419a0028 419e0024 41a20018
[    0.000000] 807cc20c 38800000 7c638214 4bffd2f5 <3a940001> 3a100024 4bffffc8 7e368b78
[    0.000000] ---[ end trace dc8fa200cb88537f ]---

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 20:04:32 -06:00
Igal Liberman 84d3e24805 powerpc/mpc85xx: Add pcsphy nodes to FManV3 device tree
This patch adds pcsphy node to FManV3 device tree.

Signed-off-by: Igal Liberman <igal.liberman@freescale.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 20:01:52 -06:00
Igal Liberman 84e0f1c138 powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)
Describe the PHY topology for all configurations supported by each board

Based on prior work by Andy Fleming <afleming@freescale.com>

Signed-off-by: Shruti Kanetkar <Shruti@freescale.com>
Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Igal Liberman <Igal.Liberman@freescale.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 20:01:38 -06:00
Alessio Igor Bogani 334479d1cc powerpc/86xx: Introduce and use common dtsi
Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 19:19:13 -06:00
Alessio Igor Bogani 595207b93f powerpc/86xx: Update device tree
Avoid duplication of the interrupt-parent, migrate to 4 interrupt-cells
and set the right clock-frequency for pcie (100 Mhz).

Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 19:19:13 -06:00
Alessio Igor Bogani 46f26ec7f6 powerpc/86xx: Move dts files to fsl directory
Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 19:19:12 -06:00
Alessio Igor Bogani 43de32c563 powerpc/86xx: Switch to kconfig fragments approach
Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 19:19:12 -06:00
Alessio Igor Bogani 3d363285a1 powerpc/86xx: Update defconfigs
This patch show how defconfigs appear if the kconfig fragment approach is
used.

Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 19:14:59 -06:00
Alessio Igor Bogani 4f9d6e95bc powerpc/86xx: Consolidate common platform code
Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 19:14:12 -06:00
Christophe Leroy 737b01fca3 powerpc32: Remove one insn in mulhdu
Remove one instruction in mulhdu

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:12 -06:00
Christophe Leroy 716fa91d19 powerpc32: small optimisation in flush_icache_range()
Inlining of _dcache_range() functions has shown that the compiler
does the same thing a bit better with one insn less

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:12 -06:00
Christophe Leroy 8478d7f091 powerpc: Simplify test in __dma_sync()
This simplification helps the compiler. We now have only one test
instead of two, so it reduces the number of branches.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:12 -06:00
Christophe Leroy affe587bac powerpc32: move xxxxx_dcache_range() functions inline
flush/clean/invalidate _dcache_range() functions are all very
similar and are quite short. They are mainly used in __dma_sync()
perf_event locate them in the top 3 consumming functions during
heavy ethernet activity

They are good candidate for inlining, as __dma_sync() does
almost nothing but calling them

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:12 -06:00
Christophe Leroy 5736f96d12 powerpc32: Remove clear_pages() and define clear_page() inline
clear_pages() is never used expect by clear_page, and PPC32 is the
only architecture (still) having this function. Neither PPC64 nor
any other architecture has it.

This patch removes clear_pages() and moves clear_page() function
inline (same as PPC64) as it only is a few isns

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:11 -06:00
Christophe Leroy d6bfa02fcc powerpc: add inline functions for cache related instructions
This patch adds inline functions to use dcbz, dcbi, dcbf, dcbst
from C functions

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:11 -06:00
Christophe Leroy 766d45cbee powerpc/8xx: rewrite flush_instruction_cache() in C
On PPC8xx, flushing instruction cache is performed by writing
in register SPRN_IC_CST. This registers suffers CPU6 ERRATA.
The patch rewrites the fonction in C so that CPU6 ERRATA will
be handled transparently

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:11 -06:00
Christophe Leroy a7761fe489 powerpc/8xx: rewrite set_context() in C
There is no real need to have set_context() in assembly.
Now that we have mtspr() handling CPU6 ERRATA directly, we
can rewrite set_context() in C language for easier maintenance.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:11 -06:00
Christophe Leroy 63e9e1c28f powerpc/8xx: remove special handling of CPU6 errata in set_dec()
CPU6 ERRATA is now handled directly in mtspr(), so we can use the
standard set_dec() fonction in all cases.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:03 -06:00
Christophe Leroy 1458dd951f powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro
MPC8xx has an ERRATA on the use of mtspr() for some registers
This patch includes the ERRATA handling directly into mtspr() macro
so that mtspr() users don't need to bother about that errata

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:02 -06:00
Christophe Leroy 7ee5cf6bfa powerpc/8xx: Add missing SPRN defines into reg_8xx.h
Add missing SPRN defines into reg_8xx.h
Some of them are defined in mmu-8xx.h, so we include mmu-8xx.h in
reg_8xx.h, for that we remove references to PAGE_SHIFT in mmu-8xx.h
to have it self sufficient, as includers of reg_8xx.h don't all
include asm/page.h

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:02 -06:00
Christophe Leroy e974cd4be0 powerpc32: remove ioremap_base
ioremap_base is not initialised and is nowhere used so remove it

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:02 -06:00
Christophe Leroy c562eb06d5 powerpc32: Remove useless/wrong MMU:setio progress message
Commit 7711684947 ("[POWERPC] Remove unused machine call outs")
removed the call to setup_io_mappings(), so remove the associated
progress line message

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:02 -06:00
Christophe Leroy 3084cdb7cd powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together
x_mapped_by_bats() and x_mapped_by_tlbcam() serve the same kind of
purpose, and are never defined at the same time.
So rename them x_block_mapped() and define them in the relevant
places

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:02 -06:00
Christophe Leroy be00ed728c powerpc32: Fix pte_offset_kernel() to return NULL for bad pages
The fixmap related functions try to map kernel pages that are
already mapped through Large TLBs. pte_offset_kernel() has to
return NULL for LTLBs, otherwise the caller will try to access
level 2 table which doesn't exist

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:02 -06:00
Christophe Leroy 516d91893b powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c
Now we have a 8xx specific .c file for that so put it in there
as other powerpc variants do

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:01 -06:00
Christophe Leroy a372acfac5 powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.

MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.

In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.

In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.

With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:01 -06:00
Christophe Leroy 913a6b3d10 powerpc/8xx: Save r3 all the time in DTLB miss handler
We are spending between 40 and 160 cycles with a mean of 65 cycles in
the DTLB handling routine (measured with mftbl) so make it more
simple althought it adds one instruction.
With this modification, we get three registers available at all time,
which will help with following patch.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:01 -06:00
Linus Torvalds 2a4fb270da ARM: SoC fixes
Two more fixes for 4.5:
 
  - One is a fix for OMAP that is urgently needed to avoid DRA7xx chips from
    premature aging, by always keeping the Ethernet clock enabled.
 
  - The other solves a I/O memory layout issue on Armada, where SROM and PCI
    memory windows were conflicting in some configurations.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW4yITAAoJEIwa5zzehBx38BsQAJRjZOeAec3/F+T8+3pnV0Jl
 URcyIFBgXQm6AVW9bwrn7bg9GOcWm0hNk4lgQ/E6KgaZpRVJQ+bhqb79Rz45LhCG
 7YmxEXtM8zhVY80/AJsEF0vzogfZsPPI3SiGF9OeIwiMEO91hpRMyvFbOqJC2H40
 YX17ARv2BTozLJ2PaW9BKoFAJX2uJJqIB6QOi307m3TVFRPQ5qPpVvh43L1+7flF
 ntugOzbEhIg1ZENeb0sNMtrhWlsNlQvulJl2xcp3sbXqkj3sPNIHzyvrPXhxOYQI
 VFJKHDC1Op6c2PFK8H0iOQMKq+WWuOidjCGwyg5/PNAoQ4cP+AoD0EpEuXXNjh7e
 8DlVhCiYNSJl7M88jahHj1pq3X+CxwQraGANHIa0nijKYp4pqOqv+CZA0sgAX5cq
 Ro6U5v5XZxgSR6QGwNBtjCxmXC4z9YaYIP/nkCW2zbPQkaeocKYNykOifp1fOI59
 VWufA0OTqk1XjVGcYorpgDaLFUAhgc14JEz1VLQGlw1/M+nVVcfr598FtTWrEoNI
 C1L2H7ahqKpVRSYCCtUlXg4TipyurjBk3A91mVBVcrSj/A4ztGkqjwMx995KzP+w
 HXI7PSulXK/HDupXslUcUCmVwkI5nxhcH7kuk978zwFFyQvDwB+A1mPysR+Naenz
 sI0wqCBHKZj70kyFCflm
 =/uWT
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "Two more fixes for 4.5:

   - One is a fix for OMAP that is urgently needed to avoid DRA7xx chips
     from premature aging, by always keeping the Ethernet clock enabled.

   - The other solves a I/O memory layout issue on Armada, where SROM
     and PCI memory windows were conflicting in some configurations"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window
  ARM: dts: dra7: do not gate cpsw clock due to errata i877
  ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property
2016-03-11 12:35:54 -08:00
Thomas Petazzoni d7d5a43c0d ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window
When the Crypto SRAM mappings were added to the Device Tree files
describing the Armada XP boards in commit c466d997bb ("ARM: mvebu:
define crypto SRAM ranges for all armada-xp boards"), the fact that
those mappings were overlaping with the PCIe memory aperture was
overlooked. Due to this, we currently have for all Armada XP platforms
a situation that looks like this:

Memory mapping on Armada XP boards with internal registers at
0xf1000000:

 - 0x00000000 -> 0xf0000000	3.75G 	RAM
 - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
 - 0xf1000000 -> 0xf1100000	1M	internal registers
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory aperture
 - 0xf8100000 -> 0xf8110000	64KB	Crypto SRAM #0	=> OVERLAPS WITH PCIE !
 - 0xf8110000 -> 0xf8120000	64KB	Crypto SRAM #1	=> OVERLAPS WITH PCIE !
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O aperture
 - 0xfff0000  -> 0xffffffff	1M	BootROM

The overlap means that when PCIe devices are added, depending on their
memory window needs, they might or might not be mapped into the
physical address space. Indeed, they will not be mapped if the area
allocated in the PCIe memory aperture by the PCI core overlaps with
one of the Crypto SRAM. Typically, a Intel IGB PCIe NIC that needs 8MB
of PCIe memory will see its PCIe memory window allocated from
0xf80000000 for 8MB, which overlaps with the Crypto SRAM windows. Due
to this, the PCIe window is not created, and any attempt to access the
PCIe window makes the kernel explode:

[    3.302213] igb: Copyright (c) 2007-2014 Intel Corporation.
[    3.307841] pci 0000:00:09.0: enabling device (0140 -> 0143)
[    3.313539] mvebu_mbus: cannot add window '4:f8', conflicts with another window
[    3.320870] mvebu-pcie soc:pcie-controller: Could not create MBus window at [mem 0xf8000000-0xf87fffff]: -22
[    3.330811] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf08c0018

This problem does not occur on Armada 370 boards, because we use the
following memory mapping (for boards that have internal registers at
0xf1000000):

 - 0x00000000 -> 0xf0000000	3.75G 	RAM
 - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
 - 0xf1000000 -> 0xf1100000	1M	internal registers
 - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0 => OK !
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
 - 0xfff0000  -> 0xffffffff	1M	BootROM

Obviously, the solution is to align the location of the Crypto SRAM
mappings of Armada XP to be similar with the ones on Armada 370, i.e
have them between the "internal registers" area and the beginning of
the PCIe aperture.

However, we have a special case with the OpenBlocks AX3-4 platform,
which has a 128 MB NOR flash. Currently, this NOR flash is mapped from
0xf0000000 to 0xf8000000. This is possible because on OpenBlocks
AX3-4, the internal registers are not at 0xf1000000. And this explains
why the Crypto SRAM mappings were not configured at the same place on
Armada XP.

Hence, the solution is two-fold:

 (1) Move the NOR flash mapping on Armada XP OpenBlocks AX3-4 from
     0xe8000000 to 0xf0000000. This frees the 0xf0000000 ->
     0xf80000000 space.

 (2) Move the Crypto SRAM mappings on Armada XP to be similar to
     Armada 370 (except of course that Armada XP has two Crypto SRAM
     and not one).

After this patch, the memory mapping on Armada XP boards with
registers at 0xf1 is:

 - 0x00000000 -> 0xf0000000	3.75G 	RAM
 - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
 - 0xf1000000 -> 0xf1100000	1M	internal registers
 - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0
 - 0xf1110000 -> 0xf1120000	64KB	Crypto SRAM #1
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
 - 0xfff0000  -> 0xffffffff	1M	BootROM

And the memory mapping for the special case of the OpenBlocks AX3-4
(internal registers at 0xd0000000, NOR of 128 MB):

 - 0x00000000 -> 0xc0000000	3G 	RAM
 - 0xd0000000 -> 0xd1000000	1M	internal registers
 - 0xe800000  -> 0xf0000000	128M	NOR flash
 - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0
 - 0xf1110000 -> 0xf1120000	64KB	Crypto SRAM #1
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
 - 0xfff0000  -> 0xffffffff	1M	BootROM

Fixes: c466d997bb ("ARM: mvebu: define crypto SRAM ranges for all armada-xp boards")
Reported-by: Phil Sutter <phil@nwl.cc>
Cc: Phil Sutter <phil@nwl.cc>
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-11 11:49:55 -08:00
Catalin Marinas 2776e0e8ef arm64: kasan: Fix zero shadow mapping overriding kernel image shadow
With the 16KB and 64KB page size configurations, SWAPPER_BLOCK_SIZE is
PAGE_SIZE and ARM64_SWAPPER_USES_SECTION_MAPS is 0. Since
kimg_shadow_end is not page aligned (_end shifted by
KASAN_SHADOW_SCALE_SHIFT), the edges of previously mapped kernel image
shadow via vmemmap_populate() may be overridden by subsequent calls to
kasan_populate_zero_shadow(), leading to kernel panics like below:

------------------------------------------------------------------------------
Unable to handle kernel paging request at virtual address fffffc100135068c
pgd = fffffc8009ac0000
[fffffc100135068c] *pgd=00000009ffee0003, *pud=00000009ffee0003, *pmd=00000009ffee0003, *pte=00e0000081a00793
Internal error: Oops: 9600004f [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4+ #1984
Hardware name: Juno (DT)
task: fffffe09001a0000 ti: fffffe0900200000 task.ti: fffffe0900200000
PC is at __memset+0x4c/0x200
LR is at kasan_unpoison_shadow+0x34/0x50
pc : [<fffffc800846f1cc>] lr : [<fffffc800821ff54>] pstate: 00000245
sp : fffffe0900203db0
x29: fffffe0900203db0 x28: 0000000000000000
x27: 0000000000000000 x26: 0000000000000000
x25: fffffc80099b69d0 x24: 0000000000000001
x23: 0000000000000000 x22: 0000000000002000
x21: dffffc8000000000 x20: 1fffff9001350a8c
x19: 0000000000002000 x18: 0000000000000008
x17: 0000000000000147 x16: ffffffffffffffff
x15: 79746972100e041d x14: ffffff0000000000
x13: ffff000000000000 x12: 0000000000000000
x11: 0101010101010101 x10: 1fffffc11c000000
x9 : 0000000000000000 x8 : fffffc100135068c
x7 : 0000000000000000 x6 : 000000000000003f
x5 : 0000000000000040 x4 : 0000000000000004
x3 : fffffc100134f651 x2 : 0000000000000400
x1 : 0000000000000000 x0 : fffffc100135068c

Process swapper/0 (pid: 1, stack limit = 0xfffffe0900200020)
Call trace:
[<fffffc800846f1cc>] __memset+0x4c/0x200
[<fffffc8008220044>] __asan_register_globals+0x5c/0xb0
[<fffffc8008a09d34>] _GLOBAL__sub_I_65535_1_sunrpc_cache_lookup+0x1c/0x28
[<fffffc8008f20d28>] kernel_init_freeable+0x104/0x274
[<fffffc80089e1948>] kernel_init+0x10/0xf8
[<fffffc8008093a00>] ret_from_fork+0x10/0x50
------------------------------------------------------------------------------

This patch aligns kimg_shadow_start and kimg_shadow_end to
SWAPPER_BLOCK_SIZE in all configurations.

Fixes: f9040773b7 ("arm64: move kernel image to base of vmalloc area")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2016-03-11 11:03:35 +00:00
Catalin Marinas 2f76969f2e arm64: kasan: Use actual memory node when populating the kernel image shadow
With the 16KB or 64KB page configurations, the generic
vmemmap_populate() implementation warns on potential offnode
page_structs via vmemmap_verify() because the arm64 kasan_init() passes
NUMA_NO_NODE instead of the actual node for the kernel image memory.

Fixes: f9040773b7 ("arm64: move kernel image to base of vmalloc area")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: James Morse <james.morse@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
2016-03-11 11:03:34 +00:00
Catalin Marinas fdc69e7df3 arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission
The set_pte_at() function must update the hardware PTE_RDONLY bit
depending on the state of the PTE_WRITE and PTE_DIRTY bits of the given
entry value. However, it currently only performs this for pte_valid()
entries, ignoring PTE_PROT_NONE. The side-effect is that PROT_NONE
mappings would not have the PTE_RDONLY bit set. Without
CONFIG_ARM64_HW_AFDBM, this is not an issue since such PROT_NONE pages
are not accessible anyway.

With commit 2f4b829c62 ("arm64: Add support for hardware updates of
the access and dirty pte bits"), the ptep_set_wrprotect() function was
re-written to cope with automatic hardware updates of the dirty state.
As an optimisation, only PTE_RDONLY is checked to assess the "dirty"
status. Since set_pte_at() does not set this bit for PROT_NONE mappings,
such pages may be considered "dirty" as a result of
ptep_set_wrprotect().

This patch updates the pte_valid() check to pte_present() in
set_pte_at(). It also adds PTE_PROT_NONE to the swap entry bits comment.

Fixes: 2f4b829c62 ("arm64: Add support for hardware updates of the access and dirty pte bits")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Tested-by: Ganapatrao Kulkarni <gkulkarni@cavium.com>
Cc: <stable@vger.kernel.org>
2016-03-11 11:03:34 +00:00
Vineet Gupta 2d746eeb4f ARC: [*defconfig] No need to specify CONFIG_CROSS_COMPILE
The problem is with CONFIG_CPU_BIG_ENDIAN=y we still needed .config
fixup to override the the defconfig prefix to arceb-linux-

So remove these from defconfig and let user pass this via CROSS_COMPILE
environment var or use the default for ENDIAN (per previous patch)

No other arch carries them in defconfigs anyways !

Cc: Noam Camus <noamc@ezchip.com>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Anton Kolesov <akolesov@synosys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-11 14:59:55 +05:30
Vineet Gupta 89a269285f ARC: [BE] Select correct CROSS_COMPILE prefix
This allows CONFIG_CPU_BIG_ENDIAN=y to build correctly out of the box,
w/o any other tweaks.

Cc: Noam Camus <noamc@ezchip.com>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Anton Kolesov <akolesov@synosys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-11 14:59:54 +05:30
Vineet Gupta 2a41b6dc28 ARC: bitops: Remove non relevant comments
commit 80f420842f removed the ARC bitops microoptimization but failed
to prune the comments to same effect

Fixes: 80f420842f ("ARC: Make ARC bitops "safer" (add anti-optimization)")
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-11 14:59:54 +05:30
Adam Buchbinder 7423cc0cae ARC: Fix misspellings in comments.
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-11 14:59:53 +05:30
Dave Hansen 0d47638f80 x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
Kirill Shutemov pointed this out to me.

The tip tree currently has commit:

	dfb4a70f2 [x86/cpufeature, x86/mm/pkeys: Add protection keys related CPUID definitions]

whioch added support for two new CPUID bits: X86_FEATURE_PKU and
X86_FEATURE_OSPKE.  But, those bits were mis-merged and put in
cpufeature.h instead of cpufeatures.h.

This didn't cause any breakage *except* it keeps the "ospke" and
"pku" bits from showing up in cpuinfo.

Now cpuinfo has the two new flags:

	flags	: ...  pku ospke

BTW, is it really wise to have cpufeature.h and cpufeatures.h?
It seems like they can only cause confusion and mahem with tab
completion.

Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160310221213.06F9DB53@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-11 09:55:57 +01:00
Max Filippov c91e02bd97 xtensa: support hardware breakpoints/watchpoints
Use perf framework to manage hardware instruction and data breakpoints.
Add two new ptrace calls: PTRACE_GETHBPREGS and PTRACE_SETHBPREGS to
query and set instruction and data breakpoints.
Address bit 0 choose instruction (0) or data (1) break register, bits
31..1 are the register number.
Both calls transfer two 32-bit words: address (0) and control (1).
Instruction breakpoint contorl word is 0 to clear breakpoint, 1 to set.
Data breakpoint control word bit 31 is 'trigger on store', bit 30 is
'trigger on load, bits 29..0 are length. Length 0 is used to clear a
breakpoint. To set a breakpoint length must be a power of 2 in the range
1..64 and the address must be length-aligned.

Introduce new thread_info flag: TIF_DB_DISABLED. Set it if debug
exception is raised by the kernel code accessing watched userspace
address and disable corresponding data breakpoint. On exit to userspace
check that flag and, if set, restore all data breakpoints.

Handle debug exceptions raised with PS.EXCM set. This may happen when
window overflow/underflow handler or fast exception handler hits data
breakpoint, in which case save and disable all data breakpoints,
single-step faulting instruction and restore data breakpoints.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-03-11 08:53:32 +00:00
Max Filippov 6ec7026ac0 xtensa: use context structure for debug exceptions
With implementation of data breakpoints debug exceptions raised when
PS.EXCM is set need to be handled, e.g. window overflow code can write
to watched userspace address. Currently debug exception handler uses
EXCSAVE and DEPC SRs to save temporary registers, but DEPC may not be
available when PS.EXCM is set and more space will be needed to save
additional state.
Reorganize debug context: create per-CPU structure debug_table instance
and store its address in the EXCSAVE<debug level> instead of
debug_exception function address. Expand this structure when more save
space is needed.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-03-11 08:53:32 +00:00
Max Filippov 816aa58895 xtensa: remove remaining non-functional KGDB bits
KGDB is not supported on xtensa, but there are bits of related code
under arch/xtensa/kernel. Remove these bits.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-03-11 08:53:32 +00:00
Max Filippov 7de7ac785a xtensa: clear all DBREAKC registers on start
There are XCHAL_NUM_DBREAK registers, clear them all.
This also fixes cryptic assembler error message with binutils 2.25 when
XCHAL_NUM_DBREAK is 0:

  as: out of memory allocating 18446744073709551575 bytes after a total
  of 495616 bytes

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-03-11 08:53:32 +00:00
Max Filippov 56b9f9d672 xtensa: xtfpga: fix earlycon endianness
Serial port is attached to XTFPGA boards as native endian device, now
that earlycon parameter parser understands mmio32native put it into
earlycon kernel parameter. This makes early console functional on both
little- and big-endian CPUs with identical kernel command lines.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-03-11 08:53:32 +00:00
Max Filippov bce299ca54 xtensa: xtfpga: fix i2c controller register width and endianness
I2C controller is attached to XTFPGA boards as native endian device, mark
it as such in DTS.
Set register width in DTS to 4, this way it works both for little- and
big-endian CPUs.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-03-11 08:53:32 +00:00
Max Filippov d99434e176 xtensa: xtfpga: fix ethernet controller endianness
Ethernet controller is attached to XTFPGA boards as native endian device,
mark it as such in DTS and pass correct endianness in platform data.
This makes network functional on big-endian CPUs.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-03-11 08:53:31 +00:00
Max Filippov abfbd89595 xtensa: xtfpga: fix serial port register width and endianness
Serial port is attached to XTFPGA boards as native endian device, mark
it as such in DTS and pass correct endianness in platform data.
Set register width in DTS to 4, this way it matches the platform data
and works correctly on big-endian CPUs.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-03-11 08:53:31 +00:00
Max Filippov 4611bf7eb5 xtensa: define CONFIG_CPU_{BIG,LITTLE}_ENDIAN
Query compiler for the CPU endianness and add corresponding definition
to KBUILD_CPPFLAGS. This allows using 'native-endian' property in DTS.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-03-11 08:53:31 +00:00
Max Filippov a67cc9aa2d xtensa: fix preemption in {clear,copy}_user_highpage
Disabling pagefault makes little sense there, preemption disabling is
what was meant.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-03-11 08:53:31 +00:00
Max Filippov 362014c8d9 xtensa: ISS: don't hang if stdin EOF is reached
Simulator stdin may be connected to a file, when its end is reached
kernel hangs in infinite loop inside rs_poll, because simc_poll always
signals that descriptor 0 is readable and simc_read always returns 0.
Check simc_read return value and exit loop if it's not positive. Also
don't rewind polling timer if it's zero.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-03-11 08:53:31 +00:00
Hector Marco-Gisbert 8b8addf891 x86/mm/32: Enable full randomization on i386 and X86_32
Currently on i386 and on X86_64 when emulating X86_32 in legacy mode, only
the stack and the executable are randomized but not other mmapped files
(libraries, vDSO, etc.). This patch enables randomization for the
libraries, vDSO and mmap requests on i386 and in X86_32 in legacy mode.

By default on i386 there are 8 bits for the randomization of the libraries,
vDSO and mmaps which only uses 1MB of VA.

This patch preserves the original randomness, using 1MB of VA out of 3GB or
4GB. We think that 1MB out of 3GB is not a big cost for having the ASLR.

The first obvious security benefit is that all objects are randomized (not
only the stack and the executable) in legacy mode which highly increases
the ASLR effectiveness, otherwise the attackers may use these
non-randomized areas. But also sensitive setuid/setgid applications are
more secure because currently, attackers can disable the randomization of
these applications by setting the ulimit stack to "unlimited". This is a
very old and widely known trick to disable the ASLR in i386 which has been
allowed for too long.

Another trick used to disable the ASLR was to set the ADDR_NO_RANDOMIZE
personality flag, but fortunately this doesn't work on setuid/setgid
applications because there is security checks which clear Security-relevant
flags.

This patch always randomizes the mmap_legacy_base address, removing the
possibility to disable the ASLR by setting the stack to "unlimited".

Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es>
Acked-by: Ismael Ripoll Ripoll <iripoll@upv.es>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1457639460-5242-1-git-send-email-hecmargi@upv.es
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-11 09:53:19 +01:00
Michael Ellerman d8c0282f4d Merge branch 'topic/mprofile-kernel' into next
Merge the ftrace changes to support -mprofile-kernel on ppc64le. This is
a prerequisite for live patching, the support for which will be merged
via the livepatch tree based on this topic branch.
2016-03-11 11:20:15 +11:00
Jon Derrick 2c2c5c5cd2 x86/PCI: VMD: Attach VMD resources to parent domain's resource tree
Attach the new VMD domain's resources to the VMD device's resources.  This
allows /proc/iomem to display a more complete picture.

Before:
  c0000000-c1ffffff : 0000:5d:05.5
  c2000000-c3ffffff : 0000:5d:05.5
    c2010000-c2013fff : nvme
  c4000000-c40fffff : 0000:5d:05.5

After:
  c0000000-c1ffffff : 0000:5d:05.5
  c2000000-c3ffffff : 0000:5d:05.5
    c2000000-c3ffffff : VMD MEMBAR1
      c2000000-c22fffff : PCI Bus 10000:01
        c2000000-c200ffff : 10000:01:00.0
        c2010000-c2013fff : 10000:01:00.0
          c2010000-c2013fff : nvme
      c2300000-c24fffff : PCI Bus 10000:01
  c4000000-c40fffff : 0000:5d:05.5
    c4002000-c40fffff : VMD MEMBAR2

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
2016-03-10 14:57:38 -06:00
Keith Busch d068c350c0 x86/PCI: VMD: Set bus resource start to 0
The bus always starts at 0.  Due to alignment and down-casting, this
happened to work before, but looked alarmingly incorrect in kernel logs.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-10 14:53:42 -06:00
Keith Busch 83cc54a608 x86/PCI: VMD: Document code for maintainability
Comment the less obvious portion of the code for setting up memory windows,
and the platform dependency for initializing the h/w with appropriate
resources.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-10 14:52:13 -06:00
Joao Pinto c1678ffcde ARC: Add PCI support
Add PCI support to ARC and update drivers/pci Makefile enabling the ARC
arch to use the generic PCI setup functions.

[bhelgaas: fold in Joao's pci-dma-compat.h & pci-bridge.h build fix (I
should have caught this myself, sorry]
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-10 14:44:13 -06:00
Linus Torvalds f2c1242194 A few simple fixes for ARM, x86, PPC and generic code. The x86 MMU fix
is a bit larger because the surrounding code needed a cleanup, but
 nothing worrisome.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJW4UwZAAoJEL/70l94x66DG3YH/0PfUr4sW0jnWRVXmYlPVka4
 sNFYrdtYnx08PwXu2sWMm1F+OBXlF/t0ZSJXJ9OBF8WdKIu8TU4yBOINRAvGO/oE
 slrivjktLTKgicTtIXP5BpRR14ohwHIGcuiIlppxvnhmQz1/rMtig7fvhZxYI545
 lJyIbyquNR86tiVdUSG9/T9+ulXXXCvOspYv8jPXZx7VKBXKTvp5P5qavSqciRb+
 O9RqY+GDCR/5vrw+MV0J7H9ZydeEJeD02LcWguTGMATTm0RCrhydvSbou42UcKfY
 osWii0kwt2LhcM/sTOz+cWnLJ6gwU9T+ZtJTTbLvYWXWDLP/+icp9ACMkwNciNo=
 =/y4V
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "A few simple fixes for ARM, x86, PPC and generic code.

  The x86 MMU fix is a bit larger because the surrounding code needed a
  cleanup, but nothing worrisome"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0
  KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
  kvm: cap halt polling at exactly halt_poll_ns
  KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS
  KVM: VMX: disable PEBS before a guest entry
  KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit
2016-03-10 10:42:15 -08:00
Linus Torvalds c32c2cb272 arm64 fixes:
- Temporarily disable huge pages built using contiguous ptes
 - Ensure vmemmap region is sufficiently aligned for sparsemem sections
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJW4FSYAAoJELescNyEwWM0cp4H/0+9iMTqb3KowIW1vQPNnOG2
 BG/RVsGlCPAwBu0V4FdcY7eK4fQ9J+/UCmVd5/SrlpjNpblinxRPihbIzs4bToqa
 It02wNk7ISPm0oYtyGrRu1TpC5AvMykcluZkU/CUk0sjZlBAi8WSaJiqftFuZSGH
 lhyhARO4KscbAUUhwDDYNKuWmLbmyOpt9RM2fziNQdjSp+8czCoCR9G+JXiPQFsJ
 ORU10BqBDCyFpp8/NhM55qA76FJo6RCBUWx/6L1oJJxjvahkmPba/hhnfI7+Xj1u
 3FKAntJ6wVZeKqRsIkOlECoU/mrgjlTByTFN+o3KhOky8ZYBHoveIQWtsqNMFd4=
 =7d1o
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "I thought we were done for 4.5, but then the 64k-page chaps came
  crawling out of the woodwork.  *sigh*

  The vmemmap fix I sent for -rc7 caused a regression with 64k pages and
  sparsemem and at some point during the release cycle the new hugetlb
  code using contiguous ptes started failing the libhugetlbfs tests with
  64k pages enabled.

  So here are a couple of patches that fix the vmemmap alignment and
  disable the new hugetlb page sizes whilst a proper fix is being
  developed:

   - Temporarily disable huge pages built using contiguous ptes

   - Ensure vmemmap region is sufficiently aligned for sparsemem
     sections"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: hugetlb: partial revert of 66b3923a1a
  arm64: account for sparsemem section alignment when choosing vmemmap offset
2016-03-10 10:39:04 -08:00
Linus Torvalds 2da33f9f96 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "Three bug fixes:
   - The fix for the page table corruption (CVE-2016-2143)
   - The diagnose statistics introduced a regression for the dasd diag
     driver
   - Boot crash on systems without the set-program-parameters facility"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: four page table levels vs. fork
  s390/cpumf: Fix lpp detection
  s390/dasd: fix diag 0x250 inline assembly
2016-03-10 10:36:07 -08:00
Jianyu Zhan 10ee73865e x86/entry/traps: Show unhandled signal for i386 in do_trap()
Commit abd4f7505b ("x86: i386-show-unhandled-signals-v3") did turn on
the showing-unhandled-signal behaviour for i386 for some exception handlers,
but for no reason do_trap() is left out (my naive guess is because turning it on
for do_trap() would be too noisy since do_trap() is shared by several exceptions).

And since the same commit make "show_unhandled_signals" a debug tunable(in
/proc/sys/debug/exception-trace), and x86 by default turning it on.

So it would be strange for i386 users who turing it on manually and expect
seeing the unhandled signal output in log, but nothing.

This patch turns it on for i386 in do_trap() as well.

Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@suse.de
Cc: dave.hansen@linux.intel.com
Cc: heukelum@fastmail.fm
Cc: jbeulich@novell.com
Cc: jdike@addtoit.com
Cc: joe@perches.com
Cc: luto@kernel.org
Link: http://lkml.kernel.org/r/1457612398-4568-1-git-send-email-nasa4836@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 18:37:25 +01:00
Paolo Bonzini f958ee745f KVM: s390: Fixes and features for kvm/next (4.6) part 2
- add watchdog diagnose to trace event decoder
 - better handle the cpu timer when not inside the guest
 - only provide STFLE if the CPU model has STFLE
 - reduce DMA page usage
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJW3/XrAAoJEBF7vIC1phx8D7cP/RGZqjVQF8S0tOpzGl2EHRYd
 8MyTA2b53wUQx1cR35U1Tx1mKoTBoAcwADVJljCsCzCuGp/Bae9kT+T1kz4/aWEv
 vlQqE7+63VkPsgQxFCcKA6CKagFLz+2QBmNFMdYStjHjN/MNBDwfo58I9Y9g2DdB
 r7cjdExKC1WhCUguDv8eREtNWGRGC8IDV8ID02xTABTSk7LYcIGok0YIRjsZaV5b
 W1gchW5GicOTp/mktkCsNPhnFg28owsG2LVcb5zeGQuq0TowvrSvRzxCOrHM1lKS
 WU28FDvFn64bPvSAu1w894YKMsMxvm9SX5SWQ7nA5JkDko5lQ49H77M2YipMLUXY
 F+KN9TlS352fzGr5i7TaPUOLAtFDhdtM0NAicUf5+ydR73DLetQOvMqqenmmP1Mw
 K9hgAviFtKfnScIomCPrmeel5ddIl0OTshiZMGIbS6/lAGRZ5aJuXPxUgMhFiz95
 YgjH2tb16eR0eYAs2PIt2bsJVis4x2MfFi+PExYzWP5vERxzMthE879Ra42NBfT1
 tswxtvvQtXgXrkakwYkCTXqnja0JLNOhfKvIL73/am9DbwUmikWfVYOoq/yEe/w9
 Gf5ZHzACrokHDqeKbMT6xzGEk4C1QEn7VRc4H4MX47xseVVXb2GxPYFNASgL2SSU
 Qe6+IILH9uDhDkXVHIJD
 =n2MS
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fixes and features for kvm/next (4.6) part 2

- add watchdog diagnose to trace event decoder
- better handle the cpu timer when not inside the guest
- only provide STFLE if the CPU model has STFLE
- reduce DMA page usage
2016-03-10 17:06:07 +01:00
Martin Schwidefsky e370e47694 s390: fix floating pointer register corruption (again)
There is a tricky interaction between the machine check handler
and the critical sections of load_fpu_regs and save_fpu_regs
functions. If the machine check interrupts one of the two
functions the critical section cleanup will complete the function
before the machine check handler s390_do_machine_check is called.
Trouble is that the machine check handler needs to validate the
floating point registers *before* and not *after* the completion
of load_fpu_regs/save_fpu_regs.

The simplest solution is to rewind the PSW to the start of the
load_fpu_regs/save_fpu_regs and retry the function after the
return from the machine check handler.

Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: <stable@vger.kernel.org> # 4.3+
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-10 14:35:42 +01:00
Heiko Carstens 8f100bb1ff s390/cpumf: add missing lpp magic initialization
Add the missing lpp magic initialization for cpu 0. Without this all
samples on cpu 0 do not have the most significant bit set in the
program parameter field, which we use to distinguish between guest and
host samples if the pid is also 0.

We did initialize the lpp magic in the absolute zero lowcore but
forgot that when switching to the allocated lowcore on cpu 0 only.

Reported-by: Shu Juan Zhang <zhshuj@cn.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: e22cf8ca6f ("s390/cpumf: rework program parameter setting to detect guest samples")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-10 14:35:35 +01:00
Madhavan Srinivasan 58bffb5bbb powerpc/perf: Fix misleading comment in pmao_restore_workaround()
The current comment in pmao_restore_workaround() regarding
hard_irq_disable() is wrong. It should say to hard *disable* interrupts
instead of *enable*. Fix it.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10 23:00:23 +11:00
Sukadev Bhattiprolu 8f69dc701a powerpc/perf/24x7: Eliminate domain suffix in event names
The Physical Core events of the 24x7 PMU can be monitored across various
domains (physical core, vcpu home core, vcpu home node etc). For each of
these core events, we currently create multiple events in sysfs, one for
each domain the event can be monitored in. These events are distinguished
by their suffixes like __PHYS_CORE, __VCPU_HOME_CORE etc.

Rather than creating multiple such entries, we could let the user specify
make 'domain' index a required parameter and let the user specify a value
for it (like they currently specify the core index).

	$ cat /sys/bus/event_source/devices/hv_24x7/events/HPM_CCYC
	domain=?,offset=0x98,core=?,lpar=0x0

	$ perf stat -C 0 -e hv_24x7/HPM_CCYC,domain=2,core=1/ true

(the 'domain=?' and 'core=?' in sysfs tell perf tool to enforce them as
required parameters).

This simplifies the interface and allows users to identify events by the
name specified in the catalog (User can determine the domain index by
referring to '/sys/bus/event_source/devices/hv_24x7/interface/domains').

Eliminating the event suffix eliminates several functions and simplifies
code.

Note that Physical Chip events can only be monitored in the chip domain
so those events have the domain set to 1 (rather than =?) and users don't
need to specify the domain index for the Chip events.

	$ cat /sys/bus/event_source/devices/hv_24x7/events/PM_XLINK_CYCLES
	domain=1,offset=0x230,chip=?,lpar=0x0

	$ perf stat -C 0 -e hv_24x7/PM_XLINK_CYCLES,chip=1/ true

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10 22:57:23 +11:00
Sukadev Bhattiprolu d34171e88a powerpc/perf/hv-24x7: Display domain indices in sysfs
To help users determine domains, display the domain indices used by the
kernel in sysfs.

	$ cat /sys/bus/event_source/devices/hv_24x7/interface/domains
	1: Physical Chip
	2: Physical Core
	3: VCPU Home Core
	4: VCPU Home Chip
	5: VCPU Home Node
	6: VCPU Remote Node

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10 22:57:22 +11:00
Sukadev Bhattiprolu 2b206ee6b0 powerpc/perf/hv-24x7: Display change in counter values
For 24x7 counters, perf displays the raw value of the 24x7 counter, which
is a monotonically increasing value.

	perf stat -C 0 -e \
		'hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/' \
		sleep 1

 Performance counter stats for 'CPU(s) 0':

     9,105,403,170      hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/

       0.000425751 seconds time elapsed

In the typical usage of 'perf stat' this counter value is not as useful
as the _change_ in the counter value over the duration of the application.

Have h_24x7_event_init() set the event's prev_count to the raw value of
the 24x7 counter at the time of initialization. When the application
terminates, hv_24x7_event_read() will compute the change in value and
report to the perf tool. Similarly, for the transaction interface, clear
the event count to 0 at the beginning of the transaction.

	perf stat -C 0 -e \
		'hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/' \
		sleep 1

 Performance counter stats for 'CPU(s) 0':

           245,758      hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/

       1.006366383 seconds time elapsed

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10 22:56:55 +11:00
Sukadev Bhattiprolu e5a5886d7a powerpc/perf/hv-24x7: Fix usage with chip events.
24x7 counters can belong to different domains (core, chip, virtual CPU
etc). For events in the 'chip' domain, sysfs entry currently looks like:

	$ cd /sys/bus/event_source/devices/hv_24x7/events
	$ cat PM_XLINK_CYCLES__PHYS_CHIP
	domain=0x1,offset=0x230,core=?,lpar=0x0

where the required parameter, 'core=?' is specified with perf as:

	perf stat -C 0 -e hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,core=1/ \
		/bin/true

This is inconsistent in that 'core' is a required parameter for a chip
event.  Instead, have the the sysfs entry display 'chip=?' for chip
events:

	$ cd /sys/bus/event_source/devices/hv_24x7/events
	$ cat PM_XLINK_CYCLES__PHYS_CHIP
	domain=0x1,offset=0x230,chip=?,lpar=0x0

We also need to add a 'chip' entry in the sysfs format directory:

	$ ls /sys/bus/event_source/devices/hv_24x7/format
	chip  core  domain  lpar  offset  vcpu
	^^^^
	(new)

so the perf tool can automatically check usage and format the chip
parameter correctly:

	$ perf stat -C 0 -v -e hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP/ \
		/bin/true
	Required parameter 'chip' not specified
	invalid or unsupported event: 'hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP/'

	$ perf stat -C 0 -v -e hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,chip=1/ \
		/bin/true
	hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,chip=1/: 0 6628908 6628908

	 Performance counter stats for 'CPU(s) 0':

	         0      hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,chip=1/

	    0.006606970 seconds time elapsed

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10 22:56:54 +11:00
Sukadev Bhattiprolu e0728b50d4 powerpc/perf: Export Power8 generic and cache events to sysfs
Power8 supports a large number of events in each susbystem so when a
user runs:

	perf stat -e branch-instructions sleep 1
	perf stat -e L1-dcache-loads sleep 1

it is not clear as to which PMU events were monitored.

Export the generic hardware and cache perf events for Power8 to sysfs,
so users can precisely determine the PMU event monitored by the generic
event.

Eg:
	cat /sys/bus/event_source/devices/cpu/events/branch-instructions
	event=0x10068

	$ cat /sys/bus/event_source/devices/cpu/events/L1-dcache-loads
	event=0x100ee

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10 22:56:05 +11:00
Sukadev Bhattiprolu d4969e2459 powerpc/perf: Remove PME_ prefix for power7 events
We used the PME_ prefix earlier to avoid some macro/variable name
collisions.  We have since changed the way we define/use the event
macros so we no longer need the prefix.

By dropping the prefix, we keep the the event macros consistent with
their official names.

Reported-by: Michael Ellerman <ellerman@au1.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10 22:56:04 +11:00
Borislav Petkov 84477336ec x86/delay: Avoid preemptible context checks in delay_mwaitx()
We do use this_cpu_ptr(&cpu_tss) as a cacheline-aligned, seldomly
accessed per-cpu var as the MONITORX target in delay_mwaitx(). However,
when called in preemptible context, this_cpu_ptr -> smp_processor_id() ->
debug_smp_processor_id() fires:

  BUG: using smp_processor_id() in preemptible [00000000] code: udevd/312
  caller is delay_mwaitx+0x40/0xa0

But we don't care about that check - we only need cpu_tss as a MONITORX
target and it doesn't really matter which CPU's var we're touching as
we're going idle anyway. Fix that.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: spg_linux_kernel@amd.com
Link: http://lkml.kernel.org/r/20160309205622.GG6564@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 11:27:12 +01:00
Paolo Bonzini 5f0b819995 KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0
KVM has special logic to handle pages with pte.u=1 and pte.w=0 when
CR0.WP=1.  These pages' SPTEs flip continuously between two states:
U=1/W=0 (user and supervisor reads allowed, supervisor writes not allowed)
and U=0/W=1 (supervisor reads and writes allowed, user writes not allowed).

When SMEP is in effect, however, U=0 will enable kernel execution of
this page.  To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0, making the two states U=1/W=0/NX=gpte.NX and U=0/W=1/NX=1.
When guest EFER has the NX bit cleared, the reserved bit check thinks
that the latter state is invalid; teach it that the smep_andnot_wp case
will also use the NX bit of SPTEs.

Cc: stable@vger.kernel.org
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.inel.com>
Fixes: c258b62b26
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-10 11:26:10 +01:00
Paolo Bonzini 844a5fe219 KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
Yes, all of these are needed. :) This is admittedly a bit odd, but
kvm-unit-tests access.flat tests this if you run it with "-cpu host"
and of course ept=0.

KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
specially when pte.u=1/pte.w=0/CR0.WP=0.  Such writes cause a fault
when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
restarts execution.  This will still cause a user write to fault, while
supervisor writes will succeed.  User reads will fault spuriously now,
and KVM will then flip U and W again in the SPTE (U=1, W=0).  User reads
will be enabled and supervisor writes disabled, going back to the
originary situation where supervisor writes fault spuriously.

When SMEP is in effect, however, U=0 will enable kernel execution of
this page.  To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0.  If the guest has not enabled NX, the result is a continuous
stream of page faults due to the NX bit being reserved.

The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
switch.  (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
control, so they do not use user-return notifiers for EFER---if they did,
EFER.NX would be forced to the same value as the host).

There is another bug in the reserved bit check, which I've split to a
separate patch for easier application to stable kernels.

Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Fixes: f6577a5fa1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-10 11:26:07 +01:00
Andy Lutomirski 9999c8c01f x86/entry: Call enter_from_user_mode() with IRQs off
Now that slow-path syscalls always enter C before enabling
interrupts, it's straightforward to call enter_from_user_mode() before
enabling interrupts rather than doing it as part of entry tracing.

With this change, we should finally be able to retire exception_enter().

This will also enable optimizations based on knowing that we never
change context tracking state with interrupts on.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bc376ecf87921a495e874ff98139b1ca2f5c5dd7.1457558566.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 10:53:26 +01:00
Andy Lutomirski a798f09111 x86/entry/32: Change INT80 to be an interrupt gate
We want all of the syscall entries to run with interrupts off so that
we can efficiently run context tracking before enabling interrupts.

This will regress int $0x80 performance on 32-bit kernels by a
couple of cycles.  This shouldn't matter much -- int $0x80 is not a
fast path.

This effectively reverts:

  657c1eea00 ("x86/entry/32: Fix entry_INT80_32() to expect interrupts to be on")

... and fixes the same issue differently.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/59b4f90c9ebfccd8c937305dbbbca680bc74b905.1457558566.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 10:53:26 +01:00
Ingo Molnar 6cbe9e4a22 Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 10:28:27 +01:00
Yu-cheng Yu a65050c6f1 x86/fpu: Revert ("x86/fpu: Disable AVX when eagerfpu is off")
Leonid Shatz noticed that the SDM interpretation of the following
recent commit:

  394db20ca2 ("x86/fpu: Disable AVX when eagerfpu is off")

... is incorrect and that the original behavior of the FPU code was correct.

Because AVX is not stated in CR0 TS bit description, it was mistakenly
believed to be not supported for lazy context switch. This turns out
to be false:

  Intel Software Developer's Manual Vol. 3A, Sec. 2.5 Control Registers:

   'TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the x87 FPU/
    MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch to be delayed until
    an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed
    by the new task.'

  Intel Software Developer's Manual Vol. 2A, Sec. 2.4 Instruction Exception
  Specification:

   'AVX instructions refer to exceptions by classes that include #NM
    "Device Not Available" exception for lazy context switch.'

So revert the commit.

Reported-by: Leonid Shatz <leonid.shatz@ravellosystems.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457569734-3785-1-git-send-email-yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 10:15:58 +01:00
Andy Lutomirski fda57b2267 x86/entry: Improve system call entry comments
Ingo suggested that the comments should explain when the various
entries are used.  This adds these explanations and improves other
parts of the comments.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/9524ecef7a295347294300045d08354d6a57c6e7.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:15 +01:00
Andy Lutomirski 392a62549f x86/entry: Remove TIF_SINGLESTEP entry work
Now that SYSENTER with TF set puts X86_EFLAGS_TF directly into
regs->flags, we don't need a TIF_SINGLESTEP fixup in the syscall
entry code.  Remove it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2d15f24da52dafc9d2f0b8d76f55544f4779c517.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:14 +01:00
Andy Lutomirski 2a41aa4feb x86/entry/32: Add and check a stack canary for the SYSENTER stack
The first instruction of the SYSENTER entry runs on its own tiny
stack.  That stack can be used if a #DB or NMI is delivered before
the SYSENTER prologue switches to a real stack.

We have code in place to prevent us from overflowing the tiny stack.
For added paranoia, add a canary to the stack and check it in
do_debug() -- that way, if something goes wrong with the #DB logic,
we'll eventually notice.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/6ff9a806f39098b166dc2c41c1db744df5272f29.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:14 +01:00
Andy Lutomirski 7536656f08 x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup
Right after SYSENTER, we can get a #DB or NMI.  On x86_32, there's no IST,
so the exception handler is invoked on the temporary SYSENTER stack.

Because the SYSENTER stack is very small, we have a fixup to switch
off the stack quickly when this happens.  The old fixup had several issues:

 1. It checked the interrupt frame's CS and EIP.  This wasn't
    obviously correct on Xen or if vm86 mode was in use [1].

 2. In the NMI handler, it did some frightening digging into the
    stack frame.  I'm not convinced this digging was correct.

 3. The fixup didn't switch stacks and then switch back.  Instead, it
    synthesized a brand new stack frame that would redirect the IRET
    back to the SYSENTER code.  That frame was highly questionable.
    For one thing, if NMI nested inside #DB, we would effectively
    abort the #DB prologue, which was probably safe but was
    frightening.  For another, the code used PUSHFL to write the
    FLAGS portion of the frame, which was simply bogus -- by the time
    PUSHFL was called, at least TF, NT, VM, and all of the arithmetic
    flags were clobbered.

Simplify this considerably.  Instead of looking at the saved frame
to see where we came from, check the hardware ESP register against
the SYSENTER stack directly.  Malicious user code cannot spoof the
kernel ESP register, and by moving the check after SAVE_ALL, we can
use normal PER_CPU accesses to find all the relevant addresses.

With this patch applied, the improved syscall_nt_32 test finally
passes on 32-bit kernels.

[1] It isn't obviously correct, but it is nonetheless safe from vm86
    shenanigans as far as I can tell.  A user can't point EIP at
    entry_SYSENTER_32 while in vm86 mode because entry_SYSENTER_32,
    like all kernel addresses, is greater than 0xffff and would thus
    violate the CS segment limit.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/b2cdbc037031c07ecf2c40a96069318aec0e7971.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:14 +01:00
Andy Lutomirski 6dcc94149d x86/entry: Only allocate space for tss_struct::SYSENTER_stack if needed
The SYSENTER stack is only used on 32-bit kernels.  Remove it on 64-bit kernels.

( We may end up using it down the road on 64-bit kernels. If so,
  we'll re-enable it for CONFIG_IA32_EMULATION. )

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/9dbd18429f9ff61a76b6eda97a9ea20510b9f6ba.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:14 +01:00
Andy Lutomirski f2b375756c x86/entry: Vastly simplify SYSENTER TF (single-step) handling
Due to a blatant design error, SYSENTER doesn't clear TF (single-step).

As a result, if a user does SYSENTER with TF set, we will single-step
through the kernel until something clears TF.  There is absolutely
nothing we can do to prevent this short of turning off SYSENTER [1].

Simplify the handling considerably with two changes:

  1. We already sanitize EFLAGS in SYSENTER to clear NT and AC.  We can
     add TF to that list of flags to sanitize with no overhead whatsoever.

  2. Teach do_debug() to ignore single-step traps in the SYSENTER prologue.

That's all we need to do.

Don't get too excited -- our handling is still buggy on 32-bit
kernels.  There's nothing wrong with the SYSENTER code itself, but
the #DB prologue has a clever fixup for traps on the very first
instruction of entry_SYSENTER_32, and the fixup doesn't work quite
correctly.  The next two patches will fix that.

[1] We could probably prevent it by forcing BTF on at all times and
    making sure we clear TF before any branches in the SYSENTER
    code.  Needless to say, this is a bad idea.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a30d2ea06fe4b621fe6a9ef911b02c0f38feb6f2.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:13 +01:00
Andy Lutomirski 8bb5643686 x86/entry/traps: Clear DR6 early in do_debug() and improve the comment
Leaving any bits set in DR6 on return from a debug exception is
asking for trouble.  Prevent it by writing zero right away and
clarify the comment.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3857676e1be8fb27db4b89bbb1e2052b7f435ff4.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:13 +01:00
Andy Lutomirski 81edd9f69a x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions
The SDM says that debug exceptions clear BTF, and we need to keep
TIF_BLOCKSTEP in sync with BTF.  Clear it unconditionally and improve
the comment.

I suspect that the fact that kmemcheck could cause TIF_BLOCKSTEP not
to be cleared was just an oversight.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/fa86e55d196e6dde5b38839595bde2a292c52fdc.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:13 +01:00
Andy Lutomirski c2c9b52fab x86/entry/32: Restore FLAGS on SYSEXIT
We weren't restoring FLAGS at all on SYSEXIT.  Apparently no one cared.

With this patch applied, native kernels should always honor
task_pt_regs()->flags, which opens the door for some sys_iopl()
cleanups.  I'll do those as a separate series, though, since getting
it right will involve tweaking some paravirt ops.

( The short version is that, before this patch, sys_iopl(), invoked via
  SYSENTER, wasn't guaranteed to ever transfer the updated
  regs->flags, so sys_iopl() had to change the hardware flags register
  as well. )

Reported-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3f98b207472dc9784838eb5ca2b89dcc845ce269.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:12 +01:00
Andy Lutomirski 67f590e8d4 x86/entry/32: Filter NT and speed up AC filtering in SYSENTER
This makes the 32-bit code work just like the 64-bit code.  It should
speed up syscalls on 32-bit kernels on Skylake by something like 20
cycles (by analogy to the 64-bit compat case).

It also cleans up NT just like we do for the 64-bit case.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/07daef3d44bd1ed62a2c866e143e8df64edb40ee.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:12 +01:00
Andy Lutomirski e786041153 x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test
CLAC is slow, and the SYSENTER code already has an unlikely path
that runs if unusual flags are set.  Drop the CLAC and instead rely
on the unlikely path to clear AC.

This seems to save ~24 cycles on my Skylake laptop.  (Hey, Intel,
make this faster please!)

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/90d6db2189f9add83bc7bddd75a0c19ebbd676b2.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:12 +01:00
Martin Schwidefsky 3446c13b26 s390/mm: four page table levels vs. fork
The fork of a process with four page table levels is broken since
git commit 6252d702c5 "[S390] dynamic page tables."

All new mm contexts are created with three page table levels and
an asce limit of 4TB. If the parent has four levels dup_mmap will
add vmas to the new context which are outside of the asce limit.
The subsequent call to copy_page_range will walk the three level
page table structure of the new process with non-zero pgd and pud
indexes. This leads to memory clobbers as the pgd_index *and* the
pud_index is added to the mm->pgd pointer without a pgd_deref
in between.

The init_new_context() function is selecting the number of page
table levels for a new context. The function is used by mm_init()
which in turn is called by dup_mm() and mm_alloc(). These two are
used by fork() and exec(). The init_new_context() function can
distinguish the two cases by looking at mm->context.asce_limit,
for fork() the mm struct has been copied and the number of page
table levels may not change. For exec() the mm_alloc() function
set the new mm structure to zero, in this case a three-level page
table is created as the temporary stack space is located at
STACK_TOP_MAX = 4TB.

This fixes CVE-2016-2143.

Reported-by: Marcin Kościelnicki <koriakin@0x04.net>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-10 09:21:24 +01:00
Linus Walleij cc998d8bc7 Linux 4.5-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWyN0eAAoJEHm+PkMAQRiGqIAIAKKodaqX5ACJhTRozj3GN5iV
 dDHU/SJQj4nIyJecaCVAJIBa3gvAX6GyY+Jg4JKJ4TKAdR0Hd/3EwOWIR+0+BQIM
 0MqmB0CRLzq42AOQtpDUdwB+OTE8jFQFQd2gFKuQYJJ61ppykCC36OWV0bTfQLSV
 b2esO4Ry6eoQnDMw8oT52ncUIZEvQ2DZE3L6tNDEPD/0je14GWkV1Fx1+X2jb9cB
 diFA2TmaEEXMHNT1NCLSQ+D7QefXV3mFl85leNlFi5QQNy7ZdSh7kvvOodMQ2uAS
 qa9V8Uk6LZYv5O71+Jr5Rmlqh3GxNRCMXu2tlMd2gtw8ApEvBw6XoL5YZYE13Lk=
 =3HMg
 -----END PGP SIGNATURE-----

Merge tag 'v4.5-rc5' into devel

Linux 4.5-rc5
2016-03-10 09:29:25 +07:00
Dan Williams 489011652a Merge branch 'for-4.6/pfn' into libnvdimm-for-next 2016-03-09 17:15:43 -08:00
Mark Rutland 0d97e6d802 arm64: kasan: clear stale stack poison
Functions which the compiler has instrumented for KASAN place poison on
the stack shadow upon entry and remove this poison prior to returning.

In the case of cpuidle, CPUs exit the kernel a number of levels deep in
C code.  Any instrumented functions on this critical path will leave
portions of the stack shadow poisoned.

If CPUs lose context and return to the kernel via a cold path, we
restore a prior context saved in __cpu_suspend_enter are forgotten, and
we never remove the poison they placed in the stack shadow area by
functions calls between this and the actual exit of the kernel.

Thus, (depending on stackframe layout) subsequent calls to instrumented
functions may hit this stale poison, resulting in (spurious) KASAN
splats to the console.

To avoid this, clear any stale poison from the idle thread for a CPU
prior to bringing a CPU online.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09 15:43:42 -08:00
Dan Williams 59e6473980 libnvdimm, pmem: clear poison on write
If a write is directed at a known bad block perform the following:

1/ write the data

2/ send a clear poison command

3/ invalidate the poison out of the cache hierarchy

Cc: <x86@kernel.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-09 15:15:32 -08:00
Olof Johansson 1dea581f86 ARM: OMAP2+: critical DRA7xx fix for v4.5-rc
Force the DRA7xx Ethernet internal clock source to stay enabled
 per TI erratum i877:
 
 http://www.ti.com/lit/er/sprz429h/sprz429h.pdf
 
 Otherwise, if the Ethernet internal clock source is disabled, the
 chip will age prematurely, and the RGMII I/O timing will soon
 fail to meet the delay time and skew specifications for 1000Mbps
 Ethernet.
 
 This fix should go in as soon as possible.
 
 Basic build, boot, and PM test results are available here:
 
 http://www.pwsan.com/omap/testlogs/omap-critical-fixes-for-v4.5-rc/20160307014209/
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW3UuoAAoJEMePsQ0LvSpL4JIP/j9A1ax1c6kGfNujSzBMrVL3
 I68l27ohfbo/MKMc/KsqkdahzGimIUmqkJGxrnA19UMhfyMb6l3pzlVswxfUUd10
 EXl/aKlPDa+Xl+A+TCwK78C69ZXHk4nqsNDSixuoIVfxM2uTZZZmNK3FOR+/EaQ8
 kUq3zwkg31HgsYxIyvqcCwpsxmDwKXx6fQ3sX5A6tQGvtsdeNofWJOVoGpZe0Ott
 tmt09VEvSGvXVEL1Um6U5A+8Mw6GPWa9/wih8nYaE70BswuOmIMUxeCkrShDadpn
 4Z8rqZg1Q8sdnI7ZCARS2WZ63+wrcjq04Yycf7m8feUc7cIDqlahWnrIWKuvpPAz
 P20LgrwRQDgifM2TzJupPRUKX+7BoACOXTIt2A3HuOIsZRfqysFx4qoOEdQNBlVq
 mOOwA/o8ly8hJI7uym8elrPY4MjZ4f6l2h/mFom0XrlS/1NcxXwuGqi9SJNneFSm
 ALyCIW7YnemoOex0tUcHUg2fiGaRceWlSmzHxI0WgVyOj86hrXc8OnpjnPmuhMQV
 i4pkL4Y1/UxZepd0b6QOTUC+LQvsWL008XLUr0SPm1d2Co9sxyzN8i0pXh07bsm0
 sOflS6DtwWSNenX/OVVQWk0r5amNwiFFpiw7tBWIeXYi4glhyizqdGjbzxRjxJUf
 QfFex23RAWtf/1ZrvqQO
 =kJw8
 -----END PGP SIGNATURE-----

Merge tag 'for-v4.5-rc/omap-critical-fixes-a' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending into fixes

ARM: OMAP2+: critical DRA7xx fix for v4.5-rc

Force the DRA7xx Ethernet internal clock source to stay enabled
per TI erratum i877:

http://www.ti.com/lit/er/sprz429h/sprz429h.pdf

Otherwise, if the Ethernet internal clock source is disabled, the
chip will age prematurely, and the RGMII I/O timing will soon
fail to meet the delay time and skew specifications for 1000Mbps
Ethernet.

This fix should go in as soon as possible.

Basic build, boot, and PM test results are available here:

http://www.pwsan.com/omap/testlogs/omap-critical-fixes-for-v4.5-rc/20160307014209/

* tag 'for-v4.5-rc/omap-critical-fixes-a' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending:
  ARM: dts: dra7: do not gate cpsw clock due to errata i877
  ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-09 14:15:31 -08:00
Xuelin Shi 3b5eb41b8c powerpc/p5040: Add device node for RAID Engine
add the missing RAID Engine device node for p5040.
otherwise, the device can not be detected.

Signed-off-by: Xuelin Shi <xuelin.shi@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09 10:44:18 -06:00
Christophe Leroy 7e393220b6 powerpc: optimise csum_partial() call when len is constant
csum_partial is often called for small fixed length packets
for which it is suboptimal to use the generic csum_partial()
function.

For instance, in my configuration, I got:
* One place calling it with constant len 4
* Seven places calling it with constant len 8
* Three places calling it with constant len 14
* One place calling it with constant len 20
* One place calling it with constant len 24
* One place calling it with constant len 32

This patch renames csum_partial() to __csum_partial() and
implements csum_partial() as a wrapper inline function which
* uses csum_add() for small 16bits multiple constant length
* uses ip_fast_csum() for other 32bits multiple constant
* uses __csum_partial() in all other cases

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09 10:44:18 -06:00
Raghav Dogra ac6082dd32 powerpc/fsl-lbc: Modify suspend/resume entry sequence
Modify platform driver suspend/resume to syscore
suspend/resume. This is because p1022ds needs to use
localbus when entering the PCIE resume.

Signed-off-by: Raghav Dogra <raghav.dogra@nxp.com>
[scottwood: dropped makefile churn]
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09 10:44:17 -06:00
Christophe Leroy 921fff351c powerpc/8xx: CONFIG_DEBUG_PAGEALLOC requires ITLBmiss for kernel addresses
When CONFIG_DEBUG_PAGEALLOC is activated, the initial TLB mapping gets
flushed to track accesses to wrong areas. Therefore, kernel addresses
will also generate ITLB misses.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09 10:44:17 -06:00
Christophe Leroy 501ea76687 powerpc/885: set SDCR to 0x40
The MPC885 reference manual says that SDCR shall have value 0x40, but
most exemples set SDCR to 0x1
With 0x1 in SDCR, we observe TX underruns on SCC when using it in
QMC mode.
According the NXP technical support, this is a copy/paste error from
MPC860 reference manual, 0x40 being the only value supported
by the MPC885 HW.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09 10:44:16 -06:00
Bartlomiej Zolnierkiewicz b278268b63 powerpc/86xx: disable IDE subsystem in mpc8610_hpcd_defconfig
This patch disables deprecated IDE subsystem in mpc8610_hpcd_defconfig
(no IDE host drivers are selected in this config so there is no valid
reason to enable IDE subsystem itself).

Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09 10:44:15 -06:00
Bartlomiej Zolnierkiewicz 2fa1d23071 powerpc/85xx: disable IDE subsystem in stx_gp3_defconfig
This patch disables deprecated IDE subsystem in stx_gp3_defconfig
(no IDE host drivers are selected in this config so there is no valid
reason to enable IDE subsystem itself).

Cc: Scott Wood <oss@buserror.net>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09 10:44:15 -06:00
Bartlomiej Zolnierkiewicz 451bc2e9e3 powerpc/85xx: disable IDE subsystem in ksi8560_defconfig
This patch disables deprecated IDE subsystem in ksi8560_defconfig
(no IDE host drivers are selected in this config so there is no valid
reason to enable IDE subsystem itself).

Cc: Scott Wood <oss@buserror.net>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09 10:44:14 -06:00
Bartlomiej Zolnierkiewicz ba1353eee0 powerpc/83xx: disable IDE subsystem in mpc834x_itx_defconfig
This patch disables deprecated IDE subsystem in mpc834x_itx_defconfig
(no IDE host drivers are selected in this config so there is no valid
reason to enable IDE subsystem itself).

Cc: Scott Wood <oss@buserror.net>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09 10:44:14 -06:00
Will Deacon ff7925848b arm64: hugetlb: partial revert of 66b3923a1a
Commit 66b3923a1a ("arm64: hugetlb: add support for PTE contiguous bit")
introduced support for huge pages using the contiguous bit in the PTE
as opposed to block mappings, which may be slightly unwieldy (512M) in
64k page configurations.

Unfortunately, this support has resulted in some late regressions when
running the libhugetlbfs test suite with 64k pages and CONFIG_DEBUG_VM
as a result of a BUG:

 | readback (2M: 64):	------------[ cut here ]------------
 | kernel BUG at fs/hugetlbfs/inode.c:446!
 | Internal error: Oops - BUG: 0 [#1] SMP
 | Modules linked in:
 | CPU: 7 PID: 1448 Comm: readback Not tainted 4.5.0-rc7 #148
 | Hardware name: linux,dummy-virt (DT)
 | task: fffffe0040964b00 ti: fffffe00c2668000 task.ti: fffffe00c2668000
 | PC is at remove_inode_hugepages+0x44c/0x480
 | LR is at remove_inode_hugepages+0x264/0x480

Rather than revert the entire patch, simply avoid advertising the
contiguous huge page sizes for now while people are actively working on
a fix. This patch can then be reverted once things have been sorted out.

Cc: David Woods <dwoods@ezchip.com>
Reported-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-03-09 15:29:29 +00:00
Ard Biesheuvel 36e5cd6b89 arm64: account for sparsemem section alignment when choosing vmemmap offset
Commit dfd55ad85e ("arm64: vmemmap: use virtual projection of linear
region") fixed an issue where the struct page array would overflow into the
adjacent virtual memory region if system RAM was placed so high up in
physical memory that its addresses were not representable in the build time
configured virtual address size.

However, the fix failed to take into account that the vmemmap region needs
to be relatively aligned with respect to the sparsemem section size, so that
a sequence of page structs corresponding with a sparsemem section in the
linear region appears naturally aligned in the vmemmap region.

So round up vmemmap to sparsemem section size. Since this essentially moves
the projection of the linear region up in memory, also revert the reduction
of the size of the vmemmap region.

Cc: <stable@vger.kernel.org>
Fixes: dfd55ad85e ("arm64: vmemmap: use virtual projection of linear region")
Tested-by: Mark Langsdorf <mlangsdo@redhat.com>
Tested-by: David Daney <david.daney@cavium.com>
Tested-by: Robert Richter <rrichter@cavium.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-03-09 14:57:08 +00:00
Luis R. Rodriguez f6e45661f9 dma, mm/pat: Rename dma_*_writecombine() to dma_*_wc()
Rename dma_*_writecombine() to dma_*_wc(), so that the naming
is coherent across the various write-combining APIs. Keep the
old names for compatibility for a while, these can be removed
at a later time. A guard is left to enable backporting of the
rename, and later remove of the old mapping defines seemlessly.

Build tested successfully with allmodconfig.

The following Coccinelle SmPL patch was used for this simple
transformation:

@ rename_dma_alloc_writecombine @
expression dev, size, dma_addr, gfp;
@@

-dma_alloc_writecombine(dev, size, dma_addr, gfp)
+dma_alloc_wc(dev, size, dma_addr, gfp)

@ rename_dma_free_writecombine @
expression dev, size, cpu_addr, dma_addr;
@@

-dma_free_writecombine(dev, size, cpu_addr, dma_addr)
+dma_free_wc(dev, size, cpu_addr, dma_addr)

@ rename_dma_mmap_writecombine @
expression dev, vma, cpu_addr, dma_addr, size;
@@

-dma_mmap_writecombine(dev, vma, cpu_addr, dma_addr, size)
+dma_mmap_wc(dev, vma, cpu_addr, dma_addr, size)

We also keep the old names as compatibility helpers, and
guard against their definition to make backporting easier.

Generated-by: Coccinelle SmPL
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: airlied@linux.ie
Cc: akpm@linux-foundation.org
Cc: benh@kernel.crashing.org
Cc: bhelgaas@google.com
Cc: bp@suse.de
Cc: dan.j.williams@intel.com
Cc: daniel.vetter@ffwll.ch
Cc: dhowells@redhat.com
Cc: julia.lawall@lip6.fr
Cc: konrad.wilk@oracle.com
Cc: linux-fbdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: luto@amacapital.net
Cc: mst@redhat.com
Cc: tomi.valkeinen@ti.com
Cc: toshi.kani@hp.com
Cc: vinod.koul@intel.com
Cc: xen-devel@lists.xensource.com
Link: http://lkml.kernel.org/r/1453516462-4844-1-git-send-email-mcgrof@do-not-panic.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-09 14:57:51 +01:00
Borislav Petkov 8b30a8b3c6 x86/defconfigs/32: Set CONFIG_FRAME_WARN to the Kconfig default
Sync it to the Kconfig default for 32-bit.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: tim.gardner@canonical.com
Link: http://lkml.kernel.org/r/20160309134821.GD6564@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-09 14:53:41 +01:00
Paolo Bonzini 5a5fbdc0e3 KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch
It is now equal to use_eager_fpu(), which simply tests a cpufeature bit.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-09 14:04:36 +01:00
Paolo Bonzini a87036add0 KVM: x86: disable MPX if host did not enable MPX XSAVE features
When eager FPU is disabled, KVM will still see the MPX bit in CPUID and
presumably the MPX vmentry and vmexit controls.  However, it will not
be able to expose the MPX XSAVE features to the guest, because the guest's
accessible XSAVE features are always a subset of host_xcr0.

In this case, we should disable the MPX CPUID bit, the BNDCFGS MSR,
and the MPX vmentry and vmexit controls for nested virtualization.
It is then unnecessary to enable guest eager FPU if the guest has the
MPX CPUID bit set.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-09 14:04:36 +01:00
Andy Lutomirski f363938c70 x86/fpu: Fix 'no387' regression
After fixing FPU option parsing, we now parse the 'no387' boot option
too early: no387 clears X86_FEATURE_FPU before it's even probed, so
the boot CPU promptly re-enables it.

I suspect it gets even more confused on SMP.

Fix the probing code to leave X86_FEATURE_FPU off if it's been
disabled by setup_clear_cpu_cap().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: yu-cheng yu <yu-cheng.yu@intel.com>
Fixes: 4f81cbafcc ("x86/fpu: Fix early FPU command-line parsing")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-09 13:54:40 +01:00
Paolo Bonzini ab92f30875 KVM/ARM updates for 4.6
- VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
 - PMU support for guests
 - 32bit world switch rewritten in C
 - Various optimizations to the vgic save/restore code
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW36xjAAoJECPQ0LrRPXpDGQkQAMDppzcTOixT3e8VPdHAX09a
 Z5PO0gyTMVV7Jyz5Ul3pedPJA2GSK9mxOCwqvIFbdxLAR6ZB00juO5FrTHkSdI91
 1XLPj4bKoMWcVvhL/g5A4Glp/pVMW1k/9Yq8zZAtYlsLRlqG5rLOutSadcqHcYaJ
 cTD/pFf7b2oPtkTPyoFml75KgHBT/8uvAvFDOWA66Id2z6T11+PsBT/6XnGDiwKg
 tpGTNzx3kPIKIzOAOHqVW6UBxFOeabebXLT8wUz3VwNn/UbG6gkumMNApMAyF2q1
 zU0nAh8+7Ek6Dr4OFWE6BfW6sgg/l7i1lA8XoAmqG7ZTrSptCc59fvaZJxPruG+Q
 dMsU6QgR77JJjbZTinf9a1jReZ/liZrx2gZXedVKdILrjmDSq0UnGcxjUOEDZOGy
 2/dbrlJhv+LhpcJtuPpxPCfoqbW5L0ynzmuYuXRdRz3lTHiOWIRx5gugrhO+wH4D
 4gvZhbw3XCiYfpYHYhl8A1EH5kanKgdXDocz9yIm7mZm89gngufF/HkeXS3ZU25T
 yThyBGulGjqN4FCdgf1HolkTfFjnfSx4qJovJ58eHga+HNLXRkTecZZcbFy2OOHv
 8Bx0PIlwj4RgSaRLWQUudAhdhKS2g22DKDDljxFwhkMPNghvqkYMJCRDKLu6GBXQ
 4YsLKM+TaShHFjSpx+ao
 =rpvb
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM updates for 4.6

- VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
- PMU support for guests
- 32bit world switch rewritten in C
- Various optimizations to the vgic save/restore code

Conflicts:
	include/uapi/linux/kvm.h
2016-03-09 11:50:42 +01:00
Linus Walleij 0bae2f1732 Merge branch 'ib-mfd-regulator-gpio-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into devel 2016-03-09 17:40:37 +07:00
Marc Zyngier b40c4892d1 arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit
So far, we're always writing all possible LRs, setting the empty
ones with a zero value. This is obvious doing a low of work for
nothing, and we're better off clearing those we've actually
dirtied on the exit path (it is very rare to inject more than one
interrupt at a time anyway).

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:24:16 +00:00
Marc Zyngier 0d98d00b8d arm64: KVM: vgic-v3: Reset LRs at boot time
In order to let the GICv3 code be more lazy in the way it
accesses the LRs, it is necessary to start with a clean slate.

Let's reset the LRs on each CPU when the vgic is probed (which
includes a round trip to EL2...).

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:24:09 +00:00
Marc Zyngier 84e8b9c88d arm64: KVM: vgic-v3: Do not save an LR known to be empty
On exit, any empty LR will be signaled in ICH_ELRSR_EL2. Which
means that we do not have to save it, and we can just clear
its state in the in-memory copy.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:24:07 +00:00
Marc Zyngier b4344545cf arm64: KVM: vgic-v3: Save maintenance interrupt state only if required
Next on our list of useless accesses is the maintenance interrupt
status registers (ICH_MISR_EL2, ICH_EISR_EL2).

It is pointless to save them if we haven't asked for a maintenance
interrupt the first place, which can only happen for two reasons:
- Underflow: ICH_HCR_UIE will be set,
- EOI: ICH_LR_EOI will be set.

These conditions can be checked on the in-memory copies of the regs.
Should any of these two condition be valid, we must read GICH_MISR.
We can then check for ICH_MISR_EOI, and only when set read
ICH_EISR_EL2.

This means that in most case, we don't have to save them at all.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:24:06 +00:00
Marc Zyngier 1b8e83c04e arm64: KVM: vgic-v3: Avoid accessing ICH registers
Just like on GICv2, we're a bit hammer-happy with GICv3, and access
them more often than we should.

Adopt a policy similar to what we do for GICv2, only save/restoring
the minimal set of registers. As we don't access the registers
linearly anymore (we may skip some), the convoluted accessors become
slightly simpler, and we can drop the ugly indexing macro that
tended to confuse the reviewers.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:24:04 +00:00
Christophe Lombard c0efa9aee8 powerpc: New possible return value from hcall
The hcalls introduced for cxl use a possible new value:
H_STATE (invalid state).

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:51 +11:00
Andrew Donnellan 949e9b827e powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
In eeh_pci_enable(), after making the request to set the new options, we
call eeh_ops->wait_state() to check that the request finished successfully.

At the moment, if eeh_ops->wait_state() returns 0, we return 0 without
checking that it reflects the expected outcome. This can lead to callers
further up the chain incorrectly assuming the slot has been successfully
unfrozen and continuing to attempt recovery.

On powernv, this will occur if pnv_eeh_get_pe_state() or
pnv_eeh_get_phb_state() return 0, which in turn occurs if the relevant OPAL
call returns OPAL_EEH_STOPPED_MMIO_DMA_FREEZE or
OPAL_EEH_PHB_ERROR respectively.

On pseries, this will occur if pseries_eeh_get_state() returns 0, which in
turn occurs if RTAS reports that the PE is in the MMIO Stopped and DMA
Stopped states.

Obviously, none of these cases represent a successful completion of a
request to thaw MMIO or DMA.

Fix the check so that a wait_state() return value of 0 won't be considered
successful for the EEH_OPT_THAW_MMIO or EEH_OPT_THAW_DMA cases.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 11:33:30 +11:00
Gavin Shan b6c7347f2f powerpc/eeh: Remove duplicated check in eeh_dump_pe_log()
When eeh_dump_pe_log() is only called by eeh_slot_error_detail(),
we already have the check that the PE isn't in PCI config blocked
state in eeh_slot_error_detail(). So we needn't the duplicated
check in eeh_dump_pe_log().

This removes the duplicated check in eeh_dump_pe_log(). No logical
changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 10:25:35 +11:00
Gavin Shan eca036ee1b powerpc/eeh: Synchronize recovery in host/guest
When passing through SRIOV VFs to guest, we possibly encounter EEH
error on PF. In this case, the VF PEs are put into frozen state.
The error could be reported to guest before it's captured by the
host. That means the guest could attempt to recover errors on VFs
before host gets chance to recover errors on PFs. The VFs won't be
recovered successfully.

This enforces the recovery order for above case: the recovery on
child PE in guest is hold until the recovery on parent PE in host
is completed.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:28 +11:00
Gavin Shan 3fa7bf7229 powerpc/eeh: Don't remove passed VFs
When we have partial hotplug as part of the error recovery on PF,
the VFs that are bound with vfio-pci driver will experience hotplug.
That's not allowed.

This checks if the VF PE is passed or not. If it does, we leave
the VF without removing it.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:27 +11:00
Gavin Shan 2311cca555 powerpc/eeh: Don't propagate error to guest
When EEH error happened to the parent PE of those PEs that have
been passed through to guest, the error is propagated to guest
domain and the VFIO driver's error handlers are called. It's not
correct as the error in the host domain shouldn't be propagated
to guests and affect them.

This adds one more limitation when calling EEH error handlers.
If the PE has been passed through to guest, the error handlers
won't be called.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:25 +11:00
Wei Yang 67086e32b5 powerpc/eeh: powerpc/eeh: Support error recovery for VF PE
PFs are enumerated on PCI bus, while VFs are created by PF's driver.

In EEH recovery, it has two cases:
1. Device and driver is EEH aware, error handlers are called.
2. Device and driver is not EEH aware, un-plug the device and plug it again
by enumerating it.

The special thing happens on the second case. For a PF, we could use the
original pci core to enumerate the bus, while for VF we need to record the
VFs which aer un-plugged then plug it again.

Also The patch caches the VF index in pci_dn, which can be used to
calculate VF's bus, device and function number. Those information helps to
locate the VF's PCI device instance when doing hotplug during EEH recovery
if necessary.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:23 +11:00
Wei Yang 0dc2830e0a powerpc/powernv: Support PCI config restore for VFs
After PE reset, OPAL API opal_pci_reinit() is called on all devices
contained in the PE to reinitialize them. While skiboot is not aware of
VFs, we have to implement the function in kernel to reinitialize VFs after
reset on PE for VFs.

In this patch, two functions pnv_pci_fixup_vf_mps() and
pnv_eeh_restore_vf_config() both manipulate the MPS of the VF, since for a
VF it has three cases.

1. Normal creation for a VF
   In this case, pnv_pci_fixup_vf_mps() is called to make the MPS a proper
   value compared with its parent.
2. EEH recovery without VF removed
   In this case, MPS is stored in pci_dn and pnv_eeh_restore_vf_config() is
   called to restore it and reinitialize other part.
3. EEH recovery with VF removed
   In this case, VF will be removed then re-created. Both functions are
   called. First pnv_pci_fixup_vf_mps() is called to store the proper MPS
   to pci_dn and then pnv_eeh_restore_vf_config() is called to do proper
   thing.

This introduces two functions: pnv_pci_fixup_vf_mps() to fixup the VF's
MPS to make sure it is equal to parent's and store this value in pci_dn
for future use. pnv_eeh_restore_vf_config() to re-initialize on VF by
restoring MPS, disabling completion timeout, enabling SERR, etc.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:22 +11:00
Wei Yang 9312bc5bab powerpc/powernv: Support EEH reset for VF PE
PEs for VFs don't have primary bus. So they have to have their own reset
backend, which is used during EEH recovery. The patch implements the reset
backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained
in the PE.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:21 +11:00
Wei Yang c29fa27d26 powerpc/eeh: Create PE for VFs
This creates PEs for VFs in the weak function pcibios_bus_add_device().
Those PEs for VFs are identified with newly introduced flag EEH_PE_VF
so that we treat them differently during EEH recovery.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:19 +11:00
Wei Yang 39218cd00e powerpc/eeh: EEH device for VF
VFs and their corresponding pdn are created and released dynamically
when their PF's SRIOV capability is enabled and disabled. This creates
and releases EEH devices for VFs when creating and releasing their pdn
instances, which means EEH devices and pdn instances have same life
cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn).

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:18 +11:00
Wei Yang 51c0e87e9a powerpc/eeh: Cache normal BARs, not windows or IOV BARs
This restricts the EEH address cache to use only the first 7 BARs. This
makes __eeh_addr_cache_insert_dev() ignore PCI bridge window and IOV BARs.
As the result of this change, eeh_addr_cache_get_dev() will return VFs from
VF's resource addresses instead of parent PFs.

This also removes PCI bridge check as we limit __eeh_addr_cache_insert_dev()
to 7 BARs and this effectively excludes PCI bridges from being cached.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:17 +11:00
Wei Yang 971427f582 powerpc/pci: Remove VFs prior to PF
As commit ac205b7bb7 ("PCI: make sriov work with hotplug remove")
indicates, VFs which is on the same PCI bus as their PF, should be
removed before the PF. Otherwise, we might run into kernel crash
at PCI unplugging time.

This applies the above pattern to powerpc PCI hotplug path.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:15 +11:00
Gavin Shan 4eb0799ff9 powerpc/eeh: Reworked eeh_pe_bus_get()
The original implementation is ugly: unnecessary if statements and
"out" tag. This reworks the function to avoid above weaknesses. No
functional changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:57:46 +11:00
Bjorn Helgaas e7e127e3c7 PCI: Include pci/hotplug Kconfig directly from pci/Kconfig
Include pci/hotplug/Kconfig directly from pci/Kconfig, so arches don't
have to source both pci/Kconfig and pci/hotplug/Kconfig.

Note that this effectively adds pci/hotplug/Kconfig to the following
arches, because they already sourced drivers/pci/Kconfig but they
previously did not source drivers/pci/hotplug/Kconfig:

  alpha
  arm
  avr32
  frv
  m68k
  microblaze
  mn10300
  sparc
  unicore32

Inspired-by-patch-from: Bogicevic Sasa <brutallesale@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-08 15:10:48 -06:00
Bogicevic Sasa 5f8fc43217 PCI: Include pci/pcie/Kconfig directly from pci/Kconfig
Include pci/pcie/Kconfig directly from pci/Kconfig, so arches don't
have to source both pci/Kconfig and pci/pcie/Kconfig.

Note that this effectively adds pci/pcie/Kconfig to the following
arches, because they already sourced drivers/pci/Kconfig but they
previously did not source drivers/pci/pcie/Kconfig:

  alpha
  avr32
  blackfin
  frv
  m32r
  m68k
  microblaze
  mn10300
  parisc
  sparc
  unicore32
  xtensa

[bhelgaas: changelog, source pci/pcie/Kconfig at top of pci/Kconfig, whitespace]
Signed-off-by: Sasa Bogicevic <brutallesale@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-08 14:36:48 -06:00
Bharat Kumar Gogada 01cf9d524f microblaze/PCI: Support generic Xilinx AXI PCIe Host Bridge IP driver
Modify the Microblaze PCI subsystem to work with the generic
drivers/pci/host/pcie-xilinx.c driver on Microblaze and Zynq.

[bhelgaas: changelog]
Signed-off-by: Bharat Kumar Gogada <bharatku@xilinx.com>
Signed-off-by: Ravi Kiran Gummaluri <rgummal@xilinx.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
2016-03-08 14:25:57 -06:00
Stefan Agner 4774f40093 Input: ad7879 - move header to platform_data directory
The header file is used by the SPI and I2C variant of the driver.
Therefore, move it to a more generic place under platform_data.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-03-08 10:51:10 -08:00
Bjorn Helgaas 0c0e0736ac PCI: Set ROM shadow location in arch code, not in PCI core
IORESOURCE_ROM_SHADOW means there is a copy of a device's option ROM in
RAM.  The existence of such a copy and its location are arch-specific.
Previously the IORESOURCE_ROM_SHADOW flag was set in arch code, but the
0xC0000-0xDFFFF location was hard-coded into the PCI core.

If we're using a shadow copy in RAM, disable the ROM BAR and release the
address space it was consuming.  Move the location information from the PCI
core to the arch code that sets IORESOURCE_ROM_SHADOW.  Save the location
of the RAM copy in the struct resource for PCI_ROM_RESOURCE.

After this change, pci_map_rom() will call pci_assign_resource() and
pci_enable_rom() for these IORESOURCE_ROM_SHADOW resources, which we did
not do before.  This is safe because:

  - pci_assign_resource() will do nothing because the resource is marked
    IORESOURCE_PCI_FIXED, which means we can't move it, and

  - pci_enable_rom() will not turn on the ROM BAR's enable bit because the
    resource is marked IORESOURCE_ROM_SHADOW, which means it is in RAM
    rather than in PCI memory space.

Storing the location in the struct resource means "lspci" will show the
shadow location, not the value from the ROM BAR.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-08 12:14:31 -06:00
Bjorn Helgaas 63e22924f5 PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
A shadow copy of an option ROM is placed by the BIOS as a fixed address.
Set IORESOURCE_PCI_FIXED to indicate that we can't move the shadow copy.
This prevents warnings like the following when we assign resources:

  BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment

This warning is emitted by pdev_sort_resources(), which already ignores
IORESOURCE_PCI_FIXED resources.

Link: http://lkml.kernel.org/r/CA+55aFyVMfTBB0oz_yx8+eQOEJnzGtCsYSj9QuhEpdZ9BHdq5A@mail.gmail.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-08 12:14:31 -06:00
Bjorn Helgaas b894157145 x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs
The Home Agent and PCU PCI devices in Broadwell-EP have a non-BAR register
where a BAR should be.  We don't know what the side effects of sizing the
"BAR" would be, and we don't know what address space the "BAR" might appear
to describe.

Mark these devices as having non-compliant BARs so the PCI core doesn't
touch them.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
CC: stable@vger.kernel.org
2016-03-08 12:13:57 -06:00
Bjorn Helgaas 9289b9d3aa unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition
HAVE_ARCH_PCI_SET_DMA_MASK is unused, so remove the #define for it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: GUAN Xuetao <gxt@mprc.pku.edu.cn>
2016-03-08 11:39:36 -06:00
David S. Miller 810813c47a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of overlapping changes, as well as one instance
(vxlan) of a bug fix in 'net' overlapping with code movement
in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08 12:34:12 -05:00
Tony Luck 92b0729c34 x86/mm, x86/mce: Add memcpy_mcsafe()
Make use of the EXTABLE_FAULT exception table entries to write
a kernel copy routine that doesn't crash the system if it
encounters a machine check. Prime use case for this is to copy
from large arrays of non-volatile memory used as storage.

We have to use an unrolled copy loop for now because current
hardware implementations treat a machine check in "rep mov"
as fatal. When that is fixed we can simplify.

Return type is a "bool". True means that we copied OK, false means
that it didn't.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@gmail.com>
Link: http://lkml.kernel.org/r/a44e1055efc2d2a9473307b22c91caa437aa3f8b.1456439214.git.tony.luck@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 17:54:38 +01:00
Borislav Petkov 7a8698058a perf/x86/intel/rapl: Simplify quirk handling even more
Drop the quirk() function pointer in favor of a simple boolean which
says whether the quirk should be applied or not. Update comment while at
it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-tip-commits@vger.kernel.org
Link: http://lkml.kernel.org/r/20160308164041.GF16568@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 17:49:52 +01:00
Adam Buchbinder 7eb792bf7c s390: Fix misspellings in comments
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-08 15:00:17 +01:00
Martin Schwidefsky 1e133ab296 s390/mm: split arch/s390/mm/pgtable.c
The pgtable.c file is quite big, before it grows any larger split it
into pgtable.c, pgalloc.c and gmap.c. In addition move the gmap related
header definitions into the new gmap.h header and all of the pgste
helpers from pgtable.h to pgtable.c.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-08 15:00:15 +01:00
Martin Schwidefsky 227be799c3 s390/mm: uninline pmdp_xxx functions from pgtable.h
The pmdp_xxx function are smaller than their ptep_xxx counterparts
but to keep things symmetrical unline them as well.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-08 15:00:14 +01:00
Martin Schwidefsky ebde765c0e s390/mm: uninline ptep_xxx functions from pgtable.h
The code in the various ptep_xxx functions has grown quite large,
consolidate them to four out-of-line functions:
  ptep_xchg_direct to exchange a pte with another with immediate flushing
  ptep_xchg_lazy to exchange a pte with another in a batched update
  ptep_modify_prot_start to begin a protection flags update
  ptep_modify_prot_commit to commit a protection flags update

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-08 15:00:12 +01:00
Sudeep Holla 6d6acd140a arm64: dts: juno/vexpress: fix node name unit-address presence warnings
Commit fa38a82096a1 ("scripts/dtc: Update to upstream version
53bf130b1cdd") added warnings on node name unit-address presence/absence
mismatch in device trees.

This patch fixes those warning on all the juno/vexpress platforms where
unit-address is present in node name while the reg/ranges property is
not present. It also adds unit-address to all smb bus node.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2016-03-08 13:54:27 +00:00
Ingo Molnar 14ddde78c7 Merge branch 'linus' into x86/fpu, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 14:25:45 +01:00
Andy Lutomirski 0dd0036f6e x86/asm-offsets: Remove PARAVIRT_enabled
It no longer has any users.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: lguest@lists.ozlabs.org
Cc: xen-devel@lists.xensource.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 14:16:44 +01:00
Andy Lutomirski 58a5aac533 x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled
x86_64 has very clean espfix handling on paravirt: espfix64 is set
up in native_iret, so paravirt systems that override iret bypass
espfix64 automatically.  This is robust and straightforward.

x86_32 is messier.  espfix is set up before the IRET paravirt patch
point, so it can't be directly conditionalized on whether we use
native_iret.  We also can't easily move it into native_iret without
regressing performance due to a bizarre consideration.  Specifically,
on 64-bit kernels, the logic is:

  if (regs->ss & 0x4)
          setup_espfix;

On 32-bit kernels, the logic is:

  if ((regs->ss & 0x4) && (regs->cs & 0x3) == 3 &&
      (regs->flags & X86_EFLAGS_VM) == 0)
          setup_espfix;

The performance of setup_espfix itself is essentially irrelevant, but
the comparison happens on every IRET so its performance matters.  On
x86_64, there's no need for any registers except flags to implement
the comparison, so we fold the whole thing into native_iret.  On
x86_32, we don't do that because we need a free register to
implement the comparison efficiently.  We therefore do espfix setup
before restoring registers on x86_32.

This patch gets rid of the explicit paravirt_enabled check by
introducing X86_BUG_ESPFIX on 32-bit systems and using an ALTERNATIVE
to skip espfix on paravirt systems where iret != native_iret.  This is
also messy, but it's at least in line with other things we do.

This improves espfix performance by removing a branch, but no one
cares.  More importantly, it removes a paravirt_enabled user, which is
good because paravirt_enabled is ill-defined and is going away.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: lguest@lists.ozlabs.org
Cc: xen-devel@lists.xensource.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 14:16:44 +01:00
David Hildenbrand c54f0d6ae0 KVM: s390: allocate only one DMA page per VM
We can fit the 2k for the STFLE interpretation and the crypto
control block into one DMA page. As we now only have to allocate
one DMA page, we can clean up the code a bit.

As a nice side effect, this also fixes a problem with crycbd alignment in
case special allocation debug options are enabled, debugged by Sascha
Silbe.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-03-08 13:57:54 +01:00
David Hildenbrand 80bc79dc0b KVM: s390: enable STFLE interpretation only if enabled for the guest
Not setting the facility list designation disables STFLE interpretation,
this is what we want if the guest was told to not have it.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-03-08 13:57:54 +01:00
David Hildenbrand b3c17f10fa KVM: s390: wake up when the VCPU cpu timer expires
When the VCPU cpu timer expires, we have to wake up just like when the ckc
triggers. For now, setting up a cpu timer in the guest and going into
enabled wait will never lead to a wakeup. This patch fixes this problem.
Just as for the ckc, we have to take care of waking up too early. We
have to recalculate the sleep time and go back to sleep.

Please note that the timer callback calls kvm_s390_get_cpu_timer() from
interrupt context. As the timer is canceled when leaving handle_wait(),
and we don't do any VCPU cpu timer writes/updates in that function, we can
be sure that we will never try to read the VCPU cpu timer from the same cpu
that is currentyl updating the timer (deadlock).

Reported-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Tested-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-03-08 13:57:53 +01:00
David Hildenbrand 5ebda31686 KVM: s390: step the VCPU timer while in enabled wait
The cpu timer is a mean to measure task execution time. We want
to account everything for a VCPU for which it is responsible. Therefore,
if the VCPU wants to sleep, it shall be accounted for it.

We can easily get this done by not disabling cpu timer accounting when
scheduled out while sleeping because of enabled wait.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-03-08 13:57:53 +01:00
David Hildenbrand 9c23a1318e KVM: s390: protect VCPU cpu timer with a seqcount
For now, only the owning VCPU thread (that has loaded the VCPU) can get a
consistent cpu timer value when calculating the delta. However, other
threads might also be interested in a more recent, consistent value. Of
special interest will be the timer callback of a VCPU that executes without
having the VCPU loaded and could run in parallel with the VCPU thread.

The cpu timer has a nice property: it is only updated by the owning VCPU
thread. And speaking about accounting, a consistent value can only be
calculated by looking at cputm_start and the cpu timer itself in
one shot, otherwise the result might be wrong.

As we only have one writing thread at a time (owning VCPU thread), we can
use a seqcount instead of a seqlock and retry if the VCPU refreshed its
cpu timer. This avoids any heavy locking and only introduces a counter
update/check plus a handful of smp_wmb().

The owning VCPU thread should never have to retry on reads, and also for
other threads this might be a very rare scenario.

Please note that we have to use the raw_* variants for locking the seqcount
as lockdep will produce false warnings otherwise. The rq->lock held during
vcpu_load/put is also acquired from hardirq context. Lockdep cannot know
that we avoid potential deadlocks by disabling preemption and thereby
disable concurrent write locking attempts (via vcpu_put/load).

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-03-08 13:57:53 +01:00
David Hildenbrand db0758b297 KVM: s390: step VCPU cpu timer during kvm_run ioctl
Architecturally we should only provide steal time if we are scheduled
away, and not if the host interprets a guest exit. We have to step
the guest CPU timer in these cases.

In the first shot, we will step the VCPU timer only during the kvm_run
ioctl. Therefore all time spent e.g. in interception handlers or on irq
delivery will be accounted for that VCPU.

We have to take care of a few special cases:
- Other VCPUs can test for pending irqs. We can only report a consistent
  value for the VCPU thread itself when adding the delta.
- We have to take care of STP sync, therefore we have to extend
  kvm_clock_sync() and disable preemption accordingly
- During any call to disable/enable/start/stop we could get premeempted
  and therefore get start/stop calls. Therefore we have to make sure we
  don't get into an inconsistent state.

Whenever a VCPU is scheduled out, sleeping, in user space or just about
to enter the SIE, the guest cpu timer isn't stepped.

Please note that all primitives are prepared to be called from both
environments (cpu timer accounting enabled or not), although not completely
used in this patch yet (e.g. kvm_s390_set_cpu_timer() will never be called
while cpu timer accounting is enabled).

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-03-08 13:57:52 +01:00
David Hildenbrand 4287f247f6 KVM: s390: abstract access to the VCPU cpu timer
We want to manually step the cpu timer in certain scenarios in the future.
Let's abstract any access to the cpu timer, so we can hide the complexity
internally.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-03-08 13:57:52 +01:00
David Hildenbrand 01a745ac8b KVM: s390: store cpu id in vcpu->cpu when scheduled in
By storing the cpu id, we have a way to verify if the current cpu is
owning a VCPU.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-03-08 13:57:51 +01:00
Alexander Yarygin dce382b6c7 KVM: s390: Add diag "watchdog functions" to trace event decoding
DIAG 0x288 may occur now. Let's add its code to the diag table in
sie.h.

Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-03-08 13:57:51 +01:00
Kostenzer Felix 8e2a7f5b9a x86/nmi: Mark 'ignore_nmis' as __read_mostly
ignore_nmis is used in two distinct places:

 1. modified through {stop,restart}_nmi by alternative_instructions
 2. read by do_nmi to determine if default_do_nmi should be called or not

thus the access pattern conforms to __read_mostly and do_nmi() is a fastpath.

Signed-off-by: Kostenzer Felix <fkostenzer@live.at>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 12:48:19 +01:00