Commit Graph

48588 Commits

Author SHA1 Message Date
majd@mellanox.com e2013b212f net/mlx5_core: Add RQ and SQ event handling
RQ/SQ will be used to implement IB verbs QPs, so the IB QP affiliated
events are affiliated also with SQs and RQs.

Since SQ, RQ and QP resource numbers do not share the same name
space, a queue type field was added to the event data to specify
the SW object that the event is affiliated with.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com 8d7f9ecb37 net/mlx5_core: Export transport objects
To be used by mlx5_ib in the following patches for implementing
RAW PACKET QP.

Add mlx5_core_ prefix to alloc and delloc transport_domain since
they are exposed now.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:08 -05:00
Linus Torvalds 30f05309bd More power management and ACPI updates for v4.5-rc1
- Modify the driver core and the USB subsystem to allow USB devices
    to stay suspended over system suspend/resume cycles if they have
    been runtime-suspended already beforehand and fix some bugs on
    top of these changes (Tomeu Vizoso, Rafael Wysocki).
 
  - Update ACPICA to upstream revision 20160108, including updates
    of the ACPICA's copyright notices, a code fixup resulting from
    a regression fix that was necessary in the upstream code only
    (the regression fixed by it has never been present in Linux)
    and a compiler warning fix (Bob Moore, Lv Zheng).
 
  - Fix a recent regression in the cpuidle menu governor that broke
    it on practically all architectures other than x86 and make a
    couple of optimizations on top of that fix (Rafael Wysocki).
 
  - Clean up the selection of cpuidle governors depending on whether
    or not the kernel is configured for tickless systems (Jean Delvare).
 
  - Revert a recent commit that introduced a regression in the ACPI
    backlight driver, address the problem it attempted to fix in a
    different way and revert one more cosmetic change depending on
    the problematic commit (Hans de Goede).
 
  - Add two more ACPI backlight quirks (Hans de Goede).
 
  - Fix a few minor problems in the core devfreq code, clean it up
    a bit and update the MAINTAINERS information related to it
    (Chanwoo Choi, MyungJoo Ham).
 
  - Improve an error message in the ACPI fan driver (Andy Lutomirski).
 
  - Fix a recent build regression in the cpupower tool (Shreyas Prabhu).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJWoCQ4AAoJEILEb/54YlRxLscQALEFVKSRnNaco72OqqRZs9Bu
 1RI6TgHTpZxR+Ef0+QWqE1QMnDwfImGhKDbSRm/t3S2sMYYZbAOL8cu4y6GmkBv4
 bOon/f9WEoPlQCFoo/6U4u8H45rNT5W9zX5+Bva8x+4Wu3n2J1QdvirnS5JHeHe1
 o6tGLaHuZXSwX8SLnCk8gJYK1VhATxbubJtpcVtvlnAhO11qUAwsscCrkUmB60i7
 5hLyrZb06hoa/hZVcIefGFuSd9qPhzDMQE2M20EohQ7UVkNJQdY9QNHMqCk2P42T
 nMWCNSwGnwfiO1p9ByXqunOFBCmyL7P+KV/DHsz6TFCVjz+jeG53Kqey9SkSJ/2W
 iaAE80K9MfOMvg8j7rib6fTn5uXBwRfqdeUDF/Hr64QqJoRn3R2LX4HmZe4L8ufb
 zA1rece67o8FD+7p7GkNbT3rPV/kA62tn/moFk446X5N+b261Kz90t1DVci8kRVf
 k+1gcvEdqO0GPpEHoirfXrBvQFixqkXakKj4r2aAob/DldQeLX7CkOUuRRJ1ykec
 bxwI9R0v8MlVe5rDxg+rPB0I9EFxRDmxqxpU5j0MRWxKnMRzLvBtHuk8YNVS/eU1
 xwyJOGcwF6yI0PaCFggPqmhebSrWLE7wJxaK+3bC+yiDTvHYPjB+4MfQrmkRAwwM
 azgb+ZgXDYx5wXeb8EjB
 =bKJ9
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management and ACPI updates from Rafael Wysocki:
 "This includes fixes on top of the previous batch of PM+ACPI updates
  and some new material as well.

  From the new material perspective the most significant are the driver
  core changes that should allow USB devices to stay suspended over
  system suspend/resume cycles if they have been runtime-suspended
  already beforehand.  Apart from that, ACPICA is updated to upstream
  revision 20160108 (cosmetic mostly, but including one fixup on top of
  the previous ACPICA update) and there are some devfreq updates the
  didn't make it before (due to timing).

  A few recent regressions are fixed, most importantly in the cpuidle
  menu governor and in the ACPI backlight driver and some x86 platform
  drivers depending on it.

  Some more bugs are fixed and cleanups are made on top of that.

  Specifics:

   - Modify the driver core and the USB subsystem to allow USB devices
     to stay suspended over system suspend/resume cycles if they have
     been runtime-suspended already beforehand and fix some bugs on top
     of these changes (Tomeu Vizoso, Rafael Wysocki).

   - Update ACPICA to upstream revision 20160108, including updates of
     the ACPICA's copyright notices, a code fixup resulting from a
     regression fix that was necessary in the upstream code only (the
     regression fixed by it has never been present in Linux) and a
     compiler warning fix (Bob Moore, Lv Zheng).

   - Fix a recent regression in the cpuidle menu governor that broke it
     on practically all architectures other than x86 and make a couple
     of optimizations on top of that fix (Rafael Wysocki).

   - Clean up the selection of cpuidle governors depending on whether or
     not the kernel is configured for tickless systems (Jean Delvare).

   - Revert a recent commit that introduced a regression in the ACPI
     backlight driver, address the problem it attempted to fix in a
     different way and revert one more cosmetic change depending on the
     problematic commit (Hans de Goede).

   - Add two more ACPI backlight quirks (Hans de Goede).

   - Fix a few minor problems in the core devfreq code, clean it up a
     bit and update the MAINTAINERS information related to it (Chanwoo
     Choi, MyungJoo Ham).

   - Improve an error message in the ACPI fan driver (Andy Lutomirski).

   - Fix a recent build regression in the cpupower tool (Shreyas
     Prabhu)"

* tag 'pm+acpi-4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
  cpuidle: menu: Avoid pointless checks in menu_select()
  sched / idle: Drop default_idle_call() fallback from call_cpuidle()
  cpupower: Fix build error in cpufreq-info
  cpuidle: Don't enable all governors by default
  cpuidle: Default to ladder governor on ticking systems
  time: nohz: Expose tick_nohz_enabled
  ACPICA: Update version to 20160108
  ACPICA: Silence a -Wbad-function-cast warning when acpi_uintptr_t is 'uintptr_t'
  ACPICA: Additional 2016 copyright changes
  ACPICA: Reduce regression fix divergence from upstream ACPICA
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830
  ACPI / video: Revert "thinkpad_acpi: Use acpi_video_handles_brightness_key_presses()"
  ACPI / video: Document acpi_video_handles_brightness_key_presses() a bit
  ACPI / video: Fix using an uninitialized mutex / list_head in acpi_video_handles_brightness_key_presses()
  ACPI / video: Revert "ACPI / video: driver must be registered before checking for keypresses"
  ACPI / fan: Improve acpi_device_update_power error message
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700
  cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0
  MAINTAINERS: Add devfreq-event entry
  MAINTAINERS: Add missing git repository and directory for devfreq
  ...
2016-01-20 19:06:49 -08:00
Linus Torvalds 3549d82279 SH Drivers Updates for v4.5
Clean up the clock API on legacy SH to make it more similar to the Common
 Clock Framework.  This will avoid different behaviour in drivers shared
 between legacy and CCF-based platforms (e.g. SCIF).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWoEPRAAoJENfPZGlqN0++WesP/1/7IwU5qMG/qWrtUwr7jpov
 UZryYyLElvVg9nUG+alFPPB0nc4c7xv+iUjB2R49rZlDNjm8Ehx3bgG6WF4tRzSa
 5Hh1GpTWPGptFaobaExC3x6ACgwd06bJ0TpezgNGtX8YyKk6g1ttUy91jVWpKiD2
 WL80rbN0wR04NG4enzVEUZWxmkDW5Rv9z0s9a8bm/tNDBiMxLLD7XR0Bh+0qws0t
 A3pU+IJxVB1pKkpq0XgZbj0/al1kVDQ294ovJds0vwAGymocbFfOWZHoAltlwxpc
 93xFuF+MVxflmnYE1BN2oHjRvXukBGkWmXBNyRWGWxNqqLJsTm8xHl8u/J8gaSrC
 YHfmq5tnFUgVzbPW2HGPwLjn2Ch2GBgipS1QY2UvcZZ1GI3XcaKvYrx3VSr35ldX
 d8RBH1yRZo3g6UU+fbDUJbJYiGN37yYmvPZ7k8W1pRbZ8z8u3yB/bB8DlEGjndhu
 E4Hs4UfhyqOvJuoqdWQdWitHkx2Rga7p/SwRGWYYQLN8hfVqqW0bCkGSt7muQGI9
 96UrAnuZmcCR3b/YkUQnje7J2K/roYBsml9Yl5Qu9UEKIr8hINrqHWR7WpPvufZC
 Z8a1uIQLGhmy7/PginvrGhIryDZi8G8N8oNoIbHZxAX44sElMgDihOtRRWEazfwV
 iGtJXNfTKL4s/XhJN/Ck
 =aezU
 -----END PGP SIGNATURE-----

Merge tag 'renesas-sh-drivers-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas

Pull SH driver updates from Simon Horman:
 "Clean up the clock API on legacy SH to make it more similar to the
  Common Clock Framework.  This will avoid different behaviour in
  drivers shared between legacy and CCF-based platforms (e.g.  SCIF)"

* tag 'renesas-sh-drivers-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  drivers: sh: clk: Avoid crashes when passing NULL clocks
  drivers: sh: clk: Remove obsolete and unused clk_round_parent()
2016-01-20 19:00:46 -08:00
Linus Torvalds 9638685e32 ARM: SoC driver updates for v4.5
Driver updates for ARM SoCs. Some for SoC-family code under drivers/soc,
 but also some other driver updates that don't belong anywhere else. We also
 bring in the drivers/reset code through arm-soc.
 
 Some of the larger updates:
 
 - Qualcomm support for SMEM, SMSM, SMP2P. All used to communicate with other
   parts of the chip/board on these platforms, all proprietary protocols that
   don't fit into other subsystems and live in drivers/soc for now.
 
 - System bus driver for UniPhier
 - Driver for the TI Wakeup M3 IPC device
 
 - Power management for Raspberry PI
 
 + Again a bunch of other smaller updates and patches.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWnsLfAAoJEIwa5zzehBx3KVcQAJS90XyeheS3LX+SUMsn5qAJ
 Q+NLzfYAAHZPafruOTcBATs9xUvuJM4KsdqXIWiA4gJig/ZFpF5dpDbrnhpdJFEK
 0ULSfZd9J4B4m7aBo6L8lVBEbd4i++updHpIYnmPx3tkLL0Win/VM8KlMpdOzMQN
 DncNFVIRJ80OxepdHgBu6yY7gv3Z8XhRrNukeG8EZTvHw5bJQHfjMOeAQNQWUmWa
 HLyO6uqYkca38XMJOhPhqt0mOpwh/X6ByUueIrIMpOksJcRnklrkjDnBuvp9dQ+8
 fU1sOjjYzQPiA6Oa3o/2ruBKj2FymLJ0OzONJb3xPiBO5bJbc1LJwK00tuIRC3hv
 r9Jqoqwae/AolqYephfao/LvUcoyHBaZ6aSbGQYms0Wy2FCkkkocfY2i7QP6S6Mz
 qtOCHxX+yF921AEZ1xTOKgBvBU931WM1MQOe4s1QVllRlLFJhrGqDDwsNlPwRa+6
 OkXmPJSEhsHGFaCdA3iOEy+/W2dvMbVOaGa9csNDpPcSJhhpxc5A4CYRlFSwGyM1
 oL20g1T3ybDP+EkNcdpavWdQ55toWGmm48PunIZImVSdgNRZZQJw107yA/UzQTNz
 kOg27kqtjpM4xcpxmwGlIj4Zoy1M390ACm+n79Awa9agvjOvfV4QE1tLB4tiXyuy
 0Pxy8WWL/G5Si8bKWsfc
 =khZ5
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC driver updates from Olof Johansson:
 "Driver updates for ARM SoCs.  Some for SoC-family code under
  drivers/soc, but also some other driver updates that don't belong
  anywhere else.  We also bring in the drivers/reset code through
  arm-soc.

  Some of the larger updates:

   - Qualcomm support for SMEM, SMSM, SMP2P.  All used to communicate
     with other parts of the chip/board on these platforms, all
     proprietary protocols that don't fit into other subsystems and live
     in drivers/soc for now.

   - System bus driver for UniPhier

   - Driver for the TI Wakeup M3 IPC device

   - Power management for Raspberry PI

  + Again a bunch of other smaller updates and patches"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (38 commits)
  bus: uniphier: allow only built-in driver
  ARM: bcm2835: clarify RASPBERRYPI_FIRMWARE dependency
  MAINTAINERS: Drop Kumar Gala from QCOM
  bus: uniphier-system-bus: add UniPhier System Bus driver
  ARM: bcm2835: add rpi power domain driver
  dt-bindings: add rpi power domain driver bindings
  ARM: bcm2835: Define two new packets from the latest firmware.
  drivers/soc: make mediatek/mtk-scpsys.c explicitly non-modular
  soc: mediatek: SCPSYS: Add regulator support
  MAINTAINERS: Change QCOM entries
  soc: qcom: smd-rpm: Add existing platform support
  memory/tegra: Add number of TLB lines for Tegra124
  reset: hi6220: fix modular build
  soc: qcom: Introduce WCNSS_CTRL SMD client
  ARM: qcom: select ARM_CPU_SUSPEND for power management
  MAINTAINERS: Add rules for Qualcomm dts files
  soc: qcom: enable smsm/smp2p modular build
  serial: msm_serial: Make config tristate
  soc: qcom: smp2p: Qualcomm Shared Memory Point to Point
  soc: qcom: smsm: Add driver for Qualcomm SMSM
  ...
2016-01-20 18:42:30 -08:00
Linus Torvalds 1305eda751 ARM: SoC platform updates for v4.5
Updates for new platform support:
 
 - New platform: Tango4 from Sigma Designs.
 - Broadcom BCM2836 (Raspberry Pi 2 SoC)
 - Enable cpufreq on Freescale i.MX7D
 - Rockchip: SMP support for rk3036, general support for rk3228
 - SMP support on Broadcom Kona and NSP
 - Cleanups for OMAP removing legacy IOMMU data
 
 + a bunch of misc fixes and tweaks for various platforms.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWnrvIAAoJEIwa5zzehBx3j0oP/i1WKGraSHS/w29ms4vfisFW
 bteUAsgh9aG8GCp/kigjmdMbWMdCRQfXp4D+o+SSr2DcM0q2iZzXV2WTeQ6YPAJM
 Fn431nuPn37ULd9PXdEhgiTDfDmVpvZrHeIQpDxC/TccRj4hkbxkMn6+s5WGkJLR
 DAr3h2mjKgulKZ4Wlj30rOxTDO9kVqTTIxjZXhbmMxepgMihSpvRGMXbKjxVHDSe
 NdCoIlfjhu6trr9d39b3POQVDCtrr82WvgrlfrhZZW6YXq7dIb3hhM1RldHCM+a7
 mnNDMoVDCEjLM7WLkgoOzsX4j+MyCbrsBY9DV9Zs7MtntWGcD+oH0M+ZLjrtlPef
 vebi+yBh/JH9fyrOiusSzbRaimy5o/NO/tYFgEAh/zk6GhwJwBYBvM1nexnfSHhv
 LMY+nSrgZYdN9qHurwjaBmAAmp8u0XZIz/MLQA4WpoqqoAcV2X3gtMcQMdL+nio5
 sUlE60/BcXjHqp/gt3MblxZy2WVzklCuqprOSYxhPwFDx7i9SKb9erSyzEkmzQDP
 UU+8Bory8egR537GaxM+izq9LQ3HLpWnyGPXSGx1s0vzNpZTsYC49eFK44vJRtLK
 QOlFG1VAupFP5S7kDneXJ8KoWcOwG3OccPfPTvyaC8SIezjXrKDgISV2mHOghYlk
 uNYO2fcyBoUX9jG3dUy8
 =FP4V
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC platform updates from Olof Johansson:
 "Updates for new platform support:

   - New platform: Tango4 from Sigma Designs.
   - Broadcom BCM2836 (Raspberry Pi 2 SoC)
   - Enable cpufreq on Freescale i.MX7D
   - Rockchip: SMP support for rk3036, general support for rk3228
   - SMP support on Broadcom Kona and NSP
   - Cleanups for OMAP removing legacy IOMMU data

  + a bunch of misc fixes and tweaks for various platforms"

* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits)
  ARM: tango: Fix UP build issues
  ARM: tango: pass ARM arch level for smc.S
  ARM: bcm2835: Add Kconfig support for bcm2836
  ARM: OMAP2+: Add support for dm814x and dra62x usb
  ARM: OMAP2+: Add mmc hwmod entries for dm814x
  ARM: OMAP2+: Update 81xx clock and power domains for default, active and sgx
  ARM: OMAP2+: Fix SoC detection for dra62x j5-eco
  ARM: tango4: Initial platform support
  ARM: bcm2835: Add a compat string for bcm2836 machine probe
  dt-bindings: Add root properties for Raspberry Pi 2
  ARM: imx: select SRC for i.MX7
  ARM: uniphier: select PINCTRL
  ARM: OMAP2+: Remove device creation for omap-pcm-audio
  ARM: OMAP1: Remove device creation for omap-pcm-audio
  ARM: rockchip: enable support for RK3228 SoCs
  ARM: rockchip: use const and __initconst for rk3036 smp_operations
  ARM: zynq: Select ARCH_HAS_RESET_CONTROLLER
  ARM: BCM: Add SMP support for Broadcom 4708
  ARM: BCM: Add SMP support for Broadcom NSP
  ARM: BCM: Clean up SMP support for Broadcom Kona
  ...
2016-01-20 18:10:05 -08:00
Linus Torvalds 6b5a12dbca ARM: SoC multiplatform code changes for v4.5
This branch is the culmination of 5 years of effort to bring the ARMv6
 and ARMv7 platforms together such that they can all be enabled and
 boot the same kernel. It has been a tremendous amount of cleanup and
 refactoring by a huge number of people, and creation of several new
 (and major) subsystems to better abstract out all the platform details
 in an appropriate manner.
 
 The bulk of this branch is a large patchset from Arnd that brings several
 of the more minor and older platforms we have closer to multiplatform
 support.  Among these are MMP, S3C64xx, Orion5x, mv78xx0 and realview
 Much of this is moving around header files from old mach directories,
 but there are also some cleanup patches of debug_ll (lowlevel debug
 per-platform options) and other parts.
 
 Linus Walleij also has some patchs to clean up the older ARM Realview
 platforms by finally introducing DT support, and Rob Herring has some
 for ARM Versatile which is now DT-only. Both of these platforms are
 now multiplatform.
 
 Finally, a couple of patches from Russell for Dove PMU, and a fix from
 Valentin Rothberg for Exynos ADC, which were rebased on top of the
 series to avoid conflicts.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIUAwUAVqAGcmCrR//JCVInAQLDog/4x9F0PHGmZhexGfFOpi2Od63Jjx55izRU
 zRXqRjjFjambOrZuOx8lEGDy/qzqKbsDU8D1P4IUugkDr2bLSXv+NTLZL1kNBIdm
 YOlJhw/BmzLYqauOHmBzGhtv1FDUk3rqbgTsP5tTWj5LpSkwjmqui3HBZpi+f3Rr
 YOn+NeQSARiw+51D0b106a9RFshQXRGgn5m3xFjLWhJqshb2z2Ew5cogX/zdwrrM
 ss1BFomxsvgk6S+snN6v7cEX2iXe3r89qNR5jEW5BgNpQGFsAUeXPr9zzH07L/Qq
 O7XLw9jt5MX/X5372zVHPb57WoflLbF9cFaaDUZV3eTqt3lC67BTxOtYIdC2i90k
 E5GYlsy88CRwT2EO+ok/6UTryph+hVv7JqHfbKfnISrbraMCK36DtDTpBIpZ9uYF
 rRB7ncJZUWBcyoe+qvitSl+2KV54iB1ez2RXsketxM98dDZsfB2M2ImFou1F/Pgg
 ALvpifPubi/uDe7xNUsSuaT6/3jAomBuNsxnkYJ3NeiH/+duZbOYGkzK/LlcjZyc
 UrA0IpLfwIFsBNzwfpZPZ1lkEu8Y1YZZ+Hv9k65q1wMuBDgrFI5zUeYrPZi4pN9T
 Yo1xP9FstVLDouJrpGZo12VIIxR1UBeGqfRI/BZ58LEF3PRq/g2OVFsdQia5gZKr
 ddiJKSL1Vw==
 =z1AW
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC multiplatform code updates from Arnd Bergmann:
 "This branch is the culmination of 5 years of effort to bring the ARMv6
  and ARMv7 platforms together such that they can all be enabled and
  boot the same kernel.  It has been a tremendous amount of cleanup and
  refactoring by a huge number of people, and creation of several new
  (and major) subsystems to better abstract out all the platform details
  in an appropriate manner.

  The bulk of this branch is a large patchset from Arnd that brings
  several of the more minor and older platforms we have closer to
  multiplatform support.  Among these are MMP, S3C64xx, Orion5x, mv78xx0
  and realview Much of this is moving around header files from old mach
  directories, but there are also some cleanup patches of debug_ll
  (lowlevel debug per-platform options) and other parts.

  Linus Walleij also has some patchs to clean up the older ARM Realview
  platforms by finally introducing DT support, and Rob Herring has some
  for ARM Versatile which is now DT-only.  Both of these platforms are
  now multiplatform.

  Finally, a couple of patches from Russell for Dove PMU, and a fix from
  Valentin Rothberg for Exynos ADC, which were rebased on top of the
  series to avoid conflicts"

* tag 'armsoc-multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (75 commits)
  ARM: realview: don't select SMP_ON_UP for UP builds
  ARM: s3c: simplify s3c_irqwake_{e,}intallow definition
  ARM: s3c64xx: fix pm-debug compilation
  iio: exynos-adc: fix irqf_oneshot.cocci warnings
  ARM: realview: build realview-dt SMP support only when used
  ARM: realview: select apropriate targets
  ARM: realview: clean up header files
  ARM: realview: make all header files local
  ARM: no longer make CPU targets visible separately
  ARM: integrator: use explicit core module options
  ARM: realview: enable multiplatform
  ARM: make default platform work for NOMMU
  ARM: debug-ll: move DEBUG_LL_UART_EFM32 to correct Kconfig location
  ARM: defconfig: use correct debug_ll settings
  ARM: versatile: convert to multi-platform
  ARM: versatile: merge mach code into a single file
  ARM: versatile: switch to DT only booting and remove legacy code
  ARM: versatile: add DT based PCI detection
  ARM: pxa: mark ezx structures as __maybe_unused
  ARM: pxa: mark raumfeld init functions as __maybe_unused
  ...
2016-01-20 18:03:56 -08:00
Linus Torvalds 5083c54264 ARM: SoC cleanups for v4.5
A smallish number of general cleanup commits this release cycle. Some
 of these are minor tweaks:
 
 - shmobile change of binding for their GIC (using arm,pl390 now)
 - ARCH_RENESAS introduction
 - Misc other renesas updates
 
 There's also a couple of treewide commits from Masahiro Yamada cleaning up
 const/__initconst for SMP operation structs and a switch to using "depends
 on" instead of if-constructs on most of the Kconfig platform targets.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWnrTOAAoJEIwa5zzehBx3khsQAKCH1YZfI6AcY0+4z2Kmn/vK
 7M86Fnmfa13ju+Iub5YuVsVFVAZ/TGTEVWoiUBMEb73IO0D5Jdl97BhJqV8Kv9Uy
 jz6PZGXDMJjjkts6N0ehYyu+8WbRvxtVbMNqVD/CO6CH1096UVnxgRz9uTmgJ9Z7
 81EDJH4QPPg/zZR/GNV/STf4FKjlcAAN7Vo+5+m12RIptZXXkbGSL3Y6NZAyFlVB
 JNK5jFcabhD08DsCKa4YzbuubiQO5qiXdoxX+u/OyQWjupxM9YE5gAcna9o4V3FY
 Y6KnCPcy0XHCkIYk26MITXghr7UFLq9LdD2+s5Ab4HP1XZukw4TUUKd3gwCjCY2h
 8RPIfvM7cJmiU3flY56A076Pg+Y35gfMQr+VDe2gMzWtrgCONWma+tHj2JSnNBkv
 4I615hysQ46rzgsbpnI/yOQoXTlQH0qsNPjOlsXuRIlC4feNaw2FPTtT4dqEIXjE
 l7/LeHuu3217/yp2w37OrtMue4C9UZCHVSnHiV6hJgjdS+9UNRWAXMUAqWApSOam
 5MPdZ/93+66gSrCdJG1KUhcw4F9MGawLAe4A41Eq7gWDbiJVDcZhRczK+Q79MNKo
 KvoLWAED+85qS5Z8k/1Ko9NNnl4c4kNR8fAKqD5qcEes7WGLIO1F2/RfC1zMmJfk
 kHYcwx4sBVPsBHDsAiPN
 =pQaP
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC cleanups from Olof Johansson:
 "A smallish number of general cleanup commits this release cycle.  Some
  of these are minor tweaks:

   - shmobile change of binding for their GIC (using arm,pl390 now)
   - ARCH_RENESAS introduction
   - Misc other renesas updates

  There's also a couple of treewide commits from Masahiro Yamada
  cleaning up const/__initconst for SMP operation structs and a switch
  to using "depends on" instead of if-constructs on most of the Kconfig
  platform targets"

* tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  staging: board: armadillo800eva: Use "arm,pl390"
  staging: board: kzm9d: Use "arm,pl390"
  ARM: shmobile: r8a7778 dtsi: Use "arm,pl390" for GIC
  ARM: shmobile: emev2 dtsi: Use "arm,pl390" for GIC
  ARM: shmobile: r8a7740 dtsi: Use "arm,pl390" for GIC
  ARM: shmobile: r7s72100 dtsi: Use "arm,pl390" for GIC
  ARM: use "depends on" for SoC configs instead of "if" after prompt
  ARM/clocksource: use automatic DT probing for ux500 PRCMU
  ARM: use const and __initconst for smp_operations
  ARM: hisi: do not export smp_operations structures
  ARM: mvebu: remove unused mach/gpio.h
  ARM: shmobile: Remove legacy mach/irqs.h
  ARM: shmobile: Introduce ARCH_RENESAS
  MAINTAINERS: Remove link to oss.renesas.com which is closed
2016-01-20 17:55:20 -08:00
Linus Torvalds 0c582826a1 ARM: SoC non-urgent fixes for v4.5
As usual, we queue up a few fixes that don't seem urgent enough to go in
 through -rc.
 
  - MAINTAINERS updates to add a list for brcmstb and fix a typo
  - A handful of fixes for OMAP 81xx, a recently resurrected platform so these
    can't be considered real regressions and thus got queued.
  - A couple of other small fixes for scoop, sa1100 and davinci
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWnrNxAAoJEIwa5zzehBx3UwUQAJ0sX5ScDfOjUJFn8ridx4OY
 8mcOuR+ij/f2srNUlvBFV4h/j+vOauZ2zlNAjhQtgAJH0PV1pQyTkyLRGoDSwIzI
 BDl79IhlMKte3iZ460q3fsVovgF2OwmDVfx0WXC72X6oOv3xt+FPlE4B543Q/v/r
 WmL+PvLitaUA44/aK7QwlXg+IVQn2jDP64Uqkal5oVuQCjpeu4tcL99AbCLi4FiZ
 XA5wcofKCo/wDeRK0uLWgcrHklVF4QIcvOPRIqvPvFc8V6OldAb22hTOozNa5gSp
 QG3S9IFO4OHrcEH6M7XqLaTQv8KXEwzAqFlrciJUBX7rm0cUlzRKwctr1MmLsnKi
 UpbHAqUHTLeJaNjKQGEX+vVKBa8PjLN4E05AuSa6TOgDbNgxRu0XUDbLR/9/bgbL
 towtcUGTLBYc8itWx+0jhy4ABU0/kZiUbVOdWi5ex76BHI8lnXSvV8dD02iLHmfU
 yLskruj/RMubvZxdj/kb7f/Tqn0eyi9TUGS2lqGE3Twj2MUwUaZaPskMEYMQgSZf
 U3NWTPCDWvXsBnaO3yENBWUBGA54fy7YfB0gZc7W9Lg+vbnw6j+I4/GX1Eb2toA+
 EU9qn6nZ3kI6NAo/2snVsaGLFnAAnsslX6evnFea9mqPMEbrzd9Yg5rUMK7nfsvv
 5DpvnsMKnlL80YykQ4yC
 =W+es
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull non-urgent ARM SoC fixes from Olof Johansson:
 "As usual, we queue up a few fixes that don't seem urgent enough to go
  in through -rc.

   - MAINTAINERS updates to add a list for brcmstb and fix a typo
   - A handful of fixes for OMAP 81xx, a recently resurrected platform
     so these can't be considered real regressions and thus got queued.
   - A couple of other small fixes for scoop, sa1100 and davinci"

* tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: OMAP2+: Fix randconfig build warning for dm814_pllss_data
  ARM: sa1100/simpad: Be sure to clamp return value
  ARM: scoop: Be sure to clamp return value
  ARM: davinci: fix a problematic usage of WARN()
  ARM: davinci: only select WT cache if cache is enabled
  ARM: OMAP2+: Remove useless check for legacy booting for dm814x
  ARM: OMAP2+: Enable GPIO for dm814x
  ARM: dts: Fix dm814x pinctrl address and mask
  ARM: dts: Fix dm8148 control modules ranges
  ARM: OMAP2+: Fix timer entries for dm814x
  ARM: dts: Fix some mux and divider clocks to get dm814x-evm booting
  ARM: OMAP2+: Add DPPLS clock manager for dm814x
  clk: ti: Add few dm814x clock aliases
  ARM: dts: Fix dm814x entries for pllss and prcm
  MAINTAINERS: gpio-brcmstb: Remove stray '>'
  MAINTAINERS: brcmstb: Include Broadcom internal mailing-list
2016-01-20 17:44:16 -08:00
Linus Torvalds 71e4634e00 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "The highlights this round include:

   - Introduce configfs support for unlocked configfs_depend_item()
     (krzysztof + andrezej)
   - Conversion of usb-gadget target driver to new function registration
     interface (andrzej + sebastian)
   - Enable qla2xxx FC target mode support for Extended Logins (himansu +
     giridhar)
   - Enable qla2xxx FC target mode support for Exchange Offload (himansu +
     giridhar)
   - Add qla2xxx FC target mode irq affinity notification + selective
     command queuing.  (quinn + himanshu)
   - Fix iscsi-target deadlock in se_node_acl configfs deletion (sagi +
     nab)
   - Convert se_node_acl configfs deletion + se_node_acl->queue_depth to
     proper se_session->sess_kref + target_get_session() usage.  (hch +
     sagi + nab)
   - Fix long-standing race between se_node_acl->acl_kref get and
     get_initiator_node_acl() lookup.  (hch + nab)
   - Fix target/user block-size handling, and make sure netlink reaches
     all network namespaces (sheng + andy)

  Note there is an outstanding bug-fix series for remote I_T nexus port
  TMR LUN_RESET has been posted and still being tested, and will likely
  become post -rc1 material at this point"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (56 commits)
  scsi: qla2xxxx: avoid type mismatch in comparison
  target/user: Make sure netlink would reach all network namespaces
  target: Obtain se_node_acl->acl_kref during get_initiator_node_acl
  target: Convert ACL change queue_depth se_session reference usage
  iscsi-target: Fix potential dead-lock during node acl delete
  ib_srpt: Convert acl lookup to modern get_initiator_node_acl usage
  tcm_fc: Convert acl lookup to modern get_initiator_node_acl usage
  tcm_fc: Wait for command completion before freeing a session
  target: Fix a memory leak in target_dev_lba_map_store()
  target: Support aborting tasks with a 64-bit tag
  usb/gadget: Remove set-but-not-used variables
  target: Remove an unused variable
  target: Fix indentation in target_core_configfs.c
  target/user: Allow user to set block size before enabling device
  iser-target: Fix non negative ERR_PTR isert_device_get usage
  target/fcoe: Add tag support to tcm_fc
  qla2xxx: Check for online flag instead of active reset when transmitting responses
  qla2xxx: Set all queues to 4k
  qla2xxx: Disable ZIO at start time.
  qla2xxx: Move atioq to a different lock to reduce lock contention
  ...
2016-01-20 17:20:53 -08:00
Johannes Weiner b2807f07f4 mm: memcontrol: add "sock" to cgroup2 memory.stat
Provide statistics on how much of a cgroup's memory footprint is made up
of socket buffers from network connections owned by the group.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Vladimir Davydov 5ccc5abaaf mm: free swap cache aggressively if memcg swap is full
Swap cache pages are freed aggressively if swap is nearly full (>50%
currently), because otherwise we are likely to stop scanning anonymous
when we near the swap limit even if there is plenty of freeable swap cache
pages.  We should follow the same trend in case of memory cgroup, which
has its own swap limit.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Vladimir Davydov d8b38438a0 mm: vmscan: do not scan anon pages if memcg swap limit is hit
We don't scan anonymous memory if we ran out of swap, neither should we do
it in case memcg swap limit is hit, because swap out is impossible anyway.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Vladimir Davydov 6f2cb2f177 swap.h: move memcg related stuff to the end of the file
The following patches will add more functions to the memcg section of
include/linux/swap.h.  Some of them will need values defined below the
current location of the section.  So let's move the section to the end of
the file.  No functional changes intended.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Vladimir Davydov eb01aaab43 mm: memcontrol: replace mem_cgroup_lruvec_online with mem_cgroup_online
mem_cgroup_lruvec_online() takes lruvec, but it only needs memcg.  Since
get_scan_count(), which is the only user of this function, now possesses
pointer to memcg, let's pass memcg directly to mem_cgroup_online() instead
of picking it out of lruvec and rename the function accordingly.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Vladimir Davydov 37e8435119 mm: memcontrol: charge swap to cgroup2
This patchset introduces swap accounting to cgroup2.

This patch (of 7):

In the legacy hierarchy we charge memsw, which is dubious, because:

 - memsw.limit must be >= memory.limit, so it is impossible to limit
   swap usage less than memory usage. Taking into account the fact that
   the primary limiting mechanism in the unified hierarchy is
   memory.high while memory.limit is either left unset or set to a very
   large value, moving memsw.limit knob to the unified hierarchy would
   effectively make it impossible to limit swap usage according to the
   user preference.

 - memsw.usage != memory.usage + swap.usage, because a page occupying
   both swap entry and a swap cache page is charged only once to memsw
   counter. As a result, it is possible to effectively eat up to
   memory.limit of memory pages *and* memsw.limit of swap entries, which
   looks unexpected.

That said, we should provide a different swap limiting mechanism for
cgroup2.

This patch adds mem_cgroup->swap counter, which charges the actual number
of swap entries used by a cgroup.  It is only charged in the unified
hierarchy, while the legacy hierarchy memsw logic is left intact.

The swap usage can be monitored using new memory.swap.current file and
limited using memory.swap.max.

Note, to charge swap resource properly in the unified hierarchy, we have
to make swap_entry_free uncharge swap only when ->usage reaches zero, not
just ->count, i.e.  when all references to a swap entry, including the one
taken by swap cache, are gone.  This is necessary, because otherwise
swap-in could result in uncharging swap even if the page is still in swap
cache and hence still occupies a swap entry.  At the same time, this
shouldn't break memsw counter logic, where a page is never charged twice
for using both memory and swap, because in case of legacy hierarchy we
uncharge swap on commit (see mem_cgroup_commit_charge).

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner 0b8f73e104 mm: memcontrol: clean up alloc, online, offline, free functions
The creation and teardown of struct mem_cgroup is fairly messy and
that has attracted mistakes and subtle bugs before.

The main cause for this is that there is no clear model about what
needs to happen when, and that attracts more chaos. So create one:

1. mem_cgroup_alloc() should allocate struct mem_cgroup and its
   auxiliary members and initialize work items, locks etc. so that the
   object it returns is fully initialized and in a neutral state.

2. mem_cgroup_css_alloc() will use mem_cgroup_alloc() to obtain a new
   memcg object and configure it and the system according to the role
   of the new memory-controlled cgroup in the hierarchy.

3. mem_cgroup_css_online() is no longer needed to synchronize with
   iterators, but it verifies css->id which isn't available earlier.

4. mem_cgroup_css_offline() implements stuff that needs to happen upon
   the user-visible destruction of a cgroup, which includes stopping
   all user interfacing as well as releasing certain structures when
   continued memory consumption would be unexpected at that point.

5. mem_cgroup_css_free() prepares the system and the memcg object for
   the object's disappearance, neutralizes its state, and then gives
   it back to mem_cgroup_free().

6. mem_cgroup_free() releases struct mem_cgroup and auxiliary memory.

[arnd@arndb.de: fix SLOB build regression]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner 0db1529817 mm: memcontrol: flatten struct cg_proto
There are no more external users of struct cg_proto, flatten the
structure into struct mem_cgroup.

Since using those struct members doesn't stand out as much anymore,
add cgroup2 static branches to make it clearer which code is legacy.

Suggested-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner d886f4e483 mm: memcontrol: rein in the CONFIG space madness
What CONFIG_INET and CONFIG_LEGACY_KMEM guard inside the memory
controller code is insignificant, having these conditionals is not
worth the complication and fragility that comes with them.

[akpm@linux-foundation.org: rework mem_cgroup_css_free() statement ordering]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner 489c2a20a4 mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
Let the user know that CONFIG_MEMCG_KMEM does not apply to the cgroup2
interface. This also makes legacy-only code sections stand out better.

[arnd@arndb.de: mm: memcontrol: only manage socket pressure for CONFIG_INET]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner 127424c86b mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
The cgroup2 memory controller will account important in-kernel memory
consumers per default.  Move all necessary components to CONFIG_MEMCG.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner 567e9ab2e6 mm: memcontrol: give the kmem states more descriptive names
On any given memcg, the kmem accounting feature has three separate
states: not initialized, structures allocated, and actively accounting
slab memory.  These are represented through a combination of the
kmem_acct_activated and kmem_acct_active flags, which is confusing.

Convert to a kmem_state enum with the states NONE, ALLOCATED, and
ONLINE.  Then rename the functions to modify the state accordingly.
This follows the nomenclature of css object states more closely.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Geliang Tang 8e99469ab0 dma-mapping: use offset_in_page macro
Use offset_in_page macro instead of (addr & ~PAGE_MASK).

Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Christoph Hellwig 20d666e411 dma-mapping: remove <asm-generic/dma-coherent.h>
This wasn't an asm-generic header to start with, and can be merged into
dma-mapping.h trivially.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Christoph Hellwig e1c7e32453 dma-mapping: always provide the dma_map_ops based implementation
Move the generic implementation to <linux/dma-mapping.h> now that all
architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
that everyone supports them.

[valentinrothberg@gmail.com: remove leftovers in Kconfig]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Yaowei Bai 2954e440be ipc/shm.c: is_file_shm_hugepages() can be boolean
Make is_file_shm_hugepages() return bool to improve readability due to
this particular function only using either one or zero as its return
value.

No functional change.

Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Bongkyu Kim 06af1c52c9 lz4: fix wrong compress buffer size for 64-bits
The current lz4 compress buffer is 16kb on 32-bits, 32kb on 64-bits
system.  But, lz4 needs only 16kb on both.  On 64-bits, this causes
wasted cpu cycles for additional memset during every compression.

In case of lz4hc, the current buffer size is (256kb + 8) on 32-bits,
(512kb + 16) on 64-bits.  But, lz4hc needs only (256kb + 2 * pointer) on
both.

This patch fixes these wrong compress buffer sizes for 64-bits.

Signed-off-by: Bongkyu Kim <bongkyu.kim@lge.com>
Cc: Chanho Min <chanho.min@lge.com>
Cc: Yann Collet <yann.collet.73@gmail.com>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Andrey Ryabinin c6d308534a UBSAN: run-time undefined behavior sanity checker
UBSAN uses compile-time instrumentation to catch undefined behavior
(UB).  Compiler inserts code that perform certain kinds of checks before
operations that could cause UB.  If check fails (i.e.  UB detected)
__ubsan_handle_* function called to print error message.

So the most of the work is done by compiler.  This patch just implements
ubsan handlers printing errors.

GCC has this capability since 4.9.x [1] (see -fsanitize=undefined
option and its suboptions).
However GCC 5.x has more checkers implemented [2].
Article [3] has a bit more details about UBSAN in the GCC.

[1] - https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html
[2] - https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html
[3] - http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

Issues which UBSAN has found thus far are:

Found bugs:

 * out-of-bounds access - 97840cb67f ("netfilter: nfnetlink: fix
   insufficient validation in nfnetlink_bind")

undefined shifts:

 * d48458d4a7 ("jbd2: use a better hash function for the revoke
   table")

 * 10632008b9 ("clockevents: Prevent shift out of bounds")

 * 'x << -1' shift in ext4 -
   http://lkml.kernel.org/r/<5444EF21.8020501@samsung.com>

 * undefined rol32(0) -
   http://lkml.kernel.org/r/<1449198241-20654-1-git-send-email-sasha.levin@oracle.com>

 * undefined dirty_ratelimit calculation -
   http://lkml.kernel.org/r/<566594E2.3050306@odin.com>

 * undefined roundown_pow_of_two(0) -
   http://lkml.kernel.org/r/<1449156616-11474-1-git-send-email-sasha.levin@oracle.com>

 * [WONTFIX] undefined shift in __bpf_prog_run -
   http://lkml.kernel.org/r/<CACT4Y+ZxoR3UjLgcNdUm4fECLMx2VdtfrENMtRRCdgHB2n0bJA@mail.gmail.com>

   WONTFIX here because it should be fixed in bpf program, not in kernel.

signed overflows:

 * 32a8df4e0b ("sched: Fix odd values in effective_load()
   calculations")

 * mul overflow in ntp -
   http://lkml.kernel.org/r/<1449175608-1146-1-git-send-email-sasha.levin@oracle.com>

 * incorrect conversion into rtc_time in rtc_time64_to_tm() -
   http://lkml.kernel.org/r/<1449187944-11730-1-git-send-email-sasha.levin@oracle.com>

 * unvalidated timespec in io_getevents() -
   http://lkml.kernel.org/r/<CACT4Y+bBxVYLQ6LtOKrKtnLthqLHcw-BMp3aqP3mjdAvr9FULQ@mail.gmail.com>

 * [NOTABUG] signed overflow in ktime_add_safe() -
   http://lkml.kernel.org/r/<CACT4Y+aJ4muRnWxsUe1CMnA6P8nooO33kwG-c8YZg=0Xc8rJqw@mail.gmail.com>

[akpm@linux-foundation.org: fix unused local warning]
[akpm@linux-foundation.org: fix __int128 build woes]
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yury Gribov <y.gribov@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Davidlohr Bueso a460bece02 rbtree: use READ_ONCE in RB_EMPTY_ROOT
With commit d72da4a4d9 ("rbtree: Make lockless searches non-fatal") our
rbtrees provide weak guarantees that allows us to do lockless (and very
speculative) reads of the tree.  Such readers cannot see partial stores
on nodes, ie left/right as well as root.  As such, similar to the
WRITE_ONCE semantics when doing rotations, use READ_ONCE when checking
the root node in RB_EMPTY_ROOT.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Xunlei Pang 978e30c9b4 kexec: move some memembers and definitions within the scope of CONFIG_KEXEC_FILE
Move the stuff currently only used by the kexec file code within
CONFIG_KEXEC_FILE (and CONFIG_KEXEC_VERIFY_SIG).

Also move internal "struct kexec_sha_region" and "struct kexec_buf" into
"kexec_internal.h".

Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rasmus Villemoes 9425676a36 kernel/cpu.c: make set_cpu_* static inlines
Almost all callers of the set_cpu_* functions pass an explicit true or
false.  Making them static inline thus replaces the function calls with a
simple set_bit/clear_bit, saving some .text.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rasmus Villemoes 5aec01b834 kernel/cpu.c: eliminate cpu_*_mask
Replace the variables cpu_possible_mask, cpu_online_mask, cpu_present_mask
and cpu_active_mask with macros expanding to expressions of the same type
and value, eliminating some indirection.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rasmus Villemoes 4b804c85dc kernel/cpu.c: export __cpu_*_mask
Exporting the cpumasks __cpu_possible_mask and friends will allow us to
remove the extra indirection through the cpu_*_mask variables.  It will
also allow the set_cpu_* functions to become static inlines, which will
give a .text reduction.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Jann Horn caaee6234d ptrace: use fsuid, fsgid, effective creds for fs access checks
By checking the effective credentials instead of the real UID / permitted
capabilities, ensure that the calling process actually intended to use its
credentials.

To ensure that all ptrace checks use the correct caller credentials (e.g.
in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS
flag), use two new flags and require one of them to be set.

The problem was that when a privileged task had temporarily dropped its
privileges, e.g.  by calling setreuid(0, user_uid), with the intent to
perform following syscalls with the credentials of a user, it still passed
ptrace access checks that the user would not be able to pass.

While an attacker should not be able to convince the privileged task to
perform a ptrace() syscall, this is a problem because the ptrace access
check is reused for things in procfs.

In particular, the following somewhat interesting procfs entries only rely
on ptrace access checks:

 /proc/$pid/stat - uses the check for determining whether pointers
     should be visible, useful for bypassing ASLR
 /proc/$pid/maps - also useful for bypassing ASLR
 /proc/$pid/cwd - useful for gaining access to restricted
     directories that contain files with lax permissions, e.g. in
     this scenario:
     lrwxrwxrwx root root /proc/13020/cwd -> /root/foobar
     drwx------ root root /root
     drwxr-xr-x root root /root/foobar
     -rw-r--r-- root root /root/foobar/secret

Therefore, on a system where a root-owned mode 6755 binary changes its
effective credentials as described and then dumps a user-specified file,
this could be used by an attacker to reveal the memory layout of root's
processes or reveal the contents of files he is not allowed to access
(through /proc/$pid/cwd).

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Adam Barth 243c2137cd include/linux/radix-tree.h: fix error in docs about locks
This text refers to the "first 7 functions", which was correct when
written but became incorrect when Johannes Weiner added another function
to the list in 139e561660 ("lib: radix_tree: tree node interface").

Change the text to correctly refer to the first 8 functions.

Signed-off-by: Adam Barth <aurorean@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Stephen Boyd a9aec5881b lib/iomap_copy.c: add __ioread32_copy()
Some drivers need to read data out of iomem areas 32-bits at a time.
Add an API to do this.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Cc: <zajec5@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rafael J. Wysocki f11aef69b2 Merge branch 'pm-cpuidle'
* pm-cpuidle:
  cpuidle: menu: Avoid pointless checks in menu_select()
  sched / idle: Drop default_idle_call() fallback from call_cpuidle()
  cpuidle: Don't enable all governors by default
  cpuidle: Default to ladder governor on ticking systems
  time: nohz: Expose tick_nohz_enabled
  cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0
2016-01-21 00:43:21 +01:00
Rafael J. Wysocki fa8bb45187 Merge branch 'pm-devfreq'
* pm-devfreq:
  MAINTAINERS: Add devfreq-event entry
  MAINTAINERS: Add missing git repository and directory for devfreq
  PM / devfreq: Do not show statistics if it's not ready.
  PM / devfreq: Modify the indentation of trans_stat sysfs for readability
  PM / devfreq: Set the freq_table of devfreq device
  PM / devfreq: Add show_one macro to delete the duplicate code
  PM / devfreq: event: Fix the error and warning from script/checkpatch.pl
  PM / devfreq: event: Remove the error log of devfreq_event_get_edev_by_phandle()
2016-01-21 00:43:13 +01:00
Rafael J. Wysocki 6efd3f8cde Merge branch 'pm-core'
* pm-core:
  driver core: Avoid NULL pointer dereferences in device_is_bound()
  platform: Do not detach from PM domains on shutdown
  USB / PM: Allow USB devices to remain runtime-suspended when sleeping
  PM / sleep: Go direct_complete if driver has no callbacks
  PM / Domains: add setter for dev.pm_domain
  device core: add device_is_bound()
2016-01-21 00:42:59 +01:00
Thierry Reding 386744425e swiotlb: Make linux/swiotlb.h standalone includible
This header file uses the enum dma_data_direction and struct page types
without explicitly including the corresponding header files. This makes
it rely on the includer to have included the proper headers before.

To fix this, include linux/dma-direction.h and forward-declare struct
page. The swiotlb_free() function is also annotated __init, therefore
requires linux/init.h to be included as well.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2016-01-20 17:29:52 -05:00
Bjorn Helgaas 9662e32c81 Merge branch 'pci/trivial' into next
* pci/trivial:
  PCI: shpchp: Constify hpc_ops structure
  PCI: Use kobj_to_dev() instead of open-coding it
  PCI: Use to_pci_dev() instead of open-coding it
  PCI: Fix all whitespace issues
  PCI/MSI: Fix typos in <linux/msi.h>
2016-01-20 11:48:25 -06:00
Bjorn Helgaas 904f664b58 Merge branches 'pci/iommu' and 'pci/misc' into next
* pci/iommu:
  PCI: Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183

* pci/misc:
  PCI: Limit config space size for Netronome NFP4000
  PCI: Add Netronome NFP4000 PF device ID
2016-01-20 11:47:54 -06:00
Willy Tarreau 759c01142a pipe: limit the per-user amount of pages allocated in pipes
On no-so-small systems, it is possible for a single process to cause an
OOM condition by filling large pipes with data that are never read. A
typical process filling 4000 pipes with 1 MB of data will use 4 GB of
memory. On small systems it may be tricky to set the pipe max size to
prevent this from happening.

This patch makes it possible to enforce a per-user soft limit above
which new pipes will be limited to a single page, effectively limiting
them to 4 kB each, as well as a hard limit above which no new pipes may
be created for this user. This has the effect of protecting the system
against memory abuse without hurting other users, and still allowing
pipes to work correctly though with less data at once.

The limit are controlled by two new sysctls : pipe-user-pages-soft, and
pipe-user-pages-hard. Both may be disabled by setting them to zero. The
default soft limit allows the default number of FDs per process (1024)
to create pipes of the default size (64kB), thus reaching a limit of 64MB
before starting to create only smaller pipes. With 256 processes limited
to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB =
1084 MB of memory allocated for a user. The hard limit is disabled by
default to avoid breaking existing applications that make intensive use
of pipes (eg: for splicing).

Reported-by: socketpair@gmail.com
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Mitigates: CVE-2013-4312 (Linux 2.0+)
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-19 19:25:21 -05:00
Linus Torvalds 7c24d9f3b2 Merge branch 'for-4.5/core' of git://git.kernel.dk/linux-block
Pull core block updates from Jens Axboe:
 "We don't have a lot of core changes this time around, it's mostly in
  drivers, which will come in a subsequent pull.

  The cores changes include:

   - blk-mq
        - Prep patch from Christoph, changing blk_mq_alloc_request() to
          take flags instead of just using gfp_t for sleep/nosleep.
        - Doc patch from me, clarifying the difference between legacy
          and blk-mq for timer usage.
        - Fixes from Raghavendra for memory-less numa nodes, and a reuse
          of CPU masks.

   - Cleanup from Geliang Tang, using offset_in_page() instead of open
     coding it.

   - From Ilya, rename request_queue slab to it reflects what it holds,
     and a fix for proper use of bdgrab/put.

   - A real fix for the split across stripe boundaries from Keith.  We
     yanked a broken version of this from 4.4-rc final, this one works.

   - From Mike Krinkin, emit a trace message when we split.

   - From Wei Tang, two small cleanups, not explicitly clearing memory
     that is already cleared"

* 'for-4.5/core' of git://git.kernel.dk/linux-block:
  block: use bd{grab,put}() instead of open-coding
  block: split bios to max possible length
  block: add call to split trace point
  blk-mq: Avoid memoryless numa node encoded in hctx numa_node
  blk-mq: Reuse hardware context cpumask for tags
  blk-mq: add a flags parameter to blk_mq_alloc_request
  Revert "blk-flush: Queue through IO scheduler when flush not required"
  block: clarify blk_add_timer() use case for blk-mq
  bio: use offset_in_page macro
  block: do not initialise statics to 0 or NULL
  block: do not initialise globals to 0 or NULL
  block: rename request_queue slab cache
2016-01-19 15:03:34 -08:00
Moni Shoua 3b5daf28ac IB/mlx4: Support modify_qp for RoCE v2
In order to support modify_qp for RoCE v2, we need to set
the gid_type in the QP context.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:35:01 -05:00
Moni Shoua 3f723f42d9 net/mlx4_core: Add support for RoCE v2 entropy
In RoCE v2 we need to choose a source UDP port, we do so by using
entropy over the source and dest QPNs.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:35:00 -05:00
Moni Shoua fca8300629 net/mlx4_core: Add support for configuring RoCE v2 UDP port
In order to support RoCE v2, the hardware needs to be configured
to classify certain UDP packets as RoCE v2 packets and pass it
through its RoCE pipeline. This patch enables configuring this
UDP port.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:35:00 -05:00
Moni Shoua 7e57b85c44 IB/mlx4: Add support for setting RoCEv2 gids in hardware
To tell hardware about a gid with type RoCEv2, software needs a new
modifier to the SET_PORT command: MLX4_SET_PORT_ROCE_ADDR. This can
replace the old method, MLX4_SET_PORT_GID_TABLE, for  RoCEv1 gids.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:35:00 -05:00
Moni Shoua d8ae914196 net/mlx4: Query RoCE support
Query the RoCE support from firmware using the appropriate firmware
commands. Downstream patches will read these capabilities and act
accordingly.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:35:00 -05:00
Christoph Hellwig 5fe1043da8 svc_rdma: use local_dma_lkey
We now alwasy have a per-PD local_dma_lkey available.  Make use of that
fact in svc_rdma and stop registering our own MR.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:30:48 -05:00
Chuck Lever 5d252f90a8 svcrdma: Add class for RDMA backwards direction transport
To support the server-side of an NFSv4.1 backchannel on RDMA
connections, add a transport class that enables backward
direction messages on an existing forward channel connection.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Bruce Fields <bfields@fieldses.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:30:48 -05:00
Chuck Lever 03fe993153 svcrdma: Define maximum number of backchannel requests
Extra resources for handling backchannel requests have to be
pre-allocated when a transport instance is created. Set up
additional fields in svcxprt_rdma to track these resources.

The max_requests fields are elements of the RPC-over-RDMA
protocol, so they should be u32. To ensure that unsigned
arithmetic is used everywhere, some other fields in the
svcxprt_rdma struct are updated.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Bruce Fields <bfields@fieldses.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:30:48 -05:00
Chuck Lever ba986c96f9 svcrdma: Make map_xdr non-static
Pre-requisite to use map_xdr in the backchannel code.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Bruce Fields <bfields@fieldses.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:30:48 -05:00
Chuck Lever 39b09a1a12 svcrdma: Add gfp flags to svc_rdma_post_recv()
svc_rdma_post_recv() allocates pages for receive buffers on-demand.
It uses GFP_KERNEL so the allocator tries hard, and may sleep. But
I'm about to add a call to svc_rdma_post_recv() from a function
that may not sleep.

Since all svc_rdma_post_recv() call sites can tolerate its failure,
allow it to fail if the page allocator returns nothing. Longer term,
receive buffers, being a finite resource per-connection, should be
pre-allocated and re-used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Bruce Fields <bfields@fieldses.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:30:48 -05:00
Chuck Lever 71810ef327 svcrdma: Remove unused req_map and ctxt kmem_caches
Clean up.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Bruce Fields <bfields@fieldses.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:30:48 -05:00
Chuck Lever 2fe81b239d svcrdma: Improve allocation of struct svc_rdma_req_map
To ensure this allocation cannot fail and will not sleep,
pre-allocate the req_map structures per-connection.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Bruce Fields <bfields@fieldses.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:30:48 -05:00
Chuck Lever cc886c9ff1 svcrdma: Improve allocation of struct svc_rdma_op_ctxt
When the maximum payload size of NFS READ and WRITE was increased
by commit cc9a903d91 ("svcrdma: Change maximum server payload back
to RPCSVC_MAXPAYLOAD"), the size of struct svc_rdma_op_ctxt
increased to over 6KB (on x86_64). That makes allocating one of
these from a kmem_cache more likely to fail in situations when
system memory is exhausted.

Since I'm about to add a caller where this allocation must always
work _and_ it cannot sleep, pre-allocate ctxts for each connection.

Another motivation for this change is that NFSv4.x servers are
required by specification not to drop NFS requests. Pre-allocating
memory resources reduces the likelihood of a drop.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Bruce Fields <bfields@fieldses.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:30:48 -05:00
Moni Shoua f25bf1977f net/mlx4: Remove unused macro
The macro mlx4_foreach_non_ib_transport_port() is not used anywhere. Remove it.

Fixes: aa9a2d51a3 ("mlx4: Activate RoCE/SRIOV")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:24:53 -05:00
Linus Torvalds a200dcb346 virtio: barrier rework+fixes
This adds a new kind of barrier, and reworks virtio and xen
 to use it.
 Plus some fixes here and there.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWlU2kAAoJECgfDbjSjVRpZ6IH/Ra19ecG8sCQo9zskr4zo22Z
 DZXC3u0sJDBYjjBAiw3IY1FKh7wx2Fr1RhUOj1bteBgcFCMCV1zInP5ITiCyzd1H
 YYh1w9C2tZaj2T4t9L4hIrAdtIF8fGS+oI2IojXPjOuDLEt6pfFBEjHp/sfl3UJq
 ZmZvw4OXviSNej7jBw8Xni3Uv18yfmLGXvMdkvMSPC1/XL29voGDqTVwhqJwxLVz
 k/ZLcKFOzIs9N7Nja0Jl1EiZtC2Y9cpItqweicNAzszlpkSL44vQxmCSefB+WyQ4
 gt0O3+AxYkLfrxzCBhUA4IpRex3/XPW1b+1e/V1XjfR2n/FlyLe+AIa8uPJElFc=
 =ukaV
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio barrier rework+fixes from Michael Tsirkin:
 "This adds a new kind of barrier, and reworks virtio and xen to use it.

  Plus some fixes here and there"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (44 commits)
  checkpatch: add virt barriers
  checkpatch: check for __smp outside barrier.h
  checkpatch.pl: add missing memory barriers
  virtio: make find_vqs() checkpatch.pl-friendly
  virtio_balloon: fix race between migration and ballooning
  virtio_balloon: fix race by fill and leak
  s390: more efficient smp barriers
  s390: use generic memory barriers
  xen/events: use virt_xxx barriers
  xen/io: use virt_xxx barriers
  xenbus: use virt_xxx barriers
  virtio_ring: use virt_store_mb
  sh: move xchg_cmpxchg to a header by itself
  sh: support 1 and 2 byte xchg
  virtio_ring: update weak barriers to use virt_xxx
  Revert "virtio_ring: Update weak barriers to use dma_wmb/rmb"
  asm-generic: implement virt_xxx memory barriers
  x86: define __smp_xxx
  xtensa: define __smp_xxx
  tile: define __smp_xxx
  ...
2016-01-18 16:44:24 -08:00
Linus Torvalds d05d82f711 Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull arch/tile updates from Chris Metcalf:
 "This is a grab bag of changes that includes some NOHZ and
  context-tracking related changes, some debugging improvements,
  JUMP_LABEL support, and some fixes for tilepro allmodconfig support.

  We also remove the now-unused node_has_online_mem() definitions both
  for tile's asm/topology.h as well as in linux/topology.h itself"

* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  numa: remove stale node_has_online_mem() define
  arch/tile: move user_exit() to early kernel entry sequence
  tile: fix bug in setting PT_FLAGS_DISABLE_IRQ on kernel entry
  tile: fix tilepro casts for readl, writel, etc
  tile: fix a -Wframe-larger-than warning
  tile: include the syscall number in the backtrace
  MAINTAINERS: add git URL for tile
  arch/tile: adopt prepare_exit_to_usermode() model from x86
  tile/jump_label: add jump label support for TILE-Gx
  tile: define a macro ktext_writable_addr to get writable kernel text address
2016-01-18 12:57:18 -08:00
Linus Torvalds d90f351a9b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32
Pull AVR32 updates from Hans-Christian Noren Egtvedt.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
  mmc: atmel: get rid of struct mci_dma_data
  mmc: atmel-mci: restore dma on AVR32
  avr32: wire up missing syscalls
  avr32: wire up accept4 syscall
2016-01-18 12:50:55 -08:00
Linus Torvalds 48f58ba9cb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull more networking fixes from David Miller:

 1) Fix brcmfmac build with older gcc, from Arend van Spriel.

 2) IRQ values unintentionally truncated to u8 in mlx5 driver, from
    Doron Tsur.

 3) Fix build warnings wrt tcp cgroup changes, from Geert Uytterhoeven.

 4) Limit deep recursion in ovs stack, from Hannes Frederic Sowa.

 5) at803x phy driver bug fixes from, Martin Blumenstingl.

 6) Fix TSO handling in hns driver, from Daode Huang

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits)
  ovs: limit ovs recursions in ovs_execute_actions to not corrupt stack
  team: Replace rcu_read_lock with a mutex in team_vlan_rx_kill_vid
  net: hns: bug fix about hisilicon TSO BD mode
  brcmfmac: fix BRCMF_FW_NVRAM_DEF macro for older gcc compilers
  net: phy: at803x: Add the interrupt register bit definitions
  net: phy: at803x: Clean up duplicate register definitions
  net: phy: at803x: Allow specifying the RGMII RX clock delay via phy mode
  net: phy: at803x: Don't set gbit features for the AR8030 phy
  arm64: bpf: add extra pass to handle faulty codegen
  arm64: insn: remove BUG_ON from codegen
  sctp: the temp asoc's transports should not be hashed/unhashed
  net/mlx5_core: Fix trimming down IRQ number
  tcp_memcontrol: Forward declare cgroup_subsys and mem_cgroup stucts
  batman-adv: Drop immediate orig_node free function
  batman-adv: Drop immediate batadv_hard_iface free function
  batman-adv: Drop immediate neigh_ifinfo free function
  batman-adv: Drop immediate batadv_hardif_neigh_node free function
  batman-adv: Drop immediate batadv_neigh_node free function
  batman-adv: Drop immediate batadv_orig_ifinfo free function
  batman-adv: Avoid recursive call_rcu for batadv_nc_node
  ...
2016-01-18 12:35:14 -08:00
Linus Torvalds c38dec7166 RTC for 4.5
Core:
  - fix module reference count in rtc-proc
  - Replace simple_strtoul by kstrtoul
 
 New driver:
  - Epson RX8010SJ
 
 Subsystem wide cleanups:
  - use %ph for short hex dumps
  - constify *_chip_ops structures
 
 Drivers:
  - abx80x: Microcrystal rv1805 support, alarm support
  - cmos: prevent kernel warning on IRQ flags mismatch
  - s5m: various cleanups
  - rv8803: rx8900 compatibility, small error path fix
  - sunxi: various cleanups
  - lpc32xx: remove irq > NR_IRQS check from probe()
  - imxdi: fix spelling mistake in warning message
  - ds1685: don't try to micromanage sysfs output size
  - da9063: avoid writing undefined data to rtc
  - gemini: Remove unnecessary platform_set_drvdata()
  - efi: add efi_procfs in efi_rtc_ops
  - pcf8523: refuse to write dates later than 2099
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJWnRQIAAoJEKbNnwlvZCyzb8kP/26N+iQNxQc1SjR4n/ZpJlBp
 aGcqyloPWcrpHPaAlQ7slo/TD0LLYdpYjCshiPnJjm4OLgA5+mba/ca5BgMAKOJM
 t0oXtLRVM7//TB+wLkpdHgM/OrwsTmYFMkB15G6YymsSB56Dc5YUxg4fIqCGcpIX
 j78AGyfwFinx7WzxMfGQSQIIz3Itym/s4Mc5HAjFikXuOPBQbd6RRZfj80gB/u4f
 k3XUsU8bOvQxqUa0SzgQTww90K11h4L5JYIjNh8dBvMbYcVX+xv6OFWfQYcVqrMa
 YYiAlznlOJ7O4puy7LSAFzc2Vm42NxqM9WHQ4VNcpRTiQ4nYgcreyPc5DhpY0Tc5
 RdA2LmILEG6k+6b2d8yyGBEWOoXYVZyizelyCMkqJRixDXN4jow4Xc5qXDwP/bsX
 qQSQK7XFuvDTFFZ4hUqF2W/OWMWo2wIiaYaWY6EAQmh4iq3eFBQ+2yolSJn+1PEY
 mJqUyw7E7PNgPagbmfIVJ6g6Q6ZQ/J4G67feZTOFaEagbdFM5eP79HL6ZbtNYZce
 4eqcmILvlUtmF0pzm0Ec8fhTIWL55GpPVuge6QkaIPFYq0lId8xiAuOX2L6G4oRf
 iAI6URijGwPWIHrCUx5dLGQgkxgauPdW5MpnuvSnxGWezGT2tCzLzV4ZHLyzayiX
 L8txbwQN0P1ykTJu1R0/
 =xQhZ
 -----END PGP SIGNATURE-----

Merge tag 'rtc-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
 "Core:
   - fix module reference count in rtc-proc
   - Replace simple_strtoul by kstrtoul

  New driver:
   - Epson RX8010SJ

  Subsystem wide cleanups:
   - use %ph for short hex dumps
   - constify *_chip_ops structures

  Drivers:
   - abx80x: Microcrystal rv1805 support, alarm support
   - cmos: prevent kernel warning on IRQ flags mismatch
   - s5m: various cleanups
   - rv8803: rx8900 compatibility, small error path fix
   - sunxi: various cleanups
   - lpc32xx: remove irq > NR_IRQS check from probe()
   - imxdi: fix spelling mistake in warning message
   - ds1685: don't try to micromanage sysfs output size
   - da9063: avoid writing undefined data to rtc
   - gemini: Remove unnecessary platform_set_drvdata()
   - efi: add efi_procfs in efi_rtc_ops
   - pcf8523: refuse to write dates later than 2099"

* tag 'rtc-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (24 commits)
  rtc: cmos: prevent kernel warning on IRQ flags mismatch
  rtc: rtc-ds2404: constify ds2404_chip_ops structures
  rtc: s5m: Make register configuration per S2MPS device to remove exceptions
  rtc: s5m: Add separate field for storing auto-cleared mask in register config
  rtc: s5m: Cleanup by removing useless 'rtc' prefix from fields
  rtc: Replace simple_strtoul by kstrtoul
  rtc: abx80x: add alarm support
  rtc: abx80x: Add Microcrystal rv1805 support
  rtc: v3020: constify v3020_chip_ops structures
  rtc: rv8803: Extend compatibility with the rx8900
  rtc: rv8803: fix handling return value of i2c_smbus_read_byte_data
  rtc: Add Epson RX8010SJ RTC driver
  rtc: lpc32xx: remove irq > NR_IRQS check from probe()
  rtc: imxdi: fix spelling mistake in warning message
  rtc: ds1685: don't try to micromanage sysfs output size
  rtc: use %ph for short hex dumps
  rtc: da9063: avoid writing undefined data to rtc
  rtc: sunxi: use of_device_get_match_data
  rtc: sunxi: constify the data_year_param structure
  rtc: sunxi: fix signedness issues
  ...
2016-01-18 12:10:45 -08:00
Linus Torvalds d43fb9f3c5 fbdev changes for 4.5
* pxafb: device-tree support
 * An unsafe kernel parameter 'lockless_register_fb' for debugging problems
   happening while inside the console lock
 * Small miscellaneous fixes & cleanups
 * omapdss: add writeback support functions
 * Separation of omapfb and omapdrm (see below)
 
 About the separation of omapfb and omapdrm, see
 http://permalink.gmane.org/gmane.comp.video.dri.devel/143151 for longer story.
 
 The short version:
 
 omapfb and omapdrm have shared low level drivers (omapdss and panel drivers),
 making further development of omapdrm difficult. After these patches omapfb and
 omapdrm have their own versions of the drivers, which are more or less
 direct copies for now but will diverge soon.
 
 This also means that omapfb (everything under drivers/video/fbdev/omap2/) is
 now in maintenance mode, and all new development will be done for omapdrm
 (drivers/gpu/drm/omapdrm/).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWnKg1AAoJEPo9qoy8lh71lrsP/RwwG8FDMl2tgwcsKVa/VlbF
 oez/CaNeZ9Jz0qd5RbIzIS9QdbL0AZugg4twwl76UbHWT477Z3EbUmpw++78kasr
 RFKDWYqMbxw3kshRDyALinGQmxPOPjNnc5mt9CYKzK4x0pJSLBmZc8qaNK3L4a5a
 eLJ3h6UhQDY61D04qr+LuTCETAxNR78x+NNIG7vYa9oS0ZDDrhlDyVPw4akPDMS6
 6Y5NgtRL1h2mq2hLBgTDCrwx3p7yZbnkSRKbpFnw/yddiXilND1d75JoW+0F6vKW
 U8DiRKxYtHNBdry4HlpRwufT52wkmtA/2puCW5Smw8araQ7R+s+wOt/1HAYQM72g
 8UCmNFMbhBpk8x8pT24ja4wyTLM9gaZqG9MWHLPEPbE6WicxSbqEAvIX9sakXLv6
 dDaf1SHZ+DFpq0jOwC8Rcnx1JFeeNNDf5cJb2pZI2Zka5jayQRTdbxeZGGnpFzu1
 1ZMiNQ24U+n9hgjV9QMiCW24TEBXFhFTf0Nlne3VP7qUbmvLqMUdGxGwM+b25/El
 SW/peryWglxsn5EBA7XybK+RTYxbjDtD5a8SOjD2YTNqVVVFHgf7z05SfSmYO5yi
 H67eDqdt0YsEGG87I8hv3eKM7FSRlYAywTC2mPfSOJ3+/G+18OU/voepcJHZ15x7
 SO3e/TFTrtglJzjVzX8j
 =Nrji
 -----END PGP SIGNATURE-----

Merge tag 'fbdev-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux

Pull fbdev updates from Tomi Valkeinen:
 "Summary:

   - pxafb: device-tree support
   - An unsafe kernel parameter 'lockless_register_fb' for debugging
     problems happening while inside the console lock
   - Small miscellaneous fixes & cleanups
   - omapdss: add writeback support functions
   - Separation of omapfb and omapdrm (see below)

  About the separation of omapfb and omapdrm, see

    http://permalink.gmane.org/gmane.comp.video.dri.devel/143151

  for longer story.  The short version:

  omapfb and omapdrm have shared low level drivers (omapdss and panel
  drivers), making further development of omapdrm difficult.  After
  these patches omapfb and omapdrm have their own versions of the
  drivers, which are more or less direct copies for now but will diverge
  soon.

  This also means that omapfb (everything under drivers/video/fbdev/omap2/)
  is now in maintenance mode, and all new development will be done for
  omapdrm (drivers/gpu/drm/omapdrm/)"

* tag 'fbdev-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (49 commits)
  video: fbdev: pxafb: fix out of memory error path
  drm/omap: make omapdrm select OMAP2_DSS
  drm/omap: move omapdss & displays under omapdrm
  omapfb: move vrfb into omapfb
  omapfb: take omapfb's private omapdss into use
  omapfb/displays: change CONFIG_DISPLAY_* to CONFIG_FB_OMAP2_*
  omapfb/dss: change CONFIG_OMAP* to CONFIG_FB_OMAP*
  omapdss: remove CONFIG_OMAP2_DSS_VENC from omapdss.h
  omapfb: copy omapdss & displays for omapfb
  omapfb: allow compilation only if DRM_OMAP is disabled
  fbdev: omap2: panel-dpi: simplify gpio setting
  fbdev: omap2: panel-dpi: in .disable first disable backlight then display
  OMAPDSS: DSS: fix a warning message
  video: omapdss: delete unneeded of_node_put
  OMAPDSS: DISPC: Remove boolean comparisons
  OMAPDSS: DSI: cleanup DSI_IRQ_ERROR_MASK define
  OMAPDSS: remove extra out == NULL checks
  OMAPDSS: change internal dispc functions to static
  OMAPDSS: make a two dss feat funcs internal to omapdss
  OMAPDSS: remove extra EXPORT_SYMBOLs
  ...
2016-01-18 11:58:31 -08:00
Chris Metcalf 00d27c6336 numa: remove stale node_has_online_mem() define
This isn't used anywhere, so delete it.

Looks like the last usage (in x86-specific code) was removed by Tejun
in 2011 in commit bd6709a91a ("x86, NUMA: Make 32bit use common NUMA
init path").

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
2016-01-18 14:49:33 -05:00
Linus Torvalds 5807fcaa9b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:

 - EVM gains support for loading an x509 cert from the kernel
   (EVM_LOAD_X509), into the EVM trusted kernel keyring.

 - Smack implements 'file receive' process-based permission checking for
   sockets, rather than just depending on inode checks.

 - Misc enhancments for TPM & TPM2.

 - Cleanups and bugfixes for SELinux, Keys, and IMA.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (41 commits)
  selinux: Inode label revalidation performance fix
  KEYS: refcount bug fix
  ima: ima_write_policy() limit locking
  IMA: policy can be updated zero times
  selinux: rate-limit netlink message warnings in selinux_nlmsg_perm()
  selinux: export validatetrans decisions
  gfs2: Invalid security labels of inodes when they go invalid
  selinux: Revalidate invalid inode security labels
  security: Add hook to invalidate inode security labels
  selinux: Add accessor functions for inode->i_security
  security: Make inode argument of inode_getsecid non-const
  security: Make inode argument of inode_getsecurity non-const
  selinux: Remove unused variable in selinux_inode_init_security
  keys, trusted: seal with a TPM2 authorization policy
  keys, trusted: select hash algorithm for TPM2 chips
  keys, trusted: fix: *do not* allow duplicate key options
  tpm_ibmvtpm: properly handle interrupted packet receptions
  tpm_tis: Tighten IRQ auto-probing
  tpm_tis: Refactor the interrupt setup
  tpm_tis: Get rid of the duplicate IRQ probing code
  ...
2016-01-17 19:13:15 -08:00
Linus Torvalds 2d663b5581 Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit
Pull audit updates from Paul Moore:
 "Seven audit patches for 4.5, all very minor despite the diffstat.

  The diffstat churn for linux/audit.h can be attributed to needing to
  reshuffle the linux/audit.h header to fix the seccomp auditing issue
  (see the commit description for details).

  Besides the seccomp/audit fix, most of the fixes are around trying to
  improve the connection with the audit daemon and a Kconfig
  simplification.  Nothing crazy, and everything passes our little
  audit-testsuite"

* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
  audit: always enable syscall auditing when supported and audit is enabled
  audit: force seccomp event logging to honor the audit_enabled flag
  audit: Delete unnecessary checks before two function calls
  audit: wake up threads if queue switched from limited to unlimited
  audit: include auditd's threads in audit_log_start() wait exception
  audit: remove audit_backlog_wait_overflow
  audit: don't needlessly reset valid wait time
2016-01-17 18:48:49 -08:00
Linus Torvalds 0cbeafb245 Merge branch 'akpm' (patches from Andrew)
Merge second patch-bomb from Andrew Morton:

 - more MM stuff:

    - Kirill's page-flags rework

    - Kirill's now-allegedly-fixed THP rework

    - MADV_FREE implementation

    - DAX feature work (msync/fsync).  This isn't quite complete but DAX
      is new and it's good enough and the guys have a handle on what
      needs to be done - I expect this to be wrapped in the next week or
      two.

  - some vsprintf maintenance work

  - various other misc bits

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (145 commits)
  printk: change recursion_bug type to bool
  lib/vsprintf: factor out %pN[F] handler as netdev_bits()
  lib/vsprintf: refactor duplicate code to special_hex_number()
  printk-formats.txt: remove unimplemented %pT
  printk: help pr_debug and pr_devel to optimize out arguments
  lib/test_printf.c: test dentry printing
  lib/test_printf.c: add test for large bitmaps
  lib/test_printf.c: account for kvasprintf tests
  lib/test_printf.c: add a few number() tests
  lib/test_printf.c: test precision quirks
  lib/test_printf.c: check for out-of-bound writes
  lib/test_printf.c: don't BUG
  lib/kasprintf.c: add sanity check to kvasprintf
  lib/vsprintf.c: warn about too large precisions and field widths
  lib/vsprintf.c: help gcc make number() smaller
  lib/vsprintf.c: expand field_width to 24 bits
  lib/vsprintf.c: eliminate potential race in string()
  lib/vsprintf.c: move string() below widen_string()
  lib/vsprintf.c: pull out padding code from dentry_name()
  printk: do cond_resched() between lines while outputting to consoles
  ...
2016-01-17 12:58:52 -08:00
Linus Torvalds 58cf279aca GPIO bulk updates for the v4.5 kernel cycle:
Infrastructural changes:
 
 - In struct gpio_chip, rename the .dev node to .parent to better reflect
   the fact that this is not the GPIO struct device abstraction. We will
   add that soon so this would be totallt confusing.
 
 - It was noted that the driver .get_value() callbacks was
   sometimes reporting negative -ERR values to the gpiolib core, expecting
   them to be propagated to consumer gpiod_get_value() and gpio_get_value()
   calls. This was not happening, so as there was a mess of drivers
   returning negative errors and some returning "anything else than zero"
   to indicate that a line was active. As some would have bit 31 set to
   indicate "line active" it clashed with negative error codes. This is
   fixed by the largeish series clamping values in all drivers with
   !!value to [0,1] and then augmenting the code to propagate error codes
   to consumers. (Includes some ACKed patches in other subsystems.)
 
 - Add a void *data pointer to struct gpio_chip. The container_of() design
   pattern is indeed very nice, but we want to reform the struct gpio_chip
   to be a non-volative, stateless business, and keep states internal to
   the gpiolib to be able to hold on to the state when adding a proper
   userspace ABI (character device) further down the road. To achieve this,
   drivers need a handle at the internal state that is not dependent on
   their struct gpio_chip() so we add gpiochip_add_data() and
   gpiochip_get_data() following the pattern of many other subsystems.
   All the "use gpiochip data pointer" patches transforms drivers to this
   scheme.
 
 - The Generic GPIO chip header has been merged into the general
   <linux/gpio/driver.h> header, and the custom header for that removed.
   Instead of having a separate mm_gpio_chip struct for these generic
   drivers, merge that into struct gpio_chip, simplifying the code and
   removing the need for separate and confusing includes.
 
 Misc improvements:
 
 - Stabilize the way GPIOs are looked up from the ACPI legacy
   specification.
 
 - Incremental driver features for PXA, PCA953X, Lantiq (patches from the
   OpenWRT community), RCAR, Zynq, PL061, 104-idi-48
 
 New drivers:
 
 - Add a GPIO chip to the ALSA SoC AC97 driver.
 
 - Add a new Broadcom NSP SoC driver (this lands in the pinctrl dir, but
   the branch is merged here too to account for infrastructural changes).
 
 - The sx150x driver now supports the sx1502.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWmsZhAAoJEEEQszewGV1ztq0QAJ1KbNOpmf/s3INkOH4r771Z
 WIrNEsmwwLIAryo8gKNOM0H1zCwhRUV7hIE5jYWgD6JvjuAN6vobMlZAq21j6YpB
 pKgqnI5DuoND450xjb8wSwGQ5NTYp1rFXNmwCrtyTjOle6AAW+Kp2cvVWxVr77Av
 uJinRuuBr9GOKW/yYM1Fw/6EPjkvvhVOb+LBguRyVvq0s5Peyw7ZVeY1tjgPHJLn
 oSZ9dmPUjHEn91oZQbtfro3plOObcxdgJ8vo//pgEmyhMeR8XjXES+aUfErxqWOU
 PimrZuMMy4cxnsqWwh3Dyxo7KSWfJKfSPRwnGwc/HgbHZEoWxOZI1ezRtGKrRQtj
 vubxp5dUBA5z66TMsOCeJtzKVSofkvgX2Wr/Y9jKp5oy9cHdAZv9+jEHV1pr6asz
 Tas97MmmO77XuRI/GPDqVHx8dfa15OIz9s92+Gu64KxNzVxTo4+NdoPSNxkbCILO
 FKn7EmU3D0OjmN2NJ9GAURoFaj3BBUgNhaxacG9j2bieyh+euuUHRtyh2k8zXR9y
 8OnY1UOrTUYF8YIq9pXZxMQRD/lqwCNHvEjtI6BqMcNx4MptfTL+FKYUkn/SgCYk
 QTNV6Ui+ety5D5aEpp5q0ItGsrDJ2LYSItsS+cOtMy2ieOxbQav9NWwu7eI3l5ly
 gwYTZjG9p9joPXLW0E3g
 =63rR
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "Here is the bulk of GPIO changes for v4.5.

  Notably there are big refactorings mostly by myself, aimed at getting
  the gpio_chip into a shape that makes me believe I can proceed to
  preserve state for a proper userspace ABI (character device) that has
  already been proposed once, but resulted in the feedback that I need
  to go back and restructure stuff.  So I've been restructuring stuff.
  On the way I ran into brokenness (return code from the get_value()
  callback) and had to fix it.  Also, refactored generic GPIO to be
  simpler.

  Some of that is still waiting to trickle down from the subsystems all
  over the kernel that provide random gpio_chips, I've touched every
  single GPIO driver in the kernel now, oh man I didn't know I was
  responsible for so much...

  Apart from that we're churning along as usual.

  I took some effort to test and retest so it should merge nicely and we
  shook out a couple of bugs in -next.

  Infrastructural changes:

   - In struct gpio_chip, rename the .dev node to .parent to better
     reflect the fact that this is not the GPIO struct device
     abstraction.  We will add that soon so this would be totallt
     confusing.

   - It was noted that the driver .get_value() callbacks was sometimes
     reporting negative -ERR values to the gpiolib core, expecting them
     to be propagated to consumer gpiod_get_value() and gpio_get_value()
     calls.  This was not happening, so as there was a mess of drivers
     returning negative errors and some returning "anything else than
     zero" to indicate that a line was active.  As some would have bit
     31 set to indicate "line active" it clashed with negative error
     codes.  This is fixed by the largeish series clamping values in all
     drivers with !!value to [0,1] and then augmenting the code to
     propagate error codes to consumers.  (Includes some ACKed patches
     in other subsystems.)

   - Add a void *data pointer to struct gpio_chip.  The container_of()
     design pattern is indeed very nice, but we want to reform the
     struct gpio_chip to be a non-volative, stateless business, and keep
     states internal to the gpiolib to be able to hold on to the state
     when adding a proper userspace ABI (character device) further down
     the road.  To achieve this, drivers need a handle at the internal
     state that is not dependent on their struct gpio_chip() so we add
     gpiochip_add_data() and gpiochip_get_data() following the pattern
     of many other subsystems.  All the "use gpiochip data pointer"
     patches transforms drivers to this scheme.

   - The Generic GPIO chip header has been merged into the general
     <linux/gpio/driver.h> header, and the custom header for that
     removed.  Instead of having a separate mm_gpio_chip struct for
     these generic drivers, merge that into struct gpio_chip,
     simplifying the code and removing the need for separate and
     confusing includes.

  Misc improvements:

   - Stabilize the way GPIOs are looked up from the ACPI legacy
     specification.

   - Incremental driver features for PXA, PCA953X, Lantiq (patches from
     the OpenWRT community), RCAR, Zynq, PL061, 104-idi-48

  New drivers:

   - Add a GPIO chip to the ALSA SoC AC97 driver.

   - Add a new Broadcom NSP SoC driver (this lands in the pinctrl dir,
     but the branch is merged here too to account for infrastructural
     changes).

   - The sx150x driver now supports the sx1502"

* tag 'gpio-v4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (220 commits)
  gpio: generic: make bgpio_pdata always visible
  gpiolib: fix chip order in gpio list
  gpio: mpc8xxx: Do not use gpiochip_get_data() in mpc8xxx_gpio_save_regs()
  gpio: mm-lantiq: Do not use gpiochip_get_data() in ltq_mm_save_regs()
  gpio: brcmstb: Allow building driver for BMIPS_GENERIC
  gpio: brcmstb: Set endian flags for big-endian MIPS
  gpio: moxart: fix build regression
  gpio: xilinx: Do not use gpiochip_get_data() in xgpio_save_regs()
  leds: pca9532: use gpiochip data pointer
  leds: tca6507: use gpiochip data pointer
  hid: cp2112: use gpiochip data pointer
  bcma: gpio: use gpiochip data pointer
  avr32: gpio: use gpiochip data pointer
  video: fbdev: via: use gpiochip data pointer
  gpio: pch: Optimize pch_gpio_get()
  Revert "pinctrl: lantiq: Implement gpio_chip.to_irq"
  pinctrl: nsp-gpio: use gpiochip data pointer
  pinctrl: vt8500-wmt: use gpiochip data pointer
  pinctrl: exynos5440: use gpiochip data pointer
  pinctrl: at91-pio4: use gpiochip data pointer
  ...
2016-01-17 12:32:01 -08:00
Linus Torvalds 6606b342fe Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
 "This adds following items:

   - watchdog restart handler support
   - watchdog reboot notifier support
   - watchdog sysfs attributes
   - support for the following new devices: AMD Mullins platform, AMD
     Carrizo platform, meson8b SoC, CSRatlas7, TS-4800, Alphascale
     asm9260-wdt, Zodiac, Sigma Designs SMP86xx/SMP87xx
   - Changes in refcounting for the watchdog core
   - watchdog core improvements
   - and small fixes"

* git://www.linux-watchdog.org/linux-watchdog: (60 commits)
  watchdog: asm9260: remove __init and __exit annotations
  watchdog: Drop pointer to watchdog device from struct watchdog_device
  watchdog: ziirave: Use watchdog infrastructure to create sysfs attributes
  watchdog: Add support for creating driver specific sysfs attributes
  watchdog: kill unref/ref ops
  watchdog: stmp3xxx: Remove unused variables
  watchdog: add MT7621 watchdog support
  hwmon: (sch56xx) Drop watchdog driver data reference count callbacks
  watchdog: da9055_wdt: Drop reference counting
  watchdog: da9052_wdt: Drop reference counting
  watchdog: Separate and maintain variables based on variable lifetime
  watchdog: diag288: Stop re-using watchdog core internal flags
  watchdog: Create watchdog device in watchdog_dev.c
  watchdog: qcom-wdt: Do not set 'dev' in struct watchdog_device
  watchdog: mena21: Do not use device pointer from struct watchdog_device
  watchdog: gpio: Do not use device pointer from struct watchdog_device
  watchdog: tangox: Print info message using pointer to platform device
  watchdog: bcm2835_wdt: Drop log message if watchdog is stopped
  devicetree: watchdog: add binding for Sigma Designs SMP8642 watchdog
  watchdog: add support for Sigma Designs SMP86xx/SMP87xx
  ...
2016-01-17 12:15:38 -08:00
Linus Torvalds a016af2e70 sound updates for 4.5-rc1
We've had quite busy weeks in this cycle.  Looking at ALSA core, the
 significant changes are a few fixes wrt timer and sequencer ioctls
 that have been revealed by fuzzer recently.  Other than that, ASoC
 core got a few updates about DAI link handling, but these are rather
 straightforward refactoring.
 
 In drivers scene, ASoC received quite lots of new drivers in addition
 to bunch of updates for still ongoing Intel Skylake support and
 topology API.  HD-audio gained a new HDMI/DP hotplug notification via
 component.  FireWire got a pile of code refactoring/updates with
 SCS.1x driver integration.
 
 More highlights are shown below.
 
 [NOTE: this contains also many commits for DRM.  This is due to the
  pull of drm stable branch into sound tree, as the base of i915 audio
  component work for HD-audio.  The highlights below don't contain
  these DRM changes, as these are supposed to be pulled via drm tree in
  anyway sooner or later.]
 
 Core
  - Handful fixes to harden ALSA timer and sequencer ioctls against
    races reported by syzkaller fuzzer
  - Irq description string can be unique to each card; only for
    HD-audio for now
 
 ASoC
  - Conversion of the array of DAI links to a list for supporting
    dynamically adding and removing DAI links
  - Topology API enhancements to make everything more component based
    and being able to specify PCM links via topology
  - Some more fixes for the topology code, though it is still not final
    and ready for enabling in production; we really need to get to the
    point where that can be done
  - A pile of changes for Intel SkyLake drivers which hopefully deliver
    some useful initial functionality for systems with this chipset,
    though there is more work still to come
  - Lots of new features and cleanups for the Renesas drivers
  - ANC support for WM5110
  - New drivers: Imagination Technologies IPs, Atmel class D speaker,
    Cirrus CS47L24 and WM1831, Dialog DA7128, Realtek RT5659 and
    RT56156, Rockchip RK3036, TI PC3168A, and AMD ACP
  - Rename PCM1792a driver to be generic pcm179x
 
 HD-Audio
  - Use audio component for i915 HDMI/DP hotplug handling
  - On-demand binding with i915 driver
  - bdl_pos_adj parameter adjustment for Baytrail controllers
  - Enable power_save_node for CX20722; this shouldn't lead to
    regression, hopefully
  - Kabylake HDMI/DP codec support
  - Quirks for Lenovo E50-80, Dell Latitude E-series, and other Dell
    machines
  - A few code refactoring
 
 FireWire
  - Lots of code cleanup and refactoring
  - Integrate the support of SCS.1x devices into snd-oxfw driver;
    snd-scs1x driver is obsoleted
 
 USB-audio
  - Fix possible NULL dereference at disconnection
  - A regression fix for Native Instruments devices
 
 Misc
  - A few code cleanups of fm801 driver
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJWmmhNAAoJEGwxgFQ9KSmk/wsP/3eO+giAT9VRPa6qxR6VdT6I
 dZwTxcp4ZzUrgLxk9k5VYjqey6QL+1xWfl3Abrd+NzXDj1wo4KsDh2XCKG1btO9K
 UpIZf76Nzt7o91pzHbsU6mrjDeoVNqloZoGbg1utAmmegaXH3owd18p/ZHfE3sz2
 BbaHmYW/R8lnaBgBhzqJB97+zRaLJmMWpWHfpHaIPjdfw8/V4j76jtPnpmv2hDZl
 BHXVHcQXjVGunFRzxdzBLuTC+FmhzUeTAbbAdOT4fEoOCv5MtZqYppNxdhj+b9l5
 mrsXe5FBTNmrt9Z5TtfCuzgJPkzoDperFb0aKd7wI1jVMtLzkNCMlanHr9U6B6fr
 jSrs6l25xrpF1BBfRMfHjNudA5vng/XC5dtW00JofXSrIxtwPNUoDDiqJgw7xVm5
 aVWK7KkQIjRbHdCQaeTymv70oHHKei92hbCrXUobXZ7wLeJMXNVPT25ttChWrgAI
 7cu5h+K5PjReI/sJFTMPL4aHZ+jAn9quQl7vK8EXiL9E6G8lLiuBiVW6hjGd9At+
 Z6UyGV+nCM6O3qZcyParMuLkNtWx9uT7Pcn8oTZAdKPngNhsf8+yl9qmsFkNLDC4
 LKPx0+rdCjtMKn2du3krsHhG3EN9pLDrE6g5U3d6Cz83e69Y7fCuSjl31SjD91H0
 bZDcM/ejYSbid3yKN4TL
 =Gvgb
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "We've had quite busy weeks in this cycle.  Looking at ALSA core, the
  significant changes are a few fixes wrt timer and sequencer ioctls
  that have been revealed by fuzzer recently.  Other than that, ASoC
  core got a few updates about DAI link handling, but these are rather
  straightforward refactoring.

  In drivers scene, ASoC received quite lots of new drivers in addition
  to bunch of updates for still ongoing Intel Skylake support and
  topology API.  HD-audio gained a new HDMI/DP hotplug notification via
  component.  FireWire got a pile of code refactoring/updates with
  SCS.1x driver integration.

  More highlights are shown below.

  [ NOTE: this contains also many commits for DRM.  This is due to the
    pull of drm stable branch into sound tree, as the base of i915 audio
    component work for HD-audio.  The highlights below don't contain
    these DRM changes, as these are supposed to be pulled via drm tree
    in anyway sooner or later.  ]

  Core:
   - Handful fixes to harden ALSA timer and sequencer ioctls against
     races reported by syzkaller fuzzer
   - Irq description string can be unique to each card; only for
     HD-audio for now

  ASoC:
   - Conversion of the array of DAI links to a list for supporting
     dynamically adding and removing DAI links
   - Topology API enhancements to make everything more component based
     and being able to specify PCM links via topology
   - Some more fixes for the topology code, though it is still not final
     and ready for enabling in production; we really need to get to the
     point where that can be done
   - A pile of changes for Intel SkyLake drivers which hopefully deliver
     some useful initial functionality for systems with this chipset,
     though there is more work still to come
   - Lots of new features and cleanups for the Renesas drivers
   - ANC support for WM5110
   - New drivers: Imagination Technologies IPs, Atmel class D speaker,
     Cirrus CS47L24 and WM1831, Dialog DA7128, Realtek RT5659 and
     RT56156, Rockchip RK3036, TI PC3168A, and AMD ACP
   - Rename PCM1792a driver to be generic pcm179x

  HD-Audio:
   - Use audio component for i915 HDMI/DP hotplug handling
   - On-demand binding with i915 driver
   - bdl_pos_adj parameter adjustment for Baytrail controllers
   - Enable power_save_node for CX20722; this shouldn't lead to
     regression, hopefully
   - Kabylake HDMI/DP codec support
   - Quirks for Lenovo E50-80, Dell Latitude E-series, and other Dell
     machines
   - A few code refactoring

  FireWire:
   - Lots of code cleanup and refactoring
   - Integrate the support of SCS.1x devices into snd-oxfw driver;
     snd-scs1x driver is obsoleted

  USB-audio:
   - Fix possible NULL dereference at disconnection
   - A regression fix for Native Instruments devices

  Misc:
   - A few code cleanups of fm801 driver"

* tag 'sound-4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (722 commits)
  ALSA: timer: Code cleanup
  ALSA: timer: Harden slave timer list handling
  ALSA: hda - Add fixup for Dell Latitidue E6540
  ALSA: timer: Fix race among timer ioctls
  ALSA: hda - add codec support for Kabylake display audio codec
  ALSA: timer: Fix double unlink of active_list
  ALSA: usb-audio: Fix mixer ctl regression of Native Instrument devices
  ALSA: hda - fix the headset mic detection problem for a Dell laptop
  ALSA: hda - Fix white noise on Dell Latitude E5550
  ALSA: hda_intel: add card number to irq description
  ALSA: seq: Fix race at timer setup and close
  ALSA: seq: Fix missing NULL check at remove_events ioctl
  ALSA: usb-audio: Avoid calling usb_autopm_put_interface() at disconnect
  ASoC: hdac_hdmi: remove unused hdac_hdmi_query_pin_connlist
  ASoC: AMD: Add missing include file
  ALSA: hda - Fixup inverted internal mic for Lenovo E50-80
  ALSA: usb: Add native DSD support for Oppo HA-1
  ASoC: Make aux_dev more like a generic component
  ASoC: bcm2835: cleanup includes by ordering them alphabetically
  ASoC: AMD: Manage ACP 2.x SRAM banks power
  ...
2016-01-17 12:05:31 -08:00
Doron Tsur 0b6e26ce89 net/mlx5_core: Fix trimming down IRQ number
With several ConnectX-4 cards installed on a server, one may receive
irqn > 255 from the kernel API, which we mistakenly trim to 8bit.

This causes EQ creation failure with the following stack trace:
[<ffffffff812a11f4>] dump_stack+0x48/0x64
[<ffffffff810ace21>] __setup_irq+0x3a1/0x4f0
[<ffffffff810ad7e0>] request_threaded_irq+0x120/0x180
[<ffffffffa0923660>] ? mlx5_eq_int+0x450/0x450 [mlx5_core]
[<ffffffffa0922f64>] mlx5_create_map_eq+0x1e4/0x2b0 [mlx5_core]
[<ffffffffa091de01>] alloc_comp_eqs+0xb1/0x180 [mlx5_core]
[<ffffffffa091ea99>] mlx5_dev_init+0x5e9/0x6e0 [mlx5_core]
[<ffffffffa091ec29>] init_one+0x99/0x1c0 [mlx5_core]
[<ffffffff812e2afc>] local_pci_probe+0x4c/0xa0

Fixing it by changing of the irqn type from u8 to unsigned int to
support values > 255

Fixes: 61d0e73e0a ('net/mlx5_core: Use the the real irqn in eq->irqn')
Reported-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Doron Tsur <doront@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-17 12:08:04 -05:00
Thomas Gleixner 203cbf77de hrtimer: Handle remaining time proper for TIME_LOW_RES
If CONFIG_TIME_LOW_RES is enabled we add a jiffie to the relative timeout to
prevent short sleeps, but we do not account for that in interfaces which
retrieve the remaining time.

Helge observed that timerfd can return a remaining time larger than the
relative timeout. That's not expected and breaks userland test programs.

Store the information that the timer was armed relative and provide functions
to adjust the remaining time. To avoid bloating the hrtimer struct make state
a u8, which as a bonus results in better code on x86 at least.

Reported-and-tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: linux-m68k@lists.linux-m68k.org
Cc: dhowells@redhat.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160114164159.273328486@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-17 11:13:55 +01:00
Aaron Conole fe22cd9b7c printk: help pr_debug and pr_devel to optimize out arguments
Currently, pr_debug and pr_devel will not elide function call arguments
appearing in calls to the no_printk for these macros.  This is because
all side effects must be honored before proceeding to the 0-value
assignment in no_printk.

The behavior is contrary to documentation found in the CodingStyle and
the header file where these functions are declared.

This patch corrects that behavior by shunting out the call to no_printk
completely.  The format string is still checked by gcc for correctness,
but no code seems to be emitted in common cases.

[akpm@linux-foundation.org: remove braces, per Joe]
Fixes: 5264f2f75d ("include/linux/printk.h: use and neaten no_printk")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Joe Perches <joe@perches.com>
Cc: Jason Baron <jbaron@akamai.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16 11:17:29 -08:00
Tejun Heo 8d91f8b153 printk: do cond_resched() between lines while outputting to consoles
@console_may_schedule tracks whether console_sem was acquired through
lock or trylock.  If the former, we're inside a sleepable context and
console_conditional_schedule() performs cond_resched().  This allows
console drivers which use console_lock for synchronization to yield
while performing time-consuming operations such as scrolling.

However, the actual console outputting is performed while holding
irq-safe logbuf_lock, so console_unlock() clears @console_may_schedule
before starting outputting lines.  Also, only a few drivers call
console_conditional_schedule() to begin with.  This means that when a
lot of lines need to be output by console_unlock(), for example on a
console registration, the task doing console_unlock() may not yield for
a long time on a non-preemptible kernel.

If this happens with a slow console devices, for example a serial
console, the outputting task may occupy the cpu for a very long time.
Long enough to trigger softlockup and/or RCU stall warnings, which in
turn pile more messages, sometimes enough to trigger the next cycle of
warnings incapacitating the system.

Fix it by making console_unlock() insert cond_resched() between lines if
@console_may_schedule.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Calvin Owens <calvinowens@fb.com>
Acked-by: Jan Kara <jack@suse.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Kyle McMartin <kyle@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16 11:17:25 -08:00
Viresh Kumar dfffa587a6 err.h: add (missing) unlikely() to IS_ERR_OR_NULL()
IS_ERR_VALUE() already contains it and so we need to add this only to
the !ptr check.  That will allow users of IS_ERR_OR_NULL(), to not add
this compiler flag.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16 11:17:24 -08:00
Yaowei Bai 3f1bfd9413 include/linux/kdev_t.h: remove new_valid_dev()
As all new_valid_dev() checks have been removed it's time to drop
new_valid_dev() itself.

No functional change.

Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16 11:17:23 -08:00
Michal Nazarewicz 8f57e4d930 include/linux/kernel.h: change abs() macro so it uses consistent return type
Rewrite abs() so that its return type does not depend on the
architecture and no unexpected type conversion happen inside of it.  The
only conversion is from unsigned to signed type.  char is left as a
return type but treated as a signed type regradless of it's actual
signedness.

With the old version, int arguments were promoted to long and depending
on architecture a long argument might result in s64 or long return type
(which may or may not be the same).

This came after some back and forth with Nicolas.  The current macro has
different return type (for the same input type) depending on
architecture which might be midly iritating.

An alternative version would promote to int like so:

	#define abs(x)	__abs_choose_expr(x, long long,			\
			__abs_choose_expr(x, long,			\
			__builtin_choose_expr(				\
				sizeof(x) <= sizeof(int),		\
				({ int __x = (x); __x<0?-__x:__x; }),	\
				((void)0))))

I have no preference but imagine Linus might.  :] Nicolas argument against
is that promoting to int causes iconsistent behaviour:

	int main(void) {
		unsigned short a = 0, b = 1, c = a - b;
		unsigned short d = abs(a - b);
		unsigned short e = abs(c);
		printf("%u %u\n", d, e);  // prints: 1 65535
	}

Then again, no sane person expects consistent behaviour from C integer
arithmetic.  ;)

Note:

  __builtin_types_compatible_p(unsigned char, char) is always false, and
  __builtin_types_compatible_p(signed char, char) is also always false.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16 11:17:22 -08:00
Vasily Kulikov b8a0255db9 include/linux/poison.h: use POISON_POINTER_DELTA for poison pointers
TIMER_ENTRY_STATIC and TAIL_MAPPING are defined as poison pointers which
should point to nowhere.  Redefine them using POISON_POINTER_DELTA
arithmetics to make sure they really point to non-mappable area declared
by the target architecture.

Signed-off-by: Vasily Kulikov <segoon@openwall.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Solar Designer <solar@openwall.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16 11:17:22 -08:00
Linus Torvalds ece6267878 The clk framework and driver changes for 4.5 look pretty typical. The
bulk of the changes are to clk controller drivers, though some
 improvements to the core and some re-usable blocks/templates also
 received some love. In this past cycle the clk maintainers developed a
 good workflow for handling the common case of patch submissions
 containing a new drivers, new shared Device Tree header and a new Device
 Tree binding description. This requires coordination with the Device
 Tree maintainers and with the architecture maintainers (typically the
 arm-soc tree in our case). This explains the increase in changes to
 include/dt-bindings/... and to
 Documentation/devicetree/bindings/clock/... coming from the clk tree.
 The same commits can be expected to come through those trees on
 occasion, through the use of shared, immutable branches.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJWmVD1AAoJEKI6nJvDJaTUWqAQAJRqH+iKSnLyuVek6USYPXdp
 ZOW1JHIySCUF/ci3fvv9vbIuW/w4Uc9FdRdAfIIaWO32bpk4ljtEogVasvavL/54
 geN+0YERZpth+PL/wk+g+Dons/8BgN9AL+0ToaVVh2MsRE903jWpe+l0qWCl+SUn
 DQF+yB9F4QcVT7gb+KI0B6hr6lv5Mu/t9Ffq3DrK2UG71IEbrv953pVo19foZqbf
 FgMgduERMHvaM/R6p5xfXxIESbjE+QNMnykEo6bWcHF2wfQrIiYlW1khtnsigNus
 kze2mYWjG77KGkrOex5kwjuBDPfiaGAstk3jcRCCMB7nRmuFcJgryF8003CD3QvW
 +cY+ZBSkyXXtL/nrkefebAplQjsvum7dJ+W6hxA32B772jFL8s8ee5rGda0ibw8N
 nFQRopVcNPoR52tjmSEOxm7Z9vqoDj5qManDdZe62kr6bCFId97E6SzeGgeyDuBQ
 V7tkss9klHMhx29V37wkyl+A/pWXO+CsGzHLEivH8/L1CysMK1WGhyHJJ2/JTDTR
 n3z1EYdg67PbHpfhboscuLP+sHcrAefOCe2la4wKxwqd4DGw6J8RhhVZml6XnfQ3
 I4yLri8Pguol49G1ac1U87ylebKsXhWR5CRHVas1BFHrt0FGULHnl0ZOLzOu4GRZ
 h8/nd9ZODLFhB1w7fL1B
 =nkGr
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk framework updates from Michael Turquette:
 "The clk framework and driver changes for 4.5 look pretty typical.  The
  bulk of the changes are to clk controller drivers, though some
  improvements to the core and some re-usable blocks/templates also
  received some love.

  In this past cycle the clk maintainers developed a good workflow for
  handling the common case of patch submissions containing a new
  drivers, new shared Device Tree header and a new Device Tree binding
  description.  This requires coordination with the Device Tree
  maintainers and with the architecture maintainers (typically the
  arm-soc tree in our case).

  This explains the increase in changes to include/dt-bindings/... and
  to Documentation/devicetree/bindings/clock/... coming from the clk
  tree.  The same commits can be expected to come through those trees on
  occasion, through the use of shared, immutable branches"

* tag 'clk-for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (125 commits)
  clk: remove duplicated COMMON_CLK_NXP record from clk/Kconfig
  clk: fix clk-gpio.c with optional clock= DT property
  clk: rockchip: fix section mismatches with new child-clocks
  clk: gpio: handle error codes for of_clk_get_parent_count()
  clk: gpio: fix memory leak
  clk: shmobile: r8a7795: Add SATA0 clock
  clk: bcm2835: Add PWM clock support
  clk: bcm2835: Support for clock parent selection
  clk: bcm2835: add a round up ability to the clock divisor
  clk: lpc32xx: add common clock framework driver
  clk: lpc18xx: add NXP specific COMMON_CLK_NXP configuration symbol
  dt-bindings: clock: add NXP LPC32xx clock list for consumers
  dt-bindings: clock: add description of LPC32xx USB clock controller
  dt-bindings: clock: add description of LPC32xx clock controller
  clk: rockchip: rk3036: include downstream muxes into fractional dividers
  clk: add flag for clocks that need to be enabled on rate changes
  clk: rockchip: Allow the RK3288 SPDIF clocks to change their parent
  clk: rockchip: include downstream muxes into fractional dividers
  clk: rockchip: handle mux dependency of fractional dividers
  clk: bcm2835: Add a driver for the auxiliary peripheral clock gates.
  ...
2016-01-15 18:21:28 -08:00
Linus Torvalds d45187aaf0 Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull dmi updates from Jean Delvare.

* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  firmware: dmi_scan: Save SMBIOS Type 9 System Slots
  firmware: dmi_scan: Fix dmi_find_device description
  firmware: dmi_scan: Clarify dmi_save_extended_devices
  firmware: dmi_scan: Optimize dmi_save_extended_devices
2016-01-15 18:12:18 -08:00
Wang Xiaoqiang 7f43add451 mm/mlock.c: change can_do_mlock return value type to boolean
Since can_do_mlock only return 1 or 0, so make it boolean.

No functional change.

[akpm@linux-foundation.org: update declaration in mm.h]
Signed-off-by: Wang Xiaoqiang <wangxq10@lzu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Kirill A. Shutemov 036fbb21de memblock: fix section mismatch
allmodconfig produces following warning for me:

  WARNING: vmlinux.o(.text.unlikely+0x10314): Section mismatch in reference from the function movable_node_is_enabled() to the variable .meminit.data:movable_node_enabled
  The function movable_node_is_enabled() references
  the variable __meminitdata movable_node_enabled.
  This is often because movable_node_is_enabled lacks a __meminitdata
  annotation or the annotation of movable_node_enabled is wrong.

Let's mark the function with __meminit.  It fixes the warning.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dominik Dingel 4a9e1cda27 mm: bring in additional flag for fixup_user_fault to signal unlock
During Jason's work with postcopy migration support for s390 a problem
regarding gmap faults was discovered.

The gmap code will call fixup_user_fault which will end up always in
handle_mm_fault.  Till now we never cared about retries, but as the
userfaultfd code kind of relies on it.  this needs some fix.

This patchset does not take care of the futex code.  I will now look
closer at this.

This patch (of 2):

With the introduction of userfaultfd, kvm on s390 needs fixup_user_fault
to pass in FAULT_FLAG_ALLOW_RETRY and give feedback if during the
faulting we ever unlocked mmap_sem.

This patch brings in the logic to handle retries as well as it cleans up
the current documentation.  fixup_user_fault was not having the same
semantics as filemap_fault.  It never indicated if a retry happened and
so a caller wasn't able to handle that case.  So we now changed the
behaviour to always retry a locked mmap_sem.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Jason J. Herne" <jjherne@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric B Munson <emunson@akamai.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams 3565fce3a6 mm, x86: get_user_pages() for dax mappings
A dax mapping establishes a pte with _PAGE_DEVMAP set when the driver
has established a devm_memremap_pages() mapping, i.e.  when the pfn_t
return from ->direct_access() has PFN_DEV and PFN_MAP set.  Later, when
encountering _PAGE_DEVMAP during a page table walk we lookup and pin a
struct dev_pagemap instance to keep the result of pfn_to_page() valid
until put_page().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams 5c7fb56e5e mm, dax: dax-pmd vs thp-pmd vs hugetlbfs-pmd
A dax-huge-page mapping while it uses some thp helpers is ultimately not
a transparent huge page.  The distinction is especially important in the
get_user_pages() path.  pmd_devmap() is used to distinguish dax-pmds
from pmd_huge() and pmd_trans_huge() which have slightly different
semantics.

Explicitly mark the pmd_trans_huge() helpers that dax needs by adding
pmd_devmap() checks.

[kirill.shutemov@linux.intel.com: fix regression in handling mlocked pages in  __split_huge_pmd()]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams 5c2c2587b1 mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gup
get_dev_page() enables paths like get_user_pages() to pin a dynamically
mapped pfn-range (devm_memremap_pages()) while the resulting struct page
objects are in use.  Unlike get_page() it may fail if the device is, or
is in the process of being, disabled.  While the initial lookup of the
range may be an expensive list walk, the result is cached to speed up
subsequent lookups which are likely to be in the same mapped range.

devm_memremap_pages() now requires a reference counter to be specified
at init time.  For pmem this means moving request_queue allocation into
pmem_alloc() so the existing queue usage counter can track "device
pages".

ZONE_DEVICE pages always have an elevated count and will never be on an
lru reclaim list.  That space in 'struct page' can be redirected for
other uses, but for safety introduce a poison value that will always
trip __list_add() to assert.  This allows half of the struct list_head
storage to be reclaimed with some assurance to back up the assumption
that the page count never goes to zero and a list_add() is never
attempted.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams f25748e3c3 mm, dax: convert vmf_insert_pfn_pmd() to pfn_t
Similar to the conversion of vm_insert_mixed() use pfn_t in the
vmf_insert_pfn_pmd() to tag the resulting pte with _PAGE_DEVICE when the
pfn is backed by a devm_memremap_pages() mapping.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams 01c8f1c44b mm, dax, gpu: convert vm_insert_mixed to pfn_t
Convert the raw unsigned long 'pfn' argument to pfn_t for the purpose of
evaluating the PFN_MAP and PFN_DEV flags.  When both are set it triggers
_PAGE_DEVMAP to be set in the resulting pte.

There are no functional changes to the gpu drivers as a result of this
conversion.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams 888cdbc2c9 hugetlb: fix compile error on tile
Inlude asm/pgtable.h to get the definition for pud_t to fix:

  include/linux/hugetlb.h:203:29: error: unknown type name 'pud_t'

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams 4b94ffdc41 x86, mm: introduce vmem_altmap to augment vmemmap_populate()
In support of providing struct page for large persistent memory
capacities, use struct vmem_altmap to change the default policy for
allocating memory for the memmap array.  The default vmemmap_populate()
allocates page table storage area from the page allocator.  Given
persistent memory capacities relative to DRAM it may not be feasible to
store the memmap in 'System Memory'.  Instead vmem_altmap represents
pre-allocated "device pages" to satisfy vmemmap_alloc_block_buf()
requests.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: kbuild test robot <lkp@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams 9476df7d80 mm: introduce find_dev_pagemap()
There are several scenarios where we need to retrieve and update
metadata associated with a given devm_memremap_pages() mapping, and the
only lookup key available is a pfn in the range:

1/ We want to augment vmemmap_populate() (called via arch_add_memory())
   to allocate memmap storage from pre-allocated pages reserved by the
   device driver.  At vmemmap_alloc_block_buf() time it grabs device pages
   rather than page allocator pages.  This is in support of
   devm_memremap_pages() mappings where the memmap is too large to fit in
   main memory (i.e. large persistent memory devices).

2/ Taking a reference against the mapping when inserting device pages
   into the address_space radix of a given inode.  This facilitates
   unmap_mapping_range() and truncate_inode_pages() operations when the
   driver is tearing down the mapping.

3/ get_user_pages() operations on ZONE_DEVICE memory require taking a
   reference against the mapping so that the driver teardown path can
   revoke and drain usage of device pages.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams 260ae3f7db mm: skip memory block registration for ZONE_DEVICE
Prevent userspace from trying and failing to online ZONE_DEVICE pages
which are meant to never be onlined.

For example on platforms with a udev rule like the following:

  SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"

...will generate futile attempts to online the ZONE_DEVICE sections.
Example kernel messages:

    Built 1 zonelists in Node order, mobility grouping on.  Total pages: 1004747
    Policy zone: Normal
    online_pages [mem 0x248000000-0x24fffffff] failed

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams 34c0fd540e mm, dax, pmem: introduce pfn_t
For the purpose of communicating the optional presence of a 'struct
page' for the pfn returned from ->direct_access(), introduce a type that
encapsulates a page-frame-number plus flags.  These flags contain the
historical "page_link" encoding for a scatterlist entry, but can also
denote "device memory".  Where "device memory" is a set of pfns that are
not part of the kernel's linear mapping by default, but are accessed via
the same memory controller as ram.

The motivation for this new type is large capacity persistent memory
that needs struct page entries in the 'memmap' to support 3rd party DMA
(i.e.  O_DIRECT I/O with a persistent memory source/target).  However,
we also need it in support of maintaining a list of mapped inodes which
need to be unmapped at driver teardown or freeze_bdev() time.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave@sr71.net>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams ba049e93ae kvm: rename pfn_t to kvm_pfn_t
To date, we have implemented two I/O usage models for persistent memory,
PMEM (a persistent "ram disk") and DAX (mmap persistent memory into
userspace).  This series adds a third, DAX-GUP, that allows DAX mappings
to be the target of direct-i/o.  It allows userspace to coordinate
DMA/RDMA from/to persistent memory.

The implementation leverages the ZONE_DEVICE mm-zone that went into
4.3-rc1 (also discussed at kernel summit) to flag pages that are owned
and dynamically mapped by a device driver.  The pmem driver, after
mapping a persistent memory range into the system memmap via
devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus
page-backed pmem-pfns via flags in the new pfn_t type.

The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn, flags the
resulting pte(s) inserted into the process page tables with a new
_PAGE_DEVMAP flag.  Later, when get_user_pages() is walking ptes it keys
off _PAGE_DEVMAP to pin the device hosting the page range active.
Finally, get_page() and put_page() are modified to take references
against the device driver established page mapping.

Finally, this need for "struct page" for persistent memory requires
memory capacity to store the memmap array.  Given the memmap array for a
large pool of persistent may exhaust available DRAM introduce a
mechanism to allocate the memmap from persistent memory.  The new
"struct vmem_altmap *" parameter to devm_memremap_pages() enables
arch_add_memory() to use reserved pmem capacity rather than the page
allocator.

This patch (of 18):

The core has developed a need for a "pfn_t" type [1].  Move the existing
pfn_t in KVM to kvm_pfn_t [2].

[1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html
[2]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002218.html

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Dan Williams b2e0d1625e dax: fix lifetime of in-kernel dax mappings with dax_map_atomic()
The DAX implementation needs to protect new calls to ->direct_access()
and usage of its return value against the driver for the underlying
block device being disabled.  Use blk_queue_enter()/blk_queue_exit() to
hold off blk_cleanup_queue() from proceeding, or otherwise fail new
mapping requests if the request_queue is being torn down.

This also introduces blk_dax_ctl to simplify the interface from fs/dax.c
through dax_map_atomic() to bdev_direct_access().

[willy@linux.intel.com: fix read() of a hole]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Minchan Kim b8d3c4c300 mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called
We don't need to split THP page when MADV_FREE syscall is called if
[start, len] is aligned with THP size.  The split could be done when VM
decide to free it in reclaim path if memory pressure is heavy.  With
that, we could avoid unnecessary THP split.

For the feature, this patch changes pte dirtness marking logic of THP.
Now, it marks every ptes of pages dirty unconditionally in splitting,
which makes MADV_FREE void.  So, instead, this patch propagates pmd
dirtiness to all pages via PG_dirty and restores pte dirtiness from
PG_dirty.  With this, if pmd is clean(ie, MADV_FREEed) when split
happens(e,g, shrink_page_list), all of pages are clean too so we could
discard them.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Shaohua Li <shli@kernel.org>
Cc: <yalin.wang2010@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jason Evans <je@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mika Penttil <mika.penttila@nextfour.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Shaohua Li <shli@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Minchan Kim 10853a0392 mm: move lazily freed pages to inactive list
MADV_FREE is a hint that it's okay to discard pages if there is memory
pressure and we use reclaimers(ie, kswapd and direct reclaim) to free
them so there is no value keeping them in the active anonymous LRU so
this patch moves them to inactive LRU list's head.

This means that MADV_FREE-ed pages which were living on the inactive
list are reclaimed first because they are more likely to be cold rather
than recently active pages.

An arguable issue for the approach would be whether we should put the
page to the head or tail of the inactive list.  I chose head because the
kernel cannot make sure it's really cold or warm for every MADV_FREE
usecase but at least we know it's not *hot*, so landing of inactive head
would be a comprimise for various usecases.

This fixes suboptimal behavior of MADV_FREE when pages living on the
active list will sit there for a long time even under memory pressure
while the inactive list is reclaimed heavily.  This basically breaks the
whole purpose of using MADV_FREE to help the system to free memory which
is might not be used.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: <yalin.wang2010@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jason Evans <je@fb.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Mika Penttil <mika.penttila@nextfour.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Roland Dreier <roland@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Minchan Kim 854e9ed09d mm: support madvise(MADV_FREE)
Linux doesn't have an ability to free pages lazy while other OS already
have been supported that named by madvise(MADV_FREE).

The gain is clear that kernel can discard freed pages rather than
swapping out or OOM if memory pressure happens.

Without memory pressure, freed pages would be reused by userspace
without another additional overhead(ex, page fault + allocation +
zeroing).

Jason Evans said:

: Facebook has been using MAP_UNINITIALIZED
: (https://lkml.org/lkml/2012/1/18/308) in some of its applications for
: several years, but there are operational costs to maintaining this
: out-of-tree in our kernel and in jemalloc, and we are anxious to retire it
: in favor of MADV_FREE.  When we first enabled MAP_UNINITIALIZED it
: increased throughput for much of our workload by ~5%, and although the
: benefit has decreased using newer hardware and kernels, there is still
: enough benefit that we cannot reasonably retire it without a replacement.
:
: Aside from Facebook operations, there are numerous broadly used
: applications that would benefit from MADV_FREE.  The ones that immediately
: come to mind are redis, varnish, and MariaDB.  I don't have much insight
: into Android internals and development process, but I would hope to see
: MADV_FREE support eventually end up there as well to benefit applications
: linked with the integrated jemalloc.
:
: jemalloc will use MADV_FREE once it becomes available in the Linux kernel.
: In fact, jemalloc already uses MADV_FREE or equivalent everywhere it's
: available: *BSD, OS X, Windows, and Solaris -- every platform except Linux
: (and AIX, but I'm not sure it even compiles on AIX).  The lack of
: MADV_FREE on Linux forced me down a long series of increasingly
: sophisticated heuristics for madvise() volume reduction, and even so this
: remains a common performance issue for people using jemalloc on Linux.
: Please integrate MADV_FREE; many people will benefit substantially.

How it works:

When madvise syscall is called, VM clears dirty bit of ptes of the
range.  If memory pressure happens, VM checks dirty bit of page table
and if it found still "clean", it means it's a "lazyfree pages" so VM
could discard the page instead of swapping out.  Once there was store
operation for the page before VM peek a page to reclaim, dirty bit is
set so VM can swap out the page instead of discarding.

One thing we should notice is that basically, MADV_FREE relies on dirty
bit in page table entry to decide whether VM allows to discard the page
or not.  IOW, if page table entry includes marked dirty bit, VM
shouldn't discard the page.

However, as a example, if swap-in by read fault happens, page table
entry doesn't have dirty bit so MADV_FREE could discard the page
wrongly.

For avoiding the problem, MADV_FREE did more checks with PageDirty and
PageSwapCache.  It worked out because swapped-in page lives on swap
cache and since it is evicted from the swap cache, the page has PG_dirty
flag.  So both page flags check effectively prevent wrong discarding by
MADV_FREE.

However, a problem in above logic is that swapped-in page has PG_dirty
still after they are removed from swap cache so VM cannot consider the
page as freeable any more even if madvise_free is called in future.

Look at below example for detail.

    ptr = malloc();
    memset(ptr);
    ..
    ..
    .. heavy memory pressure so all of pages are swapped out
    ..
    ..
    var = *ptr; -> a page swapped-in and could be removed from
                   swapcache. Then, page table doesn't mark
                   dirty bit and page descriptor includes PG_dirty
    ..
    ..
    madvise_free(ptr); -> It doesn't clear PG_dirty of the page.
    ..
    ..
    ..
    .. heavy memory pressure again.
    .. In this time, VM cannot discard the page because the page
    .. has *PG_dirty*

To solve the problem, this patch clears PG_dirty if only the page is
owned exclusively by current process when madvise is called because
PG_dirty represents ptes's dirtiness in several processes so we could
clear it only if we own it exclusively.

Firstly, heavy users would be general allocators(ex, jemalloc, tcmalloc
and hope glibc supports it) and jemalloc/tcmalloc already have supported
the feature for other OS(ex, FreeBSD)

  barrios@blaptop:~/benchmark/ebizzy$ lscpu
  Architecture:          x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Byte Order:            Little Endian
  CPU(s):                12
  On-line CPU(s) list:   0-11
  Thread(s) per core:    1
  Core(s) per socket:    1
  Socket(s):             12
  NUMA node(s):          1
  Vendor ID:             GenuineIntel
  CPU family:            6
  Model:                 2
  Stepping:              3
  CPU MHz:               3200.185
  BogoMIPS:              6400.53
  Virtualization:        VT-x
  Hypervisor vendor:     KVM
  Virtualization type:   full
  L1d cache:             32K
  L1i cache:             32K
  L2 cache:              4096K
  NUMA node0 CPU(s):     0-11
  ebizzy benchmark(./ebizzy -S 10 -n 512)

  Higher avg is better.

   vanilla-jemalloc             MADV_free-jemalloc

  1 thread
  records: 10                   records: 10
  avg:   2961.90                avg:  12069.70
  std:     71.96(2.43%)         std:    186.68(1.55%)
  max:   3070.00                max:  12385.00
  min:   2796.00                min:  11746.00

  2 thread
  records: 10                   records: 10
  avg:   5020.00                avg:  17827.00
  std:    264.87(5.28%)         std:    358.52(2.01%)
  max:   5244.00                max:  18760.00
  min:   4251.00                min:  17382.00

  4 thread
  records: 10                   records: 10
  avg:   8988.80                avg:  27930.80
  std:   1175.33(13.08%)        std:   3317.33(11.88%)
  max:   9508.00                max:  30879.00
  min:   5477.00                min:  21024.00

  8 thread
  records: 10                   records: 10
  avg:  13036.50                avg:  33739.40
  std:    170.67(1.31%)         std:   5146.22(15.25%)
  max:  13371.00                max:  40572.00
  min:  12785.00                min:  24088.00

  16 thread
  records: 10                   records: 10
  avg:  11092.40                avg:  31424.20
  std:    710.60(6.41%)         std:   3763.89(11.98%)
  max:  12446.00                max:  36635.00
  min:   9949.00                min:  25669.00

  32 thread
  records: 10                   records: 10
  avg:  11067.00                avg:  34495.80
  std:    971.06(8.77%)         std:   2721.36(7.89%)
  max:  12010.00                max:  38598.00
  min:   9002.00                min:  30636.00

In summary, MADV_FREE is about much faster than MADV_DONTNEED.

This patch (of 12):

Add core MADV_FREE implementation.

[akpm@linux-foundation.org: small cleanups]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Mika Penttil <mika.penttila@nextfour.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jason Evans <je@fb.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Shaohua Li <shli@kernel.org>
Cc: <yalin.wang2010@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Shaohua Li" <shli@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Roland Dreier <roland@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Shaohua Li <shli@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Vladimir Davydov 8749cfea11 mm: add page_check_address_transhuge() helper
page_referenced_one() and page_idle_clear_pte_refs_one() duplicate the
code for looking up pte of a (possibly transhuge) page.  Move this code
to a new helper function, page_check_address_transhuge(), and make the
above mentioned functions use it.

This is just a cleanup, no functional changes are intended.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00