For smaller block sizes (512B, 1K, 2K), some quotas straddle block
boundaries such that the usage value is on one block and the rest
of the quota is on the previous block. In such cases, the value
does not get updated correctly. This patch fixes that by addressing
the boundary conditions correctly.
This patch also adds a (s64) cast that was missing in a call to
gfs2_quota_change() in inode.c
Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
bi was already declared and initialized globally in gfs2_rbm_find()
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Fixes the following kernel-doc warnings:
Warning(fs/gfs2/aops.c:180): No description found for parameter 'wbc'
Warning(fs/gfs2/aops.c:236): No description found for parameter 'end'
Warning(fs/gfs2/aops.c:236): No description found for parameter 'done_index'
Warning(fs/gfs2/aops.c:236): Excess function parameter 'writepage' description in 'gfs2_write_jdata_pagevec'
Warning(fs/gfs2/aops.c:346): Excess function parameter 'writepage' description in 'gfs2_write_cache_jdata'
Warning(fs/gfs2/aops.c:346): Excess function parameter 'data' description in 'gfs2_write_cache_jdata'
Warning(fs/gfs2/aops.c:605): No description found for parameter 'file'
Warning(fs/gfs2/aops.c:605): No description found for parameter 'mapping'
Warning(fs/gfs2/aops.c:605): No description found for parameter 'pages'
Warning(fs/gfs2/aops.c:605): No description found for parameter 'nr_pages'
Warning(fs/gfs2/aops.c:870): No description found for parameter 'copied'
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
-Remove obsolete simple_str functions.
-Return error code when kstr failed.
-This patch also calls functions corresponding to destination type.
Thanks to Alexey Dobriyan for suggesting improvements in
block_store() and wdack_store()
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
At the end of gfs2_set_inode_flags inode->i_flags is set to flags, so
we should be modifying flags instead of inode->i_flags, so it isn't
overwritten.
Signed-off-by: Benjamin Marzinski <bmarzins redhat com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
gfs2 now uses the rename2 directory iop, and supports the
RENAME_EXCHANGE flag (as well as RENAME_NOREPLACE, which the vfs
takes care of).
Signed-off-by: Benjamin Marzinski <bmarzins redhat com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
The function set_rgrp_preferences() does not handle the (rarely
returned) NULL value from gfs2_rgrpd_get_next() and this patch
fixes that.
The fs image in question is only 150MB in size which allows for
only 1 rgrp to be created. The in-memory rb tree has only 1 node
and when gfs2_rgrpd_get_next() is called on this sole rgrp, it
returns NULL. (Default behavior is to wrap around the rb tree and
return the first node to give the illusion of a circular linked
list. In the case of only 1 rgrp, we can't have
gfs2_rgrpd_get_next() return the same rgrp (first, last, next all
point to the same rgrp)... that would cause unintended consequences
and infinite loops.)
Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Follow the same style used for the other functions in the same file.
Signed-off-by: Antonio Ospite <ao2@ao2.it>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
When gfs2 was mounted read-only and then unmounted, it was writing a
header block to the journal in the syncing gfs2_log_flush() call from
kill_sb(). This is because the journal was not being marked as idle
until the first log header was written out, and on a read-only mount
there never was a log header written out. Since the journal was not
marked idle, gfs2_log_flush() was writing out a header lock to make
sure it was empty during the sync. Not only did this cause IO to a
read-only filesystem, but the journalling isn't completely initialized
on read-only mounts, and so gfs2 was writing out the wrong sequence
number in the log header.
Now, the journal is marked idle on mount, and gfs2_log_flush() won't
write out anything until there starts being transactions to flush.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
This patch changes function gfs2_rgrp_congested so that it only factors
in non-zero values into its average round trip time. If the round-trip
time is zero for a particular cpu, that cpu has obviously never dealt
with bouncing the resource group in question, so factoring in a zero
value will only skew the numbers. It also fixes a compile error on
some arches related to division.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
This patch changes function gfs2_rgrp_congested so that it uses an
average srttb (smoothed round trip time for blocking rgrp glocks)
rather than the CPU-specific value. If we use the CPU-specific value
it can incorrectly report no contention when there really is contention
due to the glock processing occurring on a different CPU.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Pull ARM updates from Russell King:
"Included in this update are both some long term fixes and some new
features.
Fixes:
- An integer overflow in the calculation of ELF_ET_DYN_BASE.
- Avoiding OOMs for high-order IOMMU allocations
- SMP requires the data cache to be enabled for synchronisation
primitives to work, so prevent the CPU_DCACHE_DISABLE option being
visible on SMP builds.
- A bug going back 10+ years in the noMMU ARM94* CPU support code,
where it corrupts registers. Found by folk getting Linux running
on their cameras.
- Versatile Express needs an errata workaround enabled for CPU
hot-unplug to work.
Features:
- Clean up module linker by handling out of range relocations
separately from relocation cases we don't handle.
- Fix a long term bug in the pci_mmap_page_range() code, which we
hope won't impact userspace (we hope there's no users of the
existing broken interface.)
- Don't map DMA coherent allocations when we don't have a MMU.
- Drop experimental status for SMP_ON_UP.
- Warn when DT doesn't specify ePAPR mandatory cache properties.
- Add documentation concerning how we find the start of physical
memory for AUTO_ZRELADDR kernels, detailing why we have chosen the
mask and the implications of changing it.
- Updates from Ard Biesheuvel to address some issues with large
kernels (such as allyesconfig) failing to link.
- Allow hibernation to work on modern (ARMv7) CPUs - this appears to
have never worked in the past on these CPUs.
- Enable IRQ_SHOW_LEVEL, which changes the /proc/interrupts output
format (hopefully without userspace breaking... let's hope that if
it causes someone a problem, they tell us.)
- Fix tegra-ahb DT offsets.
- Rework ARM errata 643719 code (and ARMv7 flush_cache_louis()/
flush_dcache_all()) code to be more efficient, and enable this
errata workaround by default for ARMv7+SMP CPUs. This complements
the Versatile Express fix above.
- Rework ARMv7 context code for errata 430973, so that only Cortex A8
CPUs are impacted by the branch target buffer flush when this
errata is enabled. Also update the help text to indicate that all
r1p* A8 CPUs are impacted.
- Switch ARM to the generic show_mem() implementation, it conveys all
the information which we were already reporting.
- Prevent slow timer sources being used for udelay() - timers running
at less than 1MHz are not useful for this, and can cause udelay()
to return immediately, without any wait. Using such a slow timer
is silly.
- VDSO support for 32-bit ARM, mainly for gettimeofday() using the
ARM architected timer.
- Perf support for Scorpion performance monitoring units"
vdso semantic conflict fixed up as per linux-next.
* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (52 commits)
ARM: update errata 430973 documentation to cover Cortex A8 r1p*
ARM: ensure delay timer has sufficient accuracy for delays
ARM: switch to use the generic show_mem() implementation
ARM: proc-v7: avoid errata 430973 workaround for non-Cortex A8 CPUs
ARM: enable ARM errata 643719 workaround by default
ARM: cache-v7: optimise test for Cortex A9 r0pX devices
ARM: cache-v7: optimise branches in v7_flush_cache_louis
ARM: cache-v7: consolidate initialisation of cache level index
ARM: cache-v7: shift CLIDR to extract appropriate field before masking
ARM: cache-v7: use movw/movt instructions
ARM: allow 16-bit instructions in ALT_UP()
ARM: proc-arm94*.S: fix setup function
ARM: vexpress: fix CPU hotplug with CT9x4 tile.
ARM: 8276/1: Make CPU_DCACHE_DISABLE depend on !SMP
ARM: 8335/1: Documentation: DT bindings: Tegra AHB: document the legacy base address
ARM: 8334/1: amba: tegra-ahb: detect and correct bogus base address
ARM: 8333/1: amba: tegra-ahb: fix register offsets in the macros
ARM: 8339/1: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL
ARM: 8338/1: kexec: Relax SMP validation to improve DT compatibility
ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocations
...
Pull s390 updates from Martin Schwidefsky:
"The major change in this merge is the removal of the support for
31-bit kernels. Naturally 31-bit user space will continue to work via
the compat layer.
And then some cleanup, some improvements and bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (23 commits)
s390/smp: wait until secondaries are active & online
s390/hibernate: fix save and restore of kernel text section
s390/cacheinfo: add missing facility check
s390/syscalls: simplify syscall_get_arch()
s390/irq: enforce correct irqclass_sub_desc array size
s390: remove "64" suffix from mem64.S and swsusp_asm64.S
s390/ipl: cleanup macro usage
s390/ipl: cleanup shutdown_action attributes
s390/ipl: cleanup bin attr usage
s390/uprobes: fix address space annotation
s390: add missing arch_release_task_struct() declaration
s390: make couple of functions and variables static
s390/maccess: improve s390_kernel_write()
s390/maccess: remove potentially broken probe_kernel_write()
s390/watchdog: support for KVM hypervisors and delete pr_info messages
s390/watchdog: enable KEEPALIVE for /dev/watchdog
s390/dasd: remove setting of scheduler from driver
s390/traps: panic() instead of die() on translation exception
s390: remove test_facility(2) (== z/Architecture mode active) checks
s390/cmpxchg: simplify cmpxchg_double
...
- Generic PM domains support update including new PM domain
callbacks to handle device initialization better (Russell King,
Rafael J Wysocki, Kevin Hilman).
- Unified device properties API update including a new mechanism
for accessing data provided by platform initialization code
(Rafael J Wysocki, Adrian Hunter).
- ARM cpuidle update including ARM32/ARM64 handling consolidation
(Daniel Lezcano).
- intel_idle update including support for the Silvermont Core in
the Baytrail SOC and for the Airmont Core in the Cherrytrail and
Braswell SOCs (Len Brown, Mathias Krause).
- New cpufreq driver for Hisilicon ACPU (Leo Yan).
- intel_pstate update including support for the Knights Landing
chip (Dasaratharaman Chandramouli, Kristen Carlson Accardi).
- QorIQ cpufreq driver update (Tang Yuantian, Arnd Bergmann).
- powernv cpufreq driver update (Shilpasri G Bhat).
- devfreq update including Tegra support changes (Tomeu Vizoso,
MyungJoo Ham, Chanwoo Choi).
- powercap RAPL (Running-Average Power Limit) driver update
including support for Intel Broadwell server chips (Jacob Pan,
Mathias Krause).
- ACPI device enumeration update related to the handling of the
special PRP0001 device ID allowing DT-style 'compatible' property
to be used for ACPI device identification (Rafael J Wysocki).
- ACPI EC driver update including limited _DEP support (Lan Tianyu,
Lv Zheng).
- ACPI backlight driver update including a new mechanism to allow
native backlight handling to be forced on non-Windows 8 systems
and a new quirk for Lenovo Ideapad Z570 (Aaron Lu, Hans de Goede).
- New Windows Vista compatibility quirk for Sony VGN-SR19XN (Chen Yu).
- Assorted ACPI fixes and cleanups (Aaron Lu, Martin Kepplinger,
Masanari Iida, Mika Westerberg, Nan Li, Rafael J Wysocki).
- Fixes related to suspend-to-idle for the iTCO watchdog driver and
the ACPI core system suspend/resume code (Rafael J Wysocki, Chen Yu).
- PM tracing support for the suspend phase of system suspend/resume
transitions (Zhonghui Fu).
- Configurable delay for the system suspend/resume testing facility
(Brian Norris).
- PNP subsystem cleanups (Peter Huewe, Rafael J Wysocki).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJVLbO+AAoJEILEb/54YlRx5N4QAJXsmEW1FL2l6mMAyTQkEsVj
nbqjF9I6aJgYM9+i8GKaZJxpN17SAZ7Ii7aCAXjPwX8AvjT70+gcZr+KDWtPir61
B75VNVEcUYOR4vOF5Z6rQcQMlhGPkfMOJYXFMahpOG6DdPbVh1x2/tuawfc6IC0V
a6S/fln6WqHrXQ+8swDSv1KuZsav6+8AQaTlNUQkkuXdY9b3k/3xiy5C2K26APP8
x1B39iAF810qX6ipnK0gEOC3Vs29dl7hvNmgOVmmkBGVS7+pqTuy5n1/9M12cDRz
78IQ7DXB0NcSwr5tdrmGVUyH0Q6H9lnD3vO7MJkYwKDh5a/2MiBr2GZc4KHDKDWn
E1sS27f1Pdn9qnpWLzTcY+yYNV3EEyre56L2fc+sh+Xq9sNOjUah+Y/eAej/IxYD
XYRf+GAj768yCJgNP+Y3PJES/PRh+0IZ/dn5k0Qq2iYvc8mcObyG6zdQIvCucv/i
70uV1Z2GWEb31cI9TUV8o5GrMW3D0KI9EsCEEpiFFUnhjNog3AWcerGgFQMHxu7X
ZnNSzudvek+XJ3NtpbPgTiJAmnMz8bDvBQm3G1LUO2TQdjYTU6YMUHsfzXs8DL6c
aIMWO4stkVuDtWrlT/hfzIXepliccyXmSP6sbH+zNNCepulXe5C4M2SftaDi4l/B
uIctXWznvHoGys+EFL+v
=erd3
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"These are mostly fixes and cleanups all over, although there are a few
items that sort of fall into the new feature category.
First off, we have new callbacks for PM domains that should help us to
handle some issues related to device initialization in a better way.
There also is some consolidation in the unified device properties API
area allowing us to use that inferface for accessing data coming from
platform initialization code in addition to firmware-provided data.
We have some new device/CPU IDs in a few drivers, support for new
chips and a new cpufreq driver too.
Specifics:
- Generic PM domains support update including new PM domain callbacks
to handle device initialization better (Russell King, Rafael J
Wysocki, Kevin Hilman)
- Unified device properties API update including a new mechanism for
accessing data provided by platform initialization code (Rafael J
Wysocki, Adrian Hunter)
- ARM cpuidle update including ARM32/ARM64 handling consolidation
(Daniel Lezcano)
- intel_idle update including support for the Silvermont Core in the
Baytrail SOC and for the Airmont Core in the Cherrytrail and
Braswell SOCs (Len Brown, Mathias Krause)
- New cpufreq driver for Hisilicon ACPU (Leo Yan)
- intel_pstate update including support for the Knights Landing chip
(Dasaratharaman Chandramouli, Kristen Carlson Accardi)
- QorIQ cpufreq driver update (Tang Yuantian, Arnd Bergmann)
- powernv cpufreq driver update (Shilpasri G Bhat)
- devfreq update including Tegra support changes (Tomeu Vizoso,
MyungJoo Ham, Chanwoo Choi)
- powercap RAPL (Running-Average Power Limit) driver update including
support for Intel Broadwell server chips (Jacob Pan, Mathias Krause)
- ACPI device enumeration update related to the handling of the
special PRP0001 device ID allowing DT-style 'compatible' property
to be used for ACPI device identification (Rafael J Wysocki)
- ACPI EC driver update including limited _DEP support (Lan Tianyu,
Lv Zheng)
- ACPI backlight driver update including a new mechanism to allow
native backlight handling to be forced on non-Windows 8 systems and
a new quirk for Lenovo Ideapad Z570 (Aaron Lu, Hans de Goede)
- New Windows Vista compatibility quirk for Sony VGN-SR19XN (Chen Yu)
- Assorted ACPI fixes and cleanups (Aaron Lu, Martin Kepplinger,
Masanari Iida, Mika Westerberg, Nan Li, Rafael J Wysocki)
- Fixes related to suspend-to-idle for the iTCO watchdog driver and
the ACPI core system suspend/resume code (Rafael J Wysocki, Chen Yu)
- PM tracing support for the suspend phase of system suspend/resume
transitions (Zhonghui Fu)
- Configurable delay for the system suspend/resume testing facility
(Brian Norris)
- PNP subsystem cleanups (Peter Huewe, Rafael J Wysocki)"
* tag 'pm+acpi-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
ACPI / scan: Fix NULL pointer dereference in acpi_companion_match()
ACPI / scan: Rework modalias creation when "compatible" is present
intel_idle: mark cpu id array as __initconst
powercap / RAPL: mark rapl_ids array as __initconst
powercap / RAPL: add ID for Broadwell server
intel_pstate: Knights Landing support
intel_pstate: remove MSR test
cpufreq: fix qoriq uniprocessor build
ACPI / scan: Take the PRP0001 position in the list of IDs into account
ACPI / scan: Simplify acpi_match_device()
ACPI / scan: Generalize of_compatible matching
device property: Introduce firmware node type for platform data
device property: Make it possible to use secondary firmware nodes
PM / watchdog: iTCO: stop watchdog during system suspend
cpufreq: hisilicon: add acpu driver
ACPI / EC: Call acpi_walk_dep_device_list() after installing EC opregion handler
cpufreq: powernv: Report cpu frequency throttling
intel_idle: Add support for the Airmont Core in the Cherrytrail and Braswell SOCs
intel_idle: Update support for Silvermont Core in Baytrail SOC
PM / devfreq: tegra: Register governor on module init
...
Pull input subsystem updates from Dmitry Torokhov:
"You will get the following new drivers:
- Qualcomm PM8941 power key drver
- ChipOne icn8318 touchscreen controller driver
- Broadcom iProc touchscreen and keypad drivers
- Semtech SX8654 I2C touchscreen controller driver
ALPS driver now supports newer SS4 devices; Elantech got a fix that
should make it work on some ASUS laptops; and a slew of other
enhancements and random fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (51 commits)
Input: alps - non interleaved V2 dualpoint has separate stick button bits
Input: alps - fix touchpad buttons getting stuck when used with trackpoint
Input: atkbd - document "no new force-release quirks" policy
Input: ALPS - make alps_get_pkt_id_ss4_v2() and others static
Input: ALPS - V7 devices can report 5-finger taps
Input: ALPS - add support for SS4 touchpad devices
Input: ALPS - refactor alps_set_abs_params_mt()
Input: elantech - fix absolute mode setting on some ASUS laptops
Input: atmel_mxt_ts - split out touchpad initialisation logic
Input: atmel_mxt_ts - implement support for T100 touch object
Input: cros_ec_keyb - fix clearing keyboard state on wakeup
Input: gscps2 - drop pci_ids dependency
Input: synaptics - allocate 3 slots to keep stability in image sensors
Input: Revert "Revert "synaptics - use dmax in input_mt_assign_slots""
Input: MT - make slot assignment work for overcovered solutions
mfd: tc3589x: enforce device-tree only mode
Input: tc3589x - localize platform data
Input: tsc2007 - Convert msecs to jiffies only once
Input: edt-ft5x06 - remove EV_SYN event report
Input: edt-ft5x06 - allow to setting the maximum axes value through the DT
...
Pull i2c updates from Wolfram Sang:
"Most notable:
- introducing the i2c_quirk infrastructure. Now, flaws of I2C
controllers can be described and the core will check if the flaws
collide with the messages to be sent
- wait_for_completion return type cleanup series
- new drivers for Digicolor, Netlogic XLP, Ingenic JZ4780
- updates to the I2C slave framework which include API changes. Its
only user was updated, too. Documentation was finally added
- changed dynamic bus numbering for the DT case. This could change
bus numbers for users. However, it fixes a collision where dynamic
and static busses request the same id.
- driver bugfixes, cleanups"
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (52 commits)
i2c: xlp9xx: Driver for Netlogic XLP9XX/5XX I2C controller
of: Add vendor prefix 'netlogic'
i2c: davinci: use ICPFUNC to toggle I2C as gpio for bus recovery
i2c: davinci: use bus recovery infrastructure
i2c: change input parameter to i2c_adapter for prepare/unprepare_recovery
i2c: i2c-mux-gpio: remove error messages for probe deferrals
i2c: jz4780: Add i2c bus controller driver for Ingenic JZ4780
i2c: dln2: set the device tree node of the adapter
i2c: davinci: fixup wait_for_completion_timeout handling
i2c: mpc: Fix ISR return value
i2c: slave-eeprom: add more info when to increase the pointer
i2c: slave: add documentation for i2c-slave-eeprom
Documentation: i2c: describe the new slave mode
i2c: slave: rework the slave API
i2c: add support for the Digicolor I2C controller
i2c: busses with dynamic ids should start after fixed ids for DT
of: base: add function to get highest id of an alias stem
i2c: designware: Suppress error message if platform_get_irq() < 0
i2c: mpc: assign the correct prescaler from SVR
i2c: img-scb: fixup of wait_for_completion_timeout return handling
...
- VFIO platform bus driver support (Baptiste Reynal, Antonios Motakis, testing and review by Eric Auger)
- Split VFIO irqfd support to separate module (Alex Williamson)
- vfio-pci VGA arbiter client (Alex Williamson)
- New vfio-pci.ids= module option (Alex Williamson)
- vfio-pci D3 power state support for idle devices (Alex Williamson)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVLXApAAoJECObm247sIsiw4cP/AzBLqBXlxeCLgWNRJ4F6Dwz
PrjSK4xBRnPV/eMbIksWW6Wc6D4EaSUASMUUb1JLnF8roKWbRg2a5wdI4clrSYL0
mV8j4Uaw+9XkV0nWrVeiW0Yw8tu4pIQRvC6FiEJ6/pE9rHsP8wr8uzKzC7d4AvyJ
tb/awTF4PN9rZVIs68sIJi0JJjFXHuUTCHx6ns/723F1FMRS8ONAxEvIyFJOqaYh
DNDZs3UalstXYFHlV/0Zs/TqMXQJoSSOPkWx41P+kMsn00tKge88WAmybyQRjjoQ
paFjz7ox3PETUolojCFS72kMPApBCRT2q2HXXrPV74/GY2Or2UV5F1qo4gUsclyh
HqdE4OF9LcexCAOopc907Tped2SrHoiHpZu2aJWKJz+qEsejxgsAgnf1pxJikRQX
Eu7LJxJcYlddREN60ONgCUvsRq9ayNopuIDqD47Zhic6e5y8ujPjjz1e4yin+oxI
5WxMzcEdeVKC72vg2abUTHBri+l3GEWcdHk6YeK4fe95g11+gSBga8XmufgmOcGJ
VUl/umBQWzfxk0wHJtBLVIgleifs4sq+b5jxuXOboko8Z80q12zLlpcBXs8IX5Wa
wgzFvPPg8Ecsw6goCEW1xkeHfMEWcmWhI8QEWCtoZfSYLKK5I/hpuCUZBSqWgPVE
Wm2rcOkNBmqu2HpHBPPu
=8g7y
-----END PGP SIGNATURE-----
Merge tag 'vfio-v4.1-rc1' of git://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:
- VFIO platform bus driver support (Baptiste Reynal, Antonios Motakis,
testing and review by Eric Auger)
- Split VFIO irqfd support to separate module (Alex Williamson)
- vfio-pci VGA arbiter client (Alex Williamson)
- New vfio-pci.ids= module option (Alex Williamson)
- vfio-pci D3 power state support for idle devices (Alex Williamson)
* tag 'vfio-v4.1-rc1' of git://github.com/awilliam/linux-vfio: (30 commits)
vfio-pci: Fix use after free
vfio-pci: Move idle devices to D3hot power state
vfio-pci: Remove warning if try-reset fails
vfio-pci: Allow PCI IDs to be specified as module options
vfio-pci: Add VGA arbiter client
vfio-pci: Add module option to disable VGA region access
vgaarb: Stub vga_set_legacy_decoding()
vfio: Split virqfd into a separate module for vfio bus drivers
vfio: virqfd_lock can be static
vfio: put off the allocation of "minor" in vfio_create_group
vfio/platform: implement IRQ masking/unmasking via an eventfd
vfio: initialize the virqfd workqueue in VFIO generic code
vfio: move eventfd support code for VFIO_PCI to a separate file
vfio: pass an opaque pointer on virqfd initialization
vfio: add local lock for virqfd instead of depending on VFIO PCI
vfio: virqfd: rename vfio_pci_virqfd_init and vfio_pci_virqfd_exit
vfio: add a vfio_ prefix to virqfd_enable and virqfd_disable and export
vfio/platform: support for level sensitive interrupts
vfio/platform: trigger an interrupt via eventfd
vfio/platform: initial interrupts support code
...
cycle:
New drivers:
- Intel Sunrisepoint
- AMD KERNCZ GPIO
- Broadcom Cygnus IOMUX
New subdrivers:
- Marvell MVEBU Armada 39x SoCs
- Samsung Exynos 5433
- nVidia Tegra 210
- Mediatek MT8135
- Mediatek MT8173
- AMLogic Meson8b
- Qualcomm PM8916
On top of this cleanups and development history for the above
drivers as issues were fixed after merging.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVLSiSAAoJEEEQszewGV1z/nMP/jXyxCb8THuAyCFkJoY+Hgzz
dj3eMt+TPPyqt6lG3WODq/lQFxEdE28TqHYf2eGVdpriCpuyecFxyf9MPzY3P60E
v6XfzIeUOIGovw781qsg9TxJAZ0C+DLmNgpS8kWhnK7Igs3EJrRcz5Zz00F32olv
1dojVIQF6Nsn3M2Cc6bzF2wkJre3tLsUT7KXGZw4e3yA0K1XMZI3jYYXZ25GgLW0
CpZ1vKdKgwuBsgV5waJ8XZuFMqo3FRhjcD/f8ubk5hhMK6Wx6gszBcrLKVlkbz+2
XJzhn5tZSupSKmBtl0gCzP7pgVc0JeV8P12hHv6imT82rCU0YGBidNX2s3GrscnF
Nfwpw/BsaOXcu9CttI5LndvDlvCH2hUAal6i4IghiL5sRzlcW4jUoWHkIa8e6dHe
e/zjJSo7tXrWio30Dl6++qklcDimP3sbaaFseENhLUSl7hDGJ09Tx8yxEOFN3PX6
29i7ZC+ifZFzS30E3E+MOlFrxp3MB7j/z/ig3HL7XYr/TTiCKMNbKJNOlmGEfiTV
VI3GvTAJgt/1U+AOJI7a5xrxivaGL5GWXuoonQHY1gPrvjqkL54w4j+XBpj4tXIV
LBh6AS6G6AJDUOEU00EroCxOt1FPMW24zQvTKP18sgXsKWqHTIKT9B5RZbDXM9D7
pLfOIy0H1RWdQGvGgK7X
=MGqe
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-v4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pincontrol updates from Linus Walleij:
"This is the bulk of pin control changes for the v4.1 development
cycle. Nothing really exciting this time: we basically added a few
new drivers and subdrivers and stabilized them in linux-next. Some
cleanups too. With sunrisepoint Intel has a real fine fully featured
pin control driver for contemporary hardware, and the AMD driver is
also for large deployments. Most of the others are ARM devices.
New drivers:
- Intel Sunrisepoint
- AMD KERNCZ GPIO
- Broadcom Cygnus IOMUX
New subdrivers:
- Marvell MVEBU Armada 39x SoCs
- Samsung Exynos 5433
- nVidia Tegra 210
- Mediatek MT8135
- Mediatek MT8173
- AMLogic Meson8b
- Qualcomm PM8916
On top of this cleanups and development history for the above drivers
as issues were fixed after merging"
* tag 'pinctrl-v4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (71 commits)
pinctrl: sirf: move sgpio lock into state container
pinctrl: Add support for PM8916 GPIO's and MPP's
pinctrl: bcm2835: Fix support for threaded level triggered IRQs
sh-pfc: r8a7790: add EtherAVB pin groups
pinctrl: Document "function" + "pins" pinmux binding
pinctrl: intel: Add Intel Sunrisepoint pin controller and GPIO support
pinctrl: fsl: imx: Check for 0 config register
pinctrl: Add support for Meson8b
documentation: Extend pinctrl docs for Meson8b
pinctrl: Cleanup Meson8 driver
Fix inconsistent spinlock of AMD GPIO driver which can be recognized by static analysis tool smatch. Declare constant Variables with Sparse's suggestion.
pinctrl: at91: convert __raw to endian agnostic IO
pinctrl: constify of_device_id array
pinctrl: pinconf-generic: add dt node names to error messages
pinctrl: pinconf-generic: scan also referenced phandle node
pinctrl: mvebu: add suspend/resume support to Armada XP pinctrl driver
pinctrl: st: Display pin's function when printing pinctrl debug information
pinctrl: st: Show correct pin direction also in GPIO mode
pinctrl: st: Supply a GPIO get_direction() call-back
pinctrl: st: Move st_get_pio_control() further up the source file
...
Merge first patchbomb from Andrew Morton:
- arch/sh updates
- ocfs2 updates
- kernel/watchdog feature
- about half of mm/
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (122 commits)
Documentation: update arch list in the 'memtest' entry
Kconfig: memtest: update number of test patterns up to 17
arm: add support for memtest
arm64: add support for memtest
memtest: use phys_addr_t for physical addresses
mm: move memtest under mm
mm, hugetlb: abort __get_user_pages if current has been oom killed
mm, mempool: do not allow atomic resizing
memcg: print cgroup information when system panics due to panic_on_oom
mm: numa: remove migrate_ratelimited
mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE
mm: split ET_DYN ASLR from mmap ASLR
s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE
mm: expose arch_mmap_rnd when available
s390: standardize mmap_rnd() usage
powerpc: standardize mmap_rnd() usage
mips: extract logic for mmap_rnd()
arm64: standardize mmap_rnd() usage
x86: standardize mmap_rnd() usage
arm: factor out mmap ASLR into mmap_rnd
...
Since arm64/arm support memtest command line option update the "memtest"
entry.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Additional test patterns for memtest were introduced since commit
63823126c2 ("x86: memtest: add additional (regular) test patterns"),
but looks like Kconfig was not updated that time.
Update Kconfig entry with the actual number of maximum test patterns.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for memtest command line option.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for memtest command line option.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since memtest might be used by other architectures pass input parameters
as phys_addr_t instead of long to prevent overflow.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Memtest is a simple feature which fills the memory with a given set of
patterns and validates memory contents, if bad memory regions is detected
it reserves them via memblock API. Since memblock API is widely used by
other architectures this feature can be enabled outside of x86 world.
This patch set promotes memtest to live under generic mm umbrella and
enables memtest feature for arm/arm64.
It was reported that this patch set was useful for tracking down an issue
with some errant DMA on an arm64 platform.
This patch (of 6):
There is nothing platform dependent in the core memtest code, so other
platforms might benefit from this feature too.
[linux@roeck-us.net: MEMTEST depends on MEMBLOCK]
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If __get_user_pages() is faulting a significant number of hugetlb pages,
usually as the result of mmap(MAP_LOCKED), it can potentially allocate a
very large amount of memory.
If the process has been oom killed, this will cause a lot of memory to
potentially deplete memory reserves.
In the same way that commit 4779280d1e ("mm: make get_user_pages()
interruptible") aborted for pending SIGKILLs when faulting non-hugetlb
memory, based on the premise of commit 462e00cc71 ("oom: stop
allocating user memory if TIF_MEMDIE is set"), hugetlb page faults now
terminate when the process has been oom killed.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Greg Thelen <gthelen@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allocating a large number of elements in atomic context could quickly
deplete memory reserves, so just disallow atomic resizing entirely.
Nothing currently uses mempool_resize() with anything other than
GFP_KERNEL, so convert existing callers to drop the gfp_mask.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Steffen Maier <maier@linux.vnet.ibm.com> [zfcp]
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Steve French <sfrench@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If kernel panics due to oom, caused by a cgroup reaching its limit, when
'compulsory panic_on_oom' is enabled, then we will only see that the OOM
happened because of "compulsory panic_on_oom is enabled" but this doesn't
tell the difference between mempolicy and memcg. And dumping system wide
information is plain wrong and more confusing. This patch provides the
information of the cgroup whose limit triggerred panic
Signed-off-by: Balasubramani Vivekanandan <balasubramani_vivekanandan@mentor.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This code is dead since commit 9e645ab6d0 ("sched/numa: Continue PTE
scanning even if migrate rate limited") so remove it.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The arch_randomize_brk() function is used on several architectures,
even those that don't support ET_DYN ASLR. To avoid bulky extern/#define
tricks, consolidate the support under CONFIG_ARCH_HAS_ELF_RANDOMIZE for
the architectures that support it, while still handling CONFIG_COMPAT_BRK.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Hector Marco-Gisbert <hecmargi@upv.es>
Cc: Russell King <linux@arm.linux.org.uk>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "David A. Long" <dave.long@linaro.org>
Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Arun Chandran <achandran@mvista.com>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Min-Hua Chen <orca.chen@gmail.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Alex Smith <alex@alex-smith.me.uk>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Vineeth Vijayan <vvijayan@mvista.com>
Cc: Jeff Bailey <jeffbailey@google.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Behan Webster <behanw@converseincode.com>
Cc: Ismael Ripoll <iripoll@upv.es>
Cc: Jan-Simon Mller <dl9pf@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes the "offset2lib" weakness in ASLR for arm, arm64, mips,
powerpc, and x86. The problem is that if there is a leak of ASLR from
the executable (ET_DYN), it means a leak of shared library offset as
well (mmap), and vice versa. Further details and a PoC of this attack
is available here:
http://cybersecurity.upv.es/attacks/offset2lib/offset2lib.html
With this patch, a PIE linked executable (ET_DYN) has its own ASLR
region:
$ ./show_mmaps_pie
54859ccd6000-54859ccd7000 r-xp ... /tmp/show_mmaps_pie
54859ced6000-54859ced7000 r--p ... /tmp/show_mmaps_pie
54859ced7000-54859ced8000 rw-p ... /tmp/show_mmaps_pie
7f75be764000-7f75be91f000 r-xp ... /lib/x86_64-linux-gnu/libc.so.6
7f75be91f000-7f75beb1f000 ---p ... /lib/x86_64-linux-gnu/libc.so.6
7f75beb1f000-7f75beb23000 r--p ... /lib/x86_64-linux-gnu/libc.so.6
7f75beb23000-7f75beb25000 rw-p ... /lib/x86_64-linux-gnu/libc.so.6
7f75beb25000-7f75beb2a000 rw-p ...
7f75beb2a000-7f75beb4d000 r-xp ... /lib64/ld-linux-x86-64.so.2
7f75bed45000-7f75bed46000 rw-p ...
7f75bed46000-7f75bed47000 r-xp ...
7f75bed47000-7f75bed4c000 rw-p ...
7f75bed4c000-7f75bed4d000 r--p ... /lib64/ld-linux-x86-64.so.2
7f75bed4d000-7f75bed4e000 rw-p ... /lib64/ld-linux-x86-64.so.2
7f75bed4e000-7f75bed4f000 rw-p ...
7fffb3741000-7fffb3762000 rw-p ... [stack]
7fffb377b000-7fffb377d000 r--p ... [vvar]
7fffb377d000-7fffb377f000 r-xp ... [vdso]
The change is to add a call the newly created arch_mmap_rnd() into the
ELF loader for handling ET_DYN ASLR in a separate region from mmap ASLR,
as was already done on s390. Removes CONFIG_BINFMT_ELF_RANDOMIZE_PIE,
which is no longer needed.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Hector Marco-Gisbert <hecmargi@upv.es>
Cc: Russell King <linux@arm.linux.org.uk>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "David A. Long" <dave.long@linaro.org>
Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Arun Chandran <achandran@mvista.com>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Min-Hua Chen <orca.chen@gmail.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Alex Smith <alex@alex-smith.me.uk>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Vineeth Vijayan <vvijayan@mvista.com>
Cc: Jeff Bailey <jeffbailey@google.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Behan Webster <behanw@converseincode.com>
Cc: Ismael Ripoll <iripoll@upv.es>
Cc: Jan-Simon Mller <dl9pf@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In preparation for moving ET_DYN randomization into the ELF loader (which
requires a static ELF_ET_DYN_BASE), this redefines s390's existing ET_DYN
randomization in a call to arch_mmap_rnd(). This refactoring results in
the same ET_DYN randomization on s390.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When an architecture fully supports randomizing the ELF load location,
a per-arch mmap_rnd() function is used to find a randomized mmap base.
In preparation for randomizing the location of ET_DYN binaries
separately from mmap, this renames and exports these functions as
arch_mmap_rnd(). Additionally introduces CONFIG_ARCH_HAS_ELF_RANDOMIZE
for describing this feature on architectures that support it
(which is a superset of ARCH_BINFMT_ELF_RANDOMIZE_PIE, since s390
already supports a separated ET_DYN ASLR from mmap ASLR without the
ARCH_BINFMT_ELF_RANDOMIZE_PIE logic).
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Hector Marco-Gisbert <hecmargi@upv.es>
Cc: Russell King <linux@arm.linux.org.uk>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "David A. Long" <dave.long@linaro.org>
Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Arun Chandran <achandran@mvista.com>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Min-Hua Chen <orca.chen@gmail.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Alex Smith <alex@alex-smith.me.uk>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Vineeth Vijayan <vvijayan@mvista.com>
Cc: Jeff Bailey <jeffbailey@google.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Behan Webster <behanw@converseincode.com>
Cc: Ismael Ripoll <iripoll@upv.es>
Cc: Jan-Simon Mller <dl9pf@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In preparation for splitting out ET_DYN ASLR, this refactors the use of
mmap_rnd() to be used similarly to arm and x86, and extracts the
checking of PF_RANDOMIZE.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In preparation for splitting out ET_DYN ASLR, this refactors the use of
mmap_rnd() to be used similarly to arm and x86.
(Can mmap ASLR be safely enabled in the legacy mmap case here? Other
archs use "mm->mmap_base = TASK_UNMAPPED_BASE + random_factor".)
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In preparation for splitting out ET_DYN ASLR, extract the mmap ASLR
selection into a separate function.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In preparation for splitting out ET_DYN ASLR, this refactors the use of
mmap_rnd() to be used similarly to arm and x86. This additionally
enables mmap ASLR on legacy mmap layouts, which appeared to be missing
on arm64, and was already supported on arm. Additionally removes a
copy/pasted declaration of an unused function.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In preparation for splitting out ET_DYN ASLR, this refactors the use of
mmap_rnd() to be used similarly to arm, and extracts the checking of
PF_RANDOMIZE.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To address the "offset2lib" ASLR weakness[1], this separates ET_DYN ASLR
from mmap ASLR, as already done on s390. The architectures that are
already randomizing mmap (arm, arm64, mips, powerpc, s390, and x86), have
their various forms of arch_mmap_rnd() made available via the new
CONFIG_ARCH_HAS_ELF_RANDOMIZE. For these architectures,
arch_randomize_brk() is collapsed as well.
This is an alternative to the solutions in:
https://lkml.org/lkml/2015/2/23/442
I've been able to test x86 and arm, and the buildbot (so far) seems happy
with building the rest.
[1] http://cybersecurity.upv.es/attacks/offset2lib/offset2lib.html
This patch (of 10):
In preparation for splitting out ET_DYN ASLR, this moves the ASLR
calculations for mmap on ARM into a separate routine, similar to x86.
This also removes the redundant check of personality (PF_RANDOMIZE is
already set before calling arch_pick_mmap_layout).
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Hector Marco-Gisbert <hecmargi@upv.es>
Cc: Russell King <linux@arm.linux.org.uk>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "David A. Long" <dave.long@linaro.org>
Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Arun Chandran <achandran@mvista.com>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Min-Hua Chen <orca.chen@gmail.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Alex Smith <alex@alex-smith.me.uk>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Vineeth Vijayan <vvijayan@mvista.com>
Cc: Jeff Bailey <jeffbailey@google.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Behan Webster <behanw@converseincode.com>
Cc: Ismael Ripoll <iripoll@upv.es>
Cc: Jan-Simon Mller <dl9pf@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE enabled, and a normal top-down
address allocation strategy, load_elf_binary() will attempt to map a PIE
binary into an address range immediately below mm->mmap_base.
Unfortunately, load_elf_ binary() does not take account of the need to
allocate sufficient space for the entire binary which means that, while
the first PT_LOAD segment is mapped below mm->mmap_base, the subsequent
PT_LOAD segment(s) end up being mapped above mm->mmap_base into the are
that is supposed to be the "gap" between the stack and the binary.
Since the size of the "gap" on x86_64 is only guaranteed to be 128MB this
means that binaries with large data segments > 128MB can end up mapping
part of their data segment over their stack resulting in corruption of the
stack (and the data segment once the binary starts to run).
Any PIE binary with a data segment > 128MB is vulnerable to this although
address randomization means that the actual gap between the stack and the
end of the binary is normally greater than 128MB. The larger the data
segment of the binary the higher the probability of failure.
Fix this by calculating the total size of the binary in the same way as
load_elf_interp().
Signed-off-by: Michael Davidson <md@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When !MMU, it will report warning. The related warning with allmodconfig
under c6x:
CC mm/memcontrol.o
mm/memcontrol.c:2802:12: warning: 'mem_cgroup_move_account' defined but not used [-Wunused-function]
static int mem_cgroup_move_account(struct page *page,
^
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement huge KVA mapping interfaces on x86.
On x86, MTRRs can override PAT memory types with a 4KB granularity. When
using a huge page, MTRRs can override the memory type of the huge page,
which may lead a performance penalty. The processor can also behave in an
undefined manner if a huge page is mapped to a memory range that MTRRs
have mapped with multiple different memory types. Therefore, the mapping
code falls back to use a smaller page size toward 4KB when a mapping range
is covered by non-WB type of MTRRs. The WB type of MTRRs has no affect on
the PAT memory types.
pud_set_huge() and pmd_set_huge() call mtrr_type_lookup() to see if a
given range is covered by MTRRs. MTRR_TYPE_WRBACK indicates that the
range is either covered by WB or not covered and the MTRR default value is
set to WB. 0xFF indicates that MTRRs are disabled.
HAVE_ARCH_HUGE_VMAP is selected when X86_64 or X86_32 with X86_PAE is set.
X86_32 without X86_PAE is not supported since such config can unlikey be
benefited from this feature, and there was an issue found in testing.
[fengguang.wu@intel.com: ioremap_pud_capable can be static]
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Robert Elliott <Elliott@hp.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement huge I/O mapping capability interfaces for ioremap() on x86.
IOREMAP_MAX_ORDER is defined to PUD_SHIFT on x86/64 and PMD_SHIFT on
x86/32, which overrides the default value defined in <linux/vmalloc.h>.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Robert Elliott <Elliott@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change vunmap_pmd_range() and vunmap_pud_range() to tear down huge KVA
mappings when they are set. pud_clear_huge() and pmd_clear_huge() return
zero when no-operation is performed, i.e. huge page mapping was not used.
These changes are only enabled when CONFIG_HAVE_ARCH_HUGE_VMAP is defined
on the architecture.
[akpm@linux-foundation.org: use consistent code layout]
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Robert Elliott <Elliott@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ioremap_pud_range() and ioremap_pmd_range() are changed to create huge I/O
mappings when their capability is enabled, and a request meets required
conditions -- both virtual & physical addresses are aligned by their huge
page size, and a requested range fufills their huge page size. When
pud_set_huge() or pmd_set_huge() returns zero, i.e. no-operation is
performed, the code simply falls back to the next level.
The changes are only enabled when CONFIG_HAVE_ARCH_HUGE_VMAP is defined on
the architecture.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Robert Elliott <Elliott@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add ioremap_pud_enabled() and ioremap_pmd_enabled(), which return 1 when
I/O mappings with pud/pmd are enabled on the kernel.
ioremap_huge_init() calls arch_ioremap_pud_supported() and
arch_ioremap_pmd_supported() to initialize the capabilities at boot-time.
A new kernel option "nohugeiomap" is also added, so that user can disable
the huge I/O map capabilities when necessary.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Robert Elliott <Elliott@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ioremap() and its related interfaces are used to create I/O mappings to
memory-mapped I/O devices. The mapping sizes of the traditional I/O
devices are relatively small. Non-volatile memory (NVM), however, has
many GB and is going to have TB soon. It is not very efficient to create
large I/O mappings with 4KB.
This patchset extends the ioremap() interfaces to transparently create I/O
mappings with huge pages whenever possible. ioremap() continues to use
4KB mappings when a huge page does not fit into a requested range. There
is no change necessary to the drivers using ioremap(). A requested
physical address must be aligned by a huge page size (1GB or 2MB on x86)
for using huge page mapping, though. The kernel huge I/O mapping will
improve performance of NVM and other devices with large memory, and reduce
the time to create their mappings as well.
On x86, MTRRs can override PAT memory types with a 4KB granularity. When
using a huge page, MTRRs can override the memory type of the huge page,
which may lead a performance penalty. The processor can also behave in an
undefined manner if a huge page is mapped to a memory range that MTRRs
have mapped with multiple different memory types. Therefore, the mapping
code falls back to use a smaller page size toward 4KB when a mapping range
is covered by non-WB type of MTRRs. The WB type of MTRRs has no affect on
the PAT memory types.
The patchset introduces HAVE_ARCH_HUGE_VMAP, which indicates that the arch
supports huge KVA mappings for ioremap(). User may specify a new kernel
option "nohugeiomap" to disable the huge I/O mapping capability of
ioremap() when necessary.
Patch 1-4 change common files to support huge I/O mappings. There is no
change in the functinalities unless HAVE_ARCH_HUGE_VMAP is defined on the
architecture of the system.
Patch 5-6 implement the HAVE_ARCH_HUGE_VMAP funcs on x86, and set
HAVE_ARCH_HUGE_VMAP on x86.
This patch (of 6):
__get_vm_area_node() takes unsigned long size, which is a 64-bit value on
a 64-bit kernel. However, fls(size) simply ignores the upper 32-bit.
Change to use fls_long() to handle the size properly.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Robert Elliott <Elliott@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>