Commit Graph

51552 Commits

Author SHA1 Message Date
Linus Torvalds 72b5ac54d6 The highlights are:
* RADOS namespace support in libceph and CephFS (Zheng Yan and myself).
    The stopgaps added in 4.5 to deny access to inodes in namespaces are
    removed and CEPH_FEATURE_FS_FILE_LAYOUT_V2 feature bit is now fully
    supported.
 
  * A large rework of the MDS cap flushing code (Zheng Yan).
 
  * Handle some of ->d_revalidate() in RCU mode (Jeff Layton).  We were
    overly pessimistic before, bailing at the first sight of LOOKUP_RCU.
 
 On top of that we've got a few CephFS bug fixes, a couple of cleanups
 and Arnd's workaround for a weird genksyms issue.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJXoKLJAAoJEEp/3jgCEfOLDTUIAIcctpKUiNBokc95mQaXYl34
 j7lPIaD0/Ur7JPt4nMdtlywYJYSVV2c+SglHztj/+fv0G4bWbLVEFRruh9SwKIci
 PzttcmycIAqSn1f5gBZwyQbGuffd/F0EnBj7fFjcukt01i3s1ZQ7t4XtLGtAV0Ts
 aIfFtx9SqWig57Z1OZqNgnhnOoh6IqNbic3FL5Hvdl5N5pFbBcQho6Vzoa5O1osH
 URG6RmCcO4nykfSoxiivE7UZ+CImsXHkRD7rupBuIjqjZ8wvmZqQF5qxnkb9Dw2F
 IkNhrHkTSIiv4EsNPLAETTnFSozrL1nEykKr2FBW+ti8nxNcav+8FgVapqLvFIw=
 =gQ0/
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-4.8-rc1' of git://github.com/ceph/ceph-client

Pull Ceph updates from Ilya Dryomov:
 "The highlights are:

   - RADOS namespace support in libceph and CephFS (Zheng Yan and
     myself).  The stopgaps added in 4.5 to deny access to inodes in
     namespaces are removed and CEPH_FEATURE_FS_FILE_LAYOUT_V2 feature
     bit is now fully supported

   - A large rework of the MDS cap flushing code (Zheng Yan)

   - Handle some of ->d_revalidate() in RCU mode (Jeff Layton).  We were
     overly pessimistic before, bailing at the first sight of LOOKUP_RCU

  On top of that we've got a few CephFS bug fixes, a couple of cleanups
  and Arnd's workaround for a weird genksyms issue"

* tag 'ceph-for-4.8-rc1' of git://github.com/ceph/ceph-client: (34 commits)
  ceph: fix symbol versioning for ceph_monc_do_statfs
  ceph: Correctly return NXIO errors from ceph_llseek
  ceph: Mark the file cache as unreclaimable
  ceph: optimize cap flush waiting
  ceph: cleanup ceph_flush_snaps()
  ceph: kick cap flushes before sending other cap message
  ceph: introduce an inode flag to indicates if snapflush is needed
  ceph: avoid sending duplicated cap flush message
  ceph: unify cap flush and snapcap flush
  ceph: use list instead of rbtree to track cap flushes
  ceph: update types of some local varibles
  ceph: include 'follows' of pending snapflush in cap reconnect message
  ceph: update cap reconnect message to version 3
  ceph: mount non-default filesystem by name
  libceph: fsmap.user subscription support
  ceph: handle LOOKUP_RCU in ceph_d_revalidate
  ceph: allow dentry_lease_is_valid to work under RCU walk
  ceph: clear d_fsinfo pointer under d_lock
  ceph: remove ceph_mdsc_lease_release
  ceph: don't use ->d_time
  ...
2016-08-02 19:39:09 -04:00
Alexey Dobriyan 3bd080e4d8 ipc: delete "nr_ipc_ns"
Write-only variable.

Link: http://lkml.kernel.org/r/20160708214356.GA6785@p183.telecom.by
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:44 -04:00
Alexandre Bounine 0b9364b5cf rapidio/switches: add driver for IDT gen3 switches
Add RapidIO switch driver for IDT Gen3 switch devices: RXS1632 and
RXS2448.

[alexandre.bounine@idt.com: fixup for original driver patch]
  Link: http://lkml.kernel.org/r/1469137596-18241-1-git-send-email-alexandre.bounine@idt.com
Link: http://lkml.kernel.org/r/1469125134-16523-14-git-send-email-alexandre.bounine@idt.com
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Tested-by: Barry Wood <barry.wood@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:38 -04:00
Alexandre Bounine 1ae842de1d rapidio: modify for rev.3 specification changes
Implement changes made in RapidIO specification rev.3 to LP-Serial Physical
Layer register definitions:

 - use per-port register offset calculations based on LP-Serial Extended
   Features Block (EFB) Register Map type (I or II) with different
   per-port offset step (0x20 vs 0x40 respectfully).

 - remove deprecated Parallel Physical layer definitions and related
   code.

[alexandre.bounine@idt.com: fix DocBook warning for gen3 update]
  Link: http://lkml.kernel.org/r/1469191173-19338-1-git-send-email-alexandre.bounine@idt.com
Link: http://lkml.kernel.org/r/1469125134-16523-12-git-send-email-alexandre.bounine@idt.com
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Tested-by: Barry Wood <barry.wood@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:37 -04:00
Alexandre Bounine a057a52e94 rapidio: change inbound window size type to u64
Current definition of map_inb() mport operations callback uses u32 type
to specify required inbound window (IBW) size.  This is limiting factor
because existing hardware - tsi721 and fsl_rio, both support IBW size up
to 16GB.

Changing type of size parameter to u64 to allow IBW size configurations
larger than 4GB.

[alexandre.bounine@idt.com: remove compiler warning about size of constant]
  Link: http://lkml.kernel.org/r/20160802184856.2566-1-alexandre.bounine@idt.com
Link: http://lkml.kernel.org/r/1469125134-16523-11-git-send-email-alexandre.bounine@idt.com
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:37 -04:00
Petr Tesarik 21db79e8bb kexec: add a kexec_crash_loaded() function
Provide a wrapper function to be used by kernel code to check whether a
crash kernel is loaded.  It returns the same value that can be seen in
/sys/kernel/kexec_crash_loaded by userspace programs.

I'm exporting the function, because it will be used by Xen, and it is
possible to compile Xen modules separately to enable the use of PV
drivers with unmodified bare-metal kernels.

Link: http://lkml.kernel.org/r/20160713121955.14969.69080.stgit@hananiah.suse.cz
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:30 -04:00
Russell King 43546d8669 kexec: allow architectures to override boot mapping
kexec physical addresses are the boot-time view of the system.  For
certain ARM systems (such as Keystone 2), the boot view of the system
does not match the kernel's view of the system: the boot view uses a
special alias in the lower 4GB of the physical address space.

To cater for these kinds of setups, we need to translate between the
boot view physical addresses and the normal kernel view physical
addresses.  This patch extracts the current transation points into
linux/kexec.h, and allows an architecture to override the functions.

Due to the translations required, we unfortunately end up with six
translation functions, which are reduced down to four that the
architecture can override.

[akpm@linux-foundation.org: kexec.h needs asm/io.h for phys_to_virt()]
Link: http://lkml.kernel.org/r/E1b8koP-0004HZ-Vf@rmk-PC.armlinux.org.uk
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Vitaly Andrianov <vitalya@ti.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:27 -04:00
Russell King dae28018f5 kdump: arrange for paddr_vmcoreinfo_note() to return phys_addr_t
On PAE systems (eg, ARM LPAE) the vmcore note may be located above 4GB
physical on 32-bit architectures, so we need a wider type than "unsigned
long" here.  Arrange for paddr_vmcoreinfo_note() to return a
phys_addr_t, thereby allowing it to be located above 4GB.

This makes no difference for kexec-tools, as they already assume a
64-bit type when reading from this file.

Link: http://lkml.kernel.org/r/E1b8koK-0004HS-K9@rmk-PC.armlinux.org.uk
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Pratyush Anand <panand@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Vitaly Andrianov <vitalya@ti.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:27 -04:00
Russell King dc5cccacf4 kexec: don't invoke OOM-killer for control page allocation
If we are unable to find a suitable page when allocating the control
page, do not invoke the OOM-killer: killing processes probably isn't
going to help.

Link: http://lkml.kernel.org/r/E1b8ko9-0004HG-R5@rmk-PC.armlinux.org.uk
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Pratyush Anand <panand@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Vitaly Andrianov <vitalya@ti.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:26 -04:00
Geliang Tang b06fb41533 cpumask: fix code comment
Fix code comment for cpumask_parse().

Link: http://lkml.kernel.org/r/71aae2c60ae5dae0cf554199ce6aea8f88c69347.1465380581.git.geliangtang@gmail.com
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:24 -04:00
Andy Lutomirski 7e7814180b signal: consolidate {TS,TLF}_RESTORE_SIGMASK code
In general, there's no need for the "restore sigmask" flag to live in
ti->flags.  alpha, ia64, microblaze, powerpc, sh, sparc (64-bit only),
tile, and x86 use essentially identical alternative implementations,
placing the flag in ti->status.

Replace those optimized implementations with an equally good common
implementation that stores it in a bitfield in struct task_struct and
drop the custom implementations.

Additional architectures can opt in by removing their
TIF_RESTORE_SIGMASK defines.

Link: http://lkml.kernel.org/r/8a14321d64a28e40adfddc90e18a96c086a6d6f9.1468522723.git.luto@kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:23 -04:00
Ryusuke Konishi e63e88bc53 nilfs2: move ioctl interface and disk layout to uapi separately
The header file "include/linux/nilfs2_fs.h" is composed of parts for
ioctl and disk format, and both are intended to be shared with user
space programs.

This moves them to the uapi directory "include/uapi/linux" splitting the
file to "nilfs2_api.h" and "nilfs2_ondisk.h".  The following minor
changes are accompanied by this migration:

 - nilfs_direct_node struct in nilfs2/direct.h is converged to
   nilfs2_ondisk.h because it's an on-disk structure.
 - inline functions nilfs_rec_len_from_disk() and
   nilfs_rec_len_to_disk() are moved to nilfs2/dir.c.

Link: http://lkml.kernel.org/r/1465825507-3407-4-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:21 -04:00
Stephen Boyd a098ecd2fa firmware: support loading into a pre-allocated buffer
Some systems are memory constrained but they need to load very large
firmwares.  The firmware subsystem allows drivers to request this
firmware be loaded from the filesystem, but this requires that the
entire firmware be loaded into kernel memory first before it's provided
to the driver.  This can lead to a situation where we map the firmware
twice, once to load the firmware into kernel memory and once to copy the
firmware into the final resting place.

This creates needless memory pressure and delays loading because we have
to copy from kernel memory to somewhere else.  Let's add a
request_firmware_into_buf() API that allows drivers to request firmware
be loaded directly into a pre-allocated buffer.  This skips the
intermediate step of allocating a buffer in kernel memory to hold the
firmware image while it's read from the filesystem.  It also requires
that drivers know how much memory they'll require before requesting the
firmware and negates any benefits of firmware caching because the
firmware layer doesn't manage the buffer lifetime.

For a 16MB buffer, about half the time is spent performing a memcpy from
the buffer to the final resting place.  I see loading times go from
0.081171 seconds to 0.047696 seconds after applying this patch.  Plus
the vmalloc pressure is reduced.

This is based on a patch from Vikram Mulukutla on codeaurora.org:
  https://www.codeaurora.org/cgit/quic/la/kernel/msm-3.18/commit/drivers/base/firmware_class.c?h=rel/msm-3.18&id=0a328c5f6cd999f5c591f172216835636f39bcb5

Link: http://lkml.kernel.org/r/20160607164741.31849-4-stephen.boyd@linaro.org
Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Vikram Mulukutla <markivx@codeaurora.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:10 -04:00
Ross Zwisler a23216a2f1 radix-tree: fix comment about "exceptional" bits
The bottom two bits of radix tree entries are reserved for special use
by the radix tree code itself.  A comment detailing their usage was
added by commit 3bcadd6fa6 ("radix-tree: free up the bottom bit of
exceptional entries for reuse")

This comment states that if the bottom two bits are '11', this means
that this is a locked exceptional entry.

It turns out that this bit combination was never actually used.  Radix
tree locking for DAX was indeed implemented, but it actually used the
third LSB:

  /* We use lowest available exceptional entry bit for locking */
  #define RADIX_DAX_ENTRY_LOCK (1 << RADIX_TREE_EXCEPTIONAL_SHIFT)

This locking code was also made specific to the DAX code instead of
being generally implemented in radix-tree.h.

So, fix the comment.

Link: http://lkml.kernel.org/r/1468997731-2155-1-git-send-email-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:08 -04:00
Borislav Petkov 750afe7bab printk: add kernel parameter to control writes to /dev/kmsg
Add a "printk.devkmsg" kernel command line parameter which controls how
userspace writes into /dev/kmsg.  It has three options:

 * ratelimit - ratelimit logging from userspace.
 * on  - unlimited logging from userspace
 * off - logging from userspace gets ignored

The default setting is to ratelimit the messages written to it.

This changes the kernel default setting of "on" to "ratelimit" and we do
that because we want to keep userspace spamming /dev/kmsg to sane
levels.  This is especially moot when a small kernel log buffer wraps
around and messages get lost.  So the ratelimiting setting should be a
sane setting where kernel messages should have a bit higher chance of
survival from all the spamming.

It additionally does not limit logging to /dev/kmsg while the system is
booting if we haven't disabled it on the command line.

Furthermore, we can control the logging from a lower priority sysctl
interface - kernel.printk_devkmsg.

That interface will succeed only if printk.devkmsg *hasn't* been
supplied on the command line.  If it has, then printk.devkmsg is a
one-time setting which remains for the duration of the system lifetime.
This "locking" of the setting is to prevent userspace from changing the
logging on us through sysctl(2).

This patch is based on previous patches from Linus and Steven.

[bp@suse.de: fixes]
  Link: http://lkml.kernel.org/r/20160719072344.GC25563@nazgul.tnic
Link: http://lkml.kernel.org/r/20160716061745.15795-3-bp@alien8.de
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Franck Bui <fbui@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:06 -04:00
Borislav Petkov 6b1d174b0c ratelimit: extend to print suppressed messages on release
Extend the ratelimiting facility to print the amount of suppressed lines
when it is being released.

This use case is aimed at short-termed, burst-like users for which we
want to output the suppressed lines stats only once, after it has been
disposed of.  For an example, see /dev/kmsg usage in a follow-on patch.

Also, change the printk() line we issue on release to not use
"callbacks" as it is misleading: we're not suppressing callbacks but
printk() calls.

This has been separated from a previous patch by Linus.

Link: http://lkml.kernel.org/r/20160716061745.15795-2-bp@alien8.de
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Franck Bui <fbui@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:06 -04:00
Joe Perches 874f9c7da9 printk: create pr_<level> functions
Using functions instead of macros can reduce overall code size by
eliminating unnecessary "KERN_SOH<digit>" prefixes from format strings.

  defconfig x86-64:

  $ size vmlinux*
     text    data     bss      dec     hex  filename
  10193570 4331464 1105920 15630954  ee826a vmlinux.new
  10192623 4335560 1105920 15634103  ee8eb7 vmlinux.old

As the return value are unimportant and unused in the kernel tree, these
new functions return void.

Miscellanea:

 - change pr_<level> macros to call new __pr_<level> functions
 - change vprintk_nmi and vprintk_default to add LOGLEVEL_<level> argument

[akpm@linux-foundation.org: fix LOGLEVEL_INFO, per Joe]
Link: http://lkml.kernel.org/r/e16cc34479dfefcae37c98b481e6646f0f69efc3.1466718827.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:04 -04:00
Luis de Bethencourt 9d5059c959 dynamic_debug: only add header when used
kernel.h header doesn't directly use dynamic debug, instead we can
include it in module.c (which used it via kernel.h).  printk.h only uses
it if CONFIG_DYNAMIC_DEBUG is on, changing the inclusion to only happen
in that case.

Link: http://lkml.kernel.org/r/1468429793-16917-1-git-send-email-luisbg@osg.samsung.com
[luisbg@osg.samsung.com: include dynamic_debug.h in drb_int.h]
  Link: http://lkml.kernel.org/r/1468447828-18558-2-git-send-email-luisbg@osg.samsung.com
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:03 -04:00
Chen Gang 949bed2f57 include: mman: use bool instead of int for the return value of arch_validate_prot
For pure bool function's return value, bool is a little better more or
less than int.

Link: http://lkml.kernel.org/r/1469331815-2026-1-git-send-email-chengang@emindsoft.com.cn
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:02 -04:00
Alexey Dobriyan db3f600124 uapi: move forward declarations of internal structures
Don't user forward declarations of internal kernel structures in headers
exported to userspace.

Move "struct completion;".
Move "struct task_struct;".

Link: http://lkml.kernel.org/r/20160713215808.GA22486@p183.telecom.by
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 17:31:41 -04:00
Fabian Frederick bd721ea73e treewide: replace obsolete _refok by __ref
There was only one use of __initdata_refok and __exit_refok

__init_refok was used 46 times against 82 for __ref.

Those definitions are obsolete since commit 312b1485fb ("Introduce new
section reference annotations tags: __ref, __refdata, __refconst")

This patch removes the following compatibility definitions and replaces
them treewide.

/* compatibility defines */
#define __init_refok     __ref
#define __initdata_refok __refdata
#define __exit_refok     __ref

I can also provide separate patches if necessary.
(One patch per tree and check in 1 month or 2 to remove old definitions)

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/1466796271-3043-1-git-send-email-fabf@skynet.be
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 17:31:41 -04:00
Andrey Ryabinin b3cbd9bf77 mm/kasan: get rid of ->state in struct kasan_alloc_meta
The state of object currently tracked in two places - shadow memory, and
the ->state field in struct kasan_alloc_meta.  We can get rid of the
latter.  The will save us a little bit of memory.  Also, this allow us
to move free stack into struct kasan_alloc_meta, without increasing
memory consumption.  So now we should always know when the last time the
object was freed.  This may be useful for long delayed use-after-free
bugs.

As a side effect this fixes following UBSAN warning:
	UBSAN: Undefined behaviour in mm/kasan/quarantine.c:102:13
	member access within misaligned address ffff88000d1efebc for type 'struct qlist_node'
	which requires 8 byte alignment

Link: http://lkml.kernel.org/r/1470062715-14077-5-git-send-email-aryabinin@virtuozzo.com
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 17:31:41 -04:00
Linus Torvalds c8d0267efd PCI changes for the v4.8 merge window:
Enumeration
     Move ecam.h to linux/include/pci-ecam.h (Jayachandran C)
     Add parent device field to ECAM struct pci_config_window (Jayachandran C)
     Add generic MCFG table handling (Tomasz Nowicki)
     Refactor pci_bus_assign_domain_nr() for CONFIG_PCI_DOMAINS_GENERIC (Tomasz Nowicki)
     Factor DT-specific pci_bus_find_domain_nr() code out (Tomasz Nowicki)
 
   Resource management
     Add devm_request_pci_bus_resources() (Bjorn Helgaas)
     Unify pci_resource_to_user() declarations (Bjorn Helgaas)
     Implement pci_resource_to_user() with pcibios_resource_to_bus() (microblaze, powerpc, sparc) (Bjorn Helgaas)
     Request host bridge window resources (designware, iproc, rcar, xgene, xilinx, xilinx-nwl) (Bjorn Helgaas)
     Make PCI I/O space optional on ARM32 (Bjorn Helgaas)
     Ignore write combining when mapping I/O port space (Bjorn Helgaas)
     Claim bus resources on MIPS PCI_PROBE_ONLY set-ups (Bjorn Helgaas)
     Remove unicore32 pci=firmware command line parameter handling (Bjorn Helgaas)
     Support I/O resources when parsing host bridge resources (Jayachandran C)
     Add helpers to request/release memory and I/O regions (Johannes Thumshirn)
     Use pci_(request|release)_mem_regions (NVMe, lpfc, GenWQE, ethernet/intel, alx) (Johannes Thumshirn)
     Extend pci=resource_alignment to specify device/vendor IDs (Koehrer Mathias (ETAS/ESW5))
     Add generic pci_bus_claim_resources() (Lorenzo Pieralisi)
     Claim bus resources on ARM32 PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)
     Remove ARM32 and ARM64 arch-specific pcibios_enable_device() (Lorenzo Pieralisi)
     Add pci_unmap_iospace() to unmap I/O resources (Sinan Kaya)
     Remove powerpc __pci_mmap_set_pgprot() (Yinghai Lu)
 
   PCI device hotplug
     Allow additional bus numbers for hotplug bridges (Keith Busch)
     Ignore interrupts during D3cold (Lukas Wunner)
 
   Power management
     Enforce type casting for pci_power_t (Andy Shevchenko)
     Don't clear d3cold_allowed for PCIe ports (Mika Westerberg)
     Put PCIe ports into D3 during suspend (Mika Westerberg)
     Power on bridges before scanning new devices (Mika Westerberg)
     Runtime resume bridge before rescan (Mika Westerberg)
     Add runtime PM support for PCIe ports (Mika Westerberg)
     Remove redundant check of pcie_set_clkpm (Shawn Lin)
 
   Virtualization
     Add function 1 DMA alias quirk for Marvell 88SE9182 (Aaron Sierra)
     Add DMA alias quirk for Adaptec 3805 (Alex Williamson)
     Mark Atheros AR9485 and QCA9882 to avoid bus reset (Chris Blake)
     Add ACS quirk for Solarflare SFC9220 (Edward Cree)
 
   MSI
     Fix PCI_MSI dependencies (Arnd Bergmann)
     Add pci_msix_desc_addr() helper (Christoph Hellwig)
     Switch msix_program_entries() to use pci_msix_desc_addr() (Christoph Hellwig)
     Make the "entries" argument to pci_enable_msix() optional (Christoph Hellwig)
     Provide sensible IRQ vector alloc/free routines (Christoph Hellwig)
     Spread interrupt vectors in pci_alloc_irq_vectors() (Christoph Hellwig)
 
   Error Handling
     Bind DPC to Root Ports as well as Downstream Ports (Keith Busch)
     Remove DPC tristate module option (Keith Busch)
     Convert Downstream Port Containment driver to use devm_* functions (Mika Westerberg)
 
   Generic host bridge driver
     Select IRQ_DOMAIN (Arnd Bergmann)
     Claim bus resources on PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)
 
   ACPI host bridge driver
     Add ARM64 acpi_pci_bus_find_domain_nr() (Tomasz Nowicki)
     Add ARM64 ACPI support for legacy IRQs parsing and consolidation with DT code (Tomasz Nowicki)
     Implement ARM64 AML accessors for PCI_Config region (Tomasz Nowicki)
     Support ARM64 ACPI-based PCI host controller (Tomasz Nowicki)
 
   Altera host bridge driver
     Check link status before retrain link (Ley Foon Tan)
     Poll for link up status after retraining the link (Ley Foon Tan)
 
   Axis ARTPEC-6 host bridge driver
     Add PCI_MSI_IRQ_DOMAIN dependency (Arnd Bergmann)
     Add DT binding for Axis ARTPEC-6 PCIe controller (Niklas Cassel)
     Add Axis ARTPEC-6 PCIe controller driver (Niklas Cassel)
 
   Intel VMD host bridge driver
     Use lock save/restore in interrupt enable path (Jon Derrick)
     Select device dma ops to override (Keith Busch)
     Initialize list item in IRQ disable (Keith Busch)
     Use x86_vector_domain as parent domain (Keith Busch)
     Separate MSI and MSI-X vector sharing (Keith Busch)
 
   Marvell Aardvark host bridge driver
     Add DT binding for the Aardvark PCIe controller (Thomas Petazzoni)
     Add Aardvark PCI host controller driver (Thomas Petazzoni)
     Add Aardvark PCIe support for Armada 3700 (Thomas Petazzoni)
 
   Microsoft Hyper-V host bridge driver
     Fix interrupt cleanup path (Cathy Avery)
     Don't leak buffer in hv_pci_onchannelcallback() (Vitaly Kuznetsov)
     Handle all pending messages in hv_pci_onchannelcallback() (Vitaly Kuznetsov)
 
   NVIDIA Tegra host bridge driver
     Program PADS_REFCLK_CFG* always, not just on legacy SoCs (Stephen Warren)
     Program PADS_REFCLK_CFG* registers with per-SoC values (Stephen Warren)
     Use lower-case hex consistently for register definitions (Thierry Reding)
     Use generic pci_remap_iospace() rather than ARM32-specific one (Thierry Reding)
     Stop setting pcibios_min_mem (Thierry Reding)
 
   Renesas R-Car host bridge driver
     Drop gen2 dummy I/O port region (Bjorn Helgaas)
 
   TI DRA7xx host bridge driver
     Fix return value in case of error (Christophe JAILLET)
 
   Xilinx AXI host bridge driver
     Fix return value in case of error (Christophe JAILLET)
 
   Miscellaneous
     Make bus_attr_resource_alignment static (Ben Dooks)
     Include <asm/dma.h> for isa_dma_bridge_buggy (Ben Dooks)
     MAINTAINERS: Add file patterns for PCI device tree bindings (Geert Uytterhoeven)
     Make host bridge drivers explicitly non-modular (Paul Gortmaker)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXoNRtAAoJEFmIoMA60/r8LMkP/3kiNh21QFS6RZGOaDft5/Py
 n14Zo0w51avspxoI3iyDlBd5q/SssMqi+2c6Ko/fh2D2xMxJgmQOjdMDrIGARxGA
 qEHk/5IoXquY2/GcptmCk3ap66cJ6kTovS4OPrb73m3fPuknFwFwdzExq22XHbnI
 crPya6xwQxPLc54VpY/TsgW8E+EKZd/3FW9wuzzNHXrXmTILyhBQzQAA0K470GMx
 wEXU6kc3M/XhRuF1zjV9/O+H/xguwfnbTpZLvd2NAF6uXKZoRytEHHtNnVqu1hoe
 UPpDS2xq32pMNbGxGqBetCdIbkY/hWOufmckHI7Yu2OfXBYyHBYMG2je1+nMPkOV
 WiFhhrchGt5KnEMUwXPS4ROqnSZVpZBl1Fd4s10GhUYkoE2HNKJXta398H9FR1jj
 4NEVSi4mSX/+CkaoIN3lXYiaf9P0wv4Wppve4Scr30+VnLjJhm7Vw5La7v12oo6x
 otrJ/g98AkmnbuUdLeWBUS/+TOcdPjZYbw52rqBsbOOjFm51Zcj6D7kf5WcTypQy
 HzbvygSVabcioWehUG1uudC8pdJmQlUGx1aES/iu+mZEae4cuUFALu6hDBD9IYnZ
 5JdwjVzI0UItEwT3rQt3t4xiAqHADQ0NAVNJVCeREdoy/YQpSoTWGXIpyqCZ1yCm
 aBykjRsxbKQXlhVeIxuc
 =NVxu
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Highlights:

   - ARM64 support for ACPI host bridges

   - new drivers for Axis ARTPEC-6 and Marvell Aardvark

   - new pci_alloc_irq_vectors() interface for MSI-X, MSI, legacy INTx

   - pci_resource_to_user() cleanup (more to come)

  Detailed summary:

  Enumeration:
   - Move ecam.h to linux/include/pci-ecam.h (Jayachandran C)
   - Add parent device field to ECAM struct pci_config_window (Jayachandran C)
   - Add generic MCFG table handling (Tomasz Nowicki)
   - Refactor pci_bus_assign_domain_nr() for CONFIG_PCI_DOMAINS_GENERIC (Tomasz Nowicki)
   - Factor DT-specific pci_bus_find_domain_nr() code out (Tomasz Nowicki)

  Resource management:
   - Add devm_request_pci_bus_resources() (Bjorn Helgaas)
   - Unify pci_resource_to_user() declarations (Bjorn Helgaas)
   - Implement pci_resource_to_user() with pcibios_resource_to_bus() (microblaze, powerpc, sparc) (Bjorn Helgaas)
   - Request host bridge window resources (designware, iproc, rcar, xgene, xilinx, xilinx-nwl) (Bjorn Helgaas)
   - Make PCI I/O space optional on ARM32 (Bjorn Helgaas)
   - Ignore write combining when mapping I/O port space (Bjorn Helgaas)
   - Claim bus resources on MIPS PCI_PROBE_ONLY set-ups (Bjorn Helgaas)
   - Remove unicore32 pci=firmware command line parameter handling (Bjorn Helgaas)
   - Support I/O resources when parsing host bridge resources (Jayachandran C)
   - Add helpers to request/release memory and I/O regions (Johannes Thumshirn)
   - Use pci_(request|release)_mem_regions (NVMe, lpfc, GenWQE, ethernet/intel, alx) (Johannes Thumshirn)
   - Extend pci=resource_alignment to specify device/vendor IDs (Koehrer Mathias (ETAS/ESW5))
   - Add generic pci_bus_claim_resources() (Lorenzo Pieralisi)
   - Claim bus resources on ARM32 PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)
   - Remove ARM32 and ARM64 arch-specific pcibios_enable_device() (Lorenzo Pieralisi)
   - Add pci_unmap_iospace() to unmap I/O resources (Sinan Kaya)
   - Remove powerpc __pci_mmap_set_pgprot() (Yinghai Lu)

  PCI device hotplug:
   - Allow additional bus numbers for hotplug bridges (Keith Busch)
   - Ignore interrupts during D3cold (Lukas Wunner)

  Power management:
   - Enforce type casting for pci_power_t (Andy Shevchenko)
   - Don't clear d3cold_allowed for PCIe ports (Mika Westerberg)
   - Put PCIe ports into D3 during suspend (Mika Westerberg)
   - Power on bridges before scanning new devices (Mika Westerberg)
   - Runtime resume bridge before rescan (Mika Westerberg)
   - Add runtime PM support for PCIe ports (Mika Westerberg)
   - Remove redundant check of pcie_set_clkpm (Shawn Lin)

  Virtualization:
   - Add function 1 DMA alias quirk for Marvell 88SE9182 (Aaron Sierra)
   - Add DMA alias quirk for Adaptec 3805 (Alex Williamson)
   - Mark Atheros AR9485 and QCA9882 to avoid bus reset (Chris Blake)
   - Add ACS quirk for Solarflare SFC9220 (Edward Cree)

  MSI:
   - Fix PCI_MSI dependencies (Arnd Bergmann)
   - Add pci_msix_desc_addr() helper (Christoph Hellwig)
   - Switch msix_program_entries() to use pci_msix_desc_addr() (Christoph Hellwig)
   - Make the "entries" argument to pci_enable_msix() optional (Christoph Hellwig)
   - Provide sensible IRQ vector alloc/free routines (Christoph Hellwig)
   - Spread interrupt vectors in pci_alloc_irq_vectors() (Christoph Hellwig)

  Error Handling:
   - Bind DPC to Root Ports as well as Downstream Ports (Keith Busch)
   - Remove DPC tristate module option (Keith Busch)
   - Convert Downstream Port Containment driver to use devm_* functions (Mika Westerberg)

  Generic host bridge driver:
   - Select IRQ_DOMAIN (Arnd Bergmann)
   - Claim bus resources on PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)

  ACPI host bridge driver:
   - Add ARM64 acpi_pci_bus_find_domain_nr() (Tomasz Nowicki)
   - Add ARM64 ACPI support for legacy IRQs parsing and consolidation with DT code (Tomasz Nowicki)
   - Implement ARM64 AML accessors for PCI_Config region (Tomasz Nowicki)
   - Support ARM64 ACPI-based PCI host controller (Tomasz Nowicki)

  Altera host bridge driver:
   - Check link status before retrain link (Ley Foon Tan)
   - Poll for link up status after retraining the link (Ley Foon Tan)

  Axis ARTPEC-6 host bridge driver:
   - Add PCI_MSI_IRQ_DOMAIN dependency (Arnd Bergmann)
   - Add DT binding for Axis ARTPEC-6 PCIe controller (Niklas Cassel)
   - Add Axis ARTPEC-6 PCIe controller driver (Niklas Cassel)

  Intel VMD host bridge driver:
   - Use lock save/restore in interrupt enable path (Jon Derrick)
   - Select device dma ops to override (Keith Busch)
   - Initialize list item in IRQ disable (Keith Busch)
   - Use x86_vector_domain as parent domain (Keith Busch)
   - Separate MSI and MSI-X vector sharing (Keith Busch)

  Marvell Aardvark host bridge driver:
   - Add DT binding for the Aardvark PCIe controller (Thomas Petazzoni)
   - Add Aardvark PCI host controller driver (Thomas Petazzoni)
   - Add Aardvark PCIe support for Armada 3700 (Thomas Petazzoni)

  Microsoft Hyper-V host bridge driver:
   - Fix interrupt cleanup path (Cathy Avery)
   - Don't leak buffer in hv_pci_onchannelcallback() (Vitaly Kuznetsov)
   - Handle all pending messages in hv_pci_onchannelcallback() (Vitaly Kuznetsov)

  NVIDIA Tegra host bridge driver:
   - Program PADS_REFCLK_CFG* always, not just on legacy SoCs (Stephen Warren)
   - Program PADS_REFCLK_CFG* registers with per-SoC values (Stephen Warren)
   - Use lower-case hex consistently for register definitions (Thierry Reding)
   - Use generic pci_remap_iospace() rather than ARM32-specific one (Thierry Reding)
   - Stop setting pcibios_min_mem (Thierry Reding)

  Renesas R-Car host bridge driver:
   - Drop gen2 dummy I/O port region (Bjorn Helgaas)

  TI DRA7xx host bridge driver:
   - Fix return value in case of error (Christophe JAILLET)

  Xilinx AXI host bridge driver:
   - Fix return value in case of error (Christophe JAILLET)

  Miscellaneous:
   - Make bus_attr_resource_alignment static (Ben Dooks)
   - Include <asm/dma.h> for isa_dma_bridge_buggy (Ben Dooks)
   - MAINTAINERS: Add file patterns for PCI device tree bindings (Geert Uytterhoeven)
   - Make host bridge drivers explicitly non-modular (Paul Gortmaker)"

* tag 'pci-v4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (125 commits)
  PCI: xgene: Make explicitly non-modular
  PCI: thunder-pem: Make explicitly non-modular
  PCI: thunder-ecam: Make explicitly non-modular
  PCI: tegra: Make explicitly non-modular
  PCI: rcar-gen2: Make explicitly non-modular
  PCI: rcar: Make explicitly non-modular
  PCI: mvebu: Make explicitly non-modular
  PCI: layerscape: Make explicitly non-modular
  PCI: keystone: Make explicitly non-modular
  PCI: hisi: Make explicitly non-modular
  PCI: generic: Make explicitly non-modular
  PCI: designware-plat: Make it explicitly non-modular
  PCI: artpec6: Make explicitly non-modular
  PCI: armada8k: Make explicitly non-modular
  PCI: artpec: Add PCI_MSI_IRQ_DOMAIN dependency
  PCI: Add ACS quirk for Solarflare SFC9220
  arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700
  PCI: aardvark: Add Aardvark PCI host controller driver
  dt-bindings: add DT binding for the Aardvark PCIe controller
  PCI: tegra: Program PADS_REFCLK_CFG* registers with per-SoC values
  ...
2016-08-02 17:12:29 -04:00
Linus Torvalds affe8a2abd MTD updates for v4.8:
NAND:
 
   Updates from Boris:
 
     """
     This pull request contains only one notable change:
     * Addition of the MTK NAND controller driver
 
     And a bunch of specific NAND driver improvements/fixes. Here are the
     changes that are worth mentioning:
     * A few fixes/improvements for the xway NAND controller driver
     * A few fixes for the sunxi NAND controller driver
     * Support for DMA in the sunxi NAND driver
     * Support for the sunxi NAND controller IP embedded in A23/A33 SoCs
     * Addition for bitflips detection in erased pages to the brcmnand driver
     * Support for new brcmnand IPs
     * Update of the OMAP-GPMC binding to support DMA channel description
     """
 
   In addition, some small fixes around error handling, etc., as well as one
   long-standing corner case issue (2.6.20, I think?) with writing 1 byte less
   than a page.
 
 NOR:
 
  * Rework some error handling on reads and writes, so we can better handle (for
    instance) SPI controllers which have limitations on their maximum transfer size
 
  * Add new Cadence Quad SPI flash controller driver
 
  * Add new Atmel QSPI flash controller driver
 
  * Add new Hisilicon SPI flash controller driver
 
  * Support a few new flash, and update supported features on others
 
  * Fix the logic used for detecting a fully-unlocked flash
 
 And other miscellaneous small fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXn/j0AAoJEFySrpd9RFgto58P/j1huB0d21zFen3teo8YKKr1
 dLi65mFbqtpU1BLqD07uc2gsH17kvezFJCtx8H8Jp/yCjLF2kYIKL7wDTf6OJPtn
 aYGS5dG5jhMIq+6CD2olKqy+IVLfL9GvCf44Z3fpVta5lOn09y7Jm0AkBjmJcH45
 SdJi+iUkSKhqRY3O2udyauGyL4JWG7eHUonrG9g8ROrO0GWQkT4Ijm1ZwIBlFDkJ
 e1N960OqPg5ISOzuTeM14Ok9IUyeb7wiXhqRfOJgzNk6iBcN1YC2ono3C7RH2sN7
 wiyCqqUpDDCDBDiFdGOdpc9cjzNysrt02ypWRsZIpQVCm89nPLSutqQEWLuo0qzq
 /eIzdwbk1AxX96CeQohOezqL+n6+RHP9AIvwzL9GeWjipD1LBvfM1l3CmuSKK5jb
 bQ4CA/FVz1tO/25q8tuLJfpFzhFE2PC3pphVf8tREL/U6OR/97NgDMuQIuqiQpRc
 4nJtu79yacAzEiztZh0bsx+t94QFE+kfs/6d8m+llLEyx2sI8HKZeDvNRwEB0OsD
 wQ5bjyd54m7+i4H8njrnOTP+K2YrwNjGlbTo7qrRSpFMDr4mD0VQLap03Srvo5xV
 OYB6uGZhGV2L3k6nG5wywM2z6Hw0QfHxjrpcLyG51xABmni2QF3lOdYSllRtRTXp
 aYLfPUqgIgSUI2GTDHyW
 =HQr0
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20160801' of git://git.infradead.org/linux-mtd

Pull MTD updates from Brian Norris:
 "NAND:

    Quoting Boris:
     'This pull request contains only one notable change:
       - Addition of the MTK NAND controller driver

      And a bunch of specific NAND driver improvements/fixes. Here are the
      changes that are worth mentioning:
       - A few fixes/improvements for the xway NAND controller driver
       - A few fixes for the sunxi NAND controller driver
       - Support for DMA in the sunxi NAND driver
       - Support for the sunxi NAND controller IP embedded in A23/A33 SoCs
       - Addition for bitflips detection in erased pages to the brcmnand driver
       - Support for new brcmnand IPs
       - Update of the OMAP-GPMC binding to support DMA channel description'

    In addition, some small fixes around error handling, etc., as well
    as one long-standing corner case issue (2.6.20, I think?) with
    writing 1 byte less than a page.

  NOR:

   - rework some error handling on reads and writes, so we can better
     handle (for instance) SPI controllers which have limitations on
     their maximum transfer size

   - add new Cadence Quad SPI flash controller driver

   - add new Atmel QSPI flash controller driver

   - add new Hisilicon SPI flash controller driver

   - support a few new flash, and update supported features on others

   - fix the logic used for detecting a fully-unlocked flash

  And other miscellaneous small fixes"

* tag 'for-linus-20160801' of git://git.infradead.org/linux-mtd: (60 commits)
  mtd: spi-nor: don't build Cadence QuadSPI on non-ARM
  mtd: mtk-nor: remove duplicated include from mtk-quadspi.c
  mtd: nand: fix bug writing 1 byte less than page size
  mtd: update description of MTD_BCM47XXSFLASH symbol
  mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller
  mtd: spi-nor: Bindings for Cadence Quad SPI Flash Controller driver
  mtd: nand: brcmnand: Change BUG_ON in brcmnand_send_cmd
  mtd: pmcmsp-flash: Allocating too much in init_msp_flash()
  mtd: maps: sa1100-flash: potential NULL dereference
  mtd: atmel-quadspi: add driver for Atmel QSPI controller
  mtd: nand: omap2: fix return value check in omap_nand_probe()
  Documentation: atmel-quadspi: add binding file for Atmel QSPI driver
  mtd: spi-nor: add hisilicon spi-nor flash controller driver
  mtd: spi-nor: support dual, quad, and WP for Gigadevice
  mtd: spi-nor: Added support for n25q00a.
  memory: Update dependency of IFC for Layerscape
  mtd: nand: jz4780: Update MODULE_AUTHOR email address
  mtd: nand: sunxi: prevent a small memory leak
  mtd: nand: sunxi: add reset line support
  mtd: nand: sunxi: update DT bindings
  ...
2016-08-02 17:05:11 -04:00
Linus Torvalds f716a85cd6 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:

 - GCC plugin support by Emese Revfy from grsecurity, with a fixup from
   Kees Cook.  The plugins are meant to be used for static analysis of
   the kernel code.  Two plugins are provided already.

 - reduction of the gcc commandline by Arnd Bergmann.

 - IS_ENABLED / IS_REACHABLE macro enhancements by Masahiro Yamada

 - bin2c fix by Michael Tautschnig

 - setlocalversion fix by Wolfram Sang

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  gcc-plugins: disable under COMPILE_TEST
  kbuild: Abort build on bad stack protector flag
  scripts: Fix size mismatch of kexec_purgatory_size
  kbuild: make samples depend on headers_install
  Kbuild: don't add obj tree in additional includes
  Kbuild: arch: look for generated headers in obtree
  Kbuild: always prefix objtree in LINUXINCLUDE
  Kbuild: avoid duplicate include path
  Kbuild: don't add ../../ to include path
  vmlinux.lds.h: replace config_enabled() with IS_ENABLED()
  kconfig.h: allow to use IS_{ENABLE,REACHABLE} in macro expansion
  kconfig.h: use already defined macros for IS_REACHABLE() define
  export.h: use __is_defined() to check if __KSYM_* is defined
  kconfig.h: use __is_defined() to check if MODULE is defined
  kbuild: setlocalversion: print error to STDERR
  Add sancov plugin
  Add Cyclomatic complexity GCC plugin
  GCC plugin infrastructure
  Shared library support
2016-08-02 16:37:12 -04:00
Linus Torvalds 221bb8a46e - ARM: GICv3 ITS emulation and various fixes. Removal of the old
VGIC implementation.
 
 - s390: support for trapping software breakpoints, nested virtualization
 (vSIE), the STHYI opcode, initial extensions for CPU model support.
 
 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups,
 preliminary to this and the upcoming support for hardware virtualization
 extensions.
 
 - x86: support for execute-only mappings in nested EPT; reduced vmexit
 latency for TSC deadline timer (by about 30%) on Intel hosts; support for
 more than 255 vCPUs.
 
 - PPC: bugfixes.
 
 The ugly bit is the conflicts.  A couple of them are simple conflicts due
 to 4.7 fixes, but most of them are with other trees. There was definitely
 too much reliance on Acked-by here.  Some conflicts are for KVM patches
 where _I_ gave my Acked-by, but the worst are for this pull request's
 patches that touch files outside arch/*/kvm.  KVM submaintainers should
 probably learn to synchronize better with arch maintainers, with the
 latter providing topic branches whenever possible instead of Acked-by.
 This is what we do with arch/x86.  And I should learn to refuse pull
 requests when linux-next sends scary signals, even if that means that
 submaintainers have to rebase their branches.
 
 Anyhow, here's the list:
 
 - arch/x86/kvm/vmx.c: handle_pcommit and EXIT_REASON_PCOMMIT was removed
 by the nvdimm tree.  This tree adds handle_preemption_timer and
 EXIT_REASON_PREEMPTION_TIMER at the same place.  In general all mentions
 of pcommit have to go.
 
 There is also a conflict between a stable fix and this patch, where the
 stable fix removed the vmx_create_pml_buffer function and its call.
 
 - virt/kvm/kvm_main.c: kvm_cpu_notifier was removed by the hotplug tree.
 This tree adds kvm_io_bus_get_dev at the same place.
 
 - virt/kvm/arm/vgic.c: a few final bugfixes went into 4.7 before the
 file was completely removed for 4.8.
 
 - include/linux/irqchip/arm-gic-v3.h: this one is entirely our fault;
 this is a change that should have gone in through the irqchip tree and
 pulled by kvm-arm.  I think I would have rejected this kvm-arm pull
 request.  The KVM version is the right one, except that it lacks
 GITS_BASER_PAGES_SHIFT.
 
 - arch/powerpc: what a mess.  For the idle_book3s.S conflict, the KVM
 tree is the right one; everything else is trivial.  In this case I am
 not quite sure what went wrong.  The commit that is causing the mess
 (fd7bacbca4, "KVM: PPC: Book3S HV: Fix TB corruption in guest exit
 path on HMI interrupt", 2016-05-15) touches both arch/powerpc/kernel/
 and arch/powerpc/kvm/.  It's large, but at 396 insertions/5 deletions
 I guessed that it wasn't really possible to split it and that the 5
 deletions wouldn't conflict.  That wasn't the case.
 
 - arch/s390: also messy.  First is hypfs_diag.c where the KVM tree
 moved some code and the s390 tree patched it.  You have to reapply the
 relevant part of commits 6c22c98637, plus all of e030c1125e, to
 arch/s390/kernel/diag.c.  Or pick the linux-next conflict
 resolution from http://marc.info/?l=kvm&m=146717549531603&w=2.
 Second, there is a conflict in gmap.c between a stable fix and 4.8.
 The KVM version here is the correct one.
 
 I have pushed my resolution at refs/heads/merge-20160802 (commit
 3d1f53419842) at git://git.kernel.org/pub/scm/virt/kvm/kvm.git.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJXoGm7AAoJEL/70l94x66DugQIAIj703ePAFepB/fCrKHkZZia
 SGrsBdvAtNsOhr7FQ5qvvjLxiv/cv7CymeuJivX8H+4kuUHUllDzey+RPHYHD9X7
 U6n1PdCH9F15a3IXc8tDjlDdOMNIKJixYuq1UyNZMU6NFwl00+TZf9JF8A2US65b
 x/41W98ilL6nNBAsoDVmCLtPNWAqQ3lajaZELGfcqRQ9ZGKcAYOaLFXHv2YHf2XC
 qIDMf+slBGSQ66UoATnYV2gAopNlWbZ7n0vO6tE2KyvhHZ1m399aBX1+k8la/0JI
 69r+Tz7ZHUSFtmlmyByi5IAB87myy2WQHyAPwj+4vwJkDGPcl0TrupzbG7+T05Y=
 =42ti
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:

 - ARM: GICv3 ITS emulation and various fixes.  Removal of the
   old VGIC implementation.

 - s390: support for trapping software breakpoints, nested
   virtualization (vSIE), the STHYI opcode, initial extensions
   for CPU model support.

 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots
   of cleanups, preliminary to this and the upcoming support for
   hardware virtualization extensions.

 - x86: support for execute-only mappings in nested EPT; reduced
   vmexit latency for TSC deadline timer (by about 30%) on Intel
   hosts; support for more than 255 vCPUs.

 - PPC: bugfixes.

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
  KVM: PPC: Introduce KVM_CAP_PPC_HTM
  MIPS: Select HAVE_KVM for MIPS64_R{2,6}
  MIPS: KVM: Reset CP0_PageMask during host TLB flush
  MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
  MIPS: KVM: Sign extend MFC0/RDHWR results
  MIPS: KVM: Fix 64-bit big endian dynamic translation
  MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
  MIPS: KVM: Use 64-bit CP0_EBase when appropriate
  MIPS: KVM: Set CP0_Status.KX on MIPS64
  MIPS: KVM: Make entry code MIPS64 friendly
  MIPS: KVM: Use kmap instead of CKSEG0ADDR()
  MIPS: KVM: Use virt_to_phys() to get commpage PFN
  MIPS: Fix definition of KSEGX() for 64-bit
  KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
  kvm: x86: nVMX: maintain internal copy of current VMCS
  KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
  KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
  KVM: arm64: vgic-its: Simplify MAPI error handling
  KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
  KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
  ...
2016-08-02 16:11:27 -04:00
Linus Torvalds 731c7d3a20 Merge tag 'drm-for-v4.8' of git://people.freedesktop.org/~airlied/linux
Merge drm updates from Dave Airlie:
 "This is the main drm pull request for 4.8.

  I'm down with a cold at the moment so hopefully this isn't in too bad
  a state, I finished pulling stuff last week mostly (nouveau fixes just
  went in today), so only this message should be influenced by illness.
  Apologies to anyone who's major feature I missed :-)

  Core:
        Lockless GEM BO freeing
        Non-blocking atomic work
        Documentation changes (rst/sphinx)
        Prep for new fencing changes
        Simple display helpers
        Master/auth changes
        Register/unregister rework
        Loads of trivial patches/fixes.

  New stuff:
        ARM Mali display driver (not the 3D chip)
        sii902x RGB->HDMI bridge

  Panel:
        Support for new panels
        Improved backlight support

  Bridge:
        Convert ADV7511 to bridge driver
        ADV7533 support
        TC358767 (DSI/DPI to eDP) encoder chip support

  i915:
        BXT support enabled by default
        GVT-g infrastructure
        GuC command submission and fixes
        BXT workarounds
        SKL/BKL workarounds
        Demidlayering device registration
        Thundering herd fixes
        Missing pci ids
        Atomic updates

  amdgpu/radeon:
        ATPX improvements for better dGPU power control on PX systems
        New power features for CZ/BR/ST
        Pipelined BO moves and evictions in TTM
        GPU scheduler improvements
        GPU reset improvements
        Overclocking on dGPUs with amdgpu
        Polaris powermanagement enabled

  nouveau:
        GK20A/GM20B volt and clock improvements.
        Initial support for GP100/GP104 GPUs, GP104 will not yet support
        acceleration due to NVIDIA having not released firmware for them as of yet.

  exynos:
        Exynos5433 SoC with IOMMU support.

  vc4:
        Shader validation for branching

  imx-drm:
        Atomic mode setting conversion
        Reworked DMFC FIFO allocation
        External bridge support

  analogix-dp:
        RK3399 eDP support
        Lots of fixes.

  rockchip:
        Lots of small fixes.

  msm:
        DT bindings cleanups
        Shrinker and madvise support
        ASoC HDMI codec support

  tegra:
        Host1x driver cleanups
        SOR reworking for DP support
        Runtime PM support

  omapdrm:
        PLL enhancements
        Header refactoring
        Gamma table support

  arcgpu:
        Simulator support

  virtio-gpu:
        Atomic modesetting fixes.

  rcar-du:
        Misc fixes.

  mediatek:
        MT8173 HDMI support

  sti:
        ASOC HDMI codec support
        Minor fixes

  fsl-dcu:
        Suspend/resume support
        Bridge support

  amdkfd:
        Minor fixes.

  etnaviv:
        Enable GPU clock gating

  hisilicon:
        Vblank and other fixes"

* tag 'drm-for-v4.8' of git://people.freedesktop.org/~airlied/linux: (1575 commits)
  drm/nouveau/gr/nv3x: fix instobj write offsets in gr setup
  drm/nouveau/acpi: fix lockup with PCIe runtime PM
  drm/nouveau/acpi: check for function 0x1B before using it
  drm/nouveau/acpi: return supported DSM functions
  drm/nouveau/acpi: ensure matching ACPI handle and supported functions
  drm/nouveau/fbcon: fix font width not divisible by 8
  drm/amd/powerplay: remove enable_clock_power_gatings_tasks from initialize and resume events
  drm/amd/powerplay: move clockgating to after ungating power in pp for uvd/vce
  drm/amdgpu: add query device id and revision id into system info entry at CGS
  drm/amdgpu: add new definition in bif header
  drm/amd/powerplay: rename smum header guards
  drm/amdgpu: enable UVD context buffer for older HW
  drm/amdgpu: fix default UVD context size
  drm/amdgpu: fix incorrect type of info_id
  drm/amdgpu: make amdgpu_cgs_call_acpi_method as static
  drm/amdgpu: comment out unused defaults_staturn_pro static const structure to fix the build
  drm/amdgpu: enable UVD VM only on polaris
  drm/amdgpu: increase timeout of IB test
  drm/amdgpu: add destroy session when generate VCE destroy msg.
  drm/amd: fix deadlock of job_list_lock V2
  ...
2016-08-01 21:44:08 -04:00
Asias He 06a8fc7836 VSOCK: Introduce virtio_vsock_common.ko
This module contains the common code and header files for the following
virtio_transporto and vhost_vsock kernel modules.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02 02:57:29 +03:00
Linus Torvalds e48af7aaf1 Miscellaneous ia64 cleanups
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXn8QHAAoJEKurIx+X31iBA8MQAINFPoU1ZbRiEqfRvCzU/evR
 snyu/sh4uBlFm7o2cOg5atdPYV+28Gz+yW7Y/aWAF+ftrcPJnFEfQaBZG4P3pGYv
 g5hiAcL/0ITLsBL0WoOGjE0FxUdF9MpajIwocHmfghlUycmLk1AJTD1iUBzpLcnu
 yiO2IL1AvfNIeDJangsAlCe/xhdfdCGmODj5gDnOTmGKjH2Pd9jLHd12L/bt0J9s
 PY6iGFgmhD373VJGeGeakHZA+eLcGXzyBBcoNg1L63vGVIIgM62AQwr1IFfwNGEr
 4XwcQvetTTdvdjOnco5LF6IPjKW4IQ3pIy7SSP2b9LQheaKSnaLDuUt7kXh5jSC/
 ICXW/+TvVKA1oU0Audp+RaXdwXjI7Jc+HvEwSLsiR4/61A8iUqoF1qlEBv5MmYU5
 WZLoWQKTVuqg5HmTWfvF9LwBFuTHti8yjHt4LJ/lNryvmbYirZllXkB8CIY9fYsN
 u8aex71WiMV6qAVTb1zwHbeRzGRILK2/ngPfJ3QN4uS99rVcqTKO+gs2B0vl0VZj
 F399JSidnVFxeSwhpHwprgvOrBpGuQ5e6NpHCbYcYcdYRfvS7cMtOivfIJ5ENMp1
 bZcCF/NyBerBAckznEyISFkAR/9l1gNl4bGp00cX2Dd8CAkgLnAtlULbgB8461ae
 VioZS6lltaXPTTewUNo5
 =Jpj7
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-misc-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull ia64 updates from Tony Luck:
 "Miscellaneous ia64 cleanups"

* tag 'please-pull-misc-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  ia64: salinfo: use a waitqueue instead a sema down/up combo
  ia64: efi: use timespec64 for persistent clock
2016-08-01 18:55:31 -04:00
Linus Torvalds 43a0a98aa8 ARM: SoC driver updates for v4.8
Driver updates for ARM SoCs.
 
 A slew of changes this release cycle. The reset driver tree, that we merge
 through arm-soc for historical reasons, is also sizable this time around.
 
 Among the changes:
 
  - clps711x: Treewide changes to compatible strings, merged here for simplicity.
  - Qualcomm: SCM firmware driver cleanups, move to platform driver
  - ux500: Major cleanups, removal of old mach-specific infrastructure.
  - Atmel external bus memory driver
  - Move of brcmstb platform to the rest of bcm
  - PMC driver updates for tegra, various fixes and improvements
  - Samsung platform driver updates to support 64-bit Exynos platforms
  - Reset controller cleanups moving to devm_reset_controller_register() APIs
  - Reset controller driver for Amlogic Meson
  - Reset controller driver for Hisilicon hi6220
  - ARM SCPI power domain support
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXnm1XAAoJEIwa5zzehBx35lcP/ApuQarIXeZCQZtjlUBV9McW
 o3o7FhKFHePmEPeoYCvVeK5D8NykTkQv3WpnCknoxPJzxGJF7jbPWQJcVnXfKOXD
 kTcyIK15WL2HHtSE3lYyLfyUPz8AbJyRt0l0cxgcg6jvo+uzlWooNz1y78rLIYzg
 UwRssj7OiHv4dsyYRHZIsjnB8gMWw8rYMk154gP2xy6MnNXXzzOVVnOiVqxSZBm+
 EgIIcROMOqkkHuFlClMYKluIgrmgz1Ypjf+FuAg7dqXZd+TGRrmGXeI7SkGThfLu
 nyvY3N18NViNu7xOUkI9zg7+ifyYM8Si9ylalSICSJdIAxZfiwFqFaLJvVWKU1rY
 rBOBjKckQI0/X9WYusFNFHcijhIFV8/FgGAnVRRMPdvlCss7Zp03C9mR4AEhmKMX
 rLG49x81hU1C+LftC59ml3iB8dhZrrRkbxNHjLFHVGWNrKMrmJKa8JhXGRAoNM+u
 LRauiuJZatqvLfISNvpfcoW2EashVoU3f+uC8ymT3QCyME3wZm0t7T4tllxhMfBl
 sOgJqNkTKDmPLofwm/dASiLML7ZF1WePScrFyOACnj9K4mUD+OaCnowtWoQPu0eI
 aNmT84oosJ2S9F/iUDPtFHXdzQ+1QPPfSiQ9FXMoauciVq/2F+pqq68yYgqoxFOG
 vmkmG2YM4Wyq43u0BONR
 =O8+y
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC driver updates from Olof Johansson:
 "Driver updates for ARM SoCs.

  A slew of changes this release cycle.  The reset driver tree, that we
  merge through arm-soc for historical reasons, is also sizable this
  time around.

  Among the changes:

   - clps711x: Treewide changes to compatible strings, merged here for simplicity.
   - Qualcomm: SCM firmware driver cleanups, move to platform driver
   - ux500: Major cleanups, removal of old mach-specific infrastructure.
   - Atmel external bus memory driver
   - Move of brcmstb platform to the rest of bcm
   - PMC driver updates for tegra, various fixes and improvements
   - Samsung platform driver updates to support 64-bit Exynos platforms
   - Reset controller cleanups moving to devm_reset_controller_register() APIs
   - Reset controller driver for Amlogic Meson
   - Reset controller driver for Hisilicon hi6220
   - ARM SCPI power domain support"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (100 commits)
  ARM: ux500: consolidate base platform files
  ARM: ux500: move soc_id driver to drivers/soc
  ARM: ux500: call ux500_setup_id later
  ARM: ux500: consolidate soc_device code in id.c
  ARM: ux500: remove cpu_is_u* helpers
  ARM: ux500: use CLK_OF_DECLARE()
  ARM: ux500: move l2x0 init to .init_irq
  mfd: db8500 stop passing around platform data
  ASoC: ab8500-codec: remove platform data based probe
  ARM: ux500: move ab8500_regulator_plat_data into driver
  ARM: ux500: remove unused regulator data
  soc: raspberrypi-power: add CONFIG_OF dependency
  firmware: scpi: add CONFIG_OF dependency
  video: clps711x-fb: Changing the compatibility string to match with the smallest supported chip
  input: clps711x-keypad: Changing the compatibility string to match with the smallest supported chip
  pwm: clps711x: Changing the compatibility string to match with the smallest supported chip
  serial: clps711x: Changing the compatibility string to match with the smallest supported chip
  irqchip: clps711x: Changing the compatibility string to match with the smallest supported chip
  clocksource: clps711x: Changing the compatibility string to match with the smallest supported chip
  clk: clps711x: Changing the compatibility string to match with the smallest supported chip
  ...
2016-08-01 18:36:01 -04:00
Linus Torvalds fbae5cbb43 ARM: SoC platform updates for v4.8
Improved and new platform support for various SoCs:
 
  - New SoC support:
    - Broadcom BCM23550
    - Freescale i.MX7Solo
    - Qualcomm MDM9615
    - Renesas r8a7792
  - Conversion of clps711x to multiplatform
  - debug uart improvements for Atmel platforms
  - Tango platform improvements: HOTPLUG_CPU, Suspend-to-ram
  - OMAP tweaks and improvements to hwmod
  - OMAP support for kexec on SMP
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXnaibAAoJEIwa5zzehBx3h6AP/0TBATiDuYXTcX3V8zZ/ia9y
 7dWbP7gVX7DN39b5qdjLTa+DUx3Y3msxW9qsuUQR8RWijbqjCH7b/fyPwGA0fmpP
 3uZpFpyzs+6/3TiMDN1yw1T+/2YbVyM+4rOeNsCwncdXjGSx0FaMJAqLBrppiWLH
 1S9HhD/314znibl8skOy8QIDWwlW011sS2mNUIN+JelvnS/VDjtCDfpphpNrAQF9
 MZB6LhT9itvf6mIEGIsaDq/Ii7fgIAnA9WCtwv9tJkAZHzbS0cWkiJzb7hF1GzFO
 Q5HBAyzn+CkeTQ3+9NQU0G0vhfa3Ea0g1gfw6qRmAw+z8Qdiamjh8SSve6zm1fE8
 GmIewsMAWWIUYykEIi9hbWCTYq06Pw/Nn6KWRAuQ/lpt++jzMQ82qk6cxELLW15e
 uAC1JjFOCIFNBZhkrdQDU0qx6Ew/AUH4wCYqu4Xh7pW0MHu0V9NgsmooeoTmCkpd
 WtgKp8Wh5dsK3SdsbTjdR/IeHSQkeSdgNY/6TBTjpRwCIlEMwHlKbvwvRExk1xzi
 nLQJsR49MsjeSdPflzO6WUzOjJhQfuw2jCtAQjlom15EgkEZ569MT4RsAQIgvNCI
 PeUWkvIW1uCtW7Y6ADPRBKMIrajPs8YW4E/xTItuhrqLHp8z6efvRmVNdpzqBTVj
 tT2t2bRXF0cGiUvOeU7U
 =Kh9P
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC platform updates from Olof Johansson:
 "Improved and new platform support for various SoCs:

  New SoC support:
   - Broadcom BCM23550
   - Freescale i.MX7Solo
   - Qualcomm MDM9615
   - Renesas r8a7792

  Improvements:
   - convert clps711x to multiplatform
   - debug uart improvements for Atmel platforms
   - Tango platform improvements: HOTPLUG_CPU, Suspend-to-ram
   - OMAP tweaks and improvements to hwmod
   - OMAP support for kexec on SMP"

* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (109 commits)
  ARM: davinci: fix build break because of undeclared dm365_evm_snd_data
  ARM: s3c64xx: smartq: Avoid sparse warnings
  ARM: sti: Implement dummy L2 cache's write_sec
  ARM: STi: Update machine _namestr to be more generic.
  arm: meson: explicitly select clk drivers
  ARM: tango: add Suspend-to-RAM support
  ARM: hisi: consolidate the hisilicon machine entries
  ARM: tango: fix CONFIG_HOTPLUG_CPU=n build
  MAINTAINERS: Update BCM281XX/BCM11XXX/BCM216XX entry
  MAINTAINERS: Update BCM63XX entry
  MAINTAINERS: Add NS2 entry
  MAINTAINERS: Fix nsp false-positives
  MAINTAINERS: Change L to M for Broadcom ARM/ARM64 SoC entries
  ARM: debug: Enable DEBUG_BCM_5301X for Northstar Plus SoCs
  ARM: clps711x: Switch to MULTIPLATFORM
  ARM: clps711x: Remove boards support
  ARM: clps711x: Add basic DT support
  ARM: clps711x: Reduce static map size
  ARM: SAMSUNG: Constify iomem address passed to s5p_init_cpu
  ARM: oxnas: Change OX810SE default driver config
  ...
2016-08-01 18:27:08 -04:00
Linus Torvalds 6f888fe31d ARM: SoC cleanups for v4.8
The cleanup branch keeps going down in size as we've completed a lot of
 the major legacy platform removals and conversions.
 
 A handful of changes this time around, some of the themes or larger sets are:
 
  - A bunch of i.MX cleanups around platform detection, init call cleanups
  - Misc fixes of missing/implicit includes
  - Removal of ARCH_[WANT_OPTIONAL|REQUIRE]_GPIOLIB
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXnnwFAAoJEIwa5zzehBx3zXQP/2a/+XkiseeGkEoiX/6FOfhH
 XTzipye0OYdEe3kVWFL1sVVXRH6a5sbDJRNtfsQc+KdSG5i7LMHWARRJmIx9CTMB
 oQ9pEbYKSyBQDHBSOZYT6W+qYOI2SdTYqesjd3yn+FY4SIFBpQ/V3axHnMICIRm9
 PmHF1QUQEdtQ2Y9+E1vA1mHcPN9enjlesD3VdRbxVPX/PZw63kx9y8ICVq5I/PX9
 DfJRcA+PKIYQghhEZ0cx2bEoKozv7W088C7DD1Umw1NN18pMuvvNQGhid80xUqKY
 4bmLSGWqwmSzv1WZ/u1pUnBGGQE9YY1U2b8kZy8hSVg9rupxS8Ang0ztZRRE6nk2
 4t8GmWuLDH+7PxFv/skzi1AMAx+4KxSfp3N5qyKr8ddmnYrFWmBPj2AeBqrlziw6
 8Z41LQULmf/Gs6McikGUP7ryqd15gNtTJO1wlavqFrPe0fyzcHsgqpIy3YCqZiSE
 wQ4Hc036xqGknmg6GjHWp+W1rHZVGsnXmvnp1IVRoAGqwBqxNi4ItIXE7an8144H
 NnWFmPRSsGg26MfEJVsbtPQWNtEGqM2lgr6zn9xirC0cVbQ4ZDtWp2q0bJ1v/cLQ
 sW6Gu6jgVN8YUPp56lBaXJ5RxHE9V1Sqi4/+KghBKWW0X/BIo99b6PVr2bJRrkaq
 ZvpvsgzbCHdGqTptF9Dw
 =SfR/
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC cleanups from Olof Johansson:
 "The cleanup branch keeps going down in size as we've completed a lot
  of the major legacy platform removals and conversions.

  A handful of changes this time around, some of the themes or larger
  sets are:

   - A bunch of i.MX cleanups around platform detection, init call cleanups
   - Misc fixes of missing/implicit includes
   - Removal of ARCH_[WANT_OPTIONAL|REQUIRE]_GPIOLIB"

* tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (40 commits)
  ARM: mps2: fix typo
  ARM: s3c64xx: avoid warning about 'struct device_node'
  bus: mvebu-mbus: make mvebu_mbus_syscore_ops static
  bus: mvebu-mbus: fix __iomem on register pointers
  ARM: tegra: Remove board_init_funcs array
  ARM: iop: Fix indentation
  ARM: imx: remove cpu_is_mx*()
  ARM: imx: remove last call to cpu_is_mx5*
  ARM: imx: rework mx27_pm_init() call
  ARM: imx: deconstruct mx3_idle
  ARM: imx: deconstruct mxc_rnga initialization
  ARM: imx: remove cpu_is_mx1 check
  ARM: i.MX: Do not explicitly call l2x0_of_init()
  ARM: i.MX: system.c: Tweak prefetch settings for performance
  ARM: i.MX: system.c: Replace magic numbers
  ARM: i.MX: system.c: Remove redundant errata 752271 code
  ARM: i.MX: system.c: Convert goto to if statement
  ARM: Kirkwood: fix kirkwood_pm_init() declaration/type
  ARM: Kirkwood: make kirkwood_disable_mbus_error_propagation() static
  ARM: orion5x: make orion5x_legacy_handle_irq static
  ...
2016-08-01 18:21:13 -04:00
Michael S. Tsirkin 1a93769399 virtio: new feature to detect IOMMU device quirk
The interaction between virtio and IOMMUs is messy.

On most systems with virtio, physical addresses match bus addresses,
and it doesn't particularly matter which one we use to program
the device.

On some systems, including Xen and any system with a physical device
that speaks virtio behind a physical IOMMU, we must program the IOMMU
for virtio DMA to work at all.

On other systems, including SPARC and PPC64, virtio-pci devices are
enumerated as though they are behind an IOMMU, but the virtio host
ignores the IOMMU, so we must either pretend that the IOMMU isn't
there or somehow map everything as the identity.

Add a feature bit to detect that quirk: VIRTIO_F_IOMMU_PLATFORM.

Any device with this feature bit set to 0 needs a quirk and has to be
passed physical addresses (as opposed to bus addresses) even though
the device is behind an IOMMU.

Note: it has to be a per-device quirk because for example, there could
be a mix of passed-through and virtual virtio devices. As another
example, some devices could be implemented by an out of process
hypervisor backend (in case of qemu vhost, or vhost-user) and so support
for an IOMMU needs to be coded up separately.

It would be cleanest to handle this in IOMMU core code, but that needs
per-device DMA ops. While we are waiting for that to be implemented, use
a work-around in virtio core.

Note: a "noiommu" feature is a quirk - add a wrapper to make
that clear.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-01 21:44:52 +03:00
Bjorn Helgaas 9454c23852 Merge branch 'pci/msi-affinity' into next
Conflicts:
	drivers/nvme/host/pci.c
2016-08-01 12:34:01 -05:00
Bjorn Helgaas 79dd993461 Merge branches 'pci/demodularize-hosts' and 'pci/host-request-windows' into next
* pci/demodularize-hosts:
  PCI: xgene: Make explicitly non-modular
  PCI: thunder-pem: Make explicitly non-modular
  PCI: thunder-ecam: Make explicitly non-modular
  PCI: tegra: Make explicitly non-modular
  PCI: rcar-gen2: Make explicitly non-modular
  PCI: rcar: Make explicitly non-modular
  PCI: mvebu: Make explicitly non-modular
  PCI: layerscape: Make explicitly non-modular
  PCI: keystone: Make explicitly non-modular
  PCI: hisi: Make explicitly non-modular
  PCI: generic: Make explicitly non-modular
  PCI: designware-plat: Make it explicitly non-modular
  PCI: artpec6: Make explicitly non-modular
  PCI: armada8k: Make explicitly non-modular
  PCI: artpec: Add PCI_MSI_IRQ_DOMAIN dependency
  PCI: artpec: Add Axis ARTPEC-6 PCIe controller driver
  PCI: Add DT binding for Axis ARTPEC-6 PCIe controller
  PCI: generic: Select IRQ_DOMAIN

* pci/host-request-windows:
  PCI: versatile: Simplify host bridge window iteration
  PCI: versatile: Request host bridge window resources with core function
  PCI: tegra: Request host bridge window resources with core function
  PCI: tegra: Remove top-level resource from hierarchy
  PCI: rcar: Simplify host bridge window iteration
  PCI: rcar: Request host bridge window resources with core function
  PCI: rcar Gen2: Request host bridge window resources
  PCI: rcar: Drop gen2 dummy I/O port region
  ARM: Make PCI I/O space optional
  PCI: mvebu: Request host bridge window resources with core function
  PCI: generic: Simplify host bridge window iteration
  PCI: generic: Request host bridge window resources with core function
  PCI: altera: Simplify host bridge window iteration
  PCI: altera: Request host bridge window resources with core function
  PCI: xilinx-nwl: Use dev_printk() when possible
  PCI: xilinx-nwl: Request host bridge window resources
  PCI: xilinx-nwl: Free bridge resource list on failure
  PCI: xilinx: Request host bridge window resources
  PCI: xilinx: Free bridge resource list on failure
  PCI: xgene: Request host bridge window resources
  PCI: xgene: Free bridge resource list on failure
  PCI: iproc: Request host bridge window resources
  PCI: designware: Simplify host bridge window iteration
  PCI: designware: Request host bridge window resources
  PCI: designware: Free bridge resource list on failure
  PCI: Add devm_request_pci_bus_resources()
2016-08-01 12:23:57 -05:00
Bjorn Helgaas 3efc702378 Merge branch 'pci/resource' into next
* pci/resource:
  unicore32/PCI: Remove pci=firmware command line parameter handling
  ARM/PCI: Remove arch-specific pcibios_enable_device()
  ARM64/PCI: Remove arch-specific pcibios_enable_device()
  MIPS/PCI: Claim bus resources on PCI_PROBE_ONLY set-ups
  ARM/PCI: Claim bus resources on PCI_PROBE_ONLY set-ups
  PCI: generic: Claim bus resources on PCI_PROBE_ONLY set-ups
  PCI: Add generic pci_bus_claim_resources()
  alx: Use pci_(request|release)_mem_regions
  ethernet/intel: Use pci_(request|release)_mem_regions
  GenWQE: Use pci_(request|release)_mem_regions
  lpfc: Use pci_(request|release)_mem_regions
  NVMe: Use pci_(request|release)_mem_regions
  PCI: Add helpers to request/release memory and I/O regions
  PCI: Extending pci=resource_alignment to specify device/vendor IDs
  sparc/PCI: Implement pci_resource_to_user() with pcibios_resource_to_bus()
  powerpc/pci: Implement pci_resource_to_user() with pcibios_resource_to_bus()
  microblaze/PCI: Implement pci_resource_to_user() with pcibios_resource_to_bus()
  PCI: Unify pci_resource_to_user() declarations
  microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
  powerpc/pci: Remove __pci_mmap_set_pgprot()
  PCI: Ignore write combining when mapping I/O port space
2016-08-01 12:23:44 -05:00
Bjorn Helgaas a00c74c166 Merge branches 'pci/aspm', 'pci/dpc', 'pci/hotplug', 'pci/misc', 'pci/msi', 'pci/pm' and 'pci/virtualization' into next
* pci/aspm:
  PCI/ASPM: Remove redundant check of pcie_set_clkpm

* pci/dpc:
  PCI: Remove DPC tristate module option
  PCI: Bind DPC to Root Ports as well as Downstream Ports
  PCI: Fix whitespace in struct dpc_dev
  PCI: Convert Downstream Port Containment driver to use devm_* functions

* pci/hotplug:
  PCI: Allow additional bus numbers for hotplug bridges

* pci/misc:
  PCI: Include <asm/dma.h> for isa_dma_bridge_buggy
  PCI: Make bus_attr_resource_alignment static
  MAINTAINERS: Add file patterns for PCI device tree bindings
  PCI: Fix comment typo

* pci/msi:
  PCI/MSI: irqchip: Fix PCI_MSI dependencies

* pci/pm:
  PCI: pciehp: Ignore interrupts during D3cold
  PCI: Document connection between pci_power_t and hardware PM capability
  PCI: Add runtime PM support for PCIe ports
  ACPI / hotplug / PCI: Runtime resume bridge before rescan
  PCI: Power on bridges before scanning new devices
  PCI: Put PCIe ports into D3 during suspend
  PCI: Don't clear d3cold_allowed for PCIe ports
  PCI / PM: Enforce type casting for pci_power_t

* pci/virtualization:
  PCI: Add ACS quirk for Solarflare SFC9220
  PCI: Add DMA alias quirk for Adaptec 3805
  PCI: Mark Atheros AR9485 and QCA9882 to avoid bus reset
  PCI: Add function 1 DMA alias quirk for Marvell 88SE9182
2016-08-01 12:23:31 -05:00
Linus Torvalds 06e23d5115 - Core Frameworks
- New API to call bespoke pre/post IRQ handlers; Regmap
  - New Device Support
    - Add support for RN5T567 to rn5t618
    - Add support for COMe-cSL6 and COMe-mAL10 to kempld-core
  - New Functionality
    - Add support for USB Power Supply to axp20x
    - Add support for Power Key to hi655x-pmic
  - Fix-ups
    - Update MAINTAINERS; Dialog, Altera
    - Remove module support; max77843, max77620, max8998, max8997, max8925-i2c
    - Add module support; max14577
    - Constifying; max77620
    - Allow bespoke IRQ masking/unmasking; max77620
    - Remove superfluous code; arizona, qcom_rpm, smsc-ece1099
    - Power Management fixups; arizona-core
    - Error-path improvement; twl-core, dm355evm_msp, smsc-ece1099, hi655x
    - Clocking fixups; twl6040
    - Trivial (spelling, headers, coding-style, whitespace, (re)naming);
        si476x-i2c, omap-usb-tll, ti_am335x_tscadc, tps6507, hi655x-pmic
 - Bug Fixes
    - Fix offset error for MSM8660; qcom_rpm
    - Fix possible spurious IRQs; arizona, hi655x-pmic
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXnxUiAAoJEFGvii+H/HdhjY4P/i9EeGo4rSKyMNYvSIV+FwpB
 W63eOeEf08EzLfaTyDTnJ9FO53UvcJEXvVE7jO5oYWI8g5y5zBinbwk041uUIeuN
 5gv/3quW0kiOozd+6VBghaLYbj37ZZLqBFXc0sYpHNwcIrfVboTaiUrFkl9wUXKQ
 Gbi7mAw0JhgzigQX2oPmylsdJ2u+MuwxLOcL43HoYLQ6E9b5ZmMqHqz0X7zwjh2c
 og+WYSl2VOe001fRoLZIPPkE6r61IWZ85oojuZKeQnIiadZWXGwY3XMcYDcheVuz
 6BC/f5Vt9PacDPjuInfI4Mv8ROmhnvoNgB9xgoUDROAn5m5FgqdHGZPK40hE+uVy
 cWRQjdI9CiTTcogGepJGY57OlB7T7q+Peue00knjQiiYdtS2jU4E0xNOWPO/8ILD
 EV8yeYbD9yghhA1vFc5pzOEQnBD9q9zX40N36vJp4cgl6jO25R/PkLeTg5KufFoi
 6+H9VAgqevg4/D8R63Yj8ANdEAvTVupLcpzwORgz89jKF4x5fDU782L5nGifx6dt
 hNlan00bLrSL6sWtAqCvezSNlF2Aqb6qHhdnvIV93VGeteVMU6N1wRB3JH08HFQI
 C5t+qFqlxJn3R5E8ur/6ZhtkiPnC4qCVR+r9uyH6Aix9bR1W18o0jgypuZMu7E+h
 czIJMLv97drOVtzYsosz
 =KB2d
 -----END PGP SIGNATURE-----

Merge tag 'mfd-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
 "Core Framework:
   - New API to call bespoke pre/post IRQ handlers; Regmap

  New Device Support:
   - Add support for RN5T567 to rn5t618
   - Add support for COMe-cSL6 and COMe-mAL10 to kempld-core

  New Functionality:
   - Add support for USB Power Supply to axp20x
   - Add support for Power Key to hi655x-pmic

  Fix-ups:
   - Update MAINTAINERS; Dialog, Altera
   - Remove module support; max77843, max77620, max8998, max8997, max8925-i2c
   - Add module support; max14577
   - Constifying; max77620
   - Allow bespoke IRQ masking/unmasking; max77620
   - Remove superfluous code; arizona, qcom_rpm, smsc-ece1099
   - Power Management fixups; arizona-core
   - Error-path improvement; twl-core, dm355evm_msp, smsc-ece1099, hi655x
   - Clocking fixups; twl6040
   - Trivial (spelling, headers, coding-style, whitespace, (re)naming);
       si476x-i2c, omap-usb-tll, ti_am335x_tscadc, tps6507, hi655x-pmic

  Bug Fixes:
   - Fix offset error for MSM8660; qcom_rpm
   - Fix possible spurious IRQs; arizona, hi655x-pmic"

* tag 'mfd-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (42 commits)
  mfd: qcom_rpm: Parametrize also ack selector size
  mfd: twl6040: Handle mclk used for HPPLL and optional internal clock source
  mfd: Add support for COMe-cSL6 and COMe-mAL10 to Kontron PLD driver
  mfd: hi655x: Fix return value check in hi655x_pmic_probe()
  mfd: smsc-ece1099: Return directly after a function failure in smsc_i2c_probe()
  mfd: smsc-ece1099: Delete an unnecessary variable initialisation in smsc_i2c_probe()
  mfd: dm355evm_msp: Return directly after a failed platform_device_alloc() in add_child()
  mfd: twl-core: Refactoring for add_numbered_child()
  mfd: twl-core: Return directly after a failed platform_device_alloc() in add_numbered_child()
  mfd: arizona: Add missing disable of PM runtime on probe error path
  mfd: stmpe: Move platform data into MFD driver
  mfd: max14577: Allow driver to be built as a module
  mfd: max14577: Use module_init() instead of subsys_initcall()
  mfd: arizona: Remove some duplicate defines
  mfd: qcom_rpm: Remove unused define
  mfd: hi655x-pmic: Add powerkey device to hi655x PMIC driver
  mfd: hi655x-pmic: Rename some interrupt macro names
  mfd: hi655x-pmic: Fixup issue with un-acked interrupts
  mfd: arizona: Check if AOD interrupts are pending before dispatching
  mfd: qcom_rpm: Fix offset error for msm8660
  ...
2016-08-01 07:28:14 -04:00
Linus Torvalds dd9671172a IOMMU Updates for Linux v4.8
In the updates:
 
 	* Big endian support and preparation for defered probing for the
 	  Exynos IOMMU driver
 
 	* Simplifications in iommu-group id handling
 
 	* Support for Mediatek generation one IOMMU hardware
 
 	* Conversion of the AMD IOMMU driver to use the generic IOVA
 	  allocator. This driver now also benefits from the recent
 	  scalability improvements in the IOVA code.
 
 	* Preparations to use generic DMA mapping code in the Rockchip
 	  IOMMU driver
 
 	* Device tree adaption and conversion to use generic page-table
 	  code for the MSM IOMMU driver
 
 	* An iova_to_phys optimization in the ARM-SMMU driver to greatly
 	  improve page-table teardown performance with VFIO
 
 	* Various other small fixes and conversions
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJXl3e+AAoJECvwRC2XARrjMIgP/1Mm9qIfcaAxKY4ByqbVfrH8
 313PO6rpwUhhywUmnf/1F/x+JbuLv8MmRXfSc106mdB1rq9NXpkORYKrqVxs0cSq
 6u6TzZWbF6WN1ipqXxDITNFBSy7u97K1VuFaKyYFfLbg8xrkcdkMZJ7BqM2xIEdk
 rnRKcfHo6wsmCXJ6InsUPmKAqU6AfMewZTGjO+v77Gce0rZEbsJ8n7BRKC9vO2bc
 akvN2W+zzEUSyhbuyYQBG+agpmC5GJvz4u+6QvAP5sxTWfAsnwAoPpP4xxR+/KjT
 eicHlja4v0YK6Hr4AJaMxoKfKIrCdqpWm0D2tg/edyWZCeg98AW/w7/s0I8OD3ao
 Otj6IqC8nPk0pYciOeEPQ7aqPbvKAqU2FYWt7lWamrdr98u2R3p2nXGl0KthoAj6
 JqzrCZXvBS7sj1IPLlGpj939yvbKbjpE0p7y1qhI1VEBXoBWFNvlKydkYx76BTGK
 F6paGVqn2Zwy00AqAsylTEkvIK063zwShZ6nPqz4bMdVlgzjrjCzdDecjfbHr8Ic
 6D2oCwyF+RJ8qw+Ecm9EmWFik80sgb+iUTeeYEXNf+YzLYt5McIj7fi3N+sUPel3
 YJ4S4x0sIpgUZZ1i+rOo8ZPAFHRU6SRPYV+ewaeYKrMt+Un5dTn9SddpqrJdbiUu
 YrF36BaQjc123IRGKrSd
 =xiS2
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:

 - big-endian support and preparation for defered probing for the Exynos
   IOMMU driver

 - simplifications in iommu-group id handling

 - support for Mediatek generation one IOMMU hardware

 - conversion of the AMD IOMMU driver to use the generic IOVA allocator.
   This driver now also benefits from the recent scalability
   improvements in the IOVA code.

 - preparations to use generic DMA mapping code in the Rockchip IOMMU
   driver

 - device tree adaption and conversion to use generic page-table code
   for the MSM IOMMU driver

 - an iova_to_phys optimization in the ARM-SMMU driver to greatly
   improve page-table teardown performance with VFIO

 - various other small fixes and conversions

* tag 'iommu-updates-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (59 commits)
  iommu/amd: Initialize dma-ops domains with 3-level page-table
  iommu/amd: Update Alias-DTE in update_device_table()
  iommu/vt-d: Return error code in domain_context_mapping_one()
  iommu/amd: Use container_of to get dma_ops_domain
  iommu/amd: Flush iova queue before releasing dma_ops_domain
  iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back
  iommu/amd: Use dev_data->domain in get_domain()
  iommu/amd: Optimize map_sg and unmap_sg
  iommu/amd: Introduce dir2prot() helper
  iommu/amd: Implement timeout to flush unmap queues
  iommu/amd: Implement flush queue
  iommu/amd: Allow NULL pointer parameter for domain_flush_complete()
  iommu/amd: Set up data structures for flush queue
  iommu/amd: Remove align-parameter from __map_single()
  iommu/amd: Remove other remains of old address allocator
  iommu/amd: Make use of the generic IOVA allocator
  iommu/amd: Remove special mapping code for dma_ops path
  iommu/amd: Pass gfp-flags to iommu_map_page()
  iommu/amd: Implement apply_dm_region call-back
  iommu/amd: Create a list of reserved iova addresses
  ...
2016-08-01 07:25:10 -04:00
Linus Torvalds 77d9ada23f Merge branch 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jussi Brar:
 "Broadcom:
   - New PDC controller driver and bindings

  Misc:
   - PL320 - Convert from 'raw' IO to 'relaxed' version
   - Test - fix dangling pointer"

* 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox: Fix format and type mismatches in Broadcom PDC driver
  mailbox: Add Broadcom PDC mailbox driver
  dt-bindings: add bindings documentation for PDC driver.
  mailbox: pl320: remove __raw IO
  mailbox: mailbox-test: set tdev->signal to NULL after freeing
2016-08-01 07:23:29 -04:00
Linus Torvalds 07f00f06ba MMC core:
- A couple of changes to improve the support for erase/discard/trim cmds
  - Add eMMC HS400 enhanced strobe support
  - Show OCR and DSR registers in SYSFS for MMC/SD cards
  - Correct and improve busy detection logic for MMC switch (CMD6) cmds
  - Disable HPI cmds for certain broken Hynix eMMC cards
  - Allow MMC hosts to specify non-support for SD and MMC cmds
  - Some minor additional fixes
 
 MMC host:
  - sdhci: Re-works, fixes and clean-ups
  - sdhci: Add HW auto re-tuning support
  - sdhci: Re-factor code to prepare for adding support for eMMC CMDQ
  - sdhci-esdhc-imx: Fixes and clean-ups
  - sdhci-esdhc-imx: Update system PM support
  - sdhci-esdhc-imx: Enable HW auto re-tuning
  - sdhci-bcm2835: Remove driver as sdhci-iproc is used instead
  - sdhci-brcmstb: Add new driver for Broadcom BRCMSTB SoCs
  - sdhci-msm: Add support for UHS cards
  - sdhci-tegra: Improve support for UHS cards
  - sdhci-of-arasan: Update phy support for Rockchip SoCs
  - sdhci-of-arasan: Deploy enhanced strobe support
  - dw_mmc: Some fixes and clean-ups
  - dw_mmc: Enable support for erase/discard/trim cmds
  - dw_mmc: Enable CMD23 support
  - mediatek: Some fixes related to the eMMC HS400 support
  - sh_mmcif: Improve support for HW busy detection
  - rtsx_pci: Enable support for erase/discard/trim cmds
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXnibNAAoJEP4mhCVzWIwpKIQQAM2iTx5ROuOe3/CmWEAv5phg
 XpgmuWz/R/KQH1wyDor4zEJUE5mUYpvSg9vSCByYnVwHpoNKg4R0t0OJZJSN05Z0
 njdKCtkVXlRaQitX3tiFQazkMLaQER0tRrqrHjYZYkBgr8DnSEf00aueSGshrr/p
 g/JE7Z+eGVJx6d3Lvd24NMVZYSGbVq5lY/JTbURkKR5C9OhEe5GMfUsw1M+aVhUo
 pvxeGVYuIr4WsP+c+roa6czTO8VMrsd71syLDCweWMk+GOHvaSjpzvdC7iFGAl4B
 CoN6f3LTpJGpDCBWnqp/yQMvrMsR1SKlrwOUph4XEAXZCaz5WtF8+vhQaK7gwYGO
 RiyIIh04Bup+gzY6w0Mwgsw0jAkw5ahXl6xknD58q/sPEdlKe0ATyxl0oUwDntc0
 957BtmOvChYG13Q5pW6VNef0jKW6LOU+brJKEGxtr8x9OPeCiQOBO/GCUTg3IwA5
 ohylmjS+/6G3G4bgGvfXyGvObOsdAh3RQ8g7/1pVb0hQobDhxO5PjtTqVtelfFJk
 PRdKKIWhROiNMrpVLZnL7+OLkusEQ4s0Lmq08T1dVXvV+hLITwyTyWYmBr6Kb51B
 2x3DO7PcfEyytSn3TjPjdp7Wax+19YL+tMmu2QAlXv3iHSKFAH17nLk0bPO6VOWZ
 FHzKKutm4TwZJ/fyyiqP
 =wzSC
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v4.8' of git://git.linaro.org/people/ulf.hansson/mmc

Pull MMC updates from Ulf Hansson:
 "MMC core:
   - A couple of changes to improve the support for erase/discard/trim cmds
   - Add eMMC HS400 enhanced strobe support
   - Show OCR and DSR registers in SYSFS for MMC/SD cards
   - Correct and improve busy detection logic for MMC switch (CMD6) cmds
   - Disable HPI cmds for certain broken Hynix eMMC cards
   - Allow MMC hosts to specify non-support for SD and MMC cmds
   - Some minor additional fixes

  MMC host:
   - sdhci: Re-works, fixes and clean-ups
   - sdhci: Add HW auto re-tuning support
   - sdhci: Re-factor code to prepare for adding support for eMMC CMDQ
   - sdhci-esdhc-imx: Fixes and clean-ups
   - sdhci-esdhc-imx: Update system PM support
   - sdhci-esdhc-imx: Enable HW auto re-tuning
   - sdhci-bcm2835: Remove driver as sdhci-iproc is used instead
   - sdhci-brcmstb: Add new driver for Broadcom BRCMSTB SoCs
   - sdhci-msm: Add support for UHS cards
   - sdhci-tegra: Improve support for UHS cards
   - sdhci-of-arasan: Update phy support for Rockchip SoCs
   - sdhci-of-arasan: Deploy enhanced strobe support
   - dw_mmc: Some fixes and clean-ups
   - dw_mmc: Enable support for erase/discard/trim cmds
   - dw_mmc: Enable CMD23 support
   - mediatek: Some fixes related to the eMMC HS400 support
   - sh_mmcif: Improve support for HW busy detection
   - rtsx_pci: Enable support for erase/discard/trim cmds"

* tag 'mmc-v4.8' of git://git.linaro.org/people/ulf.hansson/mmc: (135 commits)
  mmc: rtsx_pci: Remove deprecated create_singlethread_workqueue
  mmc: rtsx_pci: Enable MMC_CAP_ERASE to allow erase/discard/trim requests
  mmc: rtsx_pci: Use the provided busy timeout from the mmc core
  mmc: sdhci-pltfm: Drop define for SDHCI_PLTFM_PMOPS
  mmc: sdhci-pltfm: Convert to use the SET_SYSTEM_SLEEP_PM_OPS
  mmc: sdhci-pltfm: Make sdhci_pltfm_suspend|resume() static
  mmc: sdhci-esdhc-imx: Use common sdhci_suspend|resume_host()
  mmc: sdhci-esdhc-imx: Assign system PM ops within #ifdef CONFIG_PM_SLEEP
  mmc: sdhci-sirf: Remove non needed #ifdef CONFIG_PM* for dev_pm_ops
  mmc: sdhci-s3c: Remove non needed #ifdef CONFIG_PM for dev_pm_ops
  mmc: sdhci-pxav3: Remove non needed #ifdef CONFIG_PM for dev_pm_ops
  mmc: sdhci-of-esdhc: Simplify code by using SIMPLE_DEV_PM_OPS
  mmc: sdhci-acpi: Simplify code by using SET_SYSTEM_SLEEP_PM_OPS
  mmc: sdhci-pci-core: Simplify code by using SET_SYSTEM_SLEEP_PM_OPS
  mmc: Change the max discard sectors and erase response when HW busy detect
  phy: rockchip-emmc: Wait even longer for the DLL to lock
  phy: rockchip-emmc: Be tolerant to card clock of 0 in power on
  mmc: sdhci-of-arasan: Revert: Always power the PHY off/on when clock changes
  mmc: sdhci-msm: Add support for UHS cards
  mmc: sdhci-msm: Add set_uhs_signaling() implementation
  ...
2016-07-31 21:36:58 -04:00
Linus Torvalds 27acbec338 Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
 "Core:
   - min and max timeout improvements, WDOG_HW_RUNNING improvements,
     status funtionality
   - Add a device managed API for watchdog_register_device()

  New watchdog drivers:
   -  Aspeed SoCs
   -  Maxim PMIC MAX77620
   -  Amlogic Meson GXBB SoC

  Enhancements:
   - support for the r8a7796 watchdog device
   - support for F81866 watchdog device
   - support for 5th variation of Apollo Lake
   - support for MCP78S chipset
   - clean-up of softdog.c watchdog device driver
   - pic32-wdt and pic32-dmt fixes
   - Documentation/watchdog: watchdog-test improvements
   - several other fixes and improvements"

* git://www.linux-watchdog.org/linux-watchdog: (50 commits)
  watchdog: gpio_wdt: Fix missing platform_set_drvdata() in gpio_wdt_probe()
  watchdog: core: Clear WDOG_HW_RUNNING before calling the stop function
  watchdog: core: Fix error handling of watchdog_dev_init()
  watchdog: pic32-wdt: Fix return value check in pic32_wdt_drv_probe()
  watchdog: pic32-dmt: Remove .owner field for driver
  watchdog: pic32-wdt: Remove .owner field for driver
  watchdog: renesas-wdt: Add support for the r8a7796 wdt
  Documentation/watchdog: check return value for magic close
  watchdog: sbsa: Drop status function
  watchdog: Implement status function in watchdog core
  watchdog: tangox: Set max_hw_heartbeat_ms instead of max_timeout
  watchdog: change watchdog_need_worker logic
  watchdog: add support for MCP78S chipset in nv_tco
  watchdog: bcm2835_wdt: remove redundant ->set_timeout callback
  watchdog: bcm2835_wdt: constify _ops and _info structures
  dt-bindings: watchdog: Add Meson GXBB Watchdog bindings
  watchdog: Add Meson GXBB Watchdog Driver
  watchdog: qcom: configure BARK time in addition to BITE time
  watchdog: qcom: add option for standalone watchdog not in timer block
  watchdog: qcom: update device tree bindings
  ...
2016-07-31 21:32:22 -04:00
Al Viro 6fa67e7075 get rid of 'parent' argument of ->d_compare()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-07-31 16:37:25 -04:00
Linus Torvalds c9b95e5961 sound updates for 4.8
Majority of this update is about ASoC, including a few new drivers,
 and the rest are mostly minor changes.  The only substantial change in
 ALSA core is about the additional error handling in the
 compress-offload API.  Below are highlights:
 
 - Add the error propagating support in compress-offload API
 
 - HD-audio: a usual Dell headset fixup, an Intel HDMI/DP fix, and the
   default mixer setup change ot turn off the loopback
 
 - Lots of updates for ASoC Intel drivers, mostly board support and bug
   fixing, and to the NAU8825 driver
 
 - Work on generalizing bits of simple-card to allow more code sharing
   with the Renesas rsrc-card (which can't use simple-card due to DPCM)
 
 - Removal of the Odroid X2 driver due to replacement with simple-card
 
 - Support for several new Mediatek platforms and associated boards
 
 - New ASoC drivers for Allwinner A10, Analog Devices ADAU7002, Broadcom
   Cygnus, Cirrus Logic CS35L33 and CS53L30, Maxim MAX8960 and MAX98504,
   Realtek RT5514 and Wolfson WM8758
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIrBAABCAAVBQJXnaohDhx0aXdhaUBzdXNlLmRlAAoJEGwxgFQ9KSmkdh0P+wde
 9YXRP7lugnai2keRR5ibev118qHvwcoLycdNSSutjydK1gbZtRNcVOiAO9KRKqxs
 nv4lM7TduFHDjsBrD26iMxFHqdnyuI2Xd5nddpPfTteCLtLLu72RYrRrubZB/SC9
 Pi2PwMg+FkwaAG+/HXcWFIDNIxqhUXg2jHIhWlvWyvEvIZgJntvZaD1vw3NQqmE/
 EAbafpdNXewY2IPxsQ+4zVD6JUTh4RTY790R4vZptO7E0oRksmffzAbX+XiYpiam
 pVwZAhyj05jy4DGsio89ufA51EC58oV8xfo6jxQh6wiXjr7ArhBh/60EoV7VMYh7
 aouFQ+EOox3p/rcoyaaHbGROnCk0WIc2milQHM41F3Q6iC0fxOep14kSjxOXXqYg
 tgIjduptExh9XsEdpTCHfnmOyXEq+7PFbqmfluoKvG1q//k4u3d8SEicBfJMuWO+
 Fb/v2ngii0r3D7rwsl4ONEShJdVd+mbpg2d6DoOu1/UYMfTGaxR/vF3DD7gCmn0A
 qikZkOnmmVttDVc8YlsmMZyo2ATU3AWsnNhdZGDGGT5IaAppZ+h3H5JZKNhxo5my
 cK6JbSMCljkmcc8GM990P6BNngya3doiu5aox8hM2uWugHcWYvjyoigGUhGVJdlT
 DgGtxlZ+5dQptdP0gAe5qO/Fzn9gfwvZQZxgSdaS
 =zx3K
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "The majority of this update is about ASoC, including a few new
  drivers, and the rest are mostly minor changes.  The only substantial
  change in ALSA core is about the additional error handling in the
  compress-offload API.  Below are highlights:

   - Add the error propagating support in compress-offload API

   - HD-audio: a usual Dell headset fixup, an Intel HDMI/DP fix, and the
     default mixer setup change ot turn off the loopback

   - Lots of updates for ASoC Intel drivers, mostly board support and
     bug fixing, and to the NAU8825 driver

   - Work on generalizing bits of simple-card to allow more code sharing
     with the Renesas rsrc-card (which can't use simple-card due to DPCM)

   - Removal of the Odroid X2 driver due to replacement with simple-card

   - Support for several new Mediatek platforms and associated boards

   - New ASoC drivers for Allwinner A10, Analog Devices ADAU7002,
     Broadcom Cygnus, Cirrus Logic CS35L33 and CS53L30, Maxim MAX8960
     and MAX98504, Realtek RT5514 and Wolfson WM8758"

* tag 'sound-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (278 commits)
  sound: oss: Use kernel_read_file_from_path() for mod_firmware_load()
  ASoC: Intel: Skylake: Delete an unnecessary check before the function call "release_firmware"
  ASoC: Intel: Skylake: Fix NULL Pointer exception in dynamic_debug.
  ASoC: samsung: Specify DMA channels through struct snd_dmaengine_pcm_config
  ASoC: samsung: Fix error paths in the I2S driver's probe()
  ASoC: cs53l30: Fix bit shift issue of TDM mode
  ASoC: cs53l30: Fix a bug for TDM slot location validation
  ASoC: rockchip: correct the spdif clk
  ALSA: echoaudio: purge contradictions between dimension matrix members and total number of members
  ASoC: rsrc-card: use asoc_simple_card_parse_card_name()
  ASoC: rsrc-card: use asoc_simple_dai instead of rsrc_card_dai
  ASoC: rsrc-card: use asoc_simple_card_parse_dailink_name()
  ASoC: simple-card: use asoc_simple_card_parse_card_name()
  ASoC: simple-card-utils: add asoc_simple_card_parse_card_name()
  ASoC: simple-card: use asoc_simple_card_parse_dailink_name()
  ASoC: simple-card-utils: add asoc_simple_card_set_dailink_name()
  ASoC: nau8825: drop redundant idiom when converting integer to boolean
  ASoC: nau8825: jack connection decision with different insertion logic
  ASoC: mediatek: Add HDMI dai-links to the mt8173-rt5650 machine driver
  ASoC: mediatek: mt2701: fix non static symbol warning
  ...
2016-07-31 02:25:02 -07:00
Linus Torvalds bad60e6f25 powerpc updates for 4.8 # 1
Highlights:
  - PowerNV PCI hotplug support.
  - Lots more Power9 support.
  - eBPF JIT support on ppc64le.
  - Lots of cxl updates.
  - Boot code consolidation.
 
 Bug fixes:
  - Fix spin_unlock_wait() from Boqun Feng
  - Fix stack pointer corruption in __tm_recheckpoint() from Michael Neuling
  - Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
  - mm: Ensure "special" zones are empty from Oliver O'Halloran
  - ftrace: Separate the heuristics for checking call sites from Michael Ellerman
  - modules: Never restore r2 for a mprofile-kernel style mcount() call from Michael Ellerman
  - Fix endianness when reading TCEs from Alexey Kardashevskiy
  - start rtasd before PCI probing from Greg Kurz
  - PCI: rpaphp: Fix slot registration for multiple slots under a PHB from Tyrel Datwyler
  - powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev Bhattiprolu
 
 Cleanups & fixes:
  - Drop support for MPIC in pseries from Rashmica Gupta
  - Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
  - Remove unused symbols in asm-offsets.c from Rashmica Gupta
  - Fix SRIOV not building without EEH enabled from Russell Currey
  - Remove kretprobe_trampoline_holder. from Thiago Jung Bauermann
  - Reduce log level of PCI I/O space warning from Benjamin Herrenschmidt
  - Add array bounds checking to crash_shutdown_handlers from Suraj Jitindar Singh
  - Avoid -maltivec when using clang integrated assembler from Anton Blanchard
  - Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
  - Fix error return value in cmm_mem_going_offline() from Rasmus Villemoes
  - export cpu_to_core_id() from Mauricio Faria de Oliveira
  - Remove old symbols from defconfigs from Andrew Donnellan
  - Update obsolete comments in setup_32.c about entry conditions from Benjamin Herrenschmidt
  - Add comment explaining the purpose of setup_kdump_trampoline() from Benjamin Herrenschmidt
  - Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin Hao
  - Remove RELOCATABLE_PPC32 from Kevin Hao
  - Fix .long's in tlb-radix.c to more meaningful from Balbir Singh
 
 Minor cleanups & fixes:
  - Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
    Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King, Geliang
    Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman, Michael Ellerman,
    Stephen Rothwell, Stewart Smith.
 
 Freescale updates from Scott:
  - "Highlights include more 8xx optimizations, device tree updates,
    and MVME7100 support."
 
 PowerNV PCI hotplug from Gavin Shan:
  - PCI: Add pcibios_setup_bridge()
  - Override pcibios_setup_bridge()
  - Remove PCI_RESET_DELAY_US
  - Move pnv_pci_ioda_setup_opal_tce_kill() around
  - Increase PE# capacity
  - Allocate PE# in reverse order
  - Create PEs in pcibios_setup_bridge()
  - Setup PE for root bus
  - Extend PCI bridge resources
  - Make pnv_ioda_deconfigure_pe() visible
  - Dynamically release PE
  - Update bridge windows on PCI plug
  - Delay populating pdn
  - Support PCI slot ID
  - Use PCI slot reset infrastructure
  - Introduce pnv_pci_get_slot_id()
  - Functions to get/set PCI slot state
  - PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  - Print correct PHB type names
 
 Power9 idle support from Shreyas B. Prabhu:
  - set power_save func after the idle states are initialized
  - Use PNV_THREAD_WINKLE macro while requesting for winkle
  - make hypervisor state restore a function
  - Rename idle_power7.S to idle_book3s.S
  - Rename reusable idle functions to hardware agnostic names
  - Make pnv_powersave_common more generic
  - abstraction for saving SPRs before entering deep idle states
  - Add platform support for stop instruction
  - cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
  - cpuidle/powernv: cleanup cpuidle-powernv.c
  - cpuidle/powernv: Add support for POWER ISA v3 idle states
  - Use deepest stop state when cpu is offlined
 
 Power9 PMU from Madhavan Srinivasan:
  - factor out power8 pmu macros and defines
  - factor out power8 pmu functions
  - factor out power8 __init_pmu code
  - Add power9 event list macros for generic and cache events
  - Power9 PMU support
  - Export Power9 generic and cache events to sysfs
 
 Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
  - Add XICS emulation APIs
  - Move a few exception common handlers to make room
  - Add support for HV virtualization interrupts
  - Add mechanism to force a replay of interrupts
  - Add ICP OPAL backend
  - Discover IODA3 PHBs
  - pci: Remove obsolete SW invalidate
  - opal: Add real mode call wrappers
  - Rename TCE invalidation calls
  - Remove SWINV constants and obsolete TCE code
  - Rework accessing the TCE invalidate register
  - Fallback to OPAL for TCE invalidations
  - Use the device-tree to get available range of M64's
  - Check status of a PHB before using it
  - pci: Don't try to allocate resources that will be reassigned
 
 Other Power9:
  - Send SIGBUS on unaligned copy and paste from Chris Smart
  - Large Decrementer support from Oliver O'Halloran
  - Load Monitor Register Support from Jack Miller
 
 Performance improvements from Anton Blanchard:
  - Avoid load hit store in __giveup_fpu() and __giveup_altivec()
  - Avoid load hit store in setup_sigcontext()
  - Remove assembly versions of strcpy, strcat, strlen and strcmp
  - Align hot loops of some string functions
 
 eBPF JIT from Naveen N. Rao:
  - Fix/enhance 32-bit Load Immediate implementation
  - Optimize 64-bit Immediate loads
  - Introduce rotate immediate instructions
  - A few cleanups
  - Isolate classic BPF JIT specifics into a separate header
  - Implement JIT compiler for extended BPF
 
 Operator Panel driver from Suraj Jitindar Singh:
  - devicetree/bindings: Add binding for operator panel on FSP machines
  - Add inline function to get rc from an ASYNC_COMP opal_msg
  - Add driver for operator panel on FSP machines
 
 Sparse fixes from Daniel Axtens:
  - make some things static
  - Introduce asm-prototypes.h
  - Include headers containing prototypes
  - Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
  - kvm: Clarify __user annotations
  - Pass endianness to sparse
  - Make ppc_md.{halt, restart} __noreturn
 
 MM fixes & cleanups from Aneesh Kumar K.V:
  - radix: Update LPCR HR bit as per ISA
  - use _raw variant of page table accessors
  - Compile out radix related functions if RADIX_MMU is disabled
  - Clear top 16 bits of va only on older cpus
  - Print formation regarding the the MMU mode
  - hash: Update SDR1 size encoding as documented in ISA 3.0
  - radix: Update PID switch sequence
  - radix: Update machine call back to support new HCALL.
  - radix: Add LPID based tlb flush helpers
  - radix: Add a kernel command line to disable radix
  - Cleanup LPCR defines
 
 Boot code consolidation from Benjamin Herrenschmidt:
  - Move epapr_paravirt_early_init() to early_init_devtree()
  - cell: Don't use flat device-tree after boot
  - ge_imp3a: Don't use the flat device-tree after boot
  - mpc85xx_ds: Don't use the flat device-tree after boot
  - mpc85xx_rdb: Don't use the flat device-tree after boot
  - Don't test for machine type in rtas_initialize()
  - Don't test for machine type in smp_setup_cpu_maps()
  - dt: Add of_device_compatible_match()
  - Factor do_feature_fixup calls
  - Move 64-bit feature fixup earlier
  - Move 64-bit memory reserves to setup_arch()
  - Use a cachable DART
  - Move FW feature probing out of pseries probe()
  - Put exception configuration in a common place
  - Remove early allocation of the SMU command buffer
  - Move MMU backend selection out of platform code
  - pasemi: Remove IOBMAP allocation from platform probe()
  - mm/hash: Don't use machine_is() early during boot
  - Don't test for machine type to detect HEA special case
  - pmac: Remove spurrious machine type test
  - Move hash table ops to a separate structure
  - Ensure that ppc_md is empty before probing for machine type
  - Move 64-bit probe_machine() to later in the boot process
  - Move 32-bit probe() machine to later in the boot process
  - Get rid of ppc_md.init_early()
  - Move the boot time info banner to a separate function
  - Move setting of {i,d}cache_bsize to initialize_cache_info()
  - Move the content of setup_system() to setup_arch()
  - Move cache info inits to a separate function
  - Re-order the call to smp_setup_cpu_maps()
  - Re-order setup_panic()
  - Make a few boot functions __init
  - Merge 32-bit and 64-bit setup_arch()
 
 Other new features:
  - tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
  - tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
  - powerpc: Add module autoloading based on CPU features from Alastair D'Silva
  - crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
  - Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
  - xmon: Dump ISA 2.06 SPRs from Michael Ellerman
  - xmon: Dump ISA 2.07 SPRs from Michael Ellerman
  - Add a parameter to disable 1TB segs from Oliver O'Halloran
  - powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
  - Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
  - pseries: Add pseries hotplug workqueue from John Allen
  - pseries: Add support for hotplug interrupt source from John Allen
  - pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
  - pseries: Move property cloning into its own routine from Nathan Fontenot
  - pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
  - pseries: Auto-online hotplugged memory from Nathan Fontenot
  - pseries: Remove call to memblock_add() from Nathan Fontenot
 
 cxl:
  - Add set and get private data to context struct from Michael Neuling
  - make base more explicitly non-modular from Paul Gortmaker
  - Use for_each_compatible_node() macro from Wei Yongjun
  - Frederic Barrat
    - Abstract the differences between the PSL and XSL
    - Make vPHB device node match adapter's
  - Philippe Bergheaud
    - Add mechanism for delivering AFU driver specific events
    - Ignore CAPI adapters misplaced in switched slots
    - Refine slice error debug messages
  - Andrew Donnellan
    - static-ify variables to fix sparse warnings
    - PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
    - PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
    - Add cxl_check_and_switch_mode() API to switch bi-modal cards
    - remove dead Kconfig options
    - fix potential NULL dereference in free_adapter()
  - Ian Munsie
    - Update process element after allocating interrupts
    - Add support for CAPP DMA mode
    - Fix allowing bogus AFU descriptors with 0 maximum processes
    - Fix allocating a minimum of 2 pages for the SPA
    - Fix bug where AFU disable operation had no effect
    - Workaround XSL bug that does not clear the RA bit after a reset
    - Fix NULL pointer dereference on kernel contexts with no AFU interrupts
    - powerpc/powernv: Split cxl code out into a separate file
    - Add cxl_slot_is_supported API
    - Enable bus mastering for devices using CAPP DMA mode
    - Move cxl_afu_get / cxl_afu_put to base
    - Allow a default context to be associated with an external pci_dev
    - Do not create vPHB if there are no AFU configuration records
    - powerpc/powernv: Add support for the cxl kernel api on the real phb
    - Add support for using the kernel API with a real PHB
    - Add kernel APIs to get & set the max irqs per context
    - Add preliminary workaround for CX4 interrupt limitation
    - Add support for interrupts on the Mellanox CX4
    - Workaround PE=0 hardware limitation in Mellanox CX4
    - powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n
 
 selftests:
  - Test unaligned copy and paste from Chris Smart
  - Load Monitor Register Tests from Jack Miller
  - Cyril Bur
    - exec() with suspended transaction
    - Use signed long to read perf_event_paranoid
    - Fix usage message in context_switch
    - Fix generation of vector instructions/types in context_switch
  - Michael Ellerman
    - Use "Delta" rather than "Error" in normal output
    - Import Anton's mmap & futex micro benchmarks
    - Add a test for PROT_SAO
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXnWchAAoJEFHr6jzI4aWAe64P/36Vd9yJLptjkoyZp8/IQtu1
 Cv8buQwGdKuSMzdkcUAOXcC3fe2u70ZWXMKKLfY3koIV1IAiqdWk5/XWRKMP2XmE
 dG0LhSf0uu7uh+mE0WvQnRu46ImeKtQ+mPp4Hbs/s9SxMSeYjruv3vdWWmgUq0cl
 Gac2qJSRtAMmgLuHWMjf7N5mxOTOnKejU4o2i9cJ+YHmWKOdCigv2Ge1UadOQFlC
 E7tRPiUR3asfDfj+e+LVTTdToH6p8pk+mOUzIoZ8jIkQ+IXzi62UDl5+Rw9mqiuX
 1CtqEMUXxo2qwX+d4TcV/QUOp0YKPuIcUZ9NMMS+S3lOyJ4NFt+j2Izk7QJp5kNP
 gKVqB68TjDQsBuDr3P9ynlHbduxTIhZAqopbTrLe0FIg48nUe4n1yHJBVzqaVajX
 rFBJSsSUffBLAARNPSXJJhIgc2C1/qOC8dgMeDMcR2kPirDHaQZ/lY1yEpq1yiqR
 q6e3v5hvIAm4IjbYk0mF7TUxBrPGVE/ExyBINyASRoYxAJ1PyeD/iljZ9vI3asRA
 s+hhxT8H3f7lnqTrmJqMjHgAdGkmag07EdmvFNX4xK4aADSy7Y6g4dw25ffRopo9
 p9Jf9HX+dZv65Y3UjbV/6HuXcaSEBJJLSVWvii65PebqSN0LuHEFvNeIJ6Iblx0B
 AWh/hd0Iin2gdkcG39Mr
 =Z5kM
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights:
   - PowerNV PCI hotplug support.
   - Lots more Power9 support.
   - eBPF JIT support on ppc64le.
   - Lots of cxl updates.
   - Boot code consolidation.

  Bug fixes:
   - Fix spin_unlock_wait() from Boqun Feng
   - Fix stack pointer corruption in __tm_recheckpoint() from Michael
     Neuling
   - Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
   - mm: Ensure "special" zones are empty from Oliver O'Halloran
   - ftrace: Separate the heuristics for checking call sites from
     Michael Ellerman
   - modules: Never restore r2 for a mprofile-kernel style mcount() call
     from Michael Ellerman
   - Fix endianness when reading TCEs from Alexey Kardashevskiy
   - start rtasd before PCI probing from Greg Kurz
   - PCI: rpaphp: Fix slot registration for multiple slots under a PHB
     from Tyrel Datwyler
   - powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev
     Bhattiprolu

  Cleanups & fixes:
   - Drop support for MPIC in pseries from Rashmica Gupta
   - Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
   - Remove unused symbols in asm-offsets.c from Rashmica Gupta
   - Fix SRIOV not building without EEH enabled from Russell Currey
   - Remove kretprobe_trampoline_holder from Thiago Jung Bauermann
   - Reduce log level of PCI I/O space warning from Benjamin
     Herrenschmidt
   - Add array bounds checking to crash_shutdown_handlers from Suraj
     Jitindar Singh
   - Avoid -maltivec when using clang integrated assembler from Anton
     Blanchard
   - Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
   - Fix error return value in cmm_mem_going_offline() from Rasmus
     Villemoes
   - export cpu_to_core_id() from Mauricio Faria de Oliveira
   - Remove old symbols from defconfigs from Andrew Donnellan
   - Update obsolete comments in setup_32.c about entry conditions from
     Benjamin Herrenschmidt
   - Add comment explaining the purpose of setup_kdump_trampoline() from
     Benjamin Herrenschmidt
   - Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin
     Hao
   - Remove RELOCATABLE_PPC32 from Kevin Hao
   - Fix .long's in tlb-radix.c to more meaningful from Balbir Singh

  Minor cleanups & fixes:
   - Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
     Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King,
     Geliang Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman,
     Michael Ellerman, Stephen Rothwell, Stewart Smith.

  Freescale updates from Scott:
   - "Highlights include more 8xx optimizations, device tree updates,
     and MVME7100 support."

  PowerNV PCI hotplug from Gavin Shan:
   - PCI: Add pcibios_setup_bridge()
   - Override pcibios_setup_bridge()
   - Remove PCI_RESET_DELAY_US
   - Move pnv_pci_ioda_setup_opal_tce_kill() around
   - Increase PE# capacity
   - Allocate PE# in reverse order
   - Create PEs in pcibios_setup_bridge()
   - Setup PE for root bus
   - Extend PCI bridge resources
   - Make pnv_ioda_deconfigure_pe() visible
   - Dynamically release PE
   - Update bridge windows on PCI plug
   - Delay populating pdn
   - Support PCI slot ID
   - Use PCI slot reset infrastructure
   - Introduce pnv_pci_get_slot_id()
   - Functions to get/set PCI slot state
   - PCI/hotplug: PowerPC PowerNV PCI hotplug driver
   - Print correct PHB type names

  Power9 idle support from Shreyas B. Prabhu:
   - set power_save func after the idle states are initialized
   - Use PNV_THREAD_WINKLE macro while requesting for winkle
   - make hypervisor state restore a function
   - Rename idle_power7.S to idle_book3s.S
   - Rename reusable idle functions to hardware agnostic names
   - Make pnv_powersave_common more generic
   - abstraction for saving SPRs before entering deep idle states
   - Add platform support for stop instruction
   - cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
   - cpuidle/powernv: cleanup cpuidle-powernv.c
   - cpuidle/powernv: Add support for POWER ISA v3 idle states
   - Use deepest stop state when cpu is offlined

  Power9 PMU from Madhavan Srinivasan:
   - factor out power8 pmu macros and defines
   - factor out power8 pmu functions
   - factor out power8 __init_pmu code
   - Add power9 event list macros for generic and cache events
   - Power9 PMU support
   - Export Power9 generic and cache events to sysfs

  Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
   - Add XICS emulation APIs
   - Move a few exception common handlers to make room
   - Add support for HV virtualization interrupts
   - Add mechanism to force a replay of interrupts
   - Add ICP OPAL backend
   - Discover IODA3 PHBs
   - pci: Remove obsolete SW invalidate
   - opal: Add real mode call wrappers
   - Rename TCE invalidation calls
   - Remove SWINV constants and obsolete TCE code
   - Rework accessing the TCE invalidate register
   - Fallback to OPAL for TCE invalidations
   - Use the device-tree to get available range of M64's
   - Check status of a PHB before using it
   - pci: Don't try to allocate resources that will be reassigned

  Other Power9:
   - Send SIGBUS on unaligned copy and paste from Chris Smart
   - Large Decrementer support from Oliver O'Halloran
   - Load Monitor Register Support from Jack Miller

  Performance improvements from Anton Blanchard:
   - Avoid load hit store in __giveup_fpu() and __giveup_altivec()
   - Avoid load hit store in setup_sigcontext()
   - Remove assembly versions of strcpy, strcat, strlen and strcmp
   - Align hot loops of some string functions

  eBPF JIT from Naveen N. Rao:
   - Fix/enhance 32-bit Load Immediate implementation
   - Optimize 64-bit Immediate loads
   - Introduce rotate immediate instructions
   - A few cleanups
   - Isolate classic BPF JIT specifics into a separate header
   - Implement JIT compiler for extended BPF

  Operator Panel driver from Suraj Jitindar Singh:
   - devicetree/bindings: Add binding for operator panel on FSP machines
   - Add inline function to get rc from an ASYNC_COMP opal_msg
   - Add driver for operator panel on FSP machines

  Sparse fixes from Daniel Axtens:
   - make some things static
   - Introduce asm-prototypes.h
   - Include headers containing prototypes
   - Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
   - kvm: Clarify __user annotations
   - Pass endianness to sparse
   - Make ppc_md.{halt, restart} __noreturn

  MM fixes & cleanups from Aneesh Kumar K.V:
   - radix: Update LPCR HR bit as per ISA
   - use _raw variant of page table accessors
   - Compile out radix related functions if RADIX_MMU is disabled
   - Clear top 16 bits of va only on older cpus
   - Print formation regarding the the MMU mode
   - hash: Update SDR1 size encoding as documented in ISA 3.0
   - radix: Update PID switch sequence
   - radix: Update machine call back to support new HCALL.
   - radix: Add LPID based tlb flush helpers
   - radix: Add a kernel command line to disable radix
   - Cleanup LPCR defines

  Boot code consolidation from Benjamin Herrenschmidt:
   - Move epapr_paravirt_early_init() to early_init_devtree()
   - cell: Don't use flat device-tree after boot
   - ge_imp3a: Don't use the flat device-tree after boot
   - mpc85xx_ds: Don't use the flat device-tree after boot
   - mpc85xx_rdb: Don't use the flat device-tree after boot
   - Don't test for machine type in rtas_initialize()
   - Don't test for machine type in smp_setup_cpu_maps()
   - dt: Add of_device_compatible_match()
   - Factor do_feature_fixup calls
   - Move 64-bit feature fixup earlier
   - Move 64-bit memory reserves to setup_arch()
   - Use a cachable DART
   - Move FW feature probing out of pseries probe()
   - Put exception configuration in a common place
   - Remove early allocation of the SMU command buffer
   - Move MMU backend selection out of platform code
   - pasemi: Remove IOBMAP allocation from platform probe()
   - mm/hash: Don't use machine_is() early during boot
   - Don't test for machine type to detect HEA special case
   - pmac: Remove spurrious machine type test
   - Move hash table ops to a separate structure
   - Ensure that ppc_md is empty before probing for machine type
   - Move 64-bit probe_machine() to later in the boot process
   - Move 32-bit probe() machine to later in the boot process
   - Get rid of ppc_md.init_early()
   - Move the boot time info banner to a separate function
   - Move setting of {i,d}cache_bsize to initialize_cache_info()
   - Move the content of setup_system() to setup_arch()
   - Move cache info inits to a separate function
   - Re-order the call to smp_setup_cpu_maps()
   - Re-order setup_panic()
   - Make a few boot functions __init
   - Merge 32-bit and 64-bit setup_arch()

  Other new features:
   - tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
   - tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
   - powerpc: Add module autoloading based on CPU features from Alastair D'Silva
   - crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
   - Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
   - xmon: Dump ISA 2.06 SPRs from Michael Ellerman
   - xmon: Dump ISA 2.07 SPRs from Michael Ellerman
   - Add a parameter to disable 1TB segs from Oliver O'Halloran
   - powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
   - Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
   - pseries: Add pseries hotplug workqueue from John Allen
   - pseries: Add support for hotplug interrupt source from John Allen
   - pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
   - pseries: Move property cloning into its own routine from Nathan Fontenot
   - pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
   - pseries: Auto-online hotplugged memory from Nathan Fontenot
   - pseries: Remove call to memblock_add() from Nathan Fontenot

  cxl:
   - Add set and get private data to context struct from Michael Neuling
   - make base more explicitly non-modular from Paul Gortmaker
   - Use for_each_compatible_node() macro from Wei Yongjun
   - Frederic Barrat
   - Abstract the differences between the PSL and XSL
   - Make vPHB device node match adapter's
   - Philippe Bergheaud
   - Add mechanism for delivering AFU driver specific events
   - Ignore CAPI adapters misplaced in switched slots
   - Refine slice error debug messages
   - Andrew Donnellan
   - static-ify variables to fix sparse warnings
   - PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
   - PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
   - Add cxl_check_and_switch_mode() API to switch bi-modal cards
   - remove dead Kconfig options
   - fix potential NULL dereference in free_adapter()
   - Ian Munsie
   - Update process element after allocating interrupts
   - Add support for CAPP DMA mode
   - Fix allowing bogus AFU descriptors with 0 maximum processes
   - Fix allocating a minimum of 2 pages for the SPA
   - Fix bug where AFU disable operation had no effect
   - Workaround XSL bug that does not clear the RA bit after a reset
   - Fix NULL pointer dereference on kernel contexts with no AFU interrupts
   - powerpc/powernv: Split cxl code out into a separate file
   - Add cxl_slot_is_supported API
   - Enable bus mastering for devices using CAPP DMA mode
   - Move cxl_afu_get / cxl_afu_put to base
   - Allow a default context to be associated with an external pci_dev
   - Do not create vPHB if there are no AFU configuration records
   - powerpc/powernv: Add support for the cxl kernel api on the real phb
   - Add support for using the kernel API with a real PHB
   - Add kernel APIs to get & set the max irqs per context
   - Add preliminary workaround for CX4 interrupt limitation
   - Add support for interrupts on the Mellanox CX4
   - Workaround PE=0 hardware limitation in Mellanox CX4
   - powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n

  selftests:
   - Test unaligned copy and paste from Chris Smart
   - Load Monitor Register Tests from Jack Miller
   - Cyril Bur
   - exec() with suspended transaction
   - Use signed long to read perf_event_paranoid
   - Fix usage message in context_switch
   - Fix generation of vector instructions/types in context_switch
   - Michael Ellerman
   - Use "Delta" rather than "Error" in normal output
   - Import Anton's mmap & futex micro benchmarks
   - Add a test for PROT_SAO"

* tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (263 commits)
  powerpc/mm: Parenthesise IS_ENABLED() in if condition
  tty/hvc: Use opal irqchip interface if available
  tty/hvc: Use IRQF_SHARED for OPAL hvc consoles
  selftests/powerpc: exec() with suspended transaction
  powerpc: Improve comment explaining why we modify VRSAVE
  powerpc/mm: Drop unused externs for hpte_init_beat[_v3]()
  powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header
  powerpc/mm: Fix build break when PPC_NATIVE=n
  crypto: vmx - Convert to CPU feature based module autoloading
  powerpc: Add module autoloading based on CPU features
  powerpc/powernv/ioda: Fix endianness when reading TCEs
  powerpc/mm: Add memory barrier in __hugepte_alloc()
  powerpc/modules: Never restore r2 for a mprofile-kernel style mcount() call
  powerpc/ftrace: Separate the heuristics for checking call sites
  powerpc: Merge 32-bit and 64-bit setup_arch()
  powerpc/64: Make a few boot functions __init
  powerpc: Re-order setup_panic()
  powerpc: Re-order the call to smp_setup_cpu_maps()
  powerpc/32: Move cache info inits to a separate function
  powerpc/64: Move the content of setup_system() to setup_arch()
  ...
2016-07-30 21:01:36 -07:00
Linus Torvalds 7f155c7026 NFS client updates for Linux 4.8
Highlights include:
 
 Stable bugfixes:
  - nfs: don't create zero-length requests
  - Several LAYOUTGET bugfixes
 
 Features:
  - Several performance related features
    - More aggressive caching when we can rely on close-to-open cache
      consistency
    - Remove serialisation of O_DIRECT reads and writes
    - Optimise several code paths to not flush to disk unnecessarily. However
      allow for the idiosyncracies of pNFS for those layout types that need
      to issue a LAYOUTCOMMIT before the metadata can be updated on the server.
    - SUNRPC updates to the client data receive path
  - pNFS/SCSI support RH/Fedora dm-mpath device nodes
  - pNFS files/flexfiles can now use unprivileged ports when the generic NFS
    mount options allow it.
 
 Bugfixes:
  - Don't use RDMA direct data placement together with data integrity or
    privacy security flavours
  - Remove the RDMA ALLPHYSICAL memory registration mode as it has potential
    security holes.
  - Several layout recall fixes to improve NFSv4.1 protocol compliance.
  - Fix an Oops in the pNFS files and flexfiles connection setup to the DS
  - Allow retry of operations that used a returned delegation stateid
  - Don't mark the inode as revalidated if a LAYOUTCOMMIT is outstanding
  - Fix writeback races in nfs4_copy_range() and nfs42_proc_deallocate()
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXnSq8AAoJEGcL54qWCgDyn8cP/RCHLekUCq7Klh+NAnEsvuBi
 C7w9YpVHaC83/8Q0tR6LyFShSBJBWi/clWwO0IEomkNK/MuO77v4iyPujtEyqowK
 0+eWFh/e8CsTf7mNGoi0avrHAZDB3deSuOQeYbwnNWHmd7qKVkB6tHus8LQjk852
 eqwYmZ4kVr+eaCO6MttCCxJHf6datPnsbe0stiC9MpxmCzsdpZmFptfauidsFX+p
 0U1IHi/ABN6zIFoc4R0iXXbaDb8ErxGw32SWIb8cnnWwdlSD8I0+Jqxs4opp23LY
 lAm9E0vtDJ49bJBllYl0dUmizdhJ3+NefK4aqPh5H5h3Csub+MLIsuQv/+r2AOhH
 qLBi5kThpspPhGHZ40VDmfV825+csUPTc8WkDaNLvb4f4UGIPakK/KBrBtxiqn+P
 0etvYiWBuoBaqRVQpstawnyDdnBK0IMF/3LAULo+ozo7iTkpaZmOALYgPcBUYw2f
 d6pxZGeNN0GwWfjDmoUDGC07OpO/CSN5WouArgKsp5+VhjzPxjyaZLCnUhzHzXiM
 RV1oBytEs/iw2BLXX809noM9mqHYkdgSVmrZ9OvvDMslcLHaslpq6eaJKZSWqV2J
 fAws6rbcZdTFSnbAWr0OSxct6w6BijEjc3/uk+wWRtw9nkOhFqtlxI3y7k4odpW9
 wVcEmRNkxfA0LlYNXWuL
 =WNyE
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable bugfixes:
   - nfs: don't create zero-length requests

   - several LAYOUTGET bugfixes

  Features:
   - several performance related features

   - more aggressive caching when we can rely on close-to-open
     cache consistency

   - remove serialisation of O_DIRECT reads and writes

   - optimise several code paths to not flush to disk unnecessarily.

     However allow for the idiosyncracies of pNFS for those layout
     types that need to issue a LAYOUTCOMMIT before the metadata can
     be updated on the server.

   - SUNRPC updates to the client data receive path

   - pNFS/SCSI support RH/Fedora dm-mpath device nodes

   - pNFS files/flexfiles can now use unprivileged ports when
     the generic NFS mount options allow it.

  Bugfixes:
   - Don't use RDMA direct data placement together with data
     integrity or privacy security flavours

   - Remove the RDMA ALLPHYSICAL memory registration mode as
     it has potential security holes.

   - Several layout recall fixes to improve NFSv4.1 protocol
     compliance.

   - Fix an Oops in the pNFS files and flexfiles connection
     setup to the DS

   - Allow retry of operations that used a returned delegation
      stateid

   - Don't mark the inode as revalidated if a LAYOUTCOMMIT is
     outstanding

   - Fix writeback races in nfs4_copy_range() and
     nfs42_proc_deallocate()"

* tag 'nfs-for-4.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (104 commits)
  pNFS: Actively set attributes as invalid if LAYOUTCOMMIT is outstanding
  NFSv4: Clean up lookup of SECINFO_NO_NAME
  NFSv4.2: Fix warning "variable ‘stateids’ set but not used"
  NFSv4: Fix warning "no previous prototype for ‘nfs4_listxattr’"
  SUNRPC: Fix a compiler warning in fs/nfs/clnt.c
  pNFS: Remove redundant smp_mb() from pnfs_init_lseg()
  pNFS: Cleanup - do layout segment initialisation in one place
  pNFS: Remove redundant stateid invalidation
  pNFS: Remove redundant pnfs_mark_layout_returned_if_empty()
  pNFS: Clear the layout metadata if the server changed the layout stateid
  pNFS: Cleanup - don't open code pnfs_mark_layout_stateid_invalid()
  NFS: pnfs_mark_matching_lsegs_return() should match the layout sequence id
  pNFS: Do not set plh_return_seq for non-callback related layoutreturns
  pNFS: Ensure layoutreturn acts as a completion for layout callbacks
  pNFS: Fix CB_LAYOUTRECALL stateid verification
  pNFS: Always update the layout barrier seqid on LAYOUTGET
  pNFS: Always update the layout stateid if NFS_LAYOUT_INVALID_STID is set
  pNFS: Clear the layout return tracking on layout reinitialisation
  pNFS: LAYOUTRETURN should only update the stateid if the layout is valid
  nfs: don't create zero-length requests
  ...
2016-07-30 16:33:25 -07:00
Linus Torvalds f64d6e2aaa DeviceTree update for 4.8:
- Removal of most of_platform_populate() calls in arch code. Now the DT
 core code calls it in the default case and platforms only need to call
 it if they have special needs.
 
 - Use pr_fmt on all the DT core print statements.
 
 - CoreSight binding doc improvements to block name descriptions.
 
 - Add dt_to_config script which can parse dts files and list
 corresponding kernel config options.
 
 - Fix memory leak hit with a PowerMac DT.
 
 - Correct a bunch of STMicro compatible strings to use the correct
 vendor prefix.
 
 - Fix DA9052 PMIC binding doc to match what is actually used in dts
 files.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXm9KcAAoJEPr7XbWNvGHDRT4QAIIIOSB4AWHardnMLROgGge9
 aOQKZ/05O9feOcxYKe8FkQbcH+IujJjrUL+yrRD36yGQPAyBP21gtcmmfrkCcwFM
 kH915f/JbGvXpfwEf8dcarHhzYH6FFJiQGduPpWfwSSWynx+xq5EKPwCqYzMg8bN
 SExxt7vUx1MKFOExZ0K8BNCo8VMVLUWQoJ1DNeJDuL25Op4EU3i2l1HQNYV/3XDk
 BSA3x7Lw3GjrWEH20VWYn2Azq1OFLY+E2FC2lnG4nbkk5X8dZbUH9PR1Sk7uTQDj
 uxTjWe59NBpliCxKSAbMbTAU/WwSB1pJ0I+zDJBiQsdFT+nb5F4zOrs3qSKHa/A9
 Rv6AC8k5gdSMrDB1dOspfF2vWvOOInXgNV4/Kza0D92mbCpwyUuF+vhE6rfcMrZU
 OiD7rj2/fvO7Y9fUAhrp6zrfrOfH9B1Z9vS+940AlK96YwPE2+J0SA2vBxR/wg8H
 7fj4Ud5X+SFisXWQhh5Wlv0W9o6e7C7fsi8vpkQ7gufmezLFWVnJKsUfQaxGEwhG
 Hkhm9kuSHHMd+6dEnn2756DnNfJAtQv6rSR0/QR4Lf9y5L4dvR3kAQIci8X/nx4P
 sIk+IJWGZG6wziZq59hh+SO6HEqdSNuvh+5sbR0iUimdE/1HsDBdPiocXf/r8iwK
 NY9nGeZPRrXmFgdpoZfm
 =wLMr
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull DeviceTree updates from Rob Herring:

 - remove most of_platform_populate() calls in arch code.  Now the DT
   core code calls it in the default case and platforms only need to
   call it if they have special needs

 - use pr_fmt on all the DT core print statements

 - CoreSight binding doc improvements to block name descriptions

 - add dt_to_config script which can parse dts files and list
   corresponding kernel config options

 - fix memory leak hit with a PowerMac DT

 - correct a bunch of STMicro compatible strings to use the correct
   vendor prefix

 - fix DA9052 PMIC binding doc to match what is actually used in dts
   files

* tag 'devicetree-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (35 commits)
  documentation: da9052: Update regulator bindings names to match DA9052/53 DTS expectations
  xtensa: Partially Revert "xtensa: Remove unnecessary of_platform_populate with default match table"
  xtensa: Fix build error due to missing include file
  MIPS: ath79: Add missing include file
  Fix spelling errors in Documentation/devicetree
  ARM: dts: fix STMicroelectronics compatible strings
  powerpc/dts: fix STMicroelectronics compatible strings
  Documentation: dt: i2c: use correct STMicroelectronics vendor prefix
  scripts/dtc: dt_to_config - kernel config options for a devicetree
  of: fdt: mark unflattened tree as detached
  of: overlay: add resolver error prints
  coresight: document binding acronyms
  Documentation/devicetree: document cavium-pip rx-delay/tx-delay properties
  of: use pr_fmt prefix for all console printing
  of/irq: Mark initialised interrupt controllers as populated
  of: fix memory leak related to safe_name()
  Revert "of/platform: export of_default_bus_match_table"
  of: unittest: use of_platform_default_populate() to populate default bus
  memory: omap-gpmc: use of_platform_default_populate() to populate default bus
  bus: uniphier-system-bus: use of_platform_default_populate() to populate default bus
  ...
2016-07-30 11:32:01 -07:00
Linus Torvalds 1056c9bd27 The bulk of the changes are updates and fixes to existing clk provider
drivers, along with a pretty standard number of new drivers. The core
 recieved a small number of updates as well.
 
 Core changes of note:
 * Removed CLK_IS_ROOT flag
 
 New clk provider drivers:
 * Renesas r8a7796 Clock Pulse Generator / Module Standby and Software
   Reset
 * Allwinner sun8i H3 Clock Controller Unit
 * AmLogic meson8b clock controller (rewritten)
 * AmLogic gxbb clock controller
 * Support for some new ICs was added by simple changes to static data
   tables for chips sharing the same family
 
 Driver updates of note:
 * the Allwinner sunxi clock driver infrastucture was rewritten to
   comform to the state of the art at drivers/clk/sunxi-ng. The old
   implementation is still supported for backwards compatibility with the
   DT ABI
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJXm9LNAAoJEKI6nJvDJaTUHYcP/1oKTHA4uSPux2+5l9vApEJc
 eSV0oSEP/zDDwYfV4xRt1byAOtzHe3R4p5zDGShnk3+CCwXbQSLGmFJFmZDoSs5E
 ulftXA8uRV4ac0SEh86BOlstdJHGOVo7Q38XtdPvgKRTv58+ZqHuFcvB726bWY/I
 GMzVTkpugbn5U7e3MLM58gkBoqa8BS9uIdsf5q/JIxdpe+VgUtjmioH9Rz6G5U5Y
 T6ObeU6HNusDaz6kUJIiAEkl4UNHk+aebnY7FTbwd9JGGySp5nEknVY7tSMeC6iU
 hG/knzdMiasa5yv0xdW3/OxnKCQtEeuPPHnJDSXWc9yb20vNs2QhRtd8uckMGHOH
 X7fXq9xoLxDH81AggW188g7+xfhx7J+ASpjHmHs74itUIoqhhy61PM3I95yQ8m6s
 QDkInqSszFd2C7OiYsy6m5XS0K8g0jyA2tzWhxlIZNVAl7yP71+HVLJ45NNKCPkb
 oYbMx4ho538Tl6+c6HHJG0vC3XWSdHbvwMOKyE9k/GyrIJ7FxLZS3BLOYPm6u9t4
 9n041MbtBb/UrWVi8kMSdmBjIIs6VSGjq5gwRFAqoPQt3VkPuy0IfnwIpjdE1aFz
 PbDJQF4ze7D3JgH2rWtivXbCTRthTANmoar299xjSwwb99TSlP2IABY3cNnJrhPr
 9/PGShM3hqRpBEUnEmt+
 =1dpY
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk updates from Michael Turquette:
 "The bulk of the changes are updates and fixes to existing clk provider
  drivers, along with a pretty standard number of new drivers.  The core
  recieved a small number of updates as well.

  Core changes of note:
   - removed CLK_IS_ROOT flag

  New clk provider drivers:
   - Renesas r8a7796 clock pulse generator / module standby and
     software reset
   - Allwinner sun8i H3 clock controller unit
   - AmLogic meson8b clock controller (rewritten)
   - AmLogic gxbb clock controller
   - support for some new ICs was added by simple changes to static
     data tables for chips sharing the same family

  Driver updates of note:
   - the Allwinner sunxi clock driver infrastucture was rewritten to
     comform to the state of the art at drivers/clk/sunxi-ng.  The old
     implementation is still supported for backwards compatibility with
     the DT ABI"

* tag 'clk-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (162 commits)
  clk: Makefile: re-sort and clean up
  Revert "clk: gxbb: expose CLKID_MMC_PCLK"
  clk: samsung: Allow modular build of the Audio Subsystem CLKCON driver
  clk: samsung: make clk-s5pv210-audss explicitly non-modular
  clk: exynos5433: remove CLK_IGNORE_UNUSED flag from SPI clocks
  clk: oxnas: Add hardware dependencies
  clk: imx7d: do not set parent of ethernet time/ref clocks
  ARM: dt: sun8i: switch the H3 to the new CCU driver
  clk: sunxi-ng: h3: Fix Kconfig symbol typo
  clk: sunxi-ng: h3: Fix audio clock divider offset
  clk: sunxi-ng: Add H3 clocks
  clk: sunxi-ng: Add N-K-M-P factor clock
  clk: sunxi-ng: Add N-K-M Factor clock
  clk: sunxi-ng: Add N-M-factor clock support
  clk: sunxi-ng: Add N-K-factor clock support
  clk: sunxi-ng: Add M-P factor clock support
  clk: sunxi-ng: Add divider
  clk: sunxi-ng: Add phase clock support
  clk: sunxi-ng: Add mux clock support
  clk: sunxi-ng: Add gate clock support
  ...
2016-07-30 11:20:02 -07:00
Linus Torvalds 797cee982e Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit
Pull audit updates from Paul Moore:
 "Six audit patches for 4.8.

  There are a couple of style and minor whitespace tweaks for the logs,
  as well as a minor fixup to catch errors on user filter rules, however
  the major improvements are a fix to the s390 syscall argument masking
  code (reviewed by the nice s390 folks), some consolidation around the
  exclude filtering (less code, always a win), and a double-fetch fix
  for recording the execve arguments"

* 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit:
  audit: fix a double fetch in audit_log_single_execve_arg()
  audit: fix whitespace in CWD record
  audit: add fields to exclude filter by reusing user filter
  s390: ensure that syscall arguments are properly masked on s390
  audit: fix some horrible switch statement style crimes
  audit: fixup: log on errors from filter user rules
2016-07-29 17:54:17 -07:00
Linus Torvalds 7a1e8b80fb Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - TPM core and driver updates/fixes
   - IPv6 security labeling (CALIPSO)
   - Lots of Apparmor fixes
   - Seccomp: remove 2-phase API, close hole where ptrace can change
     syscall #"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
  apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
  tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
  tpm: Factor out common startup code
  tpm: use devm_add_action_or_reset
  tpm2_i2c_nuvoton: add irq validity check
  tpm: read burstcount from TPM_STS in one 32-bit transaction
  tpm: fix byte-order for the value read by tpm2_get_tpm_pt
  tpm_tis_core: convert max timeouts from msec to jiffies
  apparmor: fix arg_size computation for when setprocattr is null terminated
  apparmor: fix oops, validate buffer size in apparmor_setprocattr()
  apparmor: do not expose kernel stack
  apparmor: fix module parameters can be changed after policy is locked
  apparmor: fix oops in profile_unpack() when policy_db is not present
  apparmor: don't check for vmalloc_addr if kvzalloc() failed
  apparmor: add missing id bounds check on dfa verification
  apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
  apparmor: use list_next_entry instead of list_entry_next
  apparmor: fix refcount race when finding a child profile
  apparmor: fix ref count leak when profile sha1 hash is read
  apparmor: check that xindex is in trans_table bounds
  ...
2016-07-29 17:38:46 -07:00
Linus Torvalds a867d7349e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns vfs updates from Eric Biederman:
 "This tree contains some very long awaited work on generalizing the
  user namespace support for mounting filesystems to include filesystems
  with a backing store.  The real world target is fuse but the goal is
  to update the vfs to allow any filesystem to be supported.  This
  patchset is based on a lot of code review and testing to approach that
  goal.

  While looking at what is needed to support the fuse filesystem it
  became clear that there were things like xattrs for security modules
  that needed special treatment.  That the resolution of those concerns
  would not be fuse specific.  That sorting out these general issues
  made most sense at the generic level, where the right people could be
  drawn into the conversation, and the issues could be solved for
  everyone.

  At a high level what this patchset does a couple of simple things:

   - Add a user namespace owner (s_user_ns) to struct super_block.

   - Teach the vfs to handle filesystem uids and gids not mapping into
     to kuids and kgids and being reported as INVALID_UID and
     INVALID_GID in vfs data structures.

  By assigning a user namespace owner filesystems that are mounted with
  only user namespace privilege can be detected.  This allows security
  modules and the like to know which mounts may not be trusted.  This
  also allows the set of uids and gids that are communicated to the
  filesystem to be capped at the set of kuids and kgids that are in the
  owning user namespace of the filesystem.

  One of the crazier corner casees this handles is the case of inodes
  whose i_uid or i_gid are not mapped into the vfs.  Most of the code
  simply doesn't care but it is easy to confuse the inode writeback path
  so no operation that could cause an inode write-back is permitted for
  such inodes (aka only reads are allowed).

  This set of changes starts out by cleaning up the code paths involved
  in user namespace permirted mounts.  Then when things are clean enough
  adds code that cleanly sets s_user_ns.  Then additional restrictions
  are added that are possible now that the filesystem superblock
  contains owner information.

  These changes should not affect anyone in practice, but there are some
  parts of these restrictions that are changes in behavior.

   - Andy's restriction on suid executables that does not honor the
     suid bit when the path is from another mount namespace (think
     /proc/[pid]/fd/) or when the filesystem was mounted by a less
     privileged user.

   - The replacement of the user namespace implicit setting of MNT_NODEV
     with implicitly setting SB_I_NODEV on the filesystem superblock
     instead.

     Using SB_I_NODEV is a stronger form that happens to make this state
     user invisible.  The user visibility can be managed but it caused
     problems when it was introduced from applications reasonably
     expecting mount flags to be what they were set to.

  There is a little bit of work remaining before it is safe to support
  mounting filesystems with backing store in user namespaces, beyond
  what is in this set of changes.

   - Verifying the mounter has permission to read/write the block device
     during mount.

   - Teaching the integrity modules IMA and EVM to handle filesystems
     mounted with only user namespace root and to reduce trust in their
     security xattrs accordingly.

   - Capturing the mounters credentials and using that for permission
     checks in d_automount and the like.  (Given that overlayfs already
     does this, and we need the work in d_automount it make sense to
     generalize this case).

  Furthermore there are a few changes that are on the wishlist:

   - Get all filesystems supporting posix acls using the generic posix
     acls so that posix_acl_fix_xattr_from_user and
     posix_acl_fix_xattr_to_user may be removed.  [Maintainability]

   - Reducing the permission checks in places such as remount to allow
     the superblock owner to perform them.

   - Allowing the superblock owner to chown files with unmapped uids and
     gids to something that is mapped so the files may be treated
     normally.

  I am not considering even obvious relaxations of permission checks
  until it is clear there are no more corner cases that need to be
  locked down and handled generically.

  Many thanks to Seth Forshee who kept this code alive, and putting up
  with me rewriting substantial portions of what he did to handle more
  corner cases, and for his diligent testing and reviewing of my
  changes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits)
  fs: Call d_automount with the filesystems creds
  fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns
  evm: Translate user/group ids relative to s_user_ns when computing HMAC
  dquot: For now explicitly don't support filesystems outside of init_user_ns
  quota: Handle quota data stored in s_user_ns in quota_setxquota
  quota: Ensure qids map to the filesystem
  vfs: Don't create inodes with a uid or gid unknown to the vfs
  vfs: Don't modify inodes with a uid or gid unknown to the vfs
  cred: Reject inodes with invalid ids in set_create_file_as()
  fs: Check for invalid i_uid in may_follow_link()
  vfs: Verify acls are valid within superblock's s_user_ns.
  userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS
  fs: Refuse uid/gid changes which don't map into s_user_ns
  selinux: Add support for unprivileged mounts from user namespaces
  Smack: Handle labels consistently in untrusted mounts
  Smack: Add support for unprivileged mounts from user namespaces
  fs: Treat foreign mounts as nosuid
  fs: Limit file caps to the user namespace of the super block
  userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag
  userns: Remove implicit MNT_NODEV fragility.
  ...
2016-07-29 15:54:19 -07:00
Linus Torvalds a6408f6cb6 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp hotplug updates from Thomas Gleixner:
 "This is the next part of the hotplug rework.

   - Convert all notifiers with a priority assigned

   - Convert all CPU_STARTING/DYING notifiers

     The final removal of the STARTING/DYING infrastructure will happen
     when the merge window closes.

  Another 700 hundred line of unpenetrable maze gone :)"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  timers/core: Correct callback order during CPU hot plug
  leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
  powerpc/numa: Convert to hotplug state machine
  arm/perf: Fix hotplug state machine conversion
  irqchip/armada: Avoid unused function warnings
  ARC/time: Convert to hotplug state machine
  clocksource/atlas7: Convert to hotplug state machine
  clocksource/armada-370-xp: Convert to hotplug state machine
  clocksource/exynos_mct: Convert to hotplug state machine
  clocksource/arm_global_timer: Convert to hotplug state machine
  rcu: Convert rcutree to hotplug state machine
  KVM/arm/arm64/vgic-new: Convert to hotplug state machine
  smp/cfd: Convert core to hotplug state machine
  x86/x2apic: Convert to CPU hotplug state machine
  profile: Convert to hotplug state machine
  timers/core: Convert to hotplug state machine
  hrtimer: Convert to hotplug state machine
  x86/tboot: Convert to hotplug state machine
  arm64/armv8 deprecated: Convert to hotplug state machine
  hwtracing/coresight-etm4x: Convert to hotplug state machine
  ...
2016-07-29 13:55:30 -07:00
Linus Torvalds 27ae0c41ed Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
 "This fixes error propagation from writeback to fsync/close for
  writeback cache mode as well as adding a missing capability flag to
  the INIT message.  The rest are cleanups.

  (The commits are recent but all the code actually sat in -next for a
  while now.  The recommits are due to conflict avoidance and the
  addition of Cc: stable@...)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: use filemap_check_errors()
  mm: export filemap_check_errors() to modules
  fuse: fix wrong assignment of ->flags in fuse_send_init()
  fuse: fuse_flush must check mapping->flags for errors
  fuse: fsync() did not return IO errors
  fuse: don't mess with blocking signals
  new helper: wait_event_killable_exclusive()
  fuse: improve aio directIO write performance for size extending writes
2016-07-29 12:29:15 -07:00
Linus Torvalds 20d00ee829 Revert "vfs: add lookup_hash() helper"
This reverts commit 3c9fe8cdff.

As Miklos points out in commit c1b2cc1a76, the "lookup_hash()" helper
is now unused, and in fact, with the hash salting changes, since the
hash of a dentry name now depends on the directory dentry it is in, the
helper function isn't even really likely to be useful.

So rather than keep it around in case somebody else might end up finding
a use for it, let's just remove the helper and not trick people into
thinking it might be a useful thing.

For example, I had obviously completely missed how the helper didn't
follow the normal dentry hashing patterns, and how the hash salting
patch broke overlayfs.  Things would quietly build and look sane, but
not work.

Suggested-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-29 12:17:52 -07:00
Miklos Szeredi d72d9e2a5d mm: export filemap_check_errors() to modules
Can be used by fuse, btrfs and f2fs to replace opencoded variants.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29 14:10:57 +02:00
Linus Torvalds c624c86615 This is mostly clean ups and small fixes. Some of the more visible
changes are:
 
  . The function pid code uses the event pid filtering logic
  . [ku]probe events have access to current->comm
  . trace_printk now has sample code
  . PCI devices now trace physical addresses
  . stack tracing has less unnessary functions traced
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXl+d2AAoJEKKk/i67LK/83QEH/RDJ0mcfFVsuEeOnZZrZXABm
 4Rxk4FE5UAD+TSrVycwwzcbQab1iPK63mMdYvIBvaOiIC6/OJaEVM7jzZxnNGqmr
 pj0H8bxwOr58pe5pfnP92ow5qTLLzsXraWNl5sRXhSSHON7CXpGVzkErB58GmMYd
 8p6d9ziifQjo8X2O6XC9rGAvYLY5kEkVvyfuE1hI7muNTeOjyOT4EqpkNzxdBk+I
 QkGZGsk3Xhc8II9nu8FPWkaD26TatGJoZtZmVWHOzfsb3HNzG4RXla+WVOQ5u1HV
 noVyB1CJHhkO5CEBPdYIqwBWPQU4B9HfG4gVcUpDDVRxfzMpnEcKi1uwe+uDjfs=
 =XFcv
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "This is mostly clean ups and small fixes.  Some of the more visible
  changes are:

   - The function pid code uses the event pid filtering logic
   - [ku]probe events have access to current->comm
   - trace_printk now has sample code
   - PCI devices now trace physical addresses
   - stack tracing has less unnessary functions traced"

* tag 'trace-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  printk, tracing: Avoiding unneeded blank lines
  tracing: Use __get_str() when manipulating strings
  tracing, RAS: Cleanup on __get_str() usage
  tracing: Use outer () on __get_str() definition
  ftrace: Reduce size of function graph entries
  tracing: Have HIST_TRIGGERS select TRACING
  tracing: Using for_each_set_bit() to simplify trace_pid_write()
  ftrace: Move toplevel init out of ftrace_init_tracefs()
  tracing/function_graph: Fix filters for function_graph threshold
  tracing: Skip more functions when doing stack tracing of events
  tracing: Expose CPU physical addresses (resource values) for PCI devices
  tracing: Show the preempt count of when the event was called
  tracing: Add trace_printk sample code
  tracing: Choose static tp_printk buffer by explicit nesting count
  tracing: expose current->comm to [ku]probe events
  ftrace: Have set_ftrace_pid use the bitmap like events do
  tracing: Move pid_list write processing into its own function
  tracing: Move the pid_list seq_file functions to be global
  tracing: Move filtered_pid helper functions into trace.c
  tracing: Make the pid filtering helper functions global
2016-07-28 18:20:09 -07:00
Linus Torvalds f0c98ebc57 libnvdimm for 4.8
1/ Replace pcommit with ADR / directed-flushing:
    The pcommit instruction, which has not shipped on any product, is
    deprecated. Instead, the requirement is that platforms implement either
    ADR, or provide one or more flush addresses per nvdimm. ADR
    (Asynchronous DRAM Refresh) flushes data in posted write buffers to the
    memory controller on a power-fail event. Flush addresses are defined in
    ACPI 6.x as an NVDIMM Firmware Interface Table (NFIT) sub-structure:
    "Flush Hint Address Structure". A flush hint is an mmio address that
    when written and fenced assures that all previous posted writes
    targeting a given dimm have been flushed to media.
 
 2/ On-demand ARS (address range scrub):
    Linux uses the results of the ACPI ARS commands to track bad blocks
    in pmem devices.  When latent errors are detected we re-scrub the media
    to refresh the bad block list, userspace can also request a re-scrub at
    any time.
 
 3/ Support for the Microsoft DSM (device specific method) command format.
 
 4/ Support for EDK2/OVMF virtual disk device memory ranges.
 
 5/ Various fixes and cleanups across the subsystem.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXmXBsAAoJEB7SkWpmfYgCEwwP/1IOt9ocP+iHLMDH9KE7VaTZ
 NmUDR+Zy6g5cRQM7SgcuU5BXUcx+OsSrSrUTVF1cW994o9Gbz1mFotkv0ZAsPcYY
 ZVRQxo2oqHrssyOcg+PsgKWiXn68rJOCgmpEyzaJywl5qTMst7pzsT1s1f7rSh6h
 trCf4VaJJwxZR8fARGtlHUnnhPe2Orp99EZRKEWprAsIv2kPuWpPHSjRjuEgN1JG
 KW8AYwWqFTtiLRUk86I4KBB0wcDrfctsjgN9Ogd6+aHyQBRnVSr2U+vDCFkC8KLu
 qiDCpYp+yyxBjclnljz7tRRT3GtzfCUWd4v2KVWqgg2IaobUc0Lbukp/rmikUXQP
 WLikT2OCQ994eFK5OX3Q3cIU/4j459TQnof8q14yVSpjAKrNUXVSR5puN7Hxa+V7
 41wKrAsnsyY1oq+Yd/rMR8VfH7PHx3bFkrmRCGZCufLX1UQm4aYj+sWagDKiV3yA
 DiudghbOnhfurfGsnXUVw7y7GKs+gNWNBmB6ndAD6ZEHmKoGUhAEbJDLCc3DnANl
 b/2mv1MIdIcC1DlCmnbbcn6fv6bICe/r8poK3VrCK3UgOq/EOvKIWl7giP+k1JuC
 6DdVYhlNYIVFXUNSLFAwz8OkLu8byx7WDm36iEqrKHtPw+8qa/2bWVgOU6OBgpjV
 cN3edFVIdxvZeMgM5Ubq
 =xCBG
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:

 - Replace pcommit with ADR / directed-flushing.

   The pcommit instruction, which has not shipped on any product, is
   deprecated.  Instead, the requirement is that platforms implement
   either ADR, or provide one or more flush addresses per nvdimm.

   ADR (Asynchronous DRAM Refresh) flushes data in posted write buffers
   to the memory controller on a power-fail event.

   Flush addresses are defined in ACPI 6.x as an NVDIMM Firmware
   Interface Table (NFIT) sub-structure: "Flush Hint Address Structure".
   A flush hint is an mmio address that when written and fenced assures
   that all previous posted writes targeting a given dimm have been
   flushed to media.

 - On-demand ARS (address range scrub).

   Linux uses the results of the ACPI ARS commands to track bad blocks
   in pmem devices.  When latent errors are detected we re-scrub the
   media to refresh the bad block list, userspace can also request a
   re-scrub at any time.

 - Support for the Microsoft DSM (device specific method) command
   format.

 - Support for EDK2/OVMF virtual disk device memory ranges.

 - Various fixes and cleanups across the subsystem.

* tag 'libnvdimm-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (41 commits)
  libnvdimm-btt: Delete an unnecessary check before the function call "__nd_device_register"
  nfit: do an ARS scrub on hitting a latent media error
  nfit: move to nfit/ sub-directory
  nfit, libnvdimm: allow an ARS scrub to be triggered on demand
  libnvdimm: register nvdimm_bus devices with an nd_bus driver
  pmem: clarify a debug print in pmem_clear_poison
  x86/insn: remove pcommit
  Revert "KVM: x86: add pcommit support"
  nfit, tools/testing/nvdimm/: unify shutdown paths
  libnvdimm: move ->module to struct nvdimm_bus_descriptor
  nfit: cleanup acpi_nfit_init calling convention
  nfit: fix _FIT evaluation memory leak + use after free
  tools/testing/nvdimm: add manufacturing_{date|location} dimm properties
  tools/testing/nvdimm: add virtual ramdisk range
  acpi, nfit: treat virtual ramdisk SPA as pmem region
  pmem: kill __pmem address space
  pmem: kill wmb_pmem()
  libnvdimm, pmem: use nvdimm_flush() for namespace I/O writes
  fs/dax: remove wmb_pmem()
  libnvdimm, pmem: flush posted-write queues on shutdown
  ...
2016-07-28 17:38:16 -07:00
Linus Torvalds d94ba9e7d8 This is the bulk of pin control changes for the v4.8 kernel cycle.
New drivers:
 
 - New driver for Oxnas pin control and GPIO. This ARM-based chipset
   is used in a few storage (NAS) type devices.
 
 - New driver for the MAX77620/MAX20024 pin controller portions.
 
 - New driver for the Intel Merrifield pin controller.
 
 New subdrivers:
 
 - New subdriver for the Qualcomm MDM9615
 
 - New subdriver for the STM32F746 MCU
 
 - New subdriver for the Broadcom NSP SoC.
 
 Cleanups:
 
 - Demodularization of bool compiled-in drivers.
 
 Apart from this there is just regular incremental improvements to
 a lot of drivers, especially Uniphier and PFC.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXmcthAAoJEEEQszewGV1z2ukP/1T34tgMllEzBcnyTM28pnPl
 80anOiCi/jkLuW1hYTLc4Rx0tZT2oWw9hrZdKGNMuzNCaJSmFaRhUrbzxhZ8E+6O
 3AHYSopAYUTKVYJsuY+fs3HbNajKBsSWYdmxin4e953BPudLjhZ2WliDXxupsbwZ
 /KI6s8J2pDZcPurrlozT5Avp3BTwwCq+fo12NIMkkmuhURb69OsDAVjPlwjocq73
 BKcdH8AdgO7w5Ss5/IQbvrhyuFc2kCQ/wH1tiuE2a4iYWhp+QkMOEqWUSdYs33bx
 Sbn3KTK6IYYS1eb4xharh7H/zoBs20aCQ2kS8qbYmG+Fv7rB7qboI0qM7m7+25O8
 7F6Tf4F0HUg6IjcABcI5lFuTxACBG8p5ZlmQAp/36EaeIblLALBCvd1ArmZ6fdG+
 Pzu5vLaZBAmxQn6EseHLAJkH4FEzV7II/Sk7U23TINHUpl/L2GJO+6irz7eelKAk
 XED6mrNU8rRnZMka02ZnIgIbABG7ELNJsRxEnf8k9CX7cfi4p568eZeR1nfKcy4+
 uldJLipNv3NfwuRY5JEEa7pFW4azYmnzS1GcoVYPy7snYc4Rr8cbBD6YvcxyMhVz
 RXHc21mj45JnboldAYU59t5BbVZNwZqF8hmg935ngoaZjYfhhnfGoNQF7hy6/1fS
 bwojoQtBqGcriKZYGs0o
 =po0g
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control updates from Linus Walleij:
 "This is the bulk of pin control changes for the v4.8 kernel cycle.

  Nothing stands out as especially exiting: new drivers, new subdrivers,
  lots of cleanups and incremental features.

  Business as usual.

  New drivers:

   - New driver for Oxnas pin control and GPIO.  This ARM-based chipset
     is used in a few storage (NAS) type devices.

   - New driver for the MAX77620/MAX20024 pin controller portions.

   - New driver for the Intel Merrifield pin controller.

  New subdrivers:

   - New subdriver for the Qualcomm MDM9615

   - New subdriver for the STM32F746 MCU

   - New subdriver for the Broadcom NSP SoC.

  Cleanups:

   - Demodularization of bool compiled-in drivers.

  Apart from this there is just regular incremental improvements to a
  lot of drivers, especially Uniphier and PFC"

* tag 'pinctrl-v4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (131 commits)
  pinctrl: fix pincontrol definition for marvell
  pinctrl: xway: fix typo
  Revert "pinctrl: amd: make it explicitly non-modular"
  pinctrl: iproc: Add NSP and Stingray GPIO support
  pinctrl: Update iProc GPIO DT bindings
  pinctrl: bcm: add OF dependencies
  pinctrl: ns2: remove redundant dev_err call in ns2_pinmux_probe()
  pinctrl: Add STM32F746 MCU support
  pinctrl: intel: Protect set wake flow by spin lock
  pinctrl: nsp: remove redundant dev_err call in nsp_pinmux_probe()
  pinctrl: uniphier: add Ethernet pin-mux settings
  sh-pfc: Use PTR_ERR_OR_ZERO() to simplify the code
  pinctrl: ns2: fix return value check in ns2_pinmux_probe()
  pinctrl: qcom: update DT bindings with ebi2 groups
  pinctrl: qcom: establish proper EBI2 pin groups
  pinctrl: imx21: Remove the MODULE_DEVICE_TABLE() macro
  Documentation: dt: Add new compatible to STM32 pinctrl driver bindings
  includes: dt-bindings: Add STM32F746 pinctrl DT bindings
  pinctrl: sunxi: fix nand0 function name for sun8i
  pinctrl: uniphier: remove pointless pin-mux settings for PH1-LD11
  ...
2016-07-28 17:06:51 -07:00
Linus Torvalds 1c88e19b0f Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "The rest of MM"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (101 commits)
  mm, compaction: simplify contended compaction handling
  mm, compaction: introduce direct compaction priority
  mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations
  mm, page_alloc: make THP-specific decisions more generic
  mm, page_alloc: restructure direct compaction handling in slowpath
  mm, page_alloc: don't retry initial attempt in slowpath
  mm, page_alloc: set alloc_flags only once in slowpath
  lib/stackdepot.c: use __GFP_NOWARN for stack allocations
  mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB
  mm, kasan: account for object redzone in SLUB's nearest_obj()
  mm: fix use-after-free if memory allocation failed in vma_adjust()
  zsmalloc: Delete an unnecessary check before the function call "iput"
  mm/memblock.c: fix index adjustment error in __next_mem_range_rev()
  mem-hotplug: alloc new page from a nearest neighbor node when mem-offline
  mm: optimize copy_page_to/from_iter_iovec
  mm: add cond_resched() to generic_swapfile_activate()
  Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements"
  mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode
  mm: hwpoison: remove incorrect comments
  make __section_nr() more efficient
  ...
2016-07-28 16:36:48 -07:00
Vlastimil Babka c3486f5376 mm, compaction: simplify contended compaction handling
Async compaction detects contention either due to failing trylock on
zone->lock or lru_lock, or by need_resched().  Since 1f9efdef4f ("mm,
compaction: khugepaged should not give up due to need_resched()") the
code got quite complicated to distinguish these two up to the
__alloc_pages_slowpath() level, so different decisions could be taken
for khugepaged allocations.

After the recent changes, khugepaged allocations don't check for
contended compaction anymore, so we again don't need to distinguish lock
and sched contention, and simplify the current convoluted code a lot.

However, I believe it's also possible to simplify even more and
completely remove the check for contended compaction after the initial
async compaction for costly orders, which was originally aimed at THP
page fault allocations.  There are several reasons why this can be done
now:

- with the new defaults, THP page faults no longer do reclaim/compaction at
  all, unless the system admin has overridden the default, or application has
  indicated via madvise that it can benefit from THP's. In both cases, it
  means that the potential extra latency is expected and worth the benefits.
- even if reclaim/compaction proceeds after this patch where it previously
  wouldn't, the second compaction attempt is still async and will detect the
  contention and back off, if the contention persists
- there are still heuristics like deferred compaction and pageblock skip bits
  in place that prevent excessive THP page fault latencies

Link: http://lkml.kernel.org/r/20160721073614.24395-9-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Vlastimil Babka a5508cd83f mm, compaction: introduce direct compaction priority
In the context of direct compaction, for some types of allocations we
would like the compaction to either succeed or definitely fail while
trying as hard as possible.  Current async/sync_light migration mode is
insufficient, as there are heuristics such as caching scanner positions,
marking pageblocks as unsuitable or deferring compaction for a zone.  At
least the final compaction attempt should be able to override these
heuristics.

To communicate how hard compaction should try, we replace migration mode
with a new enum compact_priority and change the relevant function
signatures.  In compact_zone_order() where struct compact_control is
constructed, the priority is mapped to suitable control flags.  This
patch itself has no functional change, as the current priority levels
are mapped back to the same migration modes as before.  Expanding them
will be done next.

Note that !CONFIG_COMPACTION variant of try_to_compact_pages() is
removed, as the only caller exists under CONFIG_COMPACTION.

Link: http://lkml.kernel.org/r/20160721073614.24395-8-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Vlastimil Babka 2516035499 mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations
After the previous patch, we can distinguish costly allocations that
should be really lightweight, such as THP page faults, with
__GFP_NORETRY.  This means we don't need to recognize khugepaged
allocations via PF_KTHREAD anymore.  We can also change THP page faults
in areas where madvise(MADV_HUGEPAGE) was used to try as hard as
khugepaged, as the process has indicated that it benefits from THP's and
is willing to pay some initial latency costs.

We can also make the flags handling less cryptic by distinguishing
GFP_TRANSHUGE_LIGHT (no reclaim at all, default mode in page fault) from
GFP_TRANSHUGE (only direct reclaim, khugepaged default).  Adding
__GFP_NORETRY or __GFP_KSWAPD_RECLAIM is done where needed.

The patch effectively changes the current GFP_TRANSHUGE users as
follows:

* get_huge_zero_page() - the zero page lifetime should be relatively
  long and it's shared by multiple users, so it's worth spending some
  effort on it.  We use GFP_TRANSHUGE, and __GFP_NORETRY is not added.
  This also restores direct reclaim to this allocation, which was
  unintentionally removed by commit e4a49efe4e7e ("mm: thp: set THP defrag
  by default to madvise and add a stall-free defrag option")

* alloc_hugepage_khugepaged_gfpmask() - this is khugepaged, so latency
  is not an issue.  So if khugepaged "defrag" is enabled (the default), do
  reclaim via GFP_TRANSHUGE without __GFP_NORETRY.  We can remove the
  PF_KTHREAD check from page alloc.

  As a side-effect, khugepaged will now no longer check if the initial
  compaction was deferred or contended.  This is OK, as khugepaged sleep
  times between collapsion attempts are long enough to prevent noticeable
  disruption, so we should allow it to spend some effort.

* migrate_misplaced_transhuge_page() - already was masking out
  __GFP_RECLAIM, so just convert to GFP_TRANSHUGE_LIGHT which is
  equivalent.

* alloc_hugepage_direct_gfpmask() - vma's with VM_HUGEPAGE (via madvise)
  are now allocating without __GFP_NORETRY.  Other vma's keep using
  __GFP_NORETRY if direct reclaim/compaction is at all allowed (by default
  it's allowed only for madvised vma's).  The rest is conversion to
  GFP_TRANSHUGE(_LIGHT).

[mhocko@suse.com: suggested GFP_TRANSHUGE_LIGHT]
Link: http://lkml.kernel.org/r/20160721073614.24395-7-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Alexander Potapenko 80a9201a59 mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB
For KASAN builds:
 - switch SLUB allocator to using stackdepot instead of storing the
   allocation/deallocation stacks in the objects;
 - change the freelist hook so that parts of the freelist can be put
   into the quarantine.

[aryabinin@virtuozzo.com: fixes]
  Link: http://lkml.kernel.org/r/1468601423-28676-1-git-send-email-aryabinin@virtuozzo.com
Link: http://lkml.kernel.org/r/1468347165-41906-3-git-send-email-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt (Red Hat) <rostedt@goodmis.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Kuthonuzo Luruo <kuthonuzo.luruo@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Alexander Potapenko c146a2b98e mm, kasan: account for object redzone in SLUB's nearest_obj()
When looking up the nearest SLUB object for a given address, correctly
calculate its offset if SLAB_RED_ZONE is enabled for that cache.

Previously, when KASAN had detected an error on an object from a cache
with SLAB_RED_ZONE set, the actual start address of the object was
miscalculated, which led to random stacks having been reported.

When looking up the nearest SLUB object for a given address, correctly
calculate its offset if SLAB_RED_ZONE is enabled for that cache.

Fixes: 7ed2f9e663 ("mm, kasan: SLAB support")
Link: http://lkml.kernel.org/r/1468347165-41906-2-git-send-email-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt (Red Hat) <rostedt@goodmis.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Kuthonuzo Luruo <kuthonuzo.luruo@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Dennis Chen a571d4eb55 mm/memblock.c: add new infrastructure to address the mem limit issue
In some cases, memblock is queried by kernel to determine whether a
specified address is RAM or not.  For example, the ACPI core needs this
information to determine which attributes to use when mapping ACPI
regions(acpi_os_ioremap).  Use of incorrect memory types can result in
faults, data corruption, or other issues.

Removing memory with memblock_enforce_memory_limit() throws away this
information, and so a kernel booted with 'mem=' may suffer from the
issues described above.  To avoid this, we need to keep those NOMAP
regions instead of removing all above the limit, which preserves the
information we need while preventing other use of those regions.

This patch adds new infrastructure to retain all NOMAP memblock regions
while removing others, to cater for this.

Link: http://lkml.kernel.org/r/1468475036-5852-2-git-send-email-dennis.chen@arm.com
Signed-off-by: Dennis Chen <dennis.chen@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Kaly Xin <kaly.xin@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Andy Lutomirski e558af65be kdb: use task_cpu() instead of task_thread_info()->cpu
We'll need this cleanup to make the cpu field in thread_info be
optional.

Link: http://lkml.kernel.org/r/da298328dc77ea494576c2f20a934218e758a6fa.1468523549.git.luto@kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Andy Lutomirski efdc949079 mm: fix memcg stack accounting for sub-page stacks
We should account for stacks regardless of stack size, and we need to
account in sub-page units if THREAD_SIZE < PAGE_SIZE.  Change the units
to kilobytes and Move it into account_kernel_stack().

Fixes: 12580e4b54 ("mm: memcontrol: report kernel stack usage in cgroup2 memory.stat")
Link: http://lkml.kernel.org/r/9b5314e3ee5eda61b0317ec1563768602c1ef438.1468523549.git.luto@kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Andy Lutomirski d30dd8be06 mm: track NR_KERNEL_STACK in KiB instead of number of stacks
Currently, NR_KERNEL_STACK tracks the number of kernel stacks in a zone.
This only makes sense if each kernel stack exists entirely in one zone,
and allowing vmapped stacks could break this assumption.

Since frv has THREAD_SIZE < PAGE_SIZE, we need to track kernel stack
allocations in a unit that divides both THREAD_SIZE and PAGE_SIZE on all
architectures.  Keep it simple and use KiB.

Link: http://lkml.kernel.org/r/083c71e642c5fa5f1b6898902e1b2db7b48940d4.1468523549.git.luto@kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Dan Williams 11db048643 mm: cleanup ifdef guards for vmem_altmap
Now that ZONE_DEVICE depends on SPARSEMEM_VMEMMAP we can simplify some
ifdef guards to just ZONE_DEVICE.

Link: http://lkml.kernel.org/r/146687646788.39261.8020536391978771940.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Eric Sandeen <sandeen@redhat.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Huang Ying 319904ad40 mm, THP: clean up return value of madvise_free_huge_pmd
The definition of return value of madvise_free_huge_pmd is not clear
before.  According to the suggestion of Minchan Kim, change the type of
return value to bool and return true if we do MADV_FREE successfully on
entire pmd page, otherwise, return false.  Comments are added too.

Link: http://lkml.kernel.org/r/1467135452-16688-2-git-send-email-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Ebru Akagunduz <ebru.akagunduz@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 5a1c84b404 mm: remove reclaim and compaction retry approximations
If per-zone LRU accounting is available then there is no point
approximating whether reclaim and compaction should retry based on pgdat
statistics.  This is effectively a revert of "mm, vmstat: remove zone
and node double accounting by approximating retries" with the difference
that inactive/active stats are still available.  This preserves the
history of why the approximation was retried and why it had to be
reverted to handle OOM kills on 32-bit systems.

Link: http://lkml.kernel.org/r/1469110261-7365-4-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman bb4cc2bea6 mm, vmscan: remove highmem_file_pages
With the reintroduction of per-zone LRU stats, highmem_file_pages is
redundant so remove it.

[mgorman@techsingularity.net: wrong stat is being accumulated in highmem_dirtyable_memory]
  Link: http://lkml.kernel.org/r/20160725092324.GM10438@techsingularity.netLink: http://lkml.kernel.org/r/1469110261-7365-3-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Minchan Kim 71c799f498 mm: add per-zone lru list stat
When I did stress test with hackbench, I got OOM message frequently
which didn't ever happen in zone-lru.

  gfp_mask=0x26004c0(GFP_KERNEL|__GFP_REPEAT|__GFP_NOTRACK), order=0
  ..
  ..
   __alloc_pages_nodemask+0xe52/0xe60
   ? new_slab+0x39c/0x3b0
   new_slab+0x39c/0x3b0
   ___slab_alloc.constprop.87+0x6da/0x840
   ? __alloc_skb+0x3c/0x260
   ? _raw_spin_unlock_irq+0x27/0x60
   ? trace_hardirqs_on_caller+0xec/0x1b0
   ? finish_task_switch+0xa6/0x220
   ? poll_select_copy_remaining+0x140/0x140
   __slab_alloc.isra.81.constprop.86+0x40/0x6d
   ? __alloc_skb+0x3c/0x260
   kmem_cache_alloc+0x22c/0x260
   ? __alloc_skb+0x3c/0x260
   __alloc_skb+0x3c/0x260
   alloc_skb_with_frags+0x4e/0x1a0
   sock_alloc_send_pskb+0x16a/0x1b0
   ? wait_for_unix_gc+0x31/0x90
   ? alloc_set_pte+0x2ad/0x310
   unix_stream_sendmsg+0x28d/0x340
   sock_sendmsg+0x2d/0x40
   sock_write_iter+0x6c/0xc0
   __vfs_write+0xc0/0x120
   vfs_write+0x9b/0x1a0
   ? __might_fault+0x49/0xa0
   SyS_write+0x44/0x90
   do_fast_syscall_32+0xa6/0x1e0
   sysenter_past_esp+0x45/0x74

  Mem-Info:
  active_anon:104698 inactive_anon:105791 isolated_anon:192
   active_file:433 inactive_file:283 isolated_file:22
   unevictable:0 dirty:0 writeback:296 unstable:0
   slab_reclaimable:6389 slab_unreclaimable:78927
   mapped:474 shmem:0 pagetables:101426 bounce:0
   free:10518 free_pcp:334 free_cma:0
  Node 0 active_anon:418792kB inactive_anon:423164kB active_file:1732kB inactive_file:1132kB unevictable:0kB isolated(anon):768kB isolated(file):88kB mapped:1896kB dirty:0kB writeback:1184kB shmem:0kB writeback_tmp:0kB unstable:0kB pages_scanned:1478632 all_unreclaimable? yes
  DMA free:3304kB min:68kB low:84kB high:100kB present:15992kB managed:15916kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:4088kB kernel_stack:0kB pagetables:2480kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
  lowmem_reserve[]: 0 809 1965 1965
  Normal free:3436kB min:3604kB low:4504kB high:5404kB present:897016kB managed:858460kB mlocked:0kB slab_reclaimable:25556kB slab_unreclaimable:311712kB kernel_stack:164608kB pagetables:30844kB bounce:0kB free_pcp:620kB local_pcp:104kB free_cma:0kB
  lowmem_reserve[]: 0 0 9247 9247
  HighMem free:33808kB min:512kB low:1796kB high:3080kB present:1183736kB managed:1183736kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:372252kB bounce:0kB free_pcp:428kB local_pcp:72kB free_cma:0kB
  lowmem_reserve[]: 0 0 0 0
  DMA: 2*4kB (UM) 2*8kB (UM) 0*16kB 1*32kB (U) 1*64kB (U) 2*128kB (UM) 1*256kB (U) 1*512kB (M) 0*1024kB 1*2048kB (U) 0*4096kB = 3192kB
  Normal: 33*4kB (MH) 79*8kB (ME) 11*16kB (M) 4*32kB (M) 2*64kB (ME) 2*128kB (EH) 7*256kB (EH) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3244kB
  HighMem: 2590*4kB (UM) 1568*8kB (UM) 491*16kB (UM) 60*32kB (UM) 6*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 33064kB
  Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
  25121 total pagecache pages
  24160 pages in swap cache
  Swap cache stats: add 86371, delete 62211, find 42865/60187
  Free swap  = 4015560kB
  Total swap = 4192252kB
  524186 pages RAM
  295934 pages HighMem/MovableOnly
  9658 pages reserved
  0 pages cma reserved

The order-0 allocation for normal zone failed while there are a lot of
reclaimable memory(i.e., anonymous memory with free swap).  I wanted to
analyze the problem but it was hard because we removed per-zone lru stat
so I couldn't know how many of anonymous memory there are in normal/dma
zone.

When we investigate OOM problem, reclaimable memory count is crucial
stat to find a problem.  Without it, it's hard to parse the OOM message
so I believe we should keep it.

With per-zone lru stat,

  gfp_mask=0x26004c0(GFP_KERNEL|__GFP_REPEAT|__GFP_NOTRACK), order=0
  Mem-Info:
  active_anon:101103 inactive_anon:102219 isolated_anon:0
   active_file:503 inactive_file:544 isolated_file:0
   unevictable:0 dirty:0 writeback:34 unstable:0
   slab_reclaimable:6298 slab_unreclaimable:74669
   mapped:863 shmem:0 pagetables:100998 bounce:0
   free:23573 free_pcp:1861 free_cma:0
  Node 0 active_anon:404412kB inactive_anon:409040kB active_file:2012kB inactive_file:2176kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:3452kB dirty:0kB writeback:136kB shmem:0kB writeback_tmp:0kB unstable:0kB pages_scanned:1320845 all_unreclaimable? yes
  DMA free:3296kB min:68kB low:84kB high:100kB active_anon:5540kB inactive_anon:0kB active_file:0kB inactive_file:0kB present:15992kB managed:15916kB mlocked:0kB slab_reclaimable:248kB slab_unreclaimable:2628kB kernel_stack:792kB pagetables:2316kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
  lowmem_reserve[]: 0 809 1965 1965
  Normal free:3600kB min:3604kB low:4504kB high:5404kB active_anon:86304kB inactive_anon:0kB active_file:160kB inactive_file:376kB present:897016kB managed:858524kB mlocked:0kB slab_reclaimable:24944kB slab_unreclaimable:296048kB kernel_stack:163832kB pagetables:35892kB bounce:0kB free_pcp:3076kB local_pcp:656kB free_cma:0kB
  lowmem_reserve[]: 0 0 9247 9247
  HighMem free:86156kB min:512kB low:1796kB high:3080kB active_anon:312852kB inactive_anon:410024kB active_file:1924kB inactive_file:2012kB present:1183736kB managed:1183736kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:365784kB bounce:0kB free_pcp:3868kB local_pcp:720kB free_cma:0kB
  lowmem_reserve[]: 0 0 0 0
  DMA: 8*4kB (UM) 8*8kB (UM) 4*16kB (M) 2*32kB (UM) 2*64kB (UM) 1*128kB (M) 3*256kB (UME) 2*512kB (UE) 1*1024kB (E) 0*2048kB 0*4096kB = 3296kB
  Normal: 240*4kB (UME) 160*8kB (UME) 23*16kB (ME) 3*32kB (UE) 3*64kB (UME) 2*128kB (ME) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3408kB
  HighMem: 10942*4kB (UM) 3102*8kB (UM) 866*16kB (UM) 76*32kB (UM) 11*64kB (UM) 4*128kB (UM) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 86344kB
  Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
  54409 total pagecache pages
  53215 pages in swap cache
  Swap cache stats: add 300982, delete 247765, find 157978/226539
  Free swap  = 3803244kB
  Total swap = 4192252kB
  524186 pages RAM
  295934 pages HighMem/MovableOnly
  9642 pages reserved
  0 pages cma reserved

With that, we can see normal zone has a 86M reclaimable memory so we can
know something goes wrong(I will fix the problem in next patch) in
reclaim.

[mgorman@techsingularity.net: rename zone LRU stats in /proc/vmstat]
 Link: http://lkml.kernel.org/r/20160725072300.GK10438@techsingularity.net
Link: http://lkml.kernel.org/r/1469110261-7365-2-git-send-email-mgorman@techsingularity.net
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 7ee36a14f0 mm, vmscan: Update all zone LRU sizes before updating memcg
Minchan Kim reported setting the following warning on a 32-bit system
although it can affect 64-bit systems.

  WARNING: CPU: 4 PID: 1322 at mm/memcontrol.c:998 mem_cgroup_update_lru_size+0x103/0x110
  mem_cgroup_update_lru_size(f44b4000, 1, -7): zid 1 lru_size 1 but empty
  Modules linked in:
  CPU: 4 PID: 1322 Comm: cp Not tainted 4.7.0-rc4-mm1+ #143
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
    dump_stack+0x76/0xaf
    __warn+0xea/0x110
    ? mem_cgroup_update_lru_size+0x103/0x110
    warn_slowpath_fmt+0x3b/0x40
    mem_cgroup_update_lru_size+0x103/0x110
    isolate_lru_pages.isra.61+0x2e2/0x360
    shrink_active_list+0xac/0x2a0
    ? __delay+0xe/0x10
    shrink_node_memcg+0x53c/0x7a0
    shrink_node+0xab/0x2a0
    do_try_to_free_pages+0xc6/0x390
    try_to_free_pages+0x245/0x590

LRU list contents and counts are updated separately.  Counts are updated
before pages are added to the LRU and updated after pages are removed.
The warning above is from a check in mem_cgroup_update_lru_size that
ensures that list sizes of zero are empty.

The problem is that node-lru needs to account for highmem pages if
CONFIG_HIGHMEM is set.  One impact of the implementation is that the
sizes are updated in multiple passes when pages from multiple zones were
isolated.  This happens whether HIGHMEM is set or not.  When multiple
zones are isolated, it's possible for a debugging check in memcg to be
tripped.

This patch forces all the zone counts to be updated before the memcg
function is called.

Link: http://lkml.kernel.org/r/1468588165-12461-6-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Minchan Kim <minchan@kernel.org>
Reported-by: Minchan Kim <minchan@kernel.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman bca6759258 mm, vmstat: remove zone and node double accounting by approximating retries
The number of LRU pages, dirty pages and writeback pages must be
accounted for on both zones and nodes because of the reclaim retry
logic, compaction retry logic and highmem calculations all depending on
per-zone stats.

Many lowmem allocations are immune from OOM kill due to a check in
__alloc_pages_may_oom for (ac->high_zoneidx < ZONE_NORMAL) since commit
03668b3ceb ("oom: avoid oom killer for lowmem allocations").  The
exception is costly high-order allocations or allocations that cannot
fail.  If the __alloc_pages_may_oom avoids OOM-kill for low-order lowmem
allocations then it would fall through to __alloc_pages_direct_compact.

This patch will blindly retry reclaim for zone-constrained allocations
in should_reclaim_retry up to MAX_RECLAIM_RETRIES.  This is not ideal
but without per-zone stats there are not many alternatives.  The impact
it that zone-constrained allocations may delay before considering the
OOM killer.

As there is no guarantee enough memory can ever be freed to satisfy
compaction, this patch avoids retrying compaction for zone-contrained
allocations.

In combination, that means that the per-node stats can be used when
deciding whether to continue reclaim using a rough approximation.  While
it is possible this will make the wrong decision on occasion, it will
not infinite loop as the number of reclaim attempts is capped by
MAX_RECLAIM_RETRIES.

The final step is calculating the number of dirtyable highmem pages.  As
those calculations only care about the global count of file pages in
highmem.  This patch uses a global counter used instead of per-zone
stats as it is sufficient.

In combination, this allows the per-zone LRU and dirty state counters to
be removed.

[mgorman@techsingularity.net: fix acct_highmem_file_pages()]
  Link: http://lkml.kernel.org/r/1468853426-12858-4-git-send-email-mgorman@techsingularity.netLink: http://lkml.kernel.org/r/1467970510-21195-35-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Suggested by: Michal Hocko <mhocko@kernel.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 7cc30fcfd2 mm: vmstat: account per-zone stalls and pages skipped during reclaim
The vmstat allocstall was fairly useful in the general sense but
node-based LRUs change that.  It's important to know if a stall was for
an address-limited allocation request as this will require skipping
pages from other zones.  This patch adds pgstall_* counters to replace
allocstall.  The sum of the counters will equal the old allocstall so it
can be trivially recalculated.  A high number of address-limited
allocation requests may result in a lot of useless LRU scanning for
suitable pages.

As address-limited allocations require pages to be skipped, it's
important to know how much useless LRU scanning took place so this patch
adds pgskip* counters.  This yields the following model

1. The number of address-space limited stalls can be accounted for (pgstall)
2. The amount of useless work required to reclaim the data is accounted (pgskip)
3. The total number of scans is available from pgscan_kswapd and pgscan_direct
   so from that the ratio of useful to useless scans can be calculated.

[mgorman@techsingularity.net: s/pgstall/allocstall/]
  Link: http://lkml.kernel.org/r/1468404004-5085-3-git-send-email-mgorman@techsingularity.netLink: http://lkml.kernel.org/r/1467970510-21195-33-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 16709d1de1 mm: vmstat: replace __count_zone_vm_events with a zone id equivalent
This is partially a preparation patch for more vmstat work but it also
has the slight advantage that __count_zid_vm_events is cheaper to
calculate than __count_zone_vm_events().

Link: http://lkml.kernel.org/r/1467970510-21195-32-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman e6cbd7f2ef mm, page_alloc: remove fair zone allocation policy
The fair zone allocation policy interleaves allocation requests between
zones to avoid an age inversion problem whereby new pages are reclaimed
to balance a zone.  Reclaim is now node-based so this should no longer
be an issue and the fair zone allocation policy is not free.  This patch
removes it.

Link: http://lkml.kernel.org/r/1467970510-21195-30-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman a5f5f91da6 mm: convert zone_reclaim to node_reclaim
As reclaim is now per-node based, convert zone_reclaim to be
node_reclaim.  It is possible that a node will be reclaimed multiple
times if it has multiple zones but this is unavoidable without caching
all nodes traversed so far.  The documentation and interface to
userspace is the same from a configuration perspective and will will be
similar in behaviour unless the node-local allocation requests were also
limited to lower zones.

Link: http://lkml.kernel.org/r/1467970510-21195-24-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman c4a25635b6 mm: move vmscan writes and file write accounting to the node
As reclaim is now node-based, it follows that page write activity due to
page reclaim should also be accounted for on the node.  For consistency,
also account page writes and page dirtying on a per-node basis.

After this patch, there are a few remaining zone counters that may appear
strange but are fine.  NUMA stats are still per-zone as this is a
user-space interface that tools consume.  NR_MLOCK, NR_SLAB_*,
NR_PAGETABLE, NR_KERNEL_STACK and NR_BOUNCE are all allocations that
potentially pin low memory and cannot trivially be reclaimed on demand.
This information is still useful for debugging a page allocation failure
warning.

Link: http://lkml.kernel.org/r/1467970510-21195-21-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 11fb998986 mm: move most file-based accounting to the node
There are now a number of accounting oddities such as mapped file pages
being accounted for on the node while the total number of file pages are
accounted on the zone.  This can be coped with to some extent but it's
confusing so this patch moves the relevant file-based accounted.  Due to
throttling logic in the page allocator for reliable OOM detection, it is
still necessary to track dirty and writeback pages on a per-zone basis.

[mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting]
  Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net
Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 4b9d0fab71 mm: rename NR_ANON_PAGES to NR_ANON_MAPPED
NR_FILE_PAGES  is the number of        file pages.
NR_FILE_MAPPED is the number of mapped file pages.
NR_ANON_PAGES  is the number of mapped anon pages.

This is unhelpful naming as it's easy to confuse NR_FILE_MAPPED and
NR_ANON_PAGES for mapped pages.  This patch renames NR_ANON_PAGES so we
have

NR_FILE_PAGES  is the number of        file pages.
NR_FILE_MAPPED is the number of mapped file pages.
NR_ANON_MAPPED is the number of mapped anon pages.

Link: http://lkml.kernel.org/r/1467970510-21195-19-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 50658e2e04 mm: move page mapped accounting to the node
Reclaim makes decisions based on the number of pages that are mapped but
it's mixing node and zone information.  Account NR_FILE_MAPPED and
NR_ANON_PAGES pages on the node.

Link: http://lkml.kernel.org/r/1467970510-21195-18-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 281e37265f mm, page_alloc: consider dirtyable memory in terms of nodes
Historically dirty pages were spread among zones but now that LRUs are
per-node it is more appropriate to consider dirty pages in a node.

Link: http://lkml.kernel.org/r/1467970510-21195-17-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 1e6b10857f mm, workingset: make working set detection node-aware
Working set and refault detection is still zone-based, fix it.

Link: http://lkml.kernel.org/r/1467970510-21195-16-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman ef8f232799 mm, memcg: move memcg limit enforcement from zones to nodes
Memcg needs adjustment after moving LRUs to the node.  Limits are
tracked per memcg but the soft-limit excess is tracked per zone.  As
global page reclaim is based on the node, it is easy to imagine a
situation where a zone soft limit is exceeded even though the memcg
limit is fine.

This patch moves the soft limit tree the node.  Technically, all the
variable names should also change but people are already familiar by the
meaning of "mz" even if "mn" would be a more appropriate name now.

Link: http://lkml.kernel.org/r/1467970510-21195-15-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman a9dd0a8310 mm, vmscan: make shrink_node decisions more node-centric
Earlier patches focused on having direct reclaim and kswapd use data
that is node-centric for reclaiming but shrink_node() itself still uses
too much zone information.  This patch removes unnecessary zone-based
information with the most important decision being whether to continue
reclaim or not.  Some memcg APIs are adjusted as a result even though
memcg itself still uses some zone information.

[mgorman@techsingularity.net: optimization]
  Link: http://lkml.kernel.org/r/1468588165-12461-2-git-send-email-mgorman@techsingularity.net
Link: http://lkml.kernel.org/r/1467970510-21195-14-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 38087d9b03 mm, vmscan: simplify the logic deciding whether kswapd sleeps
kswapd goes through some complex steps trying to figure out if it should
stay awake based on the classzone_idx and the requested order.  It is
unnecessarily complex and passes in an invalid classzone_idx to
balance_pgdat().  What matters most of all is whether a larger order has
been requsted and whether kswapd successfully reclaimed at the previous
order.  This patch irons out the logic to check just that and the end
result is less headache inducing.

Link: http://lkml.kernel.org/r/1467970510-21195-10-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 31483b6ad2 mm, vmscan: remove balance gap
The balance gap was introduced to apply equal pressure to all zones when
reclaiming for a higher zone.  With node-based LRU, the need for the
balance gap is removed and the code is dead so remove it.

[vbabka@suse.cz: Also remove KSWAPD_ZONE_BALANCE_GAP_RATIO]
Link: http://lkml.kernel.org/r/1467970510-21195-9-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 0f66114893 mm, mmzone: clarify the usage of zone padding
Zone padding separates write-intensive fields used by page allocation,
compaction and vmstats but the comments are a little misleading and need
clarification.

Link: http://lkml.kernel.org/r/1467970510-21195-5-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 599d0c954f mm, vmscan: move LRU lists to node
This moves the LRU lists from the zone to the node and related data such
as counters, tracing, congestion tracking and writeback tracking.

Unfortunately, due to reclaim and compaction retry logic, it is
necessary to account for the number of LRU pages on both zone and node
logic.  Most reclaim logic is based on the node counters but the retry
logic uses the zone counters which do not distinguish inactive and
active sizes.  It would be possible to leave the LRU counters on a
per-zone basis but it's a heavier calculation across multiple cache
lines that is much more frequent than the retry checks.

Other than the LRU counters, this is mostly a mechanical patch but note
that it introduces a number of anomalies.  For example, the scans are
per-zone but using per-node counters.  We also mark a node as congested
when a zone is congested.  This causes weird problems that are fixed
later but is easier to review.

In the event that there is excessive overhead on 32-bit systems due to
the nodes being on LRU then there are two potential solutions

1. Long-term isolation of highmem pages when reclaim is lowmem

   When pages are skipped, they are immediately added back onto the LRU
   list. If lowmem reclaim persisted for long periods of time, the same
   highmem pages get continually scanned. The idea would be that lowmem
   keeps those pages on a separate list until a reclaim for highmem pages
   arrives that splices the highmem pages back onto the LRU. It potentially
   could be implemented similar to the UNEVICTABLE list.

   That would reduce the skip rate with the potential corner case is that
   highmem pages have to be scanned and reclaimed to free lowmem slab pages.

2. Linear scan lowmem pages if the initial LRU shrink fails

   This will break LRU ordering but may be preferable and faster during
   memory pressure than skipping LRU pages.

Link: http://lkml.kernel.org/r/1467970510-21195-4-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman a52633d8e9 mm, vmscan: move lru_lock to the node
Node-based reclaim requires node-based LRUs and locking.  This is a
preparation patch that just moves the lru_lock to the node so later
patches are easier to review.  It is a mechanical change but note this
patch makes contention worse because the LRU lock is hotter and direct
reclaim and kswapd can contend on the same lock even when reclaiming
from different zones.

Link: http://lkml.kernel.org/r/1467970510-21195-3-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 75ef718405 mm, vmstat: add infrastructure for per-node vmstats
Patchset: "Move LRU page reclaim from zones to nodes v9"

This series moves LRUs from the zones to the node.  While this is a
current rebase, the test results were based on mmotm as of June 23rd.
Conceptually, this series is simple but there are a lot of details.
Some of the broad motivations for this are;

1. The residency of a page partially depends on what zone the page was
   allocated from.  This is partially combatted by the fair zone allocation
   policy but that is a partial solution that introduces overhead in the
   page allocator paths.

2. Currently, reclaim on node 0 behaves slightly different to node 1. For
   example, direct reclaim scans in zonelist order and reclaims even if
   the zone is over the high watermark regardless of the age of pages
   in that LRU. Kswapd on the other hand starts reclaim on the highest
   unbalanced zone. A difference in distribution of file/anon pages due
   to when they were allocated results can result in a difference in
   again. While the fair zone allocation policy mitigates some of the
   problems here, the page reclaim results on a multi-zone node will
   always be different to a single-zone node.
   it was scheduled on as a result.

3. kswapd and the page allocator scan zones in the opposite order to
   avoid interfering with each other but it's sensitive to timing.  This
   mitigates the page allocator using pages that were allocated very recently
   in the ideal case but it's sensitive to timing. When kswapd is allocating
   from lower zones then it's great but during the rebalancing of the highest
   zone, the page allocator and kswapd interfere with each other. It's worse
   if the highest zone is small and difficult to balance.

4. slab shrinkers are node-based which makes it harder to identify the exact
   relationship between slab reclaim and LRU reclaim.

The reason we have zone-based reclaim is that we used to have
large highmem zones in common configurations and it was necessary
to quickly find ZONE_NORMAL pages for reclaim. Today, this is much
less of a concern as machines with lots of memory will (or should) use
64-bit kernels. Combinations of 32-bit hardware and 64-bit hardware are
rare. Machines that do use highmem should have relatively low highmem:lowmem
ratios than we worried about in the past.

Conceptually, moving to node LRUs should be easier to understand. The
page allocator plays fewer tricks to game reclaim and reclaim behaves
similarly on all nodes.

The series has been tested on a 16 core UMA machine and a 2-socket 48
core NUMA machine. The UMA results are presented in most cases as the NUMA
machine behaved similarly.

pagealloc
---------

This is a microbenchmark that shows the benefit of removing the fair zone
allocation policy. It was tested uip to order-4 but only orders 0 and 1 are
shown as the other orders were comparable.

                                           4.7.0-rc4                  4.7.0-rc4
                                      mmotm-20160623                 nodelru-v9
Min      total-odr0-1               490.00 (  0.00%)           457.00 (  6.73%)
Min      total-odr0-2               347.00 (  0.00%)           329.00 (  5.19%)
Min      total-odr0-4               288.00 (  0.00%)           273.00 (  5.21%)
Min      total-odr0-8               251.00 (  0.00%)           239.00 (  4.78%)
Min      total-odr0-16              234.00 (  0.00%)           222.00 (  5.13%)
Min      total-odr0-32              223.00 (  0.00%)           211.00 (  5.38%)
Min      total-odr0-64              217.00 (  0.00%)           208.00 (  4.15%)
Min      total-odr0-128             214.00 (  0.00%)           204.00 (  4.67%)
Min      total-odr0-256             250.00 (  0.00%)           230.00 (  8.00%)
Min      total-odr0-512             271.00 (  0.00%)           269.00 (  0.74%)
Min      total-odr0-1024            291.00 (  0.00%)           282.00 (  3.09%)
Min      total-odr0-2048            303.00 (  0.00%)           296.00 (  2.31%)
Min      total-odr0-4096            311.00 (  0.00%)           309.00 (  0.64%)
Min      total-odr0-8192            316.00 (  0.00%)           314.00 (  0.63%)
Min      total-odr0-16384           317.00 (  0.00%)           315.00 (  0.63%)
Min      total-odr1-1               742.00 (  0.00%)           712.00 (  4.04%)
Min      total-odr1-2               562.00 (  0.00%)           530.00 (  5.69%)
Min      total-odr1-4               457.00 (  0.00%)           433.00 (  5.25%)
Min      total-odr1-8               411.00 (  0.00%)           381.00 (  7.30%)
Min      total-odr1-16              381.00 (  0.00%)           356.00 (  6.56%)
Min      total-odr1-32              372.00 (  0.00%)           346.00 (  6.99%)
Min      total-odr1-64              372.00 (  0.00%)           343.00 (  7.80%)
Min      total-odr1-128             375.00 (  0.00%)           351.00 (  6.40%)
Min      total-odr1-256             379.00 (  0.00%)           351.00 (  7.39%)
Min      total-odr1-512             385.00 (  0.00%)           355.00 (  7.79%)
Min      total-odr1-1024            386.00 (  0.00%)           358.00 (  7.25%)
Min      total-odr1-2048            390.00 (  0.00%)           362.00 (  7.18%)
Min      total-odr1-4096            390.00 (  0.00%)           362.00 (  7.18%)
Min      total-odr1-8192            388.00 (  0.00%)           363.00 (  6.44%)

This shows a steady improvement throughout. The primary benefit is from
reduced system CPU usage which is obvious from the overall times;

           4.7.0-rc4   4.7.0-rc4
        mmotm-20160623nodelru-v8
User          189.19      191.80
System       2604.45     2533.56
Elapsed      2855.30     2786.39

The vmstats also showed that the fair zone allocation policy was definitely
removed as can be seen here;

                             4.7.0-rc3   4.7.0-rc3
                         mmotm-20160623 nodelru-v8
DMA32 allocs               28794729769           0
Normal allocs              48432501431 77227309877
Movable allocs                       0           0

tiobench on ext4
----------------

tiobench is a benchmark that artifically benefits if old pages remain resident
while new pages get reclaimed. The fair zone allocation policy mitigates this
problem so pages age fairly. While the benchmark has problems, it is important
that tiobench performance remains constant as it implies that page aging
problems that the fair zone allocation policy fixes are not re-introduced.

                                         4.7.0-rc4             4.7.0-rc4
                                    mmotm-20160623            nodelru-v9
Min      PotentialReadSpeed        89.65 (  0.00%)       90.21 (  0.62%)
Min      SeqRead-MB/sec-1          82.68 (  0.00%)       82.01 ( -0.81%)
Min      SeqRead-MB/sec-2          72.76 (  0.00%)       72.07 ( -0.95%)
Min      SeqRead-MB/sec-4          75.13 (  0.00%)       74.92 ( -0.28%)
Min      SeqRead-MB/sec-8          64.91 (  0.00%)       65.19 (  0.43%)
Min      SeqRead-MB/sec-16         62.24 (  0.00%)       62.22 ( -0.03%)
Min      RandRead-MB/sec-1          0.88 (  0.00%)        0.88 (  0.00%)
Min      RandRead-MB/sec-2          0.95 (  0.00%)        0.92 ( -3.16%)
Min      RandRead-MB/sec-4          1.43 (  0.00%)        1.34 ( -6.29%)
Min      RandRead-MB/sec-8          1.61 (  0.00%)        1.60 ( -0.62%)
Min      RandRead-MB/sec-16         1.80 (  0.00%)        1.90 (  5.56%)
Min      SeqWrite-MB/sec-1         76.41 (  0.00%)       76.85 (  0.58%)
Min      SeqWrite-MB/sec-2         74.11 (  0.00%)       73.54 ( -0.77%)
Min      SeqWrite-MB/sec-4         80.05 (  0.00%)       80.13 (  0.10%)
Min      SeqWrite-MB/sec-8         72.88 (  0.00%)       73.20 (  0.44%)
Min      SeqWrite-MB/sec-16        75.91 (  0.00%)       76.44 (  0.70%)
Min      RandWrite-MB/sec-1         1.18 (  0.00%)        1.14 ( -3.39%)
Min      RandWrite-MB/sec-2         1.02 (  0.00%)        1.03 (  0.98%)
Min      RandWrite-MB/sec-4         1.05 (  0.00%)        0.98 ( -6.67%)
Min      RandWrite-MB/sec-8         0.89 (  0.00%)        0.92 (  3.37%)
Min      RandWrite-MB/sec-16        0.92 (  0.00%)        0.93 (  1.09%)

           4.7.0-rc4   4.7.0-rc4
        mmotm-20160623 approx-v9
User          645.72      525.90
System        403.85      331.75
Elapsed      6795.36     6783.67

This shows that the series has little or not impact on tiobench which is
desirable and a reduction in system CPU usage. It indicates that the fair
zone allocation policy was removed in a manner that didn't reintroduce
one class of page aging bug. There were only minor differences in overall
reclaim activity

                             4.7.0-rc4   4.7.0-rc4
                          mmotm-20160623nodelru-v8
Minor Faults                    645838      647465
Major Faults                       573         640
Swap Ins                             0           0
Swap Outs                            0           0
DMA allocs                           0           0
DMA32 allocs                  46041453    44190646
Normal allocs                 78053072    79887245
Movable allocs                       0           0
Allocation stalls                   24          67
Stall zone DMA                       0           0
Stall zone DMA32                     0           0
Stall zone Normal                    0           2
Stall zone HighMem                   0           0
Stall zone Movable                   0          65
Direct pages scanned             10969       30609
Kswapd pages scanned          93375144    93492094
Kswapd pages reclaimed        93372243    93489370
Direct pages reclaimed           10969       30609
Kswapd efficiency                  99%         99%
Kswapd velocity              13741.015   13781.934
Direct efficiency                 100%        100%
Direct velocity                  1.614       4.512
Percentage direct scans             0%          0%

kswapd activity was roughly comparable. There were differences in direct
reclaim activity but negligible in the context of the overall workload
(velocity of 4 pages per second with the patches applied, 1.6 pages per
second in the baseline kernel).

pgbench read-only large configuration on ext4
---------------------------------------------

pgbench is a database benchmark that can be sensitive to page reclaim
decisions. This also checks if removing the fair zone allocation policy
is safe

pgbench Transactions
                        4.7.0-rc4             4.7.0-rc4
                   mmotm-20160623            nodelru-v8
Hmean    1       188.26 (  0.00%)      189.78 (  0.81%)
Hmean    5       330.66 (  0.00%)      328.69 ( -0.59%)
Hmean    12      370.32 (  0.00%)      380.72 (  2.81%)
Hmean    21      368.89 (  0.00%)      369.00 (  0.03%)
Hmean    30      382.14 (  0.00%)      360.89 ( -5.56%)
Hmean    32      428.87 (  0.00%)      432.96 (  0.95%)

Negligible differences again. As with tiobench, overall reclaim activity
was comparable.

bonnie++ on ext4
----------------

No interesting performance difference, negligible differences on reclaim
stats.

paralleldd on ext4
------------------

This workload uses varying numbers of dd instances to read large amounts of
data from disk.

                               4.7.0-rc3             4.7.0-rc3
                          mmotm-20160623            nodelru-v9
Amean    Elapsd-1       186.04 (  0.00%)      189.41 ( -1.82%)
Amean    Elapsd-3       192.27 (  0.00%)      191.38 (  0.46%)
Amean    Elapsd-5       185.21 (  0.00%)      182.75 (  1.33%)
Amean    Elapsd-7       183.71 (  0.00%)      182.11 (  0.87%)
Amean    Elapsd-12      180.96 (  0.00%)      181.58 ( -0.35%)
Amean    Elapsd-16      181.36 (  0.00%)      183.72 ( -1.30%)

           4.7.0-rc4   4.7.0-rc4
        mmotm-20160623 nodelru-v9
User         1548.01     1552.44
System       8609.71     8515.08
Elapsed      3587.10     3594.54

There is little or no change in performance but some drop in system CPU usage.

                             4.7.0-rc3   4.7.0-rc3
                        mmotm-20160623  nodelru-v9
Minor Faults                    362662      367360
Major Faults                      1204        1143
Swap Ins                            22           0
Swap Outs                         2855        1029
DMA allocs                           0           0
DMA32 allocs                  31409797    28837521
Normal allocs                 46611853    49231282
Movable allocs                       0           0
Direct pages scanned                 0           0
Kswapd pages scanned          40845270    40869088
Kswapd pages reclaimed        40830976    40855294
Direct pages reclaimed               0           0
Kswapd efficiency                  99%         99%
Kswapd velocity              11386.711   11369.769
Direct efficiency                 100%        100%
Direct velocity                  0.000       0.000
Percentage direct scans             0%          0%
Page writes by reclaim            2855        1029
Page writes file                     0           0
Page writes anon                  2855        1029
Page reclaim immediate             771        1628
Sector Reads                 293312636   293536360
Sector Writes                 18213568    18186480
Page rescued immediate               0           0
Slabs scanned                   128257      132747
Direct inode steals                181          56
Kswapd inode steals                 59        1131

It basically shows that kswapd was active at roughly the same rate in
both kernels. There was also comparable slab scanning activity and direct
reclaim was avoided in both cases. There appears to be a large difference
in numbers of inodes reclaimed but the workload has few active inodes and
is likely a timing artifact.

stutter
-------

stutter simulates a simple workload. One part uses a lot of anonymous
memory, a second measures mmap latency and a third copies a large file.
The primary metric is checking for mmap latency.

stutter
                             4.7.0-rc4             4.7.0-rc4
                        mmotm-20160623            nodelru-v8
Min         mmap     16.6283 (  0.00%)     13.4258 ( 19.26%)
1st-qrtle   mmap     54.7570 (  0.00%)     34.9121 ( 36.24%)
2nd-qrtle   mmap     57.3163 (  0.00%)     46.1147 ( 19.54%)
3rd-qrtle   mmap     58.9976 (  0.00%)     47.1882 ( 20.02%)
Max-90%     mmap     59.7433 (  0.00%)     47.4453 ( 20.58%)
Max-93%     mmap     60.1298 (  0.00%)     47.6037 ( 20.83%)
Max-95%     mmap     73.4112 (  0.00%)     82.8719 (-12.89%)
Max-99%     mmap     92.8542 (  0.00%)     88.8870 (  4.27%)
Max         mmap   1440.6569 (  0.00%)    121.4201 ( 91.57%)
Mean        mmap     59.3493 (  0.00%)     42.2991 ( 28.73%)
Best99%Mean mmap     57.2121 (  0.00%)     41.8207 ( 26.90%)
Best95%Mean mmap     55.9113 (  0.00%)     39.9620 ( 28.53%)
Best90%Mean mmap     55.6199 (  0.00%)     39.3124 ( 29.32%)
Best50%Mean mmap     53.2183 (  0.00%)     33.1307 ( 37.75%)
Best10%Mean mmap     45.9842 (  0.00%)     20.4040 ( 55.63%)
Best5%Mean  mmap     43.2256 (  0.00%)     17.9654 ( 58.44%)
Best1%Mean  mmap     32.9388 (  0.00%)     16.6875 ( 49.34%)

This shows a number of improvements with the worst-case outlier greatly
improved.

Some of the vmstats are interesting

                             4.7.0-rc4   4.7.0-rc4
                          mmotm-20160623nodelru-v8
Swap Ins                           163         502
Swap Outs                            0           0
DMA allocs                           0           0
DMA32 allocs                 618719206  1381662383
Normal allocs                891235743   564138421
Movable allocs                       0           0
Allocation stalls                 2603           1
Direct pages scanned            216787           2
Kswapd pages scanned          50719775    41778378
Kswapd pages reclaimed        41541765    41777639
Direct pages reclaimed          209159           0
Kswapd efficiency                  81%         99%
Kswapd velocity              16859.554   14329.059
Direct efficiency                  96%          0%
Direct velocity                 72.061       0.001
Percentage direct scans             0%          0%
Page writes by reclaim         6215049           0
Page writes file               6215049           0
Page writes anon                     0           0
Page reclaim immediate           70673          90
Sector Reads                  81940800    81680456
Sector Writes                100158984    98816036
Page rescued immediate               0           0
Slabs scanned                  1366954       22683

While this is not guaranteed in all cases, this particular test showed
a large reduction in direct reclaim activity. It's also worth noting
that no page writes were issued from reclaim context.

This series is not without its hazards. There are at least three areas
that I'm concerned with even though I could not reproduce any problems in
that area.

1. Reclaim/compaction is going to be affected because the amount of reclaim is
   no longer targetted at a specific zone. Compaction works on a per-zone basis
   so there is no guarantee that reclaiming a few THP's worth page pages will
   have a positive impact on compaction success rates.

2. The Slab/LRU reclaim ratio is affected because the frequency the shrinkers
   are called is now different. This may or may not be a problem but if it
   is, it'll be because shrinkers are not called enough and some balancing
   is required.

3. The anon/file reclaim ratio may be affected. Pages about to be dirtied are
   distributed between zones and the fair zone allocation policy used to do
   something very similar for anon. The distribution is now different but not
   necessarily in any way that matters but it's still worth bearing in mind.

VM statistic counters for reclaim decisions are zone-based.  If the kernel
is to reclaim on a per-node basis then we need to track per-node
statistics but there is no infrastructure for that.  The most notable
change is that the old node_page_state is renamed to
sum_zone_node_page_state.  The new node_page_state takes a pglist_data and
uses per-node stats but none exist yet.  There is some renaming such as
vm_stat to vm_zone_stat and the addition of vm_node_stat and the renaming
of mod_state to mod_zone_state.  Otherwise, this is mostly a mechanical
patch with no functional change.  There is a lot of similarity between the
node and zone helpers which is unfortunate but there was no obvious way of
reusing the code and maintaining type safety.

Link: http://lkml.kernel.org/r/1467970510-21195-2-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Johannes Weiner 55779ec759 mm: fix vm-scalability regression in cgroup-aware workingset code
Commit 23047a96d7 ("mm: workingset: per-cgroup cache thrash
detection") added a page->mem_cgroup lookup to the cache eviction,
refault, and activation paths, as well as locking to the activation
path, and the vm-scalability tests showed a regression of -23%.

While the test in question is an artificial worst-case scenario that
doesn't occur in real workloads - reading two sparse files in parallel
at full CPU speed just to hammer the LRU paths - there is still some
optimizations that can be done in those paths.

Inline the lookup functions to eliminate calls.  Also, page->mem_cgroup
doesn't need to be stabilized when counting an activation; we merely
need to hold the RCU lock to prevent the memcg from being freed.

This cuts down on overhead quite a bit:

23047a96d7 063f6715e77a7be5770d6081fe
---------------- --------------------------
         %stddev     %change         %stddev
             \          |                \
  21621405 +- 0%     +11.3%   24069657 +- 2%  vm-scalability.throughput

[linux@roeck-us.net: drop unnecessary include file]
[hannes@cmpxchg.org: add WARN_ON_ONCE()s]
  Link: http://lkml.kernel.org/r/20160707194024.GA26580@cmpxchg.org
Link: http://lkml.kernel.org/r/20160624175101.GA3024@cmpxchg.org
Reported-by: Ye Xiaolong <xiaolong.ye@intel.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Michal Hocko 11a410d516 mm, oom_reaper: do not attempt to reap a task more than twice
oom_reaper relies on the mmap_sem for read to do its job.  Many places
which might block readers have been converted to use down_write_killable
and that has reduced chances of the contention a lot.  Some paths where
the mmap_sem is held for write can take other locks and they might
either be not prepared to fail due to fatal signal pending or too
impractical to be changed.

This patch introduces MMF_OOM_NOT_REAPABLE flag which gets set after the
first attempt to reap a task's mm fails.  If the flag is present after
the failure then we set MMF_OOM_REAPED to hide this mm from the oom
killer completely so it can go and chose another victim.

As a result a risk of OOM deadlock when the oom victim would be blocked
indefinetly and so the oom killer cannot make any progress should be
mitigated considerably while we still try really hard to perform all
reclaim attempts and stay predictable in the behavior.

Link: http://lkml.kernel.org/r/1466426628-15074-10-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Michal Hocko 1af8bb4326 mm, oom: fortify task_will_free_mem()
task_will_free_mem is rather weak.  It doesn't really tell whether the
task has chance to drop its mm.  98748bd722 ("oom: consider
multi-threaded tasks in task_will_free_mem") made a first step into making
it more robust for multi-threaded applications so now we know that the
whole process is going down and probably drop the mm.

This patch builds on top for more complex scenarios where mm is shared
between different processes - CLONE_VM without CLONE_SIGHAND, or in kernel
use_mm().

Make sure that all processes sharing the mm are killed or exiting.  This
will allow us to replace try_oom_reaper by wake_oom_reaper because
task_will_free_mem implies the task is reapable now.  Therefore all paths
which bypass the oom killer are now reapable and so they shouldn't lock up
the oom killer.

Link: http://lkml.kernel.org/r/1466426628-15074-8-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Michal Hocko b18dc5f291 mm, oom: skip vforked tasks from being selected
vforked tasks are not really sitting on any memory.  They are sharing the
mm with parent until they exec into a new code.  Until then it is just
pinning the address space.  OOM killer will kill the vforked task along
with its parent but we still can end up selecting vforked task when the
parent wouldn't be selected.  E.g.  init doing vfork to launch a task or
vforked being a child of oom unkillable task with an updated oom_score_adj
to be killable.

Add a new helper to check whether a task is in the vfork sharing memory
with its parent and use it in oom_badness to skip over these tasks.

Link: http://lkml.kernel.org/r/1466426628-15074-6-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Michal Hocko 44a70adec9 mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj
oom_score_adj is shared for the thread groups (via struct signal) but this
is not sufficient to cover processes sharing mm (CLONE_VM without
CLONE_SIGHAND) and so we can easily end up in a situation when some
processes update their oom_score_adj and confuse the oom killer.  In the
worst case some of those processes might hide from the oom killer
altogether via OOM_SCORE_ADJ_MIN while others are eligible.  OOM killer
would then pick up those eligible but won't be allowed to kill others
sharing the same mm so the mm wouldn't release the mm and so the memory.

It would be ideal to have the oom_score_adj per mm_struct because that is
the natural entity OOM killer considers.  But this will not work because
some programs are doing

	vfork()
	set_oom_adj()
	exec()

We can achieve the same though.  oom_score_adj write handler can set the
oom_score_adj for all processes sharing the same mm if the task is not in
the middle of vfork.  As a result all the processes will share the same
oom_score_adj.  The current implementation is rather pessimistic and
checks all the existing processes by default if there is more than 1
holder of the mm but we do not have any reliable way to check for external
users yet.

Link: http://lkml.kernel.org/r/1466426628-15074-5-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Linus Torvalds 76d5b28bba Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota update from Jan Kara:
 "time64 support for quota"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: use time64_t internally
2016-07-28 13:53:23 -07:00
Linus Torvalds 6784725ab0 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "Assorted cleanups and fixes.

  Probably the most interesting part long-term is ->d_init() - that will
  have a bunch of followups in (at least) ceph and lustre, but we'll
  need to sort the barrier-related rules before it can get used for
  really non-trivial stuff.

  Another fun thing is the merge of ->d_iput() callers (dentry_iput()
  and dentry_unlink_inode()) and a bunch of ->d_compare() ones (all
  except the one in __d_lookup_lru())"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
  fs/dcache.c: avoid soft-lockup in dput()
  vfs: new d_init method
  vfs: Update lookup_dcache() comment
  bdev: get rid of ->bd_inodes
  Remove last traces of ->sync_page
  new helper: d_same_name()
  dentry_cmp(): use lockless_dereference() instead of smp_read_barrier_depends()
  vfs: clean up documentation
  vfs: document ->d_real()
  vfs: merge .d_select_inode() into .d_real()
  unify dentry_iput() and dentry_unlink_inode()
  binfmt_misc: ->s_root is not going anywhere
  drop redundant ->owner initializations
  ufs: get rid of redundant checks
  orangefs: constify inode_operations
  missed comment updates from ->direct_IO() prototype change
  file_inode(f)->i_mapping is f->f_mapping
  trim fsnotify hooks a bit
  9p: new helper - v9fs_parent_fid()
  debugfs: ->d_parent is never NULL or negative
  ...
2016-07-28 12:59:05 -07:00
Linus Torvalds 554828ee0d Merge branch 'salted-string-hash'
This changes the vfs dentry hashing to mix in the parent pointer at the
_beginning_ of the hash, rather than at the end.

That actually improves both the hash and the code generation, because we
can move more of the computation to the "static" part of the dcache
setup, and do less at lookup runtime.

It turns out that a lot of other hash users also really wanted to mix in
a base pointer as a 'salt' for the hash, and so the slightly extended
interface ends up working well for other cases too.

Users that want a string hash that is purely about the string pass in a
'salt' pointer of NULL.

* merge branch 'salted-string-hash':
  fs/dcache.c: Save one 32-bit multiply in dcache lookup
  vfs: make the string hashes salt the hash
2016-07-28 12:26:31 -07:00
Richard Cochran 4fae16dffb timers/core: Correct callback order during CPU hot plug
On the tear-down path, the dead CPU callback for the timers was
misplaced within the 'cpuhp_state' enumeration.  There is a hidden
dependency between the timers and block multiqueue.  The timers
callback must happen before the block multiqueue callback otherwise a
RCU stall occurs.

Move the timers callback to the proper place in the state machine.

Reported-and-tested-by: Jon Hunter <jonathanh@nvidia.com>
Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 24f73b9971 ("timers/core: Convert to hotplug state machine")
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: John Stultz <john.stultz@linaro.org>
Cc: rt@linutronix.de
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1469610498-25914-1-git-send-email-rcochran@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-28 18:56:22 +02:00
Arnd Bergmann a0f2b65275 ceph: fix symbol versioning for ceph_monc_do_statfs
The genksyms helper in the kernel cannot parse a type definition
like "typeof(((type *)0)->keyfld)" that is used in the DEFINE_RB_FUNCS
helper, causing the following EXPORT_SYMBOL() statement to be ignored
when computing the crcs, and triggering a warning about this:

WARNING: "ceph_monc_do_statfs" [fs/ceph/ceph.ko] has no CRC

To work around the problem, we can rewrite the type to reference
an undefined 'extern' symbol instead of a NULL pointer. This is
evidently ok for genksyms, and it no longer complains about the
line when calling it with 'genksyms -w'.

I've looked briefly into extending genksyms instead, but it seems
really hard to do. Jan Beulich introduced basic support for 'typeof'
a while ago in dc53324060 ("genksyms: fix typeof() handling"),
but that is not sufficient for the expression we have here.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: fcd00b68bb ("libceph: DEFINE_RB_FUNCS macro")
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 15:32:53 +02:00
Rob Rice a24532f8d1 mailbox: Add Broadcom PDC mailbox driver
The Broadcom PDC mailbox driver is a mailbox controller that
manages data transfers to and from one or more offload engines.

Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2016-07-28 09:34:47 +05:30
Yan, Zheng 0cabbd94ff libceph: fsmap.user subscription support
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 03:00:40 +02:00
Yan, Zheng 774a6a118c ceph: reduce i_nr_by_mode array size
Track usage count for individual fmode bit. This can reduce the
array size by half.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 02:55:39 +02:00
Yan, Zheng 30c156d995 libceph: rados pool namespace support
Add pool namesapce pointer to struct ceph_file_layout and struct
ceph_object_locator. Pool namespace is used by when mapping object
to PG, it's also used when composing OSD request.

The namespace pointer in struct ceph_file_layout is RCU protected.
So libceph can read namespace without taking lock.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
[idryomov@gmail.com: ceph_oloc_destroy(), misc minor changes]
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 02:55:37 +02:00
Yan, Zheng 51e9273796 libceph: introduce reference counted string
The data structure is for storing namesapce string. It allows namespace
string to be shared between cephfs inodes with same layout. This data
structure can also be referenced by OSD request.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 02:55:37 +02:00
Yan, Zheng 7627151ea3 libceph: define new ceph_file_layout structure
Define new ceph_file_layout structure and rename old ceph_file_layout
to ceph_file_layout_legacy. This is preparation for adding namespace
to ceph_file_layout structure.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 02:55:36 +02:00
Ilya Dryomov 22748f9d61 libceph: add start en/decoding block helpers
Add ceph_start_encoding() and ceph_start_decoding(), the equivalent of
ENCODE_START and DECODE_START in the userspace ceph code.

This is based on a patch from Mike Christie <michaelc@cs.wisc.edu>.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 02:55:36 +02:00
Ilya Dryomov 281dbe5db8 libceph: add an ONSTACK initializer for oids
An on-stack oid in ceph_ioctl_get_dataloc() is not initialized,
resulting in a WARN and a NULL pointer dereference later on.  We will
have more of these on-stack in the future, so fix it with a convenience
macro.

Fixes: d30291b985 ("libceph: variable-sized ceph_object_id")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 02:55:35 +02:00
Ilya Dryomov b2aa5d0bc8 libceph: fix some missing includes
- decode.h needs slab.h for kmalloc()
- osd_client.h needs msgpool.h for struct ceph_msgpool
- msgpool.h doesn't need messenger.h

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 02:55:35 +02:00
Linus Torvalds 8448cefe49 HSI changes for the v4.8 series
* proper runtime pm support for omap-ssi and ssi-protocol
 * misc. fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJXlnMYAAoJENju1/PIO/qaLnwP/3BTdRGsAS/8UtwWG5ZJ3Xrz
 AdYCKfgrOPoNj97ba95Dd3aGbrQGkTOmx6UTkYM1SO9KNN/VwoeQuV5DnWRZlF9k
 vgH5lUh3L2AHbpOywEuQF/0MAiZXGguqPTfI3hx66k6t7vMYxq8VuahtZ69+ICPI
 TSG+JXYVMpdjcIMIQrmH4CC2R1G7IsKNnQpaPDCAyIfE0WsiOkCUIlB4OuLlHemT
 TK5BggWQYlnKN/bbIC/dx6ME5ZWPv9pXLUNzwUqH8C8G9wdxeXj2PTEt/ZZrw8dT
 tskFYXno7tUuU4btR5OkgrpZLHybXzMIQS2ZSFKE8oUvu0hymQLgGGuucWg1oTVG
 1kkat8Hyz49xv+YriiMbaj0XKO0WWLBcjOduexPho1RUKV95MsnMz6e55eXq2ptn
 ucCb71lap7Q9I4MJjntvLXaIfFlyQR+6FEyo4RVwvCy44HSsPEhHeXxHN6hIeQDB
 BFTOADy7RCT9B6KSuSItW5H1oT1jESY26TYp+NvWC+p6kz3mexxlO1gVDoln3NRB
 opktOsusml7MNr+3XBNap3CsHPKm2X8azpkWsikN9QbRrjeH/AxWhjWrVruw8Yhu
 6qe2u46vh4yI0YexE8CoYqj71Em/V2I6EkCq7cuoLQsBZew9i1kT/8qO69kP9BFV
 djK5b44/mAZJmC0d8t3/
 =WPyp
 -----END PGP SIGNATURE-----

Merge tag 'hsi-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi

Pull HSI updates from Sebastian Reichel:

 - proper runtime pm support for omap-ssi and ssi-protocol

 - misc fixes

* tag 'hsi-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi: (24 commits)
  HSI: omap_ssi: drop pm_runtime_irq_safe
  HSI: omap_ssi_port: use rpm autosuspend API
  HSI: omap_ssi: call msg->complete() from process context
  HSI: omap_ssi_port: ensure clocks are kept enabled during transfer
  HSI: omap_ssi_port: replace pm_runtime_put_sync with non-sync variant
  HSI: omap_ssi_port: avoid calling runtime_pm_*_sync inside spinlock
  HSI: omap_ssi_port: avoid pm_runtime_get_sync in ssi_start_dma and ssi_start_pio
  HSI: omap_ssi_port: switch to threaded pio irq
  HSI: omap_ssi_core: remove pm_runtime_get_sync call from tasklet
  HSI: omap_ssi_core: use pm_runtime_put instead of pm_runtime_put_sync
  HSI: omap_ssi_port: prepare start_tx/stop_tx for blocking pm_runtime calls
  HSI: core: switch port event notifier from atomic to blocking
  HSI: omap_ssi_port: replace wkin_cken with atomic bitmap operations
  HSI: omap_ssi: convert cawake irq handler to thread
  HSI: ssi_protocol: fix ssip_xmit invocation
  HSI: ssi_protocol: replace spin_lock with spin_lock_bh
  HSI: ssi_protocol: avoid ssi_waketest call with held spinlock
  HSI: omap_ssi: do not reset module
  HSI: omap_ssi_port: remove useless newline
  hsi: Only descend into hsi directory when CONFIG_HSI is set
  ...
2016-07-27 15:18:53 -07:00
Linus Torvalds d85486d471 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
 "Updates for the input subsystem.  This contains the following new
  drivers promised in the last merge window:

   - driver for touchscreen controller found in Surface 3
   - driver for Pegasus Notetaker tablet
   - driver for Atmel Captouch Buttons
   - driver for Raydium I2C touchscreen controllers
   - powerkey driver for HISI 65xx SoC

  plus a few fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (40 commits)
  Input: tty/vt/keyboard - use memdup_user()
  Input: pegasus_notetaker - set device mode in reset_resume() if in use
  Input: pegasus_notetaker - cancel workqueue's work in suspend()
  Input: pegasus_notetaker - fix usb_autopm calls to be balanced
  Input: pegasus_notetaker - handle usb control msg errors
  Input: wacom_w8001 - handle errors from input_mt_init_slots()
  Input: wacom_w8001 - resolution wasn't set for ABS_MT_POSITION_X/Y
  Input: pixcir_ts - add support for axis inversion / swapping
  Input: icn8318 - use of_touchscreen helpers for inverting / swapping axes
  Input: edt-ft5x06 - add support for inverting / swapping axes
  Input: of_touchscreen - add support for inverted / swapped axes
  Input: synaptics-rmi4 - use the RMI_F11_REL_BYTES define in rmi_f11_rel_pos_report
  Input: synaptics-rmi4 - remove unneeded variable
  Input: synaptics-rmi4 - remove pointer to rmi_function in f12_data
  Input: synaptics-rmi4 - support regulator supplies
  Input: raydium_i2c_ts - check CRC of incoming packets
  Input: xen-kbdfront - prefer xenbus_write() over xenbus_printf() where possible
  Input: fix a double word "is is" in include/linux/input.h
  Input: add powerkey driver for HISI 65xx SoC
  Input: apanel - spelling mistake - "skiping" -> "skipping"
  ...
2016-07-27 14:30:41 -07:00
Dmitry Torokhov 4097461897 Input: i8042 - break load dependency between atkbd/psmouse and i8042
As explained in 1407814240-4275-1-git-send-email-decui@microsoft.com we
have a hard load dependency between i8042 and atkbd which prevents
keyboard from working on Gen2 Hyper-V VMs.

> hyperv_keyboard invokes serio_interrupt(), which needs a valid serio
> driver like atkbd.c.  atkbd.c depends on libps2.c because it invokes
> ps2_command().  libps2.c depends on i8042.c because it invokes
> i8042_check_port_owner().  As a result, hyperv_keyboard actually
> depends on i8042.c.
>
> For a Generation 2 Hyper-V VM (meaning no i8042 device emulated), if a
> Linux VM (like Arch Linux) happens to configure CONFIG_SERIO_I8042=m
> rather than =y, atkbd.ko can't load because i8042.ko can't load(due to
> no i8042 device emulated) and finally hyperv_keyboard can't work and
> the user can't input: https://bugs.archlinux.org/task/39820
> (Ubuntu/RHEL/SUSE aren't affected since they use CONFIG_SERIO_I8042=y)

To break the dependency we move away from using i8042_check_port_owner()
and instead allow serio port owner specify a mutex that clients should use
to serialize PS/2 command stream.

Reported-by: Mark Laws <mdl@60hz.org>
Tested-by: Mark Laws <mdl@60hz.org>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-07-27 14:20:09 -07:00
Linus Torvalds 66304207cd Merge branch 'i2c/for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
 "Here is the I2C pull request for 4.8:

   - the core and i801 driver gained support for SMBus Host Notify

   - core support for more than one address in DT

   - i2c_add_adapter() has now better error messages.  We can remove all
     error messages from drivers calling it as a next step.

   - bigger updates to rk3x driver to support rk3399 SoC

   - the at24 eeprom driver got refactored and can now read special
     variants with unique serials or fixed MAC addresses.

  The rest is regular driver updates and bugfixes"

* 'i2c/for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (66 commits)
  i2c: i801: use IS_ENABLED() instead of checking for built-in or module
  Documentation: i2c: slave: give proper example for pm usage
  Documentation: i2c: slave: describe buffer problems a bit better
  i2c: bcm2835: Don't complain on -EPROBE_DEFER from getting our clock
  i2c: i2c-smbus: drop useless stubs
  i2c: efm32: fix a failure path in efm32_i2c_probe()
  Revert "i2c: core: Cleanup I2C ACPI namespace"
  Revert "i2c: core: Add function for finding the bus speed from ACPI"
  i2c: Update the description of I2C_SMBUS
  i2c: i2c-smbus: fix i2c_handle_smbus_host_notify documentation
  eeprom: at24: tweak the loop_until_timeout() macro
  eeprom: at24: add support for at24mac series
  eeprom: at24: support reading the serial number for 24csxx
  eeprom: at24: platform_data: use BIT() macro
  eeprom: at24: split at24_eeprom_write() into specialized functions
  eeprom: at24: split at24_eeprom_read() into specialized functions
  eeprom: at24: hide the read/write loop behind a macro
  eeprom: at24: call read/write functions via function pointers
  eeprom: at24: coding style fixes
  eeprom: at24: move at24_read() below at24_eeprom_write()
  ...
2016-07-27 14:19:25 -07:00
Linus Torvalds 7ae0ae4a02 spi: Updates for v4.8
Quite a lot of cleanup and maintainence work going on this release in
 various drivers, and also a fix for a nasty locking issue in the core:
 
  - A fix for locking issues when external drivers explicitly locked the
    bus with spi_bus_lock() - we were using the same lock to both control
    access to the physical bus in multi-threaded I/O operations and
    exclude multiple callers.  Confusion between these two caused us to
    have scenarios where we were dropping locks.  These are fixed by
    splitting into two separate locks like should have been done
    originally, making everything much clearer and correct.
  - Support for DMA in spi_flash_read().
  - Support for instantiating spidev on ACPI systems, including some test
    devices used in Windows validation.
  - Use of the core DMA mapping functionality in the McSPI driver.
  - Start of support for ThunderX SPI controllers, involving a very big
    set of changes to the Cavium driver.
  - Support for Braswell, Exynos 5433, Kaby Lake, Merrifield, RK3036,
    RK3228, RK3368 controllers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXmPFBAAoJECTWi3JdVIfQbisH/355nT/cyqc08l9iC+a1zRDw
 /Bf5kN8pqmu6+y3sMjAIdptZQTlXhgR4q1ZH+oNSfowCVgvJYWF6RVCEXDBh6XHs
 YBQAFlYeSOO5cLTPQSDnn06oFucV/HZJppC6hM0SNclbVboeMBBS6S6aljXqMbj+
 mFvtq6/iEsG6kgQcmcl3fm/SMOYF2OFDJyr66NimBXQGzjx84xJcG0eGk8kCIwEw
 xyiE/WmB9WT2scFSgAsfaOEE27ozaq9iANNUA/ceUibQgQYpQveBgy4XVXFjEzFo
 3BVvPYGGzjebzaXbMwDV6OvSgMwnTsMxjtZGsraxIEOcMdeeMEpn1/Ze4ksWi4c=
 =R4Zx
 -----END PGP SIGNATURE-----

Merge tag 'spi-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi updates from Mark Brown:
 "Quite a lot of cleanup and maintainence work going on this release in
  various drivers, and also a fix for a nasty locking issue in the core:

   - A fix for locking issues when external drivers explicitly locked
     the bus with spi_bus_lock() - we were using the same lock to both
     control access to the physical bus in multi-threaded I/O operations
     and exclude multiple callers.

     Confusion between these two caused us to have scenarios where we
     were dropping locks.  These are fixed by splitting into two
     separate locks like should have been done originally, making
     everything much clearer and correct.

   - Support for DMA in spi_flash_read().

   - Support for instantiating spidev on ACPI systems, including some
     test devices used in Windows validation.

   - Use of the core DMA mapping functionality in the McSPI driver.

   - Start of support for ThunderX SPI controllers, involving a very big
     set of changes to the Cavium driver.

   - Support for Braswell, Exynos 5433, Kaby Lake, Merrifield, RK3036,
     RK3228, RK3368 controllers"

* tag 'spi-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (64 commits)
  spi: Split bus and I/O locking
  spi: octeon: Split driver into Octeon specific and common parts
  spi: octeon: Move include file from arch/mips to drivers/spi
  spi: octeon: Put register offsets into a struct
  spi: octeon: Store system clock freqency in struct octeon_spi
  spi: octeon: Convert driver to use readq()/writeq() functions
  spi: pic32-sqi: fixup wait_for_completion_timeout return handling
  spi: pic32: fixup wait_for_completion_timeout return handling
  spi: rockchip: limit transfers to (64K - 1) bytes
  spi: xilinx: Return IRQ_NONE if no interrupts were detected
  spi: xilinx: Handle errors from platform_get_irq()
  spi: s3c64xx: restore removed comments
  spi: s3c64xx: add Exynos5433 compatible for ioclk handling
  spi: s3c64xx: use error code from clk_prepare_enable()
  spi: s3c64xx: rename goto labels to meaningful names
  spi: s3c64xx: document the clocks and the clock-name property
  spi: s3c64xx: add exynos5433 spi compatible
  spi: s3c64xx: fix reference leak to master in s3c64xx_spi_remove()
  spi: spi-sh: Remove deprecated create_singlethread_workqueue
  spi: spi-topcliff-pch: Remove deprecated create_singlethread_workqueue
  ...
2016-07-27 14:11:43 -07:00
Linus Torvalds 607e11ab66 New LED class driver:
- LED driver for TI LP3952 6-Channel Color LED
 
 LED core improvements:
 - Only descend into leds directory when CONFIG_NEW_LEDS is set
 - Add no-op gpio_led_register_device when LED subsystem is disabled
 - MAINTAINERS: Add file patterns for led device tree bindings
 
 LED Trigger core improvements:
 - return error if invalid trigger name is provided via sysfs
 
 LED class drivers improvements
 - is31fl32xx: define complete i2c_device_id table
 - is31fl32xx: fix typo in id and match table names
 - leds-gpio: Set of_node for created LED devices
 - pca9532: Add device tree support
 
 Conversion of IDE trigger to common disk trigger:
 - leds: convert IDE trigger to common disk trigger
 - leds: documentation: 'ide-disk' to 'disk-activity'
 - unicore32: use the new LED disk activity trigger
 - parisc: use the new LED disk activity trigger
 - mips: use the new LED disk activity trigger
 - arm: use the new LED disk activity trigger
 - powerpc: use the new LED disk activity trigger
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.17 (GNU/Linux)
 
 iQIcBAABAgAGBQJXlbvcAAoJEL1qUBy3i3wm1NoP/1ujmu9eljBO/0VyyrZNSWvM
 F367twUAmfyrC+xos/lyxn8u8yknyLNfbUStU0L/cpz2GQRNwuIS4h+xrkyqsDCj
 JBCKrZQCqi/Hsj6jBuIUW+s0OFjxhBJpEkeYW7FM6i2+O9EUxZvCuuG2WPQBNayx
 73qG4RdHmB+qFWMlIp//iF8OviSSm2sd5+UEAJH8mYwrbCcv2gT7UjIaIRczpo4H
 JTkc/onR6y/qliUUrbLpMr40bJLA4+p9ZkqEGPS4lsKiUKf5SF7bLg59RNebIF75
 FZ52yhAw55KLiF0yokthlay8cBUjsd52lROjANkIe1iEHBiKoJRKLOA9WozY37qu
 3/9fKcddiPHxVF3ItcwhrldVru2ZxfWFRz/44rCPDTjKEfsuNN/2EkY8aVYCiHpO
 8gSXw/HjLPCSQIp6PVcqGSVZd1VWUYKcJY/xvCa3zQi0r7JdAmYUp76KCkmq9FgB
 DuQTkxM57ndKiz5WqlGU70mzht43/raqzPOQwLMwz5mGKj/nXvNMKFLQ1xhnyfeY
 vw7D2f/wqm74cTMhMS40IDlpJ81EoCYpsnzvtIHti3FPeo8wmVNz30H51JvFRdJm
 hzeju1tdSLwazGBIIAtjLSCLpnkaOq7rMwwGndGO5QCO/+A6gAC3iAKe4YFl+AXV
 CrS9BAdZSslYg+ddJYg6
 =8lmX
 -----END PGP SIGNATURE-----

Merge tag 'leds_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds

Pull LED updates from Jacek Anaszewski:
 "New LED class driver:
   - LED driver for TI LP3952 6-Channel Color LED

  LED core improvements:
   - Only descend into leds directory when CONFIG_NEW_LEDS is set
   - Add no-op gpio_led_register_device when LED subsystem is disabled
   - MAINTAINERS: Add file patterns for led device tree bindings

  LED Trigger core improvements:
   - return error if invalid trigger name is provided via sysfs

  LED class drivers improvements
   - is31fl32xx: define complete i2c_device_id table
   - is31fl32xx: fix typo in id and match table names
   - leds-gpio: Set of_node for created LED devices
   - pca9532: Add device tree support

  Conversion of IDE trigger to common disk trigger:
   - leds: convert IDE trigger to common disk trigger
   - leds: documentation: 'ide-disk' to 'disk-activity'
   - unicore32: use the new LED disk activity trigger
   - parisc: use the new LED disk activity trigger
   - mips: use the new LED disk activity trigger
   - arm: use the new LED disk activity trigger
   - powerpc: use the new LED disk activity trigger"

* tag 'leds_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
  leds: is31fl32xx: define complete i2c_device_id table
  leds: is31fl32xx: fix typo in id and match table names
  leds: LED driver for TI LP3952 6-Channel Color LED
  leds: leds-gpio: Set of_node for created LED devices
  leds: triggers: return error if invalid trigger name is provided via sysfs
  leds: Only descend into leds directory when CONFIG_NEW_LEDS is set
  leds: Add no-op gpio_led_register_device when LED subsystem is disabled
  unicore32: use the new LED disk activity trigger
  parisc: use the new LED disk activity trigger
  mips: use the new LED disk activity trigger
  arm: use the new LED disk activity trigger
  powerpc: use the new LED disk activity trigger
  leds: documentation: 'ide-disk' to 'disk-activity'
  leds: convert IDE trigger to common disk trigger
  leds: pca9532: Add device tree support
  MAINTAINERS: Add file patterns for led device tree bindings
2016-07-27 14:03:52 -07:00
Linus Torvalds 78d51aee04 Remove some old cruft that was disabled by default a long time ago.
No modern hardware should need this, and anybody who really
 doesn't have something to automatically detect IPMI can add the
 device by hand on the module commandline or hot add it.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iEYEABECAAYFAleY0zgACgkQIXnXXONXEReeewCgiZa5nEIWzmmroaWcmDK1VTs2
 wSQAn2KYA6ieZiDZso9e9xwhJWvRsQDl
 =CakW
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.8' of git://git.code.sf.net/p/openipmi/linux-ipmi

Pull IPMI updates from Corey Minyard:
 "Remove some old cruft that was disabled by default a long time ago.

  No modern hardware should need this, and anybody who really doesn't
  have something to automatically detect IPMI can add the device by hand
  on the module commandline or hot add it"

* tag 'for-linus-4.8' of git://git.code.sf.net/p/openipmi/linux-ipmi:
  ipmi: remove trydefaults parameter and default init
2016-07-27 13:46:07 -07:00
Linus Torvalds 468fc7ed55 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Unified UDP encapsulation offload methods for drivers, from
    Alexander Duyck.

 2) Make DSA binding more sane, from Andrew Lunn.

 3) Support QCA9888 chips in ath10k, from Anilkumar Kolli.

 4) Several workqueue usage cleanups, from Bhaktipriya Shridhar.

 5) Add XDP (eXpress Data Path), essentially running BPF programs on RX
    packets as soon as the device sees them, with the option to mirror
    the packet on TX via the same interface.  From Brenden Blanco and
    others.

 6) Allow qdisc/class stats dumps to run lockless, from Eric Dumazet.

 7) Add VLAN support to b53 and bcm_sf2, from Florian Fainelli.

 8) Simplify netlink conntrack entry layout, from Florian Westphal.

 9) Add ipv4 forwarding support to mlxsw spectrum driver, from Ido
    Schimmel, Yotam Gigi, and Jiri Pirko.

10) Add SKB array infrastructure and convert tun and macvtap over to it.
    From Michael S Tsirkin and Jason Wang.

11) Support qdisc packet injection in pktgen, from John Fastabend.

12) Add neighbour monitoring framework to TIPC, from Jon Paul Maloy.

13) Add NV congestion control support to TCP, from Lawrence Brakmo.

14) Add GSO support to SCTP, from Marcelo Ricardo Leitner.

15) Allow GRO and RPS to function on macsec devices, from Paolo Abeni.

16) Support MPLS over IPV4, from Simon Horman.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
  xgene: Fix build warning with ACPI disabled.
  be2net: perform temperature query in adapter regardless of its interface state
  l2tp: Correctly return -EBADF from pppol2tp_getname.
  net/mlx5_core/health: Remove deprecated create_singlethread_workqueue
  net: ipmr/ip6mr: update lastuse on entry change
  macsec: ensure rx_sa is set when validation is disabled
  tipc: dump monitor attributes
  tipc: add a function to get the bearer name
  tipc: get monitor threshold for the cluster
  tipc: make cluster size threshold for monitoring configurable
  tipc: introduce constants for tipc address validation
  net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in neigh_update()
  MAINTAINERS: xgene: Add driver and documentation path
  Documentation: dtb: xgene: Add MDIO node
  dtb: xgene: Add MDIO node
  drivers: net: xgene: ethtool: Use phy_ethtool_gset and sset
  drivers: net: xgene: Use exported functions
  drivers: net: xgene: Enable MDIO driver
  drivers: net: xgene: Add backward compatibility
  drivers: net: phy: xgene: Add MDIO driver
  ...
2016-07-27 12:03:20 -07:00
Linus Torvalds 08fd8c1768 xen: features and fixes for 4.8-rc0
- ACPI support for guests on ARM platforms.
 - Generic steal time support for arm and x86.
 - Support cases where kernel cpu is not Xen VCPU number (e.g., if
   in-guest kexec is used).
 - Use the system workqueue instead of a custom workqueue in various
   places.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXmLlrAAoJEFxbo/MsZsTRvRQH/1wOMF8BmlbZfR7H3qwDfjst
 ApNifCiZE08xDtWBlwUaBFAQxyflQS9BBiNZDVK0sysIdXeOdpWV7V0ZjRoLL+xr
 czsaaGXDcmXxJxApoMDVuT7FeP6rEk6LVAYRoHpVjJjMZGW3BbX1vZaMW4DXl2WM
 9YNaF2Lj+rpc1f8iG31nUxwkpmcXFog6ct4tu7HiyCFT3hDkHt/a4ghuBdQItCkd
 vqBa1pTpcGtQBhSmWzlylN/PV2+NKcRd+kGiwd09/O/rNzogTMCTTWeHKAtMpPYb
 Cu6oSqJtlK5o0vtr0qyLSWEGIoyjE2gE92s0wN3iCzFY1PldqdsxUO622nIj+6o=
 =G6q3
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from David Vrabel:
 "Features and fixes for 4.8-rc0:

   - ACPI support for guests on ARM platforms.
   - Generic steal time support for arm and x86.
   - Support cases where kernel cpu is not Xen VCPU number (e.g., if
     in-guest kexec is used).
   - Use the system workqueue instead of a custom workqueue in various
     places"

* tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (47 commits)
  xen: add static initialization of steal_clock op to xen_time_ops
  xen/pvhvm: run xen_vcpu_setup() for the boot CPU
  xen/evtchn: use xen_vcpu_id mapping
  xen/events: fifo: use xen_vcpu_id mapping
  xen/events: use xen_vcpu_id mapping in events_base
  x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info
  x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op
  xen: introduce xen_vcpu_id mapping
  x86/acpi: store ACPI ids from MADT for future usage
  x86/xen: update cpuid.h from Xen-4.7
  xen/evtchn: add IOCTL_EVTCHN_RESTRICT
  xen-blkback: really don't leak mode property
  xen-blkback: constify instance of "struct attribute_group"
  xen-blkfront: prefer xenbus_scanf() over xenbus_gather()
  xen-blkback: prefer xenbus_scanf() over xenbus_gather()
  xen: support runqueue steal time on xen
  arm/xen: add support for vm_assist hypercall
  xen: update xen headers
  xen-pciback: drop superfluous variables
  xen-pciback: short-circuit read path used for merging write values
  ...
2016-07-27 11:35:37 -07:00
Linus Torvalds 0e6acf0204 xfs: update for 4.8-rc1
Changes in this update:
 o generic iomap based IO path infrastructure
 o generic iomap based fiemap implementation
 o xfs iomap based Io path implementation
 o buffer error handling fixes
 o tracking of in flight buffer IO for unmount serialisation
 o direct IO and DAX io path separation and simplification
 o shortform directory format definition changes for wider platform compatibility
 o various buffer cache fixes
 o cleanups in preparation for rmap merge
 o error injection cleanups and fixes
 o log item format buffer memory allocation restructuring to prevent rare OOM
   reclaim deadlocks
 o sparse inode chunks are now fully supported.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXmA5XAAoJEK3oKUf0dfodCc0QAKY5Jlfw5HwLria+Ad87HCcM
 Zi/LGMMC3CPh+vkbqsmDnLKHYjXRwi3HamBoXdufiE8E3UtOjp/sV98/fCw+zwhe
 tHDLmdAx23RLTn7gUhcsIXydKeXh0+HlRxPa4eBAlmnsJ3nGgrKrKQLgDT7Gjlum
 nPfRSTYjzm5gs2dpUTYhMV7MplenDW9GFz2uBMct6N9kYQ9m225I99fd/4nb/L7R
 o/8UocsK7iREUXP6decDoN9uIAzE2mYR720EL+Txy09CTYy+luNyGoNXOsQtxT5O
 plyoPZbzIIDvC44bvp6bZX96Udm7tAeTloieInCZG13I2zJy9gmTmLqkZ3M2at12
 kOyeAMSBOWQYSa3uh++FsEP+JGtBTlZXf+4DAYf+U08s8tMVE/61/RZrtJZF4OjW
 hyumRBD6zqZ9Y6Qtji2HaA3l9IGxOC2k4URw9JZdDDyMoRTQvawN1QWNAeZINXiv
 9ywqTruVsfQnoGDC1Gk1OEfQpubNztTAkEPqVM7ez5dkwOdwuOZXcZPL1Ltvb4Bt
 PLaWKLIYFYZKrM5kqgQlTERspSQA99++z8H9a21wFezfetaBby28fIqwMMfQAiSw
 nCq95WshJPwenogMtWjNfOgs/fqOBKdPdLFw0H6Jpmjwna2KpuFIZiTnwu25vvjz
 dHh4DVSuMTq1pBkXEU7B
 =vcSd
 -----END PGP SIGNATURE-----

Merge tag 'xfs-for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs

Pull xfs updates from Dave Chinner:
 "The major addition is the new iomap based block mapping
  infrastructure.  We've been kicking this about locally for years, but
  there are other filesystems want to use it too (e.g. gfs2).  Now it
  is fully working, reviewed and ready for merge and be used by other
  filesystems.

  There are a lot of other fixes and cleanups in the tree, but those are
  XFS internal things and none are of the scale or visibility of the
  iomap changes.  See below for details.

  I am likely to send another pull request next week - we're just about
  ready to merge some new functionality (on disk block->owner reverse
  mapping infrastructure), but that's a huge chunk of code (74 files
  changed, 7283 insertions(+), 1114 deletions(-)) so I'm keeping that
  separate to all the "normal" pull request changes so they don't get
  lost in the noise.

  Summary of changes in this update:
   - generic iomap based IO path infrastructure
   - generic iomap based fiemap implementation
   - xfs iomap based Io path implementation
   - buffer error handling fixes
   - tracking of in flight buffer IO for unmount serialisation
   - direct IO and DAX io path separation and simplification
   - shortform directory format definition changes for wider platform
     compatibility
   - various buffer cache fixes
   - cleanups in preparation for rmap merge
   - error injection cleanups and fixes
   - log item format buffer memory allocation restructuring to prevent
     rare OOM reclaim deadlocks
   - sparse inode chunks are now fully supported"

* tag 'xfs-for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (53 commits)
  xfs: remove EXPERIMENTAL tag from sparse inode feature
  xfs: bufferhead chains are invalid after end_page_writeback
  xfs: allocate log vector buffers outside CIL context lock
  libxfs: directory node splitting does not have an extra block
  xfs: remove dax code from object file when disabled
  xfs: skip dirty pages in ->releasepage()
  xfs: remove __arch_pack
  xfs: kill xfs_dir2_inou_t
  xfs: kill xfs_dir2_sf_off_t
  xfs: split direct I/O and DAX path
  xfs: direct calls in the direct I/O path
  xfs: stop using generic_file_read_iter for direct I/O
  xfs: split xfs_file_read_iter into buffered and direct I/O helpers
  xfs: remove s_maxbytes enforcement in xfs_file_read_iter
  xfs: kill ioflags
  xfs: don't pass ioflags around in the ioctl path
  xfs: track and serialize in-flight async buffers against unmount
  xfs: exclude never-released buffers from buftarg I/O accounting
  xfs: don't reset b_retries to 0 on every failure
  xfs: remove extraneous buffer flag changes
  ...
2016-07-27 09:53:35 -07:00
Tony Camuso b07b58a3e4 ipmi: remove trydefaults parameter and default init
Parameter trydefaults=1 causes the ipmi_init to initialize ipmi through
the legacy port io space that was designated for ipmi. Architectures
that do not map legacy port io can panic when trydefaults=1.

Rather than implement build-time conditional exceptions for each
architecture that does not map legacy port io, we have removed legacy
port io from the driver.

Parameter 'trydefaults' has been removed. Attempts to use it hereafter
will evoke the "Unknown symbol in module, or unknown parameter" message.

The patch was built against a number of architectures and tested for
regressions and functionality on x86_64 and ARM64.

Signed-off-by: Tony Camuso <tcamuso@redhat.com>

Removed the config entry and the address source entry for default,
since neither were used any more.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-07-27 10:24:38 -05:00
Jiri Kosina bf262dcec6 module: fix noreturn attribute for __module_put_and_exit()
__module_put_and_exit() is makred noreturn in module.h declaration, but is
lacking the attribute in the definition, which makes some tools (such as
sparse) unhappy. Amend the definition with the attribute as well (and
reformat the declaration so that it uses more common format).

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-07-27 12:38:00 +09:30
Linus Torvalds 0e06f5c0de Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - a few misc bits

 - ocfs2

 - most(?) of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (125 commits)
  thp: fix comments of __pmd_trans_huge_lock()
  cgroup: remove unnecessary 0 check from css_from_id()
  cgroup: fix idr leak for the first cgroup root
  mm: memcontrol: fix documentation for compound parameter
  mm: memcontrol: remove BUG_ON in uncharge_list
  mm: fix build warnings in <linux/compaction.h>
  mm, thp: convert from optimistic swapin collapsing to conservative
  mm, thp: fix comment inconsistency for swapin readahead functions
  thp: update Documentation/{vm/transhuge,filesystems/proc}.txt
  shmem: split huge pages beyond i_size under memory pressure
  thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE
  khugepaged: add support of collapse for tmpfs/shmem pages
  shmem: make shmem_inode_info::lock irq-safe
  khugepaged: move up_read(mmap_sem) out of khugepaged_alloc_page()
  thp: extract khugepaged from mm/huge_memory.c
  shmem, thp: respect MADV_{NO,}HUGEPAGE for file mappings
  shmem: add huge pages support
  shmem: get_unmapped_area align huge page
  shmem: prepare huge= mount option and sysfs knob
  mm, rmap: account shmem thp pages
  ...
2016-07-26 19:55:54 -07:00
Linus Torvalds f7816ad0f8 power supply and reset changes for the v4.8 series
* introduce reboot mode driver
  * add DT support to max8903
  * add power supply support for axp221
  * misc. fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJXlm+bAAoJENju1/PIO/qaHCsP/0bwvb3PlvPozVs7iXba4hbK
 It6RxQDHl9Q0tvsgRitqoibomBlGqdMXADwr2yXsio255IYqNTKErJasiOn0KkO6
 Wuoo9FSiZV6KL1JMBvZOgMydK/x8q2yBKKJKm9JG9srLueJWuZa/4BqOWS4JSFJ6
 xlevxxA0XqUzWr6VUXFRTd/YXa0OaRDHaG9JN1NN8qvJWvvYZJLhpDzWdDu83KzU
 CPEP7g8UeOFLylcmjeHgT9rJuiHbd6UCr4yKjFqLvM6uM9SY/Uinz62u4tKL5zcp
 /9V4oo8/9RjJnG/gkm4ozAL2Rb1fpeajTZivUX3t+CiEV4Nc8xiQMCMDI2nKSqr4
 IF9nghYub+q6q7EfonWZ22Aa6VhGdI6ToG5iK3euag2vkypLYe9GD1tPyAnnxx1o
 xlnziiveGKcKwSIY3tTOFVAoGqhCWIpM06ZC7/Y9HvM6EPhAr4q4Pq1xNFwr0Rrs
 OoEobm5cRB7IUtu2Gl0GU/Ku5mDoen2LvSVzIjF36/aGN+atpd7Tkoam7LBtedZA
 aqsEV/qMfgsIUYWj5/lmLZOQuG6viEou/BWvQz8cczA0U2mFDcgY6UCLMwVeJ7sj
 ZT+1uVoupcO5UNEULwdzufUmaC0ze7HrGNeXZZwt5S1IWqNsRg71FzqFYYaIwZGy
 Rnt3P3tSqhxZrz1xeZ6r
 =bMvA
 -----END PGP SIGNATURE-----

Merge tag 'for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply and reset updates from Sebastian Reichel:
 - introduce reboot mode driver
 - add DT support to max8903
 - add power supply support for axp221
 - misc fixes

* tag 'for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: reset: add reboot mode driver
  dt-bindings: power: reset: add document for reboot-mode driver
  power_supply: fix return value of get_property
  power: qcom_smbb: Make an extcon for usb cable detection
  max8903: adds support for initiation via device tree
  max8903: adds documentation for device tree bindings.
  max8903: remove unnecessary 'out of memory' error message.
  max8903: removes non zero validity checks on gpios.
  max8903: adds requesting of gpios.
  max8903: cleans up confusing relationship between dc_valid, dok and dcm.
  max8903: store pointer to pdata instead of copying it.
  power_supply: bq27xxx_battery: Group register mappings into one table
  docs: Move brcm,bcm21664-resetmgr.txt
  power/reset: make syscon_poweroff() static
  power: axp20x_usb: Add support for usb power-supply on axp22x pmics
  power_supply: bq27xxx_battery: Index register numbers by enum
  power_supply: bq27xxx_battery: Fix copy/paste error in header comment
  MAINTAINERS: Add file patterns for power supply device tree bindings
  power: reset: keystone: Enable COMPILE_TEST
2016-07-26 19:49:09 -07:00
Linus Torvalds 6097d55e10 regulator: Updates for v4.8
A quiet regulator API release, a few new drivers and some fixes but
 nothing too notable.  There will also be some updates for the PWM
 regulator coming through the PWM tree which provide much smoother
 operation when taking over an already running PWM regulator after boot
 using some new PWM APIs.
 
  - Support for configuration of the initial suspend state from DT.
  - New drivers for Mediatek MT6323, Ricoh RN5T567 and X-Powers AXP809.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXlk49AAoJECTWi3JdVIfQM4QH/jg2BGgdQSiuQspidmyN9tpL
 hZgvUzrKm0eXEvzvTaL4cPymqpDHgQgaWEFo3Kv2WiU+34XIzfKPTt8m0WI37HY9
 t4rTMlN8vL4WEMtAlB0i4DH1/jcEOJG2c88+tnSBGb5R8AXXIvx+F9xjWxOoJKJY
 ccf5QcupDfCJw99rs3Li5eIbHOW2RAcZgurGGUyV01kH+G/9LliNpIKa9uCxK4dw
 72aMH6QX+Hjid4UAqwL13C5W3sBDSebYYTwtTW61HosydLv818BfZC8IoezgjBRE
 unjh6kdxOpcZk0ftciNiNQFLTUhobdYAuQFjUZywNqijpqK28lbFypweP8yhoxg=
 =rDhl
 -----END PGP SIGNATURE-----

Merge tag 'regulator-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator updates from Mark Brown:
 "A quiet regulator API release, a few new drivers and some fixes but
  nothing too notable.  There will also be some updates for the PWM
  regulator coming through the PWM tree which provide much smoother
  operation when taking over an already running PWM regulator after boot
  using some new PWM APIs.

  Summary:

   - Support for configuration of the initial suspend state from DT.

   - New drivers for Mediatek MT6323, Ricoh RN5T567 and X-Powers AXP809"

* tag 'regulator-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (38 commits)
  regulator: da9053/52: Fix incorrectly stated minimum and maximum voltage limits
  regulator: mt6323: Constify struct regulator_ops
  regulator: mt6323: Fix module description
  regulator: mt6323: Add support for MT6323 regulator
  regulator: Add document for MT6323 regulator
  regulator: da9210: addition of device tree support
  regulator: act8865: Fix missing of_node_put() in act8865_pdata_from_dt()
  regulator: qcom_smd: Avoid overlapping linear voltage ranges
  regulator: s2mps11: Fix the voltage linear range for s2mps15
  regulator: pwm: Fix regulator ramp delay for continuous mode
  regulator: da9211: add descriptions for da9212/da9214
  mfd: rn5t618: Register restart handler
  mfd: rn5t618: Register power off callback optionally
  regulator: rn5t618: Add RN5T567 PMIC support
  mfd: rn5t618: Add Ricoh RN5T567 PMIC support
  ARM: dts: meson: minix-neo-x8: define PMIC as power controller
  regulator: tps65218: force set power-up/down strobe to 3 for dcdc3
  regulator: tps65218: Enable suspend configuration
  regulator: tps65217: Enable suspend configuration
  regulator: qcom_spmi: Add support for get_mode/set_mode on switches
  ...
2016-07-26 19:40:43 -07:00
Linus Torvalds ae9799975c regmap: Updates for v4.8
Several small updates and API enhancements:
 
  - Provide transparent unrolling of bulk writes into individual writes
    so they can be used with devices without raw formatting.
  - Fix compatibility between I2C controllers supporting block commands
    and devices with more than 8 bit wide registers.
  - Add some helpers for iopoll-like functionality and workarounds for
    weird interrupt controllers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXljTLAAoJECTWi3JdVIfQMSMH/1MRirJkwIds0SRbWcnG9hOe
 nMfUhkuFuCZJN82GwOh7YbAxLB1pGXKsGZq1XGMYFvwJawEizR3o83fgcRy1LQmG
 KOQHO1O/CWQqRo1ntGl33H2thhq6e67y6lpeMBhdMvp3VjqdyzlILNSva6iWbW/R
 N0iFSSKd4VdHlg/o+R3k7EzWCLZ4FZI7B9/5OjxWkoUR+tloNI8rR8Ecwl6+Q37K
 8+yo6R0Ntx/JXB7JvA5Y8vqPdUGi68+XpcnaAcpM5DaL95+lqzduWCjKfjG5oatZ
 3o/ORRVi+WyD95/NLXwKFMvSDSGokVDrU87HNgKph+f4sNF5TIoC/E+JhPUQBSU=
 =7BlM
 -----END PGP SIGNATURE-----

Merge tag 'regmap-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap updates from Mark Brown:
 "Several small updates and API enhancements:

   - provide transparent unrolling of bulk writes into individual writes
     so they can be used with devices without raw formatting.

   - fix compatibility between I2C controllers supporting block commands
     and devices with more than 8 bit wide registers.

   - add some helpers for iopoll-like functionality and workarounds for
     weird interrupt controllers"

* tag 'regmap-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: add iopoll-like polling macro
  regmap: Support bulk writes for devices without raw formatting
  regmap-i2c: Use i2c block command only if register value width is 8 bit
  regmap: irq: Add support to call client specific pre/post interrupt service
  regmap: Add file patterns for regmap device tree bindings
2016-07-26 19:24:33 -07:00
Linus Torvalds 9c1958fc32 media updates for v4.8-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXlfJvAAoJEAhfPr2O5OEVtLUP/RpCQ+W3YVryIdmLkdmYXoY7
 m2rXtUh7GmzBjaBkFzbRCGZtgROF7zl0e1R3nm4tLbCV4Becw8HO7YiMjqFJm9xr
 b6IngIyshsHf60Eii3RpLqUFvYrc/DDIMeYf8miwj/PvFAfI2BV9apraexJlpUuI
 wdyi28cfBHq4WYhubaXKoAyBQ8YRA/t8KNRAkDlifaOaMbSAxWHlmqoSmJWeQx73
 KHkSvbRPu4Hjo3R6q/ab8VhqmXeSnbqnQB9lgnxz7AmAZGhOlMYeAhV/K2ZwbBH8
 swv36RmJVO59Ov+vNR4p7GGGDL3+qk8JLj4LNVVfOcW0A+t7WrPQEmrL6VsyaZAy
 /+r4NEOcQN6Z5nFwbr3E0tYJ2Y5jFHOvsBfKd3EEGwty+hCl634akgb0vqtg06cg
 E2KG+XW983RBadVwEBnEudxJb0fWPWHGhXEqRrwOD+718FNmTqYM6dEvTEyxRup8
 EtCLj+eQQ4LmAyZxWyE8A+keKoMFQlHqk9LN9vQ7t7Wxq9mQ+V2l12T/lN4VhdTq
 4QZ4mrCMCGEvNcNzgSg6R/9lVb6RHDtMXZ3htbB/w+5xET/IKIANYyg1Hr7ahtdh
 rTW/4q6n3jtsu6tp5poteFvPzZKAblbrj2EptVzZYkonQ5BeAUisFTtneUL10Jmj
 EUf/sH0fqoOA0VvV6Tu+
 =mrOW
 -----END PGP SIGNATURE-----

Merge tag 'media/v4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - new framework support for HDMI CEC and remote control support

 - new encoding codec driver for Mediatek SoC

 - new frontend driver: helene tuner

 - added support for NetUp almost universal devices, with supports
   DVB-C/S/S2/T/T2 and ISDB-T

 - the mn88472 frontend driver got promoted from staging

 - a new driver for RCar video input

 - some soc_camera legacy drivers got removed: timb, omap1, mx2, mx3

 - lots of driver cleanups, improvements and fixups

* tag 'media/v4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (377 commits)
  [media] cec: always check all_device_types and features
  [media] cec: poll should check if there is room in the tx queue
  [media] vivid: support monitor all mode
  [media] cec: fix test for unconfigured adapter in main message loop
  [media] cec: limit the size of the transmit queue
  [media] cec: zero unused msg part after msg->len
  [media] cec: don't set fh to NULL in CEC_TRANSMIT
  [media] cec: clear all status fields before transmit and always fill in sequence
  [media] cec: CEC_RECEIVE overwrote the timeout field
  [media] cxd2841er: Reading SNR for DVB-C added
  [media] cxd2841er: Reading BER and UCB for DVB-C added
  [media] cxd2841er: fix switch-case for DVB-C
  [media] cxd2841er: fix signal strength scale for ISDB-T
  [media] cxd2841er: adjust the dB scale for DVB-C
  [media] cxd2841er: provide signal strength for DVB-C
  [media] cxd2841er: fix BER report via DVBv5 stats API
  [media] mb86a20s: apply mask to val after checking for read failure
  [media] airspy: fix error logic during device register
  [media] s5p-cec/TODO: add TODO item
  [media] cec/TODO: drop comment about sphinx documentation
  ...
2016-07-26 18:59:59 -07:00
Linus Torvalds 1b3fc0bef8 pstore subsystem updates for v4.8
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJXloQYAAoJEIly9N/cbcAmGC0QAIpyoiqEuDiJq/XpRg1ux+PC
 Vyr15Pub9yQYwcrWffMH1Zr0GhlFmXb1iP9rp36zdtMhjBEfq7wegvblLVMlzl6G
 7nYt8hJjDh/h8iw1lElgDL2kwUbTym43HoczJNvY/lOmFuUMK8AoDIRYjFTLAKfQ
 S4KA9MFJe3kDh4OUoQVfQNrC2VReLD4uvXk4EUF0wDYoqjVKyU3WBHOMgEmggKTR
 cb+fwhg3Lj4cuMMtZqy8wCqZ/hqhaH8giHC9YbIZQyre3ylncH9xUZyfiqS6nQGc
 eLc03qxqDNsmZvcY6cJgXldLQ3tXM4o96Moakzn2n4sQcW9vh/3oZzDPd7gC8Ei1
 GfIXmRBXFhj5JaeHNGJxL6oCywK+JaqxG8nqD7cEcXTzJiHzjn5kKKSFlr3GmI7w
 47htXv9t07SMgQW0IlBws5yApfeB62dQXmhZc1kMtbonhGdAZCCUg2Nrv34VxrjX
 Dp+LCmD5bg/fBrnAt8f+IIQEd3pElngay+SmSEB9XFUejf3pKw8SvcoDbmE3LD7M
 zGh5bEkptHll2GMInVAt4b4tiC44e7u+0H1Rsi/ttA1cktXZ9hOtFewhXNvfOS2I
 hAMGSngOdpzR5v9Mof5hJAWrr1CLkoh757UoYMsb8u2V9aQw7oVZ5JRMzjZBU6iC
 qXqHm4h5P1bpAeit3DLF
 =fiRI
 -----END PGP SIGNATURE-----

Merge tag 'pstore-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull pstore subsystem updates from Kees Cook:
 "This expands the supported compressors, fixes some bugs, and finally
  adds DT bindings"

* tag 'pstore-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore/ram: add Device Tree bindings
  efi-pstore: implement efivars_pstore_exit()
  pstore: drop file opened reference count
  pstore: add lzo/lz4 compression support
  pstore: Cleanup pstore_dump()
  pstore: Enable compression on normal path (again)
  ramoops: Only unregister when registered
2016-07-26 18:48:23 -07:00
Linus Torvalds 396d10993f The major change this cycle is deleting ext4's copy of the file system
encryption code and switching things over to using the copies in
 fs/crypto.  I've updated the MAINTAINERS file to add an entry for
 fs/crypto listing Jaeguk Kim and myself as the maintainers.
 
 There are also a number of bug fixes, most notably for some problems
 found by American Fuzzy Lop (AFL) courtesy of Vegard Nossum.  Also
 fixed is a writeback deadlock detected by generic/130, and some
 potential races in the metadata checksum code.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJXlbP9AAoJEPL5WVaVDYGjGxgIAJ9YIqme//yix63oHYLhDNea
 lY/TLqZrb9/TdDRvGyZa3jYaKaIejL53eEQS9nhEB/JI0sEiDpHmOrDOxdj8Hlsw
 fm7nJyh1u4vFKPyklCbIvLAje1vl8X/6OvqQiwh45gIxbbsFftaBWtccW+UtEkIP
 Fx65Vk7RehJ/sNrM0cRrwB79YAmDS8P6BPyzdMRk+vO/uFqyq7Auc+pkd+bTlw/m
 TDAEIunlk0Ovjx75ru1zaemL1JJx5ffehrJmGCcSUPHVbMObOEKIrlV50gAAKVhO
 qbZAri3mhDvyspSLuS/73L9skeCiWFLhvojCBGu4t2aa3JJolmItO7IpKi4HdRU=
 =bxGK
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "The major change this cycle is deleting ext4's copy of the file system
  encryption code and switching things over to using the copies in
  fs/crypto.  I've updated the MAINTAINERS file to add an entry for
  fs/crypto listing Jaeguk Kim and myself as the maintainers.

  There are also a number of bug fixes, most notably for some problems
  found by American Fuzzy Lop (AFL) courtesy of Vegard Nossum.  Also
  fixed is a writeback deadlock detected by generic/130, and some
  potential races in the metadata checksum code"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (21 commits)
  ext4: verify extent header depth
  ext4: short-cut orphan cleanup on error
  ext4: fix reference counting bug on block allocation error
  MAINTAINRES: fs-crypto maintainers update
  ext4 crypto: migrate into vfs's crypto engine
  ext2: fix filesystem deadlock while reading corrupted xattr block
  ext4: fix project quota accounting without quota limits enabled
  ext4: validate s_reserved_gdt_blocks on mount
  ext4: remove unused page_idx
  ext4: don't call ext4_should_journal_data() on the journal inode
  ext4: Fix WARN_ON_ONCE in ext4_commit_super()
  ext4: fix deadlock during page writeback
  ext4: correct error value of function verifying dx checksum
  ext4: avoid modifying checksum fields directly during checksum verification
  ext4: check for extents that wrap around
  jbd2: make journal y2038 safe
  jbd2: track more dependencies on transaction commit
  jbd2: move lockdep tracking to journal_s
  jbd2: move lockdep instrumentation for jbd2 handles
  ext4: respect the nobarrier mount option in nojournal mode
  ...
2016-07-26 18:35:55 -07:00
Linus Torvalds e663107fa1 ACPI material for v4.8-rc1
- Support for ACPI SSDT overlays allowing Secondary System
    Description Tables (SSDTs) to be loaded at any time from EFI
    variables or via configfs (Octavian Purdila, Mika Westerberg).
 
  - Support for the ACPI LPI (Low-Power Idle) feature introduced in
    ACPI 6.0 and allowing processor idle states to be represented in
    ACPI tables in a hierarchical way (with the help of Processor
    Container objects) and support for ACPI idle states management
    on ARM64, based on LPI (Sudeep Holla).
 
  - General improvements of ACPI support for NUMA and ARM64 support
    for ACPI-based NUMA (Hanjun Guo, David Daney, Robert Richter).
 
  - General improvements of the ACPI table upgrade mechanism and
    ARM64 support for that feature (Aleksey Makarov, Jon Masters).
 
  - Support for the Boot Error Record Table (BERT) in APEI and
    improvements of kernel messages printed by the error injection
    code (Huang Ying, Borislav Petkov).
 
  - New driver for the Intel Broxton WhiskeyCove PMIC operation
    region and support for the REGS operation region on Broxton,
    PMIC code cleanups (Bin Gao, Felipe Balbi, Paul Gortmaker).
 
  - New driver for the power participant device which is part of the
    Dynamic Power and Thermal Framework (DPTF) and DPTF-related code
    reorganization (Srinivas Pandruvada).
 
  - Support for the platform-initiated graceful shutdown feature
    introduced in ACPI 6.1 (Prashanth Prakash).
 
  - ACPI button driver update related to lid input events generated
    automatically on initialization and system resume that have been
    problematic for some time (Lv Zheng).
 
  - ACPI EC driver cleanups (Lv Zheng).
 
  - Documentation of the ACPICA release automation process and the
    in-kernel ACPI AML debugger (Lv Zheng).
 
  - New blacklist entry and two fixes for the ACPI backlight driver
    (Alex Hung, Arvind Yadav, Ralf Gerbig).
 
  - Cleanups of the ACPI pci_slot driver (Joe Perches, Paul Gortmaker).
 
  - ACPI CPPC code changes to make it more robust against possible
    defects in ACPI tables and new symbol definitions for PCC (Hoan
    Tran).
 
  - System reboot code modification to execute the ACPI _PTS (Prepare
    To Sleep) method in addition to _TTS (Ocean He).
 
  - ACPICA-related change to carry out lock ordering checks in ACPICA
    if ACPICA debug is enabled in the kernel (Lv Zheng).
 
  - Assorted minor fixes and cleanups (Andy Shevchenko, Baoquan He,
    Bhaktipriya Shridhar, Paul Gortmaker, Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJXl8A7AAoJEILEb/54YlRxF0kQAI6mH0yan60Osu4598+VNvgv
 wxOWl1TEbKd+LaJkofRZ+FPzZkQf5c/h/8Oo8Q3LEpFhjkARhhX7ThDjS5v2Nx6v
 I/icQ64ynPUPrw6hGNVrmec9ofZjiAs3j3Rt2bEiae+YN6guvfhWE+kBCHo2G/nN
 o4BSaxYjkphUTDSi4/5BfaocV2sl3apvwjtAj8zgGn4RD81bFFLnblynHkqJVcoN
 HAfm7QTVjT01Zkv565OSZgK8CFcD8Ky2KKKBQvcIW8zQmD6IXaoTHSYSwL0SH+oK
 bxUZUmWVfFWw4kDTAY9mw0QwtWz9ODTWh/WMhs3itWRRN5qHfogs99rCVYFtFufQ
 ODVy4wpt4wmpzZVhyUDTTigAhznPAtCam6EpL1YeNbtyrRN4evvZVFHBZJnmhosX
 zI9iLF4eqdnJZKvh+L1VFU+py8aAZpz1ZEOatNMI+xdhArbGm7v89cldzaRkJhuW
 LZr+JqYQGaOZS5qSnymwJL1KfF66+2QGpzdvzJN5FNIDACoqanATbZ/Iie2ENcM+
 WwCEWrGJFDmM30raBNNcvx0yHFtVkcNbOymla4paVg7i29nu88Ynw4Z6seIIP11C
 DryzLFhw+3jdTg2zK/te/wkhciJ0F+iZjo6VXywSMnwatf36bpdp4r4JLUVfEo2t
 8DOGKyFMLYY1zOPMK9Th
 =YwbM
 -----END PGP SIGNATURE-----

Merge tag 'acpi-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "The new feaures here are the support for ACPI overlays (allowing ACPI
  tables to be loaded at any time from EFI variables or via configfs)
  and the LPI (Low-Power Idle) support.  Also notable is the ACPI-based
  NUMA support for ARM64.

  Apart from that we have two new drivers, for the DPTF (Dynamic Power
  and Thermal Framework) power participant device and for the Intel
  Broxton WhiskeyCove PMIC, some more PMIC-related changes, support for
  the Boot Error Record Table (BERT) in APEI and support for
  platform-initiated graceful shutdown.

  Plus two new pieces of documentation and usual assorted fixes and
  cleanups in quite a few places.

  Specifics:

   - Support for ACPI SSDT overlays allowing Secondary System
     Description Tables (SSDTs) to be loaded at any time from EFI
     variables or via configfs (Octavian Purdila, Mika Westerberg).

   - Support for the ACPI LPI (Low-Power Idle) feature introduced in
     ACPI 6.0 and allowing processor idle states to be represented in
     ACPI tables in a hierarchical way (with the help of Processor
     Container objects) and support for ACPI idle states management on
     ARM64, based on LPI (Sudeep Holla).

   - General improvements of ACPI support for NUMA and ARM64 support for
     ACPI-based NUMA (Hanjun Guo, David Daney, Robert Richter).

   - General improvements of the ACPI table upgrade mechanism and ARM64
     support for that feature (Aleksey Makarov, Jon Masters).

   - Support for the Boot Error Record Table (BERT) in APEI and
     improvements of kernel messages printed by the error injection code
     (Huang Ying, Borislav Petkov).

   - New driver for the Intel Broxton WhiskeyCove PMIC operation region
     and support for the REGS operation region on Broxton, PMIC code
     cleanups (Bin Gao, Felipe Balbi, Paul Gortmaker).

   - New driver for the power participant device which is part of the
     Dynamic Power and Thermal Framework (DPTF) and DPTF-related code
     reorganization (Srinivas Pandruvada).

   - Support for the platform-initiated graceful shutdown feature
     introduced in ACPI 6.1 (Prashanth Prakash).

   - ACPI button driver update related to lid input events generated
     automatically on initialization and system resume that have been
     problematic for some time (Lv Zheng).

   - ACPI EC driver cleanups (Lv Zheng).

   - Documentation of the ACPICA release automation process and the
     in-kernel ACPI AML debugger (Lv Zheng).

   - New blacklist entry and two fixes for the ACPI backlight driver
     (Alex Hung, Arvind Yadav, Ralf Gerbig).

   - Cleanups of the ACPI pci_slot driver (Joe Perches, Paul Gortmaker).

   - ACPI CPPC code changes to make it more robust against possible
     defects in ACPI tables and new symbol definitions for PCC (Hoan
     Tran).

   - System reboot code modification to execute the ACPI _PTS (Prepare
     To Sleep) method in addition to _TTS (Ocean He).

   - ACPICA-related change to carry out lock ordering checks in ACPICA
     if ACPICA debug is enabled in the kernel (Lv Zheng).

   - Assorted minor fixes and cleanups (Andy Shevchenko, Baoquan He,
     Bhaktipriya Shridhar, Paul Gortmaker, Rafael Wysocki)"

* tag 'acpi-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (71 commits)
  ACPI: enable ACPI_PROCESSOR_IDLE on ARM64
  arm64: add support for ACPI Low Power Idle(LPI)
  drivers: firmware: psci: initialise idle states using ACPI LPI
  cpuidle: introduce CPU_PM_CPU_IDLE_ENTER macro for ARM{32, 64}
  arm64: cpuidle: drop __init section marker to arm_cpuidle_init
  ACPI / processor_idle: Add support for Low Power Idle(LPI) states
  ACPI / processor_idle: introduce ACPI_PROCESSOR_CSTATE
  ACPI / DPTF: move int340x_thermal.c to the DPTF folder
  ACPI / DPTF: Add DPTF power participant driver
  ACPI / lpat: make it explicitly non-modular
  ACPI / dock: make dock explicitly non-modular
  ACPI / PCI: make pci_slot explicitly non-modular
  ACPI / PMIC: remove modular references from non-modular code
  ACPICA: Linux: Enable ACPI_MUTEX_DEBUG for Linux kernel
  ACPI: Rename configfs.c to acpi_configfs.c to prevent link error
  ACPI / debugger: Add AML debugger documentation
  ACPI: Add documentation describing ACPICA release automation
  ACPI: add support for loading SSDTs via configfs
  ACPI: add support for configfs
  efi / ACPI: load SSTDs from EFI variables
  ...
2016-07-26 17:56:45 -07:00
Linus Torvalds 6453dbdda3 Power management material for v4.8-rc1
- Rework the cpufreq governor interface to make it more straightforward
    and modify the conservative governor to avoid using transition
    notifications (Rafael Wysocki).
 
  - Rework the handling of frequency tables by the cpufreq core to make
    it more efficient (Viresh Kumar).
 
  - Modify the schedutil governor to reduce the number of wakeups it
    causes to occur in cases when the CPU frequency doesn't need to be
    changed (Steve Muckle, Viresh Kumar).
 
  - Fix some minor issues and clean up code in the cpufreq core and
    governors (Rafael Wysocki, Viresh Kumar).
 
  - Add Intel Broxton support to the intel_pstate driver (Srinivas
    Pandruvada).
 
  - Fix problems related to the config TDP feature and to the validity
    of the MSR_HWP_INTERRUPT register in intel_pstate (Jan Kiszka,
    Srinivas Pandruvada).
 
  - Make intel_pstate update the cpu_frequency tracepoint even if
    the frequency doesn't change to avoid confusing powertop (Rafael
    Wysocki).
 
  - Clean up the usage of __init/__initdata in intel_pstate, mark some
    of its internal variables as __read_mostly and drop an unused
    structure element from it (Jisheng Zhang, Carsten Emde).
 
  - Clean up the usage of some duplicate MSR symbols in intel_pstate
    and turbostat (Srinivas Pandruvada).
 
  - Update/fix the powernv, s3c24xx and mvebu cpufreq drivers (Akshay
    Adiga, Viresh Kumar, Ben Dooks).
 
  - Fix a regression (introduced during the 4.5 cycle) in the
    pcc-cpufreq driver by reverting the problematic commit (Andreas
    Herrmann).
 
  - Add support for Intel Denverton to intel_idle, clean up Broxton
    support in it and make it explicitly non-modular (Jacob Pan,
    Jan Beulich, Paul Gortmaker).
 
  - Add support for Denverton and Ivy Bridge server to the Intel RAPL
    power capping driver and make it more careful about the handing
    of MSRs that may not be present (Jacob Pan, Xiaolong Wang).
 
  - Fix resume from hibernation on x86-64 by making the CPU offline
    during resume avoid using MONITOR/MWAIT in the "play dead" loop
    which may lead to an inadvertent "revival" of a "dead" CPU and
    a page fault leading to a kernel crash from it (Rafael Wysocki).
 
  - Make memory management during resume from hibernation more
    straightforward (Rafael Wysocki).
 
  - Add debug features that should help to detect problems related
    to hibernation and resume from it (Rafael Wysocki, Chen Yu).
 
  - Clean up hibernation core somewhat (Rafael Wysocki).
 
  - Prevent KASAN from instrumenting the hibernation core which leads
    to large numbers of false-positives from it (James Morse).
 
  - Prevent PM (hibernate and suspend) notifiers from being called
    during the cleanup phase if they have not been called during the
    corresponding preparation phase which is possible if one of the
    other notifiers returns an error at that time (Lianwei Wang).
 
  - Improve suspend-related debug printout in the tasks freezer and
    clean up suspend-related console handling (Roger Lu, Borislav
    Petkov).
 
  - Update the AnalyzeSuspend script in the kernel sources to
    version 4.2 (Todd Brandt).
 
  - Modify the generic power domains framework to make it handle
    system suspend/resume better (Ulf Hansson).
 
  - Make the runtime PM framework avoid resuming devices synchronously
    when user space changes the runtime PM settings for them and
    improve its error reporting (Rafael Wysocki, Linus Walleij).
 
  - Fix error paths in devfreq drivers (exynos, exynos-ppmu, exynos-bus)
    and in the core, make some devfreq code explicitly non-modular and
    change some of it into tristate (Bartlomiej Zolnierkiewicz,
    Peter Chen, Paul Gortmaker).
 
  - Add DT support to the generic PM clocks management code and make
    it export some more symbols (Jon Hunter, Paul Gortmaker).
 
  - Make the PCI PM core code slightly more robust against possible
    driver errors (Andy Shevchenko).
 
  - Make it possible to change DESTDIR and PREFIX in turbostat
    (Andy Shevchenko).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJXl7/dAAoJEILEb/54YlRx+VgQAIQJOWvxKew3Yl02c/sdj9OT
 5VNnFrzGzdcAPofvvG9qGq8B0Es1vYehJpwwOB21ri8EvYv0riIiU1yrqslObojQ
 oaZOkSBpbIoKjGR4CpYA/A+feE+8EqIBdPGd+lx5a6oRdUi7tRVHBG9lyLO3FB/i
 jan1q8dMpZsmu+Y+rVVHGnCVuIlIEqr2ZnZfCwDAulO2Arp/QFAh4kH08ELATvrl
 bkPa25vq7/VMP/vCDzrfZKD5mUuKogIRu/J5wx4py1nE+FB35cKKyqBOgklLwAeY
 UI8vjDhr/myNUs54AZlktOkq47TCYvjvhX9kmOxBjuWqFbRusU012IRek1fYPRIV
 ZqbkqNX7UEVQwunAEg9AyFwyzEtOht93dQDT5RLEd4QzKuM76gmHpLeTGGMzE+nu
 FnmF9JGl4DVwqpZl9yU2+hR2Mt3bP8OF8qYmNiGUB3KO4emPslhSd+6y8liA5Bx2
 SJf0Gb//vaHCh3/uMnwAonYPqRkZvBLOMwuL1VUjNQfRMnQtDdgHMYB1aT/EglPA
 8ww6j4J8rVRLAxvYQ3UEmNA/vBNclKXblRR18+JddEZP9/oX0ATfwnCCUpr839uk
 xxyQhrm4/AI60+PHWCX4GG80YrKdOGTkF7LXCQZanVWjjuyF17rufegZ2YWLT07v
 JU1Cmumfdy2jJluT8xsR
 =uVGz
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael  Wysocki:
 "Again, the majority of changes go into the cpufreq subsystem, but
  there are no big features this time.  The cpufreq changes that stand
  out somewhat are the governor interface rework and improvements
  related to the handling of frequency tables.  Apart from those, there
  are fixes and new device/CPU IDs in drivers, cleanups and an
  improvement of the new schedutil governor.

  Next, there are some changes in the hibernation core, including a fix
  for a nasty problem related to the MONITOR/MWAIT usage by CPU offline
  during resume from hibernation, a few core improvements related to
  memory management during resume, a couple of additional debug features
  and cleanups.

  Finally, we have some fixes and cleanups in the devfreq subsystem,
  generic power domains framework improvements related to system
  suspend/resume, support for some new chips in intel_idle and in the
  power capping RAPL driver, a new version of the AnalyzeSuspend utility
  and some assorted fixes and cleanups.

  Specifics:

   - Rework the cpufreq governor interface to make it more
     straightforward and modify the conservative governor to avoid using
     transition notifications (Rafael Wysocki).

   - Rework the handling of frequency tables by the cpufreq core to make
     it more efficient (Viresh Kumar).

   - Modify the schedutil governor to reduce the number of wakeups it
     causes to occur in cases when the CPU frequency doesn't need to be
     changed (Steve Muckle, Viresh Kumar).

   - Fix some minor issues and clean up code in the cpufreq core and
     governors (Rafael Wysocki, Viresh Kumar).

   - Add Intel Broxton support to the intel_pstate driver (Srinivas
     Pandruvada).

   - Fix problems related to the config TDP feature and to the validity
     of the MSR_HWP_INTERRUPT register in intel_pstate (Jan Kiszka,
     Srinivas Pandruvada).

   - Make intel_pstate update the cpu_frequency tracepoint even if the
     frequency doesn't change to avoid confusing powertop (Rafael
     Wysocki).

   - Clean up the usage of __init/__initdata in intel_pstate, mark some
     of its internal variables as __read_mostly and drop an unused
     structure element from it (Jisheng Zhang, Carsten Emde).

   - Clean up the usage of some duplicate MSR symbols in intel_pstate
     and turbostat (Srinivas Pandruvada).

   - Update/fix the powernv, s3c24xx and mvebu cpufreq drivers (Akshay
     Adiga, Viresh Kumar, Ben Dooks).

   - Fix a regression (introduced during the 4.5 cycle) in the
     pcc-cpufreq driver by reverting the problematic commit (Andreas
     Herrmann).

   - Add support for Intel Denverton to intel_idle, clean up Broxton
     support in it and make it explicitly non-modular (Jacob Pan, Jan
     Beulich, Paul Gortmaker).

   - Add support for Denverton and Ivy Bridge server to the Intel RAPL
     power capping driver and make it more careful about the handing of
     MSRs that may not be present (Jacob Pan, Xiaolong Wang).

   - Fix resume from hibernation on x86-64 by making the CPU offline
     during resume avoid using MONITOR/MWAIT in the "play dead" loop
     which may lead to an inadvertent "revival" of a "dead" CPU and a
     page fault leading to a kernel crash from it (Rafael Wysocki).

   - Make memory management during resume from hibernation more
     straightforward (Rafael Wysocki).

   - Add debug features that should help to detect problems related to
     hibernation and resume from it (Rafael Wysocki, Chen Yu).

   - Clean up hibernation core somewhat (Rafael Wysocki).

   - Prevent KASAN from instrumenting the hibernation core which leads
     to large numbers of false-positives from it (James Morse).

   - Prevent PM (hibernate and suspend) notifiers from being called
     during the cleanup phase if they have not been called during the
     corresponding preparation phase which is possible if one of the
     other notifiers returns an error at that time (Lianwei Wang).

   - Improve suspend-related debug printout in the tasks freezer and
     clean up suspend-related console handling (Roger Lu, Borislav
     Petkov).

   - Update the AnalyzeSuspend script in the kernel sources to version
     4.2 (Todd Brandt).

   - Modify the generic power domains framework to make it handle system
     suspend/resume better (Ulf Hansson).

   - Make the runtime PM framework avoid resuming devices synchronously
     when user space changes the runtime PM settings for them and
     improve its error reporting (Rafael Wysocki, Linus Walleij).

   - Fix error paths in devfreq drivers (exynos, exynos-ppmu,
     exynos-bus) and in the core, make some devfreq code explicitly
     non-modular and change some of it into tristate (Bartlomiej
     Zolnierkiewicz, Peter Chen, Paul Gortmaker).

   - Add DT support to the generic PM clocks management code and make it
     export some more symbols (Jon Hunter, Paul Gortmaker).

   - Make the PCI PM core code slightly more robust against possible
     driver errors (Andy Shevchenko).

   - Make it possible to change DESTDIR and PREFIX in turbostat (Andy
     Shevchenko)"

* tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (89 commits)
  Revert "cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency"
  PM / hibernate: Introduce test_resume mode for hibernation
  cpufreq: export cpufreq_driver_resolve_freq()
  cpufreq: Disallow ->resolve_freq() for drivers providing ->target_index()
  PCI / PM: check all fields in pci_set_platform_pm()
  cpufreq: acpi-cpufreq: use cached frequency mapping when possible
  cpufreq: schedutil: map raw required frequency to driver frequency
  cpufreq: add cpufreq_driver_resolve_freq()
  cpufreq: intel_pstate: Check cpuid for MSR_HWP_INTERRUPT
  intel_pstate: Update cpu_frequency tracepoint every time
  cpufreq: intel_pstate: clean remnant struct element
  PM / tools: scripts: AnalyzeSuspend v4.2
  x86 / hibernate: Use hlt_play_dead() when resuming from hibernation
  cpufreq: powernv: Replacing pstate_id with frequency table index
  intel_pstate: Fix MSR_CONFIG_TDP_x addressing in core_get_max_pstate()
  PM / hibernate: Image data protection during restoration
  PM / hibernate: Add missing braces in __register_nosave_region()
  PM / hibernate: Clean up comments in snapshot.c
  PM / hibernate: Clean up function headers in snapshot.c
  PM / hibernate: Add missing braces in hibernate_setup()
  ...
2016-07-26 17:29:07 -07:00
Linus Torvalds f7e6816994 - initially based on Jens' 'for-4.8/core' (given all the flag churn) and
later merged with 'for-4.8/core' to pickup the QUEUE_FLAG_DAX commits
   that DM depends on to provide its DAX support
 
 - clean up the bio-based vs request-based DM core code by moving the
   request-based DM core code out to dm-rq.[hc]
 
 - reinstate bio-based support in the DM multipath target (done with the
   idea that fast storage like NVMe over Fabrics could benefit) -- while
   preserving support for request_fn and blk-mq request-based DM mpath
 
 - SCSI and DM multipath persistent reservation fixes that were
   coordinated with Martin Petersen.
 
 - the DM raid target saw the most extensive change this cycle; it now
   provides reshape and takeover support (by layering ontop of the
   corresponding MD capabilities)
 
 - DAX support for DM core and the linear, stripe and error targets
 
 - A DM thin-provisioning block discard vs allocation race fix that
   addresses potential for corruption
 
 - A stable fix for DM verity-fec's block calculation during decode
 
 - A few cleanups and fixes to DM core and various targets
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXkRZmAAoJEMUj8QotnQNat2wH/i4LpkoGI5tI6UhyKWxRkzJp
 vKaJ0zuZ2Ez73DucJujNuvaiyHq1IjHD5pfr8JQO3E8ygDkRC2KjF2O8EXp0Has6
 U1uLahQej72MAs0ZJTpvfE+JiY6qyIl4K+xxuPmYm2f2S5TWTIgOetYjJQmcMlQo
 Y8zFfcDYn4Dv5rMdvDT4+1ePETxq74wcBwTxyW3OAbHE1f0JjsUGdMKzXB1iTWcM
 VjLjWI//ETfFdIlDO0w2Qbd90aLUjmTR2k67RGnbPj5kNUNikv/X6iiY32KERR/0
 vMiiJ7JS+a44P7FJqCMoAVM/oBYFiSNpS4LYevOgHb0G0ikF8kaSeqBPC6sMYvg=
 =uYt9
 -----END PGP SIGNATURE-----

Merge tag 'dm-4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - initially based on Jens' 'for-4.8/core' (given all the flag churn)
   and later merged with 'for-4.8/core' to pickup the QUEUE_FLAG_DAX
   commits that DM depends on to provide its DAX support

 - clean up the bio-based vs request-based DM core code by moving the
   request-based DM core code out to dm-rq.[hc]

 - reinstate bio-based support in the DM multipath target (done with the
   idea that fast storage like NVMe over Fabrics could benefit) -- while
   preserving support for request_fn and blk-mq request-based DM mpath

 - SCSI and DM multipath persistent reservation fixes that were
   coordinated with Martin Petersen.

 - the DM raid target saw the most extensive change this cycle; it now
   provides reshape and takeover support (by layering ontop of the
   corresponding MD capabilities)

 - DAX support for DM core and the linear, stripe and error targets

 - a DM thin-provisioning block discard vs allocation race fix that
   addresses potential for corruption

 - a stable fix for DM verity-fec's block calculation during decode

 - a few cleanups and fixes to DM core and various targets

* tag 'dm-4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (73 commits)
  dm: allow bio-based table to be upgraded to bio-based with DAX support
  dm snap: add fake origin_direct_access
  dm stripe: add DAX support
  dm error: add DAX support
  dm linear: add DAX support
  dm: add infrastructure for DAX support
  dm thin: fix a race condition between discarding and provisioning a block
  dm btree: fix a bug in dm_btree_find_next_single()
  dm raid: fix random optimal_io_size for raid0
  dm raid: address checkpatch.pl complaints
  dm: call PR reserve/unreserve on each underlying device
  sd: don't use the ALL_TG_PT bit for reservations
  dm: fix second blk_delay_queue() parameter to be in msec units not jiffies
  dm raid: change logical functions to actually return bool
  dm raid: use rdev_for_each in status
  dm raid: use rs->raid_disks to avoid memory leaks on free
  dm raid: support delta_disks for raid1, fix table output
  dm raid: enhance reshape check and factor out reshape setup
  dm raid: allow resize during recovery
  dm raid: fix rs_is_recovering() to allow for lvextend
  ...
2016-07-26 17:12:11 -07:00
Minchan Kim dd4123f324 mm: fix build warnings in <linux/compaction.h>
Randy reported below build error.

> In file included from ../include/linux/balloon_compaction.h:48:0,
>                  from ../mm/balloon_compaction.c:11:
> ../include/linux/compaction.h:237:51: warning: 'struct node' declared inside parameter list [enabled by default]
>  static inline int compaction_register_node(struct node *node)
> ../include/linux/compaction.h:237:51: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
> ../include/linux/compaction.h:242:54: warning: 'struct node' declared inside parameter list [enabled by default]
>  static inline void compaction_unregister_node(struct node *node)
>

It was caused by non-lru page migration which needs compaction.h but
compaction.h doesn't include any header to be standalone.

I think proper header for non-lru page migration is migrate.h rather
than compaction.h because migrate.h has already headers needed to work
non-lru page migration indirectly like isolate_mode_t, migrate_mode
MIGRATEPAGE_SUCCESS.

[akpm@linux-foundation.org: revert mm-balloon-use-general-non-lru-movable-page-feature-fix.patch temp fix]
Link: http://lkml.kernel.org/r/20160610003304.GE29779@bbox
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Gioh Kim <gi-oh.kim@profitbricks.com>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov 779750d20b shmem: split huge pages beyond i_size under memory pressure
Even if user asked to allocate huge pages always (huge=always), we
should be able to free up some memory by splitting pages which are
partly byound i_size if memory presure comes or once we hit limit on
filesystem size (-o size=).

In order to do this we maintain per-superblock list of inodes, which
potentially have huge pages on the border of file size.

Per-fs shrinker can reclaim memory by splitting such pages.

If we hit -ENOSPC during shmem_getpage_gfp(), we try to split a page to
free up space on the filesystem and retry allocation if it succeed.

Link: http://lkml.kernel.org/r/1466021202-61880-37-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov e496cf3d78 thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE
For file mappings, we don't deposit page tables on THP allocation
because it's not strictly required to implement split_huge_pmd(): we can
just clear pmd and let following page faults to reconstruct the page
table.

But Power makes use of deposited page table to address MMU quirk.

Let's hide THP page cache, including huge tmpfs, under separate config
option, so it can be forbidden on Power.

We can revert the patch later once solution for Power found.

Link: http://lkml.kernel.org/r/1466021202-61880-36-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov f3f0e1d215 khugepaged: add support of collapse for tmpfs/shmem pages
This patch extends khugepaged to support collapse of tmpfs/shmem pages.
We share fair amount of infrastructure with anon-THP collapse.

Few design points:

  - First we are looking for VMA which can be suitable for mapping huge
    page;

  - If the VMA maps shmem file, the rest scan/collapse operations
    operates on page cache, not on page tables as in anon VMA case.

  - khugepaged_scan_shmem() finds a range which is suitable for huge
    page. The scan is lockless and shouldn't disturb system too much.

  - once the candidate for collapse is found, collapse_shmem() attempts
    to create a huge page:

      + scan over radix tree, making the range point to new huge page;

      + new huge page is not-uptodate, locked and freezed (refcount
        is 0), so nobody can touch them until we say so.

      + we swap in pages during the scan. khugepaged_scan_shmem()
        filters out ranges with more than khugepaged_max_ptes_swap
	swapped out pages. It's HPAGE_PMD_NR/8 by default.

      + old pages are isolated, unmapped and put to local list in case
        to be restored back if collapse failed.

  - if collapse succeed, we retract pte page tables from VMAs where huge
    pages mapping is possible. The huge page will be mapped as PMD on
    next minor fault into the range.

Link: http://lkml.kernel.org/r/1466021202-61880-35-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov b46e756f5e thp: extract khugepaged from mm/huge_memory.c
khugepaged implementation grew to the point when it deserve separate
file in source.

Let's move it to mm/khugepaged.c.

Link: http://lkml.kernel.org/r/1466021202-61880-32-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov 800d8c63b2 shmem: add huge pages support
Here's basic implementation of huge pages support for shmem/tmpfs.

It's all pretty streight-forward:

  - shmem_getpage() allcoates huge page if it can and try to inserd into
    radix tree with shmem_add_to_page_cache();

  - shmem_add_to_page_cache() puts the page onto radix-tree if there's
    space for it;

  - shmem_undo_range() removes huge pages, if it fully within range.
    Partial truncate of huge pages zero out this part of THP.

    This have visible effect on fallocate(FALLOC_FL_PUNCH_HOLE)
    behaviour. As we don't really create hole in this case,
    lseek(SEEK_HOLE) may have inconsistent results depending what
    pages happened to be allocated.

  - no need to change shmem_fault: core-mm will map an compound page as
    huge if VMA is suitable;

Link: http://lkml.kernel.org/r/1466021202-61880-30-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Hugh Dickins c01d5b3007 shmem: get_unmapped_area align huge page
Provide a shmem_get_unmapped_area method in file_operations, called at
mmap time to decide the mapping address.  It could be conditional on
CONFIG_TRANSPARENT_HUGEPAGE, but save #ifdefs in other places by making
it unconditional.

shmem_get_unmapped_area() first calls the usual mm->get_unmapped_area
(which we treat as a black box, highly dependent on architecture and
config and executable layout).  Lots of conditions, and in most cases it
just goes with the address that chose; but when our huge stars are
rightly aligned, yet that did not provide a suitable address, go back to
ask for a larger arena, within which to align the mapping suitably.

There have to be some direct calls to shmem_get_unmapped_area(), not via
the file_operations: because of the way shmem_zero_setup() is called to
create a shmem object late in the mmap sequence, when MAP_SHARED is
requested with MAP_ANONYMOUS or /dev/zero.  Though this only matters
when /proc/sys/vm/shmem_huge has been set.

Link: http://lkml.kernel.org/r/1466021202-61880-29-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov 5a6e75f811 shmem: prepare huge= mount option and sysfs knob
This patch adds new mount option "huge=".  It can have following values:

  - "always":
	Attempt to allocate huge pages every time we need a new page;

  - "never":
	Do not allocate huge pages;

  - "within_size":
	Only allocate huge page if it will be fully within i_size.
	Also respect fadvise()/madvise() hints;

  - "advise:
	Only allocate huge pages if requested with fadvise()/madvise();

Default is "never" for now.

"mount -o remount,huge= /mountpoint" works fine after mount: remounting
huge=never will not attempt to break up huge pages at all, just stop
more from being allocated.

No new config option: put this under CONFIG_TRANSPARENT_HUGEPAGE, which
is the appropriate option to protect those who don't want the new bloat,
and with which we shall share some pmd code.

Prohibit the option when !CONFIG_TRANSPARENT_HUGEPAGE, just as mpol is
invalid without CONFIG_NUMA (was hidden in mpol_parse_str(): make it
explicit).

Allow enabling THP only if the machine has_transparent_hugepage().

But what about Shmem with no user-visible mount? SysV SHM, memfds,
shared anonymous mmaps (of /dev/zero or MAP_ANONYMOUS), GPU drivers' DRM
objects, Ashmem.  Though unlikely to suit all usages, provide sysfs knob
/sys/kernel/mm/transparent_hugepage/shmem_enabled to experiment with
huge on those.

And allow shmem_enabled two further values:

  - "deny":
	For use in emergencies, to force the huge option off from
	all mounts;
  - "force":
	Force the huge option on for all - very useful for testing;

Based on patch by Hugh Dickins.

Link: http://lkml.kernel.org/r/1466021202-61880-28-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov 65c453778a mm, rmap: account shmem thp pages
Let's add ShmemHugePages and ShmemPmdMapped fields into meminfo and
smaps.  It indicates how many times we allocate and map shmem THP.

NR_ANON_TRANSPARENT_HUGEPAGES is renamed to NR_ANON_THPS.

Link: http://lkml.kernel.org/r/1466021202-61880-27-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov c78c66d1dd radix-tree: implement radix_tree_maybe_preload_order()
The new helper is similar to radix_tree_maybe_preload(), but tries to
preload number of nodes required to insert (1 << order) continuous
naturally-aligned elements.

This is required to push huge pages into pagecache.

Link: http://lkml.kernel.org/r/1466021202-61880-24-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov e2f0a0db95 page-flags: relax policy for PG_mappedtodisk and PG_reclaim
These flags are in use for file THP.

Link: http://lkml.kernel.org/r/1466021202-61880-23-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov 9a73f61bdb thp, mlock: do not mlock PTE-mapped file huge pages
As with anon THP, we only mlock file huge pages if we can prove that the
page is not mapped with PTE.  This way we can avoid mlock leak into
non-mlocked vma on split.

We rely on PageDoubleMap() under lock_page() to check if the the page
may be PTE mapped.  PG_double_map is set by page_add_file_rmap() when
the page mapped with PTEs.

Link: http://lkml.kernel.org/r/1466021202-61880-21-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov 95ecedcd6a thp, vmstats: add counters for huge file pages
THP_FILE_ALLOC: how many times huge page was allocated and put page
cache.

THP_FILE_MAPPED: how many times file huge page was mapped.

Link: http://lkml.kernel.org/r/1466021202-61880-13-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov 1010245964 mm: introduce do_set_pmd()
With postponed page table allocation we have chance to setup huge pages.
do_set_pte() calls do_set_pmd() if following criteria met:

 - page is compound;
 - pmd entry in pmd_none();
 - vma has suitable size and alignment;

Link: http://lkml.kernel.org/r/1466021202-61880-12-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov dd78fedde4 rmap: support file thp
Naive approach: on mapping/unmapping the page as compound we update
->_mapcount on each 4k page.  That's not efficient, but it's not obvious
how we can optimize this.  We can look into optimization later.

PG_double_map optimization doesn't work for file pages since lifecycle
of file pages is different comparing to anon pages: file page can be
mapped again at any time.

Link: http://lkml.kernel.org/r/1466021202-61880-11-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov 7267ec008b mm: postpone page table allocation until we have page to map
The idea (and most of code) is borrowed again: from Hugh's patchset on
huge tmpfs[1].

Instead of allocation pte page table upfront, we postpone this until we
have page to map in hands.  This approach opens possibility to map the
page as huge if filesystem supports this.

Comparing to Hugh's patch I've pushed page table allocation a bit
further: into do_set_pte().  This way we can postpone allocation even in
faultaround case without moving do_fault_around() after __do_fault().

do_set_pte() got renamed to alloc_set_pte() as it can allocate page
table if required.

[1] http://lkml.kernel.org/r/alpine.LSU.2.11.1502202015090.14414@eggly.anvils

Link: http://lkml.kernel.org/r/1466021202-61880-10-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00