This is the support code for DAX-enabled filesystems to allow them to
provide huge pages in response to faults.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Similar to vm_insert_pfn(), but for PMDs rather than PTEs. The 'vmf_'
prefix instead of 'vm_' prefix is intended to indicate that it returns a
VMF_ value rather than an errno (which would only have to be converted
into a VMF_ value anyway).
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To use the huge zero page in DAX, we need these functions exported.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow non-anonymous VMAs to provide huge pages in response to a page fault.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a vma_is_dax() helper macro to test whether the VMA is DAX, and use it
in zap_huge_pmd() and __split_huge_page_pmd().
[akpm@linux-foundation.org: fix build]
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In order to handle the !CONFIG_TRANSPARENT_HUGEPAGES case, we need to
return VM_FAULT_FALLBACK from the inlined dax_pmd_fault(), which is
defined in linux/mm.h. Given that we don't want to include <linux/mm.h>
in <linux/fs.h>, the easiest solution is to move the DAX-related
functions to a new header, <linux/dax.h>. We could also have moved
VM_FAULT_* definitions to a new header, or a different header that isn't
quite such a boil-the-ocean header as <linux/mm.h>, but this felt like
the best option.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This series of patches adds support for using PMD page table entries to
map DAX files. We expect NV-DIMMs to start showing up that are many
gigabytes in size and the memory consumption of 4kB PTEs will be
astronomical.
The patch series leverages much of the Transparant Huge Pages
infrastructure, going so far as to borrow one of Kirill's patches from
his THP page cache series.
This patch (of 10):
Since we're going to have huge pages in page cache, we need to call adjust
file-backed VMA, which potentially can contain huge pages.
For now we call it for all VMAs.
Probably later we will need to introduce a flag to indicate that the VMA
has huge pages.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
special_mapping_fault() is absolutely broken. It seems it was always
wrong, but this didn't matter until vdso/vvar started to use more than
one page.
And after this change vma_is_anonymous() becomes really trivial, it
simply checks vm_ops == NULL. However, I do think the helper makes
sense. There are a lot of ->vm_ops != NULL checks, the helper makes the
caller's code more understandable (self-documented) and this is more
grep-friendly.
This patch (of 3):
Preparation. Add the new simple helper, vma_is_anonymous(vma), and change
handle_pte_fault() to use it. It will have more users.
The name is not accurate, say a hpet_mmap()'ed vma is not anonymous.
Perhaps it should be named vma_has_fault() instead. But it matches the
logic in mmap.c/memory.c (see next changes). "True" just means that a
page fault will use do_anonymous_page().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map. This facility is used by the pmem driver to
enable pfn_to_page() operations on the page frames returned by DAX
('direct_access' in 'struct block_device_operations'). For now, the
'memmap' allocation for these "device" pages comes from "System
RAM". Support for allocating the memmap from device memory will
arrive in a later kernel.
2/ Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3. Completion of
the conversion is targeted for v4.4.
3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
5/ Miscellaneous updates and fixes to libnvdimm including support
for issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5
JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R
OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C
nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ
NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e
zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR
1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA
sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb
bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1
o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz
dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn
slsw6DkrWT60CRE42nbK
=o57/
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"This update has successfully completed a 0day-kbuild run and has
appeared in a linux-next release. The changes outside of the typical
drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
the introduction of ZONE_DEVICE + devm_memremap_pages().
Summary:
- Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map.
This facility is used by the pmem driver to enable pfn_to_page()
operations on the page frames returned by DAX ('direct_access' in
'struct block_device_operations').
For now, the 'memmap' allocation for these "device" pages comes
from "System RAM". Support for allocating the memmap from device
memory will arrive in a later kernel.
- Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3.
Completion of the conversion is targeted for v4.4.
- Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
- Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
- Miscellaneous updates and fixes to libnvdimm including support for
issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes"
* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
libnvdimm, pmem: direct map legacy pmem by default
libnvdimm, pmem: 'struct page' for pmem
libnvdimm, pfn: 'struct page' provider infrastructure
x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
add devm_memremap_pages
mm: ZONE_DEVICE for "device memory"
mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
dax: drop size parameter to ->direct_access()
nd_blk: change aperture mapping from WC to WB
nvdimm: change to use generic kvfree()
pmem, dax: have direct_access use __pmem annotation
dax: update I/O path to do proper PMEM flushing
pmem: add copy_from_iter_pmem() and clear_pmem()
pmem, x86: clean up conditional pmem includes
pmem: remove layer when calling arch_has_wmb_pmem()
pmem, x86: move x86 PMEM API to new pmem.h header
libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
pmem: switch to devm_ allocations
devres: add devm_memremap
libnvdimm, btt: write and validate parent_uuid
...
The changes with more meat are:
o Allowing the trace event filters to filter on CPU number and process ids
o Two new markers for trace output latency were added
(10 and 100 msec latencies)
o Have tracing_thresh filter function profiling time
I also worked on modifying the ring buffer code for some future
work, and moved the adding of the timestamp around. One of my changes
caused a regression, and since other changes were built on top of it
and already tested, I had to operate a revert of that change. Instead
of rebasing, this change set has the code that caused a regression
as well as the code to revert that change without touching the other
changes that were made on top of it.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJV6aZEAAoJEEjnJuOKh9ldrR4H/A1RcQf1prLLoUibPP4w3lat
dmQcdpS1NY+cqyiKuKPAOkFDGQL7qWzRqZ8whcPSJIsHq57ufqNSLf+0bbQYPzg9
g3CgGL7OApmGi5ulj0sNxhadvc9TFm/SAN0nVJlNuUWdm8e1UWHLsrJZaMfopu2r
RDEtkOhg619mhDL4rktNdS6rk0B92Fhu2o2PwLZPVlUl1NNEt4WJU+ejitXUVO1A
Nb70/rTGGJKtyHbW+74on4LnEN5Uu0Viu6rMwGfYyIgRmC2otdBDvE4xfKMiTUKr
SzBjzrhIoMIRn4Vl0vElfulkpYaw7pcC2BdpZ4d9VpIOiLSlZs0x/TgCtpFEv5M=
=baZ3
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing update from Steven Rostedt:
"Mostly this is just clean ups and micro optimizations.
The changes with more meat are:
- Allowing the trace event filters to filter on CPU number and
process ids
- Two new markers for trace output latency were added (10 and 100
msec latencies)
- Have tracing_thresh filter function profiling time
I also worked on modifying the ring buffer code for some future work,
and moved the adding of the timestamp around. One of my changes
caused a regression, and since other changes were built on top of it
and already tested, I had to operate a revert of that change. Instead
of rebasing, this change set has the code that caused a regression as
well as the code to revert that change without touching the other
changes that were made on top of it"
* tag 'trace-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ring-buffer: Revert "ring-buffer: Get timestamp after event is allocated"
tracing: Don't make assumptions about length of string on task rename
tracing: Allow triggers to filter for CPU ids and process names
ftrace: Format MCOUNT_ADDR address as type unsigned long
tracing: Introduce two additional marks for delay
ftrace: Fix function_graph duration spacing with 7-digits
ftrace: add tracing_thresh to function profile
tracing: Clean up stack tracing and fix fentry updates
ring-buffer: Reorganize function locations
ring-buffer: Make sure event has enough room for extend and padding
ring-buffer: Get timestamp after event is allocated
ring-buffer: Move the adding of the extended timestamp out of line
ring-buffer: Add event descriptor to simplify passing data
ftrace: correct the counter increment for trace_buffer data
tracing: Fix for non-continuous cpu ids
tracing: Prefer kcalloc over kzalloc with multiply
Pull audit update from Paul Moore:
"This is one of the larger audit patchsets in recent history,
consisting of eight patches and almost 400 lines of changes.
The bulk of the patchset is the new "audit by executable"
functionality which allows admins to set an audit watch based on the
executable on disk. Prior to this, admins could only track an
application by PID, which has some obvious limitations.
Beyond the new functionality we also have some refcnt fixes and a few
minor cleanups"
* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
fixup: audit: implement audit by executable
audit: implement audit by executable
audit: clean simple fsnotify implementation
audit: use macros for unset inode and device values
audit: make audit_del_rule() more robust
audit: fix uninitialized variable in audit_add_rule()
audit: eliminate unnecessary extra layer of watch parent references
audit: eliminate unnecessary extra layer of watch references
Pull security subsystem updates from James Morris:
"Highlights:
- PKCS#7 support added to support signed kexec, also utilized for
module signing. See comments in 3f1e1bea.
** NOTE: this requires linking against the OpenSSL library, which
must be installed, e.g. the openssl-devel on Fedora **
- Smack
- add IPv6 host labeling; ignore labels on kernel threads
- support smack labeling mounts which use binary mount data
- SELinux:
- add ioctl whitelisting (see
http://kernsec.org/files/lss2015/vanderstoep.pdf)
- fix mprotect PROT_EXEC regression caused by mm change
- Seccomp:
- add ptrace options for suspend/resume"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (57 commits)
PKCS#7: Add OIDs for sha224, sha284 and sha512 hash algos and use them
Documentation/Changes: Now need OpenSSL devel packages for module signing
scripts: add extract-cert and sign-file to .gitignore
modsign: Handle signing key in source tree
modsign: Use if_changed rule for extracting cert from module signing key
Move certificate handling to its own directory
sign-file: Fix warning about BIO_reset() return value
PKCS#7: Add MODULE_LICENSE() to test module
Smack - Fix build error with bringup unconfigured
sign-file: Document dependency on OpenSSL devel libraries
PKCS#7: Appropriately restrict authenticated attributes and content type
KEYS: Add a name for PKEY_ID_PKCS7
PKCS#7: Improve and export the X.509 ASN.1 time object decoder
modsign: Use extract-cert to process CONFIG_SYSTEM_TRUSTED_KEYS
extract-cert: Cope with multiple X.509 certificates in a single file
sign-file: Generate CMS message as signature instead of PKCS#7
PKCS#7: Support CMS messages also [RFC5652]
X.509: Change recorded SKID & AKID to not include Subject or Issuer
PKCS#7: Check content type and versions
MAINTAINERS: The keyrings mailing list has moved
...
Pull NMI backtrace update from Russell King:
"These changes convert the x86 NMI handling to be a library
implementation which other architectures can make use of. Thomas
Gleixner has reviewed and tested these changes, and wishes me to send
these rather than taking them through the tip tree.
The final patch in the set adds an initial implementation using this
infrastructure to ARM, even though it doesn't send the IPI at "NMI"
level. Patches are in progress to add the ARM equivalent of NMI, but
we still need the IRQ-level fallback for systems where the "NMI" isn't
available due to secure firmware denying access to it"
* 'nmi' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: add basic support for on-demand backtrace of other CPUs
nmi: x86: convert to generic nmi handler
nmi: create generic NMI backtrace implementation
- Convert xen-blkfront to the multiqueue API
- [arm] Support binding event channels to different VCPUs.
- [x86] Support > 512 GiB in a PV guests (off by default as such a
guest cannot be migrated with the current toolstack).
- [x86] PMU support for PV dom0 (limited support for using perf with
Xen and other guests).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJV7wIdAAoJEFxbo/MsZsTR0hEH/04HTKLKGnSJpZ5WbMPxqZxE
UqGlvhvVWNAmFocZmbPcEi9T1qtcFrX5pM55JQr6UmAp3ovYsT2q1Q1kKaOaawks
pSfc/YEH3oQW5VUQ9Lm9Ru5Z8Btox0WrzRREO92OF36UOgUOBOLkGsUfOwDinNIM
lSk2djbYwDYAsoeC3PHB32wwMI//Lz6B/9ZVXcyL6ULynt1ULdspETjGnptRPZa7
JTB5L4/soioKOn18HDwwOhKmvaFUPQv9Odnv7dc85XwZreajhM/KMu3qFbMDaF/d
WVB1NMeCBdQYgjOrUjrmpyr5uTMySiQEG54cplrEKinfeZgKlEyjKvjcAfJfiac=
=Ktjl
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from David Vrabel:
"Xen features and fixes for 4.3:
- Convert xen-blkfront to the multiqueue API
- [arm] Support binding event channels to different VCPUs.
- [x86] Support > 512 GiB in a PV guests (off by default as such a
guest cannot be migrated with the current toolstack).
- [x86] PMU support for PV dom0 (limited support for using perf with
Xen and other guests)"
* tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (33 commits)
xen: switch extra memory accounting to use pfns
xen: limit memory to architectural maximum
xen: avoid another early crash of memory limited dom0
xen: avoid early crash of memory limited dom0
arm/xen: Remove helpers which are PV specific
xen/x86: Don't try to set PCE bit in CR4
xen/PMU: PMU emulation code
xen/PMU: Intercept PMU-related MSR and APIC accesses
xen/PMU: Describe vendor-specific PMU registers
xen/PMU: Initialization code for Xen PMU
xen/PMU: Sysfs interface for setting Xen PMU mode
xen: xensyms support
xen: remove no longer needed p2m.h
xen: allow more than 512 GB of RAM for 64 bit pv-domains
xen: move p2m list if conflicting with e820 map
xen: add explicit memblock_reserve() calls for special pages
mm: provide early_memremap_ro to establish read-only mapping
xen: check for initrd conflicting with e820 map
xen: check pre-allocated page tables for conflict with memory map
xen: check for kernel memory conflicting with memory layout
...
Pull more irq updates from Thomas Gleixner:
"The second part of irq related updates:
- Provide EOImode for GIC[V3] irq chips, which is a prerequisite for
direct interrupt handling in [KVM] guests"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/GIC: Fix EOImode setting for non-DT/ACPI systems
irqchip/GIC: Don't deactivate interrupts forwarded to a guest
irqchip/GIC: Convert to EOImode == 1
irqchip/GICv3: Don't deactivate interrupts forwarded to a guest
irqchip/GICv3: Convert to EOImode == 1
The privcmd code is mixing the usage of GFN and MFN within the same
functions which make the code difficult to understand when you only work
with auto-translated guests.
The privcmd driver is only dealing with GFN so replace all the mention
of MFN into GFN.
The ioctl structure used to map foreign change has been left unchanged
given that the userspace is using it. Nonetheless, add a comment to
explain the expected value within the "mfn" field.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
is meant, I suspect this is because the first support for Xen was for
PV. This resulted in some misimplementation of helpers on ARM and
confused developers about the expected behavior.
For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
Although, if we look at the implementation on x86, it's returning a GFN.
For clarity and avoid new confusion, replace any reference to mfn with
gfn in any helpers used by PV drivers. The x86 code will still keep some
reference of pfn_to_mfn which may be used by all kind of guests
No changes as been made in the hypercall field, even
though they may be invalid, in order to keep the same as the defintion
in xen repo.
Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a
name to close to the KVM function gfn_to_page.
Take also the opportunity to simplify simple construction such
as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up
will come in follow-up patches.
[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Instead of using physical addresses for accounting of extra memory
areas available for ballooning switch to pfns as this is much less
error prone regarding partial pages.
Reported-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Fixes compilation with ppc64_defconfig.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Highlights include:
Stable patches:
- Fix atomicity of pNFS commit list updates
- Fix NFSv4 handling of open(O_CREAT|O_EXCL|O_RDONLY)
- nfs_set_pgio_error sometimes misses errors
- Fix a thinko in xs_connect()
- Fix borkage in _same_data_server_addrs_locked()
- Fix a NULL pointer dereference of migration recovery ops for v4.2 client
- Don't let the ctime override attribute barriers.
- Revert "NFSv4: Remove incorrect check in can_open_delegated()"
- Ensure flexfiles pNFS driver updates the inode after write finishes
- flexfiles must not pollute the attribute cache with attrbutes from the DS
- Fix a protocol error in layoutreturn
- Fix a protocol issue with NFSv4.1 CLOSE stateids
Bugfixes + cleanups
- pNFS blocks bugfixes from Christoph
- Various cleanups from Anna
- More fixes for delegation corner cases
- Don't fsync twice for O_SYNC/IS_SYNC files
- Fix pNFS and flexfiles layoutstats bugs
- pnfs/flexfiles: avoid duplicate tracking of mirror data
- pnfs: Fix layoutget/layoutreturn/return-on-close serialisation issues.
- pnfs/flexfiles: error handling retries a layoutget before fallback to MDS
Features:
- Full support for the OPEN NFS4_CREATE_EXCLUSIVE4_1 mode from Kinglong
- More RDMA client transport improvements from Chuck
- Removal of the deprecated ib_reg_phys_mr() and ib_rereg_phys_mr() verbs
from the SUNRPC, Lustre and core infiniband tree.
- Optimise away the close-to-open getattr if there is no cached data
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV7chgAAoJEGcL54qWCgDyqJQP/3kto9VXnXcatC382jF9Pfj5
F55XeSnviOXH7CyiKA4nSBhnxg/sLuWOTpbkVI/4Y+VyWhLby9h+mtcKURHOlBnj
d5BFoPwaBVDnUiKlHFQDkRjIyxjj2Sb6/uEb2V/u3v+3znR5AZZ4lzFx4cD85oaz
mcru7yGiSxaQCIH6lHExcCEKXaDP5YdvS9YFsyQfv2976JSaQHM9ZG04E0v6MzTo
E5wwC4CLMKmhuX9kmQMj85jzs1ASAKZ3N7b4cApTIo6F8DCDH0vKQphq/nEQC497
ECjEs5/fpxtNJUpSBu0gT7G4LCiW3PzE7pHa+8bhbaAn9OzxIR5+qWujKsfGYQhO
Oomp3K9zO6omshAc5w4MkknPpbImjoZjGAj/q/6DbtrDpnD7DzOTirwYY2yX0CA8
qcL81uJUb8+j4jJj4RTO+lTUBItrM1XTqTSd/3eSMr5DDRVZj+ERZxh17TaxRBZL
YrbrLHxCHrcbdSbPlovyvY+BwjJUUFJRcOxGQXLmNYR9u92fF59rb53bzVyzcRRO
wBozzrNRCFL+fPgfNPLEapIb6VtExdM3rl2HYsJGckHj4DPQdnoB3ytIT9iEFZEN
+/rE14XEZte7kuH3OP4el2UsP/hVsm7A49mtwrkdbd7rxMWD6XfQUp8DAggWUEtI
1H6T7RC1Y6wsu0X1fnVz
=knJA
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable patches:
- Fix atomicity of pNFS commit list updates
- Fix NFSv4 handling of open(O_CREAT|O_EXCL|O_RDONLY)
- nfs_set_pgio_error sometimes misses errors
- Fix a thinko in xs_connect()
- Fix borkage in _same_data_server_addrs_locked()
- Fix a NULL pointer dereference of migration recovery ops for v4.2
client
- Don't let the ctime override attribute barriers.
- Revert "NFSv4: Remove incorrect check in can_open_delegated()"
- Ensure flexfiles pNFS driver updates the inode after write finishes
- flexfiles must not pollute the attribute cache with attrbutes from
the DS
- Fix a protocol error in layoutreturn
- Fix a protocol issue with NFSv4.1 CLOSE stateids
Bugfixes + cleanups
- pNFS blocks bugfixes from Christoph
- Various cleanups from Anna
- More fixes for delegation corner cases
- Don't fsync twice for O_SYNC/IS_SYNC files
- Fix pNFS and flexfiles layoutstats bugs
- pnfs/flexfiles: avoid duplicate tracking of mirror data
- pnfs: Fix layoutget/layoutreturn/return-on-close serialisation
issues
- pnfs/flexfiles: error handling retries a layoutget before fallback
to MDS
Features:
- Full support for the OPEN NFS4_CREATE_EXCLUSIVE4_1 mode from
Kinglong
- More RDMA client transport improvements from Chuck
- Removal of the deprecated ib_reg_phys_mr() and ib_rereg_phys_mr()
verbs from the SUNRPC, Lustre and core infiniband tree.
- Optimise away the close-to-open getattr if there is no cached data"
* tag 'nfs-for-4.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (108 commits)
NFSv4: Respect the server imposed limit on how many changes we may cache
NFSv4: Express delegation limit in units of pages
Revert "NFS: Make close(2) asynchronous when closing NFS O_DIRECT files"
NFS: Optimise away the close-to-open getattr if there is no cached data
NFSv4.1/flexfiles: Clean up ff_layout_write_done_cb/ff_layout_commit_done_cb
NFSv4.1/flexfiles: Mark the layout for return in ff_layout_io_track_ds_error()
nfs: Remove unneeded checking of the return value from scnprintf
nfs: Fix truncated client owner id without proto type
NFSv4.1/flexfiles: Mark layout for return if the mirrors are invalid
NFSv4.1/flexfiles: RW layouts are valid only if all mirrors are valid
NFSv4.1/flexfiles: Fix incorrect usage of pnfs_generic_mark_devid_invalid()
NFSv4.1/flexfiles: Fix freeing of mirrors
NFSv4.1/pNFS: Don't request a minimal read layout beyond the end of file
NFSv4.1/pnfs: Handle LAYOUTGET return values correctly
NFSv4.1/pnfs: Don't ask for a read layout for an empty file.
NFSv4.1: Fix a protocol issue with CLOSE stateids
NFSv4.1/flexfiles: Don't mark the entire deviceid as bad for file errors
SUNRPC: Prevent SYN+SYNACK+RST storms
SUNRPC: xs_reset_transport must mark the connection as disconnected
NFSv4.1/pnfs: Ensure layoutreturn reserves space for the opaque payload
...
The documentation should say "peer" not "local" when referring to the
peer doorbell register.
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
There was a copy and paste error in the documentation for
ntb_link_is_up. The long description was mistakenly copied from
ntb_link_set_trans.
This adds the appropriate long description for ntb_link_is_up.
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Right now if we push the NTB really hard, we start dropping packets due
to not able to process the packets fast enough. We need to st:qop the
upper layer from flooding us when that happens.
A timer is necessary in order to restart the queue once the resource has
been processed on the receive side. Due to the way NTB is setup, the
resources on the tx side are tied to the processing of the rx side and
there's no async way to know when the rx side has released those
resources.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Since we're tracking modifications to the page cache on a per-page
basis, it makes sense to express the limit to how much we may cache
in units of pages.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* fix for the sizeof() pointer type issue
* a fix for regulatory getting into a restore loop
* a fix for rfkill global 'all' state, it needs to be stored
everywhere to apply correctly to new rfkill instances
* properly refuse CQM RSSI when it cannot actually be used
* protect HT TDLS traffic properly in non-HT networks
* don't incorrectly advertise 80 MHz support when not allowed
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJV6av+AAoJEDBSmw7B7bqrNYYP/1kZjdk9s4wPHpQndfSWK/sj
yTjysh/VXZ5Jyzkcmg337AjEuuJ/SntDXjMXEKTxHHrt5fmHFBvmPErPUY7p7I/t
MJqxVnYGypGW4LImyXYRJbejabmPrxtHC+jcWibw24yBCzJGCXF/MeKCotksnvIn
IoJDvUHrVFZhfzFRepdozJi1Qy2Y04Q7zXywplW6XFYWxdfDWaWyno/xh+rM8oO7
xSgpkuj4FDfimDBjzJOHwBH3xOlt/uHoB97G0TJbJ6k0R6a8aTz8eK5OWB4rFcn+
BsuQfaWcrGg9fJsoqz4mYlWf/F4Qj/gU9Gr5sBUWWa0xQ7HctcgJ/t27XjUWP7oV
VnQ9YQt/SukG3BVf8GuNqaSo7/7oprWHdDYRH0tnvImXCWsUlHM8gv8MrG0UOHGC
4I4qgPLHon6Ui3lNHfivl8iSDFT+jfsMDMWiB6kKZRpUOdEicRrepMoDARVbTYck
h6uY4Tx6OQ39evLSXjyrCWb0aOr2WoDCypxEkpE+fgDsWr8yTV0Ti0QZfVqatO3d
d4GQb6U5Zkd14ymKtzULa5qjOw1TmnKpxVzSSdrONaPBVdUDsl4QH38gIppFjGsV
61Gn3PXRRgy+VAib/YyIYqilvdhZ2TdXaSVAEyKv6WjKavOFA/8KsDPmJsyKoCSn
1dDOwaiqgfu2bK1PTU9A
=xw8r
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2015-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
For the first round of fixes, we have this:
* fix for the sizeof() pointer type issue
* a fix for regulatory getting into a restore loop
* a fix for rfkill global 'all' state, it needs to be stored
everywhere to apply correctly to new rfkill instances
* properly refuse CQM RSSI when it cannot actually be used
* protect HT TDLS traffic properly in non-HT networks
* don't incorrectly advertise 80 MHz support when not allowed
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tracepoint for dynamic halt_pool_ns, fired on every potential change.
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Change halt_poll_ns into per-VCPU variable, seeded from module parameter,
to allow greater flexibility.
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Conflicts:
include/net/netfilter/nf_conntrack.h
The conflict was an overlap between changing the type of the zone
argument to nf_ct_tmpl_alloc() whilst exporting nf_ct_tmpl_free.
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net, they are:
1) Oneliner to restore maps in nf_tables since we support addressing registers
at 32 bits level.
2) Restore previous default behaviour in bridge netfilter when CONFIG_IPV6=n,
oneliner from Bernhard Thaler.
3) Out of bound access in ipset hash:net* set types, reported by Dave Jones'
KASan utility, patch from Jozsef Kadlecsik.
4) Fix ipset compilation with gcc 4.4.7 related to C99 initialization of
unnamed unions, patch from Elad Raz.
5) Add a workaround to address inconsistent endianess in the res_id field of
nfnetlink batch messages, reported by Florian Westphal.
6) Fix error paths of CT/synproxy since the conntrack template was moved to use
kmalloc, patch from Daniel Borkmann.
All of them look good to me to reach 4.2, I can route this to -stable myself
too, just let me know what you prefer.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull vfs updates from Al Viro:
"In this one:
- d_move fixes (Eric Biederman)
- UFS fixes (me; locking is mostly sane now, a bunch of bugs in error
handling ought to be fixed)
- switch of sb_writers to percpu rwsem (Oleg Nesterov)
- superblock scalability (Josef Bacik and Dave Chinner)
- swapon(2) race fix (Hugh Dickins)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (65 commits)
vfs: Test for and handle paths that are unreachable from their mnt_root
dcache: Reduce the scope of i_lock in d_splice_alias
dcache: Handle escaped paths in prepend_path
mm: fix potential data race in SyS_swapon
inode: don't softlockup when evicting inodes
inode: rename i_wb_list to i_io_list
sync: serialise per-superblock sync operations
inode: convert inode_sb_list_lock to per-sb
inode: add hlist_fake to avoid the inode hash lock in evict
writeback: plug writeback at a high level
change sb_writers to use percpu_rw_semaphore
shift percpu_counter_destroy() into destroy_super_work()
percpu-rwsem: kill CONFIG_PERCPU_RWSEM
percpu-rwsem: introduce percpu_rwsem_release() and percpu_rwsem_acquire()
percpu-rwsem: introduce percpu_down_read_trylock()
document rwsem_release() in sb_wait_write()
fix the broken lockdep logic in __sb_start_write()
introduce __sb_writers_{acquired,release}() helpers
ufs_inode_get{frag,block}(): get rid of 'phys' argument
ufs_getfrag_block(): tidy up a bit
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV6ifEAAoJEAhfPr2O5OEVn5kP/i2jM1tWcmV/ZEBKGAN0jpRk
5Y/Q+rnXvOpIJSQC3dEkweoBymVMclSgSB/wFSWCZtp5MaB8KrH4/2uc3UvolF91
7bqXt+fCUacMbDQyaabMCR83mz9tdOJLd5sf0ABqBgXGfwh5uXmBPaYBzmcYvKcW
4D89MFUpaFDPARTs9rdpVyr0aPRU4GcN0R3snRO9Ly+cQnyV/RxPf9NqCgnI+yPq
+NvA9ScUBcBt62piSIGR4egcAR8boxYC+0r57340S21/JVMvsHQ3ok9b1aT8/rtd
Yl24FkcKrRV0ShN5S1RmW5DLH/HRGabuMjkiEz9xq52FGD2sQQda0At58dWivsa4
XYdxS9UUfb9Z+qyeMdmCl1MUFRrV2G4H6VItP+GKyT3UZLEDcLl6hBg3SkyWxWB4
CSO5WuRThiIB86OVcIaREftzqDy5HdvH3ZKRD7QrW0DItGVjQwV5j6gvwqO9OEXs
99BnSohyKwUBonumE2ZtFGGhIwIomllrMSqg991bPH9+13bg/rPxUqntkPrVap/9
cV3qKO8ZFrz5UInBnR1U83l60ZK7rV4G6AVMSMKpM9XVK9TDKryAUN9Mhj5XWRH8
hbma89TQVdhdrITtt27uzj8F622cvZvxd1BqDBR8DjKVvtv/E2GPzJrAj7GHe3/o
NgzP5fF6X2Si32GNb7J8
=cIed
-----END PGP SIGNATURE-----
Merge tag 'media/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- new DVB frontend drivers: ascot2e, cxd2841er, horus3a, lnbh25
- new HDMI capture driver: tc358743
- new driver for NetUP DVB new boards (netup_unidvb)
- IR support for DVBSky cards (smipcie-ir)
- Coda driver has gain macroblock tiling support
- Renesas R-Car gains JPEG codec driver
- new DVB platform driver for STi boards: c8sectpfe
- added documentation for the media core kABI to device-drivers DocBook
- lots of driver fixups, cleanups and improvements
* tag 'media/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (297 commits)
[media] c8sectpfe: Remove select on undefined LIBELF_32
[media] i2c: fix platform_no_drv_owner.cocci warnings
[media] cx231xx: Use wake_up_interruptible() instead of wake_up_interruptible_nr()
[media] tc358743: only queue subdev notifications if devnode is set
[media] tc358743: add missing Kconfig dependency/select
[media] c8sectpfe: Use %pad to print 'dma_addr_t'
[media] DocBook media: Fix typo "the the" in xml files
[media] tc358743: make reset gpio optional
[media] tc358743: set direction of reset gpio using devm_gpiod_get
[media] dvbdev: document most of the functions/data structs
[media] dvb_frontend.h: document the struct dvb_frontend
[media] dvb-frontend.h: document struct dtv_frontend_properties
[media] dvb-frontend.h: document struct dvb_frontend_ops
[media] dvb: Use DVBFE_ALGO_HW where applicable
[media] dvb_frontend.h: document struct analog_demod_ops
[media] dvb_frontend.h: Document struct dvb_tuner_ops
[media] Docbook: Document struct analog_parameters
[media] dvb_frontend.h: get rid of dvbfe_modcod
[media] add documentation for struct dvb_tuner_info
[media] dvb_frontend: document dvb_frontend_tune_settings
...
Pull mailbox updates from Jassi Brar:
"Mainly we move from jiffy based timer to HRTIMER for finer control
over polling. Then a controller reduces its polling period from 10 to
1ms"
* 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: arm_mhu: reduce txpoll_period from 10ms to 1 ms
mailbox: switch to hrtimer for tx_complete polling
mailbox: Drop owner assignment from platform_driver
- Add Jeff Layton as an nfsd co-maintainer: no change to
existing practice, just an acknowledgement of the status quo.
- Two patches ("nfsd: ensure that...") for a race overlooked by
the state locking rewrite, causing a crash noticed by multiple
users.
- Lots of smaller bugfixes all over from Kinglong Mee.
- From Jeff, some cleanup of server rpc code in preparation for
possible shift of nfsd threads to workqueues.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV6fbLAAoJECebzXlCjuG+qGkP/j2YnZynwqCa4uz1+FU7qfYI
kZWNGFFQ7O7e1i9Wznp7BkSA020rvM5d1HPwZhtstURM3i52XWRtbppwKF2+IuEU
tpNdPKb28BPCZO29Z8mQk9IS2sX5jmBiibXRqBk0VK7e43PXrIwg1LJJ9HOfOpLh
b1MvxdEB7vqK+fAVIYyhlg0UDd5AHAkQ+vS8YuohRXbDcsdhhE4vmusLlUl5UKp8
5Yunz+b+pXfXPYaKidmpar6U2KoRSTPP1uO3bNfN6URO1W1nchPadLs0DnsBKlhb
U8II5RZEmc+YfiIMoeptkJHoNhWT6Zu7CNJR6B0USTKv4L6TmFQVpxptVutzYVwx
sGJ65lvCiXXOPz8JJwvBty//HTmbyOiCm64/vMbhQRlSNLSmcmTXEpw/uT5Huaxx
bX9lnznoVVCd3eRoXPwMdZTbg/uEKqREZsQWVoqA6gexYqeyp79kvGbttLoUJ27Z
IjtNb9W6akxfPKrHMgan6j7dy866o6TdSfWRayHwUoswbNnVOnMYKHjApOtF0oev
k2pdLuy9tjl2a9Ow9sSwHZDbNsXgJO76E0aYnSTBP/YvctlG7KoZ+E0oxa6DWTC+
0dE+g1xhIuUtW5WRL4pfWWk1G7jnf16J91bKkn91VveDn666RncAbLBtePmpIcIu
5Ah6KxztTVCW++i5pmHh
=aecc
-----END PGP SIGNATURE-----
Merge tag 'nfsd-4.3' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"Nothing major, but:
- Add Jeff Layton as an nfsd co-maintainer: no change to existing
practice, just an acknowledgement of the status quo.
- Two patches ("nfsd: ensure that...") for a race overlooked by the
state locking rewrite, causing a crash noticed by multiple users.
- Lots of smaller bugfixes all over from Kinglong Mee.
- From Jeff, some cleanup of server rpc code in preparation for
possible shift of nfsd threads to workqueues"
* tag 'nfsd-4.3' of git://linux-nfs.org/~bfields/linux: (52 commits)
nfsd: deal with DELEGRETURN racing with CB_RECALL
nfsd: return CLID_INUSE for unexpected SETCLIENTID_CONFIRM case
nfsd: ensure that delegation stateid hash references are only put once
nfsd: ensure that the ol stateid hash reference is only put once
net: sunrpc: fix tracepoint Warning: unknown op '->'
nfsd: allow more than one laundry job to run at a time
nfsd: don't WARN/backtrace for invalid container deployment.
fs: fix fs/locks.c kernel-doc warning
nfsd: Add Jeff Layton as co-maintainer
NFSD: Return word2 bitmask if setting security label in OPEN/CREATE
NFSD: Set the attributes used to store the verifier for EXCLUSIVE4_1
nfsd: SUPPATTR_EXCLCREAT must be encoded before SECURITY_LABEL.
nfsd: Fix an FS_LAYOUT_TYPES/LAYOUT_TYPES encode bug
NFSD: Store parent's stat in a separate value
nfsd: Fix two typos in comments
lockd: NLM grace period shouldn't block NFSv4 opens
nfsd: include linux/nfs4.h in export.h
sunrpc: Switch to using hash list instead single list
sunrpc/nfsd: Remove redundant code by exports seq_operations functions
sunrpc: Store cache_detail in seq_file's private directly
...
Merge patch-bomb from Andrew Morton:
- a few misc things
- Andy's "ambient capabilities"
- fs/nofity updates
- the ocfs2 queue
- kernel/watchdog.c updates and feature work.
- some of MM. Includes Andrea's userfaultfd feature.
[ Hadn't noticed that userfaultfd was 'default y' when applying the
patches, so that got fixed in this merge instead. We do _not_ mark
new features that nobody uses yet 'default y' - Linus ]
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits)
mm/hugetlb.c: make vma_has_reserves() return bool
mm/madvise.c: make madvise_behaviour_valid() return bool
mm/memory.c: make tlb_next_batch() return bool
mm/dmapool.c: change is_page_busy() return from int to bool
mm: remove struct node_active_region
mremap: simplify the "overlap" check in mremap_to()
mremap: don't do uneccesary checks if new_len == old_len
mremap: don't do mm_populate(new_addr) on failure
mm: move ->mremap() from file_operations to vm_operations_struct
mremap: don't leak new_vma if f_op->mremap() fails
mm/hugetlb.c: make vma_shareable() return bool
mm: make GUP handle pfn mapping unless FOLL_GET is requested
mm: fix status code which move_pages() returns for zero page
mm: memcontrol: bring back the VM_BUG_ON() in mem_cgroup_swapout()
genalloc: add support of multiple gen_pools per device
genalloc: add name arg to gen_pool_get() and devm_gen_pool_create()
mm/memblock: WARN_ON when nid differs from overlap region
Documentation/features/vm: add feature description and arch support status for batched TLB flush after unmap
mm: defer flush of writable TLB entries
mm: send one IPI per CPU to TLB flush all entries after unmapping pages
...
If century field is supported by the RTC CMOS device, then we should use
it and then do not consider years greater that 169 as an error.
For information, the year field of the rtc_time structure contains the
value to add to 1970 to obtain the current year.
This was a hack to be able to support years for 1970 to 2069.
This patch remains compatible with this implementation.
Signed-off-by: Sylvain Chouleur <sylvain.chouleur@intel.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
struct node_active_region is not used anymore. Remove it.
Signed-off-by: minkyung88.kim <minkyung88.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
vma->vm_ops->mremap() looks more natural and clean in move_vma(), and this
way ->mremap() can have more users. Say, vdso.
While at it, s/aio_ring_remap/aio_ring_mremap/.
Note: this is the minimal change before ->mremap() finds another user in
file_operations; this method should have more arguments, and it can be
used to kill arch_remap().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This change fills devm_gen_pool_create()/gen_pool_get() "name" argument
stub with contents and extends of_gen_pool_get() functionality on this
basis.
If there is no associated platform device with a device node passed to
of_gen_pool_get(), the function attempts to get a label property or device
node name (= repeats MTD OF partition standard) and seeks for a named
gen_pool registered by device of the parent device node.
The main idea of the change is to allow registration of independent
gen_pools under the same umbrella device, say "partitions" on "storage
device", the original functionality of one "partition" per "storage
device" is untouched.
[akpm@linux-foundation.org: fix constness in devres_find()]
[dan.carpenter@oracle.com: freeing const data pointers]
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This change modifies gen_pool_get() and devm_gen_pool_create() client
interfaces adding one more argument "name" of a gen_pool object.
Due to implementation gen_pool_get() is capable to retrieve only one
gen_pool associated with a device even if multiple gen_pools are created,
fortunately right at the moment it is sufficient for the clients, hence
provide NULL as a valid argument on both producer devm_gen_pool_create()
and consumer gen_pool_get() sides.
Because only one created gen_pool per device is addressable, explicitly
add a restriction to devm_gen_pool_create() to create only one gen_pool
per device, this implies two possible error codes returned by the
function, account it on client side (only misc/sram). This completes
client side changes related to genalloc updates.
[akpm@linux-foundation.org: gen_pool_get() cleanup]
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a PTE is unmapped and it's dirty then it was writable recently. Due to
deferred TLB flushing, it's best to assume a writable TLB cache entry
exists. With that assumption, the TLB must be flushed before any IO can
start or the page is freed to avoid lost writes or data corruption. This
patch defers flushing of potentially writable TLBs as long as possible.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
An IPI is sent to flush remote TLBs when a page is unmapped that was
potentially accesssed by other CPUs. There are many circumstances where
this happens but the obvious one is kswapd reclaiming pages belonging to a
running process as kswapd and the task are likely running on separate
CPUs.
On small machines, this is not a significant problem but as machine gets
larger with more cores and more memory, the cost of these IPIs can be
high. This patch uses a simple structure that tracks CPUs that
potentially have TLB entries for pages being unmapped. When the unmapping
is complete, the full TLB is flushed on the assumption that a refill cost
is lower than flushing individual entries.
Architectures wishing to do this must give the following guarantee.
If a clean page is unmapped and not immediately flushed, the
architecture must guarantee that a write to that linear address
from a CPU with a cached TLB entry will trap a page fault.
This is essentially what the kernel already depends on but the window is
much larger with this patch applied and is worth highlighting. The
architecture should consider whether the cost of the full TLB flush is
higher than sending an IPI to flush each individual entry. An additional
architecture helper called flush_tlb_local is required. It's a trivial
wrapper with some accounting in the x86 case.
The impact of this patch depends on the workload as measuring any benefit
requires both mapped pages co-located on the LRU and memory pressure. The
case with the biggest impact is multiple processes reading mapped pages
taken from the vm-scalability test suite. The test case uses NR_CPU
readers of mapped files that consume 10*RAM.
Linear mapped reader on a 4-node machine with 64G RAM and 48 CPUs
4.2.0-rc1 4.2.0-rc1
vanilla flushfull-v7
Ops lru-file-mmap-read-elapsed 159.62 ( 0.00%) 120.68 ( 24.40%)
Ops lru-file-mmap-read-time_range 30.59 ( 0.00%) 2.80 ( 90.85%)
Ops lru-file-mmap-read-time_stddv 6.70 ( 0.00%) 0.64 ( 90.38%)
4.2.0-rc1 4.2.0-rc1
vanilla flushfull-v7
User 581.00 611.43
System 5804.93 4111.76
Elapsed 161.03 122.12
This is showing that the readers completed 24.40% faster with 29% less
system CPU time. From vmstats, it is known that the vanilla kernel was
interrupted roughly 900K times per second during the steady phase of the
test and the patched kernel was interrupts 180K times per second.
The impact is lower on a single socket machine.
4.2.0-rc1 4.2.0-rc1
vanilla flushfull-v7
Ops lru-file-mmap-read-elapsed 25.33 ( 0.00%) 20.38 ( 19.54%)
Ops lru-file-mmap-read-time_range 0.91 ( 0.00%) 1.44 (-58.24%)
Ops lru-file-mmap-read-time_stddv 0.28 ( 0.00%) 0.47 (-65.34%)
4.2.0-rc1 4.2.0-rc1
vanilla flushfull-v7
User 58.09 57.64
System 111.82 76.56
Elapsed 27.29 22.55
It's still a noticeable improvement with vmstat showing interrupts went
from roughly 500K per second to 45K per second.
The patch will have no impact on workloads with no memory pressure or have
relatively few mapped pages. It will have an unpredictable impact on the
workload running on the CPU being flushed as it'll depend on how many TLB
entries need to be refilled and how long that takes. Worst case, the TLB
will be completely cleared of active entries when the target PFNs were not
resident at all.
[sasha.levin@oracle.com: trace tlb flush after disabling preemption in try_to_unmap_flush]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When unmapping pages it is necessary to flush the TLB. If that page was
accessed by another CPU then an IPI is used to flush the remote CPU. That
is a lot of IPIs if kswapd is scanning and unmapping >100K pages per
second.
There already is a window between when a page is unmapped and when it is
TLB flushed. This series increases the window so multiple pages can be
flushed using a single IPI. This should be safe or the kernel is hosed
already.
Patch 1 simply made the rest of the series easier to write as ftrace
could identify all the senders of TLB flush IPIS.
Patch 2 tracks what CPUs potentially map a PFN and then sends an IPI
to flush the entire TLB.
Patch 3 tracks when there potentially are writable TLB entries that
need to be batched differently
Patch 4 increases SWAP_CLUSTER_MAX to further batch flushes
The performance impact is documented in the changelogs but in the optimistic
case on a 4-socket machine the full series reduces interrupts from 900K
interrupts/second to 60K interrupts/second.
This patch (of 4):
It is easy to trace when an IPI is received to flush a TLB but harder to
detect what event sent it. This patch makes it easy to identify the
source of IPIs being transmitted for TLB flushes on x86.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This implements mcopy_atomic and mfill_zeropage that are the lowlevel
VM methods that are invoked respectively by the UFFDIO_COPY and
UFFDIO_ZEROPAGE userfaultfd commands.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This implements the uABI of UFFDIO_COPY and UFFDIO_ZEROPAGE.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I had requests to return the full address (not the page aligned one) to
userland.
It's not entirely clear how the page offset could be relevant because
userfaults aren't like SIGBUS that can sigjump to a different place and it
actually skip resolving the fault depending on a page offset. There's
currently no real way to skip the fault especially because after a
UFFDIO_COPY|ZEROPAGE, the fault is optimized to be retried within the
kernel without having to return to userland first (not even self modifying
code replacing the .text that touched the faulting address would prevent
the fault to be repeated). Userland cannot skip repeating the fault even
more so if the fault was triggered by a KVM secondary page fault or any
get_user_pages or any copy-user inside some syscall which will return to
kernel code. The second time FAULT_FLAG_RETRY_NOWAIT won't be set leading
to a SIGBUS being raised because the userfault can't wait if it cannot
release the mmap_map first (and FAULT_FLAG_RETRY_NOWAIT is required for
that).
Still returning userland a proper structure during the read() on the uffd,
can allow to use the current UFFD_API for the future non-cooperative
extensions too and it looks cleaner as well. Once we get additional
fields there's no point to return the fault address page aligned anymore
to reuse the bits below PAGE_SHIFT.
The only downside is that the read() syscall will read 32bytes instead of
8bytes but that's not going to be measurable overhead.
The total number of new events that can be extended or of new future bits
for already shipped events, is limited to 64 by the features field of the
uffdio_api structure. If more will be needed a bump of UFFD_API will be
required.
[akpm@linux-foundation.org: use __packed]
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is (seems to be) the minimal thing that is required to unblock
standard uffd usage from the non-cooperative one. Now more bits can be
added to the features field indicating e.g. UFFD_FEATURE_FORK and others
needed for the latter use-case.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
vma->vm_userfaultfd_ctx is yet another vma parameter that vma_merge
must be aware about so that we can merge vmas back like they were
originally before arming the userfaultfd on some memory range.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These two flags gets set in vma->vm_flags to tell the VM common code
if the userfaultfd is armed and in which mode (only tracking missing
faults, only tracking wrprotect faults or both). If neither flags is
set it means the userfaultfd is not armed on the vma.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds the vm_userfaultfd_ctx to the vm_area_struct.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kernel header defining the methods needed by the VM common code to
interact with the userfaultfd.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Defines the uAPI of the userfaultfd, notably the ioctl numbers and protocol.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead
of the current hardcoded 1 (that would wake just the first waitqueue in
the head list).
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the basic infrastructure for alloc/free operations on pointer arrays.
It includes a generic function in the common slab code that is used in
this infrastructure patch to create the unoptimized functionality for slab
bulk operations.
Allocators can then provide optimized allocation functions for situations
in which large numbers of objects are needed. These optimization may
avoid taking locks repeatedly and bypass metadata creation if all objects
in slab pages can be used to provide the objects required.
Allocators can extend the skeletons provided and add their own code to the
bulk alloc and free functions. They can keep the generic allocation and
freeing and just fall back to those if optimizations would not work (like
for example when debugging is on).
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename watchdog_suspend() to lockup_detector_suspend() and
watchdog_resume() to lockup_detector_resume() to avoid confusion with the
watchdog subsystem and to be consistent with the existing name
lockup_detector_init().
Also provide comment blocks to explain the watchdog_running and
watchdog_suspended variables and their relationship.
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Stephane Eranian <eranian@google.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove watchdog_nmi_disable_all() and watchdog_nmi_enable_all() since
these functions are no longer needed. If a subsystem has a need to
deactivate the watchdog temporarily, it should utilize the
watchdog_suspend() and watchdog_resume() functions.
[akpm@linux-foundation.org: fix build with CONFIG_LOCKUP_DETECTOR=m]
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Stephane Eranian <eranian@google.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This interface can be utilized to deactivate the hard and soft lockup
detector temporarily. Callers are expected to minimize the duration of
deactivation. Multiple deactivations are allowed to occur in parallel but
should be rare in practice.
[akpm@linux-foundation.org: remove unneeded static initialization]
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Stephane Eranian <eranian@google.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The kernel's NMI watchdog has nothing to do with the watchdog subsystem.
Its header declarations should be in linux/nmi.h, not linux/watchdog.h.
The code provided two sets of dummy functions if HARDLOCKUP_DETECTOR is
not configured, one in the include file and one in kernel/watchdog.c.
Remove the dummy functions from kernel/watchdog.c and use those from the
include file.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It makes the registration cheaper and simpler for the smpboot per-cpu
kthread users that don't need to always update the cpumask after threads
creation.
[sfr@canb.auug.org.au: fix for allow passing the cpumask on per-cpu thread registration]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Many file systems that implement the show_options hook fail to correctly
escape their output which could lead to unescaped characters (e.g. new
lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This
could lead to confusion, spoofed entries (resulting in things like
systemd issuing false d-bus "mount" notifications), and who knows what
else. This looks like it would only be the root user stepping on
themselves, but it's possible weird things could happen in containers or
in other situations with delegated mount privileges.
Here's an example using overlay with setuid fusermount trusting the
contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use
of "sudo" is something more sneaky:
$ BASE="ovl"
$ MNT="$BASE/mnt"
$ LOW="$BASE/lower"
$ UP="$BASE/upper"
$ WORK="$BASE/work/ 0 0
none /proc fuse.pwn user_id=1000"
$ mkdir -p "$LOW" "$UP" "$WORK"
$ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
$ cat /proc/mounts
none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
none /proc fuse.pwn user_id=1000 0 0
$ fusermount -u /proc
$ cat /proc/mounts
cat: /proc/mounts: No such file or directory
This fixes the problem by adding new seq_show_option and
seq_show_option_n helpers, and updating the vulnerable show_option
handlers to use them as needed. Some, like SELinux, need to be open
coded due to unusual existing escape mechanisms.
[akpm@linux-foundation.org: add lost chunk, per Kees]
[keescook@chromium.org: seq_show_option should be using const parameters]
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Jan Kara <jack@suse.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: J. R. Okajima <hooanon05g@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fsnotify_destroy_mark_locked() is subtle to use because it temporarily
releases group->mark_mutex. To avoid future problems with this
function, split it into two.
fsnotify_detach_mark() is the part that needs group->mark_mutex and
fsnotify_free_mark() is the part that must be called outside of
group->mark_mutex. This way it's much clearer what's going on and we
also avoid some pointless acquisitions of group->mark_mutex.
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Free list is used when all marks on given inode / mount should be
destroyed when inode / mount is going away. However we can free all of
the marks without using a special list with some care.
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Per Andrew Morgan's request, add a securebit to allow admins to disable
PR_CAP_AMBIENT_RAISE. This securebit will prevent processes from adding
capabilities to their ambient set.
For simplicity, this disables PR_CAP_AMBIENT_RAISE entirely rather than
just disabling setting previously cleared bits.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Aaron Jones <aaronmdjones@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: Markku Savela <msa@moth.iki.fi>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Credit where credit is due: this idea comes from Christoph Lameter with
a lot of valuable input from Serge Hallyn. This patch is heavily based
on Christoph's patch.
===== The status quo =====
On Linux, there are a number of capabilities defined by the kernel. To
perform various privileged tasks, processes can wield capabilities that
they hold.
Each task has four capability masks: effective (pE), permitted (pP),
inheritable (pI), and a bounding set (X). When the kernel checks for a
capability, it checks pE. The other capability masks serve to modify
what capabilities can be in pE.
Any task can remove capabilities from pE, pP, or pI at any time. If a
task has a capability in pP, it can add that capability to pE and/or pI.
If a task has CAP_SETPCAP, then it can add any capability to pI, and it
can remove capabilities from X.
Tasks are not the only things that can have capabilities; files can also
have capabilities. A file can have no capabilty information at all [1].
If a file has capability information, then it has a permitted mask (fP)
and an inheritable mask (fI) as well as a single effective bit (fE) [2].
File capabilities modify the capabilities of tasks that execve(2) them.
A task that successfully calls execve has its capabilities modified for
the file ultimately being excecuted (i.e. the binary itself if that
binary is ELF or for the interpreter if the binary is a script.) [3] In
the capability evolution rules, for each mask Z, pZ represents the old
value and pZ' represents the new value. The rules are:
pP' = (X & fP) | (pI & fI)
pI' = pI
pE' = (fE ? pP' : 0)
X is unchanged
For setuid binaries, fP, fI, and fE are modified by a moderately
complicated set of rules that emulate POSIX behavior. Similarly, if
euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
(primary, fP and fI usually end up being the full set). For nonroot
users executing binaries with neither setuid nor file caps, fI and fP
are empty and fE is false.
As an extra complication, if you execute a process as nonroot and fE is
set, then the "secure exec" rules are in effect: AT_SECURE gets set,
LD_PRELOAD doesn't work, etc.
This is rather messy. We've learned that making any changes is
dangerous, though: if a new kernel version allows an unprivileged
program to change its security state in a way that persists cross
execution of a setuid program or a program with file caps, this
persistent state is surprisingly likely to allow setuid or file-capped
programs to be exploited for privilege escalation.
===== The problem =====
Capability inheritance is basically useless.
If you aren't root and you execute an ordinary binary, fI is zero, so
your capabilities have no effect whatsoever on pP'. This means that you
can't usefully execute a helper process or a shell command with elevated
capabilities if you aren't root.
On current kernels, you can sort of work around this by setting fI to
the full set for most or all non-setuid executable files. This causes
pP' = pI for nonroot, and inheritance works. No one does this because
it's a PITA and it isn't even supported on most filesystems.
If you try this, you'll discover that every nonroot program ends up with
secure exec rules, breaking many things.
This is a problem that has bitten many people who have tried to use
capabilities for anything useful.
===== The proposed change =====
This patch adds a fifth capability mask called the ambient mask (pA).
pA does what most people expect pI to do.
pA obeys the invariant that no bit can ever be set in pA if it is not
set in both pP and pI. Dropping a bit from pP or pI drops that bit from
pA. This ensures that existing programs that try to drop capabilities
still do so, with a complication. Because capability inheritance is so
broken, setting KEEPCAPS, using setresuid to switch to nonroot uids, and
then calling execve effectively drops capabilities. Therefore,
setresuid from root to nonroot conditionally clears pA unless
SECBIT_NO_SETUID_FIXUP is set. Processes that don't like this can
re-add bits to pA afterwards.
The capability evolution rules are changed:
pA' = (file caps or setuid or setgid ? 0 : pA)
pP' = (X & fP) | (pI & fI) | pA'
pI' = pI
pE' = (fE ? pP' : pA')
X is unchanged
If you are nonroot but you have a capability, you can add it to pA. If
you do so, your children get that capability in pA, pP, and pE. For
example, you can set pA = CAP_NET_BIND_SERVICE, and your children can
automatically bind low-numbered ports. Hallelujah!
Unprivileged users can create user namespaces, map themselves to a
nonzero uid, and create both privileged (relative to their namespace)
and unprivileged process trees. This is currently more or less
impossible. Hallelujah!
You cannot use pA to try to subvert a setuid, setgid, or file-capped
program: if you execute any such program, pA gets cleared and the
resulting evolution rules are unchanged by this patch.
Users with nonzero pA are unlikely to unintentionally leak that
capability. If they run programs that try to drop privileges, dropping
privileges will still work.
It's worth noting that the degree of paranoia in this patch could
possibly be reduced without causing serious problems. Specifically, if
we allowed pA to persist across executing non-pA-aware setuid binaries
and across setresuid, then, naively, the only capabilities that could
leak as a result would be the capabilities in pA, and any attacker
*already* has those capabilities. This would make me nervous, though --
setuid binaries that tried to privilege-separate might fail to do so,
and putting CAP_DAC_READ_SEARCH or CAP_DAC_OVERRIDE into pA could have
unexpected side effects. (Whether these unexpected side effects would
be exploitable is an open question.) I've therefore taken the more
paranoid route. We can revisit this later.
An alternative would be to require PR_SET_NO_NEW_PRIVS before setting
ambient capabilities. I think that this would be annoying and would
make granting otherwise unprivileged users minor ambient capabilities
(CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much less useful than
it is with this patch.
===== Footnotes =====
[1] Files that are missing the "security.capability" xattr or that have
unrecognized values for that xattr end up with has_cap set to false.
The code that does that appears to be complicated for no good reason.
[2] The libcap capability mask parsers and formatters are dangerously
misleading and the documentation is flat-out wrong. fE is *not* a mask;
it's a single bit. This has probably confused every single person who
has tried to use file capabilities.
[3] Linux very confusingly processes both the script and the interpreter
if applicable, for reasons that elude me. The results from thinking
about a script's file capabilities and/or setuid bits are mostly
discarded.
Preliminary userspace code is here, but it needs updating:
https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=7f5afbd175d2
Here is a test program that can be used to verify the functionality
(from Christoph):
/*
* Test program for the ambient capabilities. This program spawns a shell
* that allows running processes with a defined set of capabilities.
*
* (C) 2015 Christoph Lameter <cl@linux.com>
* Released under: GPL v3 or later.
*
*
* Compile using:
*
* gcc -o ambient_test ambient_test.o -lcap-ng
*
* This program must have the following capabilities to run properly:
* Permissions for CAP_NET_RAW, CAP_NET_ADMIN, CAP_SYS_NICE
*
* A command to equip the binary with the right caps is:
*
* setcap cap_net_raw,cap_net_admin,cap_sys_nice+p ambient_test
*
*
* To get a shell with additional caps that can be inherited by other processes:
*
* ./ambient_test /bin/bash
*
*
* Verifying that it works:
*
* From the bash spawed by ambient_test run
*
* cat /proc/$$/status
*
* and have a look at the capabilities.
*/
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <cap-ng.h>
#include <sys/prctl.h>
#include <linux/capability.h>
/*
* Definitions from the kernel header files. These are going to be removed
* when the /usr/include files have these defined.
*/
#define PR_CAP_AMBIENT 47
#define PR_CAP_AMBIENT_IS_SET 1
#define PR_CAP_AMBIENT_RAISE 2
#define PR_CAP_AMBIENT_LOWER 3
#define PR_CAP_AMBIENT_CLEAR_ALL 4
static void set_ambient_cap(int cap)
{
int rc;
capng_get_caps_process();
rc = capng_update(CAPNG_ADD, CAPNG_INHERITABLE, cap);
if (rc) {
printf("Cannot add inheritable cap\n");
exit(2);
}
capng_apply(CAPNG_SELECT_CAPS);
/* Note the two 0s at the end. Kernel checks for these */
if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, cap, 0, 0)) {
perror("Cannot set cap");
exit(1);
}
}
int main(int argc, char **argv)
{
int rc;
set_ambient_cap(CAP_NET_RAW);
set_ambient_cap(CAP_NET_ADMIN);
set_ambient_cap(CAP_SYS_NICE);
printf("Ambient_test forking shell\n");
if (execv(argv[1], argv + 1))
perror("Cannot exec");
return 0;
}
Signed-off-by: Christoph Lameter <cl@linux.com> # Original author
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Aaron Jones <aaronmdjones@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: Markku Savela <msa@moth.iki.fi>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Make it clear that the `node' arg refers to memory allocations only:
kthread_create_on_node() does not pin the new thread to that node's
CPUs.
- Encourage the use of NUMA_NO_NODE.
[nzimmer@sgi.com: use NUMA_NO_NODE in kthread_create() also]
Cc: Nathan Zimmer <nzimmer@sgi.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull drm updates from Dave Airlie:
"This is the main pull request for the drm for 4.3. Nouveau is
probably the biggest amount of changes in here, since it missed 4.2.
Highlights below, along with the usual bunch of fixes.
All stuff outside drm should have applicable acks.
Highlights:
- new drivers:
freescale dcu kms driver
- core:
more atomic fixes
disable some dri1 interfaces on kms drivers
drop fb panic handling, this was just getting more broken, as more locking was required.
new core fbdev Kconfig support - instead of each driver enable/disabling it
struct_mutex cleanups
- panel:
more new panels
cleanup Kconfig
- i915:
Skylake support enabled by default
legacy modesetting using atomic infrastructure
Skylake fixes
GEN9 workarounds
- amdgpu:
Fiji support
CGS support for amdgpu
Initial GPU scheduler - off by default
Lots of bug fixes and optimisations.
- radeon:
DP fixes
misc fixes
- amdkfd:
Add Carrizo support for amdkfd using amdgpu.
- nouveau:
long pending cleanup to complete driver,
fully bisectable which makes it larger,
perfmon work
more reclocking improvements
maxwell displayport fixes
- vmwgfx:
new DX device support, supports OpenGL 3.3
screen targets support
- mgag200:
G200eW support
G200e new revision support
- msm:
dragonboard 410c support, msm8x94 support, msm8x74v1 support
yuv format support
dma plane support
mdp5 rotation
initial hdcp
- sti:
atomic support
- exynos:
lots of cleanups
atomic modesetting/pageflipping support
render node support
- tegra:
tegra210 support (dc, dsi, dp/hdmi)
dpms with atomic modesetting support
- atmel:
support for 3 more atmel SoCs
new input formats, PRIME support.
- dwhdmi:
preparing to add audio support
- rockchip:
yuv plane support"
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (1369 commits)
drm/amdgpu: rename gmc_v8_0_init_compute_vmid
drm/amdgpu: fix vce3 instance handling
drm/amdgpu: remove ib test for the second VCE Ring
drm/amdgpu: properly enable VM fault interrupts
drm/amdgpu: fix warning in scheduler
drm/amdgpu: fix buffer placement under memory pressure
drm/amdgpu/cz: fix cz_dpm_update_low_memory_pstate logic
drm/amdgpu: fix typo in dce11 watermark setup
drm/amdgpu: fix typo in dce10 watermark setup
drm/amdgpu: use top down allocation for non-CPU accessible vram
drm/amdgpu: be explicit about cpu vram access for driver BOs (v2)
drm/amdgpu: set MEC doorbell range for Fiji
drm/amdgpu: implement burst NOP for SDMA
drm/amdgpu: add insert_nop ring func and default implementation
drm/amdgpu: add amdgpu_get_sdma_instance helper function
drm/amdgpu: add AMDGPU_MAX_SDMA_INSTANCES
drm/amdgpu: add burst_nop flag for sdma
drm/amdgpu: add count field for the SDMA NOP packet v2
drm/amdgpu: use PT for VM sync on unmap
drm/amdgpu: make wait_event uninterruptible in push_job
...
There are little changes in core part, but lots of development are
found in drivers, especially ASoC. The diffstat shows regmap-
related changes for a slight API additions / changes, and that's all.
Looking at the code size statistics, the most significant addition
is for Intel Skylake. (Note that SKL support is still underway, the
codec driver is missing.) Also STI controller driver is a major
addition as well as a few new codec drivers.
In HD-audio side, there are fewer changes than the past. The
noticeable change is the support of ELD notification from i915
graphics driver. Thus this pull request carries a few changes in
drm/i915.
Other than that, USB-audio got a rewrite of runtime PM code. It
was initiated by lockdep warning, but resulted in a good cleanup in
the end.
Below are the highlights:
Common:
- Factoring out of AC'97 reset code from ASoC into the core helper
- A few regmap API extensions (in case it's not pulled yet)
ASoC:
- New drivers for Cirrus CS4349, GTM601, InvenSense ICS43432, Realtek
RT298 and ST STI controllers
- Machine drivers for Rockchip systems with MAX98090 and RT5645 and
RT5650
- Initial driver support for Intel Skylake devices
- Lots of rsnd cleanup and enhancements
- A few DAPM fixes and cleanups
- A large number of cleanups in various drivers (conversion and
standardized to regmap, component) mostly by Lars-Peter and Axel
HD-audio:
- Extended HD-audio core for Intel Skylake controller support
- Quirks for Dell headsets, Alienware 15
- Clean up of pin-based quirk tables for Realtek codecs
- ELD notifier implenetation for Intel HDMI/DP
USB-audio:
- Refactor runtime PM code to make lockdep happier
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJV6TwJAAoJEGwxgFQ9KSmkZoEP/06GrsGlfgIfBbnlAKcsZ0t0
RDDCbxmwD8IsjTk180Gs3qBuhVPurhmPxq6Leow5fBktkEK5bIN3eAQkO9aIMroW
xxU1UF6Q9XE2j97e/PhhUld7/NP0IQK/YTMuwX74G2kfEkA9Lktl4UjNMw9mKJX2
8OIwz8ZuqSG60znmGlgiqRE4M3Svs1L/jVP1wrPg2DXQfe+ptAJpUTsyVGOMRWm3
IaJ9h5OelPg8Jm61zcg6/pgsdYx4oquCV5wLwMz8rzIUfHb7ox8F7YKOzB+sXtYI
zcsTfF2CqifoBcQAh9c+XE4+gMamAdheA+uc8ScUkcskucTj4Fr5tXLiPSN9QMt4
QGOOVjqcpWv5rWwAgzUJvl1/PT4HyQfkXn5tEQVGdg9Ab1SIcQBzD1+nHUV94vKZ
N7/grMdqJ56zUGK2fEcBS6BEDlaSToOIHDrQ1iPFNBvmW8qjBq9tYaufTGC6Vtj2
0YKJukzIbyqLIgQtQf44aqLouFIz2lq437PqRQ4W+9C3FwGN9FKCYJ/JzvOGDIJa
sSjEwQkJ9vnmZ3E2B30NKb24TG8pPq9WPIN2Rqe5EbHctU3gEnMScwvmG7SmCSG5
LtDVr6Q5XKFM56cVb7tdZl6Jv97BvGu6EERM+zN+8YyMver206rC8upWOev6R2q3
asvLDEchv7Qm3upx+PYg
=/sXs
-----END PGP SIGNATURE-----
Merge tag 'sound-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"There are little changes in core part, but lots of development are
found in drivers, especially ASoC. The diffstat shows regmap-related
changes for a slight API additions / changes, and that's all.
Looking at the code size statistics, the most significant addition is
for Intel Skylake. (Note that SKL support is still underway, the
codec driver is missing.) Also STI controller driver is a major
addition as well as a few new codec drivers.
In HD-audio side, there are fewer changes than the past. The
noticeable change is the support of ELD notification from i915
graphics driver. Thus this pull request carries a few changes in
drm/i915.
Other than that, USB-audio got a rewrite of runtime PM code. It was
initiated by lockdep warning, but resulted in a good cleanup in the
end.
Below are the highlights:
Common:
- Factoring out of AC'97 reset code from ASoC into the core helper
- A few regmap API extensions (in case it's not pulled yet)
ASoC:
- New drivers for Cirrus CS4349, GTM601, InvenSense ICS43432, Realtek
RT298 and ST STI controllers
- Machine drivers for Rockchip systems with MAX98090 and RT5645 and
RT5650
- Initial driver support for Intel Skylake devices
- Lots of rsnd cleanup and enhancements
- A few DAPM fixes and cleanups
- A large number of cleanups in various drivers (conversion and
standardized to regmap, component) mostly by Lars-Peter and Axel
HD-audio:
- Extended HD-audio core for Intel Skylake controller support
- Quirks for Dell headsets, Alienware 15
- Clean up of pin-based quirk tables for Realtek codecs
- ELD notifier implenetation for Intel HDMI/DP
USB-audio:
- Refactor runtime PM code to make lockdep happier"
* tag 'sound-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (411 commits)
drm/i915: Add locks around audio component bind/unbind
drm/i915: Drop port_mst_index parameter from pin/eld callback
ALSA: hda - Fix missing inline for dummy snd_hdac_set_codec_wakeup()
ALSA: hda - Wake the codec up on pin/ELD notify events
ALSA: hda - allow codecs to access the i915 pin/ELD callback
drm/i915: Call audio pin/ELD notify function
drm/i915: Add audio pin sense / ELD callback
ASoC: zx296702-i2s: Fix resource leak when unload module
ASoC: sti_uniperif: Ensure component is unregistered when unload module
ASoC: au1x: psc-i2s: Convert to use devm_ioremap_resource
ASoC: sh: dma-sh7760: Convert to devm_snd_soc_register_platform
ASoC: spear_pcm: Use devm_snd_dmaengine_pcm_register to fix resource leak
ALSA: fireworks/bebob/dice/oxfw: fix substreams counting at vmalloc failure
ASoC: Clean up docbook warnings
ASoC: txx9: Convert to devm_snd_soc_register_platform
ASoC: pxa: Convert to devm_snd_soc_register_platform
ASoC: nuc900: Convert to devm_snd_soc_register_platform
ASoC: blackfin: Convert to devm_snd_soc_register_platform
ASoC: au1x: Convert to devm_snd_soc_register_platform
ASoC: qcom: Constify asoc_qcom_lpass_cpu_dai_ops
...
- Move PWM8941 WLED driver into Backlight
- Remove invalid use of IS_ERR_VALUE() macro
- Remove duplicate check for NULL data before unregistering
- Export I2C Device ID structure
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV6FPVAAoJEFGvii+H/Hdh1NIP/2mcBwvmF8zgmTUALbRLKyAp
3LXx0/Nh64Pjz2rDsZPclo4NLlq43d9xq/A3CBI4b00VgiofgRG/Cihhzo8nxhTu
XN73TgVQ2CiNrQ0yn85dedoVEbv0cAlZYdyuHrp1SUUSPOUhaiRU2w6k7BkY0gc2
8lkzmVnMOkQjT+Elj6wVBcPiojm333Tu5wE125W8t6XDOwDfpLoVkzcxcv/V7eIo
pStB7u3MsXZ9kmG2S9aMSZX1PS6C772OeN1I7+3lFy67qiIDj3tQ+8MyaMFvWGdQ
/w8+f4K8k0t1m1OJVLgsT4UKR/7qprTOvEXYGrsVPPwZqSOypRZOooPOQbAVtkV6
qYsVYEnbsJerkCuAyWETL+H5eH+P56Zl66t6a9hgnzw4wKYarufo+xUe0b+ntmoO
dtegXgUVryY1P9teXsnnWx24r4pfaCWunO1RUXSZABCK+0qEeXa2SM/PFtkCWwEd
AGBp8Rcv9a2BHtY2oaii+rs6WFUH+CGeZ8rt1cKBZvT2gCjlTPb5WiGvCEnN3OBi
BCWgTLILldSRG7ERW6P6W6WkpP9u6gz8BChLNe0PsoOoNLXElHj6VCAlusOc1kHU
6WwvMpKcQ+SVMVNouFLv3Ur5K8sF7VkyaHDRoipeKufu0n3b3+AXEaRr/Ch1BsIh
cdUHgYJSujzcD/Zx8crp
=URbg
-----END PGP SIGNATURE-----
Merge tag 'backlight-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
Pull backlight updates from Lee Jones:
- Stop using LP855X Platform Data to control regulators
- Move PWM8941 WLED driver into Backlight
- Remove invalid use of IS_ERR_VALUE() macro
- Remove duplicate check for NULL data before unregistering
- Export I2C Device ID structure
* tag 'backlight-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
backlight: tosa: Export I2C module alias information
backlight: lp8788_bl: Delete a check before backlight_device_unregister()
backlight: sky81452: Remove unneeded use of IS_ERR_VALUE() macro
backlight: pm8941-wled: Move PM8941 WLED driver to backlight
backlight: lp855x: Use private data for regulator control
- New Clocksource driver from ST
- New MFD/ACPI/DMA drivers for Intel's Sunrisepoint PCH based platforms
- Add support for Arizona WM8998 and WM1814
- Add support for Dialog Semi DA9062 and DA9063
- Add support for Kontron COMe-bBL6 and COMe-cBW6
- Add support for X-Powers AXP152
- Add support for Atmel, many
- Add support for STMPE, many
- Add support for USB in X-Powers AXP22X
- Core Frameworks
- New Base API to traverse devices and their children in reverse order
- Bug Fixes
- Fix race between runtime-suspend and IRQs
- Obtain platform data form more reliable source
- Fix-ups
- Constifying things
- Variable signage changes
- Kconfig depends|selects changes
- Make use of BIT() macro
- Do not supply .owner attribute in *_driver structures
- MAINTAINERS entries
- Stop using set_irq_flags()
- Start using irq_set_chained_handler_and_data()
- Export DT device ID structures
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV6FCRAAoJEFGvii+H/HdhO2MP/1GxcbbywMXU4goj40gJaYfx
kk0zH0S7i8+A8hD7SoCIQNkWN5o7i6sNYUA6sCTnPqixbyrkduWCyid1XsATu+41
iiKEGyiCRyEHhCwnwCXvaFhpAZBzDi7FKj6hhf6nnRMHSEqwrs2aBqWgzNrOZTs0
u66i/JHccnDdfHHm9Y7XcKMA8pWVqRMnwwaHreuYTFqfrEB0UGCYpmEeEBynGVKh
MUGC0lCUrEKp59aOexZRtBUla/5BeALJd//vMQtf/+D0YPvE8lppDNwkgCe4buXN
ZlNHDQooIWIiZfTj7wbHaTWjrBK7MsOEHWBUjNsk2nyDvDOJoGhTrSdJwPeyhUSh
d2eUyW6sPEQY21XPwuD0DhfRKYKLOzVRhIcxvjlRAq9QHDWVXKyIlf3M70fculK8
5FN1Wb6Sc2h0OvMC5RemPpxMwZSq6Ks3XANa718Ju802TGK/xk6iRqhZrEut/qrN
rLYsU84TLUz6YindozTiI5FrGo+zSp9OlUU4z7HUh+4t3H5/opdsRjRp0ICwgIbY
NxAmsk2d/vJ7xX7FAAjwMY2rPIC0zIksbGEe1AJweWV455EcDMaBM1/e9zDzHciI
TXVxbzs3DFBadtQWlLv/VkwZmt43MTI8g6ozTTJJkPQNCKThtz4bvDSu8rQWqFua
bkbRyQroraX5fM0Q3HIs
=2blS
-----END PGP SIGNATURE-----
Merge tag 'mfd-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
"New Device Support:
- New Clocksource driver from ST
- New MFD/ACPI/DMA drivers for Intel's Sunrisepoint PCH based platforms
- Add support for Arizona WM8998 and WM1814
- Add support for Dialog Semi DA9062 and DA9063
- Add support for Kontron COMe-bBL6 and COMe-cBW6
- Add support for X-Powers AXP152
- Add support for Atmel, many
- Add support for STMPE, many
- Add support for USB in X-Powers AXP22X
Core Frameworks:
- New Base API to traverse devices and their children in reverse order
Bug Fixes:
- Fix race between runtime-suspend and IRQs
- Obtain platform data form more reliable source
Fix-ups:
- Constifying things
- Variable signage changes
- Kconfig depends|selects changes
- Make use of BIT() macro
- Do not supply .owner attribute in *_driver structures
- MAINTAINERS entries
- Stop using set_irq_flags()
- Start using irq_set_chained_handler_and_data()
- Export DT device ID structures"
* tag 'mfd-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (69 commits)
mfd: jz4740-adc: Init mask cache in generic IRQ chip
mfd: cros_ec: spi: Add OF match table
mfd: stmpe: Add OF match table
mfd: max77686: Split out regulator part from the DT binding
mfd: Add DT binding for Maxim MAX77802 IC
mfd: max77686: Use a generic name for the PMIC node in the example
mfd: max77686: Don't suggest in binding to use a deprecated property
mfd: Add MFD_CROS_EC dependencies
mfd: cros_ec: Remove CROS_EC_PROTO dependency for SPI and I2C drivers
mfd: axp20x: Add a cell for the usb power_supply part of the axp20x PMICs
mfd: axp20x: Add missing registers, and mark more registers volatile
mfd: arizona: Fixup some formatting/white space errors
mfd: wm8994: Fix NULL pointer exception on missing pdata
of: Add vendor prefix for Nuvoton
mfd: mt6397: Implement wake handler and suspend/resume to handle wake up event
mfd: atmel-hlcdc: Add support for new SoCs
mfd: Export OF module alias information in missing drivers
mfd: stw481x: Export I2C module alias information
mfd: da9062: Support for the DA9063 OnKey in the DA9062 core
mfd: max899x: Avoid redundant irq_data lookup
...
This time we have aded a new capability for scatter-gathered memset using
dmaengine APIs. This is supported in xdmac & hdmac drivers
We have added support for reusing descriptors for examples like video
buffers etc. Driver will follow
The behaviour of descriptor ack has been clarified and documented
New devices added are:
- dma controller in sun[457]i SoCs
- lpc18xx dmamux
- ZTE ZX296702 dma controller
- Analog Devices AXI-DMAC DMA controller
- eDMA support for dma-crossbar
- imx6sx support in imx-sdma driver
- imx-sdma device to device support
Others
- jz4780 fixes
- ioatdma large refactor and cleanup for removal of ioat v1 and v2 which is
deprecated and fixes
- ACPI support in X-Gene DMA engine driver
- ipu irq fixes
- mvxor fixes
- minor fixes spread thru drivers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV5+nSAAoJEHwUBw8lI4NHiXQQAI/++7PmUGZ6BDZGu0B9Bj7U
JalNijm43p858nka1zVhDea8pi7Cq3zJdE8EAB7FPQGESvCODWr62oZBr+mSaQ1C
oU1RTIRTSiU2HPE4EFeGUvVGrnmTbHR2b1apI1SU41gKn+oQ5RJRRoQwEVwO6uuZ
1VYcUqhurIAZs1FrMIAUa2vg7KTcK9UotfwR2gGBmSvXMf1aJ/dNZC7i/pBJjoyt
v6KrLuYjEBAJvY7l368+NhLY/MS+2xdCMQo84B+HNEG7eA7y2MFOcRPXQA3a7dzr
NwNuAZcTYDU11r2jiAPcnBM5sPo4bokX6Td0oDbYH6Rn2uIWlof7jGIceUaWLQQq
QGZc4QPI4KdjTGNedRN8g9zqv0irFVfDr5v1A+B7N7ehvlubnB4jV8LmLpqN6UQH
B38VnDJ3hqdZ6j9RHQTyUoQskSYMPbOAUYbL0qQLkyx8AnLc8TRv7DgtSvZjnz5W
oF6So2A5SWZ7UmXKupd6TKtdyG3xtFAh+/MGVQ1RS9bCmnyhaIxJRiJwfftCBTBx
IZePOsqlwl2dojM62BDlGS4CLRZve2VgiUEJaPINsdm/On3tQs9+iDbNY3cpvLQS
P9u4po1TQPZnKG732vPAxEqdlq709kta7Fj5KIEvNjuWBBGKfypNP8BHKRvTLFlR
kcbO03NzwSO6PZpmiUsx
=gQZ6
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-4.3-rc1' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine updates from Vinod Koul:
"This time we have aded a new capability for scatter-gathered memset
using dmaengine APIs. This is supported in xdmac & hdmac drivers
We have added support for reusing descriptors for examples like video
buffers etc. Driver will follow
The behaviour of descriptor ack has been clarified and documented
New devices added are:
- dma controller in sun[457]i SoCs
- lpc18xx dmamux
- ZTE ZX296702 dma controller
- Analog Devices AXI-DMAC DMA controller
- eDMA support for dma-crossbar
- imx6sx support in imx-sdma driver
- imx-sdma device to device support
Other:
- jz4780 fixes
- ioatdma large refactor and cleanup for removal of ioat v1 and v2
which is deprecated and fixes
- ACPI support in X-Gene DMA engine driver
- ipu irq fixes
- mvxor fixes
- minor fixes spread thru drivers"
[ The Kconfig and Makefile entries got re-sorted alphabetically, and I
handled the conflict with the new Intel integrated IDMA driver by
slightly mis-sorting it on purpose: "IDMA64" got sorted after "IMX" in
order to keep the Intel entries together. I think it might be a good
idea to just rename the IDMA64 config entry to INTEL_IDMA64 to make
the sorting be a true sort, not this mismash.
Also, this merge disables the COMPILE_TEST for the sun4i DMA
controller, because it does not compile cleanly at all. - Linus ]
* tag 'dmaengine-4.3-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (89 commits)
dmaengine: ioatdma: add Broadwell EP ioatdma PCI dev IDs
dmaengine :ipu: change ipu_irq_handler() to remove compile warning
dmaengine: ioatdma: Fix variable array length
dmaengine: ioatdma: fix sparse "error" with prep lock
dmaengine: hdmac: Add memset capabilities
dmaengine: sort the sh Makefile
dmaengine: sort the sh Kconfig
dmaengine: sort the dw Kconfig
dmaengine: sort the Kconfig
dmaengine: sort the makefile
drivers/dma: make mv_xor.c driver explicitly non-modular
dmaengine: Add support for the Analog Devices AXI-DMAC DMA controller
devicetree: Add bindings documentation for Analog Devices AXI-DMAC
dmaengine: xgene-dma: Fix the lock to allow client for further submission of requests
dmaengine: ioatdma: fix coccinelle warning
dmaengine: ioatdma: fix zero day warning on incompatible pointer type
dmaengine: tegra-apb: Simplify locking for device using global pause
dmaengine: tegra-apb: Remove unnecessary return statements and variables
dmaengine: tegra-apb: Avoid unnecessary channel base address calculation
dmaengine: tegra-apb: Remove unused variables
...
cycle
Core changes:
- It is possible configure groups in debugfs.
- Consolidation of chained IRQ handler install/remove replacing
all call sites where irq_set_handler_data() and
irq_set_chained_handler() were done in succession with a
combined call to irq_set_chained_handler_and_data(). This
series was created by Thomas Gleixner after the problem was
observed by Russell King.
- Tglx also made another series of patches switching
__irq_set_handler_locked() for irq_set_handler_locked() which
is way cleaner.
- Tglx also wrote a good bunch of patches to make use of
irq_desc_get_xxx() accessors and avoid looking up irq_descs
from IRQ numbers. The goal is to get rid of the irq number
from the handlers in the IRQ flow which is nice.
Driver feature enhancements:
- Power management support for the SiRF SoC Atlas 7.
- Power down support for the Qualcomm driver.
- Intel Cherryview and Baytrail: switch drivers to use raw
spinlocks in IRQ handlers to play nice with the realtime
patch set.
- Rework and new modes handling for Qualcomm SPMI-MPP.
- Pinconf power source config for SH PFC.
New drivers and subdrivers:
- A new driver for Conexant Digicolor CX92755.
- A new driver for UniPhier PH1-LD4, PH1-Pro4, PH1-sLD8,
PH1-Pro5, ProXtream2 and PH1-LD6b SoC pin control support.
- Reverse-egineered the S/PDIF settings for the Allwinner
sun4i driver.
- Support for Qualcomm Technologies QDF2xxx ARM64 SoCs
- A new Freescale i.mx6ul subdriver.
Cleanup:
- Remove platform data support in a number of SH PFC
subdrivers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV6YzgAAoJEEEQszewGV1zbIAQAILzMrzWkxsy7bhvL4QdP5/K
OG3EodE//AE0G5gKugUDjg5t2lftdiIJVhjDA17ruETCSciuAxZSLThlMy1sQgyN
LPxy9LlCrmsqrYt9+fmJ9js8j52RBJikKK0RUyUVz0VojTBplRpElyEx/KxwM5sG
Hy3+hU61uKO0j9AyIcsa/RKP6SGavwZdHytJBsHNw+pODyE3UZCf52ChAVBsTPfE
MV70g3Qzfqur7ZFqcNgtUV7qCyYvlF12ooiihrGFDOsTL3sSq4/OXB7z1z1mGGHL
Dgq8pXJ6EIZlCbk+jFMTzPRSzy46dxNai0eErjTUVEldH1tOphzGMvKmOdm/nczH
4M/UOWOKBE1aOYZNPtnUgDy2MRt5K9VJStCNSHEQCB2lGdojNAtmj2cmr8flBN5m
gM9FDpIS1/C+OYYTkOY9ftPsH5zOk7sCLEHSH5USYRGJHihzLnkV90eiN6a7vlF1
hyTGrIyl6e//E5JBgamjnR3+fYuxQGr6WeAZEP/gXZRm7BCKCaPwCarq+kPZVG4A
nolZ/QQN6XYPSlveSPU97VYvLYEUvXaKN0Hf2DTbwkqvNFp7JORD65QLESPtQoIp
x95iHMdB/1+0OfgOqMmlOtKpOKREeQ/R+KWACxsrr5Rfv3/7CP4BMRGypIZ/iPmz
HWoyDI4lIebBR+JnjMjK
=4QFX
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"This is the bulk of pin control changes for the v4.3 development
cycle.
Like with GPIO it's a lot of stuff. If my subsystems are any sign of
the overall tempo of the kernel v4.3 will be a gigantic diff.
[ It looks like 4.3 is calmer than 4.2 in most other subsystems, but
we'll see - Linus ]
Core changes:
- It is possible configure groups in debugfs.
- Consolidation of chained IRQ handler install/remove replacing all
call sites where irq_set_handler_data() and
irq_set_chained_handler() were done in succession with a combined
call to irq_set_chained_handler_and_data(). This series was
created by Thomas Gleixner after the problem was observed by
Russell King.
- Tglx also made another series of patches switching
__irq_set_handler_locked() for irq_set_handler_locked() which is
way cleaner.
- Tglx also wrote a good bunch of patches to make use of
irq_desc_get_xxx() accessors and avoid looking up irq_descs from
IRQ numbers. The goal is to get rid of the irq number from the
handlers in the IRQ flow which is nice.
Driver feature enhancements:
- Power management support for the SiRF SoC Atlas 7.
- Power down support for the Qualcomm driver.
- Intel Cherryview and Baytrail: switch drivers to use raw spinlocks
in IRQ handlers to play nice with the realtime patch set.
- Rework and new modes handling for Qualcomm SPMI-MPP.
- Pinconf power source config for SH PFC.
New drivers and subdrivers:
- A new driver for Conexant Digicolor CX92755.
- A new driver for UniPhier PH1-LD4, PH1-Pro4, PH1-sLD8, PH1-Pro5,
ProXtream2 and PH1-LD6b SoC pin control support.
- Reverse-egineered the S/PDIF settings for the Allwinner sun4i
driver.
- Support for Qualcomm Technologies QDF2xxx ARM64 SoCs
- A new Freescale i.mx6ul subdriver.
Cleanup:
- Remove platform data support in a number of SH PFC subdrivers"
* tag 'pinctrl-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (95 commits)
pinctrl: at91: fix null pointer dereference
pinctrl: mediatek: Implement wake handler and suspend resume
pinctrl: mediatek: Fix multiple registration issue.
pinctrl: sh-pfc: r8a7794: add USB pin groups
pinctrl: at91: Use generic irq_{request,release}_resources()
pinctrl: cherryview: Use raw_spinlock for locking
pinctrl: baytrail: Use raw_spinlock for locking
pinctrl: imx6ul: Remove .owner field
pinctrl: zynq: Fix typos in smc0_nand_grp and smc0_nor_grp
pinctrl: sh-pfc: Implement pinconf power-source param for voltage switching
clk: rockchip: add pclk_pd_pmu to the list of rk3288 critical clocks
pinctrl: sun4i: add spdif to pin description.
pinctrl: atlas7: clear ugly branch statements for pull and drivestrength
pinctrl: baytrail: Serialize all register access
pinctrl: baytrail: Drop FSF mailing address
pinctrl: rockchip: only enable gpio clock when it setting
pinctrl/mediatek: fix spelling mistake in dev_err error message
pinctrl: cherryview: Serialize all register access
pinctrl: UniPhier: PH1-Pro5: add I2C ch6 pin-mux setting
pinctrl: nomadik: reflect current input value
...
Core changes:
- Root out the wrapper devm_gpiod_get() and gpiod_get() etc
versions of the descriptor calls that did not use the flags
argument on the end. This was around for too long and eventually
Uwe Kleine-König took the time to clean it out and the last
users are removed along with the macros in this tag. In several
cases the use of flags simplifies the code. For this reason we
have (ACKed) patches hitting in DRM, IIO, media, NFC, USB+PHY
up until we hammer in the nail with removing the macros.
- Add a fat document describing how much ready-made GPIO stuff
we have i the kernel to discourage people from reinventing
a square wheel in userspace, as so often happens.
- Create a separate lockdep class for each instance of a GPIO
IRQ chip instead of using one class for all chips, as the current
code will not work with systems with several GPIO chips doing
lockdep debugging.
- Protect against driver unloading also when a GPIO line is only
used as IRQ for the GPIOLIB_IRQCHIP helpers.
- If the GPIO chip has no designated owner, assign the parent
device driver owner as owner.
- Consolidation of chained IRQ handler install/remove replacing
all call sites where irq_set_handler_data() and
irq_set_chained_handler() were done in succession with a
combined call to irq_set_chained_handler_and_data(). This
series was created by Thomas Gleixner after the problem was
observed by Russell King.
- Tglx also made another series of patches switching
__irq_set_handler_locked() for irq_set_handler_locked() which
is way cleaner.
- Tglx and Jiang Liu wrote a good bunch of patches to make use of
irq_desc_get_xxx() accessors and avoid looking up irq_descs
from IRQ numbers. The goal is to get rid of the irq number
from the handlers in the IRQ flow which is nice.
- Rob Herring killed off the set_irq_flags() for all GPIO
drivers. This was an ARM specific function that is replaced
with the generic irq_modify_status() where special flags
are actually needed.
- When an OF node has a pin range for its GPIOs, return
-EPROBE_DEFER if the pin controller isn't available.
Pretty logical, yet needed to be fixed.
- If a driver using GPIOLIB_IRQCHIP has its own
irq_*_resources call back, then call these instead of the
defaults provided by the GPIOLIB.
- Fix an undocumented ABI hole: named GPIOs were not
properly documented.
Driver improvements:
- Add get_direction() support to the generic GPIO driver, it's
strange that we didn't have that before.
- Make it possible to have input-only GPIO chips using the
generic GPIO driver.
- Clean out platform data support from the Emma Mobile (EM)
driver
- Finegrained runtime PM support for the RCAR driver.
- Support r8a7795 (R-car H3) in the RCAR driver.
- Support interrupts on GPIOs 16 thru 31 in the DaVinci driver.
- Some consolidation and new support in the MPC8xxx driver,
we now support MPC5125.
- Preempt-RT-friendly patches: the OMAP, MPC8xxx, drivers uses raw
spinlocks making it work better with the realime patches.
- Interrupt support for the EXTRAXFS GPIO driver.
- Make the ETRAXFS GPIO driver support also ARTPEC-3.
- Interrupt and wakeup support for the BRCMSTB driver, also for
wakeup from S5 cold boot.
- Mask MXC IRQs during suspend.
- Improve OMAP2 GPIO set_debounce() to work according to spec.
- The VF610 driver handles IRQs properly.
New drivers:
- ZTE ZX GPIO driver.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV52lvAAoJEEEQszewGV1zOVcP/2ZgkfRgl119LZnWShfrEJWq
UXaNzSaPNgvDzvGaqqi62SZQuhrdIWRKfMPtAuMGbEn5aJx0JC5UAOYjjfkKBqpO
toqc1w2DScc0JTorY8qgczIBDO1A3ZBAcIvXXpOduy/JaKPoQteRN8WYTynPw48/
0+97ZODwhyOkfeqmvUClkc9gW4XT68dudb0Lv1nQjsZmd1dHF2PZlwH3aL9sV68j
GJAqf09xNqZaWWQBhs+J3ptsYjaJfYjo9NOOUf0Y/UgqXO3vB+2S4EmRATaRHS2F
aHdj03sNZCNSDEa35WwetbLRGxPzSWmfxmBzQQ1baGdcJICn7Yv58EklPKRvwtMo
ZpUsgiOV4OUIEClPJohs4xbl2HRsOYB3VbcihkXjVAxS6i2/jgA3Tn8ATvUSZ8wq
TX8D6BfciigRCkT2G+B0TQBmcX1IQsMd1DBUNfw7Dk1TK/vxH4UYWbke422RjKGz
ORJ+0DfShMCdYjrCVlt7UbFcqE3L5CnrztLQ3oFt0om2JsSWztV9V579G+Dqo9CI
fE4G3xlsF33UCvXcmnOp6PuU+ZYBodLggkmK4REy2D3LCOnkcKq0U8Fj5RssApZ9
FdqVYck555ZpcBiN8ihB97WsmU+0XhBjblCbgzr6GxUw8EJ4x8H9nlraA6bluFoP
9c2qgPxjCq/VWA/F0YOU
=iQ2P
-----END PGP SIGNATURE-----
Merge tag 'gpio-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO updates from Linus Walleij:
"This is the bulk of GPIO changes for the v4.3 kernel cycle.
There is quite a lot going on in the GPIO subsystem this merge window,
so the main matter is decribed below.
The hits in other subsystems when making the GPIO flags optional are
all ACKed by their respective subsystem maintainers.
Core changes:
- Root out the wrapper devm_gpiod_get() and gpiod_get() etc versions
of the descriptor calls that did not use the flags argument on the
end. This was around for too long and eventually Uwe Kleine-König
took the time to clean it out and the last users are removed along
with the macros in this tag. In several cases the use of flags
simplifies the code. For this reason we have (ACKed) patches
hitting in DRM, IIO, media, NFC, USB+PHY up until we hammer in the
nail with removing the macros.
- Add a fat document describing how much ready-made GPIO stuff we
have i the kernel to discourage people from reinventing a square
wheel in userspace, as so often happens.
- Create a separate lockdep class for each instance of a GPIO IRQ
chip instead of using one class for all chips, as the current code
will not work with systems with several GPIO chips doing lockdep
debugging.
- Protect against driver unloading also when a GPIO line is only used
as IRQ for the GPIOLIB_IRQCHIP helpers.
- If the GPIO chip has no designated owner, assign the parent device
driver owner as owner.
- Consolidation of chained IRQ handler install/remove replacing all
call sites where irq_set_handler_data() and
irq_set_chained_handler() were done in succession with a combined
call to irq_set_chained_handler_and_data().
This series was created by Thomas Gleixner after the problem was
observed by Russell King.
- Tglx also made another series of patches switching
__irq_set_handler_locked() for irq_set_handler_locked() which is
way cleaner.
- Tglx and Jiang Liu wrote a good bunch of patches to make use of
irq_desc_get_xxx() accessors and avoid looking up irq_descs from
IRQ numbers. The goal is to get rid of the irq number from the
handlers in the IRQ flow which is nice.
- Rob Herring killed off the set_irq_flags() for all GPIO drivers.
This was an ARM specific function that is replaced with the generic
irq_modify_status() where special flags are actually needed.
- When an OF node has a pin range for its GPIOs, return -EPROBE_DEFER
if the pin controller isn't available. Pretty logical, yet needed
to be fixed.
- If a driver using GPIOLIB_IRQCHIP has its own irq_*_resources call
back, then call these instead of the defaults provided by the
GPIOLIB.
- Fix an undocumented ABI hole: named GPIOs were not properly
documented.
Driver improvements:
- Add get_direction() support to the generic GPIO driver, it's
strange that we didn't have that before.
- Make it possible to have input-only GPIO chips using the generic
GPIO driver.
- Clean out platform data support from the Emma Mobile (EM) driver
- Finegrained runtime PM support for the RCAR driver.
- Support r8a7795 (R-car H3) in the RCAR driver.
- Support interrupts on GPIOs 16 thru 31 in the DaVinci driver.
- Some consolidation and new support in the MPC8xxx driver, we now
support MPC5125.
- Preempt-RT-friendly patches: the OMAP, MPC8xxx, drivers uses raw
spinlocks making it work better with the realime patches.
- Interrupt support for the EXTRAXFS GPIO driver.
- Make the ETRAXFS GPIO driver support also ARTPEC-3.
- Interrupt and wakeup support for the BRCMSTB driver, also for
wakeup from S5 cold boot.
- Mask MXC IRQs during suspend.
- Improve OMAP2 GPIO set_debounce() to work according to spec.
- The VF610 driver handles IRQs properly.
New drivers:
- ZTE ZX GPIO driver"
* tag 'gpio-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (87 commits)
Revert "gpio: extraxfs: fix returnvar.cocci warnings"
gpio: tc3589x: use static container helper
gpio: xlp: fix error return code
gpio: vf610: handle level IRQ's properly
gpio: max732x: Fix error handling in probe()
gpio: omap: fix clk_prepare/unprepare usage
gpio: omap: protect regs access in omap_gpio_irq_handler
gpio: omap: fix omap2_set_gpio_debounce
gpio: omap: switch to use platform_get_irq
gpio: omap: remove wrong irq_domain_remove usage in probe
gpiolib: add description for gpio irqchip fields in struct gpio_chip
gpio: extraxfs: fix returnvar.cocci warnings
gpiolib: irqchip: use different lockdep class for each gpio irqchip
gpio/grgpio: fix deadlock in grgpio_irq_unmap()
Documentation: gpio: consumer: describe active low property
gpio: mxc: fix section mismatch warning
gpio/mxc: mask gpio interrupts in suspend
gpio: omap: Fix missing raw locks conversion
gpio: brcmstb: support wakeup from S5 cold boot
gpio: brcmstb: Add interrupt and wakeup source support
...
This reverts commit 8cd90e50d1 as with
this patch the serial console is broken on lots of platforms.
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jun Nie <jun.nie@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull tile updates from Chris Metcalf:
"This includes secure computing support as well as miscellaneous minor
improvements"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: correct some typos in opcode type names
tile/vdso: emit a GNU hash as well
tile: Remove finish_arch_switch
tile: enable full SECCOMP support
tile/time: Migrate to new 'set-state' interface
HT TDLS traffic should be protected in a non-HT BSS to avoid
collisions. Therefore, when TDLS peers join/leave, check if
protection is (now) needed and set the ht_operation_mode of
the virtual interface according to the HT capabilities of the
TDLS peer(s).
This works because a non-HT BSS connection never sets (or
otherwise uses) the ht_operation_mode; it just means that
drivers must be aware that this field applies to all HT
traffic for this virtual interface, not just the traffic
within the BSS. Document that.
Signed-off-by: Avri Altman <avri.altman@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Pull MIPS updates from Ralf Baechle:
"This is the main pull request for 4.3 for MIPS. Here's the summary:
Three fixes that didn't make 4.2-stable:
- a -Os build might compile the kernel using the MIPS16 instruction
set but the R2 optimized inline functions in <uapi/asm/swab.h> are
implemented using 32-bit wide instructions which is invalid.
- a build error in pgtable-bits.h for a particular kernel
configuration.
- accessing registers of the CM GCR might have been compiled to use
64 bit accesses but these registers are onl 32 bit wide.
And also a few new bits:
- move the ATH79 GPIO driver to drivers/gpio
- the definition of IRQCHIP_DECLARE has moved to linux/irqchip.h,
change ATH79 accordingly.
- fix definition of pgprot_writecombine
- add an implementation of dma_map_ops.mmap
- fix alignment of quiet build output for vmlinuz link
- BCM47xx: Use kmemdup rather than duplicating its implementation
- Netlogic: Fix 0x0x prefixes of constants.
- merge Bjorn Helgaas' series to remove most of the weak keywords
from function declarations.
- CP0 and CP1 registers are best considered treated as unsigned
values to avoid large values from becoming negative values.
- improve support for the MIPS GIC timer.
- enable common clock framework for Malta and SEAD3.
- a number of improvments and fixes to dump_tlb().
- document the MIPS TLB dump functionality in Magic SysRq.
- Cavium Octeon CN68XX improvments.
- NetLogic improvments.
- irq: Use access helper irq_data_get_affinity_mask.
- handle MSA unaligned accesses.
- a number of R6-related math-emu fixes.
- support for I6400.
- improvments to MSA support.
- add uprobes support.
- move from deprecated __initcall to arch_initcall.
- remove finish_arch_switch().
- IRQ cleanups by Thomas Gleixner.
- migrate to new 'set-state' interface.
- random small cleanups"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (148 commits)
MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.
MIPS: Fix alignment of quiet build output for vmlinuz link
MIPS: math-emu: Remove unused handle_dsemul function declaration
MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction
MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction
MIPS: math-emu: Add support for the MIPS R6 CLASS FPU instruction
MIPS: math-emu: Add support for the MIPS R6 RINT FPU instruction
MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction
MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction
MIPS: math-emu: Add support for the MIPS R6 SELNEZ FPU instruction
MIPS: math-emu: Add support for the MIPS R6 SELEQZ FPU instruction
MIPS: math-emu: Add support for the CMP.condn.fmt R6 instruction
MIPS: inst.h: Add new MIPS R6 FPU opcodes
MIPS: Octeon: Fix management port MII address on Kontron S1901
MIPS: BCM47xx: Use kmemdup rather than duplicating its implementation
STAGING: Octeon: Use common helpers for determining interface and port
MIPS: Octeon: Support interfaces 4 and 5
MIPS: Octeon: Set up 1:1 mapping between CN68XX PKO queues and ports
MIPS: Octeon: Initialize CN68XX PKO
STAGING: Octeon: Support CN68XX style WQE
...
- Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask from Benjamin Herrenschmidt
- EEH fixes for SRIOV from Gavin
- Introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth
- Use hardware RNG for arch_get_random_seed_* not arch_get_random_* from Paul Mackerras
- Seccomp filter support from Michael Ellerman
- opal_cec_reboot2() handling for HMIs & machine checks from Mahesh Salgaonkar
- Add powerpc timebase as a trace clock source from Naveen N. Rao
- Misc cleanups in the xmon, signal & SLB code from Anshuman Khandual
- Add an inline function to update POWER8 HID0 from Gautham R. Shenoy
- Fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman
- Drop support for 64K local store on 4K kernels from Michael Ellerman
- move dma_get_required_mask() from pnv_phb to pci_controller_ops from Andrew Donnellan
- Initialize distance lookup table from drconf path from Nikunj A Dadhania
- Enable RTC class support from Vaibhav Jain
- Disable automatically blocked PCI config from Gavin Shan
- Add LEDs driver for PowerNV platform from Vasant Hegde
- Fix endianness issues in the HVSI driver from Laurent Dufour
- Kexec endian fixes from Samuel Mendoza-Jonas
- Fix corrupted pdn list from Gavin Shan
- Fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan
- Freescale updates from Scott: Highlights include 32-bit memcpy/memset
optimizations, checksum optimizations, 85xx config fragments and updates,
device tree updates, e6500 fixes for non-SMP, and misc cleanup and minor
fixes.
- A ton of cxl updates & fixes:
- Add explicit precision specifiers from Rasmus Villemoes
- use more common format specifier from Rasmus Villemoes
- Destroy cxl_adapter_idr on module_exit from Johannes Thumshirn
- Destroy afu->contexts_idr on release of an afu from Johannes Thumshirn
- Compile with -Werror from Daniel Axtens
- EEH support from Daniel Axtens
- Plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain
- Add alternate MMIO error handling from Ian Munsie
- Allow release of contexts which have been OPENED but not STARTED from Andrew Donnellan
- Remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar
- Release irqs if memory allocation fails from Vaibhav Jain
- Remove racy attempt to force EEH invocation in reset from Daniel Axtens
- Fix + cleanup error paths in cxl_dev_context_init from Ian Munsie
- Fix force unmapping mmaps of contexts allocated through the kernel api from Ian Munsie
- Set up and enable PSL Timebase from Philippe Bergheaud
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV5+GzAAoJEFHr6jzI4aWA0iAP/jcd0kNaNBzLgcDKKygKdgz4
xn4EWu81vfMfZYWesb0ATrjlH0hLsRxSXoFUqUMhtJTa5kNAoCIaz/M8WBALS50h
aT+i7br4WEU2j2FcaMyP3iAZx/2hl+2utODJSHPRWPkec1fUDBfEyBf++e520RWM
HUQGIGZXh8yq7KMA96Pwhsvls9vOB8hS2UdU/NS8ff3J5jFvXC1/WmF2qfzJBS1V
8iHyz26Jl8+dJ+et7iC2oD5XQAjIH1oJgOyPVPBzAQttfi8RjuVzRA30TfPBAUwI
lC9nlmPy6bCe4kiQYWVB1z7GegHyW/9vkeuMj/u8mZbqpaayMEMZmd2C3iNDXNHx
i2NSvdln539t4qWYsV2v6lVCfa/ayDHD73Wackj5Dk394tzXnpCPhxNzc2yKEd5v
h7vwYc9jBhsbfSCSogaM+gSHJ1APgCidggHJMYYNA2nN2u6V62RpsMB7zp/1+Q2v
yqYdD8oYF4Dm21x/ujaNFrlizROD46WS0UqdJ3yP6HAqRYIpRXtibmpECJgt1n5h
HjADEci4hQ2UQxdMdp/Q5KZnPTJebBtrZrmkW5r6cZBUaTB5TVkFaEWN44CT/Loh
tMNeA3qOBN06CaQS2WL3UUUWpbZq9fSbWuUZ5lWZDb5AOyRxe5eWVYNLkiyIXozY
L24l1bYdBhXahnjoS/kc
=n9+X
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask
from Benjamin Herrenschmidt
- EEH fixes for SRIOV from Gavin
- introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth
- use hardware RNG for arch_get_random_seed_* not arch_get_random_*
from Paul Mackerras
- seccomp filter support from Michael Ellerman
- opal_cec_reboot2() handling for HMIs & machine checks from Mahesh
Salgaonkar
- add powerpc timebase as a trace clock source from Naveen N. Rao
- misc cleanups in the xmon, signal & SLB code from Anshuman Khandual
- add an inline function to update POWER8 HID0 from Gautham R. Shenoy
- fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman
- drop support for 64K local store on 4K kernels from Michael Ellerman
- move dma_get_required_mask() from pnv_phb to pci_controller_ops from
Andrew Donnellan
- initialize distance lookup table from drconf path from Nikunj A
Dadhania
- enable RTC class support from Vaibhav Jain
- disable automatically blocked PCI config from Gavin Shan
- add LEDs driver for PowerNV platform from Vasant Hegde
- fix endianness issues in the HVSI driver from Laurent Dufour
- kexec endian fixes from Samuel Mendoza-Jonas
- fix corrupted pdn list from Gavin Shan
- fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan
- Freescale updates from Scott: Highlights include 32-bit memcpy/memset
optimizations, checksum optimizations, 85xx config fragments and
updates, device tree updates, e6500 fixes for non-SMP, and misc
cleanup and minor fixes.
- a ton of cxl updates & fixes:
- add explicit precision specifiers from Rasmus Villemoes
- use more common format specifier from Rasmus Villemoes
- destroy cxl_adapter_idr on module_exit from Johannes Thumshirn
- destroy afu->contexts_idr on release of an afu from Johannes
Thumshirn
- compile with -Werror from Daniel Axtens
- EEH support from Daniel Axtens
- plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain
- add alternate MMIO error handling from Ian Munsie
- allow release of contexts which have been OPENED but not STARTED
from Andrew Donnellan
- remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar
- release irqs if memory allocation fails from Vaibhav Jain
- remove racy attempt to force EEH invocation in reset from Daniel
Axtens
- fix + cleanup error paths in cxl_dev_context_init from Ian Munsie
- fix force unmapping mmaps of contexts allocated through the kernel
api from Ian Munsie
- set up and enable PSL Timebase from Philippe Bergheaud
* tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (140 commits)
cxl: Set up and enable PSL Timebase
cxl: Fix force unmapping mmaps of contexts allocated through the kernel api
cxl: Fix + cleanup error paths in cxl_dev_context_init
powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail()
powerpc/pseries: Cleanup on pci_dn_reconfig_notifier()
powerpc/pseries: Fix corrupted pdn list
powerpc/powernv: Enable LEDS support
powerpc/iommu: Set default DMA offset in dma_dev_setup
cxl: Remove racy attempt to force EEH invocation in reset
cxl: Release irqs if memory allocation fails
cxl: Remove use of macro DEFINE_PCI_DEVICE_TABLE
powerpc/powernv: Fix mis-merge of OPAL support for LEDS driver
powerpc/powernv: Reset HILE before kexec_sequence()
powerpc/kexec: Reset secondary cpu endianness before kexec
powerpc/hvsi: Fix endianness issues in the HVSI driver
leds/powernv: Add driver for PowerNV platform
powerpc/powernv: Create LED platform device
powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED states
powerpc/powernv: Fix the log message when disabling VF
cxl: Allow release of contexts which have been OPENED but not STARTED
...
Pull ARM development updates from Russell King:
"Included in this update:
- moving PSCI code from ARM64/ARM to drivers/
- removal of some architecture internals from global kernel view
- addition of software based "privileged no access" support using the
old domains register to turn off the ability for kernel
loads/stores to access userspace. Only the proper accessors will
be usable.
- addition of early fixup support for early console
- re-addition (and reimplementation) of OMAP special interconnect
barrier
- removal of finish_arch_switch()
- only expose cpuX/online in sysfs if hotpluggable
- a number of code cleanups"
* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (41 commits)
ARM: software-based priviledged-no-access support
ARM: entry: provide uaccess assembly macro hooks
ARM: entry: get rid of multiple macro definitions
ARM: 8421/1: smp: Collapse arch_cpu_idle_dead() into cpu_die()
ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore()
ARM: mm: improve do_ldrd_abort macro
ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit()
ARM: entry: efficiency cleanups
ARM: entry: get rid of asm_trace_hardirqs_on_cond
ARM: uaccess: simplify user access assembly
ARM: domains: remove DOMAIN_TABLE
ARM: domains: keep vectors in separate domain
ARM: domains: get rid of manager mode for user domain
ARM: domains: move initial domain setting value to asm/domains.h
ARM: domains: provide domain_mask()
ARM: domains: switch to keeping domain value in register
ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE
ARM: 8417/1: refactor bitops functions with BIT_MASK() and BIT_WORD()
ARM: 8416/1: Feroceon: use of_iomap() to map register base
ARM: 8415/1: early fixmap support for earlycon
...
Pull locking and atomic updates from Ingo Molnar:
"Main changes in this cycle are:
- Extend atomic primitives with coherent logic op primitives
(atomic_{or,and,xor}()) and deprecate the old partial APIs
(atomic_{set,clear}_mask())
The old ops were incoherent with incompatible signatures across
architectures and with incomplete support. Now every architecture
supports the primitives consistently (by Peter Zijlstra)
- Generic support for 'relaxed atomics':
- _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
- atomic_read_acquire()
- atomic_set_release()
This came out of porting qwrlock code to arm64 (by Will Deacon)
- Clean up the fragile static_key APIs that were causing repeat bugs,
by introducing a new one:
DEFINE_STATIC_KEY_TRUE(name);
DEFINE_STATIC_KEY_FALSE(name);
which define a key of different types with an initial true/false
value.
Then allow:
static_branch_likely()
static_branch_unlikely()
to take a key of either type and emit the right instruction for the
case. To be able to know the 'type' of the static key we encode it
in the jump entry (by Peter Zijlstra)
- Static key self-tests (by Jason Baron)
- qrwlock optimizations (by Waiman Long)
- small futex enhancements (by Davidlohr Bueso)
- ... and misc other changes"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
jump_label/x86: Work around asm build bug on older/backported GCCs
locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
locking/static_keys: Make verify_keys() static
jump label, locking/static_keys: Update docs
locking/static_keys: Provide a selftest
jump_label: Provide a self-test
s390/uaccess, locking/static_keys: employ static_branch_likely()
x86, tsc, locking/static_keys: Employ static_branch_likely()
locking/static_keys: Add selftest
locking/static_keys: Add a new static_key interface
locking/static_keys: Rework update logic
locking/static_keys: Add static_key_{en,dis}able() helpers
...
Pull f2fs updates from Jaegeuk Kim:
"The major work includes fixing and enhancing the existing extent_cache
feature, which has been well settling down so far and now it becomes a
default mount option accordingly.
Also, this version newly registers a f2fs memory shrinker to reclaim
several objects consumed by a couple of data structures in order to
avoid memory pressures.
Another new feature is to add ioctl(F2FS_GARBAGE_COLLECT) which
triggers a cleaning job explicitly by users.
Most of the other patches are to fix bugs occurred in the corner cases
across the whole code area"
* tag 'for-f2fs-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (85 commits)
f2fs: upset segment_info repair
f2fs: avoid accessing NULL pointer in f2fs_drop_largest_extent
f2fs: update extent tree in batches
f2fs: fix to release inode correctly
f2fs: handle f2fs_truncate error correctly
f2fs: avoid unneeded initializing when converting inline dentry
f2fs: atomically set inode->i_flags
f2fs: fix wrong pointer access during try_to_free_nids
f2fs: use __GFP_NOFAIL to avoid infinite loop
f2fs: lookup neighbor extent nodes for merging later
f2fs: split __insert_extent_tree_ret for readability
f2fs: kill dead code in __insert_extent_tree
f2fs: adjust showing of extent cache stat
f2fs: add largest/cached stat in extent cache
f2fs: fix incorrect mapping for bmap
f2fs: add annotation for space utilization of regular/inline dentry
f2fs: fix to update cached_en of extent tree properly
f2fs: fix typo
f2fs: check the node block address of newly allocated nid
f2fs: go out for insert_inode_locked failure
...
When flushing queued messages in run-to-completion mode,
smi_event_handler() is recursively called.
flush_messages()
smi_event_handler()
handle_transaction_done()
deliver_recv_msg()
ipmi_smi_msg_received()
smi_recv_tasklet()
sender()
flush_messages()
smi_event_handler()
...
The depth of the recursive call depends on the number of queued
messages, so it can cause a stack overflow if many messages have
been queued.
To solve this problem, this patch removes flush_messages()
from sender()@ipmi_si_intf.c. Instead, add flush_messages() to
caller side of sender() if needed. Additionally, to implement this,
add new handler flush_messages to struct ipmi_smi_handlers.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Fixed up a comment and some spacing issues.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
This set mainly includes a change to the way the
dlm uses the SCTP API in the kernel, removing the
direct dependency on the sctp module. Other odd
SCTP-related fixes are also included. The other
notable fix is for a long standing regression in
the behavior of lock value blocks for user space
locks.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV5HwZAAoJEDgbc8f8gGmqoaQP/iz5zgKSjX0mOC3fz8BqXISk
85cKLPfsf0avDmGx6nkKp5wsmVDYkfrObkocvf7bOcemAuycuOmr9y22ZscNaAWM
vKLhTJQ0koAlZqhJmJx45w318BFY03RdDQmVKUnQHza9Ed7Uoa0CyR6jyuwBTuMP
gA9O6i6CezodtB8CLPySJa2znlt50CptLaJKj1V9/xCpBh7orwpihv4pBz8oH1lR
JXRj9hNEFy2+vk8Pce14fKmHgUROg5+y1V7jZeetpCbTxAAFOeFOL6EH28eWssbQ
YoWofcPugmOs9BDbnVZHf6+Y5xIaoiIylb2Q4/me4rjQfSmaiDbTZyqB4TtFrldF
BngaAJipmLQu8ELqQmwEMhZTAc/GsB60x1EcjrPVTKbW7pwsfVp2fPVV92a7koQe
prmz5rh8HCenrWuy3d4/EP7K+E4+W98ZXsDuym4pBNaoYwCPyvtWLa8kSqAdx47J
MNk/ak9ktP2NxsCs+EjCmP2hn2r+RTio6R2uCtKB2pdclfqOupIsYZkVdZERK5Ch
5+ALeVjHfxswFVRxGjbPQRs9x8ZclBydceAHgYbLQ2xDGRvTpQhnIyNLRXsZnkrD
t4mTokZG/GGgmWOscZ5nXOOGZt8SpX+UkICWWWbuy3dxuOK6al3lVeBcC0KW5Pki
KNHzcKrlGJJnCVr0nWTU
=iYRu
-----END PGP SIGNATURE-----
Merge tag 'dlm-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
"This set mainly includes a change to the way the dlm uses the SCTP API
in the kernel, removing the direct dependency on the sctp module.
Other odd SCTP-related fixes are also included.
The other notable fix is for a long standing regression in the
behavior of lock value blocks for user space locks"
* tag 'dlm-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: print error from kernel_sendpage
dlm: fix lvb copy for user locks
dlm: sctp_accept_from_sock() can be static
dlm: fix reconnecting but not sending data
dlm: replace BUG_ON with a less severe handling
dlm: use sctp 1-to-1 API
dlm: fix not reconnecting on connecting error handling
dlm: fix race while closing connections
dlm: fix connection stealing if using SCTP
features and other churn going into 4.2.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJV55TlAAoJEPL5WVaVDYGjyzYH/1WtZpIzRjp7o+3H4/vFqONg
R1Fsw785C1w8WX2QuIK/m31u4XO+VeCV4jWA9DuqnSzWm9w9C/4kTqITd4Hwp416
/9gJvYoZHHaDikxpWWADptDi8IoLohTlcFVCHIvvf53cGehVEEsc2WOijUZo7Cgv
O454Nm3tK0CQ3yrCIlf5SyvkUZSMTiawLLJJzd4GCyvU13C1SnABNQj8UxKisBA5
cP8q4O2nPg/S9rkYxnFAifQyZppd3jMvorUaq9eHiWMjl95o6e/6+wYGnHhoFUvr
/P1dNjJYbzk+TUzlsDkq2zANK2UsB3iNNi8YwLFOpfFcuYopmUAYRIWOgIZWYUQ=
=ijuI
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"Pretty much all bug fixes and clean ups for 4.3, after a lot of
features and other churn going into 4.2"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
Revert "ext4: remove block_device_ejected"
ext4: ratelimit the file system mounted message
ext4: silence a format string false positive
ext4: simplify some code in read_mmp_block()
ext4: don't manipulate recovery flag when freezing no-journal fs
jbd2: limit number of reserved credits
ext4 crypto: remove duplicate header file
ext4: update c/mtime on truncate up
jbd2: avoid infinite loop when destroying aborted journal
ext4, jbd2: add REQ_FUA flag when recording an error in the superblock
ext4 crypto: fix spelling typo in comment
ext4 crypto: exit cleanly if ext4_derive_key_aes() fails
ext4: reject journal options for ext2 mounts
ext4: implement cgroup writeback support
ext4: replace ext4_io_submit->io_op with ->io_wbc
ext4 crypto: check for too-short encrypted file names
ext4 crypto: use a jbd2 transaction when adding a crypto policy
jbd2: speedup jbd2_journal_dirty_metadata()
When the hfi1 driver was added these definitions were moved from the qib driver
to ib_mad.h to be used by both qib and hfi1. They should have been moved to
ib_smi.h instead.
Fixes: d4ab347005 ("IB/core: Add core header changes needed for OPA")
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Remove the unused IB_NOTICE_REPRESS_* defines.
When the hfi1 driver was added these definitions were moved from the qib driver
to ib_mad.h. They should have been removed instead.
Fixes: d4ab347005 ("IB/core: Add core header changes needed for OPA")
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When the hfi1 driver was added a user space header file (hfi1_user.h) was added
to be shared between PSM2 and the driver. However, the file was not added to
the header install. Add it now.
Fixes: d4ab347005 ("IB/core: Add core header changes needed for OPA")
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Pull ext3 removal, quota & udf fixes from Jan Kara:
"The biggest change in the pull is the removal of ext3 filesystem
driver (~28k lines removed). Ext4 driver is a full featured
replacement these days and both RH and SUSE use it for several years
without issues. Also there are some workarounds in VM & block layer
mainly for ext3 which we could eventually get rid of.
Other larger change is addition of proper error handling for
dquot_initialize(). The rest is small fixes and cleanups"
[ I wasn't convinced about the ext3 removal and worried about things
falling through the cracks for legacy users, but ext4 maintainers
piped up and were all unanimously in favor of removal, and maintaining
all legacy ext3 support inside ext4. - Linus ]
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Don't modify filesystem for read-only mounts
quota: remove an unneeded condition
ext4: memory leak on error in ext4_symlink()
mm/Kconfig: NEED_BOUNCE_POOL: clean-up condition
ext4: Improve ext4 Kconfig test
block: Remove forced page bouncing under IO
fs: Remove ext3 filesystem driver
doc: Update doc about journalling layer
jfs: Handle error from dquot_initialize()
reiserfs: Handle error from dquot_initialize()
ocfs2: Handle error from dquot_initialize()
ext4: Handle error from dquot_initialize()
ext2: Handle error from dquot_initalize()
quota: Propagate error from ->acquire_dquot()
Pull networking updates from David Miller:
"Another merge window, another set of networking changes. I've heard
rumblings that the lightweight tunnels infrastructure has been voted
networking change of the year. But what do I know?
1) Add conntrack support to openvswitch, from Joe Stringer.
2) Initial support for VRF (Virtual Routing and Forwarding), which
allows the segmentation of routing paths without using multiple
devices. There are some semantic kinks to work out still, but
this is a reasonably strong foundation. From David Ahern.
3) Remove spinlock fro act_bpf fast path, from Alexei Starovoitov.
4) Ignore route nexthops with a link down state in ipv6, just like
ipv4. From Andy Gospodarek.
5) Remove spinlock from fast path of act_gact and act_mirred, from
Eric Dumazet.
6) Document the DSA layer, from Florian Fainelli.
7) Add netconsole support to bcmgenet, systemport, and DSA. Also
from Florian Fainelli.
8) Add Mellanox Switch Driver and core infrastructure, from Jiri
Pirko.
9) Add support for "light weight tunnels", which allow for
encapsulation and decapsulation without bearing the overhead of a
full blown netdevice. From Thomas Graf, Jiri Benc, and a cast of
others.
10) Add Identifier Locator Addressing support for ipv6, from Tom
Herbert.
11) Support fragmented SKBs in iwlwifi, from Johannes Berg.
12) Allow perf PMUs to be accessed from eBPF programs, from Kaixu Xia.
13) Add BQL support to 3c59x driver, from Loganaden Velvindron.
14) Stop using a zero TX queue length to mean that a device shouldn't
have a qdisc attached, use an explicit flag instead. From Phil
Sutter.
15) Use generic geneve netdevice infrastructure in openvswitch, from
Pravin B Shelar.
16) Add infrastructure to avoid re-forwarding a packet in software
that was already forwarded by a hardware switch. From Scott
Feldman.
17) Allow AF_PACKET fanout function to be implemented in a bpf
program, from Willem de Bruijn"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1458 commits)
netfilter: nf_conntrack: make nf_ct_zone_dflt built-in
netfilter: nf_dup{4, 6}: fix build error when nf_conntrack disabled
net: fec: clear receive interrupts before processing a packet
ipv6: fix exthdrs offload registration in out_rt path
xen-netback: add support for multicast control
bgmac: Update fixed_phy_register()
sock, diag: fix panic in sock_diag_put_filterinfo
flow_dissector: Use 'const' where possible.
flow_dissector: Fix function argument ordering dependency
ixgbe: Resolve "initialized field overwritten" warnings
ixgbe: Remove bimodal SR-IOV disabling
ixgbe: Add support for reporting 2.5G link speed
ixgbe: fix bounds checking in ixgbe_setup_tc for 82598
ixgbe: support for ethtool set_rxfh
ixgbe: Avoid needless PHY access on copper phys
ixgbe: cleanup to use cached mask value
ixgbe: Remove second instance of lan_id variable
ixgbe: use kzalloc for allocating one thing
flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c
ixgbe: Remove unused PCI bus types
...
The port_mst_index parameter was reserved for future use, but
maintainers prefer to add it later when it is actually used.
[Note: this is an update patch to commit [51e1d83cab99: drm/i915: Call
audio pin/ELD notify function] where I mistakenly applied the older
version. Jani and Daniel's review tags were to the latest version,
so I add them below, too -- tiwai]
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull device mapper update from Mike Snitzer:
- a couple small cleanups in dm-cache, dm-verity, persistent-data's
dm-btree, and DM core.
- a 4.1-stable fix for dm-cache that fixes the leaking of deferred bio
prison cells
- a 4.2-stable fix that adds feature reporting for the dm-stats
features added in 4.2
- improve DM-snapshot to not invalidate the on-disk snapshot if
snapshot device write overflow occurs; but a write overflow triggered
through the origin device will still invalidate the snapshot.
- optimize DM-thinp's async discard submission a bit now that late bio
splitting has been included in block core.
- switch DM-cache's SMQ policy lock from using a mutex to a spinlock;
improves performance on very low latency devices (eg. NVMe SSD).
- document DM RAID 4/5/6's discard support
[ I did not pull the slab changes, which weren't appropriate for this
tree, and weren't obviously the right thing to do anyway. At the very
least they need some discussion and explanation before getting merged.
Because not pulling the actual tagged commit but doing a partial pull
instead, this merge commit thus also obviously is missing the git
signature from the original tag ]
* tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache: fix use after freeing migrations
dm cache: small cleanups related to deferred prison cell cleanup
dm cache: fix leaking of deferred bio prison cells
dm raid: document RAID 4/5/6 discard support
dm stats: report precise_timestamps and histogram in @stats_list output
dm thin: optimize async discard submission
dm snapshot: don't invalidate on-disk image on snapshot write overflow
dm: remove unlikely() before IS_ERR()
dm: do not override error code returned from dm_get_device()
dm: test return value for DM_MAPIO_SUBMITTED
dm verity: remove unused mempool
dm cache: move wake_waker() from free_migrations() to where it is needed
dm btree remove: remove unused function get_nr_entries()
dm btree: remove unused "dm_block_t root" parameter in btree_split_sibling()
dm cache policy smq: change the mutex to a spinlock
Fengguang reported, that some randconfig generated the following linker
issue with nf_ct_zone_dflt object involved:
[...]
CC init/version.o
LD init/built-in.o
net/built-in.o: In function `ipv4_conntrack_defrag':
nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt'
net/built-in.o: In function `ipv6_defrag':
nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt'
make: *** [vmlinux] Error 1
Given that configurations exist where we have a built-in part, which is
accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user()
and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a
module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in
area when netfilter is configured in general.
Therefore, split the more generic parts into a common header under
include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in
section that already holds parts related to CONFIG_NF_CONNTRACK in the
netfilter core. This fixes the issue on my side.
Fixes: 308ac9143e ("netfilter: nf_conntrack: push zone object into functions")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull SG updates from Jens Axboe:
"This contains a set of scatter-gather related changes/fixes for 4.3:
- Add support for limited chaining of sg tables even for
architectures that do not set ARCH_HAS_SG_CHAIN. From Christoph.
- Add sg chain support to target_rd. From Christoph.
- Fixup open coded sg->page_link in crypto/omap-sham. From
Christoph.
- Fixup open coded crypto ->page_link manipulation. From Dan.
- Also from Dan, automated fixup of manual sg_unmark_end()
manipulations.
- Also from Dan, automated fixup of open coded sg_phys()
implementations.
- From Robert Jarzmik, addition of an sg table splitting helper that
drivers can use"
* 'for-4.3/sg' of git://git.kernel.dk/linux-block:
lib: scatterlist: add sg splitting function
scatterlist: use sg_phys()
crypto/omap-sham: remove an open coded access to ->page_link
scatterlist: remove open coded sg_unmark_end instances
crypto: replace scatterwalk_sg_chain with sg_chain
target/rd: always chain S/G list
scatterlist: allow limited chaining without ARCH_HAS_SG_CHAIN
Pull block driver updates from Jens Axboe:
"On top of the 4.3 core block IO changes, here are the driver related
changes for 4.3. Basically just NVMe and nbd this time around:
- NVMe:
- PRACT PI improvement from Alok Pandey.
- Cleanups and improvements on submission queue doorbell and
writing, using CMB if available. From Jon Derrick.
- From Keith, support for setting queue maximum segments, and
reset support.
- Also from Jon, fixup of u64 division issue on 32-bit archs and
wiring up of the reset support through and ioctl.
- Two small cleanups from Matias and Sunad
- Various code cleanups and fixes from Markus Pargmann"
* 'for-4.3/drivers' of git://git.kernel.dk/linux-block:
NVMe: Using PRACT bit to generate and verify PI by controller
NVMe:Remove unreachable code in nvme_abort_req
NVMe: Add nvme subsystem reset IOCTL
NVMe: Add nvme subsystem reset support
NVMe: removed unused nn var from nvme_dev_add
NVMe: Set queue max segments
nbd: flags is a u32 variable
nbd: Rename functions for clearness of recv/send path
nbd: Change 'disconnect' to be boolean
nbd: Add debugfs entries
nbd: Remove variable 'pid'
nbd: Move clear queue debug message
nbd: Remove 'harderror' and propagate error properly
nbd: restructure sock_shutdown
nbd: sock_shutdown, remove conditional lock
nbd: Fix timeout detection
nvme: Fixes u64 division which breaks i386 builds
NVMe: Use CMB for the IO SQes if available
NVMe: Unify SQ entry writing and doorbell ringing
Pull core block updates from Jens Axboe:
"This first core part of the block IO changes contains:
- Cleanup of the bio IO error signaling from Christoph. We used to
rely on the uptodate bit and passing around of an error, now we
store the error in the bio itself.
- Improvement of the above from myself, by shrinking the bio size
down again to fit in two cachelines on x86-64.
- Revert of the max_hw_sectors cap removal from a revision again,
from Jeff Moyer. This caused performance regressions in various
tests. Reinstate the limit, bump it to a more reasonable size
instead.
- Make /sys/block/<dev>/queue/discard_max_bytes writeable, by me.
Most devices have huge trim limits, which can cause nasty latencies
when deleting files. Enable the admin to configure the size down.
We will look into having a more sane default instead of UINT_MAX
sectors.
- Improvement of the SGP gaps logic from Keith Busch.
- Enable the block core to handle arbitrarily sized bios, which
enables a nice simplification of bio_add_page() (which is an IO hot
path). From Kent.
- Improvements to the partition io stats accounting, making it
faster. From Ming Lei.
- Also from Ming Lei, a basic fixup for overflow of the sysfs pending
file in blk-mq, as well as a fix for a blk-mq timeout race
condition.
- Ming Lin has been carrying Kents above mentioned patches forward
for a while, and testing them. Ming also did a few fixes around
that.
- Sasha Levin found and fixed a use-after-free problem introduced by
the bio->bi_error changes from Christoph.
- Small blk cgroup cleanup from Viresh Kumar"
* 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits)
blk: Fix bio_io_vec index when checking bvec gaps
block: Replace SG_GAPS with new queue limits mask
block: bump BLK_DEF_MAX_SECTORS to 2560
Revert "block: remove artifical max_hw_sectors cap"
blk-mq: fix race between timeout and freeing request
blk-mq: fix buffer overflow when reading sysfs file of 'pending'
Documentation: update notes in biovecs about arbitrarily sized bios
block: remove bio_get_nr_vecs()
fs: use helper bio_add_page() instead of open coding on bi_io_vec
block: kill merge_bvec_fn() completely
md/raid5: get rid of bio_fits_rdev()
md/raid5: split bio for chunk_aligned_read
block: remove split code in blkdev_issue_{discard,write_same}
btrfs: remove bio splitting and merge_bvec_fn() calls
bcache: remove driver private bio splitting code
block: simplify bio_add_page()
block: make generic_make_request handle arbitrarily sized bios
blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL)
block: don't access bio->bi_error after bio_put()
block: shrink struct bio down to 2 cache lines again
...
This includes one new driver: cxlflash plus the usual grab bag of updates for
the major drivers: qla2xxx, ipr, storvsc, pm80xx, hptiop, plus a few assorted
fixes.
Signed-off-by: James Bottomley <JBottomley@Odin.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABAgAGBQJV5eGiAAoJEDeqqVYsXL0MCpwH/AoneZOPeCx0YdTlyiojasi4
Kc7ECmV9IJJoMoCbP8grwStvynYyCHDSphYmqopZPRlD021eG8ota2uRTHEGJI+q
SoiZUlq8ti8xgnD55mubwO+UNF+zoELMyHUok2pGzBoZN5alA6nvKuNY7Hif3P3b
YMT490oWQLjWmJkMW8TbpMn9nHpW0dfbP323uaggWsMy3CSI707+x36FLi1/ICg6
MZRyv4aESAcauZGUI5EG+SrIl3OBQX7snsYXyuqD3biGqzbGc3p3L9uWG1qXHDbM
OSGXhN+our0WYHCV1/UrGz7/IAWW1UU0W2qgCBwkXkDjkXJ4jqd36zLJxeuhSpE=
=KOmP
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull first round of SCSI updates from James Bottomley:
"This includes one new driver: cxlflash plus the usual grab bag of
updates for the major drivers: qla2xxx, ipr, storvsc, pm80xx, hptiop,
plus a few assorted fixes.
There's another tranch coming, but I want to incubate it another few
days in the checkers, plus it includes a mpt2sas separated lifetime
fix, which Avago won't get done testing until Friday"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (85 commits)
aic94xx: set an error code on failure
storvsc: Set the error code correctly in failure conditions
storvsc: Allow write_same when host is windows 10
storvsc: use storage protocol version to determine storage capabilities
storvsc: use correct defaults for values determined by protocol negotiation
storvsc: Untangle the storage protocol negotiation from the vmbus protocol negotiation.
storvsc: Use a single value to track protocol versions
storvsc: Rather than look for sets of specific protocol versions, make decisions based on ranges.
cxlflash: Remove unused variable from queuecommand
cxlflash: shift wrapping bug in afu_link_reset()
cxlflash: off by one bug in cxlflash_show_port_status()
cxlflash: Virtual LUN support
cxlflash: Superpipe support
cxlflash: Base error recovery support
qla2xxx: Update driver version to 8.07.00.26-k
qla2xxx: Add pci device id 0x2261.
qla2xxx: Fix missing device login retries.
qla2xxx: do not clear slot in outstanding cmd array
qla2xxx: Remove decrement of sp reference count in abort handler.
qla2xxx: Add support to show MPI and PEP FW version for ISP27xx.
...
Xen's PV network protocol includes messages to add/remove ethernet
multicast addresses to/from a filter list in the backend. This allows
the frontend to request the backend only forward multicast packets
which are of interest thus preventing unnecessary noise on the shared
ring.
The canonical netif header in git://xenbits.xen.org/xen.git specifies
the message format (two more XEN_NETIF_EXTRA_TYPEs) so the minimal
necessary changes have been pulled into include/xen/interface/io/netif.h.
To prevent the frontend from extending the multicast filter list
arbitrarily a limit (XEN_NETBK_MCAST_MAX) has been set to 64 entries.
This limit is not specified by the protocol and so may change in future.
If the limit is reached then the next XEN_NETIF_EXTRA_TYPE_MCAST_ADD
sent by the frontend will be failed with NETIF_RSP_ERROR.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull cgroup updates from Tejun Heo:
- a new PIDs controller is added. It turns out that PIDs are actually
an independent resource from kmem due to the limited PID space.
- more core preparations for the v2 interface. Once cpu side interface
is settled, it should be ready for lifting the devel mask.
for-4.3-unified-base was temporarily branched so that other trees
(block) can pull cgroup core changes that blkcg changes depend on.
- a non-critical idr_preload usage bug fix.
* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: pids: fix invalid get/put usage
cgroup: introduce cgroup_subsys->legacy_name
cgroup: don't print subsystems for the default hierarchy
cgroup: make cftype->private a unsigned long
cgroup: export cgrp_dfl_root
cgroup: define controller file conventions
cgroup: fix idr_preload usage
cgroup: add documentation for the PIDs controller
cgroup: implement the PIDs subsystem
cgroup: allow a cgroup subsystem to reject a fork
Pull percpu updates from Tejun Heo:
"Minor cleanups"
* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: clean up of schunk->map[] assignment in pcpu_setup_first_chunk
percpu: update incorrect comment for this_cpu_*() operations
Pull workqueue updates from Tejun Heo:
"Only three trivial changes for workqueue this time - doc, MAINTAINERS
and EXPORT_SYMBOL updates"
* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix some docbook warnings
workqueue: Make flush_workqueue() available again to non GPL modules
workqueue: add myself as a dedicated reviwer
commit 346add7834
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Tue Jul 14 18:07:30 2015 +0200
drm/i915: Use expcitly fixed type in compat32 structs
changed the type of param field in drm_i915_getparam from int to
s32. This header is exported to userspace and needs to use userspace
type __s32 instead.
This fixes userspace compilation errors like the following:
include/drm/i915_drm.h:361:2: error: unknown type name 's32'
s32 param;
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
This lets the interested codec be notified when an i915 pin/ELD
event happens.
[tiwai: Fixed a trivial build error for CONFIG_SND_HDA_I915=n]
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This callback will be called by the i915 driver to notify the hda
driver that its HDMI information needs to be refreshed, i e,
that audio output is now available (or unavailable) - usually as a
result of a monitor being plugged in (or unplugged).
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit c6cc1ca7f4 ("flowi: Abstract out functions to get flow hash
based on flowi") introduced a bug in __skb_set_sw_hash where we
require a dependency on evaluating arguments in a function in order.
There is no such ordering enforced in C, so this incorrect. This
patch fixes that by splitting out the arguments. This bug was
found via a compiler warning that keys may be uninitialized.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- ACPICA update to upstream revision 20150818 including method
tracing extensions to allow more in-depth AML debugging in the
kernel and a number of assorted fixes and cleanups (Bob Moore,
Lv Zheng, Markus Elfring).
- ACPI sysfs code updates and a documentation update related to
AML method tracing (Lv Zheng).
- ACPI EC driver fix related to serialized evaluations of _Qxx
methods and ACPI tools updates allowing the EC userspace tool
to be built from the kernel source (Lv Zheng).
- ACPI processor driver updates preparing it for future
introduction of CPPC support and ACPI PCC mailbox driver
updates (Ashwin Chaugule).
- ACPI interrupts enumeration fix for a regression related
to the handling of IRQ attribute conflicts between MADT
and the ACPI namespace (Jiang Liu).
- Fixes related to ACPI device PM (Mika Westerberg, Srinidhi Kasagar).
- ACPI device registration code reorganization to separate the
sysfs-related code and bus type operations from the rest (Rafael
J Wysocki).
- Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause,
Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss).
- ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups
(Pan Xinhui, Rafael J Wysocki).
- cpufreq core cleanups on top of the previous changes allowing it
to preseve its sysfs directories over system suspend/resume (Viresh
Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior).
- cpufreq fixes and cleanups related to governors (Viresh Kumar).
- cpufreq updates (core and the cpufreq-dt driver) related to the
turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz).
- New DT bindings for Operating Performance Points (OPP), support
for them in the OPP framework and in the cpufreq-dt driver plus
related OPP framework fixes and cleanups (Viresh Kumar).
- cpufreq powernv driver updates (Shilpasri G Bhat).
- New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen).
- Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups
and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean).
- intel_pstate driver updates including Skylake-S support, support
for enabling HW P-states per CPU and an additional vendor bypass
list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao).
- cpuidle core fixes related to the handling of coupled idle states
(Xunlei Pang).
- intel_idle driver updates including Skylake Client support and
support for freeze-mode-specific idle states (Len Brown).
- Driver core updates related to power management (Andy Shevchenko,
Rafael J Wysocki).
- Generic power domains framework fixes and cleanups (Jon Hunter,
Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson).
- Device PM QoS framework update to allow the latency tolerance
setting to be exposed to user space via sysfs (Mika Westerberg).
- devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect
exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas).
- System sleep support updates (Alan Stern, Len Brown, SungEun Kim).
- rockchip-io AVS support updates (Heiko Stuebner).
- PM core clocks support fixup (Colin Ian King).
- Power capping RAPL driver update including support for Skylake H/S
and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi).
- Generic device properties framework fixes related to the handling
of static (driver-provided) property sets (Andy Shevchenko).
- turbostat and cpupower updates (Len Brown, Shilpasri G Bhat,
Shreyas B Prabhu).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJV5hhGAAoJEILEb/54YlRxs+EQAK51iFk48+IbpHYaZZ50Yo4m
ZZc2zBcbwRcBlU9vKERrhG+jieSl8J/JJNxT8vBjKqyvNw038mCjewQh02ol0HuC
R7nlDiVJkmZ50sLO4xwE/1UBZr/XqbddwCUnYzvFMkMTA0ePzFtf8BrJ1FXpT8S/
fkwSXQty6hvJDwxkfrbMSaA730wMju9lahx8D6MlmUAedWYZOJDMQKB4WKa/St5X
9uckBPHUBB2KiKlXxdbFPwKLNxHvLROq5SpDLc6cM/7XZB+QfNFy85CUjCUtYo1O
1W8k0qnztvZ6UEv27qz5dejGyAGOarMWGGNsmL9evoeGeHRpQL+dom7HcTnbAfUZ
walyhYSm/zKkdy7Vl3xWUUQkMG48+PviMI6K0YhHXb3Rm5wlR/yBNZTwNIty9SX/
fKCHEa8QynWwLxgm53c3xRkiitJxMsHNK03moLD9zQMjshTyTNvpNbZoahyKQzk6
H+9M1DBRHhkkREDWSwGutukxfEMtWe2vcZcyERrFiY7l5k1j58DwDBMPqjPhRv6q
P/1NlCzr0XYf83Y86J18LbDuPGDhTjjIEn6CqbtI2mmWqTg3+rF7zvS2ux+FzMnA
gisv8l6GT9JiWhxKFqqL/rrVpwtyHebWLYE/RpNUW6fEzLziRNj1qyYO9dqI/GGi
I3rfxlXoc/5xJWCgNB8f
=fTgI
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"From the number of commits perspective, the biggest items are ACPICA
and cpufreq changes with the latter taking the lead (over 50 commits).
On the cpufreq front, there are many cleanups and minor fixes in the
core and governors, driver updates etc. We also have a new cpufreq
driver for Mediatek MT8173 chips.
ACPICA mostly updates its debug infrastructure and adds a number of
fixes and cleanups for a good measure.
The Operating Performance Points (OPP) framework is updated with new
DT bindings and support for them among other things.
We have a few updates of the generic power domains framework and a
reorganization of the ACPI device enumeration code and bus type
operations.
And a lot of fixes and cleanups all over.
Included is one branch from the MFD tree as it contains some
PM-related driver core and ACPI PM changes a few other commits are
based on.
Specifics:
- ACPICA update to upstream revision 20150818 including method
tracing extensions to allow more in-depth AML debugging in the
kernel and a number of assorted fixes and cleanups (Bob Moore, Lv
Zheng, Markus Elfring).
- ACPI sysfs code updates and a documentation update related to AML
method tracing (Lv Zheng).
- ACPI EC driver fix related to serialized evaluations of _Qxx
methods and ACPI tools updates allowing the EC userspace tool to be
built from the kernel source (Lv Zheng).
- ACPI processor driver updates preparing it for future introduction
of CPPC support and ACPI PCC mailbox driver updates (Ashwin
Chaugule).
- ACPI interrupts enumeration fix for a regression related to the
handling of IRQ attribute conflicts between MADT and the ACPI
namespace (Jiang Liu).
- Fixes related to ACPI device PM (Mika Westerberg, Srinidhi
Kasagar).
- ACPI device registration code reorganization to separate the
sysfs-related code and bus type operations from the rest (Rafael J
Wysocki).
- Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause,
Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss).
- ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups (Pan
Xinhui, Rafael J Wysocki).
- cpufreq core cleanups on top of the previous changes allowing it to
preseve its sysfs directories over system suspend/resume (Viresh
Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior).
- cpufreq fixes and cleanups related to governors (Viresh Kumar).
- cpufreq updates (core and the cpufreq-dt driver) related to the
turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz).
- New DT bindings for Operating Performance Points (OPP), support for
them in the OPP framework and in the cpufreq-dt driver plus related
OPP framework fixes and cleanups (Viresh Kumar).
- cpufreq powernv driver updates (Shilpasri G Bhat).
- New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen).
- Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups
and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean).
- intel_pstate driver updates including Skylake-S support, support
for enabling HW P-states per CPU and an additional vendor bypass
list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao).
- cpuidle core fixes related to the handling of coupled idle states
(Xunlei Pang).
- intel_idle driver updates including Skylake Client support and
support for freeze-mode-specific idle states (Len Brown).
- Driver core updates related to power management (Andy Shevchenko,
Rafael J Wysocki).
- Generic power domains framework fixes and cleanups (Jon Hunter,
Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson).
- Device PM QoS framework update to allow the latency tolerance
setting to be exposed to user space via sysfs (Mika Westerberg).
- devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect
exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas).
- System sleep support updates (Alan Stern, Len Brown, SungEun Kim).
- rockchip-io AVS support updates (Heiko Stuebner).
- PM core clocks support fixup (Colin Ian King).
- Power capping RAPL driver update including support for Skylake H/S
and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi).
- Generic device properties framework fixes related to the handling
of static (driver-provided) property sets (Andy Shevchenko).
- turbostat and cpupower updates (Len Brown, Shilpasri G Bhat,
Shreyas B Prabhu)"
* tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (180 commits)
cpufreq: speedstep-lib: Use monotonic clock
cpufreq: powernv: Increase the verbosity of OCC console messages
cpufreq: sfi: use kmemdup rather than duplicating its implementation
cpufreq: drop !cpufreq_driver check from cpufreq_parse_governor()
cpufreq: rename cpufreq_real_policy as cpufreq_user_policy
cpufreq: remove redundant 'policy' field from user_policy
cpufreq: remove redundant 'governor' field from user_policy
cpufreq: update user_policy.* on success
cpufreq: use memcpy() to copy policy
cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event
cpufreq: mediatek: Add MT8173 cpufreq driver
dt-bindings: mediatek: Add MT8173 CPU DVFS clock bindings
PM / Domains: Fix typo in description of genpd_dev_pm_detach()
PM / Domains: Remove unusable governor dummies
PM / Domains: Make pm_genpd_init() available to modules
PM / domains: Align column headers and data in pm_genpd_summary output
powercap / RAPL: disable the 2nd power limit properly
tools: cpupower: Fix error when running cpupower monitor
PM / OPP: Drop unlikely before IS_ERR(_OR_NULL)
PM / OPP: Fix static checker warning (broken 64bit big endian systems)
...
- Adding Frank Rowand as DT maintainer in preparation for Grant's
retirement.
- Generic MSI binding documentation and a few other minor doc updates
- Fix long standing issue with DT platorm device unregistration
- Fix loop forever bug in of_find_matching_node_by_address()
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJV5hAQAAoJEMhvYp4jgsXiguMIAJippY6yiqatAQr2TcWktkUx
KiN209DxPCoAFz9EeCyJ+ck0BV6xKSQf1x/ef93kynYXlQ2uWOWXW6guH8rQ5XFN
15aovQnd03PVcpzsMOSVi/ZxtyGJflxO+RBrcrJ0Vy2HaAYsMmfCiYdLAnhMmJnz
Wr4W1LfYLTxchNIS1LlBUKRAeZyOx47yRVEXfNqfjbCu+bJgXDtMj7NNH3muXqVf
WZIJu0XFb6UIgMPB/jDdC0Ln6FGe01HIDKQ/EpGgBm/flfSsywVH6eB06R7+EHym
T6ap5pRQHXB+oxRU8bGo88Z7xEbcBhKckRT/zvs/vPuG/DfWluIVxbJpR4htJ84=
=zTPm
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
- added Frank Rowand as DT maintainer in preparation for Grant's
retirement.
- generic MSI binding documentation and a few other minor doc updates
- fix long standing issue with DT platorm device unregistration
- fix loop forever bug in of_find_matching_node_by_address()
* tag 'devicetree-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
MAINTAINERS: Add Frank Rowand as DT maintainer
mtd: nand: pxa3xx: add optional dma for pxa architecture
Documentation: DT: cpsw: document missing compatible
Docs: dt: add generic MSI bindings
drivercore: Fix unregistration path of platform devices
of/address: Don't loop forever in of_find_matching_node_by_address().
of: Add vendor prefix for JEDEC Solid State Technology Association
of/platform: add function to populate default bus
of: Add vendor prefix for Sharp Corporation
Just have a flags member instead.
In file included from include/linux/linkage.h:4:0,
from include/linux/kernel.h:6,
from net/core/flow_dissector.c:1:
In function 'flow_keys_hash_start',
inlined from 'flow_hash_from_keys' at net/core/flow_dissector.c:553:34:
>> include/linux/compiler.h:447:38: error: call to '__compiletime_assert_459' declared with attribute error: BUILD_BUG_ON failed: FLOW_KEYS_HASH_OFFSET % sizeof(u32)
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull user namespace updates from Eric Biederman:
"This finishes up the changes to ensure proc and sysfs do not start
implementing executable files, as the there are application today that
are only secure because such files do not exist.
It akso fixes a long standing misfeature of /proc/<pid>/mountinfo that
did not show the proper source for files bind mounted from
/proc/<pid>/ns/*.
It also straightens out the handling of clone flags related to user
namespaces, fixing an unnecessary failure of unshare(CLONE_NEWUSER)
when files such as /proc/<pid>/environ are read while <pid> is calling
unshare. This winds up fixing a minor bug in unshare flag handling
that dates back to the first version of unshare in the kernel.
Finally, this fixes a minor regression caused by the introduction of
sysfs_create_mount_point, which broke someone's in house application,
by restoring the size of /sys/fs/cgroup to 0 bytes. Apparently that
application uses the directory size to determine if a tmpfs is mounted
on /sys/fs/cgroup.
The bind mount escape fixes are present in Al Viros for-next branch.
and I expect them to come from there. The bind mount escape is the
last of the user namespace related security bugs that I am aware of"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
fs: Set the size of empty dirs to 0.
userns,pidns: Force thread group sharing, not signal handler sharing.
unshare: Unsharing a thread does not require unsharing a vm
nsfs: Add a show_path method to fix mountinfo
mnt: fs_fully_visible enforce noexec and nosuid if !SB_I_NOEXEC
vfs: Commit to never having exectuables on proc and sysfs.
Pull x86 apic updates from Thomas Gleixner:
"This udpate contains:
- rework the irq vector array to store a pointer to the irq
descriptor instead of the irq number to avoid a lookup of the irq
descriptor in the irq entry path
- lguest interrupt handling cleanups
- conversion of the local apic timer to the new clockevent callbacks
- preparatory changes for the irq argument removal of interrupt flow
handlers"
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/irq: Do not dereference irq descriptor before checking it
tools/lguest: Clean up include dir
tools/lguest: Fix redefinition of struct virtio_pci_cfg_cap
x86/irq: Store irq descriptor in vector array
genirq: Provide irq_desc_has_action
x86/irq: Get rid of an indentation level
x86/irq: Rename VECTOR_UNDEFINED to VECTOR_UNUSED
x86/irq: Replace numeric constant
x86/irq: Protect smp_cleanup_move
x86/lguest: Do not setup unused irq vectors
x86/lguest: Clean up lguest_setup_irq
x86/apic: Drop local_irq_save/restore in timer callbacks
x86/apic: Migrate apic timer to new set_state interface
x86/irq: Use access helper irq_data_get_affinity_mask()
x86/irq: Use accessor irq_data_get_irq_handler_data()
x86/irq: Use accessor irq_data_get_node()
Add an input flag to flow dissector on rather dissection should stop
when encapsulation is detected (IP/IP or GRE). Also, add a key_control
flag that indicates encapsulation was encountered during the
dissection.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an input flag to flow dissector on rather dissection should be
stopped when a flow label is encountered. Presumably, the flow label
is derived from a sufficient hash of an inner transport packet so
further dissection is not needed (that is ports are not included in
the flow hash). Using the flow label instead of ports has the additional
benefit that packet fragments should hash to same value as non-fragments
for a flow (assuming that the same flow label is used).
We set this flag by default in for skb_get_hash.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an input flag to flow dissector on rather dissection should be
stopped when an L3 packet is encountered. This would be useful if a
caller just wanted to get IP addresses of the outermost header (e.g.
to do an L3 hash).
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an input flag to flow dissector on rather dissection should be
attempted on a first fragment. Also add key_control flags to indicate
that a packet is a fragment or first fragment.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The flags argument will allow control of the dissection process (for
instance whether to parse beyond L3).
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create __get_hash_from_flowi6 and __get_hash_from_flowi4 to get the
flow keys and hash based on flowi structures. These are called by
__skb_get_hash_flowi6 and __skb_get_hash_flowi4. Also, created
get_hash_from_flowi6 and get_hash_from_flowi4 which can be called
when just the hash value for a flowi is needed.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move __skb_set_sw_hash to skbuff.h and add __skb_set_hash which is
a common method (between __skb_set_sw_hash and skb_set_hash) to set
the hash in an skbuff.
Also, move skb_clear_hash to be closer to __skb_set_hash.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the flow dissector functions that are specific to skbuffs into
skbuff.h out of flow_dissector.h. This makes flow_dissector.h have
no dependencies on skbuff.h.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull irq updates from Thomas Gleixner:
"This updated pull request does not contain the last few GIC related
patches which were reported to cause a regression. There is a fix
available, but I let it breed for a couple of days first.
The irq departement provides:
- new infrastructure to support non PCI based MSI interrupts
- a couple of new irq chip drivers
- the usual pile of fixlets and updates to irq chip drivers
- preparatory changes for removal of the irq argument from interrupt
flow handlers
- preparatory changes to remove IRQF_VALID"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources
irqchip: Add bcm2836 interrupt controller for Raspberry Pi 2
irqchip: Add documentation for the bcm2836 interrupt controller
irqchip/bcm2835: Add support for being used as a second level controller
irqchip/bcm2835: Refactor handle_IRQ() calls out of MAKE_HWIRQ
PCI: xilinx: Fix typo in function name
irqchip/gic: Ensure gic_cpu_if_up/down() programs correct GIC instance
irqchip/gic: Only allow the primary GIC to set the CPU map
PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove
unicore32/irq: Prepare puv3_gpio_handler for irq argument removal
tile/pci_gx: Prepare trio_handle_level_irq for irq argument removal
m68k/irq: Prepare irq handlers for irq argument removal
C6X/megamode-pic: Prepare megamod_irq_cascade for irq argument removal
blackfin: Prepare irq handlers for irq argument removal
arc/irq: Prepare idu_cascade_isr for irq argument removal
sparc/irq: Use access helper irq_data_get_affinity_mask()
sparc/irq: Use helper irq_data_get_irq_handler_data()
parisc/irq: Use access helper irq_data_get_affinity_mask()
mn10300/irq: Use access helper irq_data_get_affinity_mask()
irqchip/i8259: Prepare i8259_irq_dispatch for irq argument removal
...
A number of VRF patches used 'int' for table id. It should be u32 to be
consistent with the rest of the stack.
Fixes:
4e3c89920c ("net: Introduce VRF related flags and helpers")
15be405eb2 ("net: Add inet_addr lookup by table")
30bbaa1950 ("net: Fix up inet_addr_type checks")
021dd3b8a1 ("net: Add routes to the table associated with the device")
dc028da54e ("inet: Move VRF table lookup to inlined function")
f6d3c19274 ("net: FIB tracepoints")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull timer updates from Thomas Gleixner:
"Rather large, but nothing exiting:
- new range check for settimeofday() to prevent that boot time
becomes negative.
- fix for file time rounding
- a few simplifications of the hrtimer code
- fix for the proc/timerlist code so the output of clock realtime
timers is accurate
- more y2038 work
- tree wide conversion of clockevent drivers to the new callbacks"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (88 commits)
hrtimer: Handle failure of tick_init_highres() gracefully
hrtimer: Unconfuse switch_hrtimer_base() a bit
hrtimer: Simplify get_target_base() by returning current base
hrtimer: Drop return code of hrtimer_switch_to_hres()
time: Introduce timespec64_to_jiffies()/jiffies_to_timespec64()
time: Introduce current_kernel_time64()
time: Introduce struct itimerspec64
time: Add the common weak version of update_persistent_clock()
time: Always make sure wall_to_monotonic isn't positive
time: Fix nanosecond file time rounding in timespec_trunc()
timer_list: Add the base offset so remaining nsecs are accurate for non monotonic timers
cris/time: Migrate to new 'set-state' interface
kernel: broadcast-hrtimer: Migrate to new 'set-state' interface
xtensa/time: Migrate to new 'set-state' interface
unicore/time: Migrate to new 'set-state' interface
um/time: Migrate to new 'set-state' interface
sparc/time: Migrate to new 'set-state' interface
sh/localtimer: Migrate to new 'set-state' interface
score/time: Migrate to new 'set-state' interface
s390/time: Migrate to new 'set-state' interface
...
This is the usual large batch of DT updates. Lots and lots of smaller
changes, some of the larger ones to point out are:
- Rockchip veyron (Chromebook) support, as well as several other new boards
- DRM support on Atmel AT91SAM9N12EK
- USB additions on some Allwinner platforms
- Mediatek MT6580 support
- Freescale i.MX6UL support
- Cleanups for Renesas shmobile platforms
- Lots of added devices on LPC18xx
- Lots of added devices and boards on UniPhier
There's also some dependent code added here, in particular some branches
that are primarily merged through the clock tree.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV5OMWAAoJEIwa5zzehBx3r2QP/1skn0zzgfvbK0kkPOh9q3Jk
jX1elN4Wde1SnScz8UbdVb9nmdbhxsuYE/3+Lz7yCndWScBiak4qcsNHrSRhh3FA
ST7Ub8DLc2TxY9K7eDkyVCcNkP35+UQTHCN76R5Lgrlfw3UO9Zr3xPFX3+Kd6aWz
9X8UnvJacQQIN/vO6J02kB96sKPEIANfuMgO6vDSbmcZ1RrdlHzjoRwAV0smECtJ
NyOh+NQdPBR0gSl/peyKzAXoDHNXpDotltTmIz3tPA+dYBO/qG//B73H/oqox0ql
AKAktyaDzdxXEuixPtAroo4dDy3xuIQ6xU+DNhPWQq0BgaxHWqkwq60d74ot8vCz
8gvC8pwA6gavbqVFNePOnwPNSyWZX01scX4fp903NjVM8/rGPvCR4y6p8lFIyVkG
P0L8rmY/UYq3fieaAb1W0odASDrQpgg3zsHD7to43hz6jaRnMRCpA8nTVqJcyHqI
E6YfGQH87Kpbvkjo0FYqo5P6xCCRTq+QUys6JruNYg05R/gd8AG7cXaVNO3yvg3T
lRwNXDBt/zcp2exKnGR0IdGMUMICzsuoB8ZePkQdIWwePrd4AzT5qYJe/txmg1rd
q+9VJqQkeF+txLd9XUV2W/Hcuzu3ZPCbs97I9tTKQHMGwKUZaPfuk2r4+4K+Ps5a
dYwdms39p6AIT43rK+m3
=D2Pm
-----END PGP SIGNATURE-----
Merge tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM DT updates from Olof Johansson:
"Ladies and gentlemen, we proudly announce to you the latest branch of
ARM device tree contents for the mainline kernel. Come and see, come
and see!
No less than twentythree thousand lines of additions! Just imagine the
joy you will have of using your mainline kernel on newly supported
hardware such as Rockchip Chromebooks, Freescale i.MX6UL boards or
UniPhier hardware!
For those of you feeling less adventurous, added hardware support on
platforms such as TI DM814x and Gumstix Overo platforms might be more
of your liking.
We've got something for everyone here!
Ahem. Cough. So, anyway...
This is the usual large batch of DT updates. Lots and lots of smaller
changes, some of the larger ones to point out are:
- Rockchip veyron (Chromebook) support, as well as several other new boards
- DRM support on Atmel AT91SAM9N12EK
- USB additions on some Allwinner platforms
- Mediatek MT6580 support
- Freescale i.MX6UL support
- cleanups for Renesas shmobile platforms
- lots of added devices on LPC18xx
- lots of added devices and boards on UniPhier
There's also some dependent code added here, in particular some
branches that are primarily merged through the clock tree"
* tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (389 commits)
ARM: tegra: Add gpio-ranges property
ARM: tegra: Fix AHB base address on Tegra20, Tegra30 and Tegra114
ARM: tegra: Add Tegra124 PMU support
ARM: tegra: jetson-tk1: Add GK20A GPU DT node
ARM: tegra: venice2: Add GK20A GPU DT node
ARM: tegra: Add IOMMU node to GK20A
ARM: tegra: Add CPU regulator to the Jetson TK1 device tree
ARM: tegra: Add entries for cpufreq on Tegra124
ARM: tegra: Enable the DFLL on the Jetson TK1
ARM: tegra: Add the DFLL to Tegra124 device tree
ARM: dts: zynq: Add devicetree entry for Xilinx Zynq reset controller.
ARM: dts: UniPhier: fix PPI interrupt CPU mask of timer nodes
ARM: dts: rockchip: correct regulator power states for suspend
ARM: dts: rockchip: correct regulator PM properties
ARM: dts: vexpress: Use assigned-clock-parents for sp810
pinctrl: tegra: Only set the gpio range if needed
arm: boot: dts: am4372: add ARM timers and SCU nodes
ARM: dts: AM4372: Add the am4372-rtc compatible string
ARM: shmobile: r8a7794 dtsi: Add CPG/MSTP Clock Domain
ARM: shmobile: r8a7793 dtsi: Add CPG/MSTP Clock Domain
...
Some releases this branch is nearly empty, others we have more stuff. It
tends to gather drivers that need SoC modification or dependencies such
that they have to (also) go in through our tree.
For this release, we have merged in part of the reset controller tree
(with handshake that the parts we have merged in will remain stable),
as well as dependencies on a few clock branches.
In general, new items here are:
- Qualcomm driver for SMM/SMD, which is how they communicate with the
coprocessors on (some) of their platforms
- Memory controller work for ARM's PL172 memory controller
- Reset drivers for various platforms
- PMU power domain support for Marvell platforms
- Tegra support for T132/T210 SoCs: PMC, fuse, memory controller per-SoC support
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV5Ou9AAoJEIwa5zzehBx3/k4P/jA5CVNiDvIs0GoTR3uGOuec
MYd19oKf76reV1oL5bBSpg9uryJd3fPzK0JC/qU3pYfsCVFp2TWZD7liNpitqHyt
2xL02gzJQgjHzL3QrxTQrOFJDO6P8Vm2k/5pI0KX1beoulHvI+iHejNryXGjSKSx
9vbs1GPXU9IV831YOHSaMmHz727J65bbZE8Up113ctT+WbEIc1g/ihKzUgi/8xXW
RniMxGsX8HynE3VH+UBDMbY6XkOmzZa1Wabgll735MXwIUFG1+TsvHNuGehXUski
ySwqk67en25i0F/Q7oobLSZwCPbA6Ylxk9aOfr0AnAqOEKwgKWS+K7HkEiNMz7yh
nt22b5SVkQ80sTCbNEkdJajOZ8oRalUae19CGxvMfVh77LmQ2sRI9iJrwXcxkt8W
ASs6uDDAUNC5pIWfjeJE50vsDr//Hed/WtsIjenYOtb+RI1kru5iTTgp4oLPBiy5
OeHxOfiL7gPvyZQbuPgMKAGdoGBsa/7wTM7KWJCMP6mPGHpShO8XUUsuljqKHm4w
nBV7eZRMiIuWkjRKw4bjp7R0NVKR5sOfAkZhjCsXB0aqA/NU2zyNbViWcGCh6yj8
3beZ93SdEdrKX6N8pPiAhGTMFA6eev8YeUHO7kM4IhC91ILjHlPpCs1pYk3pwEkO
ABC7GyMY6Olg1pZJweEa
=B6jn
-----END PGP SIGNATURE-----
Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Olof Johansson:
"Some releases this branch is nearly empty, others we have more stuff.
It tends to gather drivers that need SoC modification or dependencies
such that they have to (also) go in through our tree.
For this release, we have merged in part of the reset controller tree
(with handshake that the parts we have merged in will remain stable),
as well as dependencies on a few clock branches.
In general, new items here are:
- Qualcomm driver for SMM/SMD, which is how they communicate with the
coprocessors on (some) of their platforms
- memory controller work for ARM's PL172 memory controller
- reset drivers for various platforms
- PMU power domain support for Marvell platforms
- Tegra support for T132/T210 SoCs: PMC, fuse, memory controller
per-SoC support"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (49 commits)
ARM: tegra: cpuidle: implement cpuidle_state.enter_freeze()
ARM: tegra: Disable cpuidle if PSCI is available
soc/tegra: pmc: Use existing pclk reference
soc/tegra: pmc: Remove unnecessary return statement
soc: tegra: Remove redundant $(CONFIG_ARCH_TEGRA) in Makefile
memory: tegra: Add Tegra210 support
memory: tegra: Add support for a variable-size client ID bitfield
clk: shmobile: rz: Add CPG/MSTP Clock Domain support
clk: shmobile: rcar-gen2: Add CPG/MSTP Clock Domain support
clk: shmobile: r8a7779: Add CPG/MSTP Clock Domain support
clk: shmobile: r8a7778: Add CPG/MSTP Clock Domain support
clk: shmobile: Add CPG/MSTP Clock Domain support
ARM: dove: create a proper PMU driver for power domains, PMU IRQs and resets
reset: reset-zynq: Adding support for Xilinx Zynq reset controller.
docs: dts: Added documentation for Xilinx Zynq Reset Controller bindings.
MIPS: ath79: Add the reset controller to the AR9132 dtsi
reset: Add a driver for the reset controller on the AR71XX/AR9XXX
devicetree: Add bindings for the ATH79 reset controller
reset: socfpga: Update reset-socfpga to read the altr,modrst-offset property
doc: dt: add documentation for lpc1850-rgu reset driver
...
New or improved SoC support:
- Addition of support for Atmel's SAMA5D2 SoC
- Addition of Freescale i.MX6UL
- Improved support of TI's DM814x platform
- Misc fixes and improvements for RockChip platforms
- Marvell MVEBU suspend/resume support
A few driver changes that ideally would belong in the drivers branch are
also here (acked by appropriate maintainers):
- Power key input driver for Freescale platforms (svns)
- RTC driver updates for Freescale platforms (svns/mxc)
- Clk fixes for TI DM814/816X
+ a bunch of other changes for various platforms
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV5Mo6AAoJEIwa5zzehBx3VnMP/28LFJVUjIbd2xjJBo2gbSwV
jN7uGlTkKU+1kHjZqnUPuirlBxBzsXKgRfBvCoeu0cPOggwmFcaF915/HHPz7xuz
vTP7k98+Y5nSXScIohWkWCdZTpKKjve4sn74rXmiNakTUiuaHf5lKut/m7ldVrWd
hN1o9W4LN+5O1mOYbc9ZD98v3bkDb6eu+a22oK7qemXiEiQi+NIMoDx+IR2bd4pA
FeDaW7sOFWTEYU/p+M5nZNvI3n53P0/mlB5rPRiAYRjhQf9DrWHm5G7HdnMkUkgo
/s8/QlVjBkJwhkY0TqpwtHY23JjSSB6UtCnXzb1eVAkX1nJN6PJQpcpCz1zJhd9q
+sJ2k1zEvrfomJCK7/iZ1ubQE09KlJLEeb8xi5xCwD0MBOBAYC31bovDVAswCitV
8NHnfltEG+wCMyX955eqqGkVxkcw8sJMJUK5A95aK6w+vKqjd7gUgLJjXFC1u4eN
ECuVVUf1hVmUEmM799CDayTlfGDt4oGLmHGao+SiSCVc1XbG9HkWr7Lcgr6u9UHr
lNv3RIe6Axb85xxIU0/9hqLHrtB85uEQjjlbnQx3o7u8RsSOeaiZvCz4BzcjP9E1
VGyD6zkRWhuDiMFlPVXiAX0qIH5xSjIWkC0wPNJNy8eWFH9tkfGL0mOlLbl09oGR
gtuvOrjbF/BhILkPw38y
=8T5Z
-----END PGP SIGNATURE-----
Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC platform updates from Olof Johansson:
"New or improved SoC support:
- add support for Atmel's SAMA5D2 SoC
- add support for Freescale i.MX6UL
- improved support for TI's DM814x platform
- misc fixes and improvements for RockChip platforms
- Marvell MVEBU suspend/resume support
A few driver changes that ideally would belong in the drivers branch
are also here (acked by appropriate maintainers):
- power key input driver for Freescale platforms (svns)
- RTC driver updates for Freescale platforms (svns/mxc)
- clk fixes for TI DM814/816X
+ a bunch of other changes for various platforms"
* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (83 commits)
ARM: rockchip: pm: Fix PTR_ERR() argument
ARM: imx: mach-imx6ul: Fix allmodconfig build
clk: ti: fix for definition movement
ARM: uniphier: drop v7_invalidate_l1 call at secondary entry
memory: kill off set_irq_flags usage
rtc: snvs: select option REGMAP_MMIO
ARM: brcmstb: select ARCH_DMA_ADDR_T_64BIT for LPAE
ARM: BCM: Enable ARM erratum 798181 for BRCMSTB
ARM: OMAP2+: Fix power domain operations regression caused by 81xx
ARM: rockchip: enable PMU_GPIOINT_WAKEUP_EN when entering shallow suspend
ARM: rockchip: set correct stabilization thresholds in suspend
ARM: rockchip: rename osc_switch_to_32k variable
ARM: imx6ul: add fec MAC refrence clock and phy fixup init
ARM: imx6ul: add fec bits to GPR syscon definition
rtc: mxc: add support of device tree
dt-binding: document the binding for mxc rtc
rtc: mxc: use a second rtc clock
ARM: davinci: cp_intc: use IRQCHIP_SKIP_SET_WAKE instead of irq_set_wake callback
soc: mediatek: Fix SCPSYS compilation
ARM: at91/soc: add basic support for new sama5d2 SoC
...
A large cleanup branch this release, with a healthy 10k negative line delta.
Most of this is removal of legacy (non-DT) support of shmobile
platforms. There is also removal of two non-DT platforms on OMAP,
and the plat-samsung directory is cleaned out by moving most of the
previously shared-location-but-not-actually-shared files from there to
the appropriate mach directories instead.
There are other sets of changes in here as well:
- Rob Herring removed use of set_irq_flags under all platforms and
moved to genirq alternatives
- A series of timer API conversions to set-state interface
- ep93xx, nomadik and ux500 cleanups from Linus Walleij
- __init annotation fixes from Nicolas Pitre
+ a bunch of other changes that all add up to a nice set of cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV16zRAAoJEIwa5zzehBx30OYQAIM2TaHWxDzcK0jOfrPEOyU0
jhnT3B2/FbIpYbt3UwcDOqJacHzA/syU4UjplCcWWhKYtTucXQIjOxi0BgRq5V3X
EFrgEbQMLXshhMGquBd4Nl6XhpRrlZcnnY4iFPGf7pR5jQfwQhZCiHEOXCLe4qlz
m9GorKjEmiSk2ID/PFpyOUx20XiiqkU2MOCsNqNiGwFfmQfpVo0vXtKgxlL6d6q+
9mrWFFTqgiOBMAU/X1j18U9jpFT6NOe8JcXp3F3tm4Cq5nJ2pTPZcaYWOORD0bGc
7Os1HRTN9SMNsb8sSvQa1N/Fy2AAFuvGdg8q4oLzgtXJogInRmxE4oGLbffq0As/
q81mEuWkdHjGPs5oyPNQdP2grYhflNP+M1+fA6QaUG/mDl5RFFcld9nfhCiBeIZW
oHAHZhz7//BImpFsmNiXOew9fmQaHxQ6Jy0K4YvhqaxwQ0N54ZadUPH47wmmxW5E
X5IY5Oo7xg3e4EU4/t1nFDTxMG936ZxLVtuq79dRHrl+AT6KVbzdUI4LLyrvEULh
ETbWIZUAHvE9i2N8GbsfDzc9YlpG/c5SpN8T5dg2R0BQpgE192iG6s5XoGyu80MG
MbtbEBml0y3tr36Bd9tCpsSrBjzZLUk5WRo6Rd/zWKlRkslV74iacKLvC7C5oa7n
GYL4xUSPE1RhJxnXezp0
=acy4
-----END PGP SIGNATURE-----
Merge tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC cleanups from Olof Johansson:
"A large cleanup branch this release, with a healthy 10k negative line
delta.
Most of this is removal of legacy (non-DT) support of shmobile
platforms. There is also removal of two non-DT platforms on OMAP, and
the plat-samsung directory is cleaned out by moving most of the
previously shared-location-but-not-actually-shared files from there to
the appropriate mach directories instead.
There are other sets of changes in here as well:
- Rob Herring removed use of set_irq_flags under all platforms and
moved to genirq alternatives
- a series of timer API conversions to set-state interface
- ep93xx, nomadik and ux500 cleanups from Linus Walleij
- __init annotation fixes from Nicolas Pitre
+ a bunch of other changes that all add up to a nice set of cleanups"
* tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (108 commits)
ARM/fb: ep93xx: switch framebuffer to use modedb only
ARM: gemini: Setup timer3 as free running timer
ARM: gemini: Use timer1 for clockevent
ARM: gemini: Add missing register definitions for gemini timer
ARM: ep93xx/timer: Migrate to new 'set-state' interface
ARM: nomadik: push accelerometer down to boards
ARM: nomadik: move l2x0 setup to device tree
ARM: nomadik: selectively enable UART0 on boards
ARM: nomadik: move hog code to use DT hogs
ARM: shmobile: Fix mismerges
ARM: ux500: simplify secondary CPU boot
ARM: SAMSUNG: remove keypad-core header in plat-samsung
ARM: SAMSUNG: local watchdog-reset header in mach-s3c64xx
ARM: SAMSUNG: local onenand-core header in mach-s3c64xx
ARM: SAMSUNG: local irq-uart header in mach-s3c64xx
ARM: SAMSUNG: local backlight header in mach-s3c64xx
ARM: SAMSUNG: local ata-core header in mach-s3c64xx
ARM: SAMSUNG: local regs-usb-hsotg-phy header in mach-s3c64xx
ARM: SAMSUNG: local spi-core header in mach-s3c24xx
ARM: SAMSUNG: local nand-core header in mach-s3c24xx
...
Pull x86 mm updates from Ingo Molnar:
"The dominant change in this cycle was the continued work to isolate
kernel drivers from MTRR legacies: this tree gets rid of all kernel
internal driver interfaces to MTRRs (mostly by rewriting it to proper
PAT interfaces), the only access left is the /proc/mtrr ABI.
This work was done by Luis R Rodriguez.
There's also some related PCI interface additions for which I've
Cc:-ed Bjorn"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del()
s390/io: Add pci_iomap_wc() and pci_iomap_wc_range()
drivers/dma/iop-adma: Use dma_alloc_writecombine() kernel-style
drivers/video/fbdev/vt8623fb: Use arch_phys_wc_add() and pci_iomap_wc()
drivers/video/fbdev/s3fb: Use arch_phys_wc_add() and pci_iomap_wc()
drivers/video/fbdev/arkfb.c: Use arch_phys_wc_add() and pci_iomap_wc()
PCI: Add pci_iomap_wc() variants
drivers/video/fbdev/gxt4500: Use pci_ioremap_wc_bar() to map framebuffer
drivers/video/fbdev/kyrofb: Use arch_phys_wc_add() and pci_ioremap_wc_bar()
drivers/video/fbdev/i740fb: Use arch_phys_wc_add() and pci_ioremap_wc_bar()
PCI: Add pci_ioremap_wc_bar()
x86/mm: Make kernel/check.c explicitly non-modular
x86/mm/pat: Make mm/pageattr[-test].c explicitly non-modular
x86/mm/pat: Add comments to cachemode translation tables
arch/*/io.h: Add ioremap_uc() to all architectures
drivers/video/fbdev/atyfb: Use arch_phys_wc_add() and ioremap_wc()
drivers/video/fbdev/atyfb: Replace MTRR UC hole with strong UC
drivers/video/fbdev/atyfb: Clarify ioremap() base and length used
drivers/video/fbdev/atyfb: Carve out framebuffer length fudging into a helper
x86/mm, asm-generic: Add IOMMU ioremap_uc() variant default
...
Pull x86 asm changes from Ingo Molnar:
"The biggest changes in this cycle were:
- Revamp, simplify (and in some cases fix) Time Stamp Counter (TSC)
primitives. (Andy Lutomirski)
- Add new, comprehensible entry and exit handlers written in C.
(Andy Lutomirski)
- vm86 mode cleanups and fixes. (Brian Gerst)
- 32-bit compat code cleanups. (Brian Gerst)
The amount of simplification in low level assembly code is already
palpable:
arch/x86/entry/entry_32.S | 130 +----
arch/x86/entry/entry_64.S | 197 ++-----
but more simplifications are planned.
There's also the usual laudry mix of low level changes - see the
changelog for details"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (83 commits)
x86/asm: Drop repeated macro of X86_EFLAGS_AC definition
x86/asm/msr: Make wrmsrl() a function
x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer
x86/asm: Add MONITORX/MWAITX instruction support
x86/traps: Weaken context tracking entry assertions
x86/asm/tsc: Add rdtscll() merge helper
selftests/x86: Add syscall_nt selftest
selftests/x86: Disable sigreturn_64
x86/vdso: Emit a GNU hash
x86/entry: Remove do_notify_resume(), syscall_trace_leave(), and their TIF masks
x86/entry/32: Migrate to C exit path
x86/entry/32: Remove 32-bit syscall audit optimizations
x86/vm86: Rename vm86->v86flags and v86mask
x86/vm86: Rename vm86->vm86_info to user_vm86
x86/vm86: Clean up vm86.h includes
x86/vm86: Move the vm86 IRQ definitions to vm86.h
x86/vm86: Use the normal pt_regs area for vm86
x86/vm86: Eliminate 'struct kernel_vm86_struct'
x86/vm86: Move fields from 'struct kernel_vm86_struct' to 'struct vm86'
x86/vm86: Move vm86 fields out of 'thread_struct'
...
* pm-sleep:
PM / suspend: make sync() on suspend-to-RAM build-time optional
PM / sleep: Allow devices without runtime PM to do direct-complete
PM / autosleep: Use workqueue for user space wakeup sources garbage collector
* pm-domains:
PM / Domains: Fix typo in description of genpd_dev_pm_detach()
PM / Domains: Remove unusable governor dummies
PM / Domains: Make pm_genpd_init() available to modules
PM / domains: Align column headers and data in pm_genpd_summary output
PM / Domains: Return -EPROBE_DEFER if we fail to init or turn-on domain
PM / Domains: Correct unit address in power-controller example
PM / Domains: Remove intermediate states from the power off sequence
* pm-avs:
PM / AVS: rockchip-io: add io selectors and supplies for rk3368
PM / AVS: rockchip-io: depend on CONFIG_POWER_AVS
* pm-cpuidle:
cpuidle/coupled: Remove redundant 'dev' argument of cpuidle_state_is_coupled()
cpuidle/coupled: Remove cpuidle_device::safe_state_index
intel_idle: Skylake Client Support
intel_idle: allow idle states to be freeze-mode specific
* pm-devfreq:
PM / devfreq: exynos-ppmu: Update documentation to support PPMUv2
PM / devfreq: exynos-ppmu: Add the support of PPMUv2 for Exynos5433
PM / devfreq: event: Remove incorrect property in exynos-ppmu DT binding
* pm-clk:
PM / clk: don't return int on __pm_clk_enable()
* pm-opp:
PM / OPP: Drop unlikely before IS_ERR(_OR_NULL)
PM / OPP: Fix static checker warning (broken 64bit big endian systems)
PM / OPP: Free resources and properly return error on failure
cpufreq-dt: make scaling_boost_freqs sysfs attr available when boost is enabled
cpufreq: dt: Add support for turbo/boost mode
cpufreq: dt: Add support for operating-points-v2 bindings
cpufreq: Allow drivers to enable boost support after registering driver
cpufreq: Update boost flag while initializing freq table from OPPs
PM / OPP: add dev_pm_opp_is_turbo() helper
PM / OPP: Add helpers for initializing CPU OPPs
PM / OPP: Add support for opp-suspend
PM / OPP: Add OPP sharing information to OPP library
PM / OPP: Add clock-latency-ns support
PM / OPP: Add support to parse "operating-points-v2" bindings
PM / OPP: Break _opp_add_dynamic() into smaller functions
PM / OPP: Allocate dev_opp from _add_device_opp()
PM / OPP: Create _remove_device_opp() for freeing dev_opp
PM / OPP: Relocate few routines
PM / OPP: Create a directory for opp bindings
PM / OPP: Update bindings to make opp-hz a 64 bit value
* pm-cpufreq: (53 commits)
cpufreq: speedstep-lib: Use monotonic clock
cpufreq: powernv: Increase the verbosity of OCC console messages
cpufreq: sfi: use kmemdup rather than duplicating its implementation
cpufreq: drop !cpufreq_driver check from cpufreq_parse_governor()
cpufreq: rename cpufreq_real_policy as cpufreq_user_policy
cpufreq: remove redundant 'policy' field from user_policy
cpufreq: remove redundant 'governor' field from user_policy
cpufreq: update user_policy.* on success
cpufreq: use memcpy() to copy policy
cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event
cpufreq: mediatek: Add MT8173 cpufreq driver
dt-bindings: mediatek: Add MT8173 CPU DVFS clock bindings
intel_pstate: append more Oracle OEM table id to vendor bypass list
intel_pstate: Add SKY-S support
intel_pstate: Fix possible overflow complained by Coverity
cpufreq: Correct a freq check in cpufreq_set_policy()
cpufreq: Lock CPU online/offline in cpufreq_register_driver()
cpufreq: Replace recover_policy with new_policy in cpufreq_online()
cpufreq: Separate CPU device registration from CPU online
cpufreq: powernv: Restore cpu frequency to policy->cur on unthrottling
...
Its all about caching min/max freq requested by userspace, and
the name 'cpufreq_real_policy' doesn't fit that well. Rename it to
cpufreq_user_policy.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Its always same as policy->policy, and there is no need to keep another
copy of it. Remove it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Its always same as policy->governor, and there is no need to keep
another copy of it. Remove it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
What's being done from CPUFREQ_INCOMPATIBLE, can also be done with
CPUFREQ_ADJUST. There is nothing special with CPUFREQ_INCOMPATIBLE
notifier.
Kill CPUFREQ_INCOMPATIBLE and fix its usage sites.
This also updates the numbering of notifier events to remove holes.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 0838aa7fcf ("netfilter: fix netns dependencies with conntrack
templates") migrated templates to the new allocator api, but forgot to
update error paths for them in CT and synproxy to use nf_ct_tmpl_free()
instead of nf_conntrack_free().
Due to that, memory is being freed into the wrong kmemcache, but also
we drop the per net reference count of ct objects causing an imbalance.
In Brad's case, this leads to a wrap-around of net->ct.count and thus
lets __nf_conntrack_alloc() refuse to create a new ct object:
[ 10.340913] xt_addrtype: ipv6 does not support BROADCAST matching
[ 10.810168] nf_conntrack: table full, dropping packet
[ 11.917416] r8169 0000:07:00.0 eth0: link up
[ 11.917438] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 12.815902] nf_conntrack: table full, dropping packet
[ 15.688561] nf_conntrack: table full, dropping packet
[ 15.689365] nf_conntrack: table full, dropping packet
[ 15.690169] nf_conntrack: table full, dropping packet
[ 15.690967] nf_conntrack: table full, dropping packet
[...]
With slab debugging, it also reports the wrong kmemcache (kmalloc-512 vs.
nf_conntrack_ffffffff81ce75c0) and reports poison overwrites, etc. Thus,
to fix the problem, export and use nf_ct_tmpl_free() instead.
Fixes: 0838aa7fcf ("netfilter: fix netns dependencies with conntrack templates")
Reported-by: Brad Jackson <bjackson0971@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
So the drivers can be compiled with CONFIG_RESET_CONTROLLER disabled.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
opts_size is only written and never read. Following patch
removes this unused variable.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull NOHZ updates from Ingo Molnar:
"The main changes, mostly written by Frederic Weisbecker, include:
- Fix some jiffies based cputime assumptions. (No real harm because
the concerned code isn't used by full dynticks.)
- Simplify jiffies <-> usecs conversions. Remove dead code.
- Remove early hacks on nohz full code that avoided messing up idle
nohz internals. Now nohz integrates well full and idle and such
hack have become needless.
- Restart nohz full tick from irq exit. (A simplification and a
preparation for future optimization on scheduler kick to nohz
full)
- Code cleanups.
- Tile driver isolation enhancement on top of nohz. (Chris Metcalf)"
* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
nohz: Remove useless argument on tick_nohz_task_switch()
nohz: Move tick_nohz_restart_sched_tick() above its users
nohz: Restart nohz full tick from irq exit
nohz: Remove idle task special case
nohz: Prevent tilegx network driver interrupts
alpha: Fix jiffies based cputime assumption
apm32: Fix cputime == jiffies assumption
jiffies: Remove HZ > USEC_PER_SEC special case
Pull scheduler updates from Ingo Molnar:
"The biggest change in this cycle is the rewrite of the main SMP load
balancing metric: the CPU load/utilization. The main goal was to make
the metric more precise and more representative - see the changelog of
this commit for the gory details:
9d89c257df ("sched/fair: Rewrite runnable load and utilization average tracking")
It is done in a way that significantly reduces complexity of the code:
5 files changed, 249 insertions(+), 494 deletions(-)
and the performance testing results are encouraging. Nevertheless we
need to keep an eye on potential regressions, since this potentially
affects every SMP workload in existence.
This work comes from Yuyang Du.
Other changes:
- SCHED_DL updates. (Andrea Parri)
- Simplify architecture callbacks by removing finish_arch_switch().
(Peter Zijlstra et al)
- cputime accounting: guarantee stime + utime == rtime. (Peter
Zijlstra)
- optimize idle CPU wakeups some more - inspired by Facebook server
loads. (Mike Galbraith)
- stop_machine fixes and updates. (Oleg Nesterov)
- Introduce the 'trace_sched_waking' tracepoint. (Peter Zijlstra)
- sched/numa tweaks. (Srikar Dronamraju)
- misc fixes and small cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
sched/deadline: Fix comment in enqueue_task_dl()
sched/deadline: Fix comment in push_dl_tasks()
sched: Change the sched_class::set_cpus_allowed() calling context
sched: Make sched_class::set_cpus_allowed() unconditional
sched: Fix a race between __kthread_bind() and sched_setaffinity()
sched: Ensure a task has a non-normalized vruntime when returning back to CFS
sched/numa: Fix NUMA_DIRECT topology identification
tile: Reorganize _switch_to()
sched, sparc32: Update scheduler comments in copy_thread()
sched: Remove finish_arch_switch()
sched, tile: Remove finish_arch_switch
sched, sh: Fold finish_arch_switch() into switch_to()
sched, score: Remove finish_arch_switch()
sched, avr32: Remove finish_arch_switch()
sched, MIPS: Get rid of finish_arch_switch()
sched, arm: Remove finish_arch_switch()
sched/fair: Clean up load average references
sched/fair: Provide runnable_load_avg back to cfs_rq
sched/fair: Remove task and group entity load when they are dead
sched/fair: Init cfs_rq's sched_entity load average
...
Pull perf updates from Ingo Molnar:
"Main perf kernel side changes:
- uprobes updates/fixes. (Oleg Nesterov)
- Add PERF_RECORD_SWITCH to indicate context switches and use it in
tooling. (Adrian Hunter)
- Support BPF programs attached to uprobes and first steps for BPF
tooling support. (Wang Nan)
- x86 generic x86 MSR-to-perf PMU driver. (Andy Lutomirski)
- x86 Intel PT, LBR and BTS updates. (Alexander Shishkin)
- x86 Intel Skylake support. (Andi Kleen)
- x86 Intel Knights Landing (KNL) RAPL support. (Dasaratharaman
Chandramouli)
- x86 Intel Broadwell-DE uncore support. (Kan Liang)
- x86 hw breakpoints robustization (Andy Lutomirski)
Main perf tooling side changes:
- Support Intel PT in several tools, enabling the use of the
processor trace feature introduced in Intel Broadwell processors:
(Adrian Hunter)
# dmesg | grep Performance
# [0.188477] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver.
# perf record -e intel_pt//u -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.216 MB perf.data ]
# perf script # then navigate in the tool output to some area, like this one:
184 1030 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba661440 dl_main (/usr/lib64/ld-2.17.so)
185 1457 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba669f10 _dl_new_object (/usr/lib64/ld-2.17.so)
186 9f37 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba677b90 strlen (/usr/lib64/ld-2.17.so)
187 7ba3 strlen (/usr/lib64/ld-2.17.so) => 7f21ba677c75 strlen (/usr/lib64/ld-2.17.so)
188 7c78 strlen (/usr/lib64/ld-2.17.so) => 7f21ba669f3c _dl_new_object (/usr/lib64/ld-2.17.so)
189 9f8a _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba65fab0 calloc@plt (/usr/lib64/ld-2.17.so)
190 fab0 calloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e70 calloc (/usr/lib64/ld-2.17.so)
191 5e87 calloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa90 malloc@plt (/usr/lib64/ld-2.17.so)
192 fa90 malloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e60 malloc (/usr/lib64/ld-2.17.so)
193 5e68 malloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so)
194 fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so) => 7f21ba675d50 __libc_memalign (/usr/lib64/ld-2.17.so)
195 5d63 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e20 __libc_memalign (/usr/lib64/ld-2.17.so)
196 5e40 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675d73 __libc_memalign (/usr/lib64/ld-2.17.so)
197 5d97 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e18 __libc_memalign (/usr/lib64/ld-2.17.so)
198 5e1e __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675df9 __libc_memalign (/usr/lib64/ld-2.17.so)
199 5e10 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba669f8f _dl_new_object (/usr/lib64/ld-2.17.so)
200 9fc2 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba678e70 memcpy (/usr/lib64/ld-2.17.so)
201 8e8c memcpy (/usr/lib64/ld-2.17.so) => 7f21ba678ea0 memcpy (/usr/lib64/ld-2.17.so)
- Add support for using several Intel PT features (CYC, MTC packets),
the relevant documentation was updated in:
tools/perf/Documentation/intel-pt.txt
briefly describing those packets, its purposes, how to configure
them in the event config terms and relevant external documentation
for further reading. (Adrian Hunter)
- Introduce support for probing at an absolute address, for user and
kernel 'perf probe's, useful when one have the symbol maps on a
developer machine but not on an embedded system. (Wang Nan)
- Add Intel BTS support, with a call-graph script to show it and PT
in use in a GUI using 'perf script' python scripting with
postgresql and Qt. (Adrian Hunter)
- Allow selecting the type of callchains per event, including
disabling callchains in all but one entry in an event list, to save
space, and also to ask for the callchains collected in one event to
be used in other events. (Kan Liang)
- Beautify more syscall arguments in 'perf trace': (Arnaldo Carvalho
de Melo)
* A bunch more translate file/pathnames from pointers to strings.
* Convert numbers to strings for the 'keyctl' syscall 'option'
arg.
* Add missing 'clockid' entries.
- Introduce 'srcfile' sort key: (Andi Kleen)
# perf record -F 10000 usleep 1
# perf report --stdio --dsos '[kernel.vmlinux]' -s srcfile
<SNIP>
# Overhead Source File
26.49% copy_page_64.S
5.49% signal.c
0.51% msr.h
#
It can be combined with other fields, for instance, experiment with
'-s srcfile,symbol'.
There are some oddities in some distros and with some specific
DSOs, being investigated, so your mileage may vary.
- Support per-event 'freq' term: (Namhyung Kim)
$ perf record -e 'cpu/instructions,freq=1234/',cycles -c 1000 sleep 1
$ perf evlist -F
cpu/instructions,freq=1234/: sample_freq=1234
cycles: sample_period=1000
$
- Deref sys_enter pointer args with contents from probe:vfs_getname,
showing pathnames instead of pointers in many syscalls in 'perf
trace'. (Arnaldo Carvalho de Melo)
- Stop collecting /proc/kallsyms in perf.data files, saving about
4.5MB on a typical x86-64 system, use the the symbol resolution
routines used in all the other tools (report, top, etc) now that we
can ask libtraceevent to use perf's symbol resolution code.
(Arnaldo Carvalho de Melo)
- Allow filtering out of perf's PID via 'perf record --exclude-perf'.
(Wang Nan)
- 'perf trace' now supports syscall groups, like strace, i.e:
$ trace -e file touch file
Will expand 'file' into multiple, file related, syscalls. More
work needed to add extra groups for other syscall groups, and also
to complement what was added for the 'file' group, included as a
proof of concept. (Arnaldo Carvalho de Melo)
- Add lock_pi stresser to 'perf bench futex', to test the kernel code
related to FUTEX_(UN)LOCK_PI. (Davidlohr Bueso)
- Let user have timestamps with per-thread recording in 'perf record'
(Adrian Hunter)
- ... and tons of other changes, see the shortlog and the Git log for
details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (240 commits)
perf evlist: Add backpointer for perf_env to evlist
perf tools: Rename perf_session_env to perf_env
perf tools: Do not change lib/api/fs/debugfs directly
perf tools: Add tracing_path and remove unneeded functions
perf buildid: Introduce sysfs/filename__sprintf_build_id
perf evsel: Add a backpointer to the evlist a evsel is in
perf trace: Add header with copyright and background info
perf scripts python: Add new compaction-times script
perf stat: Get correct cpu id for print_aggr
tools lib traceeveent: Allow for negative numbers in print format
perf script: Add --[no-]-demangle/--[no-]-demangle-kernel
tracing/uprobes: Do not print '0x (null)' when offset is 0
perf probe: Support probing at absolute address
perf probe: Fix error reported when offset without function
perf probe: Fix list result when address is zero
perf probe: Fix list result when symbol can't be found
tools build: Allow duplicate objects in the object list
perf tools: Remove export.h from MANIFEST
perf probe: Prevent segfault when reading probe point with absolute address
perf tools: Update Intel PT documentation
...