Commit Graph

870115 Commits

Author SHA1 Message Date
Vlastimil Babka 7e2f2a0cd1 mm, page_owner: record page owner for each subpage
Patch series "debug_pagealloc improvements through page_owner", v2.

The debug_pagealloc functionality serves a similar purpose on the page
allocator level that slub_debug does on the kmalloc level, which is to
detect bad users.  One notable feature that slub_debug has is storing
stack traces of who last allocated and freed the object.  On page level we
track allocations via page_owner, but that info is discarded when freeing,
and we don't track freeing at all.  This series improves those aspects.
With both debug_pagealloc and page_owner enabled, we can then get bug
reports such as the example in Patch 4.

SLUB debug tracking additionally stores cpu, pid and timestamp.  This could
be added later, if deemed useful enough to justify the additional page_ext
structure size.

This patch (of 3):

Currently, page owner info is only recorded for the first page of a
high-order allocation, and copied to tail pages in the event of a split
page.  With the plan to keep previous owner info after freeing the page,
it would be benefical to record page owner for each subpage upon
allocation.  This increases the overhead for high orders, but that should
be acceptable for a debugging option.

The order stored for each subpage is the order of the whole allocation.
This makes it possible to calculate the "head" pfn and to recognize "tail"
pages (quoted because not all high-order allocations are compound pages
with true head and tail pages).  When reading the page_owner debugfs file,
keep skipping the "tail" pages so that stats gathered by existing scripts
don't get inflated.

Link: http://lkml.kernel.org/r/20190820131828.22684-3-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Yu Zhao e7a1aaf287 mm: replace list_move_tail() with add_page_to_lru_list_tail()
This is a cleanup patch that replaces two historical uses of
list_move_tail() with relatively recent add_page_to_lru_list_tail().

Link: http://lkml.kernel.org/r/20190716212436.7137-1-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Matthew Wilcox (Oracle) d8c6546b1a mm: introduce compound_nr()
Replace 1 << compound_order(page) with compound_nr(page).  Minor
improvements in readability.

Link: http://lkml.kernel.org/r/20190721104612.19120-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Matthew Wilcox (Oracle) 94ad933810 mm: introduce page_shift()
Replace PAGE_SHIFT + compound_order(page) with the new page_shift()
function.  Minor improvements in readability.

[akpm@linux-foundation.org: fix build in tce_page_is_contained()]
  Link: http://lkml.kernel.org/r/201907241853.yNQTrJWd%25lkp@intel.com
Link: http://lkml.kernel.org/r/20190721104612.19120-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Matthew Wilcox (Oracle) a50b854e07 mm: introduce page_size()
Patch series "Make working with compound pages easier", v2.

These three patches add three helpers and convert the appropriate
places to use them.

This patch (of 3):

It's unnecessarily hard to find out the size of a potentially huge page.
Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).

Link: http://lkml.kernel.org/r/20190721104612.19120-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
YueHaibing 1f18b29669 mm/rmap.c: remove set but not used variable 'cstart'
Fixes gcc '-Wunused-but-set-variable' warning:

mm/rmap.c: In function page_mkclean_one:
mm/rmap.c:906:17: warning: variable cstart set but not used [-Wunused-but-set-variable]

It is not used any more since
commit cdb07bdea2 ("mm/rmap.c: remove redundant variable cend")

Link: http://lkml.kernel.org/r/20190724141453.38536-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Christophe JAILLET dbf7684e29 mm/page_poison.c: fix a typo in a comment
s/posioned/poisoned/

Link: http://lkml.kernel.org/r/20190721180908.6534-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Mark Rutland b92a953cb7 lib/test_kasan.c: add roundtrip tests
In several places we need to be able to operate on pointers which have
gone via a roundtrip:

	virt -> {phys,page} -> virt

With KASAN_SW_TAGS, we can't preserve the tag for SLUB objects, and the
{phys,page} -> virt conversion will use KASAN_TAG_KERNEL.

This patch adds tests to ensure that this works as expected, without
false positives which have recently been spotted [1,2] in testing.

[1] https://lore.kernel.org/linux-arm-kernel/20190819114420.2535-1-walter-zh.wu@mediatek.com/
[2] https://lore.kernel.org/linux-arm-kernel/20190819132347.GB9927@lakrids.cambridge.arm.com/

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20190821153927.28630-1-mark.rutland@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Walter Wu ae8f06b31a kasan: add memory corruption identification for software tag-based mode
Add memory corruption identification at bug report for software tag-based
mode.  The report shows whether it is "use-after-free" or "out-of-bound"
error instead of "invalid-access" error.  This will make it easier for
programmers to see the memory corruption problem.

We extend the slab to store five old free pointer tag and free backtrace,
we can check if the tagged address is in the slab record and make a good
guess if the object is more like "use-after-free" or "out-of-bound".
therefore every slab memory corruption can be identified whether it's
"use-after-free" or "out-of-bound".

[aryabinin@virtuozzo.com: simplify & clenup code]
  Link: https://lkml.kernel.org/r/3318f9d7-a760-3cc8-b700-f06108ae745f@virtuozzo.com]
Link: http://lkml.kernel.org/r/20190821180332.11450-1-aryabinin@virtuozzo.com
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Qian Cai c59180ae3e mm/kmemleak: increase the max mem pool to 1M
There are some machines with slow disk and fast CPUs.  When they are under
memory pressure, it could take a long time to swap before the OOM kicks in
to free up some memory.  As the results, it needs a large mem pool for
kmemleak or suffering from higher chance of a kmemleak metadata allocation
failure.  524288 proves to be the good number for all architectures here.
Increase the upper bound to 1M to leave some room for the future.

Link: http://lkml.kernel.org/r/1565807572-26041-1-git-send-email-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Qian Cai 0e965a6bda mm/kmemleak.c: record the current memory pool size
The only way to obtain the current memory pool size for a running kernel
is to check the kernel config file which is inconvenient.  Record it in
the kernel messages.

[akpm@linux-foundation.org: s/memory pool size/memory pool/available/, per Catalin]
Link: http://lkml.kernel.org/r/1565809631-28933-1-git-send-email-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Catalin Marinas c566586818 mm: kmemleak: use the memory pool for early allocations
Currently kmemleak uses a static early_log buffer to trace all memory
allocation/freeing before the slab allocator is initialised.  Such early
log is replayed during kmemleak_init() to properly initialise the kmemleak
metadata for objects allocated up that point.  With a memory pool that
does not rely on the slab allocator, it is possible to skip this early log
entirely.

In order to remove the early logging, consider kmemleak_enabled == 1 by
default while the kmem_cache availability is checked directly on the
object_cache and scan_area_cache variables.  The RCU callback is only
invoked after object_cache has been initialised as we wouldn't have any
concurrent list traversal before this.

In order to reduce the number of callbacks before kmemleak is fully
initialised, move the kmemleak_init() call to mm_init().

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: remove WARN_ON(), per Catalin]
Link: http://lkml.kernel.org/r/20190812160642.52134-4-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Catalin Marinas 0647398a8c mm: kmemleak: simple memory allocation pool for kmemleak objects
Add a memory pool for struct kmemleak_object in case the normal
kmem_cache_alloc() fails under the gfp constraints passed by the caller.
The mem_pool[] array size is currently fixed at 16000.

We are not using the existing mempool kernel API since this requires
the slab allocator to be available (for pool->elements allocation).  A
subsequent kmemleak patch will replace the static early log buffer with
the pool allocation introduced here and this functionality is required
to be available before the slab was initialised.

Link: http://lkml.kernel.org/r/20190812160642.52134-3-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Catalin Marinas dba82d9431 mm: kmemleak: make the tool tolerant to struct scan_area allocation failures
Patch series "mm: kmemleak: Use a memory pool for kmemleak object
allocations", v3.

Following the discussions on v2 of this patch(set) [1], this series takes
slightly different approach:

- it implements its own simple memory pool that does not rely on the
  slab allocator

- drops the early log buffer logic entirely since it can now allocate
  metadata from the memory pool directly before kmemleak is fully
  initialised

- CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
  CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE

- moves the kmemleak_init() call earlier (mm_init())

- to avoid a separate memory pool for struct scan_area, it makes the
  tool robust when such allocations fail as scan areas are rather an
  optimisation

[1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com

This patch (of 3):

Object scan areas are an optimisation aimed to decrease the false
positives and slightly improve the scanning time of large objects known to
only have a few specific pointers.  If a struct scan_area fails to
allocate, kmemleak can still function normally by scanning the full
object.

Introduce an OBJECT_FULL_SCAN flag and mark objects as such when scan_area
allocation fails.

Link: http://lkml.kernel.org/r/20190812160642.52134-2-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Nicolas Boichat b751c52bb5 kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K
The current default value (400) is too low on many systems (e.g.  some
ARM64 platform takes up 1000+ entries).

syzbot uses 16000 as default value, and has proved to be enough on beefy
configurations, so let's pick that value.

This consumes more RAM on boot (each entry is 160 bytes, so in total
~2.5MB of RAM), but the memory would later be freed (early_log is
__initdata).

Link: http://lkml.kernel.org/r/20190730154027.101525-1-drinkcat@chromium.org
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Qian Cai 9d5f0be0f7 mm/slub.c: fix -Wunused-function compiler warnings
tid_to_cpu() and tid_to_event() are only used in note_cmpxchg_failure()
when SLUB_DEBUG_CMPXCHG=y, so when SLUB_DEBUG_CMPXCHG=n by default, Clang
will complain that those unused functions.

Link: http://lkml.kernel.org/r/1568752232-5094-1-git-send-email-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Waiman Long 9adeaa2269 mm, slab: move memcg_cache_params structure to mm/slab.h
The memcg_cache_params structure is only embedded into the kmem_cache of
slab and slub allocators as defined in slab_def.h and slub_def.h and used
internally by mm code.  There is no needed to expose it in a public
header.  So move it from include/linux/slab.h to mm/slab.h.  It is just a
refactoring patch with no code change.

In fact both the slub_def.h and slab_def.h should be moved into the mm
directory as well, but that will probably cause many merge conflicts.

Link: http://lkml.kernel.org/r/20190718180827.18758-1-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Waiman Long 04f768a39d mm, slab: extend slab/shrink to shrink all memcg caches
Currently, a value of '1" is written to /sys/kernel/slab/<slab>/shrink
file to shrink the slab by flushing out all the per-cpu slabs and free
slabs in partial lists.  This can be useful to squeeze out a bit more
memory under extreme condition as well as making the active object counts
in /proc/slabinfo more accurate.

This usually applies only to the root caches, as the SLUB_MEMCG_SYSFS_ON
option is usually not enabled and "slub_memcg_sysfs=1" not set.  Even if
memcg sysfs is turned on, it is too cumbersome and impractical to manage
all those per-memcg sysfs files in a real production system.

So there is no practical way to shrink memcg caches.  Fix this by enabling
a proper write to the shrink sysfs file of the root cache to scan all the
available memcg caches and shrink them as well.  For a non-root memcg
cache (when SLUB_MEMCG_SYSFS_ON or slub_memcg_sysfs is on), only that
cache will be shrunk when written.

On a 2-socket 64-core 256-thread arm64 system with 64k page after
a parallel kernel build, the the amount of memory occupied by slabs
before shrinking slabs were:

 # grep task_struct /proc/slabinfo
 task_struct        53137  53192   4288   61    4 : tunables    0    0
 0 : slabdata    872    872      0
 # grep "^S[lRU]" /proc/meminfo
 Slab:            3936832 kB
 SReclaimable:     399104 kB
 SUnreclaim:      3537728 kB

After shrinking slabs (by echoing "1" to all shrink files):

 # grep "^S[lRU]" /proc/meminfo
 Slab:            1356288 kB
 SReclaimable:     263296 kB
 SUnreclaim:      1092992 kB
 # grep task_struct /proc/slabinfo
 task_struct         2764   6832   4288   61    4 : tunables    0    0
 0 : slabdata    112    112      0

Link: http://lkml.kernel.org/r/20190723151445.7385-1-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Colin Ian King 1c3ce5417b ocfs2: fix spelling mistake "ambigous" -> "ambiguous"
There is a spelling mistake in a mlog_bug_on_msg message. Fix it.

Link: http://lkml.kernel.org/r/831bdff4-064e-038b-f45d-c4d265cbff1e@linux.alibaba.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Changwei Ge d7283b39db ocfs2: checkpoint appending truncate log transaction before flushing
Appending truncate log(TA) and and flushing truncate log(TF) are two
separated transactions.  They can be both committed but not checkpointed.
If crash occurs then, both transaction will be replayed with several
already released to global bitmap clusters.  Then truncate log will be
replayed resulting in cluster double free.

To reproduce this issue, just crash the host while punching hole to files.

Signed-off-by: Changwei Ge <gechangwei@live.cn>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Changwei Ge 0a3775e4f8 ocfs2: wait for recovering done after direct unlock request
There is a scenario causing ocfs2 umount hang when multiple hosts are
rebooting at the same time.

NODE1                           NODE2               NODE3
send unlock requset to NODE2
                                dies
                                                    become recovery master
                                                    recover NODE2
find NODE2 dead
mark resource RECOVERING
directly remove lock from grant list
calculate usage but RECOVERING marked
**miss the window of purging
clear RECOVERING

To reproduce this issue, crash a host and then umount ocfs2
from another node.

To solve this, just let unlock progress wait for recovery done.

Link: http://lkml.kernel.org/r/1550124866-20367-1-git-send-email-gechangwei@live.cn
Signed-off-by: Changwei Ge <gechangwei@live.cn>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Markus Elfring a89bd89fae ocfs2: delete unnecessary checks before brelse()
brelse() tests whether its argument is NULL and then returns immediately.
Thus the tests around the shown calls are not needed.

This issue was detected by using the Coccinelle software.

Link: http://lkml.kernel.org/r/55cde320-394b-f985-56ce-1a2abea782aa@web.de
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
zhengbin 77461ba1d1 fs/ocfs2/dir.c: remove set but not used variables
Fixes gcc '-Wunused-but-set-variable' warning:

fs/ocfs2/dir.c: In function ocfs2_dx_dir_transfer_leaf:
fs/ocfs2/dir.c:3653:42: warning: variable new_list set but not used [-Wunused-but-set-variable]

Link: http://lkml.kernel.org/r/1566522588-63786-4-git-send-email-joseph.qi@linux.alibaba.com
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Changwei Ge <chge@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
zhengbin 236dcc2ae4 fs/ocfs2/file.c: remove set but not used variables
Fixes gcc '-Wunused-but-set-variable' warning:

fs/ocfs2/file.c: In function ocfs2_prepare_inode_for_write:
fs/ocfs2/file.c:2143:9: warning: variable end set but not used [-Wunused-but-set-variable]

Link: http://lkml.kernel.org/r/1566522588-63786-3-git-send-email-joseph.qi@linux.alibaba.com
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Changwei Ge <chge@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
zhengbin 225dcadf8e fs/ocfs2/namei.c: remove set but not used variables
Fixes gcc '-Wunused-but-set-variable' warning:

fs/ocfs2/namei.c: In function ocfs2_create_inode_in_orphan:
fs/ocfs2/namei.c:2503:23: warning: variable di set but not used [-Wunused-but-set-variable]

Link: http://lkml.kernel.org/r/1566522588-63786-2-git-send-email-joseph.qi@linux.alibaba.com
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Changwei Ge <chge@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Guozhonghua bf5a526479 ocfs2: remove unused ocfs2_orphan_scan_exit() declaration
ocfs2_orphan_scan_exit() is declared but not implemented.  Also perform a
minor cleanup in ocfs2_link_credits()

Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA4014FC208AC@H3CMLB12-EX.srv.huawei-3com.com
Signed-off-by: guozhonghua <guozhonghua@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Guozhonghua 3dd21cdbef ocfs2: remove unused ocfs2_calc_tree_trunc_credits()
ocfs2_calc_tree_trunc_credits() is not called anywhere.

Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA4014FC2050F@H3CMLB12-EX.srv.huawei-3com.com
Signed-off-by: guozhonghua <guozhonghua@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Greg Kroah-Hartman 5e7a3ed9f1 ocfs2: further debugfs cleanups
There is no need to check return value of debugfs_create functions, but
the last sweep through ocfs missed a number of places where this was
happening.  There is also no need to save the individual dentries for the
debugfs files, as everything is can just be removed at once when the
directory is removed.

By getting rid of the file dentries for the debugfs entries, a bit of
local memory can be saved as well.

[colin.king@canonical.com: ensure ret is set to zero before returning]
  Link: http://lkml.kernel.org/r/20190807121929.28918-1-colin.king@canonical.com
Link: http://lkml.kernel.org/r/20190731132119.GA12603@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jia Guo <guojia12@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Joseph Qi 963abb9aeb jbd2: remove jbd2_journal_inode_add_[write|wait]
Since ext4/ocfs2 are using jbd2_inode dirty range scoping APIs now,
jbd2_journal_inode_add_[write|wait] are not used any more, remove them.

Link: http://lkml.kernel.org/r/1562977611-8412-2-git-send-email-joseph.qi@linux.alibaba.com
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Acked-by: Changwei Ge <chge@linux.alibaba.com>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Joseph Qi bbd0f32721 ocfs2: use jbd2_inode dirty range scoping
6ba0e7dc64 ("jbd2: introduce jbd2_inode dirty range scoping") allow us
scoping each of the inode dirty ranges associated with a given
transaction, and ext4 already does this way.

Now let's also use the newly introduced jbd2_inode dirty range scoping to
prevent us from waiting forever when trying to complete a journal
transaction in ocfs2.

Link: http://lkml.kernel.org/r/1562977611-8412-1-git-send-email-joseph.qi@linux.alibaba.com
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Changwei Ge <chge@linux.alibaba.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Greg Thelen 6279eb3dd7 kbuild: clean compressed initramfs image
Since 9e3596b0c6 ("kbuild: initramfs cleanup, set target from Kconfig")
"make clean" leaves behind compressed initramfs images.  Example:

  $ make defconfig
  $ sed -i 's|CONFIG_INITRAMFS_SOURCE=""|CONFIG_INITRAMFS_SOURCE="/tmp/ir.cpio"|' .config
  $ make olddefconfig
  $ make -s
  $ make -s clean
  $ git clean -ndxf | grep initramfs
  Would remove usr/initramfs_data.cpio.gz

clean rules do not have CONFIG_* context so they do not know which
compression format was used.  Thus they don't know which files to delete.

Tell clean to delete all possible compression formats.

Once patched usr/initramfs_data.cpio.gz and friends are deleted by
"make clean".

Link: http://lkml.kernel.org/r/20190722063251.55541-1-gthelen@google.com
Fixes: 9e3596b0c6 ("kbuild: initramfs cleanup, set target from Kconfig")
Signed-off-by: Greg Thelen <gthelen@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Vitaly Wool 3f9d2b5766 z3fold: fix retry mechanism in page reclaim
z3fold_page_reclaim()'s retry mechanism is broken: on a second iteration
it will have zhdr from the first one so that zhdr is no longer in line
with struct page.  That leads to crashes when the system is stressed.

Fix that by moving zhdr assignment up.

While at it, protect against using already freed handles by using own
local slots structure in z3fold_page_reclaim().

Link: http://lkml.kernel.org/r/20190908162919.830388dc7404d1e2c80f4095@gmail.com
Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
Reported-by: Markus Linnala <markus.linnala@gmail.com>
Reported-by: Chris Murphy <bugzilla@colorremedies.com>
Reported-by: Agustin Dall'Alba <agustin@dallalba.com.ar>
Cc: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Henry Burns <henrywolfeburns@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:06 -07:00
Arnd Bergmann 710ec38b0f mm: add dummy can_do_mlock() helper
On kernels without CONFIG_MMU, we get a link error for the siw driver:

drivers/infiniband/sw/siw/siw_mem.o: In function `siw_umem_get':
siw_mem.c:(.text+0x4c8): undefined reference to `can_do_mlock'

This is probably not the only driver that needs the function and could
otherwise build correctly without CONFIG_MMU, so add a dummy variant that
always returns false.

Link: http://lkml.kernel.org/r/20190909204201.931830-1-arnd@arndb.de
Fixes: 2251334dca ("rdma/siw: application buffer management")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:06 -07:00
Vitaly Wool 6e73fd25e2 Revert "mm/z3fold.c: fix race between migration and destruction"
With the original commit applied, z3fold_zpool_destroy() may get blocked
on wait_event() for indefinite time.  Revert this commit for the time
being to get rid of this problem since the issue the original commit
addresses is less severe.

Link: http://lkml.kernel.org/r/20190910123142.7a9c8d2de4d0acbc0977c602@gmail.com
Fixes: d776aaa989 ("mm/z3fold.c: fix race between migration and destruction")
Reported-by: Agustín Dall'Alba <agustin@dallalba.com.ar>
Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Jonathan Adams <jwadams@google.com>
Cc: Henry Burns <henrywolfeburns@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:06 -07:00
OGAWA Hirofumi 07bfa4415a fat: work around race with userspace's read via blockdev while mounting
If userspace reads the buffer via blockdev while mounting,
sb_getblk()+modify can race with buffer read via blockdev.

For example,

            FS                               userspace
    bh = sb_getblk()
    modify bh->b_data
                                  read
				    ll_rw_block(bh)
				      fill bh->b_data by on-disk data
				      /* lost modified data by FS */
				      set_buffer_uptodate(bh)
    set_buffer_uptodate(bh)

Userspace should not use the blockdev while mounting though, the udev
seems to be already doing this.  Although I think the udev should try to
avoid this, workaround the race by small overhead.

Link: http://lkml.kernel.org/r/87pnk7l3sw.fsf_-_@mail.parknet.co.jp
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Reported-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:06 -07:00
Linus Torvalds 5184d44960 Microblaze patches for 5.4-rc1
- Clean up reset gpio handler
 - Defconfig updates
 - Add support for 8 byte get_user()
 - Switch to generic dma code
 
 In merge please fix dma_atomic_pool_init reported also by:
 https://lkml.org/lkml/2019/9/2/393
 or
 https://lore.kernel.org/linux-next/20190902214011.2a5400c9@canb.auug.org.au/
 -----BEGIN PGP SIGNATURE-----
 
 iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCXYnguwAKCRDKSWXLKUoM
 IaF8AKCawdYH+58xRg7riR7Evbv2kM0ghwCfXosnu6Ncv07UEY9Tv5zx/qibafk=
 =yI3X
 -----END PGP SIGNATURE-----

Merge tag 'microblaze-v5.4-rc1' of git://git.monstr.eu/linux-2.6-microblaze

Pull Microblaze updates from Michal Simek:

 - clean up reset gpio handler

 - defconfig updates

 - add support for 8 byte get_user()

 - switch to generic dma code

* tag 'microblaze-v5.4-rc1' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Switch to standard restart handler
  microblaze: defconfig synchronization
  microblaze: Enable Xilinx AXI emac driver by default
  arch/microblaze: support get_user() of size 8 bytes
  microblaze: remove ioremap_fullcache
  microblaze: use the generic dma coherent remap allocator
  microblaze/nommu: use the generic uncached segment support
2019-09-24 12:49:47 -07:00
Linus Torvalds baff384b0e platform-drivers-x86 for v5.4-2
* Fix compilation error of ASUS WMI driver when CONFIG_ACPI_BATTERY=n.
 * Fix I²C multi-instantiate driver to work with several USB PD devices.
 * Fix boot issue on Siemens SIMATIC IPC277E when PMC critical clock is
   being disabled.
 * Plenty of fixes to Intel Speed-Select Technology tools.
 
 The following is an automated git shortlog grouped by driver:
 
 asus-wmi:
  -  Make it depend on ACPI battery API
 
 i2c-multi-instantiate:
  -  Derive the device name from parent
 
 pmc_atom:
  -  Add Siemens SIMATIC IPC277E to critclk_systems DMI table
 
 tools/power/x86/intel-speed-select:
  -  Fix perf-profile command output
  -  Extend core-power command set
  -  Fix some debug prints
  -  Format get-assoc information
  -  Allow online/offline based on tdp
  -  Fix high priority core mask over count
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAl2JyzMACgkQb7wzTHR8
 rChj1A//Uv4uFc5pWwYrvafvzho4uQ4TK0TDEvrDFEqxCcI7n2JGCWvXAuV8Lny4
 oJ7enVjqDDMWkmN4KcONR8BWhZii23TzB+CDr1enCKYQv5J//di9jezVHtANw5oC
 duKq2sPd7wkigpQmDk17sft5U2MPKOK9EgE/qMztNOSTm3XGcGSbD80Cr/o6P1w3
 3TAZy/lED5jqjwvKmkDq/6fB3GdCG/b6LK56jhay5lew9Xi+WK9bTO3rzPo9nlvx
 HKT3FuRhH3Dbx4EY4QO5ee1RVnwPG5swCjFw2ZPvpJoTsAxEMgbC2yaesRElLJvk
 odIZrGDh2LqP8GCvtg6CQACsnRHzrze3H8PK75sCkFLkVMmw5Tp0knDMEakDQ39T
 0lZWsHyN6x75Bmt15GIUCfYvDoBvvBar0UHwNwCQk4KS+IvH4F+CWC5gbCgHwvQZ
 6bw1OSkdpP/wjf99ad2HJ9yFKP19qeSPIMwDEyZUgyLZBJoU12kEOB66Yyrpskve
 djbsZe+hfBH8NFLMlgaBINFS4fISbUsYV+bOxPw2hSdbVdgajoOTPCaJOhfGEQap
 b//gWBrDoX8LyibDW/b2zRhUvp2X944Z+Ve+btC2+XSHNt7/oy8q3Kh33dLIlEic
 eMAyMKn3GhkxvT20AyxRoJf3Fy4W2KKPHu/QZ00VB/aKHz/6vc8=
 =Cd14
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v5.4-2' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform-drivers fixes from Andy Shevchenko:

 - Fix compilation error of ASUS WMI driver when CONFIG_ACPI_BATTERY=n

 - Fix I²C multi-instantiate driver to work with several USB PD devices

 - Fix boot issue on Siemens SIMATIC IPC277E when PMC critical clock is
   being disabled

 - Plenty of fixes to Intel Speed-Select Technology tools

* tag 'platform-drivers-x86-v5.4-2' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: i2c-multi-instantiate: Derive the device name from parent
  platform/x86: pmc_atom: Add Siemens SIMATIC IPC277E to critclk_systems DMI table
  tools/power/x86/intel-speed-select: Fix perf-profile command output
  tools/power/x86/intel-speed-select: Extend core-power command set
  tools/power/x86/intel-speed-select: Fix some debug prints
  tools/power/x86/intel-speed-select: Format get-assoc information
  tools/power/x86/intel-speed-select: Allow online/offline based on tdp
  tools/power/x86/intel-speed-select: Fix high priority core mask over count
  platform/x86: asus-wmi: Make it depend on ACPI battery API
2019-09-24 12:39:40 -07:00
Linus Torvalds af5a7e99cc - First round of vmbus hibernation support from Dexuan Cui.
- Removal of dependencies on PAGE_SIZE by Maya Nakamura.
 - Moving the hyper-v tools/ code into the tools build system by Andy
 Shevchenko.
 - hyper-v balloon cleanups by Dexuan Cui.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE4n5dijQDou9mhzu83qZv95d3LNwFAl2JanAACgkQ3qZv95d3
 LNwf2Q//Rtmclmnk+lXn1BhEEiXtzliSY7wcjpRR87WCzTIj6p2y//R2PuweQr+b
 dAlXKd6reK/c2Q/FnaQ5Gf1daNWvMh/39viMaesGZcvoSWlT60gDPdnelj6Z8sO0
 gWDRQV5d4AHIs2garl2zTzO+TS9/8Ot/YD0gVpX4wNAy7j9ZeEvNVoanGvQ88Et3
 pSQFKTNLVPOLlMOchm6HAhwBo5k6Y1LB3/RE/qqcX1sR8/CLp4DT0VhsMVA1DZXV
 hb3a0tEzn8fJxifx8/iguZr84SetXA/qTKKWDG59xAU2kijJrLyb3KXRE92GOzlA
 HzwOlnX0vWpTTthEzaLlvOgFKybTNBGMEQQJKmpI2PucC0iaHmYVH2dDxhBb2gX5
 uJGGr4arHjMDQYfppCVy/VXE5hCpKE29L/7kl+DsElM6NkgyJAfK7Crpuxs8KMME
 HwHi5UwTSvaKv1XKilWIDy4PpuzvGx5ftPMyBqgEH/aLK9aP1N+folCTUc01qCFU
 vz/Yjrs/p/U7T9P4rDCXMb+IPiCpr1puBsC/z0RJvsKUdKrzDzpXPLU8Wagv6UxS
 iHpZRR/ArUYByRp3N42+PR8i9uqrcOxtNgzphnRsBo3lzOAphVaQY0tPQkBPSMp2
 SQI2NP1G74l3WdszeeHi446v6S40ichN/FYsDuiGCs9YJY78mMs=
 =Dk9i
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull Hyper-V updates from Sasha Levin:

 - first round of vmbus hibernation support (Dexuan Cui)

 - remove dependencies on PAGE_SIZE (Maya Nakamura)

 - move the hyper-v tools/ code into the tools build system (Andy
   Shevchenko)

 - hyper-v balloon cleanups (Dexuan Cui)

* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Resume after fixing up old primary channels
  Drivers: hv: vmbus: Suspend after cleaning up hv_sock and sub channels
  Drivers: hv: vmbus: Clean up hv_sock channels by force upon suspend
  Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation
  Drivers: hv: vmbus: Ignore the offers when resuming from hibernation
  Drivers: hv: vmbus: Implement suspend/resume for VSC drivers for hibernation
  Drivers: hv: vmbus: Add a helper function is_sub_channel()
  Drivers: hv: vmbus: Suspend/resume the synic for hibernation
  Drivers: hv: vmbus: Break out synic enable and disable operations
  HID: hv: Remove dependencies on PAGE_SIZE for ring buffer
  Tools: hv: move to tools buildsystem
  hv_balloon: Reorganize the probe function
  hv_balloon: Use a static page for the balloon_up send buffer
2019-09-24 12:36:31 -07:00
Linus Torvalds 0b36c9eed2 Merge branch 'work.mount3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more mount API conversions from Al Viro:
 "Assorted conversions of options parsing to new API.

  gfs2 is probably the most serious one here; the rest is trivial stuff.

  Other things in what used to be #work.mount are going to wait for the
  next cycle (and preferably go via git trees of the filesystems
  involved)"

* 'work.mount3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  gfs2: Convert gfs2 to fs_context
  vfs: Convert spufs to use the new mount API
  vfs: Convert hypfs to use the new mount API
  hypfs: Fix error number left in struct pointer member
  vfs: Convert functionfs to use the new mount API
  vfs: Convert bpf to use the new mount API
2019-09-24 12:33:34 -07:00
Tony Luck 722e6f500a ia64: Fix some warnings introduced in merge window
Fix

  arch/ia64/kernel/irq_ia64.c:586:1: warning: no return statement in function returning non-void [-Wreturn-type]
  arch/ia64/mm/contig.c:111:6: warning: unused variable 'rc' [-Wunused-variable]
  arch/ia64/mm/discontig.c:189:39: warning: unused variable 'rc' [-Wunused-variable]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 12:29:16 -07:00
YueHaibing 5addcd5dbd fuse: Make fuse_args_to_req static
Fix sparse warning:

fs/fuse/dev.c:468:6: warning: symbol 'fuse_args_to_req' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Fixes: 68583165f9 ("fuse: add pages to fuse_args")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-24 15:28:02 +02:00
zhengbin 9ad09b1976 fuse: fix memleak in cuse_channel_open
If cuse_send_init fails, need to fuse_conn_put cc->fc.

cuse_channel_open->fuse_conn_init->refcount_set(&fc->count, 1)
                 ->fuse_dev_alloc->fuse_conn_get
                 ->fuse_dev_free->fuse_conn_put

Fixes: cc080e9e9b ("fuse: introduce per-instance fuse_dev structure")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-24 15:28:01 +02:00
Tejun Heo e5854b1cdf fuse: fix beyond-end-of-page access in fuse_parse_cache()
With DEBUG_PAGEALLOC on, the following triggers.

  BUG: unable to handle page fault for address: ffff88859367c000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 3001067 P4D 3001067 PUD 406d3a8067 PMD 406d30c067 PTE 800ffffa6c983060
  Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
  CPU: 38 PID: 3110657 Comm: python2.7
  RIP: 0010:fuse_readdir+0x88f/0xe7a [fuse]
  Code: 49 8b 4d 08 49 39 4e 60 0f 84 44 04 00 00 48 8b 43 08 43 8d 1c 3c 4d 01 7e 68 49 89 dc 48 03 5c 24 38 49 89 46 60 8b 44 24 30 <8b> 4b 10 44 29 e0 48 89 ca 48 83 c1 1f 48 83 e1 f8 83 f8 17 49 89
  RSP: 0018:ffffc90035edbde0 EFLAGS: 00010286
  RAX: 0000000000001000 RBX: ffff88859367bff0 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffff88859367bfed RDI: 0000000000920907
  RBP: ffffc90035edbe90 R08: 000000000000014b R09: 0000000000000004
  R10: ffff88859367b000 R11: 0000000000000000 R12: 0000000000000ff0
  R13: ffffc90035edbee0 R14: ffff889fb8546180 R15: 0000000000000020
  FS:  00007f80b5f4a740(0000) GS:ffff889fffa00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffff88859367c000 CR3: 0000001c170c2001 CR4: 00000000003606e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   iterate_dir+0x122/0x180
   __x64_sys_getdents+0xa6/0x140
   do_syscall_64+0x42/0x100
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

It's in fuse_parse_cache().  %rbx (ffff88859367bff0) is fuse_dirent
pointer - addr + offset.  FUSE_DIRENT_SIZE() is trying to dereference
namelen off of it but that derefs into the next page which is disabled
by pagealloc debug causing a PF.

This is caused by dirent->namelen being accessed before ensuring that
there's enough bytes in the page for the dirent.  Fix it by pushing
down reclen calculation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 5d7bc7e868 ("fuse: allow using readdir cache")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-24 15:28:01 +02:00
Arnd Bergmann 0ed4059302 fuse: unexport fuse_put_request
This function has been made static, which now causes a compile-time
warning:

WARNING: "fuse_put_request" [vmlinux] is a static EXPORT_SYMBOL_GPL

Remove the unneeded export.

Fixes: 66abc3599c ("fuse: unexport request ops")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-24 15:28:01 +02:00
Khazhismel Kumykov dc69e98c24 fuse: kmemcg account fs data
account per-file, dentry, and inode data

blockdev/superblock and temporary per-request data was left alone, as
this usually isn't accounted

Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-24 15:28:01 +02:00
Khazhismel Kumykov 30c6a23d34 fuse: on 64-bit store time in d_fsdata directly
Implements the optimization noted in commit f75fdf22b0 ("fuse: don't
use ->d_time"), as the additional memory can be significant.  (In
particular, on SLAB configurations this 8-byte alloc becomes 32 bytes).
Per-dentry, this can consume significant memory.

Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-24 15:28:01 +02:00
Vasily Averin d5880c7a86 fuse: fix missing unlock_page in fuse_writepage()
unlock_page() was missing in case of an already in-flight write against the
same page.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Fixes: ff17be0864 ("fuse: writepage: skip already in flight")
Cc: <stable@vger.kernel.org> # v3.13
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-24 15:28:01 +02:00
Jussi Laako f41f900568 ALSA: usb-audio: Add DSD support for EVGA NU Audio
EVGA NU Audio is actually a USB audio device on a PCIexpress card,
with it's own USB controller. It supports both PCM and DSD.

Signed-off-by: Jussi Laako <jussi@sonarnerd.net>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190924071143.30911-1-jussi@sonarnerd.net
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-09-24 13:58:20 +02:00
Linus Torvalds 4c07e2ddab - New Drivers
- Add support for Merrifield Basin Cove PMIC
 
  - New Device Support
    - Add support for Intel Tiger Lake to Intel LPSS PCI
    - Add support for Intel Sky Lake to Intel LPSS PCI
    - Add support for ST-Ericsson DB8520 to DB8500 PRCMU
 
  - New Functionality
    - Add RTC and PWRC support to MT6323
 
  - Fix-ups
    - Clean-up include files; davinci_voicecodec, asic3, sm501, mt6397
    - Ignore return values from debugfs_create*(); ab3100-*, ab8500-debugfs, aat2870-core
    - Device Tree changes; rn5t618, mt6397
    - Use new I2C API; tps80031, 88pm860x-core, ab3100-core, bcm590xx,
                       da9150-core, max14577, max77693, max77843, max8907,
                       max8925-i2c, max8997, max8998, palmas, twl-core,
    - Remove obsolete code; da9063, jz4740-adc
    - Simplify semantics; timberdale, htc-i2cpld
    - Add 'fall-through' tags; omap-usb-host, db8500-prcmu
    - Remove superfluous prints; ab8500-debugfs, db8500-prcmu, fsl-imx25-tsadc,
                                 intel_soc_pmic_bxtwc, qcom_rpm, sm501
    - Trivial rename/whitespace/typo fixes; mt6397-core, MAINTAINERS
    - Reorganise code structure; mt6397-*
    - Improve code consistency; intel-lpss
    - Use MODULE_SOFTDEP() helper; intel-lpss
    - Use DEFINE_RES_*() helpers; mt6397-core
 
  - Bug Fixes
    - Clean-up resources; max77620
    - Prevent input events being dropped on resume; intel-lpss-pci
    - Prevent sleeping in IRQ context; ezx-pcap
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEdrbJNaO+IJqU8IdIUa+KL4f8d2EFAl2JUDEACgkQUa+KL4f8
 d2HKPQ//TLMO2Ed7AdhajgpYQVRQ2wM5VGOQLaTj/sO0WTjk3DfPdmp13ZZh9UTN
 evfYiVJI2ofpcliK6Tg8X27oaNg/YK+ff562lyFzMNIQMJ7hpzjvyaAfGhMz4y7I
 tN8RgMNHSU5luywJvJ43OH6K5C+ckg+kPp4Ywvnh3Hwqm3y7QJ9/CshTON69BOCf
 kYvE9dvAqubbTl80aMOqNbl/p6+h/C9sFhvGsC7LnWdE9vev/SE1kYmjm1ha5Z1N
 hRbnAw1nXidV1mB6oIhzvsx3MgpvBjvi6UXg8TCWuZTUlwMplgAfsDKWb6J9czxB
 lhP0W5/wuJPIkdZPRSEGVJ6MCsxpF5GcGknWTf1dL2hsx0kh1JsT/AUQeHR02NpQ
 pnGF7YDgQugHnkHqr+JfbswlIunh5U3ZxspvHQWTxzaVQvQ0eGk91Kqr5zKhdHf9
 z5e1j/VtjzAsYqGkamAjXCXPES1PIvwqKT6qLrERAoFBgigwP/i4nxSlTJGvqTVY
 M3+RH7I7IWfGCuJ227PWpkvFgzX7N+okKxujOWo+Fd4DH6lhMDstnrIoVZxpEfE+
 cQLo5soG7mJtk9meGFTrxkfT3IqcVWMG0vCXeaureNS1N+OjtZLRDN7ZuG2d8zFx
 nRCi+tJoAZe34DenfiSe+VcczlOOtqR2JpZKAdwW3ZT3Kcfl9U4=
 =03Zw
 -----END PGP SIGNATURE-----

Merge tag 'mfd-next-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
 "New Drivers:
   - Add support for Merrifield Basin Cove PMIC

  New Device Support:
   - Add support for Intel Tiger Lake to Intel LPSS PCI
   - Add support for Intel Sky Lake to Intel LPSS PCI
   - Add support for ST-Ericsson DB8520 to DB8500 PRCMU

  New Functionality:
   - Add RTC and PWRC support to MT6323

  Fix-ups:
   - Clean-up include files; davinci_voicecodec, asic3, sm501, mt6397
   - Ignore return values from debugfs_create*(); ab3100-*, ab8500-debugfs, aat2870-core
   - Device Tree changes; rn5t618, mt6397
   - Use new I2C API; tps80031, 88pm860x-core, ab3100-core, bcm590xx,
                      da9150-core, max14577, max77693, max77843, max8907,
                      max8925-i2c, max8997, max8998, palmas, twl-core,
   - Remove obsolete code; da9063, jz4740-adc
   - Simplify semantics; timberdale, htc-i2cpld
   - Add 'fall-through' tags; omap-usb-host, db8500-prcmu
   - Remove superfluous prints; ab8500-debugfs, db8500-prcmu, fsl-imx25-tsadc,
                                intel_soc_pmic_bxtwc, qcom_rpm, sm501
   - Trivial rename/whitespace/typo fixes; mt6397-core, MAINTAINERS
   - Reorganise code structure; mt6397-*
   - Improve code consistency; intel-lpss
   - Use MODULE_SOFTDEP() helper; intel-lpss
   - Use DEFINE_RES_*() helpers; mt6397-core

  Bug Fixes:
   - Clean-up resources; max77620
   - Prevent input events being dropped on resume; intel-lpss-pci
   - Prevent sleeping in IRQ context; ezx-pcap"

* tag 'mfd-next-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (48 commits)
  mfd: mt6323: Add MT6323 RTC and PWRC
  mfd: mt6323: Replace boilerplate resource code with DEFINE_RES_* macros
  mfd: mt6397: Add mutex include
  dt-bindings: mfd: mediatek: Add MT6323 Power Controller
  dt-bindings: mfd: mediatek: Update RTC to include MT6323
  dt-bindings: mfd: mediatek: mt6397: Change to relative paths
  mfd: db8500-prcmu: Support the higher DB8520 ARMSS
  mfd: intel-lpss: Use MODULE_SOFTDEP() instead of implicit request
  mfd: htc-i2cpld: Drop check because i2c_unregister_device() is NULL safe
  mfd: sm501: Include the GPIO driver header
  mfd: intel-lpss: Add Intel Skylake ACPI IDs
  mfd: intel-lpss: Consistently use GENMASK()
  mfd: Add support for Merrifield Basin Cove PMIC
  mfd: ezx-pcap: Replace mutex_lock with spin_lock
  mfd: asic3: Include the right header
  MAINTAINERS: altera-sysmgr: Fix typo in a filepath
  mfd: mt6397: Extract IRQ related code from core driver
  mfd: mt6397: Rename macros to something more readable
  mfd: Remove dev_err() usage after platform_get_irq()
  mfd: db8500-prcmu: Mark expected switch fall-throughs
  ...
2019-09-23 19:37:49 -07:00
Linus Torvalds d0b3cfee33 - Core Frameworks
- Obtain scale type through sysfs
 
  - New Functionality
    - Provide Device Tree functionality; rave-sp-backlight
    - Calculate if scale type is (non-)linear; pwm_bl
 
  - Fix-ups
    - Simplify code; lm3630a_bl
    - Trivial rename/whitespace/typo fixes; lms283gf05
    - Remove superfluous NULL check; tosa_lcd
    - Fix power state initialisation; gpio_backlight
    - List supported file; MAINTAINERS
 
  - Bug Fixes
    - Kconfig - default to not building unless requested; {LED,BACKLIGHT}_CLASS_DEVICE
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEdrbJNaO+IJqU8IdIUa+KL4f8d2EFAl2JTwMACgkQUa+KL4f8
 d2EC7g/+Pliuvx8dgW+MPX8K6cnap+m7xV8Z9OyyEYHroz+MX7TsBGmJeChoOz9K
 RoFCap5VCtEEyiobDbe8HKpxivY88CdweegYxxBx+Uo9qJRogDv93kqVz3BmHxBm
 v2ShTY+8r2nf6YNN30FXZJSM4WkFqAg2MvM6LXP3R4rPSQ/u2Xp6L5VnzVETl23b
 /ZsP71axvwDNpi437ekUSoraUl9GZnHaINc7/0R/SnRPvZ2aakko7IRNYU0z+NQ+
 unImiBpFjuC1UXDaebd2Bt9ayrM47QkBGLu6LgOTUSg9tFlosvF6Z3TDLo3L2ptD
 Td5SWuTXALFaOv1xmMaylcsA/36D/KRhwnmQMVn/Ld5c/M+l4vOm4YDNgS1G8JY4
 i7KzoD7Z2yhyI3OHYoBGXI8+fmj2qQs+VGThtdhb96LlaHVUibfSlA0i5OjeMKO+
 ddiTy9ksiH+pBcn0YPvHJRNOI0uVmqBMSPkJ1Bdwz6HS58QcL3LvBzHtMVNoGkCK
 iWPPn90hveZmbXAxFIb8a2KRjFz531sReSZh+9dIZ/zOOF5OdyOotMk1zacWCUrt
 Jre1paR4W1VPXf4cE3IcpE6dJeA9DN9niz+9bsMgbr6cYG6xhB7xELrTUCXZn5CR
 C9tNMWuiqUlv0NZHb7CsifQW1bwzw8hTPhDk9VpBb0cFCucLYV0=
 =DSnC
 -----END PGP SIGNATURE-----

Merge tag 'backlight-next-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight updates from Lee Jones:
 "Core Frameworks
   - Obtain scale type through sysfs

  New Functionality:
   - Provide Device Tree functionality in rave-sp-backlight
   - Calculate if scale type is (non-)linear in pwm_bl

  Fix-ups:
   - Simplify code in lm3630a_bl
   - Trivial rename/whitespace/typo fixes in lms283gf05
   - Remove superfluous NULL check in tosa_lcd
   - Fix power state initialisation in gpio_backlight
   - List supported file in MAINTAINERS

  Bug Fixes:
   - Kconfig - default to not building unless requested in
     {LED,BACKLIGHT}_CLASS_DEVICE"

* tag 'backlight-next-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: pwm_bl: Set scale type for brightness curves specified in the DT
  backlight: pwm_bl: Set scale type for CIE 1931 curves
  backlight: Expose brightness curve type through sysfs
  MAINTAINERS: Add entry for stable backlight sysfs ABI documentation
  backlight: gpio-backlight: Correct initial power state handling
  video: backlight: tosa_lcd: drop check because i2c_unregister_device() is NULL safe
  video: backlight: Drop default m for {LCD,BACKLIGHT_CLASS_DEVICE}
  backlight: lms283gf05: Fix a typo in the description passed to 'devm_gpio_request_one()'
  backlight: lm3630a: Switch to use fwnode_property_count_uXX()
  backlight: rave-sp: Leave initial state and register with correct device
2019-09-23 19:33:56 -07:00