OpenCloudOS-Kernel/include
Johannes Weiner 96f8bf4fb1 mm: vmscan: reclaim writepage is IO cost
The VM tries to balance reclaim pressure between anon and file so as to
reduce the amount of IO incurred due to the memory shortage.  It already
counts refaults and swapins, but in addition it should also count
writepage calls during reclaim.

For swap, this is obvious: it's IO that wouldn't have occurred if the
anonymous memory hadn't been under memory pressure.  From a relative
balancing point of view this makes sense as well: even if anon is cold and
reclaimable, a cache that isn't thrashing may have equally cold pages that
don't require IO to reclaim.

For file writeback, it's trickier: some of the reclaim writepage IO would
have likely occurred anyway due to dirty expiration.  But not all of it -
premature writeback reduces batching and generates additional writes.
Since the flushers are already woken up by the time the VM starts writing
cache pages one by one, let's assume that we'e likely causing writes that
wouldn't have happened without memory pressure.  In addition, the per-page
cost of IO would have probably been much cheaper if written in larger
batches from the flusher thread rather than the single-page-writes from
kswapd.

For our purposes - getting the trend right to accelerate convergence on a
stable state that doesn't require paging at all - this is sufficiently
accurate.  If we later wanted to optimize for sustained thrashing, we can
still refine the measurements.

Count all writepage calls from kswapd as IO cost toward the LRU that the
page belongs to.

Why do this dynamically?  Don't we know in advance that anon pages require
IO to reclaim, and so could build in a static bias?

First, scanning is not the same as reclaiming.  If all the anon pages are
referenced, we may not swap for a while just because we're scanning the
anon list.  During this time, however, it's important that we age
anonymous memory and the page cache at the same rate so that their
hot-cold gradients are comparable.  Everything else being equal, we still
want to reclaim the coldest memory overall.

Second, we keep copies in swap unless the page changes.  If there is
swap-backed data that's mostly read (tmpfs file) and has been swapped out
before, we can reclaim it without incurring additional IO.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Link: http://lkml.kernel.org/r/20200520232525.798933-14-hannes@cmpxchg.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:49 -07:00
..
acpi Merge branches 'acpi-apei', 'acpi-pmic', 'acpi-video' and 'acpi-dptf' 2020-06-01 17:19:43 +02:00
asm-generic arm64/mm: drop __HAVE_ARCH_HUGE_PTEP_GET 2020-06-03 20:09:46 -07:00
clocksource
crypto crypto: lib/sha1 - fold linux/cryptohash.h into crypto/sha.h 2020-05-08 15:32:17 +10:00
drm drm pull for 5.8-rc1 2020-06-02 15:04:15 -07:00
dt-bindings RISC-V Patches for the 5.7 Merge Window, Part 1 2020-04-09 10:51:30 -07:00
keys
kunit
kvm
linux mm: vmscan: reclaim writepage is IO cost 2020-06-03 20:09:49 -07:00
math-emu
media Update rmk's email address in various drivers 2020-04-21 17:50:09 +01:00
misc
net Merge branch 'uaccess.csum' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-01 16:03:37 -07:00
pcmcia
ras
rdma RDMA/core: Fix double destruction of uobject 2020-05-27 14:22:57 -03:00
scsi block: move dma_pad handling from blk_rq_map_sg into the callers 2020-04-22 10:47:39 -06:00
soc net: dsa: ocelot: the MAC table on Felix is twice as large 2020-05-06 17:15:24 -07:00
sound ALSA: rawmidi: Fix racy buffer resize under concurrent accesses 2020-05-07 22:29:14 +02:00
target
trace khugepaged: introduce 'max_ptes_shared' tunable 2020-06-03 20:09:46 -07:00
uapi for-5.8-tag 2020-06-02 19:59:25 -07:00
vdso vdso/datapage: Use correct clock mode name in comment 2020-04-20 19:19:52 +02:00
video
xen xen: Use evtchn_type_t as a type for event channels 2020-04-07 12:12:54 +02:00