linux-sg2042

Go to file

Huang Ying cbc65df240 mm, swap: add swap readahead hit statistics Patch series "mm, swap: VMA based swap readahead", v4. The swap readahead is an important mechanism to reduce the swap in latency. Although pure sequential memory access pattern isn't very popular for anonymous memory, the space locality is still considered valid. In the original swap readahead implementation, the consecutive blocks in swap device are readahead based on the global space locality estimation. But the consecutive blocks in swap device just reflect the order of page reclaiming, don't necessarily reflect the access pattern in virtual memory space. And the different tasks in the system may have different access patterns, which makes the global space locality estimation incorrect. In this patchset, when page fault occurs, the virtual pages near the fault address will be readahead instead of the swap slots near the fault swap slot in swap device. This avoid to readahead the unrelated swap slots. At the same time, the swap readahead is changed to work on per-VMA from globally. So that the different access patterns of the different VMAs could be distinguished, and the different readahead policy could be applied accordingly. The original core readahead detection and scaling algorithm is reused, because it is an effect algorithm to detect the space locality. In addition to the swap readahead changes, some new sysfs interface is added to show the efficiency of the readahead algorithm and some other swap statistics. This new implementation will incur more small random read, on SSD, the improved correctness of estimation and readahead target should beat the potential increased overhead, this is also illustrated in the test results below. But on HDD, the overhead may beat the benefit, so the original implementation will be used by default. The test and result is as follow, Common test condition ===================== Test Machine: Xeon E5 v3 (2 sockets, 72 threads, 32G RAM) Swap device: NVMe disk Micro-benchmark with combined access pattern ============================================ vm-scalability, sequential swap test case, 4 processes to eat 50G virtual memory space, repeat the sequential memory writing until 300 seconds. The first round writing will trigger swap out, the following rounds will trigger sequential swap in and out. At the same time, run vm-scalability random swap test case in background, 8 processes to eat 30G virtual memory space, repeat the random memory write until 300 seconds. This will trigger random swap-in in the background. This is a combined workload with sequential and random memory accessing at the same time. The result (for sequential workload) is as follow, Base Optimized ---- --------- throughput 345413 KB/s 414029 KB/s (+19.9%) latency.average 97.14 us 61.06 us (-37.1%) latency.50th 2 us 1 us latency.60th 2 us 1 us latency.70th 98 us 2 us latency.80th 160 us 2 us latency.90th 260 us 217 us latency.95th 346 us 369 us latency.99th 1.34 ms 1.09 ms ra_hit% 52.69% 99.98% The original swap readahead algorithm is confused by the background random access workload, so readahead hit rate is lower. The VMA-base readahead algorithm works much better. Linpack ======= The test memory size is bigger than RAM to trigger swapping. Base Optimized ---- --------- elapsed_time 393.49 s 329.88 s (-16.2%) ra_hit% 86.21% 98.82% The score of base and optimized kernel hasn't visible changes. But the elapsed time reduced and readahead hit rate improved, so the optimized kernel runs better for startup and tear down stages. And the absolute value of readahead hit rate is high, shows that the space locality is still valid in some practical workloads. This patch (of 5): The statistics for total readahead pages and total readahead hits are recorded and exported via the following sysfs interface. /sys/kernel/mm/swap/ra_hits /sys/kernel/mm/swap/ra_total With them, the efficiency of the swap readahead could be measured, so that the swap readahead algorithm and parameters could be tuned accordingly. [akpm@linux-foundation.org: don't display swap stats if CONFIG_SWAP=n] Link: http://lkml.kernel.org/r/20170807054038.1843-2-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Shaohua Li <shli@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Tim Chen <tim.c.chen@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2017-09-06 17:27:29 -07:00
Documentation	fscache: remove unused ->now_uncached callback	2017-09-06 17:27:26 -07:00
arch	mm: arch: consolidate mmap hugetlb size encodings	2017-09-06 17:27:28 -07:00
block	Char/Misc drivers for 4.14-rc1	2017-09-05 11:08:17 -07:00
certs	modsign: add markers to endif-statements in certs/Makefile	2017-07-14 11:01:37 +10:00
crypto	crypto: algif_skcipher - only call put_page on referenced and used pages	2017-08-22 14:45:48 +08:00
drivers	block, THP: make block_device_operations.rw_page support THP	2017-09-06 17:27:27 -07:00
firmware	firmware/Makefile: force recompilation if makefile changes	2017-05-08 17:15:10 -07:00
fs	userfaultfd: provide pid in userfault msg - add feat union	2017-09-06 17:27:29 -07:00
include	mm, swap: add swap readahead hit statistics	2017-09-06 17:27:29 -07:00
init	mm, memory_hotplug: drop zone from build_all_zonelists	2017-09-06 17:27:25 -07:00
ipc	Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu	2017-08-21 09:45:19 +02:00
kernel	mm, devm_memremap_pages: use multi-order radix for ZONE_DEVICE lookups	2017-09-06 17:27:29 -07:00
lib	Driver core update for 4.14-rc1	2017-09-05 10:41:21 -07:00
mm	mm, swap: add swap readahead hit statistics	2017-09-06 17:27:29 -07:00
net	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid	2017-09-05 11:54:41 -07:00
samples	samples/bpf: fix bpf tunnel cleanup	2017-07-31 22:02:47 -07:00
scripts	modpost: simplify sec_name()	2017-09-06 17:27:24 -07:00
security	Now that IPC and other changes have landed, enable manual markings for	2017-07-19 08:55:18 -07:00
sound	Merge branch 'parisc-4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux	2017-09-05 09:37:11 -07:00
tools	selftests/memfd: add memfd_create hugetlbfs selftest	2017-09-06 17:27:29 -07:00
usr	ramfs: clarify help text that compression applies to ramfs as well as legacy ramdisk.	2017-07-06 16:24:30 -07:00
virt	KVM: update to new mmu_notifier semantic v2	2017-08-31 16:13:00 -07:00
.cocciconfig	scripts: add Linux .cocciconfig for coccinelle	2016-07-22 12:13:39 +02:00
.get_maintainer.ignore	Add hch to .get_maintainer.ignore	2015-08-21 14:30:10 -07:00
.gitattributes	.gitattributes: set git diff driver for C source code files	2016-10-07 18:46:30 -07:00
.gitignore	kbuild: Add support to generate LLVM assembly files	2017-04-25 08:13:52 +09:00
.mailmap	power supply and reset changes for the v4.12 series (part 2)	2017-05-12 12:02:21 -07:00
COPYING	…
CREDITS	avr32: remove support for AVR32 architecture	2017-05-01 09:27:15 +02:00
Kbuild	kbuild: Consolidate header generation from ASM offset information	2017-04-13 05:43:37 +09:00
Kconfig	…
MAINTAINERS	ACPI updates for v4.14-rc1	2017-09-05 12:45:03 -07:00
Makefile	Merge branch 'docs-next' of git://git.lwn.net/linux	2017-09-03 21:07:29 -07:00
README	README: add a new README file, pointing to the Documentation/	2016-10-24 08:12:35 -02:00

README

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.