OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Keith Busch	9f297db356	dmapool: create/destroy cleanup Set the 'empty' bool directly from the result of the function that determines its value instead of adding additional logic. Link: https://lkml.kernel.org/r/20230126215125.4069751-13-kbusch@meta.com Fixes: `2d55c16c0c` ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-06-09 16:25:17 -07:00
Keith Busch	da9619a30e	dmapool: link blocks across pages The allocated dmapool pages are never freed for the lifetime of the pool. There is no need for the two level list+stack lookup for finding a free block since nothing is ever removed from the list. Just use a simple stack, reducing time complexity to constant. The implementation inserts the stack linking elements and the dma handle of the block within itself when freed. This means the smallest possible dmapool block is increased to at most 16 bytes to accommodate these fields, but there are no exisiting users requesting a dma pool smaller than that anyway. Removing the list has a significant change in performance. Using the kernel's micro-benchmarking self test: Before: # modprobe dmapool_test dmapool test: size:16 blocks:8192 time:57282 dmapool test: size:64 blocks:8192 time:172562 dmapool test: size:256 blocks:8192 time:789247 dmapool test: size:1024 blocks:2048 time:371823 dmapool test: size:4096 blocks:1024 time:362237 After: # modprobe dmapool_test dmapool test: size:16 blocks:8192 time:24997 dmapool test: size:64 blocks:8192 time:26584 dmapool test: size:256 blocks:8192 time:33542 dmapool test: size:1024 blocks:2048 time:9022 dmapool test: size:4096 blocks:1024 time:6045 The module test allocates quite a few blocks that may not accurately represent how these pools are used in real life. For a more marco level benchmark, running fio high-depth + high-batched on nvme, this patch shows submission and completion latency reduced by ~100usec each, 1% IOPs improvement, and perf record's time spent in dma_pool_alloc/free were reduced by half. [kbusch@kernel.org: push new blocks in ascending order] Link: https://lkml.kernel.org/r/20230221165400.1595247-1-kbusch@meta.com Link: https://lkml.kernel.org/r/20230126215125.4069751-12-kbusch@meta.com Fixes: `2d55c16c0c` ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-05-06 10:33:38 -07:00
Keith Busch	8ecc369554	dmapool: don't memset on free twice If debug is enabled, dmapool will poison the range, so no need to clear it to 0 immediately before writing over it. Link: https://lkml.kernel.org/r/20230126215125.4069751-11-kbusch@meta.com Fixes: `2d55c16c0c` ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-05-06 10:33:38 -07:00
Keith Busch	cc669954ab	dmapool: simplify freeing The actions for busy and not busy are mostly the same, so combine these and remove the unnecessary function. Also, the pool is about to be freed so there's no need to poison the page data since we only check for poison on alloc, which can't be done on a freed pool. Link: https://lkml.kernel.org/r/20230126215125.4069751-10-kbusch@meta.com Fixes: `2d55c16c0c` ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-05-06 10:33:37 -07:00
Keith Busch	f0bccea6bc	dmapool: consolidate page initialization Various fields of the dma pool are set in different places. Move it all to one function. Link: https://lkml.kernel.org/r/20230126215125.4069751-9-kbusch@meta.com Fixes: `2d55c16c0c` ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-05-06 10:33:37 -07:00
Keith Busch	5407df10e5	dmapool: rearrange page alloc failure handling Handle the error in a condition so the good path can be in the normal flow. Link: https://lkml.kernel.org/r/20230126215125.4069751-8-kbusch@meta.com Fixes: `2d55c16c0c` ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-05-06 10:33:37 -07:00
Keith Busch	d93e08b755	dmapool: move debug code to own functions Clean up the normal path by moving the debug code outside it. Link: https://lkml.kernel.org/r/20230126215125.4069751-7-kbusch@meta.com Fixes: `2d55c16c0c` ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-05-06 10:33:37 -07:00
Tony Battersby	290911c56f	dmapool: speedup DMAPOOL_DEBUG with init_on_alloc Avoid double-memset of the same allocated memory in dma_pool_alloc() when both DMAPOOL_DEBUG is enabled and init_on_alloc=1. Link: https://lkml.kernel.org/r/20230126215125.4069751-6-kbusch@meta.com Fixes: `2d55c16c0c` ("dmapool: create/destroy cleanup") Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-05-06 10:33:36 -07:00
Tony Battersby	790233528d	dmapool: cleanup integer types To represent the size of a single allocation, dmapool currently uses 'unsigned int' in some places and 'size_t' in other places. Standardize on 'unsigned int' to reduce overhead, but use 'size_t' when counting all the blocks in the entire pool. Link: https://lkml.kernel.org/r/20230126215125.4069751-5-kbusch@meta.com Fixes: `2d55c16c0c` ("dmapool: create/destroy cleanup") Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-05-06 10:33:36 -07:00
Tony Battersby	08cc96c894	dmapool: use sysfs_emit() instead of scnprintf() Use sysfs_emit instead of scnprintf, snprintf or sprintf. Link: https://lkml.kernel.org/r/20230126215125.4069751-4-kbusch@meta.com Fixes: `2d55c16c0c` ("dmapool: create/destroy cleanup") Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-05-06 10:33:36 -07:00
Tony Battersby	67a540c60c	dmapool: remove checks for dev == NULL dmapool originally tried to support pools without a device because dma_alloc_coherent() supports allocations without a device. But nobody ended up using dma pools without a device, and trying to do so will result in an oops. So remove the checks for pool->dev == NULL since they are unneeded bloat. [kbusch@kernel.org: add check for null dev on create] Link: https://lkml.kernel.org/r/20230126215125.4069751-3-kbusch@meta.com Fixes: `2d55c16c0c` ("dmapool: create/destroy cleanup") Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-05-06 10:33:36 -07:00
Christian König	cc6266f032	mm/dmapool.c: revert "make dma pool to use kmalloc_node" This reverts commit `2618c60b8b` ("dma: make dma pool to use kmalloc_node"). While working myself into the dmapool code I've found this little odd kmalloc_node(). What basically happens here is that we allocate the housekeeping structure on the numa node where the device is attached to. Since the device is never doing DMA to or from that memory this doesn't seem to make sense at all. So while this doesn't seem to cause much harm it's probably cleaner to revert the change for consistency. Link: https://lkml.kernel.org/r/20211221110724.97664-1-christian.koenig@amd.com Signed-off-by: Christian König <christian.koenig@amd.com> Cc: Yinghai Lu <yinghai.lu@sun.com> Cc: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-01-15 16:30:28 +02:00
YueHaibing	e8df2c703d	mm/dmapool: use DEVICE_ATTR_RO macro Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(), which makes the code a bit shorter and easier to read. Link: https://lkml.kernel.org/r/20210524112852.34716-1-yuehaibing@huawei.com Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-06-29 10:53:52 -07:00
Zhiyuan Dai	943f229e96	mm/dmapool: switch from strlcpy to strscpy strlcpy is marked as deprecated in Documentation/process/deprecated.rst, and there is no functional difference when the caller expects truncation (when not checking the return value). strscpy is relatively better as it also avoids scanning the whole source string. Link: https://lkml.kernel.org/r/1613962050-14188-1-git-send-email-daizhiyuan@phytium.com.cn Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-04-30 11:20:39 -07:00
Daniel Vetter	0f2f89b6de	mm/dmapool: use might_alloc() Now that my little helper has landed, use it more. On top of the existing check this also uses lockdep through the fs_reclaim annotations. Link: https://lkml.kernel.org/r/20210113135009.3606813-1-daniel.vetter@ffwll.ch Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-02-26 09:41:01 -08:00
Andy Shevchenko	41a04814a7	mm/dmapool.c: replace hard coded function name with __func__ No need to hard code function name when __func__ can be used. While here, replace specifiers for special types like dma_addr_t. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200814135055.24898-2-andriy.shevchenko@linux.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-13 18:38:32 -07:00
Andy Shevchenko	42286f83f8	mm/dmapool.c: replace open-coded list_for_each_entry_safe() There is a place in the code where open-coded version of list_for_each_entry_safe() is used. Replace that with the standard macro. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200814135055.24898-1-andriy.shevchenko@linux.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-13 18:38:32 -07:00
Mateusz Nosek	1386f7a3bf	mm/dmapool.c: micro-optimisation remove unnecessary branch Previously there was a check if 'size' is aligned to 'align' and if not then it was aligned. This check was expensive as both branch and division are expensive instructions in most architectures. 'ALIGN' function on already aligned value will not change it, and as it is cheaper than branch + division it can be executed all the time and branch can be removed. Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200320173317.26408-1-mateusznosek0@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-04-07 10:43:42 -07:00
Alexander Potapenko	6471384af2	mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options Patch series "add init_on_alloc/init_on_free boot options", v10. Provide init_on_alloc and init_on_free boot options. These are aimed at preventing possible information leaks and making the control-flow bugs that depend on uninitialized values more deterministic. Enabling either of the options guarantees that the memory returned by the page allocator and SL[AU]B is initialized with zeroes. SLOB allocator isn't supported at the moment, as its emulation of kmem caches complicates handling of SLAB_TYPESAFE_BY_RCU caches correctly. Enabling init_on_free also guarantees that pages and heap objects are initialized right after they're freed, so it won't be possible to access stale data by using a dangling pointer. As suggested by Michal Hocko, right now we don't let the heap users to disable initialization for certain allocations. There's not enough evidence that doing so can speed up real-life cases, and introducing ways to opt-out may result in things going out of control. This patch (of 2): The new options are needed to prevent possible information leaks and make control-flow bugs that depend on uninitialized values more deterministic. This is expected to be on-by-default on Android and Chrome OS. And it gives the opportunity for anyone else to use it under distros too via the boot args. (The init_on_free feature is regularly requested by folks where memory forensics is included in their threat models.) init_on_alloc=1 makes the kernel initialize newly allocated pages and heap objects with zeroes. Initialization is done at allocation time at the places where checks for __GFP_ZERO are performed. init_on_free=1 makes the kernel initialize freed pages and heap objects with zeroes upon their deletion. This helps to ensure sensitive data doesn't leak via use-after-free accesses. Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator returns zeroed memory. The two exceptions are slab caches with constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never zero-initialized to preserve their semantics. Both init_on_alloc and init_on_free default to zero, but those defaults can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and CONFIG_INIT_ON_FREE_DEFAULT_ON. If either SLUB poisoning or page poisoning is enabled, those options take precedence over init_on_alloc and init_on_free: initialization is only applied to unpoisoned allocations. Slowdown for the new features compared to init_on_free=0, init_on_alloc=0: hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%) hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%) Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%) Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%) Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%) Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%) The slowdown for init_on_free=0, init_on_alloc=0 compared to the baseline is within the standard error. The new features are also going to pave the way for hardware memory tagging (e.g. arm64's MTE), which will require both on_alloc and on_free hooks to set the tags for heap objects. With MTE, tagging will have the same cost as memory initialization. Although init_on_free is rather costly, there are paranoid use-cases where in-memory data lifetime is desired to be minimized. There are various arguments for/against the realism of the associated threat models, but given that we'll need the infrastructure for MTE anyway, and there are people who want wipe-on-free behavior no matter what the performance cost, it seems reasonable to include it in this series. [glider@google.com: v8] Link: http://lkml.kernel.org/r/20190626121943.131390-2-glider@google.com [glider@google.com: v9] Link: http://lkml.kernel.org/r/20190627130316.254309-2-glider@google.com [glider@google.com: v10] Link: http://lkml.kernel.org/r/20190628093131.199499-2-glider@google.com Link: http://lkml.kernel.org/r/20190617151050.92663-2-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.cz> [page and dmapool parts Acked-by: James Morris <jamorris@linux.microsoft.com>] Cc: Christoph Lameter <cl@linux.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Sandeep Patil <sspatil@android.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-07-12 11:05:46 -07:00
Thomas Gleixner	b2139ce04f	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 403 Based on 1 normalized pattern(s): this software may be redistributed and or modified under the terms of the gnu general public license gpl version 2 as published by the free software foundation extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190112.039124428@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:13 +02:00
Mike Rapoport	a862f68a8b	docs/core-api/mm: fix return value descriptions in mm/ Many kernel-doc comments in mm/ have the return value descriptions either misformatted or omitted at all which makes kernel-doc script unhappy: $ make V=1 htmldocs ... ./mm/util.c:36: info: Scanning doc for kstrdup ./mm/util.c:41: warning: No description found for return value of 'kstrdup' ./mm/util.c:57: info: Scanning doc for kstrdup_const ./mm/util.c:66: warning: No description found for return value of 'kstrdup_const' ./mm/util.c:75: info: Scanning doc for kstrndup ./mm/util.c:83: warning: No description found for return value of 'kstrndup' ... Fixing the formatting and adding the missing return value descriptions eliminates ~100 such warnings. Link: http://lkml.kernel.org/r/1549549644-4903-4-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-05 21:07:20 -08:00
Joe Perches	0825a6f986	mm: use octal not symbolic permissions mm/.c files use symbolic and octal styles for permissions. Using octal and not symbolic permissions is preferred by many as more readable. https://lkml.org/lkml/2016/8/2/1945 Prefer the direct use of octal for permissions. Done using $ scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace mm/.c and some typing. Before: $ git grep -P -w "0[0-7]{3,3}" mm \| wc -l 44 After: $ git grep -P -w "0[0-7]{3,3}" mm \| wc -l 86 Miscellanea: o Whitespace neatening around these conversions. Link: http://lkml.kernel.org/r/2e032ef111eebcd4c5952bae86763b541d373469.1522102887.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-06-15 07:55:25 +09:00
Alexey Dobriyan	5b5e0928f7	lib/vsprintf.c: remove %Z support Now that %z is standartised in C99 there is no reason to support %Z. Unlike %L it doesn't even make format strings smaller. Use BUILD_BUG_ON in a couple ATM drivers. In case anyone didn't notice lib/vsprintf.o is about half of SLUB which is in my opinion is quite an achievement. Hopefully this patch inspires someone else to trim vsprintf.c more. Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-27 18:43:47 -08:00
Miles Chen	199eaa05ad	mm: cleanups for printing phys_addr_t and dma_addr_t cleanup rest of dma_addr_t and phys_addr_t type casting in mm use %pad for dma_addr_t use %pa for phys_addr_t Link: http://lkml.kernel.org/r/1486618489-13912-1-git-send-email-miles.chen@mediatek.com Signed-off-by: Miles Chen <miles.chen@mediatek.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-24 17:46:56 -08:00
Joe Perches	1170532bb4	mm: convert printk(KERN_<LEVEL> to pr_<level> Most of the mm subsystem uses pr_<level> so make it consistent. Miscellanea: - Realign arguments - Add missing newline to format - kmemleak-test.c has a "kmemleak: " prefix added to the "Kmemleak testing" logging message via pr_fmt Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Tejun Heo <tj@kernel.org> [percpu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Joe Perches	756a025f00	mm: coalesce split strings Kernel style prefers a single string over split strings when the string is 'user-visible'. Miscellanea: - Add a missing newline - Realign arguments Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Tejun Heo <tj@kernel.org> [percpu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Mel Gorman	d0164adc89	mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-06 17:50:42 -08:00
Robin Murphy	676bd99178	dmapool: fix overflow condition in pool_find_page() If a DMA pool lies at the very top of the dma_addr_t range (as may happen with an IOMMU involved), the calculated end address of the pool wraps around to zero, and page lookup always fails. Tweak the relevant calculation to be overflow-proof. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Sakari Ailus <sakari.ailus@iki.fi> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-10-01 21:42:35 -04:00
Sean O. Stalley	fa23f56d90	mm: add support for __GFP_ZERO flag to dma_pool_alloc() Currently a call to dma_pool_alloc() with a ___GFP_ZERO flag returns a non-zeroed memory region. This patchset adds support for the __GFP_ZERO flag to dma_pool_alloc(), adds 2 wrapper functions for allocing zeroed memory from a pool, and provides a coccinelle script for finding & replacing instances of dma_pool_alloc() followed by memset(0) with a single dma_pool_zalloc() call. There was some concern that this always calls memset() to zero, instead of passing __GFP_ZERO into the page allocator. [https://lkml.org/lkml/2015/7/15/881] I ran a test on my system to get an idea of how often dma_pool_alloc() calls into pool_alloc_page(). After Boot: [ 30.119863] alloc_calls:541, page_allocs:7 After an hour: [ 3600.951031] alloc_calls:9566, page_allocs:12 After copying 1GB file onto a USB drive: [ 4260.657148] alloc_calls:17225, page_allocs:12 It doesn't look like dma_pool_alloc() calls down to the page allocator very often (at least on my system). This patch (of 4): Currently the __GFP_ZERO flag is ignored by dma_pool_alloc(). Make dma_pool_alloc() zero the memory if this flag is set. Signed-off-by: Sean O. Stalley <sean.stalley@intel.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Vinod Koul <vinod.koul@intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Gilles Muller <Gilles.Muller@lip6.fr> Cc: Nicolas Palix <nicolas.palix@imag.fr> Cc: Michal Marek <mmarek@suse.cz> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-08 15:35:28 -07:00
Sergey Senozhatsky	44d7175da6	mm/dmapool: allow NULL `pool' pointer in dma_pool_destroy() dma_pool_destroy() does not tolerate a NULL dma_pool pointer argument and performs a NULL-pointer dereference. This requires additional attention and effort from developers/reviewers and forces all dma_pool_destroy() callers to do a NULL check if (pool) dma_pool_destroy(pool); Or, otherwise, be invalid dma_pool_destroy() users. Tweak dma_pool_destroy() and NULL-check the pointer there. Proposed by Andrew Morton. Link: https://lkml.org/lkml/2015/6/8/583 Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-08 15:35:28 -07:00
Nicholas Krause	d9e7e37b4d	mm/dmapool.c: change is_page_busy() return from int to bool This makes the function is_page_busy() return bool rather then an int now due to this particular function's single return statement only ever evaulating to either one or zero. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-04 16:54:41 -07:00
Paul McQuade	baa2ef8398	mm/dmapool.c: fixed a brace coding style issue Remove 3 brace coding style for any arm of this statement Signed-off-by: Paul McQuade <paulmcquad@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-09 22:26:00 -04:00
Sebastian Andrzej Siewior	01c2965f07	mm: dmapool: add/remove sysfs file outside of the pool lock lock cat /sys/.../pools followed by removal the device leads to: \|====================================================== \|[ INFO: possible circular locking dependency detected ] \|3.17.0-rc4+ #1498 Not tainted \|------------------------------------------------------- \|rmmod/2505 is trying to acquire lock: \| (s_active#28){++++.+}, at: [<c017f754>] kernfs_remove_by_name_ns+0x3c/0x88 \| \|but task is already holding lock: \| (pools_lock){+.+.+.}, at: [<c011494c>] dma_pool_destroy+0x18/0x17c \| \|which lock already depends on the new lock. \|the existing dependency chain (in reverse order) is: \| \|-> #1 (pools_lock){+.+.+.}: \| [<c0114ae8>] show_pools+0x30/0xf8 \| [<c0313210>] dev_attr_show+0x1c/0x48 \| [<c0180e84>] sysfs_kf_seq_show+0x88/0x10c \| [<c017f960>] kernfs_seq_show+0x24/0x28 \| [<c013efc4>] seq_read+0x1b8/0x480 \| [<c011e820>] vfs_read+0x8c/0x148 \| [<c011ea10>] SyS_read+0x40/0x8c \| [<c000e960>] ret_fast_syscall+0x0/0x48 \| \|-> #0 (s_active#28){++++.+}: \| [<c017e9ac>] __kernfs_remove+0x258/0x2ec \| [<c017f754>] kernfs_remove_by_name_ns+0x3c/0x88 \| [<c0114a7c>] dma_pool_destroy+0x148/0x17c \| [<c03ad288>] hcd_buffer_destroy+0x20/0x34 \| [<c03a4780>] usb_remove_hcd+0x110/0x1a4 The problem is the lock order of pools_lock and kernfs_mutex in dma_pool_destroy() vs show_pools() call path. This patch breaks out the creation of the sysfs file outside of the pools_lock mutex. The newly added pools_reg_lock ensures that there is no race of create vs destroy code path in terms whether or not the sysfs file has to be deleted (and was it deleted before we try to create a new one) and what to do if device_create_file() failed. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-09 22:25:59 -04:00
Krzysztof Hałasa	153a9f131f	Fix unbalanced mutex in dma_pool_create(). dma_pool_create() needs to unlock the mutex in error case. The bug was introduced in the 3.16 by commit `cc6b664aa2` ("mm/dmapool.c: remove redundant NULL check for dev in dma_pool_create()")/ Signed-off-by: Krzysztof Hałasa <khc@piap.pl> Cc: stable@vger.kernel.org # v3.16 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-09-18 10:39:16 -07:00
Andy Shevchenko	172cb4b3d4	mm/dmapool.c: reuse devres_release() to free resources Instead of calling an additional routine in dmam_pool_destroy() rely on what dmam_pool_release() is doing. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-04 16:54:08 -07:00
Daeseok Youn	cc6b664aa2	mm/dmapool.c: remove redundant NULL check for dev in dma_pool_create() "dev" cannot be NULL because it is already checked before calling dma_pool_create(). If dev ever was NULL, the code would oops in dev_to_node() after enabling CONFIG_NUMA. It is possible that some driver is using dev==NULL and has never been run on a NUMA machine. Such a driver is probably outdated, possibly buggy and will need some attention if it starts triggering NULL derefs. Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-04 16:54:04 -07:00
Hiroshige Sato	5835f25117	mm: Fix printk typo in dmapool.c Fix printk typo in dmapool.c Signed-off-by: Hiroshige Sato <sato.vintage@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2014-05-05 15:44:47 +02:00
Matthieu CASTET	5de55b265a	dmapool: make DMAPOOL_DEBUG detect corruption of free marker This can help to catch the case where hardware is writing after dma free. [akpm@linux-foundation.org: tidy code, fix comment, use sizeof(page->offset), use pr_err()] Signed-off-by: Matthieu Castet <matthieu.castet@parrot.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-11 17:22:24 -08:00
Marek Szyprowski	387870f2d6	mm: dmapool: use provided gfp flags for all dma_alloc_coherent() calls dmapool always calls dma_alloc_coherent() with GFP_ATOMIC flag, regardless the flags provided by the caller. This causes excessive pruning of emergency memory pools without any good reason. Additionaly, on ARM architecture any driver which is using dmapools will sooner or later trigger the following error: "ERROR: 256 KiB atomic DMA coherent pool is too small! Please increase it with coherent_pool= kernel parameter!". Increasing the coherent pool size usually doesn't help much and only delays such error, because all GFP_ATOMIC DMA allocations are always served from the special, very limited memory pool. This patch changes the dmapool code to correctly use gfp flags provided by the dmapool caller. Reported-by: Soeren Moch <smoch@web.de> Reported-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Soeren Moch <smoch@web.de> Cc: stable@vger.kernel.org	2012-12-11 09:28:08 +01:00
Paul Gortmaker	7c77509c54	mm: fix implicit stat.h usage in dmapool.c The removal of the implicitly everywhere module.h and its child includes will reveal this implicit stat.h usage: mm/dmapool.c:108: error: ‘S_IRUGO’ undeclared here (not in a function) Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 09:20:12 -04:00
Paul Gortmaker	b95f1b31b7	mm: Map most files to use export.h instead of module.h The files changed within are only using the EXPORT_SYMBOL macro variants. They are not using core modular infrastructure and hence don't need module.h but only the export.h header. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 09:20:12 -04:00
Maxin B John	ae891a1b93	devres: fix possible use after free devres uses the pointer value as key after it's freed, which is safe but triggers spurious use-after-free warnings on some static analysis tools. Rearrange code to avoid such warnings. Signed-off-by: Maxin B. John <maxin.john@gmail.com> Reviewed-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-25 20:57:14 -07:00
Andrew Morton	684265d4a3	mm/dmapool.c: use TASK_UNINTERRUPTIBLE in dma_pool_alloc() As it stands this code will degenerate into a busy-wait if the calling task has signal_pending(). Cc: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 17:32:48 -08:00
Rolf Eike Beer	84bc227d7f	mm/dmapool.c: take lock only once in dma_pool_free() dma_pool_free() scans for the page to free in the pool list holding the pool lock. Then it releases the lock basically to acquire it immediately again. Modify the code to only take the lock once. This will do some additional loops and computations with the lock held in if memory debugging is activated. If it is not activated the only new operations with this lock is one if and one substraction. Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 17:32:48 -08:00
Dima Zavin	ea05c8444e	mm: add a might_sleep_if() to dma_pool_alloc() Buggy drivers (e.g. fsl_udc) could call dma_pool_alloc from atomic context with GFP_KERNEL. In most instances, the first pool_alloc_page call would succeed and the sleeping functions would never be called. This allowed the buggy drivers to slip through the cracks. Add a might_sleep_if() checking for __GFP_WAIT in flags. Signed-off-by: Dima Zavin <dima@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-26 16:52:08 -07:00
Thomas Gleixner	c49568235d	dmapools: protect page_list walk in show_pools() show_pools() walks the page_list of a pool w/o protection against the list modifications in alloc/free. Take pool->lock to avoid stomping into nirvana. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-30 18:56:00 -07:00
Andi Kleen	b5ee5befa7	dmapool: enable debugging for CONFIG_SLUB_DEBUG_ON too Previously it was only enabled for CONFIG_DEBUG_SLAB. Not hooked into the slub runtime debug configuration, so you currently only get it with CONFIG_SLUB_DEBUG_ON, not plain CONFIG_SLUB_DEBUG Acked-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:20 -07:00
Matthew Wilcox	e34f44b351	pool: Improve memory usage for devices which can't cross boundaries The previous implementation simply refused to allocate more than a boundary's worth of data from an entire page. Some users didn't know this, so specified things like SMP_CACHE_BYTES, not realising the horrible waste of memory that this was. It's fairly easy to correct this problem, just by ensuring we don't cross a boundary within a page. This even helps drivers like EHCI (which can't cross a 4k boundary) on machines with larger page sizes. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Acked-by: David S. Miller <davem@davemloft.net>	2007-12-04 10:39:58 -05:00
Matthew Wilcox	a35a345514	Change dmapool free block management Use a list of free blocks within a page instead of using a bitmap. Update documentation to reflect this. As well as being a slight reduction in memory allocation, locked ops and lines of code, it speeds up a transaction processing benchmark by 0.4%. Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-04 10:39:57 -05:00
Matthew Wilcox	6182a0943a	dmapool: Tidy up includes and add comments We were missing a copyright statement and license, so add GPLv2, David Brownell's copyright and my copyright. The asm/io.h include was superfluous, but we were missing a few other necessary includes. Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-04 10:39:57 -05:00

1 2

54 Commits