Using order 4 pages would be helpful for IOMMUs mapping, but trying to get
order 4 pages could spend quite much time in the page allocation. From
the perspective of responsiveness, the deterministic memory allocation
speed, I think, is quite important.
The order 4 allocation with __GFP_RECLAIM may spend much time in reclaim
and compation logic. __GFP_NORETRY also may affect. These cause
unpredictable delay.
To get reasonable allocation speed from dma-buf system heap, use
HIGH_ORDER_GFP for order 4 to avoid reclaim. And let me remove
meaningless __GFP_COMP for order 0.
According to my tests, order 4 with MID_ORDER_GFP could get more number
of order 4 pages but the elapsed times could be very slow.
time order 8 order 4 order 0
584 usec 0 160 0
28,428 usec 0 160 0
100,701 usec 0 160 0
76,645 usec 0 160 0
25,522 usec 0 160 0
38,798 usec 0 160 0
89,012 usec 0 160 0
23,015 usec 0 160 0
73,360 usec 0 160 0
76,953 usec 0 160 0
31,492 usec 0 160 0
75,889 usec 0 160 0
84,551 usec 0 160 0
84,352 usec 0 160 0
57,103 usec 0 160 0
93,452 usec 0 160 0
If HIGH_ORDER_GFP is used for order 4, the number of order 4 could be
decreased but the elapsed time results were quite stable and fast enough.
time order 8 order 4 order 0
1,356 usec 0 155 80
1,901 usec 0 11 2384
1,912 usec 0 0 2560
1,911 usec 0 0 2560
1,884 usec 0 0 2560
1,577 usec 0 0 2560
1,366 usec 0 0 2560
1,711 usec 0 0 2560
1,635 usec 0 28 2112
544 usec 10 0 0
633 usec 2 128 0
848 usec 0 160 0
729 usec 0 160 0
1,000 usec 0 160 0
1,358 usec 0 160 0
2,638 usec 0 31 2064
Link: https://lkml.kernel.org/r/20230303050332.10138-1-jaewon31.kim@samsung.com
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Reviewed-by: John Stultz <jstultz@google.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: T.J. Mercier <tjmercier@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>