Summary:
The current implementation of the allocator returning freed memory
back to OS (controlled by allocator_release_to_os_interval_ms flag)
requires sorting of the free chunks list, which has two major issues,
first, when free list grows to millions of chunks, sorting, even the
fastest one, is just too slow, and second, sorting chunks in place
is unacceptable for Scudo allocator as it makes allocations more
predictable and less secure.
The proposed approach is linear in complexity (altough requires quite
a bit more temporary memory). The idea is to count the number of free
chunks on each memory page and release pages containing free chunks
only. It requires one iteration over the free list of chunks and one
iteration over the array of page counters. The obvious disadvantage
is the allocation of the array of the counters, but even in the worst
case we support (4T allocator space, 64 buckets, 16 bytes bucket size,
full free list, which leads to 2 bytes per page counter and ~17M page
counters), requires just about 34Mb of the intermediate buffer (comparing
to ~64Gb of actually allocated chunks) and usually it stays under 100K
and released after each use. It is expected to be a relatively rare event,
releasing memory back to OS, keeping the buffer between those runs
and added complexity of the bookkeeping seems unnesessary here (it can
always be improved later, though, never say never).
The most interesting problem here is how to calculate the number of chunks
falling into each memory page in the bucket. Skipping all the details,
there are three cases when the number of chunks per page is constant:
1) P >= C, P % C == 0 --> N = P / C
2) C > P , C % P == 0 --> N = 1
3) C <= P, P % C != 0 && C % (P % C) == 0 --> N = P / C + 1
where P is page size, C is chunk size and N is the number of chunks per
page and the rest of the cases, where the number of chunks per page is
calculated on the go, during the page counter array iteration.
Among the rest, there are still cases where N can be deduced from the
page index, but they require not that much less calculations per page
than the current "brute force" way and 2/3 of the buckets fall into
the first three categories anyway, so, for the sake of simplicity,
it was decided to stick to those two variations. It can always be
refined and improved later, should we see that brute force way slows
us down unacceptably.
Reviewers: eugenis, cryptoad, dvyukov
Subscribers: kubamracek, mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D38245
llvm-svn: 314311
Summary:
In `sanitizer_allocator_primary32.h`:
- rounding up in `MapWithCallback` is not needed as `MmapOrDie` does it. Note
that the 64-bit counterpart doesn't round up, this keeps the behavior
consistent;
- since `IsAligned` exists, use it in `AllocateRegion`;
- in `PopulateFreeList`:
- checking `b->Count` to be greater than 0 when `b->Count() == max_count` is
redundant when done more than once. Just check that `max_count` is greater
than 0 out of the loop; the compiler (at least on ARM) didn't optimize it;
- mark the batch creation failure as `UNLIKELY`;
In `sanitizer_allocator_primary64.h`:
- in `MapWithCallback`, mark the failure condition as `UNLIKELY`;
In `sanitizer_posix.h`:
- mark a bunch of Mmap related failure conditions as `UNLIKELY`;
- in `MmapAlignedOrDieOnFatalError`, we have `IsAligned`, so use it; rearrange
the conditions as one test was redudant;
- in `MmapFixedImpl`, 30 chars was not large enough to hold the message and a
full 64-bit address (or at least a 48-bit usermode address), increase to 40.
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: aemerson, kubamracek, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34840
llvm-svn: 306834
Summary:
The current code was sometimes attempting to release huge chunks of
memory due to undesired RoundUp/RoundDown interaction when the requested
range is fully contained within one memory page.
Reviewers: eugenis
Subscribers: kubabrecka, llvm-commits
Patch by Aleksey Shlyapnikov.
Differential Revision: https://reviews.llvm.org/D27228
llvm-svn: 288271
Summary:
In order to avoid starting a separate thread to return unused memory to
the system (the thread interferes with process startup on Android,
Zygota waits for all threads to exit before fork, but this thread never
exits), try to return it right after free.
Reviewers: eugenis
Subscribers: cryptoad, filcab, danalbert, kubabrecka, llvm-commits
Patch by Aleksey Shlyapnikov.
Differential Revision: https://reviews.llvm.org/D27003
llvm-svn: 288091
Summary:
The sanitizer allocators can works with a dynamic address space
(i.e. specified with ~0ULL).
Unfortunately, the code was broken on GetMetadata and GetChunkIdx.
The current patch is moving the Win64 memory test to a dynamic
address space. There is a migration to move every concept to a
dynamic address space on windows.
To have a better coverage, the unittest are now testing
dynamic address space on other platforms too.
Reviewers: rnk, kcc
Subscribers: kubabrecka, dberris, llvm-commits, chrisha
Differential Revision: https://reviews.llvm.org/D23170
llvm-svn: 277745