llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_allocator_local_c...

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

265 lines
9.0 KiB
C
Raw Normal View History

//===-- sanitizer_allocator_local_cache.h -----------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// Part of the Sanitizer Allocator.
//
//===----------------------------------------------------------------------===//
#ifndef SANITIZER_ALLOCATOR_H
#error This file must be included inside sanitizer_allocator.h
#endif
// Cache used by SizeClassAllocator64.
template <class SizeClassAllocator>
struct SizeClassAllocator64LocalCache {
typedef SizeClassAllocator Allocator;
void Init(AllocatorGlobalStats *s) {
stats_.Init();
if (s)
s->Register(&stats_);
}
void Destroy(SizeClassAllocator *allocator, AllocatorGlobalStats *s) {
Drain(allocator);
if (s)
s->Unregister(&stats_);
}
void *Allocate(SizeClassAllocator *allocator, uptr class_id) {
CHECK_NE(class_id, 0UL);
CHECK_LT(class_id, kNumClasses);
PerClass *c = &per_class_[class_id];
if (UNLIKELY(c->count == 0)) {
if (UNLIKELY(!Refill(c, allocator, class_id)))
return nullptr;
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
DCHECK_GT(c->count, 0);
}
CompactPtrT chunk = c->chunks[--c->count];
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
stats_.Add(AllocatorStatAllocated, c->class_size);
return reinterpret_cast<void *>(allocator->CompactPtrToPointer(
allocator->GetRegionBeginBySizeClass(class_id), chunk));
}
void Deallocate(SizeClassAllocator *allocator, uptr class_id, void *p) {
CHECK_NE(class_id, 0UL);
CHECK_LT(class_id, kNumClasses);
// If the first allocator call on a new thread is a deallocation, then
// max_count will be zero, leading to check failure.
PerClass *c = &per_class_[class_id];
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
InitCache(c);
if (UNLIKELY(c->count == c->max_count))
Drain(c, allocator, class_id, c->max_count / 2);
CompactPtrT chunk = allocator->PointerToCompactPtr(
allocator->GetRegionBeginBySizeClass(class_id),
reinterpret_cast<uptr>(p));
c->chunks[c->count++] = chunk;
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
stats_.Sub(AllocatorStatAllocated, c->class_size);
}
void Drain(SizeClassAllocator *allocator) {
for (uptr i = 1; i < kNumClasses; i++) {
PerClass *c = &per_class_[i];
while (c->count > 0)
Drain(c, allocator, i, c->count);
}
}
private:
typedef typename Allocator::SizeClassMapT SizeClassMap;
static const uptr kNumClasses = SizeClassMap::kNumClasses;
typedef typename Allocator::CompactPtrT CompactPtrT;
struct PerClass {
u32 count;
u32 max_count;
uptr class_size;
CompactPtrT chunks[2 * SizeClassMap::kMaxNumCachedHint];
};
PerClass per_class_[kNumClasses];
AllocatorStats stats_;
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
void InitCache(PerClass *c) {
if (LIKELY(c->max_count))
return;
for (uptr i = 1; i < kNumClasses; i++) {
PerClass *c = &per_class_[i];
const uptr size = Allocator::ClassIdToSize(i);
c->max_count = 2 * SizeClassMap::MaxCachedHint(size);
c->class_size = size;
}
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
DCHECK_NE(c->max_count, 0UL);
}
NOINLINE bool Refill(PerClass *c, SizeClassAllocator *allocator,
uptr class_id) {
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
InitCache(c);
const uptr num_requested_chunks = c->max_count / 2;
if (UNLIKELY(!allocator->GetFromAllocator(&stats_, class_id, c->chunks,
num_requested_chunks)))
return false;
c->count = num_requested_chunks;
return true;
}
NOINLINE void Drain(PerClass *c, SizeClassAllocator *allocator, uptr class_id,
uptr count) {
CHECK_GE(c->count, count);
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
const uptr first_idx_to_drain = c->count - count;
c->count -= count;
allocator->ReturnToAllocator(&stats_, class_id,
&c->chunks[first_idx_to_drain], count);
}
};
// Cache used by SizeClassAllocator32.
template <class SizeClassAllocator>
struct SizeClassAllocator32LocalCache {
typedef SizeClassAllocator Allocator;
typedef typename Allocator::TransferBatch TransferBatch;
void Init(AllocatorGlobalStats *s) {
stats_.Init();
if (s)
s->Register(&stats_);
}
// Returns a TransferBatch suitable for class_id.
TransferBatch *CreateBatch(uptr class_id, SizeClassAllocator *allocator,
TransferBatch *b) {
if (uptr batch_class_id = per_class_[class_id].batch_class_id)
return (TransferBatch*)Allocate(allocator, batch_class_id);
return b;
}
// Destroys TransferBatch b.
void DestroyBatch(uptr class_id, SizeClassAllocator *allocator,
TransferBatch *b) {
if (uptr batch_class_id = per_class_[class_id].batch_class_id)
Deallocate(allocator, batch_class_id, b);
}
void Destroy(SizeClassAllocator *allocator, AllocatorGlobalStats *s) {
Drain(allocator);
if (s)
s->Unregister(&stats_);
}
void *Allocate(SizeClassAllocator *allocator, uptr class_id) {
CHECK_NE(class_id, 0UL);
CHECK_LT(class_id, kNumClasses);
PerClass *c = &per_class_[class_id];
if (UNLIKELY(c->count == 0)) {
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
if (UNLIKELY(!Refill(c, allocator, class_id)))
return nullptr;
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
DCHECK_GT(c->count, 0);
}
void *res = c->batch[--c->count];
PREFETCH(c->batch[c->count - 1]);
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
stats_.Add(AllocatorStatAllocated, c->class_size);
return res;
}
void Deallocate(SizeClassAllocator *allocator, uptr class_id, void *p) {
CHECK_NE(class_id, 0UL);
CHECK_LT(class_id, kNumClasses);
// If the first allocator call on a new thread is a deallocation, then
// max_count will be zero, leading to check failure.
PerClass *c = &per_class_[class_id];
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
InitCache(c);
if (UNLIKELY(c->count == c->max_count))
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
Drain(c, allocator, class_id);
c->batch[c->count++] = p;
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
stats_.Sub(AllocatorStatAllocated, c->class_size);
}
void Drain(SizeClassAllocator *allocator) {
for (uptr i = 1; i < kNumClasses; i++) {
PerClass *c = &per_class_[i];
while (c->count > 0)
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
Drain(c, allocator, i);
}
}
private:
typedef typename Allocator::SizeClassMapT SizeClassMap;
static const uptr kBatchClassID = SizeClassMap::kBatchClassID;
static const uptr kNumClasses = SizeClassMap::kNumClasses;
// If kUseSeparateSizeClassForBatch is true, all TransferBatch objects are
// allocated from kBatchClassID size class (except for those that are needed
// for kBatchClassID itself). The goal is to have TransferBatches in a totally
// different region of RAM to improve security.
static const bool kUseSeparateSizeClassForBatch =
Allocator::kUseSeparateSizeClassForBatch;
struct PerClass {
uptr count;
uptr max_count;
uptr class_size;
uptr batch_class_id;
void *batch[2 * TransferBatch::kMaxNumCached];
};
PerClass per_class_[kNumClasses];
AllocatorStats stats_;
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
void InitCache(PerClass *c) {
if (LIKELY(c->max_count))
return;
const uptr batch_class_id = SizeClassMap::ClassID(sizeof(TransferBatch));
for (uptr i = 1; i < kNumClasses; i++) {
PerClass *c = &per_class_[i];
const uptr size = Allocator::ClassIdToSize(i);
const uptr max_cached = TransferBatch::MaxCached(size);
c->max_count = 2 * max_cached;
c->class_size = size;
// Precompute the class id to use to store batches for the current class
// id. 0 means the class size is large enough to store a batch within one
// of the chunks. If using a separate size class, it will always be
// kBatchClassID, except for kBatchClassID itself.
if (kUseSeparateSizeClassForBatch) {
c->batch_class_id = (i == kBatchClassID) ? 0 : kBatchClassID;
} else {
c->batch_class_id = (size <
TransferBatch::AllocationSizeRequiredForNElements(max_cached)) ?
batch_class_id : 0;
}
}
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
DCHECK_NE(c->max_count, 0UL);
}
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
NOINLINE bool Refill(PerClass *c, SizeClassAllocator *allocator,
uptr class_id) {
InitCache(c);
TransferBatch *b = allocator->AllocateBatch(&stats_, this, class_id);
if (UNLIKELY(!b))
return false;
CHECK_GT(b->Count(), 0);
b->CopyToArray(c->batch);
c->count = b->Count();
DestroyBatch(class_id, allocator, b);
return true;
}
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
NOINLINE void Drain(PerClass *c, SizeClassAllocator *allocator,
uptr class_id) {
const uptr count = Min(c->max_count / 2, c->count);
const uptr first_idx_to_drain = c->count - count;
TransferBatch *b = CreateBatch(
class_id, allocator, (TransferBatch *)c->batch[first_idx_to_drain]);
// Failure to allocate a batch while releasing memory is non recoverable.
// TODO(alekseys): Figure out how to do it without allocating a new batch.
if (UNLIKELY(!b)) {
Report("FATAL: Internal error: %s's allocator failed to allocate a "
"transfer batch.\n", SanitizerToolName);
Die();
}
b->SetFromArray(&c->batch[first_idx_to_drain], count);
[sanitizer] Allocator local cache improvements Summary: Here are a few improvements proposed for the local cache: - `InitCache` always read from `per_class_[1]` in the fast path. This was not ideal as we are working with `per_class_[class_id]`. The latter offers the same property we are looking for (eg: `max_count != 0` means initialized), so we might as well use it and keep our memory accesses local to the same `per_class_` element. So change `InitCache` to take the current `PerClass` as an argument. This also makes the fast-path assembly of `Deallocate` a lot more compact; - Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit counterparts, by passing the current `PerClass` as an argument. This saves some array computations; - As far as I can tell, `InitCache` has no place in `Drain`: it's either called from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which checks for `c->count` to be greater than 0 (strictly). So remove it there. - Move the `stats_` updates to after we are done with the `per_class_` accesses in an attempt to preserve locality once more; - Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in the fast path and seemed to be overly cautious failsafes; - Mark some variables as `const`. The overall result is cleaner more compact fast path generated code, and some performance gains with Scudo (and likely other Sanitizers). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42851 llvm-svn: 324257
2018-02-06 02:06:45 +08:00
c->count -= count;
allocator->DeallocateBatch(&stats_, class_id, b);
}
};