Commit Graph

12 Commits

Author SHA1 Message Date
Kostya Kortchinsky 3ad32a037e [scudo] Correct a behavior on the shared TSD registry
Summary:
There is an error in the shared TSD registry logic when looking for a
TSD in the slow path. There is an unlikely event when a TSD's precedence
was 0 after attempting a `tryLock` which indicated that it was grabbed
by another thread in between. We dealt with that case by continuing to
the next iteration, but that meant that the `Index` was not increased
and we ended up trying to lock the same TSD.
This would manifest in heavy contention, and in the end we would still
lock a TSD, but that was a wasted iteration.
So, do not `continue`, just skip the TSD as a potential candidate.

This is in both the standalone & non-standalone versions.

Reviewers: morehouse, eugenis, vitalybuka, hctim

Reviewed By: morehouse

Subscribers: delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D63783

llvm-svn: 364345
2019-06-25 19:58:11 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Kostya Kortchinsky 4410e2c43f [scudo] Improve the scalability of the shared TSD model
Summary:
The shared TSD model in its current form doesn't scale. Here is an example of
rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core
machines (defaulting to a `CompactSizeClassMap` and no Quarantine):
- with tcmalloc: 337K reqs/sec, peak RSS of 338MB;
- with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB;
- with scudo (shared): 241K reqs/sec, peak RSS of 324MB.

This isn't great, since the exclusive model uses a lot of memory, while the
shared model doesn't even come close to be competitive.

This is mostly due to the fact that we are consistently scanning the TSD pool
starting at index 0 for an available TSD, which can result in a lot of failed
lock attempts, and touching some memory that needs not be touched.

This CL attempts to make things better in most situations:
- first, use a thread local variable on Linux (intead of pthread APIs) to store
  the current TSD in the shared model;
- move the locking boolean out of the TSD: this allows the compiler to use a
  register and potentially optimize out a branch instead of reading it from the
  TSD everytime (we also save a tiny bit of memory per TSD);
- 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so
  store the `Precedence` in a `uptr` instead of a `u64`. We lose some
  nanoseconds of precision and we'll wrap around at some point, but the benefit
  is worth it;
- change a `CHECK` to a `DCHECK`: this should never happen, but if something is
  ever terribly wrong, we'll crash on a near null AV if the TSD happens to be
  null;
- based on an idea by dvyukov@, we are implementing a bound random scan for
  an available TSD. This requires computing the coprimes for the number of TSDs,
  and attempting to lock up to 4 TSDs in an random order before falling back to
  the current one. This is obviously slightly more expansive when we have just
  2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still
  basically corresponds to the moment of the first contention on a TSD. To seed
  on random choice, we use the precedence of the current TSD since it is very
  likely to be non-zero (since we are in the slow path after a failed `tryLock`)

With those modifications, the benchmark yields to:
- with scudo (shared): 330K reqs/sec, peak RSS of 327MB.

So the shared model for this specific situation not only becomes competitive but
outperforms the exclusive model. I experimented with some values greater than 4
for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just
sticking with the current TSD is also a tad slower. Numbers on platforms with
less cores (eg: Android) remain similar.

Reviewers: alekseyshl, dvyukov, javed.absar

Reviewed By: alekseyshl, dvyukov

Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D47289

llvm-svn: 334410
2018-06-11 14:50:31 +00:00
Kostya Kortchinsky 5a3fdbd829 [scudo] Make getNumberOfCPUs Fuchsia compliant v2
Summary:
This change allows Fuchsia to boot properly using the Scudo allocator.

A first version of this commit was reverted by rL317834 because it broke Android
builds for toolchains generated with older NDKs. This commit introduces a
fall back to solve that issue.

Reviewers: cryptoad, krytarowski, rnk, alekseyshl

Reviewed By: cryptoad, krytarowski, alekseyshl

Subscribers: llvm-commits, srhines, kubamracek, krytarowski

Differential Revision: https://reviews.llvm.org/D40121

llvm-svn: 318802
2017-11-21 21:14:00 +00:00
Kostya Kortchinsky 5604ad1c9b [sanitizer] Revert rL317822
Summary:
This reverts D39490.

For toolchains generated with older NDKs (<=r13b as far as we tested),
`cpu_set_t` doesn't exist in `sched.h`.
We have to figure out another way to get the number of CPUs without this.

Reviewers: rnk

Reviewed By: rnk

Subscribers: kubamracek, llvm-commits, krytarowski

Differential Revision: https://reviews.llvm.org/D39867

llvm-svn: 317834
2017-11-09 21:26:07 +00:00
Kostya Kortchinsky 6458216b28 [scudo] Make getNumberOfCPUs Fuchsia compliant
Summary: This change allows Fuchsia to boot properly using the Scudo allocator.

Reviewers: cryptoad, alekseyshl, krytarowski

Reviewed By: cryptoad, krytarowski

Subscribers: rnk, krytarowski, kubamracek, llvm-commits

Differential Revision: https://reviews.llvm.org/D39490

llvm-svn: 317822
2017-11-09 19:18:55 +00:00
Reid Kleckner f7fdac4508 Revert "[scudo] Make getNumberOfCPUs Fuchsia compliant"
This reverts commit r317604.

Android doesn't have cpu_set_t.

llvm-svn: 317655
2017-11-08 01:33:15 +00:00
Kostya Kortchinsky 4e8ce0225f [scudo] Make getNumberOfCPUs Fuchsia compliant
Summary: This change allows Fuchsia to boot properly using the Scudo allocator.

Reviewers: cryptoad, alekseyshl, krytarowski

Reviewed By: cryptoad, krytarowski

Subscribers: krytarowski, kubamracek, llvm-commits

Differential Revision: https://reviews.llvm.org/D39490

llvm-svn: 317604
2017-11-07 19:30:08 +00:00
Kostya Kortchinsky 91b7558ca8 [scudo] Allow to specify the maximum number of TSDs at compile time
Summary:
This introduces `SCUDO_MAX_CACHES` allowing to define an upper bound to the
number of `ScudoTSD` created in the Shared TSD model (by default 32U).
This name felt clearer than `SCUDO_MAX_TSDS` which is technically what it really
is. I am opened to suggestions if that doesn't feel right.

Additionally change `getNumberOfCPUs` to return a `u32` to be more consistent.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39338

llvm-svn: 316788
2017-10-27 20:10:14 +00:00
Kostya Kortchinsky f4c11e353a [scudo] Allow for non-Android Shared TSD platforms, part 2
Summary:
Follow up to D38826.

We introduce `pthread_{get,set}specific` versions of `{get,set}CurrentTSD` to
allow for non Android platforms to use the Shared TSD model.
We now allow `SCUDO_TSD_EXCLUSIVE` to be defined at compile time.

A couple of things:
- I know that `#if SANITIZER_ANDROID` is not ideal within a function, but in
  the end I feel it looks more compact and clean than going the .inc route; I
  am open to an alternative if anyone has one;
- `SCUDO_TSD_EXCLUSIVE=1` requires ELF TLS support (and not emutls as this uses
  malloc). I haven't found anything to enforce that, so it's currently not
  checked.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: srhines, llvm-commits

Differential Revision: https://reviews.llvm.org/D38854

llvm-svn: 315751
2017-10-13 20:55:31 +00:00
Kostya Kortchinsky 8d4ba5fd23 [scudo] Allow for non-Android Shared TSD platforms, part 1
Summary:
This first part just prepares the grounds for part 2 and doesn't add any new
functionality. It mostly consists of small refactors:
- move the `pthread.h` include higher as it will be used in the headers;
- use `errno.h` in `scudo_allocator.cpp` instead of the sanitizer one, update
  the `errno` assignments accordingly (otherwise it creates conflicts on some
  platforms due to `pthread.h` including `errno.h`);
- introduce and use `getCurrentTSD` and `setCurrentTSD` for the shared TSD
  model code;

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits, srhines

Differential Revision: https://reviews.llvm.org/D38826

llvm-svn: 315583
2017-10-12 15:01:09 +00:00
Kostya Kortchinsky b59abb2590 [scudo] Scudo thread specific data refactor, part 3
Summary:
Previous parts: D38139, D38183.

In this part of the refactor, we abstract the Linux vs Android TSD dissociation
in favor of a Exclusive vs Shared one, allowing for easier platform introduction
and configuration.

Most of this change consist of shuffling the files around to reflect the new
organization.

We introduce `scudo_platform.h` where platform specific definition lie. This
involves the TSD model and the platform specific allocator parameters. In an
upcoming CL, those will be configurable via defines, but we currently stick
with conservative defaults.

Reviewers: alekseyshl, dvyukov

Reviewed By: alekseyshl, dvyukov

Subscribers: srhines, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D38244

llvm-svn: 314224
2017-09-26 17:20:02 +00:00