Commit Graph

76 Commits

Author SHA1 Message Date
Mitch Phillips 4f029d1be4 [GWP-ASan] Split the unwinder into segv/non-segv.
Note: Resubmission with frame pointers force-enabled to fix builds with
-DCOMPILER_RT_BUILD_BUILTINS=False

Summary:
Splits the unwinder into a non-segv (for allocation/deallocation traces) and a
segv unwinder. This ensures that implementations can select an accurate, slower
unwinder in the segv handler (if they choose to use the GWP-ASan provided one).
This is important as fast frame-pointer unwinders (like the sanitizer unwinder)
don't like unwinding through signal handlers.

Reviewers: morehouse, cryptoad

Reviewed By: morehouse, cryptoad

Subscribers: cryptoad, mgorny, eugenis, pcc, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D83994
2020-07-21 08:25:37 -07:00
Hans Wennborg ab6263c925 Revert 502f0cc0e3 "[GWP-ASan] Split the unwinder into segv/non-segv."
It was causing tests to fail in -DCOMPILER_RT_BUILD_BUILTINS=OFF builds:

   GwpAsan-Unittest :: ./GwpAsan-x86_64-Test/BacktraceGuardedPoolAllocator.DoubleFree
   GwpAsan-Unittest :: ./GwpAsan-x86_64-Test/BacktraceGuardedPoolAllocator.UseAfterFree

see comment on the code review.

> Summary:
> Splits the unwinder into a non-segv (for allocation/deallocation traces) and a
> segv unwinder. This ensures that implementations can select an accurate, slower
> unwinder in the segv handler (if they choose to use the GWP-ASan provided one).
> This is important as fast frame-pointer unwinders (like the sanitizer unwinder)
> don't like unwinding through signal handlers.
>
> Reviewers: morehouse, cryptoad
>
> Reviewed By: morehouse, cryptoad
>
> Subscribers: cryptoad, mgorny, eugenis, pcc, #sanitizers
>
> Tags: #sanitizers
>
> Differential Revision: https://reviews.llvm.org/D83994

This reverts commit 502f0cc0e3.
2020-07-21 11:18:07 +02:00
Mitch Phillips 502f0cc0e3 [GWP-ASan] Split the unwinder into segv/non-segv.
Summary:
Splits the unwinder into a non-segv (for allocation/deallocation traces) and a
segv unwinder. This ensures that implementations can select an accurate, slower
unwinder in the segv handler (if they choose to use the GWP-ASan provided one).
This is important as fast frame-pointer unwinders (like the sanitizer unwinder)
don't like unwinding through signal handlers.

Reviewers: morehouse, cryptoad

Reviewed By: morehouse, cryptoad

Subscribers: cryptoad, mgorny, eugenis, pcc, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D83994
2020-07-17 12:59:47 -07:00
Mitch Phillips a62586846f [GWP-ASan] Crash Handler API.
Summary:
Forewarning: This patch looks big in #LOC changed. I promise it's not that bad, it just moves a lot of content from one file to another. I've gone ahead and left inline comments on Phabricator for sections where this has happened.

This patch:
 1. Introduces the crash handler API (crash_handler_api.h).
 2. Moves information required for out-of-process crash handling into an AllocatorState. This is a trivially-copied POD struct that designed to be recovered from a deceased process, and used by the crash handler to create a GWP-ASan report (along with the other trivially-copied Metadata struct).
 3. Implements the crash handler API using the AllocatorState and Metadata.
 4. Adds tests for the crash handler.
 5. Reimplements the (now optionally linked by the supporting allocator) in-process crash handler (i.e. the segv handler) using the new crash handler API.
 6. Minor updates Scudo & Scudo Standalone to fix compatibility.
 7. Changed capitalisation of errors (e.g. /s/Use after free/Use After Free).

Reviewers: cryptoad, eugenis, jfb

Reviewed By: eugenis

Subscribers: merge_guards_bot, pcc, jfb, dexonsmith, mgorny, cryptoad, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D73557
2020-02-05 15:39:17 -08:00
Mitch Phillips d4acc4720e [GWP-ASan] [Scudo] Add GWP-ASan backtrace for alloc/free to Scudo.
Summary:
Adds allocation and deallocation stack trace support to Scudo. The
default provided backtrace library for GWP-ASan is supplied by the libc
unwinder, and is suitable for production variants of Scudo. If Scudo in future
has its own unwinder, it may choose to use its own over the generic unwinder
instead.

Reviewers: cryptoad

Reviewed By: cryptoad

Subscribers: kubamracek, mgorny, #sanitizers, llvm-commits, morehouse, vlad.tsyrklevich, eugenis

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D64085

llvm-svn: 364966
2019-07-02 20:33:19 +00:00
Mitch Phillips 21184ec5c4 [GWP-ASan] Integration with Scudo [5].
Summary:
See D60593 for further information.

This patch adds GWP-ASan support to the Scudo hardened allocator. It also
implements end-to-end integration tests using Scudo as the backing allocator.
The tests include crash handling for buffer over/underflow as well as
use-after-free detection.

Reviewers: vlad.tsyrklevich, cryptoad

Reviewed By: vlad.tsyrklevich, cryptoad

Subscribers: kubamracek, mgorny, #sanitizers, llvm-commits, morehouse

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D62929

llvm-svn: 363584
2019-06-17 17:45:34 +00:00
Kostya Kortchinsky f0fbeaf44a [scudo] Tuning changes based on feedback from current use
Summary:
This tunes several of the default parameters used within the allocator:
- disable the deallocation type mismatch on Android by default; this
  was causing too many issues with third party libraries;
- change the default `SizeClassMap` to `Dense`, it caches less entries
  and is way more memory efficient overall;
- relax the timing of the RSS checks, 10 times per second was too much,
  lower it to 4 times (every 250ms), and update the test so that it
  passes with the new default.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: srhines, delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D57116

llvm-svn: 352057
2019-01-24 15:56:54 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Kostya Kortchinsky 9920489a2a [scudo] Replace eraseHeader with compareExchangeHeader for Quarantined chunks
Summary:
The reason for the existence of `eraseHeader` was that it was deemed faster
to null-out a chunk header, effectively making it invalid, rather than marking
it as available, which incurred a checksum computation and a cmpxchg.

A previous use of `eraseHeader` was removed with D50655 due to a race.

Now we remove the second use of it in the Quarantine deallocation path and
replace is with a `compareExchangeHeader`.

The reason for this is that greatly helps debugging some heap bugs as the chunk
header is now valid and the chunk marked available, as opposed to the header
being invalid. Eg: we get an invalid state error, instead of an invalid header
error, which reduces the possibilities. The computational penalty is negligible.

Reviewers: alekseyshl, flowerhack, eugenis

Reviewed By: eugenis

Subscribers: delcypher, jfb, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D51224

llvm-svn: 340633
2018-08-24 18:21:32 +00:00
Kostya Kortchinsky 3afc797e42 [scudo] Fix race condition in deallocation path when Quarantine is bypassed
Summary:
There is a race window in the deallocation path when the Quarantine is bypassed.
Initially we would just erase the header of a chunk if we were not to use the
Quarantine, as opposed to using a compare-exchange primitive, to make things
faster.

It turned out to be a poor decision, as 2 threads (or more) could simultaneously
deallocate the same pointer, and if the checks were to done before the header
got erased, this would result in the pointer being added twice (or more) to
distinct thread caches, and eventually be reused.

Winning the race is not trivial but can happen with enough control over the
allocation primitives. The repro added attempts to trigger the bug, with a
moderate success rate, but it should be enough to notice if the bug ever make
its way back into the code.

Since I am changing things in this file, there are 2 smaller changes tagging
along, marking a variable `const`, and improving the Quarantine bypass test at
runtime.

Reviewers: alekseyshl, eugenis, kcc, vitalybuka

Reviewed By: eugenis, vitalybuka

Subscribers: delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D50655

llvm-svn: 339705
2018-08-14 18:34:52 +00:00
Kostya Kortchinsky cccd21d42c [scudo] Simplify internal names (NFC)
Summary:
There is currently too much redundancy in the class/variable/* names in Scudo:
- we are in the namespace `__scudo`, so there is no point in having something
  named `ScudoX` to end up with a final name of `__scudo::ScudoX`;
- there are a lot of types/* that have `Allocator` in the name, given that
  Scudo is an allocator I figure this doubles up as well.

So change a bunch of the Scudo names to make them shorter, less redundant, and
overall simpler. They should still be pretty self explaining (or at least it
looks so to me).

The TSD part will be done in another CL (eg `__scudo::ScudoTSD`).

Reviewers: alekseyshl, eugenis

Reviewed By: alekseyshl

Subscribers: delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D49505

llvm-svn: 337557
2018-07-20 15:07:17 +00:00
Kostya Kortchinsky 63c33b1c9b [scudo] Move noinline functions definitions out of line
Summary:
Mark `isRssLimitExceeded` as `NOINLINE`, and move it's definition as well as
the one of `performSanityChecks` out of the class definition, as requested.

Reviewers: filcab, alekseyshl

Reviewed By: alekseyshl

Subscribers: delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D48228

llvm-svn: 335054
2018-06-19 15:36:30 +00:00
Kostya Kortchinsky 4adf24502e [scudo] Add verbose failures in place of CHECK(0)
Summary:
The current `FailureHandler` mechanism was fairly opaque with regard to the
failure reason due to using `CHECK(0)`. Scudo is a bit different from the other
Sanitizers as it prefers to avoid spurious processing in its failure path. So
we just `dieWithMessage` using a somewhat explicit string.

Adapted the tests for the new strings.

While this takes care of the `OnBadRequest` & `OnOOM` failures, the next step
is probably to migrate the other Scudo failures in the same failes (header
corruption, invalid state and so on).

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: filcab, mgorny, delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D48199

llvm-svn: 334843
2018-06-15 16:45:19 +00:00
Kostya Kortchinsky 76969eaf3d [scudo] Add C++17 aligned new/delete operators support
Summary:
This CL adds support for aligned new/delete operators (C++17). Currently we
do not support alignment inconsistency detection on deallocation, as this
requires a header change, but the APIs are introduced and are functional.

Add a smoke test for the aligned version of the operators.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D48031

llvm-svn: 334505
2018-06-12 14:42:40 +00:00
Kostya Kortchinsky 4410e2c43f [scudo] Improve the scalability of the shared TSD model
Summary:
The shared TSD model in its current form doesn't scale. Here is an example of
rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core
machines (defaulting to a `CompactSizeClassMap` and no Quarantine):
- with tcmalloc: 337K reqs/sec, peak RSS of 338MB;
- with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB;
- with scudo (shared): 241K reqs/sec, peak RSS of 324MB.

This isn't great, since the exclusive model uses a lot of memory, while the
shared model doesn't even come close to be competitive.

This is mostly due to the fact that we are consistently scanning the TSD pool
starting at index 0 for an available TSD, which can result in a lot of failed
lock attempts, and touching some memory that needs not be touched.

This CL attempts to make things better in most situations:
- first, use a thread local variable on Linux (intead of pthread APIs) to store
  the current TSD in the shared model;
- move the locking boolean out of the TSD: this allows the compiler to use a
  register and potentially optimize out a branch instead of reading it from the
  TSD everytime (we also save a tiny bit of memory per TSD);
- 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so
  store the `Precedence` in a `uptr` instead of a `u64`. We lose some
  nanoseconds of precision and we'll wrap around at some point, but the benefit
  is worth it;
- change a `CHECK` to a `DCHECK`: this should never happen, but if something is
  ever terribly wrong, we'll crash on a near null AV if the TSD happens to be
  null;
- based on an idea by dvyukov@, we are implementing a bound random scan for
  an available TSD. This requires computing the coprimes for the number of TSDs,
  and attempting to lock up to 4 TSDs in an random order before falling back to
  the current one. This is obviously slightly more expansive when we have just
  2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still
  basically corresponds to the moment of the first contention on a TSD. To seed
  on random choice, we use the precedence of the current TSD since it is very
  likely to be non-zero (since we are in the slow path after a failed `tryLock`)

With those modifications, the benchmark yields to:
- with scudo (shared): 330K reqs/sec, peak RSS of 327MB.

So the shared model for this specific situation not only becomes competitive but
outperforms the exclusive model. I experimented with some values greater than 4
for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just
sticking with the current TSD is also a tad slower. Numbers on platforms with
less cores (eg: Android) remain similar.

Reviewers: alekseyshl, dvyukov, javed.absar

Reviewed By: alekseyshl, dvyukov

Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D47289

llvm-svn: 334410
2018-06-11 14:50:31 +00:00
Kostya Kortchinsky e5c9e9f0bd [scudo] Quarantine optimization
Summary:
It turns out that the previous code construct was not optimizing the allocation
and deallocation of batches. The class id was read as a class member (even
though a precomputed one) and nothing else was optimized. By changing the
construct this way, the compiler actually optimizes most of the allocation and
deallocation away to only work with a single class id, which not only saves some
CPU but also some code footprint.

Reviewers: alekseyshl, dvyukov

Reviewed By: dvyukov

Subscribers: dvyukov, delcypher, llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D46961

llvm-svn: 332502
2018-05-16 18:12:31 +00:00
Kostya Kortchinsky d8803d3d92 [scudo] Adding an interface function to print allocator stats
Summary:
This adds `__scudo_print_stats` as an interface function to display the Primary
and Secondary allocator statistics for Scudo.

Reviewers: alekseyshl, flowerhack

Reviewed By: alekseyshl

Subscribers: delcypher, llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D46016

llvm-svn: 330857
2018-04-25 18:52:29 +00:00
Kostya Kortchinsky 4563b78b99 [sanitizer] Allow for the allocator "names" to be set by the tools
Summary:
In the same spirit of SanitizerToolName, allow the Primary & Secondary
allocators to have names that can be set by the tools via PrimaryAllocatorName
and SecondaryAllocatorName.

Additionally, set a non-default name for Scudo.

Reviewers: alekseyshl, vitalybuka

Reviewed By: alekseyshl, vitalybuka

Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D45600

llvm-svn: 330055
2018-04-13 19:21:27 +00:00
Kostya Kortchinsky a51139046e [scudo] Add Chunk::getSize, rework Chunk::getUsableSize
Summary:
Using `getActuallyAllocatedSize` from the Combined resulting in mediocre
compiled code, as the `ClassId != 0` predicament was not propagated there,
resulting in additional branches and dead code. Move the logic in the frontend,
which results in better compiled code. Also I think it makes it slightly easier
to distinguish between the size the user requested, and the size that was
actually allocated by the allocator.

`const` a couple of things as well.

This has no functional impact.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D44444

llvm-svn: 327525
2018-03-14 15:50:32 +00:00
Kostya Kortchinsky e245ec0cf0 [scudo] Make logging more consistent
Summary:
A few changes related to logging:
- prepend `Scudo` to the error messages so that users can identify that we
  reported an error;
- replace a couple of `Report` calls in the RSS check code with
  `dieWithMessage`/`Print`, mark a condition as `UNLIKELY` in the process;
- change some messages so that they all look more or less the same. This
  includes the `CHECK` message;
- adapt a couple of tests with the new strings.

A couple of side notes: this results in a few 1-line-blocks, for which I left
brackets. There doesn't seem to be any style guide for that, I can remove them
if need be. I didn't use `SanitizerToolName` in the strings, but directly
`Scudo` because we are the only users, I could change that too.

Reviewers: alekseyshl, flowerhack

Reviewed By: alekseyshl

Subscribers: mgorny, delcypher, llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D44171

llvm-svn: 326901
2018-03-07 16:22:16 +00:00
Kostya Kortchinsky beeea6227b [scudo] Introduce Chunk::getHeaderSize
Summary:
Instead of using `AlignedChunkHeaderSize`, introduce a `constexpr` function
`getHeaderSize` in the `Chunk` namespace. Switch `RoundUpTo` to a `constexpr`
as well (so we can use it in `constexpr` declarations). Mark a few variables
in the areas touched as `const`.

Overall this has no functional change, and is mostly to make things a bit more
consistent.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D43772

llvm-svn: 326206
2018-02-27 16:14:49 +00:00
Kostya Kortchinsky 4223af42b8 [scudo] Add default implementations for weak functions
Summary:
This is in preparation for platforms where `SANITIZER_SUPPORTS_WEAK_HOOKS` is 0.
They require a default implementation.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: delcypher, llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D42557

llvm-svn: 323795
2018-01-30 17:59:49 +00:00
Kostya Kortchinsky 1ebebde8b7 [scudo] Allow for weak hooks, gated by a define
Summary:
Hooks in the allocation & deallocation paths can be a security risk (see for an
example https://scarybeastsecurity.blogspot.com/2016/11/0day-exploit-advancing-exploitation.html
which used the glibc's __free_hook to complete exploitation).

But some users have expressed a need for them, even if only for tests and
memory benchmarks. So allow for `__sanitizer_malloc_hook` &
`__sanitizer_free_hook` to be called if defined, and gate them behind a global
define `SCUDO_CAN_USE_HOOKS` defaulting to 0.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D42430

llvm-svn: 323278
2018-01-23 23:07:42 +00:00
Kostya Kortchinsky 33802be579 [scudo] Fix for the Scudo interface function scope
Summary:
A forgotten include in `scudo_allocator.cpp` made the symbol only local :/

Before:
```
nm ./lib/clang/7.0.0/lib/linux/libclang_rt.scudo-i686-android.so | grep rss
00024730 t __scudo_set_rss_limit
```
After:
```
nm ./lib/clang/7.0.0/lib/linux/libclang_rt.scudo-i686-android.so | grep rs
00024760 T __scudo_set_rss_limit
```
And we want `T`!

This include also means that we can get rid of the `extern "C"` in the C++
file, the compiler does fine without it (note that this was already the case
for all the `__sanitizer_*` interface functions.

Reviewers: alekseyshl, eugenis

Reviewed By: eugenis

Subscribers: #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D42199

llvm-svn: 322782
2018-01-17 23:10:02 +00:00
Kostya Kortchinsky 541c5a0797 [scudo] s/unsigned long/size_t/ for __scudo_set_rss_limit
Summary:
`__scudo_set_rss_limit`'s `LimitMb` should really be a `size_t`. Update
accordingly the prototype. To avoid the `NOLINT` and conform with the other
Sanitizers, use the sanitizers types for the internal definition. This should
have no functional change.

Additionally, capitalize a variable name to follow the LLVM coding standards.

Reviewers: alekseyshl, flowerhack

Reviewed By: alekseyshl

Subscribers: #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D41704

llvm-svn: 321803
2018-01-04 17:05:04 +00:00
Kostya Kortchinsky efe3d3436a [scudo] Refactor ScudoChunk
Summary:
The initial implementation used an ASan like Chunk class that was deriving from
a Header class. Due to potential races, we ended up working with local copies
of the Header and never using the parent class fields. ScudoChunk was never
constructed but cast, and we were using `this` as the pointer needed for our
computations. This was meh.

So we refactored ScudoChunk to be now a series of static functions within the
namespace `__scudo::Chunk` that take a "user" pointer as first parameter (former
`this`). A compiled binary doesn't really change, but the code is more sensible.

Clang tends to inline all those small function (in -O2), but GCC left a few not
inlined, so we add the `INLINE` keyword to all.

Since we don't have `ScudoChunk` pointers anymore, a few variables were renamed
here and there to introduce a clearer distinction between a user pointer
(usually `Ptr`) and a backend pointer (`BackendPtr`).

Reviewers: alekseyshl, flowerhack

Reviewed By: alekseyshl

Subscribers: #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D41200

llvm-svn: 320745
2017-12-14 21:32:57 +00:00
Kostya Kortchinsky f22f5fe910 [scudo] Adding a public Scudo interface
Summary:
The first and only function to start with allows to set the soft or hard RSS
limit at runtime. Add associated tests.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: mgorny, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D41128

llvm-svn: 320611
2017-12-13 20:41:35 +00:00
Kostya Kortchinsky f50246da65 [sanitizer] Introduce a vDSO aware timing function
Summary:
See D40657 & D40679 for previous versions of this patch & description.

A couple of things were fixed here to have it not break some bots.
Weak symbols can't be used with `SANITIZER_GO` so the previous version was
breakin TsanGo. I set up some additional local tests and those pass now.

I changed the workaround for the glibc vDSO issue: `__progname` is initialized
after the vDSO and is actually public and of known type, unlike
`__vdso_clock_gettime`. This works better, and with all compilers.

The rest is the same.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: srhines, kubamracek, krytarowski, llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D41121

llvm-svn: 320594
2017-12-13 16:23:54 +00:00
Kostya Kortchinsky 4ac0b1e6e9 [scudo] Inline getScudoChunk function.
Summary:
getScudoChunk function is implicitly inlined for optimized builds on
clang, but not on gcc. It's a small enough function that it seems
sensible enough to just inline it by default.

Reviewers: cryptoad, alekseyshl

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D41138

llvm-svn: 320592
2017-12-13 16:10:39 +00:00
Kostya Kortchinsky ab5f6aaa75 [sanitizer] Revert rL320409
Summary: D40679 broke a couple of builds, reverting while investigating.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: srhines, kubamracek, krytarowski, llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D41088

llvm-svn: 320417
2017-12-11 21:03:12 +00:00
Kostya Kortchinsky d276d72441 [sanitizer] Introduce a vDSO aware time function, and use it in the allocator [redo]
Summary:
Redo of D40657, which had the initial discussion. The initial code had to move
into a libcdep file, and things had to be shuffled accordingly.

`NanoTime` is a time sink when checking whether or not to release memory to
the OS. While reducing the amount of calls to said function is in the works,
another solution that was found to be beneficial was to use a timing function
that can leverage the vDSO.

We hit a couple of snags along the way, like the fact that the glibc crashes
when clock_gettime is called from a preinit_array, or the fact that
`__vdso_clock_gettime` is mangled (for security purposes) and can't be used
directly, and also that clock_gettime can be intercepted.

The proposed solution takes care of all this as far as I can tell, and
significantly improve performances and some Scudo load tests with memory
reclaiming enabled.

@mcgrathr: please feel free to follow up on
https://reviews.llvm.org/D40657#940857 here. I posted a reply at
https://reviews.llvm.org/D40657#940974.

Reviewers: alekseyshl, krytarowski, flowerhack, mcgrathr, kubamracek

Reviewed By: alekseyshl, krytarowski

Subscribers: #sanitizers, mcgrathr, srhines, llvm-commits, kubamracek

Differential Revision: https://reviews.llvm.org/D40679

llvm-svn: 320409
2017-12-11 19:23:12 +00:00
Kostya Kortchinsky 9fcb91b3eb [scudo] Minor code generation improvement
Summary:
It looks like clang was generating somewhat weird assembly with the current
code. `FromPrimary`, even though `const`,  was replaced every time with the code
generated for `size <= SizeClassMap::kMaxSize` instead of using a variable or
register, and `FromPrimary` didn't induce `ClassId != 0` for the compiler, so a
dead branch was generated for `getActuallyAllocatedSize(Ptr, ClassId)` since
it's never called for `ClassId = 0` (Secondary backed allocations) [this one
was more wishful thinking on my side than anything else].

I rearranged the code bit so that the generated assembly is less clunky.

Also changed 2 whitespace inconsistencies that were bothering me.

Reviewers: alekseyshl, flowerhack

Reviewed By: flowerhack

Subscribers: llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D40976

llvm-svn: 320160
2017-12-08 16:36:37 +00:00
Kostya Kortchinsky df6ba242bf [scudo] Get rid of the thread local PRNG & header salt
Summary:
It was deemed that the salt in the chunk header didn't improve security
significantly (and could actually decrease it). The initial idea was that the
same chunk would different headers on different allocations, allowing for less
predictability. The issue is that gathering the same chunk header with different
salts can give information about the other "secrets" (cookie, pointer), and that
if an attacker leaks a header, they can reuse it anyway for that same chunk
anyway since we don't enforce the salt value.

So we get rid of the salt in the header. This means we also get rid of the
thread local Prng, and that we don't need a global Prng anymore as well. This
makes everything faster.

We reuse those 8 bits to store the `ClassId` of a chunk now (0 for a secondary
based allocation). This way, we get some additional speed gains:
- `ClassId` is computed outside of the locked block;
- `getActuallyAllocatedSize` doesn't need the `GetSizeClass` call;
- same for `deallocatePrimary`;
We add a sanity check at init for this new field (all sanity checks are moved
in their own function, `init` was getting crowded).

Reviewers: alekseyshl, flowerhack

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40796

llvm-svn: 319791
2017-12-05 17:08:29 +00:00
Kostya Kortchinsky 0207b6fbbf [scudo] Overhaul hardware CRC32 feature detection
Summary:
This patch aims at condensing the hardware CRC32 feature detection and making
it slightly more effective on Android.

The following changes are included:
- remove the `CPUFeature` enum, and get rid of one level of nesting of
  functions: we only used CRC32, so we just implement and use
  `hasHardwareCRC32`;
- allow for a weak `getauxval`: the Android toolchain is compiled at API level
  14 for Android ARM, meaning no `getauxval` at compile time, yet we will run
  on API level 27+ devices. The `/proc/self/auxv` fallback can work but is
  worthless for a process like `init` where the proc filesystem doesn't exist
  yet. If a weak `getauxval` doesn't exist, then fallback.
- couple of extra corrections.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: kubamracek, aemerson, srhines, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D40322

llvm-svn: 318859
2017-11-22 18:30:44 +00:00
Kostya Kortchinsky 58f2656d7e [scudo] Soft and hard RSS limit checks
Summary:
This implements an opportunistic check for the RSS limit.

For ASan, this was implemented thanks to a background thread checking the
current RSS vs the set limit every 100ms. This was deemed problematic for Scudo
due to potential Android concerns (Zygote as pointed out by Aleksey) as well as
the general inconvenience of having a permanent background thread.

If a limit (soft or hard) is specified, we will attempt to update the RSS limit
status (exceeded or not) every 100ms. This is done in an opportunistic way: if
we can update it, we do it, if not we return the current status, mostly because
we don't need it to be fully consistent (it's done every 100ms anyway). If the
limit is exceeded `allocate` will act as if OOM for a soft limit, or just die
for a hard limit.

We use the `common_flags()`'s `hard_rss_limit_mb` & `soft_rss_limit_mb` for
configuration of the limits.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40038

llvm-svn: 318301
2017-11-15 16:40:27 +00:00
Kostya Kortchinsky a2b715f883 [scudo] Simplify initialization and flags
Summary:
This is mostly some cleanup and shouldn't affect functionalities.

Reviewing some code for a future addition, I realized that the complexity of
the initialization path was unnecessary, and so was maintaining a structure
for the allocator options throughout the initialization.

So we get rid of that structure, of an extraneous level of nesting for the
`init` function, and correct a couple of related code inaccuracies in the
flags cpp.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39974

llvm-svn: 318157
2017-11-14 16:14:53 +00:00
Kostya Kortchinsky 8d4ba5fd23 [scudo] Allow for non-Android Shared TSD platforms, part 1
Summary:
This first part just prepares the grounds for part 2 and doesn't add any new
functionality. It mostly consists of small refactors:
- move the `pthread.h` include higher as it will be used in the headers;
- use `errno.h` in `scudo_allocator.cpp` instead of the sanitizer one, update
  the `errno` assignments accordingly (otherwise it creates conflicts on some
  platforms due to `pthread.h` including `errno.h`);
- introduce and use `getCurrentTSD` and `setCurrentTSD` for the shared TSD
  model code;

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits, srhines

Differential Revision: https://reviews.llvm.org/D38826

llvm-svn: 315583
2017-10-12 15:01:09 +00:00
Kostya Kortchinsky b59abb2590 [scudo] Scudo thread specific data refactor, part 3
Summary:
Previous parts: D38139, D38183.

In this part of the refactor, we abstract the Linux vs Android TSD dissociation
in favor of a Exclusive vs Shared one, allowing for easier platform introduction
and configuration.

Most of this change consist of shuffling the files around to reflect the new
organization.

We introduce `scudo_platform.h` where platform specific definition lie. This
involves the TSD model and the platform specific allocator parameters. In an
upcoming CL, those will be configurable via defines, but we currently stick
with conservative defaults.

Reviewers: alekseyshl, dvyukov

Reviewed By: alekseyshl, dvyukov

Subscribers: srhines, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D38244

llvm-svn: 314224
2017-09-26 17:20:02 +00:00
Kostya Kortchinsky 22396c2f47 [scudo] Scudo thread specific data refactor, part 2
Summary:
Following D38139, we now consolidate the TSD definition, merging the shared
TSD definition with the exclusive TSD definition. We introduce a boolean set
at initializaton denoting the need for the TSD to be unlocked or not. This
adds some unused members to the exclusive TSD, but increases consistency and
reduces the definitions fragmentation.

We remove the fallback mechanism from `scudo_allocator.cpp` and add a fallback
TSD in the non-shared version. Since the shared version doesn't require one,
this makes overall more sense.

There are a couple of additional cosmetic changes: removing the header guards
from the remaining `.inc` files, added error string to a `CHECK`.

Question to reviewers: I thought about friending `getTSDAndLock` in `ScudoTSD`
so that the `FallbackTSD` could `Mutex.Lock()` directly instead of `lock()`
which involved zeroing out the `Precedence`, which is unused otherwise. Is it
worth doing?

Reviewers: alekseyshl, dvyukov, kcc

Reviewed By: dvyukov

Subscribers: srhines, llvm-commits

Differential Revision: https://reviews.llvm.org/D38183

llvm-svn: 314110
2017-09-25 15:12:08 +00:00
Kostya Kortchinsky 392480952c [scudo] Scudo thread specific data refactor, part 1
Summary:
We are going through an overhaul of Scudo's TSD, to allow for new platforms
to be integrated more easily, and make the code more sound.

This first part is mostly renaming, preferring some shorter names, correcting
some comments. I removed `getPrng` and `getAllocatorCache` to directly access
the members, there was not really any benefit to them (and it was suggested by
Dmitry in D37590).

The only functional change is in `scudo_tls_android.cpp`: we enforce bounds to
the `NumberOfTSDs` and most of the logic in `getTSDAndLockSlow` is skipped if we
only have 1 TSD.

Reviewers: alekseyshl, dvyukov, kcc

Reviewed By: dvyukov

Subscribers: llvm-commits, srhines

Differential Revision: https://reviews.llvm.org/D38139

llvm-svn: 313987
2017-09-22 15:35:37 +00:00
Kostya Kortchinsky 26e689f0c5 [scudo] Fix bad request handling when allocator has not been initialized
Summary:
In a few functions (`scudoMemalign` and the like), we would call
`ScudoAllocator::FailureHandler::OnBadRequest` if the parameters didn't check
out. The issue is that if the allocator had not been initialized (eg: if this
is the first heap related function called), we would use variables like
`allocator_may_return_null` and `exitcode` that still had their default value
(as opposed to the one set by the user or the initialization path).

To solve this, we introduce `handleBadRequest` that will call `initThreadMaybe`,
allowing the options to be correctly initialized.

Unfortunately, the tests were passing because `exitcode` was still 0, so the
results looked like success. Change those tests to do what they were supposed
to.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37853

llvm-svn: 313294
2017-09-14 20:34:32 +00:00
Kostya Kortchinsky 040c211bc4 [scudo] Fix improper TSD init after TLS destructors are called
Summary:
Some of glibc's own thread local data is destroyed after a user's thread local
destructors are called, via __libc_thread_freeres. This might involve calling
free, as is the case for strerror_thread_freeres.
If there is no prior heap operation in the thread, this free would end up
initializing some thread specific data that would never be destroyed properly
(as user's pthread destructors have already been called), while still being
deallocated when the TLS goes away. As a result, a program could SEGV, usually
in __sanitizer::AllocatorGlobalStats::Unregister, where one of the doubly linked
list links would refer to a now unmapped memory area.

To prevent this from happening, we will not do a full initialization from the
deallocation path. This means that the fallback cache & quarantine will be used
if no other heap operation has been called, and we effectively prevent the TSD
being initialized and never destroyed. The TSD will be fully initialized for all
other paths.

In the event of a thread doing only frees and nothing else, a TSD would never
be initialized for that thread, but this situation is unlikely and we can live
with that.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37697

llvm-svn: 312939
2017-09-11 19:59:40 +00:00
Kostya Kortchinsky 43917720a7 [scudo] Application & platform compatibility changes
Summary:
This patch changes a few (small) things around for compatibility purposes for
the current Android & Fuchsia work:
- `realloc`'ing some memory that was not allocated with `malloc`, `calloc` or
  `realloc`, while UB according to http://pubs.opengroup.org/onlinepubs/009695399/functions/realloc.html
  is more common that one would think. We now only check this if
  `DeallocationTypeMismatch` is set; change the "mismatch" error
  messages to be more homogeneous;
- some sketchily written but widely used libraries expect a call to `realloc`
  to copy the usable size of the old chunk to the new one instead of the
  requested size. We have to begrundingly abide by this de-facto standard.
  This doesn't seem to impact security either way, unless someone comes up with
  something we didn't think about;
- the CRC32 intrinsics for 64-bit take a 64-bit first argument. This is
  misleading as the upper 32 bits end up being ignored. This was also raising
  `-Wconversion` errors. Change things to take a `u32` as first argument.
  This also means we were (and are) only using 32 bits of the Cookie - not a
  big thing, but worth mentioning.
- Includes-wise: prefer `stddef.h` to `cstddef`, move `scudo_flags.h` where it
  is actually needed.
- Add tests for the memalign-realloc case, and the realloc-usable-size one.

(Edited typos)

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36754

llvm-svn: 311018
2017-08-16 16:40:48 +00:00
Kostya Kortchinsky 65fdf677f2 [scudo] Check for pvalloc overflow
Summary:
Previously we were rounding up the size passed to `pvalloc` to the next
multiple of page size no matter what. There is an overflow possibility that
wasn't accounted for. So now, return null in the event of an overflow. The man
page doesn't seem to indicate the errno to set in this particular situation,
but the glibc unit tests go for ENOMEM (https://code.woboq.org/userspace/glibc/malloc/tst-pvalloc.c.html#54)
so we'll do the same.
Update the aligned allocation funtions tests to check for properly aligned
returned pointers, and the `pvalloc` corner cases.

@alekseyshl: do you want me to do the same in the other Sanitizers?

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: kubamracek, alekseyshl, llvm-commits

Differential Revision: https://reviews.llvm.org/D35818

llvm-svn: 309033
2017-07-25 21:18:02 +00:00
Kostya Kortchinsky 2d94405a32 [scudo] Quarantine overhaul
Summary:
First, some context.

The main feedback we get about the quarantine is that it's too memory hungry.
A single MB of quarantine will have an impact of 3 to 4MB of PSS/RSS, and
things quickly get out of hand in terms of memory usage, and the quarantine
ends up disabled.

The main objective of the quarantine is to protect from use-after-free
exploitation by making it harder for an attacker to reallocate a controlled
chunk in place of the targeted freed chunk. This is achieved by not making it
available to the backend right away for reuse, but holding it a little while.

Historically, what has usually been the target of such attacks was objects,
where vtable pointers or other function pointers could constitute a valuable
targeti to replace. Those are usually on the smaller side. There is barely any
advantage in putting the quarantine several megabytes of RGB data or the like.

Now for the patch.

This patch introduces a new way the Quarantine behaves in Scudo. First of all,
the size of the Quarantine will be defined in KB instead of MB, then we
introduce a new option: the size up to which (lower than or equal to) a chunk
will be quarantined. This way, we only quarantine smaller chunks, and the size
of the quarantine remains manageable. It also prevents someone from triggering
a recycle by allocating something huge. We default to 512 bytes on 32-bit and
2048 bytes on 64-bit platforms.

In details, the patches includes the following:
- introduce `QuarantineSizeKb`, but honor `QuarantineSizeMb` if set to fall
  back to the old behavior (meaning no threshold in that case);
  `QuarantineSizeMb` is described as deprecated in the options descriptios;
  documentation update will follow;
- introduce `QuarantineChunksUpToSize`, the new threshold value;
- update the `quarantine.cpp` test, and other tests using `QuarantineSizeMb`;
- remove `AllocatorOptions::copyTo`, it wasn't used;
- slightly change the logic around `quarantineOrDeallocateChunk` to accomodate
  for the new logic; rename a couple of variables there as well;

Rewriting the tests, I found a somewhat annoying bug where non-default aligned
chunks would account for more than needed when placed in the quarantine due to
`<< MinAlignment` instead of `<< MinAlignmentLog`. This is fixed and tested for
now.

Reviewers: alekseyshl, kcc

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35694

llvm-svn: 308884
2017-07-24 15:29:38 +00:00
Alex Shlyapnikov 42bea018af [Sanitizers] ASan/MSan/LSan allocators set errno on failure.
Summary:
ASan/MSan/LSan allocators set errno on allocation failures according to
malloc/calloc/etc. expected behavior.

MSan allocator was refactored a bit to make its structure more similar
with other allocators.

Also switch Scudo allocator to the internal errno definitions.

TSan allocator changes will follow.

Reviewers: eugenis

Subscribers: llvm-commits, kubamracek

Differential Revision: https://reviews.llvm.org/D35275

llvm-svn: 308344
2017-07-18 19:11:04 +00:00
Alex Shlyapnikov df18cbba55 [Sanitizers] Scudo allocator set errno on failure.
Summary:
Set proper errno code on alloction failure and change pvalloc and
posix_memalign implementation to satisfy their man-specified
requirements.

Reviewers: cryptoad

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35429

llvm-svn: 308053
2017-07-14 21:17:16 +00:00
Kostya Kortchinsky b44364dd15 [scudo] Do not grab a cache for secondary allocation & per related changes
Summary:
Secondary backed allocations do not require a cache. While it's not necessary
an issue when each thread has its cache, it becomes one with a shared pool of
caches (Android), as a Secondary backed allocation or deallocation holds a
cache that could be useful to another thread doing a Primary backed allocation.

We introduce an additional PRNG and its mutex (to avoid contention with the
Fallback one for Primary allocations) that will provide the `Salt` needed for
Secondary backed allocations.

I changed some of the code in a way that feels more readable to me (eg: using
some values directly rather than going  through ternary assigned variables,
using directly `true`/`false` rather than `FromPrimary`). I will let reviewers
decide if it actually is.

An additional change is to mark `CheckForCallocOverflow` as `UNLIKELY`.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35358

llvm-svn: 307958
2017-07-13 21:01:19 +00:00
Kostya Kortchinsky 00582563be [scudo] PRNG makeover
Summary:
This follows the addition of `GetRandom` with D34412. We remove our
`/dev/urandom` code and use the new function. Additionally, change the PRNG for
a slightly faster version. One of the issues with the old code is that we have
64 full bits of randomness per "next", using only 8 of those for the Salt and
discarding the rest. So we add a cached u64 in the PRNG that can serve up to
8 u8 before having to call the "next" function again.

During some integration work, I also realized that some very early processes
(like `init`) do not benefit from `/dev/urandom` yet. So if there is no
`getrandom` syscall as well, we have to fallback to some sort of initialization
of the PRNG.

Now a few words on why XoRoShiRo and not something else. I have played a while
with various PRNGs on 32 & 64 bit platforms. Some results are below. LCG 32 & 64
are usually faster but produce respectively 15 & 31 bits of entropy, meaning
that to get a full 64-bit, you would need to call them several times. The simple
XorShift is fast, produces 32 bits but is mediocre with regard to PRNG test
suites, PCG is slower overall, and XoRoShiRo is faster than XorShift128+ and
produces full 64 bits.

%%%
root@tulip-chiphd:/data # ./randtest.arm
[+] starting xs32...
[?] xs32 duration: 22431833053ns
[+] starting lcg32...
[?] lcg32 duration: 14941402090ns
[+] starting pcg32...
[?] pcg32 duration: 44941973771ns
[+] starting xs128p...
[?] xs128p duration: 48889786981ns
[+] starting lcg64...
[?] lcg64 duration: 33831042391ns
[+] starting xos128p...
[?] xos128p duration: 44850878605ns

root@tulip-chiphd:/data # ./randtest.aarch64
[+] starting xs32...
[?] xs32 duration: 22425151678ns
[+] starting lcg32...
[?] lcg32 duration: 14954255257ns
[+] starting pcg32...
[?] pcg32 duration: 37346265726ns
[+] starting xs128p...
[?] xs128p duration: 22523807219ns
[+] starting lcg64...
[?] lcg64 duration: 26141304679ns
[+] starting xos128p...
[?] xos128p duration: 14937033215ns
%%%

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35221

llvm-svn: 307798
2017-07-12 15:29:08 +00:00
Alex Shlyapnikov 346988bf02 Merge
llvm-svn: 306748
2017-06-29 21:54:38 +00:00