Commit Graph

15 Commits

Author SHA1 Message Date
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Kostya Kortchinsky 9fcb91b3eb [scudo] Minor code generation improvement
Summary:
It looks like clang was generating somewhat weird assembly with the current
code. `FromPrimary`, even though `const`,  was replaced every time with the code
generated for `size <= SizeClassMap::kMaxSize` instead of using a variable or
register, and `FromPrimary` didn't induce `ClassId != 0` for the compiler, so a
dead branch was generated for `getActuallyAllocatedSize(Ptr, ClassId)` since
it's never called for `ClassId = 0` (Secondary backed allocations) [this one
was more wishful thinking on my side than anything else].

I rearranged the code bit so that the generated assembly is less clunky.

Also changed 2 whitespace inconsistencies that were bothering me.

Reviewers: alekseyshl, flowerhack

Reviewed By: flowerhack

Subscribers: llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D40976

llvm-svn: 320160
2017-12-08 16:36:37 +00:00
Kostya Kortchinsky df6ba242bf [scudo] Get rid of the thread local PRNG & header salt
Summary:
It was deemed that the salt in the chunk header didn't improve security
significantly (and could actually decrease it). The initial idea was that the
same chunk would different headers on different allocations, allowing for less
predictability. The issue is that gathering the same chunk header with different
salts can give information about the other "secrets" (cookie, pointer), and that
if an attacker leaks a header, they can reuse it anyway for that same chunk
anyway since we don't enforce the salt value.

So we get rid of the salt in the header. This means we also get rid of the
thread local Prng, and that we don't need a global Prng anymore as well. This
makes everything faster.

We reuse those 8 bits to store the `ClassId` of a chunk now (0 for a secondary
based allocation). This way, we get some additional speed gains:
- `ClassId` is computed outside of the locked block;
- `getActuallyAllocatedSize` doesn't need the `GetSizeClass` call;
- same for `deallocatePrimary`;
We add a sanity check at init for this new field (all sanity checks are moved
in their own function, `init` was getting crowded).

Reviewers: alekseyshl, flowerhack

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40796

llvm-svn: 319791
2017-12-05 17:08:29 +00:00
Kostya Kortchinsky 0207b6fbbf [scudo] Overhaul hardware CRC32 feature detection
Summary:
This patch aims at condensing the hardware CRC32 feature detection and making
it slightly more effective on Android.

The following changes are included:
- remove the `CPUFeature` enum, and get rid of one level of nesting of
  functions: we only used CRC32, so we just implement and use
  `hasHardwareCRC32`;
- allow for a weak `getauxval`: the Android toolchain is compiled at API level
  14 for Android ARM, meaning no `getauxval` at compile time, yet we will run
  on API level 27+ devices. The `/proc/self/auxv` fallback can work but is
  worthless for a process like `init` where the proc filesystem doesn't exist
  yet. If a weak `getauxval` doesn't exist, then fallback.
- couple of extra corrections.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: kubamracek, aemerson, srhines, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D40322

llvm-svn: 318859
2017-11-22 18:30:44 +00:00
Kostya Kortchinsky 4a0ebbfe97 [scudo] Rearrange #include order
Summary:
To be compliant with https://llvm.org/docs/CodingStandards.html#include-style,
system headers have to come after local headers.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39623

llvm-svn: 317390
2017-11-03 23:48:25 +00:00
Kostya Kortchinsky e1dde07640 [sanitizers] Add a blocking boolean to GetRandom prototype
Summary:
On platforms with `getrandom`, the system call defaults to blocking. This
becomes an issue in the very early stage of the boot for Scudo, when the RNG
source is not set-up yet: the syscall will block and we'll stall.

Introduce a parameter to specify that the function should not block, defaulting
to blocking as the underlying syscall does.

Update Scudo to use the non-blocking version.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits, kubamracek

Differential Revision: https://reviews.llvm.org/D36399

llvm-svn: 310839
2017-08-14 14:53:47 +00:00
Kostya Kortchinsky 00582563be [scudo] PRNG makeover
Summary:
This follows the addition of `GetRandom` with D34412. We remove our
`/dev/urandom` code and use the new function. Additionally, change the PRNG for
a slightly faster version. One of the issues with the old code is that we have
64 full bits of randomness per "next", using only 8 of those for the Salt and
discarding the rest. So we add a cached u64 in the PRNG that can serve up to
8 u8 before having to call the "next" function again.

During some integration work, I also realized that some very early processes
(like `init`) do not benefit from `/dev/urandom` yet. So if there is no
`getrandom` syscall as well, we have to fallback to some sort of initialization
of the PRNG.

Now a few words on why XoRoShiRo and not something else. I have played a while
with various PRNGs on 32 & 64 bit platforms. Some results are below. LCG 32 & 64
are usually faster but produce respectively 15 & 31 bits of entropy, meaning
that to get a full 64-bit, you would need to call them several times. The simple
XorShift is fast, produces 32 bits but is mediocre with regard to PRNG test
suites, PCG is slower overall, and XoRoShiRo is faster than XorShift128+ and
produces full 64 bits.

%%%
root@tulip-chiphd:/data # ./randtest.arm
[+] starting xs32...
[?] xs32 duration: 22431833053ns
[+] starting lcg32...
[?] lcg32 duration: 14941402090ns
[+] starting pcg32...
[?] pcg32 duration: 44941973771ns
[+] starting xs128p...
[?] xs128p duration: 48889786981ns
[+] starting lcg64...
[?] lcg64 duration: 33831042391ns
[+] starting xos128p...
[?] xos128p duration: 44850878605ns

root@tulip-chiphd:/data # ./randtest.aarch64
[+] starting xs32...
[?] xs32 duration: 22425151678ns
[+] starting lcg32...
[?] lcg32 duration: 14954255257ns
[+] starting pcg32...
[?] pcg32 duration: 37346265726ns
[+] starting xs128p...
[?] xs128p duration: 22523807219ns
[+] starting lcg64...
[?] lcg64 duration: 26141304679ns
[+] starting xos128p...
[?] xos128p duration: 14937033215ns
%%%

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35221

llvm-svn: 307798
2017-07-12 15:29:08 +00:00
Kostya Kortchinsky b0e96eb28e [scudo] CRC32 optimizations
Summary:
This change optimizes several aspects of the checksum used for chunk headers.

First, there is no point in checking the weak symbol `computeHardwareCRC32`
everytime, it will either be there or not when we start, so check it once
during initialization and set the checksum type accordingly.

Then, the loading of `HashAlgorithm` for SSE versions (and ARM equivalent) was
not optimized out, while not necessary. So I reshuffled that part of the code,
which duplicates a tiny bit of code, but ends up in a much cleaner assembly
(and faster as we avoid an extraneous load and some calls).

The following code is the checksum at the end of `scudoMalloc` for x86_64 with
full SSE 4.2, before:
```
mov     rax, 0FFFFFFFFFFFFFFh
shl     r10, 38h
mov     edi, dword ptr cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
and     r14, rax
lea     rsi, [r13-10h]
movzx   eax, cs:_ZN7__scudoL13HashAlgorithmE ; __scudo::HashAlgorithm
or      r14, r10
mov     rbx, r14
xor     bx, bx
call    _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov     rsi, rbx
mov     edi, eax
call    _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov     r14w, ax
mov     rax, r13
mov     [r13-10h], r14
```
After:
```
mov     rax, cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
lea     rcx, [rbx-10h]
mov     rdx, 0FFFFFFFFFFFFFFh
and     r14, rdx
shl     r9, 38h
or      r14, r9
crc32   eax, rcx
mov     rdx, r14
xor     dx, dx
mov     eax, eax
crc32   eax, rdx
mov     r14w, ax
mov     rax, rbx
mov     [rbx-10h], r14
```

Reviewers: dvyukov, alekseyshl, kcc

Reviewed By: alekseyshl

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32971

llvm-svn: 302538
2017-05-09 15:12:38 +00:00
Kostya Kortchinsky 36b3434161 [scudo] Move thread local variables into their own files
Summary:
This change introduces scudo_tls.h & scudo_tls_linux.cpp, where we move the
thread local variables used by the allocator, namely the cache, quarantine
cache & prng. `ScudoThreadContext` will hold those. This patch doesn't
introduce any new platform support yet, this will be the object of a later
patch. This also changes the PRNG so that the structure can be POD.

Reviewers: kcc, dvyukov, alekseyshl

Reviewed By: dvyukov, alekseyshl

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D32440

llvm-svn: 301584
2017-04-27 20:21:16 +00:00
Kostya Kortchinsky 006805d146 [scudo] Minor changes and refactoring
Summary:
This is part of D31947 that is being split into several smaller changes.

This one deals with all the minor changes, more specifically:
- Rename some variables and functions to make their purpose clearer;
- Reorder some code;
- Mark the hot termination incurring checks as `UNLIKELY`; if they happen, the
  program will die anyway;
- Add a `getScudoChunk` method;
- Add an `eraseHeader` method to ScudoChunk that will clear a header with 0s;
- Add a parameter to `allocate` to know if the allocated chunk should be filled
  with zeros. This allows `calloc` to not have to call
  `GetActuallyAllocatedSize`; more changes to get rid of this function on the
  hot paths will follow;
- reallocate was missing a check to verify that the pointer is properly
  aligned on `MinAlignment`;
- The `Stats` in the secondary have to be protected by a mutex as the `Add`
  and `Sub` methods are actually not atomic;
- The software CRC32 function was moved to the header to allow for inlining.

Reviewers: dvyukov, alekseyshl, kcc

Reviewed By: dvyukov

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32242

llvm-svn: 300846
2017-04-20 15:11:00 +00:00
Kostya Kortchinsky b39dff4551 [scudo] Refactor of CRC32 and ARM runtime CRC32 detection
Summary:
ARM & AArch64 runtime detection for hardware support of CRC32 has been added
via check of the AT_HWVAL auxiliary vector.

Following Michal's suggestions in D28417, the CRC32 code has been further
changed and looks better now. When compiled with full relro (which is strongly
suggested to benefit from additional hardening), the weak symbol for
computeHardwareCRC32 is read-only and the assembly generated is fairly clean
and straight forward. As suggested, an additional optimization is to skip
the runtime check if SSE 4.2 has been enabled globally, as opposed to only
for scudo_crc32.cpp.

scudo_crc32.h has no purpose anymore and was removed.

Reviewers: alekseyshl, kcc, rengolin, mgorny, phosek

Reviewed By: rengolin, mgorny

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D28574

llvm-svn: 292409
2017-01-18 17:11:17 +00:00
Kostya Kortchinsky c4d6c938e3 [scudo] Separate hardware CRC32 routines
Summary:
As raised in D28304, enabling SSE 4.2 for the whole Scudo tree leads to the
emission of SSE 4.2 instructions everywhere, while the runtime checks only
applied to the CRC32 computing function.

This patch separates the CRC32 function taking advantage of the hardware into
its own file, and only enabled -msse4.2 for that file, if detected to be
supported by the compiler.

Another consequence of removing SSE4.2 globally is realizing that memcpy were
not being optimized, which turned out to be due to the -fno-builtin in
SANITIZER_COMMON_CFLAGS. So we now explicitely enable builtins for Scudo.

The resulting assembly looks good, with some CALLs are introduced instead of
the CRC32 code being inlined.

Reviewers: kcc, mgorny, alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28417

llvm-svn: 291570
2017-01-10 16:39:36 +00:00
Kostya Kortchinsky 1148dc5274 [scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.

Among the changes:
- The chunk header has been changed to accomodate the size limitations
  encountered on 32-bit architectures. We now fit everything in 64-bit. This
  was achieved by storing the amount of unused bytes in an allocation rather
  than the size itself, as one can be deduced from the other with the help
  of the GetActuallyAllocatedSize function. As it turns out, this header can
  be used for both 64 and 32 bit, and as such we dropped the requirement for
  the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
  instruction set is supported, use the 32-bit CRC32 instruction, and in the
  XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
  software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
  allocation and alignment sizes. The random shuffle test has been deactivated
  for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
  currently randomize chunks.

Reviewers: alekseyshl, kcc

Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache

Differential Revision: https://reviews.llvm.org/D26358

llvm-svn: 288255
2016-11-30 17:32:20 +00:00
Kostya Serebryany 8b4904f9d7 [scudo] add NORETURN to the declaration of dieWithMessage; this should fix a warning in lib/scudo/scudo_termination.cpp
llvm-svn: 277546
2016-08-02 23:23:13 +00:00
Kostya Serebryany 712fc9803a [sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.

Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc

Subscribers: kubabrecka, filcab, llvm-commits

Differential Revision: http://reviews.llvm.org/D20084

llvm-svn: 271968
2016-06-07 01:20:26 +00:00