[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
//===-- scudo_allocator.cpp -------------------------------------*- C++ -*-===//
|
|
|
|
//
|
2019-01-19 16:50:56 +08:00
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
///
|
|
|
|
/// Scudo Hardened Allocator implementation.
|
|
|
|
/// It uses the sanitizer_common allocator as a base and aims at mitigating
|
|
|
|
/// heap corruption vulnerabilities. It provides a checksum-guarded chunk
|
|
|
|
/// header, a delayed free list, and additional sanity checks.
|
|
|
|
///
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
#include "scudo_allocator.h"
|
[scudo] CRC32 optimizations
Summary:
This change optimizes several aspects of the checksum used for chunk headers.
First, there is no point in checking the weak symbol `computeHardwareCRC32`
everytime, it will either be there or not when we start, so check it once
during initialization and set the checksum type accordingly.
Then, the loading of `HashAlgorithm` for SSE versions (and ARM equivalent) was
not optimized out, while not necessary. So I reshuffled that part of the code,
which duplicates a tiny bit of code, but ends up in a much cleaner assembly
(and faster as we avoid an extraneous load and some calls).
The following code is the checksum at the end of `scudoMalloc` for x86_64 with
full SSE 4.2, before:
```
mov rax, 0FFFFFFFFFFFFFFh
shl r10, 38h
mov edi, dword ptr cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
and r14, rax
lea rsi, [r13-10h]
movzx eax, cs:_ZN7__scudoL13HashAlgorithmE ; __scudo::HashAlgorithm
or r14, r10
mov rbx, r14
xor bx, bx
call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov rsi, rbx
mov edi, eax
call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov r14w, ax
mov rax, r13
mov [r13-10h], r14
```
After:
```
mov rax, cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
lea rcx, [rbx-10h]
mov rdx, 0FFFFFFFFFFFFFFh
and r14, rdx
shl r9, 38h
or r14, r9
crc32 eax, rcx
mov rdx, r14
xor dx, dx
mov eax, eax
crc32 eax, rdx
mov r14w, ax
mov rax, rbx
mov [rbx-10h], r14
```
Reviewers: dvyukov, alekseyshl, kcc
Reviewed By: alekseyshl
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32971
llvm-svn: 302538
2017-05-09 23:12:38 +08:00
|
|
|
#include "scudo_crc32.h"
|
2018-06-16 00:45:19 +08:00
|
|
|
#include "scudo_errors.h"
|
2017-08-17 00:40:48 +08:00
|
|
|
#include "scudo_flags.h"
|
2018-01-18 07:10:02 +08:00
|
|
|
#include "scudo_interface_internal.h"
|
[scudo] Scudo thread specific data refactor, part 3
Summary:
Previous parts: D38139, D38183.
In this part of the refactor, we abstract the Linux vs Android TSD dissociation
in favor of a Exclusive vs Shared one, allowing for easier platform introduction
and configuration.
Most of this change consist of shuffling the files around to reflect the new
organization.
We introduce `scudo_platform.h` where platform specific definition lie. This
involves the TSD model and the platform specific allocator parameters. In an
upcoming CL, those will be configurable via defines, but we currently stick
with conservative defaults.
Reviewers: alekseyshl, dvyukov
Reviewed By: alekseyshl, dvyukov
Subscribers: srhines, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D38244
llvm-svn: 314224
2017-09-27 01:20:02 +08:00
|
|
|
#include "scudo_tsd.h"
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
#include "scudo_utils.h"
|
|
|
|
|
2017-07-19 03:11:04 +08:00
|
|
|
#include "sanitizer_common/sanitizer_allocator_checks.h"
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
#include "sanitizer_common/sanitizer_allocator_interface.h"
|
|
|
|
#include "sanitizer_common/sanitizer_quarantine.h"
|
|
|
|
|
2019-06-18 01:45:34 +08:00
|
|
|
#ifdef GWP_ASAN_HOOKS
|
|
|
|
# include "gwp_asan/guarded_pool_allocator.h"
|
2019-07-03 04:33:19 +08:00
|
|
|
# include "gwp_asan/optional/backtrace.h"
|
2019-06-18 01:45:34 +08:00
|
|
|
# include "gwp_asan/optional/options_parser.h"
|
|
|
|
#endif // GWP_ASAN_HOOKS
|
|
|
|
|
2017-10-12 23:01:09 +08:00
|
|
|
#include <errno.h>
|
2017-04-20 23:11:00 +08:00
|
|
|
#include <string.h>
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
|
|
|
|
namespace __scudo {
|
|
|
|
|
|
|
|
// Global static cookie, initialized at start-up.
|
2017-12-06 01:08:29 +08:00
|
|
|
static u32 Cookie;
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
|
|
|
|
// We default to software CRC32 if the alternatives are not supported, either
|
|
|
|
// at compilation or at runtime.
|
|
|
|
static atomic_uint8_t HashAlgorithm = { CRC32Software };
|
|
|
|
|
2017-08-17 00:40:48 +08:00
|
|
|
INLINE u32 computeCRC32(u32 Crc, uptr Value, uptr *Array, uptr ArraySize) {
|
[scudo] CRC32 optimizations
Summary:
This change optimizes several aspects of the checksum used for chunk headers.
First, there is no point in checking the weak symbol `computeHardwareCRC32`
everytime, it will either be there or not when we start, so check it once
during initialization and set the checksum type accordingly.
Then, the loading of `HashAlgorithm` for SSE versions (and ARM equivalent) was
not optimized out, while not necessary. So I reshuffled that part of the code,
which duplicates a tiny bit of code, but ends up in a much cleaner assembly
(and faster as we avoid an extraneous load and some calls).
The following code is the checksum at the end of `scudoMalloc` for x86_64 with
full SSE 4.2, before:
```
mov rax, 0FFFFFFFFFFFFFFh
shl r10, 38h
mov edi, dword ptr cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
and r14, rax
lea rsi, [r13-10h]
movzx eax, cs:_ZN7__scudoL13HashAlgorithmE ; __scudo::HashAlgorithm
or r14, r10
mov rbx, r14
xor bx, bx
call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov rsi, rbx
mov edi, eax
call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov r14w, ax
mov rax, r13
mov [r13-10h], r14
```
After:
```
mov rax, cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
lea rcx, [rbx-10h]
mov rdx, 0FFFFFFFFFFFFFFh
and r14, rdx
shl r9, 38h
or r14, r9
crc32 eax, rcx
mov rdx, r14
xor dx, dx
mov eax, eax
crc32 eax, rdx
mov r14w, ax
mov rax, rbx
mov [rbx-10h], r14
```
Reviewers: dvyukov, alekseyshl, kcc
Reviewed By: alekseyshl
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32971
llvm-svn: 302538
2017-05-09 23:12:38 +08:00
|
|
|
// If the hardware CRC32 feature is defined here, it was enabled everywhere,
|
|
|
|
// as opposed to only for scudo_crc32.cpp. This means that other hardware
|
|
|
|
// specific instructions were likely emitted at other places, and as a
|
|
|
|
// result there is no reason to not use it here.
|
[scudo] Refactor of CRC32 and ARM runtime CRC32 detection
Summary:
ARM & AArch64 runtime detection for hardware support of CRC32 has been added
via check of the AT_HWVAL auxiliary vector.
Following Michal's suggestions in D28417, the CRC32 code has been further
changed and looks better now. When compiled with full relro (which is strongly
suggested to benefit from additional hardening), the weak symbol for
computeHardwareCRC32 is read-only and the assembly generated is fairly clean
and straight forward. As suggested, an additional optimization is to skip
the runtime check if SSE 4.2 has been enabled globally, as opposed to only
for scudo_crc32.cpp.
scudo_crc32.h has no purpose anymore and was removed.
Reviewers: alekseyshl, kcc, rengolin, mgorny, phosek
Reviewed By: rengolin, mgorny
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D28574
llvm-svn: 292409
2017-01-19 01:11:17 +08:00
|
|
|
#if defined(__SSE4_2__) || defined(__ARM_FEATURE_CRC32)
|
[scudo] CRC32 optimizations
Summary:
This change optimizes several aspects of the checksum used for chunk headers.
First, there is no point in checking the weak symbol `computeHardwareCRC32`
everytime, it will either be there or not when we start, so check it once
during initialization and set the checksum type accordingly.
Then, the loading of `HashAlgorithm` for SSE versions (and ARM equivalent) was
not optimized out, while not necessary. So I reshuffled that part of the code,
which duplicates a tiny bit of code, but ends up in a much cleaner assembly
(and faster as we avoid an extraneous load and some calls).
The following code is the checksum at the end of `scudoMalloc` for x86_64 with
full SSE 4.2, before:
```
mov rax, 0FFFFFFFFFFFFFFh
shl r10, 38h
mov edi, dword ptr cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
and r14, rax
lea rsi, [r13-10h]
movzx eax, cs:_ZN7__scudoL13HashAlgorithmE ; __scudo::HashAlgorithm
or r14, r10
mov rbx, r14
xor bx, bx
call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov rsi, rbx
mov edi, eax
call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov r14w, ax
mov rax, r13
mov [r13-10h], r14
```
After:
```
mov rax, cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
lea rcx, [rbx-10h]
mov rdx, 0FFFFFFFFFFFFFFh
and r14, rdx
shl r9, 38h
or r14, r9
crc32 eax, rcx
mov rdx, r14
xor dx, dx
mov eax, eax
crc32 eax, rdx
mov r14w, ax
mov rax, rbx
mov [rbx-10h], r14
```
Reviewers: dvyukov, alekseyshl, kcc
Reviewed By: alekseyshl
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32971
llvm-svn: 302538
2017-05-09 23:12:38 +08:00
|
|
|
Crc = CRC32_INTRINSIC(Crc, Value);
|
|
|
|
for (uptr i = 0; i < ArraySize; i++)
|
|
|
|
Crc = CRC32_INTRINSIC(Crc, Array[i]);
|
|
|
|
return Crc;
|
[scudo] Refactor of CRC32 and ARM runtime CRC32 detection
Summary:
ARM & AArch64 runtime detection for hardware support of CRC32 has been added
via check of the AT_HWVAL auxiliary vector.
Following Michal's suggestions in D28417, the CRC32 code has been further
changed and looks better now. When compiled with full relro (which is strongly
suggested to benefit from additional hardening), the weak symbol for
computeHardwareCRC32 is read-only and the assembly generated is fairly clean
and straight forward. As suggested, an additional optimization is to skip
the runtime check if SSE 4.2 has been enabled globally, as opposed to only
for scudo_crc32.cpp.
scudo_crc32.h has no purpose anymore and was removed.
Reviewers: alekseyshl, kcc, rengolin, mgorny, phosek
Reviewed By: rengolin, mgorny
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D28574
llvm-svn: 292409
2017-01-19 01:11:17 +08:00
|
|
|
#else
|
[scudo] CRC32 optimizations
Summary:
This change optimizes several aspects of the checksum used for chunk headers.
First, there is no point in checking the weak symbol `computeHardwareCRC32`
everytime, it will either be there or not when we start, so check it once
during initialization and set the checksum type accordingly.
Then, the loading of `HashAlgorithm` for SSE versions (and ARM equivalent) was
not optimized out, while not necessary. So I reshuffled that part of the code,
which duplicates a tiny bit of code, but ends up in a much cleaner assembly
(and faster as we avoid an extraneous load and some calls).
The following code is the checksum at the end of `scudoMalloc` for x86_64 with
full SSE 4.2, before:
```
mov rax, 0FFFFFFFFFFFFFFh
shl r10, 38h
mov edi, dword ptr cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
and r14, rax
lea rsi, [r13-10h]
movzx eax, cs:_ZN7__scudoL13HashAlgorithmE ; __scudo::HashAlgorithm
or r14, r10
mov rbx, r14
xor bx, bx
call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov rsi, rbx
mov edi, eax
call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov r14w, ax
mov rax, r13
mov [r13-10h], r14
```
After:
```
mov rax, cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
lea rcx, [rbx-10h]
mov rdx, 0FFFFFFFFFFFFFFh
and r14, rdx
shl r9, 38h
or r14, r9
crc32 eax, rcx
mov rdx, r14
xor dx, dx
mov eax, eax
crc32 eax, rdx
mov r14w, ax
mov rax, rbx
mov [rbx-10h], r14
```
Reviewers: dvyukov, alekseyshl, kcc
Reviewed By: alekseyshl
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32971
llvm-svn: 302538
2017-05-09 23:12:38 +08:00
|
|
|
if (atomic_load_relaxed(&HashAlgorithm) == CRC32Hardware) {
|
|
|
|
Crc = computeHardwareCRC32(Crc, Value);
|
|
|
|
for (uptr i = 0; i < ArraySize; i++)
|
|
|
|
Crc = computeHardwareCRC32(Crc, Array[i]);
|
|
|
|
return Crc;
|
|
|
|
}
|
|
|
|
Crc = computeSoftwareCRC32(Crc, Value);
|
|
|
|
for (uptr i = 0; i < ArraySize; i++)
|
|
|
|
Crc = computeSoftwareCRC32(Crc, Array[i]);
|
|
|
|
return Crc;
|
|
|
|
#endif // defined(__SSE4_2__) || defined(__ARM_FEATURE_CRC32)
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
}
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
|
2018-07-20 23:07:17 +08:00
|
|
|
static BackendT &getBackend();
|
2017-04-28 04:21:16 +08:00
|
|
|
|
2017-12-15 05:32:57 +08:00
|
|
|
namespace Chunk {
|
|
|
|
static INLINE AtomicPackedHeader *getAtomicHeader(void *Ptr) {
|
|
|
|
return reinterpret_cast<AtomicPackedHeader *>(reinterpret_cast<uptr>(Ptr) -
|
2018-02-28 00:14:49 +08:00
|
|
|
getHeaderSize());
|
2017-12-15 05:32:57 +08:00
|
|
|
}
|
|
|
|
static INLINE
|
|
|
|
const AtomicPackedHeader *getConstAtomicHeader(const void *Ptr) {
|
|
|
|
return reinterpret_cast<const AtomicPackedHeader *>(
|
2018-02-28 00:14:49 +08:00
|
|
|
reinterpret_cast<uptr>(Ptr) - getHeaderSize());
|
2017-12-15 05:32:57 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static INLINE bool isAligned(const void *Ptr) {
|
|
|
|
return IsAligned(reinterpret_cast<uptr>(Ptr), MinAlignment);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
2018-03-14 23:50:32 +08:00
|
|
|
// We can't use the offset member of the chunk itself, as we would double
|
|
|
|
// fetch it without any warranty that it wouldn't have been tampered. To
|
|
|
|
// prevent this, we work with a local copy of the header.
|
|
|
|
static INLINE void *getBackendPtr(const void *Ptr, UnpackedHeader *Header) {
|
|
|
|
return reinterpret_cast<void *>(reinterpret_cast<uptr>(Ptr) -
|
|
|
|
getHeaderSize() - (Header->Offset << MinAlignmentLog));
|
|
|
|
}
|
|
|
|
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
// Returns the usable size for a chunk, meaning the amount of bytes from the
|
|
|
|
// beginning of the user data to the end of the backend allocated chunk.
|
2017-12-15 05:32:57 +08:00
|
|
|
static INLINE uptr getUsableSize(const void *Ptr, UnpackedHeader *Header) {
|
2018-03-14 23:50:32 +08:00
|
|
|
const uptr ClassId = Header->ClassId;
|
|
|
|
if (ClassId)
|
2018-07-20 23:07:17 +08:00
|
|
|
return PrimaryT::ClassIdToSize(ClassId) - getHeaderSize() -
|
2018-03-14 23:50:32 +08:00
|
|
|
(Header->Offset << MinAlignmentLog);
|
2018-07-20 23:07:17 +08:00
|
|
|
return SecondaryT::GetActuallyAllocatedSize(
|
2018-03-14 23:50:32 +08:00
|
|
|
getBackendPtr(Ptr, Header)) - getHeaderSize();
|
|
|
|
}
|
|
|
|
|
|
|
|
// Returns the size the user requested when allocating the chunk.
|
|
|
|
static INLINE uptr getSize(const void *Ptr, UnpackedHeader *Header) {
|
|
|
|
const uptr SizeOrUnusedBytes = Header->SizeOrUnusedBytes;
|
|
|
|
if (Header->ClassId)
|
|
|
|
return SizeOrUnusedBytes;
|
2018-07-20 23:07:17 +08:00
|
|
|
return SecondaryT::GetActuallyAllocatedSize(
|
2018-03-14 23:50:32 +08:00
|
|
|
getBackendPtr(Ptr, Header)) - getHeaderSize() - SizeOrUnusedBytes;
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
}
|
|
|
|
|
2017-12-15 05:32:57 +08:00
|
|
|
// Compute the checksum of the chunk pointer and its header.
|
|
|
|
static INLINE u16 computeChecksum(const void *Ptr, UnpackedHeader *Header) {
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
UnpackedHeader ZeroChecksumHeader = *Header;
|
|
|
|
ZeroChecksumHeader.Checksum = 0;
|
|
|
|
uptr HeaderHolder[sizeof(UnpackedHeader) / sizeof(uptr)];
|
|
|
|
memcpy(&HeaderHolder, &ZeroChecksumHeader, sizeof(HeaderHolder));
|
2017-12-15 05:32:57 +08:00
|
|
|
const u32 Crc = computeCRC32(Cookie, reinterpret_cast<uptr>(Ptr),
|
|
|
|
HeaderHolder, ARRAY_SIZE(HeaderHolder));
|
[scudo] Refactor of CRC32 and ARM runtime CRC32 detection
Summary:
ARM & AArch64 runtime detection for hardware support of CRC32 has been added
via check of the AT_HWVAL auxiliary vector.
Following Michal's suggestions in D28417, the CRC32 code has been further
changed and looks better now. When compiled with full relro (which is strongly
suggested to benefit from additional hardening), the weak symbol for
computeHardwareCRC32 is read-only and the assembly generated is fairly clean
and straight forward. As suggested, an additional optimization is to skip
the runtime check if SSE 4.2 has been enabled globally, as opposed to only
for scudo_crc32.cpp.
scudo_crc32.h has no purpose anymore and was removed.
Reviewers: alekseyshl, kcc, rengolin, mgorny, phosek
Reviewed By: rengolin, mgorny
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D28574
llvm-svn: 292409
2017-01-19 01:11:17 +08:00
|
|
|
return static_cast<u16>(Crc);
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
}
|
|
|
|
|
2017-04-20 23:11:00 +08:00
|
|
|
// Checks the validity of a chunk by verifying its checksum. It doesn't
|
|
|
|
// incur termination in the event of an invalid chunk.
|
2017-12-15 05:32:57 +08:00
|
|
|
static INLINE bool isValid(const void *Ptr) {
|
|
|
|
PackedHeader NewPackedHeader =
|
|
|
|
atomic_load_relaxed(getConstAtomicHeader(Ptr));
|
|
|
|
UnpackedHeader NewUnpackedHeader =
|
|
|
|
bit_cast<UnpackedHeader>(NewPackedHeader);
|
|
|
|
return (NewUnpackedHeader.Checksum ==
|
|
|
|
computeChecksum(Ptr, &NewUnpackedHeader));
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
[scudo] Replace eraseHeader with compareExchangeHeader for Quarantined chunks
Summary:
The reason for the existence of `eraseHeader` was that it was deemed faster
to null-out a chunk header, effectively making it invalid, rather than marking
it as available, which incurred a checksum computation and a cmpxchg.
A previous use of `eraseHeader` was removed with D50655 due to a race.
Now we remove the second use of it in the Quarantine deallocation path and
replace is with a `compareExchangeHeader`.
The reason for this is that greatly helps debugging some heap bugs as the chunk
header is now valid and the chunk marked available, as opposed to the header
being invalid. Eg: we get an invalid state error, instead of an invalid header
error, which reduces the possibilities. The computational penalty is negligible.
Reviewers: alekseyshl, flowerhack, eugenis
Reviewed By: eugenis
Subscribers: delcypher, jfb, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D51224
llvm-svn: 340633
2018-08-25 02:21:32 +08:00
|
|
|
// Ensure that ChunkAvailable is 0, so that if a 0 checksum is ever valid
|
|
|
|
// for a fully nulled out header, its state will be available anyway.
|
2017-04-20 23:11:00 +08:00
|
|
|
COMPILER_CHECK(ChunkAvailable == 0);
|
|
|
|
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
// Loads and unpacks the header, verifying the checksum in the process.
|
2017-12-15 05:32:57 +08:00
|
|
|
static INLINE
|
|
|
|
void loadHeader(const void *Ptr, UnpackedHeader *NewUnpackedHeader) {
|
|
|
|
PackedHeader NewPackedHeader =
|
|
|
|
atomic_load_relaxed(getConstAtomicHeader(Ptr));
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
*NewUnpackedHeader = bit_cast<UnpackedHeader>(NewPackedHeader);
|
2017-04-20 23:11:00 +08:00
|
|
|
if (UNLIKELY(NewUnpackedHeader->Checksum !=
|
2018-03-08 00:22:16 +08:00
|
|
|
computeChecksum(Ptr, NewUnpackedHeader)))
|
|
|
|
dieWithMessage("corrupted chunk header at address %p\n", Ptr);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
// Packs and stores the header, computing the checksum in the process.
|
2017-12-15 05:32:57 +08:00
|
|
|
static INLINE void storeHeader(void *Ptr, UnpackedHeader *NewUnpackedHeader) {
|
|
|
|
NewUnpackedHeader->Checksum = computeChecksum(Ptr, NewUnpackedHeader);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
PackedHeader NewPackedHeader = bit_cast<PackedHeader>(*NewUnpackedHeader);
|
2017-12-15 05:32:57 +08:00
|
|
|
atomic_store_relaxed(getAtomicHeader(Ptr), NewPackedHeader);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
// Packs and stores the header, computing the checksum in the process. We
|
|
|
|
// compare the current header with the expected provided one to ensure that
|
|
|
|
// we are not being raced by a corruption occurring in another thread.
|
2017-12-15 05:32:57 +08:00
|
|
|
static INLINE void compareExchangeHeader(void *Ptr,
|
|
|
|
UnpackedHeader *NewUnpackedHeader,
|
|
|
|
UnpackedHeader *OldUnpackedHeader) {
|
|
|
|
NewUnpackedHeader->Checksum = computeChecksum(Ptr, NewUnpackedHeader);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
PackedHeader NewPackedHeader = bit_cast<PackedHeader>(*NewUnpackedHeader);
|
|
|
|
PackedHeader OldPackedHeader = bit_cast<PackedHeader>(*OldUnpackedHeader);
|
2017-12-15 05:32:57 +08:00
|
|
|
if (UNLIKELY(!atomic_compare_exchange_strong(
|
|
|
|
getAtomicHeader(Ptr), &OldPackedHeader, NewPackedHeader,
|
2018-03-08 00:22:16 +08:00
|
|
|
memory_order_relaxed)))
|
|
|
|
dieWithMessage("race on chunk header at address %p\n", Ptr);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
2017-12-15 05:32:57 +08:00
|
|
|
} // namespace Chunk
|
2017-04-20 23:11:00 +08:00
|
|
|
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
struct QuarantineCallback {
|
2018-07-20 23:07:17 +08:00
|
|
|
explicit QuarantineCallback(AllocatorCacheT *Cache)
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
: Cache_(Cache) {}
|
|
|
|
|
2017-05-12 05:40:45 +08:00
|
|
|
// Chunk recycling function, returns a quarantined chunk to the backend,
|
|
|
|
// first making sure it hasn't been tampered with.
|
2017-12-15 05:32:57 +08:00
|
|
|
void Recycle(void *Ptr) {
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
UnpackedHeader Header;
|
2017-12-15 05:32:57 +08:00
|
|
|
Chunk::loadHeader(Ptr, &Header);
|
2018-03-08 00:22:16 +08:00
|
|
|
if (UNLIKELY(Header.State != ChunkQuarantine))
|
|
|
|
dieWithMessage("invalid chunk state when recycling address %p\n", Ptr);
|
[scudo] Replace eraseHeader with compareExchangeHeader for Quarantined chunks
Summary:
The reason for the existence of `eraseHeader` was that it was deemed faster
to null-out a chunk header, effectively making it invalid, rather than marking
it as available, which incurred a checksum computation and a cmpxchg.
A previous use of `eraseHeader` was removed with D50655 due to a race.
Now we remove the second use of it in the Quarantine deallocation path and
replace is with a `compareExchangeHeader`.
The reason for this is that greatly helps debugging some heap bugs as the chunk
header is now valid and the chunk marked available, as opposed to the header
being invalid. Eg: we get an invalid state error, instead of an invalid header
error, which reduces the possibilities. The computational penalty is negligible.
Reviewers: alekseyshl, flowerhack, eugenis
Reviewed By: eugenis
Subscribers: delcypher, jfb, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D51224
llvm-svn: 340633
2018-08-25 02:21:32 +08:00
|
|
|
UnpackedHeader NewHeader = Header;
|
|
|
|
NewHeader.State = ChunkAvailable;
|
|
|
|
Chunk::compareExchangeHeader(Ptr, &NewHeader, &Header);
|
2017-12-15 05:32:57 +08:00
|
|
|
void *BackendPtr = Chunk::getBackendPtr(Ptr, &Header);
|
2017-12-06 01:08:29 +08:00
|
|
|
if (Header.ClassId)
|
2018-07-20 23:07:17 +08:00
|
|
|
getBackend().deallocatePrimary(Cache_, BackendPtr, Header.ClassId);
|
2017-07-14 05:01:19 +08:00
|
|
|
else
|
2018-07-20 23:07:17 +08:00
|
|
|
getBackend().deallocateSecondary(BackendPtr);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
2017-05-12 05:40:45 +08:00
|
|
|
// Internal quarantine allocation and deallocation functions. We first check
|
|
|
|
// that the batches are indeed serviced by the Primary.
|
|
|
|
// TODO(kostyak): figure out the best way to protect the batches.
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
void *Allocate(uptr Size) {
|
2018-05-17 02:12:31 +08:00
|
|
|
const uptr BatchClassId = SizeClassMap::ClassID(sizeof(QuarantineBatch));
|
2018-07-20 23:07:17 +08:00
|
|
|
return getBackend().allocatePrimary(Cache_, BatchClassId);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
void Deallocate(void *Ptr) {
|
2018-05-17 02:12:31 +08:00
|
|
|
const uptr BatchClassId = SizeClassMap::ClassID(sizeof(QuarantineBatch));
|
2018-07-20 23:07:17 +08:00
|
|
|
getBackend().deallocatePrimary(Cache_, Ptr, BatchClassId);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
2018-07-20 23:07:17 +08:00
|
|
|
AllocatorCacheT *Cache_;
|
2017-12-06 01:08:29 +08:00
|
|
|
COMPILER_CHECK(sizeof(QuarantineBatch) < SizeClassMap::kMaxSize);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
};
|
|
|
|
|
2018-07-20 23:07:17 +08:00
|
|
|
typedef Quarantine<QuarantineCallback, void> QuarantineT;
|
|
|
|
typedef QuarantineT::Cache QuarantineCacheT;
|
|
|
|
COMPILER_CHECK(sizeof(QuarantineCacheT) <=
|
2017-09-22 23:35:37 +08:00
|
|
|
sizeof(ScudoTSD::QuarantineCachePlaceHolder));
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
|
2018-07-20 23:07:17 +08:00
|
|
|
QuarantineCacheT *getQuarantineCache(ScudoTSD *TSD) {
|
|
|
|
return reinterpret_cast<QuarantineCacheT *>(TSD->QuarantineCachePlaceHolder);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
2019-06-18 01:45:34 +08:00
|
|
|
#ifdef GWP_ASAN_HOOKS
|
|
|
|
static gwp_asan::GuardedPoolAllocator GuardedAlloc;
|
|
|
|
#endif // GWP_ASAN_HOOKS
|
|
|
|
|
2018-07-20 23:07:17 +08:00
|
|
|
struct Allocator {
|
2016-10-27 00:16:58 +08:00
|
|
|
static const uptr MaxAllowedMallocSize =
|
|
|
|
FIRST_32_SECOND_64(2UL << 30, 1ULL << 40);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
|
2018-07-20 23:07:17 +08:00
|
|
|
BackendT Backend;
|
|
|
|
QuarantineT Quarantine;
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
|
[scudo] Quarantine overhaul
Summary:
First, some context.
The main feedback we get about the quarantine is that it's too memory hungry.
A single MB of quarantine will have an impact of 3 to 4MB of PSS/RSS, and
things quickly get out of hand in terms of memory usage, and the quarantine
ends up disabled.
The main objective of the quarantine is to protect from use-after-free
exploitation by making it harder for an attacker to reallocate a controlled
chunk in place of the targeted freed chunk. This is achieved by not making it
available to the backend right away for reuse, but holding it a little while.
Historically, what has usually been the target of such attacks was objects,
where vtable pointers or other function pointers could constitute a valuable
targeti to replace. Those are usually on the smaller side. There is barely any
advantage in putting the quarantine several megabytes of RGB data or the like.
Now for the patch.
This patch introduces a new way the Quarantine behaves in Scudo. First of all,
the size of the Quarantine will be defined in KB instead of MB, then we
introduce a new option: the size up to which (lower than or equal to) a chunk
will be quarantined. This way, we only quarantine smaller chunks, and the size
of the quarantine remains manageable. It also prevents someone from triggering
a recycle by allocating something huge. We default to 512 bytes on 32-bit and
2048 bytes on 64-bit platforms.
In details, the patches includes the following:
- introduce `QuarantineSizeKb`, but honor `QuarantineSizeMb` if set to fall
back to the old behavior (meaning no threshold in that case);
`QuarantineSizeMb` is described as deprecated in the options descriptios;
documentation update will follow;
- introduce `QuarantineChunksUpToSize`, the new threshold value;
- update the `quarantine.cpp` test, and other tests using `QuarantineSizeMb`;
- remove `AllocatorOptions::copyTo`, it wasn't used;
- slightly change the logic around `quarantineOrDeallocateChunk` to accomodate
for the new logic; rename a couple of variables there as well;
Rewriting the tests, I found a somewhat annoying bug where non-default aligned
chunks would account for more than needed when placed in the quarantine due to
`<< MinAlignment` instead of `<< MinAlignmentLog`. This is fixed and tested for
now.
Reviewers: alekseyshl, kcc
Reviewed By: alekseyshl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35694
llvm-svn: 308884
2017-07-24 23:29:38 +08:00
|
|
|
u32 QuarantineChunksUpToSize;
|
|
|
|
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
bool DeallocationTypeMismatch;
|
|
|
|
bool ZeroContents;
|
|
|
|
bool DeleteSizeMismatch;
|
|
|
|
|
2017-11-16 00:40:27 +08:00
|
|
|
bool CheckRssLimit;
|
|
|
|
uptr HardRssLimitMb;
|
|
|
|
uptr SoftRssLimitMb;
|
|
|
|
atomic_uint8_t RssLimitExceeded;
|
|
|
|
atomic_uint64_t RssLastCheckedAtNS;
|
|
|
|
|
2018-07-20 23:07:17 +08:00
|
|
|
explicit Allocator(LinkerInitialized)
|
|
|
|
: Quarantine(LINKER_INITIALIZED) {}
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
|
2018-06-19 23:36:30 +08:00
|
|
|
NOINLINE void performSanityChecks();
|
2017-12-06 01:08:29 +08:00
|
|
|
|
|
|
|
void init() {
|
|
|
|
SanitizerToolName = "Scudo";
|
2018-04-14 03:21:27 +08:00
|
|
|
PrimaryAllocatorName = "ScudoPrimary";
|
|
|
|
SecondaryAllocatorName = "ScudoSecondary";
|
|
|
|
|
2017-12-06 01:08:29 +08:00
|
|
|
initFlags();
|
|
|
|
|
|
|
|
performSanityChecks();
|
|
|
|
|
2017-11-15 00:14:53 +08:00
|
|
|
// Check if hardware CRC32 is supported in the binary and by the platform,
|
|
|
|
// if so, opt for the CRC32 hardware version of the checksum.
|
2017-11-23 02:30:44 +08:00
|
|
|
if (&computeHardwareCRC32 && hasHardwareCRC32())
|
2017-11-15 00:14:53 +08:00
|
|
|
atomic_store_relaxed(&HashAlgorithm, CRC32Hardware);
|
|
|
|
|
|
|
|
SetAllocatorMayReturnNull(common_flags()->allocator_may_return_null);
|
2018-07-20 23:07:17 +08:00
|
|
|
Backend.init(common_flags()->allocator_release_to_os_interval_ms);
|
2017-11-16 00:40:27 +08:00
|
|
|
HardRssLimitMb = common_flags()->hard_rss_limit_mb;
|
|
|
|
SoftRssLimitMb = common_flags()->soft_rss_limit_mb;
|
2018-07-20 23:07:17 +08:00
|
|
|
Quarantine.Init(
|
2017-11-15 00:14:53 +08:00
|
|
|
static_cast<uptr>(getFlags()->QuarantineSizeKb) << 10,
|
|
|
|
static_cast<uptr>(getFlags()->ThreadLocalQuarantineSizeKb) << 10);
|
[scudo] Fix race condition in deallocation path when Quarantine is bypassed
Summary:
There is a race window in the deallocation path when the Quarantine is bypassed.
Initially we would just erase the header of a chunk if we were not to use the
Quarantine, as opposed to using a compare-exchange primitive, to make things
faster.
It turned out to be a poor decision, as 2 threads (or more) could simultaneously
deallocate the same pointer, and if the checks were to done before the header
got erased, this would result in the pointer being added twice (or more) to
distinct thread caches, and eventually be reused.
Winning the race is not trivial but can happen with enough control over the
allocation primitives. The repro added attempts to trigger the bug, with a
moderate success rate, but it should be enough to notice if the bug ever make
its way back into the code.
Since I am changing things in this file, there are 2 smaller changes tagging
along, marking a variable `const`, and improving the Quarantine bypass test at
runtime.
Reviewers: alekseyshl, eugenis, kcc, vitalybuka
Reviewed By: eugenis, vitalybuka
Subscribers: delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D50655
llvm-svn: 339705
2018-08-15 02:34:52 +08:00
|
|
|
QuarantineChunksUpToSize = (Quarantine.GetCacheSize() == 0) ? 0 :
|
|
|
|
getFlags()->QuarantineChunksUpToSize;
|
2017-11-15 00:14:53 +08:00
|
|
|
DeallocationTypeMismatch = getFlags()->DeallocationTypeMismatch;
|
|
|
|
DeleteSizeMismatch = getFlags()->DeleteSizeMismatch;
|
|
|
|
ZeroContents = getFlags()->ZeroContents;
|
|
|
|
|
2017-12-06 01:08:29 +08:00
|
|
|
if (UNLIKELY(!GetRandom(reinterpret_cast<void *>(&Cookie), sizeof(Cookie),
|
|
|
|
/*blocking=*/false))) {
|
|
|
|
Cookie = static_cast<u32>((NanoTime() >> 12) ^
|
|
|
|
(reinterpret_cast<uptr>(this) >> 4));
|
|
|
|
}
|
2017-11-16 00:40:27 +08:00
|
|
|
|
|
|
|
CheckRssLimit = HardRssLimitMb || SoftRssLimitMb;
|
|
|
|
if (CheckRssLimit)
|
2017-12-14 00:23:54 +08:00
|
|
|
atomic_store_relaxed(&RssLastCheckedAtNS, MonotonicNanoTime());
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
2017-04-20 23:11:00 +08:00
|
|
|
// Helper function that checks for a valid Scudo chunk. nullptr isn't.
|
2017-12-15 05:32:57 +08:00
|
|
|
bool isValidPointer(const void *Ptr) {
|
2017-04-28 04:21:16 +08:00
|
|
|
initThreadMaybe();
|
2017-12-15 05:32:57 +08:00
|
|
|
if (UNLIKELY(!Ptr))
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
return false;
|
2017-12-15 05:32:57 +08:00
|
|
|
if (!Chunk::isAligned(Ptr))
|
2017-04-20 23:11:00 +08:00
|
|
|
return false;
|
2017-12-15 05:32:57 +08:00
|
|
|
return Chunk::isValid(Ptr);
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
}
|
|
|
|
|
2018-06-19 23:36:30 +08:00
|
|
|
NOINLINE bool isRssLimitExceeded();
|
2017-11-16 00:40:27 +08:00
|
|
|
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
// Allocates a chunk.
|
2017-04-20 23:11:00 +08:00
|
|
|
void *allocate(uptr Size, uptr Alignment, AllocType Type,
|
|
|
|
bool ForceZeroContents = false) {
|
2017-04-28 04:21:16 +08:00
|
|
|
initThreadMaybe();
|
2019-06-18 01:45:34 +08:00
|
|
|
|
|
|
|
#ifdef GWP_ASAN_HOOKS
|
|
|
|
if (UNLIKELY(GuardedAlloc.shouldSample())) {
|
|
|
|
if (void *Ptr = GuardedAlloc.allocate(Size))
|
|
|
|
return Ptr;
|
|
|
|
}
|
|
|
|
#endif // GWP_ASAN_HOOKS
|
|
|
|
|
2018-06-16 00:45:19 +08:00
|
|
|
if (UNLIKELY(Alignment > MaxAlignment)) {
|
|
|
|
if (AllocatorMayReturnNull())
|
|
|
|
return nullptr;
|
|
|
|
reportAllocationAlignmentTooBig(Alignment, MaxAlignment);
|
|
|
|
}
|
2017-06-30 00:45:20 +08:00
|
|
|
if (UNLIKELY(Alignment < MinAlignment))
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
Alignment = MinAlignment;
|
2016-12-14 03:31:54 +08:00
|
|
|
|
2018-06-16 00:45:19 +08:00
|
|
|
const uptr NeededSize = RoundUpTo(Size ? Size : 1, MinAlignment) +
|
2018-02-28 00:14:49 +08:00
|
|
|
Chunk::getHeaderSize();
|
|
|
|
const uptr AlignedSize = (Alignment > MinAlignment) ?
|
|
|
|
NeededSize + (Alignment - Chunk::getHeaderSize()) : NeededSize;
|
2018-06-16 00:45:19 +08:00
|
|
|
if (UNLIKELY(Size >= MaxAllowedMallocSize) ||
|
|
|
|
UNLIKELY(AlignedSize >= MaxAllowedMallocSize)) {
|
|
|
|
if (AllocatorMayReturnNull())
|
|
|
|
return nullptr;
|
|
|
|
reportAllocationSizeTooBig(Size, AlignedSize, MaxAllowedMallocSize);
|
|
|
|
}
|
2016-12-14 03:31:54 +08:00
|
|
|
|
2018-06-16 00:45:19 +08:00
|
|
|
if (CheckRssLimit && UNLIKELY(isRssLimitExceeded())) {
|
|
|
|
if (AllocatorMayReturnNull())
|
|
|
|
return nullptr;
|
|
|
|
reportRssLimitExceeded();
|
|
|
|
}
|
2017-11-16 00:40:27 +08:00
|
|
|
|
2017-05-12 05:40:45 +08:00
|
|
|
// Primary and Secondary backed allocations have a different treatment. We
|
|
|
|
// deal with alignment requirements of Primary serviced allocations here,
|
|
|
|
// but the Secondary will take care of its own alignment needs.
|
2017-12-15 05:32:57 +08:00
|
|
|
void *BackendPtr;
|
|
|
|
uptr BackendSize;
|
2017-12-06 01:08:29 +08:00
|
|
|
u8 ClassId;
|
2018-07-20 23:07:17 +08:00
|
|
|
if (PrimaryT::CanAllocate(AlignedSize, MinAlignment)) {
|
2017-12-15 05:32:57 +08:00
|
|
|
BackendSize = AlignedSize;
|
|
|
|
ClassId = SizeClassMap::ClassID(BackendSize);
|
[scudo] Improve the scalability of the shared TSD model
Summary:
The shared TSD model in its current form doesn't scale. Here is an example of
rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core
machines (defaulting to a `CompactSizeClassMap` and no Quarantine):
- with tcmalloc: 337K reqs/sec, peak RSS of 338MB;
- with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB;
- with scudo (shared): 241K reqs/sec, peak RSS of 324MB.
This isn't great, since the exclusive model uses a lot of memory, while the
shared model doesn't even come close to be competitive.
This is mostly due to the fact that we are consistently scanning the TSD pool
starting at index 0 for an available TSD, which can result in a lot of failed
lock attempts, and touching some memory that needs not be touched.
This CL attempts to make things better in most situations:
- first, use a thread local variable on Linux (intead of pthread APIs) to store
the current TSD in the shared model;
- move the locking boolean out of the TSD: this allows the compiler to use a
register and potentially optimize out a branch instead of reading it from the
TSD everytime (we also save a tiny bit of memory per TSD);
- 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so
store the `Precedence` in a `uptr` instead of a `u64`. We lose some
nanoseconds of precision and we'll wrap around at some point, but the benefit
is worth it;
- change a `CHECK` to a `DCHECK`: this should never happen, but if something is
ever terribly wrong, we'll crash on a near null AV if the TSD happens to be
null;
- based on an idea by dvyukov@, we are implementing a bound random scan for
an available TSD. This requires computing the coprimes for the number of TSDs,
and attempting to lock up to 4 TSDs in an random order before falling back to
the current one. This is obviously slightly more expansive when we have just
2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still
basically corresponds to the moment of the first contention on a TSD. To seed
on random choice, we use the precedence of the current TSD since it is very
likely to be non-zero (since we are in the slow path after a failed `tryLock`)
With those modifications, the benchmark yields to:
- with scudo (shared): 330K reqs/sec, peak RSS of 327MB.
So the shared model for this specific situation not only becomes competitive but
outperforms the exclusive model. I experimented with some values greater than 4
for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just
sticking with the current TSD is also a tad slower. Numbers on platforms with
less cores (eg: Android) remain similar.
Reviewers: alekseyshl, dvyukov, javed.absar
Reviewed By: alekseyshl, dvyukov
Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D47289
llvm-svn: 334410
2018-06-11 22:50:31 +08:00
|
|
|
bool UnlockRequired;
|
|
|
|
ScudoTSD *TSD = getTSDAndLock(&UnlockRequired);
|
2018-07-20 23:07:17 +08:00
|
|
|
BackendPtr = Backend.allocatePrimary(&TSD->Cache, ClassId);
|
[scudo] Improve the scalability of the shared TSD model
Summary:
The shared TSD model in its current form doesn't scale. Here is an example of
rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core
machines (defaulting to a `CompactSizeClassMap` and no Quarantine):
- with tcmalloc: 337K reqs/sec, peak RSS of 338MB;
- with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB;
- with scudo (shared): 241K reqs/sec, peak RSS of 324MB.
This isn't great, since the exclusive model uses a lot of memory, while the
shared model doesn't even come close to be competitive.
This is mostly due to the fact that we are consistently scanning the TSD pool
starting at index 0 for an available TSD, which can result in a lot of failed
lock attempts, and touching some memory that needs not be touched.
This CL attempts to make things better in most situations:
- first, use a thread local variable on Linux (intead of pthread APIs) to store
the current TSD in the shared model;
- move the locking boolean out of the TSD: this allows the compiler to use a
register and potentially optimize out a branch instead of reading it from the
TSD everytime (we also save a tiny bit of memory per TSD);
- 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so
store the `Precedence` in a `uptr` instead of a `u64`. We lose some
nanoseconds of precision and we'll wrap around at some point, but the benefit
is worth it;
- change a `CHECK` to a `DCHECK`: this should never happen, but if something is
ever terribly wrong, we'll crash on a near null AV if the TSD happens to be
null;
- based on an idea by dvyukov@, we are implementing a bound random scan for
an available TSD. This requires computing the coprimes for the number of TSDs,
and attempting to lock up to 4 TSDs in an random order before falling back to
the current one. This is obviously slightly more expansive when we have just
2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still
basically corresponds to the moment of the first contention on a TSD. To seed
on random choice, we use the precedence of the current TSD since it is very
likely to be non-zero (since we are in the slow path after a failed `tryLock`)
With those modifications, the benchmark yields to:
- with scudo (shared): 330K reqs/sec, peak RSS of 327MB.
So the shared model for this specific situation not only becomes competitive but
outperforms the exclusive model. I experimented with some values greater than 4
for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just
sticking with the current TSD is also a tad slower. Numbers on platforms with
less cores (eg: Android) remain similar.
Reviewers: alekseyshl, dvyukov, javed.absar
Reviewed By: alekseyshl, dvyukov
Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D47289
llvm-svn: 334410
2018-06-11 22:50:31 +08:00
|
|
|
if (UnlockRequired)
|
|
|
|
TSD->unlock();
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
} else {
|
2017-12-15 05:32:57 +08:00
|
|
|
BackendSize = NeededSize;
|
2017-12-06 01:08:29 +08:00
|
|
|
ClassId = 0;
|
2018-07-20 23:07:17 +08:00
|
|
|
BackendPtr = Backend.allocateSecondary(BackendSize, Alignment);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
2018-06-16 00:45:19 +08:00
|
|
|
if (UNLIKELY(!BackendPtr)) {
|
|
|
|
SetAllocatorOutOfMemory();
|
|
|
|
if (AllocatorMayReturnNull())
|
|
|
|
return nullptr;
|
|
|
|
reportOutOfMemory(Size);
|
|
|
|
}
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
// If requested, we will zero out the entire contents of the returned chunk.
|
2017-12-09 00:36:37 +08:00
|
|
|
if ((ForceZeroContents || ZeroContents) && ClassId)
|
2018-07-20 23:07:17 +08:00
|
|
|
memset(BackendPtr, 0, PrimaryT::ClassIdToSize(ClassId));
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
|
2017-05-12 05:40:45 +08:00
|
|
|
UnpackedHeader Header = {};
|
2018-02-28 00:14:49 +08:00
|
|
|
uptr UserPtr = reinterpret_cast<uptr>(BackendPtr) + Chunk::getHeaderSize();
|
2017-12-15 05:32:57 +08:00
|
|
|
if (UNLIKELY(!IsAligned(UserPtr, Alignment))) {
|
2017-05-12 05:40:45 +08:00
|
|
|
// Since the Secondary takes care of alignment, a non-aligned pointer
|
|
|
|
// means it is from the Primary. It is also the only case where the offset
|
|
|
|
// field of the header would be non-zero.
|
2017-12-15 05:32:57 +08:00
|
|
|
DCHECK(ClassId);
|
|
|
|
const uptr AlignedUserPtr = RoundUpTo(UserPtr, Alignment);
|
|
|
|
Header.Offset = (AlignedUserPtr - UserPtr) >> MinAlignmentLog;
|
|
|
|
UserPtr = AlignedUserPtr;
|
2017-05-12 05:40:45 +08:00
|
|
|
}
|
2018-03-14 23:50:32 +08:00
|
|
|
DCHECK_LE(UserPtr + Size, reinterpret_cast<uptr>(BackendPtr) + BackendSize);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
Header.State = ChunkAllocated;
|
|
|
|
Header.AllocType = Type;
|
2017-12-09 00:36:37 +08:00
|
|
|
if (ClassId) {
|
|
|
|
Header.ClassId = ClassId;
|
2017-04-21 02:07:17 +08:00
|
|
|
Header.SizeOrUnusedBytes = Size;
|
|
|
|
} else {
|
|
|
|
// The secondary fits the allocations to a page, so the amount of unused
|
|
|
|
// bytes is the difference between the end of the user allocation and the
|
|
|
|
// next page boundary.
|
2017-12-09 00:36:37 +08:00
|
|
|
const uptr PageSize = GetPageSizeCached();
|
2017-12-15 05:32:57 +08:00
|
|
|
const uptr TrailingBytes = (UserPtr + Size) & (PageSize - 1);
|
2017-04-21 02:07:17 +08:00
|
|
|
if (TrailingBytes)
|
|
|
|
Header.SizeOrUnusedBytes = PageSize - TrailingBytes;
|
|
|
|
}
|
2017-12-15 05:32:57 +08:00
|
|
|
void *Ptr = reinterpret_cast<void *>(UserPtr);
|
|
|
|
Chunk::storeHeader(Ptr, &Header);
|
2018-01-24 07:07:42 +08:00
|
|
|
if (SCUDO_CAN_USE_HOOKS && &__sanitizer_malloc_hook)
|
|
|
|
__sanitizer_malloc_hook(Ptr, Size);
|
2017-12-15 05:32:57 +08:00
|
|
|
return Ptr;
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
[scudo] Quarantine overhaul
Summary:
First, some context.
The main feedback we get about the quarantine is that it's too memory hungry.
A single MB of quarantine will have an impact of 3 to 4MB of PSS/RSS, and
things quickly get out of hand in terms of memory usage, and the quarantine
ends up disabled.
The main objective of the quarantine is to protect from use-after-free
exploitation by making it harder for an attacker to reallocate a controlled
chunk in place of the targeted freed chunk. This is achieved by not making it
available to the backend right away for reuse, but holding it a little while.
Historically, what has usually been the target of such attacks was objects,
where vtable pointers or other function pointers could constitute a valuable
targeti to replace. Those are usually on the smaller side. There is barely any
advantage in putting the quarantine several megabytes of RGB data or the like.
Now for the patch.
This patch introduces a new way the Quarantine behaves in Scudo. First of all,
the size of the Quarantine will be defined in KB instead of MB, then we
introduce a new option: the size up to which (lower than or equal to) a chunk
will be quarantined. This way, we only quarantine smaller chunks, and the size
of the quarantine remains manageable. It also prevents someone from triggering
a recycle by allocating something huge. We default to 512 bytes on 32-bit and
2048 bytes on 64-bit platforms.
In details, the patches includes the following:
- introduce `QuarantineSizeKb`, but honor `QuarantineSizeMb` if set to fall
back to the old behavior (meaning no threshold in that case);
`QuarantineSizeMb` is described as deprecated in the options descriptios;
documentation update will follow;
- introduce `QuarantineChunksUpToSize`, the new threshold value;
- update the `quarantine.cpp` test, and other tests using `QuarantineSizeMb`;
- remove `AllocatorOptions::copyTo`, it wasn't used;
- slightly change the logic around `quarantineOrDeallocateChunk` to accomodate
for the new logic; rename a couple of variables there as well;
Rewriting the tests, I found a somewhat annoying bug where non-default aligned
chunks would account for more than needed when placed in the quarantine due to
`<< MinAlignment` instead of `<< MinAlignmentLog`. This is fixed and tested for
now.
Reviewers: alekseyshl, kcc
Reviewed By: alekseyshl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35694
llvm-svn: 308884
2017-07-24 23:29:38 +08:00
|
|
|
// Place a chunk in the quarantine or directly deallocate it in the event of
|
|
|
|
// a zero-sized quarantine, or if the size of the chunk is greater than the
|
|
|
|
// quarantine chunk size threshold.
|
2017-12-15 05:32:57 +08:00
|
|
|
void quarantineOrDeallocateChunk(void *Ptr, UnpackedHeader *Header,
|
2017-04-22 02:10:53 +08:00
|
|
|
uptr Size) {
|
[scudo] Fix race condition in deallocation path when Quarantine is bypassed
Summary:
There is a race window in the deallocation path when the Quarantine is bypassed.
Initially we would just erase the header of a chunk if we were not to use the
Quarantine, as opposed to using a compare-exchange primitive, to make things
faster.
It turned out to be a poor decision, as 2 threads (or more) could simultaneously
deallocate the same pointer, and if the checks were to done before the header
got erased, this would result in the pointer being added twice (or more) to
distinct thread caches, and eventually be reused.
Winning the race is not trivial but can happen with enough control over the
allocation primitives. The repro added attempts to trigger the bug, with a
moderate success rate, but it should be enough to notice if the bug ever make
its way back into the code.
Since I am changing things in this file, there are 2 smaller changes tagging
along, marking a variable `const`, and improving the Quarantine bypass test at
runtime.
Reviewers: alekseyshl, eugenis, kcc, vitalybuka
Reviewed By: eugenis, vitalybuka
Subscribers: delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D50655
llvm-svn: 339705
2018-08-15 02:34:52 +08:00
|
|
|
const bool BypassQuarantine = !Size || (Size > QuarantineChunksUpToSize);
|
2017-04-22 02:10:53 +08:00
|
|
|
if (BypassQuarantine) {
|
[scudo] Fix race condition in deallocation path when Quarantine is bypassed
Summary:
There is a race window in the deallocation path when the Quarantine is bypassed.
Initially we would just erase the header of a chunk if we were not to use the
Quarantine, as opposed to using a compare-exchange primitive, to make things
faster.
It turned out to be a poor decision, as 2 threads (or more) could simultaneously
deallocate the same pointer, and if the checks were to done before the header
got erased, this would result in the pointer being added twice (or more) to
distinct thread caches, and eventually be reused.
Winning the race is not trivial but can happen with enough control over the
allocation primitives. The repro added attempts to trigger the bug, with a
moderate success rate, but it should be enough to notice if the bug ever make
its way back into the code.
Since I am changing things in this file, there are 2 smaller changes tagging
along, marking a variable `const`, and improving the Quarantine bypass test at
runtime.
Reviewers: alekseyshl, eugenis, kcc, vitalybuka
Reviewed By: eugenis, vitalybuka
Subscribers: delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D50655
llvm-svn: 339705
2018-08-15 02:34:52 +08:00
|
|
|
UnpackedHeader NewHeader = *Header;
|
|
|
|
NewHeader.State = ChunkAvailable;
|
|
|
|
Chunk::compareExchangeHeader(Ptr, &NewHeader, Header);
|
2017-12-15 05:32:57 +08:00
|
|
|
void *BackendPtr = Chunk::getBackendPtr(Ptr, Header);
|
2017-12-06 01:08:29 +08:00
|
|
|
if (Header->ClassId) {
|
[scudo] Improve the scalability of the shared TSD model
Summary:
The shared TSD model in its current form doesn't scale. Here is an example of
rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core
machines (defaulting to a `CompactSizeClassMap` and no Quarantine):
- with tcmalloc: 337K reqs/sec, peak RSS of 338MB;
- with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB;
- with scudo (shared): 241K reqs/sec, peak RSS of 324MB.
This isn't great, since the exclusive model uses a lot of memory, while the
shared model doesn't even come close to be competitive.
This is mostly due to the fact that we are consistently scanning the TSD pool
starting at index 0 for an available TSD, which can result in a lot of failed
lock attempts, and touching some memory that needs not be touched.
This CL attempts to make things better in most situations:
- first, use a thread local variable on Linux (intead of pthread APIs) to store
the current TSD in the shared model;
- move the locking boolean out of the TSD: this allows the compiler to use a
register and potentially optimize out a branch instead of reading it from the
TSD everytime (we also save a tiny bit of memory per TSD);
- 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so
store the `Precedence` in a `uptr` instead of a `u64`. We lose some
nanoseconds of precision and we'll wrap around at some point, but the benefit
is worth it;
- change a `CHECK` to a `DCHECK`: this should never happen, but if something is
ever terribly wrong, we'll crash on a near null AV if the TSD happens to be
null;
- based on an idea by dvyukov@, we are implementing a bound random scan for
an available TSD. This requires computing the coprimes for the number of TSDs,
and attempting to lock up to 4 TSDs in an random order before falling back to
the current one. This is obviously slightly more expansive when we have just
2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still
basically corresponds to the moment of the first contention on a TSD. To seed
on random choice, we use the precedence of the current TSD since it is very
likely to be non-zero (since we are in the slow path after a failed `tryLock`)
With those modifications, the benchmark yields to:
- with scudo (shared): 330K reqs/sec, peak RSS of 327MB.
So the shared model for this specific situation not only becomes competitive but
outperforms the exclusive model. I experimented with some values greater than 4
for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just
sticking with the current TSD is also a tad slower. Numbers on platforms with
less cores (eg: Android) remain similar.
Reviewers: alekseyshl, dvyukov, javed.absar
Reviewed By: alekseyshl, dvyukov
Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D47289
llvm-svn: 334410
2018-06-11 22:50:31 +08:00
|
|
|
bool UnlockRequired;
|
|
|
|
ScudoTSD *TSD = getTSDAndLock(&UnlockRequired);
|
2018-07-20 23:07:17 +08:00
|
|
|
getBackend().deallocatePrimary(&TSD->Cache, BackendPtr,
|
|
|
|
Header->ClassId);
|
[scudo] Improve the scalability of the shared TSD model
Summary:
The shared TSD model in its current form doesn't scale. Here is an example of
rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core
machines (defaulting to a `CompactSizeClassMap` and no Quarantine):
- with tcmalloc: 337K reqs/sec, peak RSS of 338MB;
- with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB;
- with scudo (shared): 241K reqs/sec, peak RSS of 324MB.
This isn't great, since the exclusive model uses a lot of memory, while the
shared model doesn't even come close to be competitive.
This is mostly due to the fact that we are consistently scanning the TSD pool
starting at index 0 for an available TSD, which can result in a lot of failed
lock attempts, and touching some memory that needs not be touched.
This CL attempts to make things better in most situations:
- first, use a thread local variable on Linux (intead of pthread APIs) to store
the current TSD in the shared model;
- move the locking boolean out of the TSD: this allows the compiler to use a
register and potentially optimize out a branch instead of reading it from the
TSD everytime (we also save a tiny bit of memory per TSD);
- 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so
store the `Precedence` in a `uptr` instead of a `u64`. We lose some
nanoseconds of precision and we'll wrap around at some point, but the benefit
is worth it;
- change a `CHECK` to a `DCHECK`: this should never happen, but if something is
ever terribly wrong, we'll crash on a near null AV if the TSD happens to be
null;
- based on an idea by dvyukov@, we are implementing a bound random scan for
an available TSD. This requires computing the coprimes for the number of TSDs,
and attempting to lock up to 4 TSDs in an random order before falling back to
the current one. This is obviously slightly more expansive when we have just
2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still
basically corresponds to the moment of the first contention on a TSD. To seed
on random choice, we use the precedence of the current TSD since it is very
likely to be non-zero (since we are in the slow path after a failed `tryLock`)
With those modifications, the benchmark yields to:
- with scudo (shared): 330K reqs/sec, peak RSS of 327MB.
So the shared model for this specific situation not only becomes competitive but
outperforms the exclusive model. I experimented with some values greater than 4
for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just
sticking with the current TSD is also a tad slower. Numbers on platforms with
less cores (eg: Android) remain similar.
Reviewers: alekseyshl, dvyukov, javed.absar
Reviewed By: alekseyshl, dvyukov
Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D47289
llvm-svn: 334410
2018-06-11 22:50:31 +08:00
|
|
|
if (UnlockRequired)
|
|
|
|
TSD->unlock();
|
2017-04-22 02:10:53 +08:00
|
|
|
} else {
|
2018-07-20 23:07:17 +08:00
|
|
|
getBackend().deallocateSecondary(BackendPtr);
|
2017-04-22 02:10:53 +08:00
|
|
|
}
|
|
|
|
} else {
|
[scudo] Quarantine overhaul
Summary:
First, some context.
The main feedback we get about the quarantine is that it's too memory hungry.
A single MB of quarantine will have an impact of 3 to 4MB of PSS/RSS, and
things quickly get out of hand in terms of memory usage, and the quarantine
ends up disabled.
The main objective of the quarantine is to protect from use-after-free
exploitation by making it harder for an attacker to reallocate a controlled
chunk in place of the targeted freed chunk. This is achieved by not making it
available to the backend right away for reuse, but holding it a little while.
Historically, what has usually been the target of such attacks was objects,
where vtable pointers or other function pointers could constitute a valuable
targeti to replace. Those are usually on the smaller side. There is barely any
advantage in putting the quarantine several megabytes of RGB data or the like.
Now for the patch.
This patch introduces a new way the Quarantine behaves in Scudo. First of all,
the size of the Quarantine will be defined in KB instead of MB, then we
introduce a new option: the size up to which (lower than or equal to) a chunk
will be quarantined. This way, we only quarantine smaller chunks, and the size
of the quarantine remains manageable. It also prevents someone from triggering
a recycle by allocating something huge. We default to 512 bytes on 32-bit and
2048 bytes on 64-bit platforms.
In details, the patches includes the following:
- introduce `QuarantineSizeKb`, but honor `QuarantineSizeMb` if set to fall
back to the old behavior (meaning no threshold in that case);
`QuarantineSizeMb` is described as deprecated in the options descriptios;
documentation update will follow;
- introduce `QuarantineChunksUpToSize`, the new threshold value;
- update the `quarantine.cpp` test, and other tests using `QuarantineSizeMb`;
- remove `AllocatorOptions::copyTo`, it wasn't used;
- slightly change the logic around `quarantineOrDeallocateChunk` to accomodate
for the new logic; rename a couple of variables there as well;
Rewriting the tests, I found a somewhat annoying bug where non-default aligned
chunks would account for more than needed when placed in the quarantine due to
`<< MinAlignment` instead of `<< MinAlignmentLog`. This is fixed and tested for
now.
Reviewers: alekseyshl, kcc
Reviewed By: alekseyshl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35694
llvm-svn: 308884
2017-07-24 23:29:38 +08:00
|
|
|
// If a small memory amount was allocated with a larger alignment, we want
|
|
|
|
// to take that into account. Otherwise the Quarantine would be filled
|
|
|
|
// with tiny chunks, taking a lot of VA memory. This is an approximation
|
|
|
|
// of the usable size, that allows us to not call
|
|
|
|
// GetActuallyAllocatedSize.
|
2018-03-14 23:50:32 +08:00
|
|
|
const uptr EstimatedSize = Size + (Header->Offset << MinAlignmentLog);
|
2017-04-22 02:10:53 +08:00
|
|
|
UnpackedHeader NewHeader = *Header;
|
|
|
|
NewHeader.State = ChunkQuarantine;
|
2017-12-15 05:32:57 +08:00
|
|
|
Chunk::compareExchangeHeader(Ptr, &NewHeader, Header);
|
[scudo] Improve the scalability of the shared TSD model
Summary:
The shared TSD model in its current form doesn't scale. Here is an example of
rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core
machines (defaulting to a `CompactSizeClassMap` and no Quarantine):
- with tcmalloc: 337K reqs/sec, peak RSS of 338MB;
- with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB;
- with scudo (shared): 241K reqs/sec, peak RSS of 324MB.
This isn't great, since the exclusive model uses a lot of memory, while the
shared model doesn't even come close to be competitive.
This is mostly due to the fact that we are consistently scanning the TSD pool
starting at index 0 for an available TSD, which can result in a lot of failed
lock attempts, and touching some memory that needs not be touched.
This CL attempts to make things better in most situations:
- first, use a thread local variable on Linux (intead of pthread APIs) to store
the current TSD in the shared model;
- move the locking boolean out of the TSD: this allows the compiler to use a
register and potentially optimize out a branch instead of reading it from the
TSD everytime (we also save a tiny bit of memory per TSD);
- 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so
store the `Precedence` in a `uptr` instead of a `u64`. We lose some
nanoseconds of precision and we'll wrap around at some point, but the benefit
is worth it;
- change a `CHECK` to a `DCHECK`: this should never happen, but if something is
ever terribly wrong, we'll crash on a near null AV if the TSD happens to be
null;
- based on an idea by dvyukov@, we are implementing a bound random scan for
an available TSD. This requires computing the coprimes for the number of TSDs,
and attempting to lock up to 4 TSDs in an random order before falling back to
the current one. This is obviously slightly more expansive when we have just
2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still
basically corresponds to the moment of the first contention on a TSD. To seed
on random choice, we use the precedence of the current TSD since it is very
likely to be non-zero (since we are in the slow path after a failed `tryLock`)
With those modifications, the benchmark yields to:
- with scudo (shared): 330K reqs/sec, peak RSS of 327MB.
So the shared model for this specific situation not only becomes competitive but
outperforms the exclusive model. I experimented with some values greater than 4
for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just
sticking with the current TSD is also a tad slower. Numbers on platforms with
less cores (eg: Android) remain similar.
Reviewers: alekseyshl, dvyukov, javed.absar
Reviewed By: alekseyshl, dvyukov
Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D47289
llvm-svn: 334410
2018-06-11 22:50:31 +08:00
|
|
|
bool UnlockRequired;
|
|
|
|
ScudoTSD *TSD = getTSDAndLock(&UnlockRequired);
|
2018-07-20 23:07:17 +08:00
|
|
|
Quarantine.Put(getQuarantineCache(TSD), QuarantineCallback(&TSD->Cache),
|
|
|
|
Ptr, EstimatedSize);
|
[scudo] Improve the scalability of the shared TSD model
Summary:
The shared TSD model in its current form doesn't scale. Here is an example of
rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core
machines (defaulting to a `CompactSizeClassMap` and no Quarantine):
- with tcmalloc: 337K reqs/sec, peak RSS of 338MB;
- with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB;
- with scudo (shared): 241K reqs/sec, peak RSS of 324MB.
This isn't great, since the exclusive model uses a lot of memory, while the
shared model doesn't even come close to be competitive.
This is mostly due to the fact that we are consistently scanning the TSD pool
starting at index 0 for an available TSD, which can result in a lot of failed
lock attempts, and touching some memory that needs not be touched.
This CL attempts to make things better in most situations:
- first, use a thread local variable on Linux (intead of pthread APIs) to store
the current TSD in the shared model;
- move the locking boolean out of the TSD: this allows the compiler to use a
register and potentially optimize out a branch instead of reading it from the
TSD everytime (we also save a tiny bit of memory per TSD);
- 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so
store the `Precedence` in a `uptr` instead of a `u64`. We lose some
nanoseconds of precision and we'll wrap around at some point, but the benefit
is worth it;
- change a `CHECK` to a `DCHECK`: this should never happen, but if something is
ever terribly wrong, we'll crash on a near null AV if the TSD happens to be
null;
- based on an idea by dvyukov@, we are implementing a bound random scan for
an available TSD. This requires computing the coprimes for the number of TSDs,
and attempting to lock up to 4 TSDs in an random order before falling back to
the current one. This is obviously slightly more expansive when we have just
2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still
basically corresponds to the moment of the first contention on a TSD. To seed
on random choice, we use the precedence of the current TSD since it is very
likely to be non-zero (since we are in the slow path after a failed `tryLock`)
With those modifications, the benchmark yields to:
- with scudo (shared): 330K reqs/sec, peak RSS of 327MB.
So the shared model for this specific situation not only becomes competitive but
outperforms the exclusive model. I experimented with some values greater than 4
for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just
sticking with the current TSD is also a tad slower. Numbers on platforms with
less cores (eg: Android) remain similar.
Reviewers: alekseyshl, dvyukov, javed.absar
Reviewed By: alekseyshl, dvyukov
Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D47289
llvm-svn: 334410
2018-06-11 22:50:31 +08:00
|
|
|
if (UnlockRequired)
|
|
|
|
TSD->unlock();
|
2017-04-22 02:10:53 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-12-15 05:32:57 +08:00
|
|
|
// Deallocates a Chunk, which means either adding it to the quarantine or
|
|
|
|
// directly returning it to the backend if criteria are met.
|
2018-06-12 22:42:40 +08:00
|
|
|
void deallocate(void *Ptr, uptr DeleteSize, uptr DeleteAlignment,
|
|
|
|
AllocType Type) {
|
[scudo] Fix improper TSD init after TLS destructors are called
Summary:
Some of glibc's own thread local data is destroyed after a user's thread local
destructors are called, via __libc_thread_freeres. This might involve calling
free, as is the case for strerror_thread_freeres.
If there is no prior heap operation in the thread, this free would end up
initializing some thread specific data that would never be destroyed properly
(as user's pthread destructors have already been called), while still being
deallocated when the TLS goes away. As a result, a program could SEGV, usually
in __sanitizer::AllocatorGlobalStats::Unregister, where one of the doubly linked
list links would refer to a now unmapped memory area.
To prevent this from happening, we will not do a full initialization from the
deallocation path. This means that the fallback cache & quarantine will be used
if no other heap operation has been called, and we effectively prevent the TSD
being initialized and never destroyed. The TSD will be fully initialized for all
other paths.
In the event of a thread doing only frees and nothing else, a TSD would never
be initialized for that thread, but this situation is unlikely and we can live
with that.
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37697
llvm-svn: 312939
2017-09-12 03:59:40 +08:00
|
|
|
// For a deallocation, we only ensure minimal initialization, meaning thread
|
|
|
|
// local data will be left uninitialized for now (when using ELF TLS). The
|
|
|
|
// fallback cache will be used instead. This is a workaround for a situation
|
|
|
|
// where the only heap operation performed in a thread would be a free past
|
|
|
|
// the TLS destructors, ending up in initialized thread specific data never
|
|
|
|
// being destroyed properly. Any other heap operation will do a full init.
|
|
|
|
initThreadMaybe(/*MinimalInit=*/true);
|
2018-01-24 07:07:42 +08:00
|
|
|
if (SCUDO_CAN_USE_HOOKS && &__sanitizer_free_hook)
|
|
|
|
__sanitizer_free_hook(Ptr);
|
2017-12-15 05:32:57 +08:00
|
|
|
if (UNLIKELY(!Ptr))
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
return;
|
2019-06-18 01:45:34 +08:00
|
|
|
|
|
|
|
#ifdef GWP_ASAN_HOOKS
|
|
|
|
if (UNLIKELY(GuardedAlloc.pointerIsMine(Ptr))) {
|
|
|
|
GuardedAlloc.deallocate(Ptr);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
#endif // GWP_ASAN_HOOKS
|
|
|
|
|
2018-03-08 00:22:16 +08:00
|
|
|
if (UNLIKELY(!Chunk::isAligned(Ptr)))
|
|
|
|
dieWithMessage("misaligned pointer when deallocating address %p\n", Ptr);
|
[scudo] Quarantine overhaul
Summary:
First, some context.
The main feedback we get about the quarantine is that it's too memory hungry.
A single MB of quarantine will have an impact of 3 to 4MB of PSS/RSS, and
things quickly get out of hand in terms of memory usage, and the quarantine
ends up disabled.
The main objective of the quarantine is to protect from use-after-free
exploitation by making it harder for an attacker to reallocate a controlled
chunk in place of the targeted freed chunk. This is achieved by not making it
available to the backend right away for reuse, but holding it a little while.
Historically, what has usually been the target of such attacks was objects,
where vtable pointers or other function pointers could constitute a valuable
targeti to replace. Those are usually on the smaller side. There is barely any
advantage in putting the quarantine several megabytes of RGB data or the like.
Now for the patch.
This patch introduces a new way the Quarantine behaves in Scudo. First of all,
the size of the Quarantine will be defined in KB instead of MB, then we
introduce a new option: the size up to which (lower than or equal to) a chunk
will be quarantined. This way, we only quarantine smaller chunks, and the size
of the quarantine remains manageable. It also prevents someone from triggering
a recycle by allocating something huge. We default to 512 bytes on 32-bit and
2048 bytes on 64-bit platforms.
In details, the patches includes the following:
- introduce `QuarantineSizeKb`, but honor `QuarantineSizeMb` if set to fall
back to the old behavior (meaning no threshold in that case);
`QuarantineSizeMb` is described as deprecated in the options descriptios;
documentation update will follow;
- introduce `QuarantineChunksUpToSize`, the new threshold value;
- update the `quarantine.cpp` test, and other tests using `QuarantineSizeMb`;
- remove `AllocatorOptions::copyTo`, it wasn't used;
- slightly change the logic around `quarantineOrDeallocateChunk` to accomodate
for the new logic; rename a couple of variables there as well;
Rewriting the tests, I found a somewhat annoying bug where non-default aligned
chunks would account for more than needed when placed in the quarantine due to
`<< MinAlignment` instead of `<< MinAlignmentLog`. This is fixed and tested for
now.
Reviewers: alekseyshl, kcc
Reviewed By: alekseyshl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35694
llvm-svn: 308884
2017-07-24 23:29:38 +08:00
|
|
|
UnpackedHeader Header;
|
2017-12-15 05:32:57 +08:00
|
|
|
Chunk::loadHeader(Ptr, &Header);
|
2018-03-08 00:22:16 +08:00
|
|
|
if (UNLIKELY(Header.State != ChunkAllocated))
|
|
|
|
dieWithMessage("invalid chunk state when deallocating address %p\n", Ptr);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
if (DeallocationTypeMismatch) {
|
|
|
|
// The deallocation type has to match the allocation one.
|
[scudo] Quarantine overhaul
Summary:
First, some context.
The main feedback we get about the quarantine is that it's too memory hungry.
A single MB of quarantine will have an impact of 3 to 4MB of PSS/RSS, and
things quickly get out of hand in terms of memory usage, and the quarantine
ends up disabled.
The main objective of the quarantine is to protect from use-after-free
exploitation by making it harder for an attacker to reallocate a controlled
chunk in place of the targeted freed chunk. This is achieved by not making it
available to the backend right away for reuse, but holding it a little while.
Historically, what has usually been the target of such attacks was objects,
where vtable pointers or other function pointers could constitute a valuable
targeti to replace. Those are usually on the smaller side. There is barely any
advantage in putting the quarantine several megabytes of RGB data or the like.
Now for the patch.
This patch introduces a new way the Quarantine behaves in Scudo. First of all,
the size of the Quarantine will be defined in KB instead of MB, then we
introduce a new option: the size up to which (lower than or equal to) a chunk
will be quarantined. This way, we only quarantine smaller chunks, and the size
of the quarantine remains manageable. It also prevents someone from triggering
a recycle by allocating something huge. We default to 512 bytes on 32-bit and
2048 bytes on 64-bit platforms.
In details, the patches includes the following:
- introduce `QuarantineSizeKb`, but honor `QuarantineSizeMb` if set to fall
back to the old behavior (meaning no threshold in that case);
`QuarantineSizeMb` is described as deprecated in the options descriptios;
documentation update will follow;
- introduce `QuarantineChunksUpToSize`, the new threshold value;
- update the `quarantine.cpp` test, and other tests using `QuarantineSizeMb`;
- remove `AllocatorOptions::copyTo`, it wasn't used;
- slightly change the logic around `quarantineOrDeallocateChunk` to accomodate
for the new logic; rename a couple of variables there as well;
Rewriting the tests, I found a somewhat annoying bug where non-default aligned
chunks would account for more than needed when placed in the quarantine due to
`<< MinAlignment` instead of `<< MinAlignmentLog`. This is fixed and tested for
now.
Reviewers: alekseyshl, kcc
Reviewed By: alekseyshl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35694
llvm-svn: 308884
2017-07-24 23:29:38 +08:00
|
|
|
if (Header.AllocType != Type) {
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
// With the exception of memalign'd Chunks, that can be still be free'd.
|
2018-03-08 00:22:16 +08:00
|
|
|
if (Header.AllocType != FromMemalign || Type != FromMalloc)
|
|
|
|
dieWithMessage("allocation type mismatch when deallocating address "
|
|
|
|
"%p\n", Ptr);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
}
|
2018-03-14 23:50:32 +08:00
|
|
|
const uptr Size = Chunk::getSize(Ptr, &Header);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
if (DeleteSizeMismatch) {
|
2018-03-08 00:22:16 +08:00
|
|
|
if (DeleteSize && DeleteSize != Size)
|
|
|
|
dieWithMessage("invalid sized delete when deallocating address %p\n",
|
2017-12-15 05:32:57 +08:00
|
|
|
Ptr);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
2018-06-12 22:42:40 +08:00
|
|
|
(void)DeleteAlignment; // TODO(kostyak): verify that the alignment matches.
|
2017-12-15 05:32:57 +08:00
|
|
|
quarantineOrDeallocateChunk(Ptr, &Header, Size);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
// Reallocates a chunk. We can save on a new allocation if the new requested
|
|
|
|
// size still fits in the chunk.
|
|
|
|
void *reallocate(void *OldPtr, uptr NewSize) {
|
2017-04-28 04:21:16 +08:00
|
|
|
initThreadMaybe();
|
2019-06-18 01:45:34 +08:00
|
|
|
|
|
|
|
#ifdef GWP_ASAN_HOOKS
|
|
|
|
if (UNLIKELY(GuardedAlloc.pointerIsMine(OldPtr))) {
|
|
|
|
size_t OldSize = GuardedAlloc.getSize(OldPtr);
|
|
|
|
void *NewPtr = allocate(NewSize, MinAlignment, FromMalloc);
|
|
|
|
if (NewPtr)
|
|
|
|
memcpy(NewPtr, OldPtr, (NewSize < OldSize) ? NewSize : OldSize);
|
|
|
|
GuardedAlloc.deallocate(OldPtr);
|
|
|
|
return NewPtr;
|
|
|
|
}
|
|
|
|
#endif // GWP_ASAN_HOOKS
|
|
|
|
|
2018-03-08 00:22:16 +08:00
|
|
|
if (UNLIKELY(!Chunk::isAligned(OldPtr)))
|
|
|
|
dieWithMessage("misaligned address when reallocating address %p\n",
|
|
|
|
OldPtr);
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
UnpackedHeader OldHeader;
|
2017-12-15 05:32:57 +08:00
|
|
|
Chunk::loadHeader(OldPtr, &OldHeader);
|
2018-03-08 00:22:16 +08:00
|
|
|
if (UNLIKELY(OldHeader.State != ChunkAllocated))
|
|
|
|
dieWithMessage("invalid chunk state when reallocating address %p\n",
|
|
|
|
OldPtr);
|
2017-08-17 00:40:48 +08:00
|
|
|
if (DeallocationTypeMismatch) {
|
2018-03-08 00:22:16 +08:00
|
|
|
if (UNLIKELY(OldHeader.AllocType != FromMalloc))
|
|
|
|
dieWithMessage("allocation type mismatch when reallocating address "
|
|
|
|
"%p\n", OldPtr);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
2017-12-15 05:32:57 +08:00
|
|
|
const uptr UsableSize = Chunk::getUsableSize(OldPtr, &OldHeader);
|
2017-04-21 02:07:17 +08:00
|
|
|
// The new size still fits in the current chunk, and the size difference
|
|
|
|
// is reasonable.
|
|
|
|
if (NewSize <= UsableSize &&
|
|
|
|
(UsableSize - NewSize) < (SizeClassMap::kMaxSize / 2)) {
|
2017-04-22 02:10:53 +08:00
|
|
|
UnpackedHeader NewHeader = OldHeader;
|
2017-04-21 02:07:17 +08:00
|
|
|
NewHeader.SizeOrUnusedBytes =
|
2017-12-06 01:08:29 +08:00
|
|
|
OldHeader.ClassId ? NewSize : UsableSize - NewSize;
|
2017-12-15 05:32:57 +08:00
|
|
|
Chunk::compareExchangeHeader(OldPtr, &NewHeader, &OldHeader);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
return OldPtr;
|
|
|
|
}
|
|
|
|
// Otherwise, we have to allocate a new chunk and copy the contents of the
|
|
|
|
// old one.
|
|
|
|
void *NewPtr = allocate(NewSize, MinAlignment, FromMalloc);
|
|
|
|
if (NewPtr) {
|
2018-03-14 23:50:32 +08:00
|
|
|
const uptr OldSize = OldHeader.ClassId ? OldHeader.SizeOrUnusedBytes :
|
2017-04-21 02:07:17 +08:00
|
|
|
UsableSize - OldHeader.SizeOrUnusedBytes;
|
2017-08-17 00:40:48 +08:00
|
|
|
memcpy(NewPtr, OldPtr, Min(NewSize, UsableSize));
|
2017-12-15 05:32:57 +08:00
|
|
|
quarantineOrDeallocateChunk(OldPtr, &OldHeader, OldSize);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
return NewPtr;
|
|
|
|
}
|
|
|
|
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
// Helper function that returns the actual usable size of a chunk.
|
|
|
|
uptr getUsableSize(const void *Ptr) {
|
2017-04-28 04:21:16 +08:00
|
|
|
initThreadMaybe();
|
2017-06-30 00:45:20 +08:00
|
|
|
if (UNLIKELY(!Ptr))
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
return 0;
|
2019-06-18 01:45:34 +08:00
|
|
|
|
|
|
|
#ifdef GWP_ASAN_HOOKS
|
|
|
|
if (UNLIKELY(GuardedAlloc.pointerIsMine(Ptr)))
|
|
|
|
return GuardedAlloc.getSize(Ptr);
|
|
|
|
#endif // GWP_ASAN_HOOKS
|
|
|
|
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
UnpackedHeader Header;
|
2017-12-15 05:32:57 +08:00
|
|
|
Chunk::loadHeader(Ptr, &Header);
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
// Getting the usable size of a chunk only makes sense if it's allocated.
|
2018-03-08 00:22:16 +08:00
|
|
|
if (UNLIKELY(Header.State != ChunkAllocated))
|
|
|
|
dieWithMessage("invalid chunk state when sizing address %p\n", Ptr);
|
2017-12-15 05:32:57 +08:00
|
|
|
return Chunk::getUsableSize(Ptr, &Header);
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
}
|
|
|
|
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
void *calloc(uptr NMemB, uptr Size) {
|
2017-04-28 04:21:16 +08:00
|
|
|
initThreadMaybe();
|
2018-06-16 00:45:19 +08:00
|
|
|
if (UNLIKELY(CheckForCallocOverflow(NMemB, Size))) {
|
|
|
|
if (AllocatorMayReturnNull())
|
|
|
|
return nullptr;
|
|
|
|
reportCallocOverflow(NMemB, Size);
|
|
|
|
}
|
2017-06-30 00:45:20 +08:00
|
|
|
return allocate(NMemB * Size, MinAlignment, FromMalloc, true);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
2017-09-22 23:35:37 +08:00
|
|
|
void commitBack(ScudoTSD *TSD) {
|
2018-07-20 23:07:17 +08:00
|
|
|
Quarantine.Drain(getQuarantineCache(TSD), QuarantineCallback(&TSD->Cache));
|
|
|
|
Backend.destroyCache(&TSD->Cache);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
2017-02-04 04:49:42 +08:00
|
|
|
|
|
|
|
uptr getStats(AllocatorStat StatType) {
|
2017-04-28 04:21:16 +08:00
|
|
|
initThreadMaybe();
|
2017-02-04 04:49:42 +08:00
|
|
|
uptr stats[AllocatorStatCount];
|
2018-07-20 23:07:17 +08:00
|
|
|
Backend.getStats(stats);
|
2017-02-04 04:49:42 +08:00
|
|
|
return stats[StatType];
|
|
|
|
}
|
2017-09-15 04:34:32 +08:00
|
|
|
|
2018-06-16 00:45:19 +08:00
|
|
|
bool canReturnNull() {
|
2017-09-15 04:34:32 +08:00
|
|
|
initThreadMaybe();
|
2018-06-16 00:45:19 +08:00
|
|
|
return AllocatorMayReturnNull();
|
2017-09-15 04:34:32 +08:00
|
|
|
}
|
2017-12-14 04:41:35 +08:00
|
|
|
|
|
|
|
void setRssLimit(uptr LimitMb, bool HardLimit) {
|
|
|
|
if (HardLimit)
|
|
|
|
HardRssLimitMb = LimitMb;
|
|
|
|
else
|
|
|
|
SoftRssLimitMb = LimitMb;
|
|
|
|
CheckRssLimit = HardRssLimitMb || SoftRssLimitMb;
|
|
|
|
}
|
2018-04-26 02:52:29 +08:00
|
|
|
|
|
|
|
void printStats() {
|
|
|
|
initThreadMaybe();
|
2018-07-20 23:07:17 +08:00
|
|
|
Backend.printStats();
|
2018-04-26 02:52:29 +08:00
|
|
|
}
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
};
|
|
|
|
|
2018-07-20 23:07:17 +08:00
|
|
|
NOINLINE void Allocator::performSanityChecks() {
|
2018-06-19 23:36:30 +08:00
|
|
|
// Verify that the header offset field can hold the maximum offset. In the
|
|
|
|
// case of the Secondary allocator, it takes care of alignment and the
|
|
|
|
// offset will always be 0. In the case of the Primary, the worst case
|
|
|
|
// scenario happens in the last size class, when the backend allocation
|
|
|
|
// would already be aligned on the requested alignment, which would happen
|
|
|
|
// to be the maximum alignment that would fit in that size class. As a
|
|
|
|
// result, the maximum offset will be at most the maximum alignment for the
|
|
|
|
// last size class minus the header size, in multiples of MinAlignment.
|
|
|
|
UnpackedHeader Header = {};
|
|
|
|
const uptr MaxPrimaryAlignment =
|
|
|
|
1 << MostSignificantSetBitIndex(SizeClassMap::kMaxSize - MinAlignment);
|
|
|
|
const uptr MaxOffset =
|
|
|
|
(MaxPrimaryAlignment - Chunk::getHeaderSize()) >> MinAlignmentLog;
|
|
|
|
Header.Offset = MaxOffset;
|
|
|
|
if (Header.Offset != MaxOffset)
|
|
|
|
dieWithMessage("maximum possible offset doesn't fit in header\n");
|
|
|
|
// Verify that we can fit the maximum size or amount of unused bytes in the
|
|
|
|
// header. Given that the Secondary fits the allocation to a page, the worst
|
|
|
|
// case scenario happens in the Primary. It will depend on the second to
|
|
|
|
// last and last class sizes, as well as the dynamic base for the Primary.
|
|
|
|
// The following is an over-approximation that works for our needs.
|
|
|
|
const uptr MaxSizeOrUnusedBytes = SizeClassMap::kMaxSize - 1;
|
|
|
|
Header.SizeOrUnusedBytes = MaxSizeOrUnusedBytes;
|
|
|
|
if (Header.SizeOrUnusedBytes != MaxSizeOrUnusedBytes)
|
|
|
|
dieWithMessage("maximum possible unused bytes doesn't fit in header\n");
|
|
|
|
|
|
|
|
const uptr LargestClassId = SizeClassMap::kLargestClassID;
|
|
|
|
Header.ClassId = LargestClassId;
|
|
|
|
if (Header.ClassId != LargestClassId)
|
|
|
|
dieWithMessage("largest class ID doesn't fit in header\n");
|
|
|
|
}
|
|
|
|
|
|
|
|
// Opportunistic RSS limit check. This will update the RSS limit status, if
|
2019-01-24 23:56:54 +08:00
|
|
|
// it can, every 250ms, otherwise it will just return the current one.
|
2018-07-20 23:07:17 +08:00
|
|
|
NOINLINE bool Allocator::isRssLimitExceeded() {
|
2018-06-19 23:36:30 +08:00
|
|
|
u64 LastCheck = atomic_load_relaxed(&RssLastCheckedAtNS);
|
|
|
|
const u64 CurrentCheck = MonotonicNanoTime();
|
2019-01-24 23:56:54 +08:00
|
|
|
if (LIKELY(CurrentCheck < LastCheck + (250ULL * 1000000ULL)))
|
2018-06-19 23:36:30 +08:00
|
|
|
return atomic_load_relaxed(&RssLimitExceeded);
|
|
|
|
if (!atomic_compare_exchange_weak(&RssLastCheckedAtNS, &LastCheck,
|
|
|
|
CurrentCheck, memory_order_relaxed))
|
|
|
|
return atomic_load_relaxed(&RssLimitExceeded);
|
|
|
|
// TODO(kostyak): We currently use sanitizer_common's GetRSS which reads the
|
|
|
|
// RSS from /proc/self/statm by default. We might want to
|
|
|
|
// call getrusage directly, even if it's less accurate.
|
|
|
|
const uptr CurrentRssMb = GetRSS() >> 20;
|
|
|
|
if (HardRssLimitMb && UNLIKELY(HardRssLimitMb < CurrentRssMb))
|
|
|
|
dieWithMessage("hard RSS limit exhausted (%zdMb vs %zdMb)\n",
|
|
|
|
HardRssLimitMb, CurrentRssMb);
|
|
|
|
if (SoftRssLimitMb) {
|
|
|
|
if (atomic_load_relaxed(&RssLimitExceeded)) {
|
|
|
|
if (CurrentRssMb <= SoftRssLimitMb)
|
|
|
|
atomic_store_relaxed(&RssLimitExceeded, false);
|
|
|
|
} else {
|
|
|
|
if (CurrentRssMb > SoftRssLimitMb) {
|
|
|
|
atomic_store_relaxed(&RssLimitExceeded, true);
|
|
|
|
Printf("Scudo INFO: soft RSS limit exhausted (%zdMb vs %zdMb)\n",
|
|
|
|
SoftRssLimitMb, CurrentRssMb);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return atomic_load_relaxed(&RssLimitExceeded);
|
|
|
|
}
|
|
|
|
|
2018-07-20 23:07:17 +08:00
|
|
|
static Allocator Instance(LINKER_INITIALIZED);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
|
2018-07-20 23:07:17 +08:00
|
|
|
static BackendT &getBackend() {
|
|
|
|
return Instance.Backend;
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
2017-11-15 00:14:53 +08:00
|
|
|
void initScudo() {
|
|
|
|
Instance.init();
|
2019-06-18 01:45:34 +08:00
|
|
|
#ifdef GWP_ASAN_HOOKS
|
|
|
|
gwp_asan::options::initOptions();
|
2019-07-03 04:33:19 +08:00
|
|
|
gwp_asan::options::Options &Opts = gwp_asan::options::getOptions();
|
|
|
|
Opts.Backtrace = gwp_asan::options::getBacktraceFunction();
|
|
|
|
Opts.PrintBacktrace = gwp_asan::options::getPrintBacktraceFunction();
|
|
|
|
GuardedAlloc.init(Opts);
|
2019-06-18 01:45:34 +08:00
|
|
|
#endif // GWP_ASAN_HOOKS
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
[scudo] Improve the scalability of the shared TSD model
Summary:
The shared TSD model in its current form doesn't scale. Here is an example of
rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core
machines (defaulting to a `CompactSizeClassMap` and no Quarantine):
- with tcmalloc: 337K reqs/sec, peak RSS of 338MB;
- with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB;
- with scudo (shared): 241K reqs/sec, peak RSS of 324MB.
This isn't great, since the exclusive model uses a lot of memory, while the
shared model doesn't even come close to be competitive.
This is mostly due to the fact that we are consistently scanning the TSD pool
starting at index 0 for an available TSD, which can result in a lot of failed
lock attempts, and touching some memory that needs not be touched.
This CL attempts to make things better in most situations:
- first, use a thread local variable on Linux (intead of pthread APIs) to store
the current TSD in the shared model;
- move the locking boolean out of the TSD: this allows the compiler to use a
register and potentially optimize out a branch instead of reading it from the
TSD everytime (we also save a tiny bit of memory per TSD);
- 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so
store the `Precedence` in a `uptr` instead of a `u64`. We lose some
nanoseconds of precision and we'll wrap around at some point, but the benefit
is worth it;
- change a `CHECK` to a `DCHECK`: this should never happen, but if something is
ever terribly wrong, we'll crash on a near null AV if the TSD happens to be
null;
- based on an idea by dvyukov@, we are implementing a bound random scan for
an available TSD. This requires computing the coprimes for the number of TSDs,
and attempting to lock up to 4 TSDs in an random order before falling back to
the current one. This is obviously slightly more expansive when we have just
2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still
basically corresponds to the moment of the first contention on a TSD. To seed
on random choice, we use the precedence of the current TSD since it is very
likely to be non-zero (since we are in the slow path after a failed `tryLock`)
With those modifications, the benchmark yields to:
- with scudo (shared): 330K reqs/sec, peak RSS of 327MB.
So the shared model for this specific situation not only becomes competitive but
outperforms the exclusive model. I experimented with some values greater than 4
for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just
sticking with the current TSD is also a tad slower. Numbers on platforms with
less cores (eg: Android) remain similar.
Reviewers: alekseyshl, dvyukov, javed.absar
Reviewed By: alekseyshl, dvyukov
Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D47289
llvm-svn: 334410
2018-06-11 22:50:31 +08:00
|
|
|
void ScudoTSD::init() {
|
2018-07-20 23:07:17 +08:00
|
|
|
getBackend().initCache(&Cache);
|
2017-04-28 04:21:16 +08:00
|
|
|
memset(QuarantineCachePlaceHolder, 0, sizeof(QuarantineCachePlaceHolder));
|
|
|
|
}
|
|
|
|
|
2017-09-22 23:35:37 +08:00
|
|
|
void ScudoTSD::commitBack() {
|
2017-04-28 04:21:16 +08:00
|
|
|
Instance.commitBack(this);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
2018-06-12 22:42:40 +08:00
|
|
|
void *scudoAllocate(uptr Size, uptr Alignment, AllocType Type) {
|
|
|
|
if (Alignment && UNLIKELY(!IsPowerOfTwo(Alignment))) {
|
|
|
|
errno = EINVAL;
|
2018-06-16 00:45:19 +08:00
|
|
|
if (Instance.canReturnNull())
|
|
|
|
return nullptr;
|
|
|
|
reportAllocationAlignmentNotPowerOfTwo(Alignment);
|
2018-06-12 22:42:40 +08:00
|
|
|
}
|
|
|
|
return SetErrnoOnNull(Instance.allocate(Size, Alignment, Type));
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
2018-06-12 22:42:40 +08:00
|
|
|
void scudoDeallocate(void *Ptr, uptr Size, uptr Alignment, AllocType Type) {
|
|
|
|
Instance.deallocate(Ptr, Size, Alignment, Type);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
void *scudoRealloc(void *Ptr, uptr Size) {
|
|
|
|
if (!Ptr)
|
2017-07-19 03:11:04 +08:00
|
|
|
return SetErrnoOnNull(Instance.allocate(Size, MinAlignment, FromMalloc));
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
if (Size == 0) {
|
2018-06-12 22:42:40 +08:00
|
|
|
Instance.deallocate(Ptr, 0, 0, FromMalloc);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
return nullptr;
|
|
|
|
}
|
2017-07-19 03:11:04 +08:00
|
|
|
return SetErrnoOnNull(Instance.reallocate(Ptr, Size));
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
void *scudoCalloc(uptr NMemB, uptr Size) {
|
2017-07-19 03:11:04 +08:00
|
|
|
return SetErrnoOnNull(Instance.calloc(NMemB, Size));
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
void *scudoValloc(uptr Size) {
|
2017-07-19 03:11:04 +08:00
|
|
|
return SetErrnoOnNull(
|
|
|
|
Instance.allocate(Size, GetPageSizeCached(), FromMemalign));
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
void *scudoPvalloc(uptr Size) {
|
[scudo] Fix race condition in deallocation path when Quarantine is bypassed
Summary:
There is a race window in the deallocation path when the Quarantine is bypassed.
Initially we would just erase the header of a chunk if we were not to use the
Quarantine, as opposed to using a compare-exchange primitive, to make things
faster.
It turned out to be a poor decision, as 2 threads (or more) could simultaneously
deallocate the same pointer, and if the checks were to done before the header
got erased, this would result in the pointer being added twice (or more) to
distinct thread caches, and eventually be reused.
Winning the race is not trivial but can happen with enough control over the
allocation primitives. The repro added attempts to trigger the bug, with a
moderate success rate, but it should be enough to notice if the bug ever make
its way back into the code.
Since I am changing things in this file, there are 2 smaller changes tagging
along, marking a variable `const`, and improving the Quarantine bypass test at
runtime.
Reviewers: alekseyshl, eugenis, kcc, vitalybuka
Reviewed By: eugenis, vitalybuka
Subscribers: delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D50655
llvm-svn: 339705
2018-08-15 02:34:52 +08:00
|
|
|
const uptr PageSize = GetPageSizeCached();
|
2017-07-26 05:18:02 +08:00
|
|
|
if (UNLIKELY(CheckForPvallocOverflow(Size, PageSize))) {
|
2017-10-12 23:01:09 +08:00
|
|
|
errno = ENOMEM;
|
2018-06-16 00:45:19 +08:00
|
|
|
if (Instance.canReturnNull())
|
|
|
|
return nullptr;
|
|
|
|
reportPvallocOverflow(Size);
|
2017-07-26 05:18:02 +08:00
|
|
|
}
|
2017-07-15 05:17:16 +08:00
|
|
|
// pvalloc(0) should allocate one page.
|
|
|
|
Size = Size ? RoundUpTo(Size, PageSize) : PageSize;
|
2017-07-19 03:11:04 +08:00
|
|
|
return SetErrnoOnNull(Instance.allocate(Size, PageSize, FromMemalign));
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
int scudoPosixMemalign(void **MemPtr, uptr Alignment, uptr Size) {
|
2017-07-19 03:11:04 +08:00
|
|
|
if (UNLIKELY(!CheckPosixMemalignAlignment(Alignment))) {
|
2018-06-16 00:45:19 +08:00
|
|
|
if (!Instance.canReturnNull())
|
|
|
|
reportInvalidPosixMemalignAlignment(Alignment);
|
2017-10-12 23:01:09 +08:00
|
|
|
return EINVAL;
|
2017-06-30 00:45:20 +08:00
|
|
|
}
|
2017-07-15 05:17:16 +08:00
|
|
|
void *Ptr = Instance.allocate(Size, Alignment, FromMemalign);
|
2017-07-19 03:11:04 +08:00
|
|
|
if (UNLIKELY(!Ptr))
|
2017-10-12 23:01:09 +08:00
|
|
|
return ENOMEM;
|
2017-07-15 05:17:16 +08:00
|
|
|
*MemPtr = Ptr;
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
void *scudoAlignedAlloc(uptr Alignment, uptr Size) {
|
2017-07-19 03:11:04 +08:00
|
|
|
if (UNLIKELY(!CheckAlignedAllocAlignmentAndSize(Alignment, Size))) {
|
2017-10-12 23:01:09 +08:00
|
|
|
errno = EINVAL;
|
2018-06-16 00:45:19 +08:00
|
|
|
if (Instance.canReturnNull())
|
|
|
|
return nullptr;
|
|
|
|
reportInvalidAlignedAllocAlignment(Size, Alignment);
|
2017-07-15 05:17:16 +08:00
|
|
|
}
|
2017-07-19 03:11:04 +08:00
|
|
|
return SetErrnoOnNull(Instance.allocate(Size, Alignment, FromMalloc));
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
uptr scudoMallocUsableSize(void *Ptr) {
|
|
|
|
return Instance.getUsableSize(Ptr);
|
|
|
|
}
|
|
|
|
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
} // namespace __scudo
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
|
|
|
|
using namespace __scudo;
|
|
|
|
|
|
|
|
// MallocExtension helper functions
|
|
|
|
|
|
|
|
uptr __sanitizer_get_current_allocated_bytes() {
|
2017-02-04 04:49:42 +08:00
|
|
|
return Instance.getStats(AllocatorStatAllocated);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
uptr __sanitizer_get_heap_size() {
|
2017-02-04 04:49:42 +08:00
|
|
|
return Instance.getStats(AllocatorStatMapped);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
uptr __sanitizer_get_free_bytes() {
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
uptr __sanitizer_get_unmapped_bytes() {
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
2018-01-05 01:05:04 +08:00
|
|
|
uptr __sanitizer_get_estimated_allocated_size(uptr Size) {
|
|
|
|
return Size;
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
int __sanitizer_get_ownership(const void *Ptr) {
|
|
|
|
return Instance.isValidPointer(Ptr);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
|
|
|
|
[scudo] 32-bit and hardware agnostic support
Summary:
This update introduces i386 support for the Scudo Hardened Allocator, and
offers software alternatives for functions that used to require hardware
specific instruction sets. This should make porting to new architectures
easier.
Among the changes:
- The chunk header has been changed to accomodate the size limitations
encountered on 32-bit architectures. We now fit everything in 64-bit. This
was achieved by storing the amount of unused bytes in an allocation rather
than the size itself, as one can be deduced from the other with the help
of the GetActuallyAllocatedSize function. As it turns out, this header can
be used for both 64 and 32 bit, and as such we dropped the requirement for
the 128-bit compare and exchange instruction support (cmpxchg16b).
- Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
instruction set is supported, use the 32-bit CRC32 instruction, and in the
XorShift128, use a 32-bit based state instead of 64-bit.
- Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
software implementation.
- Modify tests that were not 32-bit compliant, and expand them to cover more
allocation and alignment sizes. The random shuffle test has been deactivated
for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
currently randomize chunks.
Reviewers: alekseyshl, kcc
Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D26358
llvm-svn: 288255
2016-12-01 01:32:20 +08:00
|
|
|
uptr __sanitizer_get_allocated_size(const void *Ptr) {
|
|
|
|
return Instance.getUsableSize(Ptr);
|
[sanitizer] Initial implementation of a Hardened Allocator
Summary:
This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:
- additional consistency checks on the allocation function parameters and on the heap chunks;
- use of checksum protected chunk header, to detect corruption;
- randomness to the allocator base;
- delayed freelist (quarantine), to mitigate use after free and overall determinism.
Additional mitigations are in the works.
Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
Subscribers: kubabrecka, filcab, llvm-commits
Differential Revision: http://reviews.llvm.org/D20084
llvm-svn: 271968
2016-06-07 09:20:26 +08:00
|
|
|
}
|
2017-12-14 04:41:35 +08:00
|
|
|
|
2018-01-31 01:59:49 +08:00
|
|
|
#if !SANITIZER_SUPPORTS_WEAK_HOOKS
|
|
|
|
SANITIZER_INTERFACE_WEAK_DEF(void, __sanitizer_malloc_hook,
|
|
|
|
void *Ptr, uptr Size) {
|
|
|
|
(void)Ptr;
|
|
|
|
(void)Size;
|
|
|
|
}
|
|
|
|
|
|
|
|
SANITIZER_INTERFACE_WEAK_DEF(void, __sanitizer_free_hook, void *Ptr) {
|
|
|
|
(void)Ptr;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2017-12-14 04:41:35 +08:00
|
|
|
// Interface functions
|
|
|
|
|
2018-01-05 01:05:04 +08:00
|
|
|
void __scudo_set_rss_limit(uptr LimitMb, s32 HardLimit) {
|
2017-12-14 04:41:35 +08:00
|
|
|
if (!SCUDO_CAN_USE_PUBLIC_INTERFACE)
|
|
|
|
return;
|
|
|
|
Instance.setRssLimit(LimitMb, !!HardLimit);
|
|
|
|
}
|
2018-04-26 02:52:29 +08:00
|
|
|
|
|
|
|
void __scudo_print_stats() {
|
|
|
|
Instance.printStats();
|
|
|
|
}
|