Summary:
Refactor the way /proc/self/maps entries are annotated to support most
(all?) posix platforms, with a special implementation for Android.
Extend the set of decorated Mmap* calls.
Replace shm_open with internal_open("/dev/shm/%s"). Shm_open is
problematic because it calls libc open() which may be intercepted.
Generic implementation has limits (max number of files under /dev/shm is
64K on my machine), which can be conceivably reached when sanitizing
multiple programs at once. Android implemenation is essentially free, and
enabled by default.
The test in sanitizer_common is copied to hwasan and not reused directly
because hwasan fails way too many common tests at the moment.
Reviewers: pcc, vitalybuka
Subscribers: srhines, kubamracek, jfb, llvm-commits, kcc
Differential Revision: https://reviews.llvm.org/D57720
llvm-svn: 353255
This function initializes enough of the runtime to be able to run
instrumented code in a statically linked executable. It replaces
__hwasan_shadow_init() which wasn't doing enough initialization for
instrumented code that uses either TLS or IFUNC to work.
Differential Revision: https://reviews.llvm.org/D57490
llvm-svn: 352816
Summary:
Release memory pages for thread data (allocator cache, stack allocations
ring buffer, etc) when a thread exits. We can not simply munmap them
because this memory is custom allocated within a limited address range,
and it needs to stay "reserved".
This change alters thread storage layout by putting the ring buffer
before Thread instead of after it. This makes it possible to find the
start of the thread aux allocation given only the Thread pointer.
Reviewers: kcc, pcc
Subscribers: kubamracek, llvm-commits
Differential Revision: https://reviews.llvm.org/D56621
llvm-svn: 352151
Each hwasan check requires emitting a small piece of code like this:
https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#memory-accesses
The problem with this is that these code blocks typically bloat code
size significantly.
An obvious solution is to outline these blocks of code. In fact, this
has already been implemented under the -hwasan-instrument-with-calls
flag. However, as currently implemented this has a number of problems:
- The functions use the same calling convention as regular C functions.
This means that the backend must spill all temporary registers as
required by the platform's C calling convention, even though the
check only needs two registers on the hot path.
- The functions take the address to be checked in a fixed register,
which increases register pressure.
Both of these factors can diminish the code size effect and increase
the performance hit of -hwasan-instrument-with-calls.
The solution that this patch implements is to involve the aarch64
backend in outlining the checks. An intrinsic and pseudo-instruction
are created to represent a hwasan check. The pseudo-instruction
is register allocated like any other instruction, and we allow the
register allocator to select almost any register for the address to
check. A particular combination of (register selection, type of check)
triggers the creation in the backend of a function to handle the check
for specifically that pair. The resulting functions are deduplicated by
the linker. The pseudo-instruction (really the function) is specified
to preserve all registers except for the registers that the AAPCS
specifies may be clobbered by a call.
To measure the code size and performance effect of this change, I
took a number of measurements using Chromium for Android on aarch64,
comparing a browser with inlined checks (the baseline) against a
browser with outlined checks.
Code size: Size of .text decreases from 243897420 to 171619972 bytes,
or a 30% decrease.
Performance: Using Chromium's blink_perf.layout microbenchmarks I
measured a median performance regression of 6.24%.
The fact that a perf/size tradeoff is evident here suggests that
we might want to make the new behaviour conditional on -Os/-Oz.
But for now I've enabled it unconditionally, my reasoning being that
hwasan users typically expect a relatively large perf hit, and ~6%
isn't really adding much. We may want to revisit this decision in
the future, though.
I also tried experimenting with varying the number of registers
selectable by the hwasan check pseudo-instruction (which would result
in fewer variants being created), on the hypothesis that creating
fewer variants of the function would expose another perf/size tradeoff
by reducing icache pressure from the check functions at the cost of
register pressure. Although I did observe a code size increase with
fewer registers, I did not observe a strong correlation between the
number of registers and the performance of the resulting browser on the
microbenchmarks, so I conclude that we might as well use ~all registers
to get the maximum code size improvement. My results are below:
Regs | .text size | Perf hit
-----+------------+---------
~all | 171619972 | 6.24%
16 | 171765192 | 7.03%
8 | 172917788 | 5.82%
4 | 177054016 | 6.89%
Differential Revision: https://reviews.llvm.org/D56954
llvm-svn: 351920
Reports correct size and tags when either size is not power of two
or offset to bad granule is not zero.
Differential revision: https://reviews.llvm.org/D56603
llvm-svn: 351730
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
Whenever a large shadow region is tagged to zero, madvise(DONT_NEED)
as much of it as possible.
This reduces shadow RSS on Android by 45% or so, and total memory use
by 2-4%, probably even more on long running multithreaded programs.
CPU time seems to be in the noise.
Reviewers: kcc, pcc
Subscribers: srhines, kubamracek, llvm-commits
Differential Revision: https://reviews.llvm.org/D56757
llvm-svn: 351620
Now that memory intrinsics are instrumented, it's more likely that
CheckAddressSized will be called with size 0. (It was possible before
with IR like:
%val = load [0 x i8], [0 x i8]* %ptr
but I don't think clang will generate IR like that and the optimizer
would normally remove it by the time it got anywhere near our pass
anyway). The right thing to do in both cases is to disable the
addressing checks (since the underlying memory intrinsic is a no-op),
so that's what we do.
Differential Revision: https://reviews.llvm.org/D56465
llvm-svn: 350683
Summary:
This patch lets ASan run when /proc is not accessible (ex. not mounted
yet). It includes a special test-only flag that emulates this condition
in an unpriviledged process.
This only matters on Linux, where /proc is necessary to enumerate
virtual memory mappings.
Reviewers: vitalybuka, pcc, krytarowski
Subscribers: kubamracek, llvm-commits
Differential Revision: https://reviews.llvm.org/D56141
llvm-svn: 350590
We still need the interceptor on non-aarch64 to untag the pthread_t
and pthread_attr_t pointers and disable tagging on allocations done
internally by glibc.
llvm-svn: 350445
The problem is similar to D55986 but for threads: a process with the
interceptor hwasan library loaded might have some threads started by
instrumented libraries and some by uninstrumented libraries, and we
need to be able to run instrumented code on the latter.
The solution is to perform per-thread initialization lazily. If a
function needs to access shadow memory or add itself to the per-thread
ring buffer its prologue checks to see whether the value in the
sanitizer TLS slot is null, and if so it calls __hwasan_thread_enter
and reloads from the TLS slot. The runtime does the same thing if it
needs to access this data structure.
This change means that the code generator needs to know whether we
are targeting the interceptor runtime, since we don't want to pay
the cost of lazy initialization when targeting a platform with native
hwasan support. A flag -fsanitize-hwaddress-abi={interceptor,platform}
has been introduced for selecting the runtime ABI to target. The
default ABI is set to interceptor since it's assumed that it will
be more common that users will be compiling application code than
platform code.
Because we can no longer assume that the TLS slot is initialized,
the pthread_create interceptor is no longer necessary, so it has
been removed.
Ideally, lazy initialization should only cost one instruction in the
hot path, but at present the call may cause us to spill arguments
to the stack, which means more instructions in the hot path (or
theoretically in the cold path if the spills are moved with shrink
wrapping). With an appropriately chosen calling convention for
the per-thread initialization function (TODO) the hot path should
always need just one instruction and the cold path should need two
instructions with no spilling required.
Differential Revision: https://reviews.llvm.org/D56038
llvm-svn: 350429
The Android dynamic loader has a non-standard feature that allows
libraries such as the hwasan runtime to interpose symbols even after
the symbol already has a value. The new value of the symbol is used to
relocate libraries loaded after the interposing library, but existing
libraries keep the old value. This behaviour is activated by the
DF_1_GLOBAL flag in DT_FLAGS_1, which is set by passing -z global to
the linker, which is what we already do to link the hwasan runtime.
What this means in practice is that if we have .so files that depend
on interceptor-mode hwasan without the main executable depending on
it, some of the libraries in the process will be using the hwasan
allocator and some will be using the system allocator, and these
allocators need to interact somehow. For example, if an instrumented
library calls a function such as strdup that allocates memory on
behalf of the caller, the instrumented library can reasonably expect
to be able to call free to deallocate the memory.
We can handle that relatively easily with hwasan by using tag 0 to
represent allocations from the system allocator. If hwasan's realloc
or free functions are passed a pointer with tag 0, the system allocator
is called.
One limitation is that this scheme doesn't work in reverse: if an
instrumented library allocates memory, it must free the memory itself
and cannot pass ownership to a system library. In a future change,
we may want to expose an API for calling the system allocator so
that instrumented libraries can safely transfer ownership of memory
to system libraries.
Differential Revision: https://reviews.llvm.org/D55986
llvm-svn: 350427
Summary:
Replace the 32-bit allocator with a 64-bit one with a non-constant
base address, and reduce both the number of size classes and the maximum
size of per-thread caches.
As measured on [1], this reduces average weighted memory overhead
(MaxRSS) from 26% to 12% over stock android allocator. These numbers
include overhead from code instrumentation and hwasan shadow (i.e. not a
pure allocator benchmark).
This switch also enables release-to-OS functionality, which is not
implemented in the 32-bit allocator. I have not seen any effect from
that on the benchmark.
[1] https://android.googlesource.com/platform/system/extras/+/master/memory_replay/
Reviewers: vitalybuka, kcc
Subscribers: kubamracek, cryptoad, llvm-commits
Differential Revision: https://reviews.llvm.org/D56239
llvm-svn: 350370
Revert r350104 "[asan] Fix build on windows."
Revert r350101 "[asan] Support running without /proc."
These changes break Mac build, too.
llvm-svn: 350112
Summary:
This patch lets ASan run when /proc is not accessible (ex. not mounted
yet). It includes a special test-only flag that emulates this condition
in an unpriviledged process.
This only matters on Linux, where /proc is necessary to enumerate
virtual memory mappings.
Reviewers: pcc, vitalybuka
Subscribers: kubamracek, llvm-commits
Differential Revision: https://reviews.llvm.org/D55874
llvm-svn: 350101
This is patch complements D55117 implementing __hwasan_mem*
functions in runtime
Differential revision: https://reviews.llvm.org/D55554
llvm-svn: 349730
As of r349413 it's now possible for a binary to contain an empty
hwasan frame section. Handle that case simply by doing nothing.
Differential Revision: https://reviews.llvm.org/D55796
llvm-svn: 349428
Summary:
This is a follow up patch to r346956 for the `SizeClassAllocator32`
allocator.
This patch makes `AddressSpaceView` a template parameter both to the
`ByteMap` implementations (but makes `LocalAddressSpaceView` the
default), some `AP32` implementations and is used in `SizeClassAllocator32`.
The actual changes to `ByteMap` implementations and
`SizeClassAllocator32` are very simple. However the patch is large
because it requires changing all the `AP32` definitions, and users of
those definitions.
For ASan and LSan we make `AP32` and `ByteMap` templateds type that take
a single `AddressSpaceView` argument. This has been done because we will
instantiate the allocator with a type that isn't `LocalAddressSpaceView`
in the future patches. For the allocators used in the other sanitizers
(i.e. HWAsan, MSan, Scudo, and TSan) use of `LocalAddressSpaceView` is
hard coded because we do not intend to instantiate the allocators with
any other type.
In the cases where untemplated types have become templated on a single
`AddressSpaceView` parameter (e.g. `PrimaryAllocator`) their name has
been changed to have a `ASVT` suffix (Address Space View Type) to
indicate they are templated. The only exception to this are the `AP32`
types due to the desire to keep the type name as short as possible.
In order to check that template is instantiated in the correct a way a
`static_assert(...)` has been added that checks that the
`AddressSpaceView` type used by `Params::ByteMap::AddressSpaceView` matches
the `Params::AddressSpaceView`. This uses the new `sanitizer_type_traits.h`
header.
rdar://problem/45284065
Reviewers: kcc, dvyukov, vitalybuka, cryptoad, eugenis, kubamracek, george.karpenkov
Subscribers: mgorny, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D54904
llvm-svn: 349138
Summary:
Add a check that TLS_SLOT_TSAN / TLS_SLOT_SANITIZER, whichever
android_get_tls_slot is using, is not conflicting with
TLS_SLOT_DLERROR.
Reviewers: rprichard, vitalybuka
Subscribers: srhines, kubamracek, llvm-commits
Differential Revision: https://reviews.llvm.org/D55587
llvm-svn: 348979
Summary:
With free_checks_tail_magic=1 (default) HWASAN
writes magic bytes to the tail of every heap allocation
(last bytes of the last granule, if the last granule is not fully used)
and checks these bytes on free().
This feature will detect buffer overwires within the last granule
at the time of free().
This is an alternative to malloc_align_right=[1289] that should have
fewer compatibility issues. It is also weaker since it doesn't
detect read overflows and reports bugs at free() instead of at access.
Reviewers: eugenis
Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D54656
llvm-svn: 347116
Summary:
... so that we can find intra-granule buffer overflows.
The default is still to always align left.
It remains to be seen wether we can enable this mode at scale.
Reviewers: eugenis
Reviewed By: eugenis
Subscribers: jfb, dvyukov, kubamracek, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D53789
llvm-svn: 347082
Summary:
When reporting a fatal error, collect and add the entire report text to
android_set_abort_message so that it can be found in the tombstone.
Reviewers: kcc, vitalybuka
Subscribers: srhines, kubamracek, llvm-commits
Differential Revision: https://reviews.llvm.org/D54284
llvm-svn: 346557
We have seen failing builds due to a race condition between
RTAsan_dynamic and libc++ headers builds, specifically libc++
headers depend on __config and if this header hasn't been copied
into the final location, including other headers will typically
result in failure. To avoid this race, we add an explicit dependency
on libc++ headers which ensures that they've been copied into place
before the sanitizer object library build starts.
Differential Revision: https://reviews.llvm.org/D54198
llvm-svn: 346339
Summary:
At compile-time, create an array of {PC,HumanReadableStackFrameDescription}
for every function that has an instrumented frame, and pass this array
to the run-time at the module-init time.
Similar to how we handle pc-table in SanitizerCoverage.
The run-time is dummy, will add the actual logic in later commits.
Reviewers: morehouse, eugenis
Reviewed By: eugenis
Subscribers: srhines, llvm-commits, kubamracek
Differential Revision: https://reviews.llvm.org/D53227
llvm-svn: 344985
Summary:
GetStackTrace treats top PC as a return address from an error reporting
function, and adjusts it down by 1 instruction. This is not necessary in
a signal handler, so adjust PC up to compensate.
Reviewers: kcc, vitalybuka, jfb
Subscribers: kubamracek, llvm-commits
Differential Revision: https://reviews.llvm.org/D52802
llvm-svn: 343638
Summary:
This essentially reverts r337010 since it breaks UBSan, which is used
for a few platform libraries. The "-z global" flag is now added for
Scudo as well. The only other sanitizer shared libraries are for asan
and hwasan, which have also been reinstated to use the global flag.
Reviewers: cryptoad, eugenis
Reviewed By: cryptoad
Subscribers: kubamracek, mgorny, delcypher, #sanitizers, nickdesaulniers, chh, kongyi, pirama, llvm-commits
Differential Revision: https://reviews.llvm.org/D52770
llvm-svn: 343599
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.
The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.
Developed in collaboration with Kostya Serebryany.
Reviewers: kcc
Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D52249
llvm-svn: 342923
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.
The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.
Developed in collaboration with Kostya Serebryany.
Reviewers: kcc
Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D52249
llvm-svn: 342921
Summary:
When building without COMPILER_RT_HWASAN_WITH_INTERCEPTORS, skip
interceptors for malloc/free/etc and only export their versions with
__sanitizer_ prefix.
Also remove a hack in mallinfo() interceptor that does not apply to
hwasan.
Reviewers: kcc
Subscribers: kubamracek, krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D51711
llvm-svn: 341598