Now that page aliasing for x64 has landed, we don't need to worry about
passing tagged pointers to libc, and thus D98875 removed it.
Unfortunately, we still test on aarch64 devices that don't have the
kernel tagged address ABI (https://reviews.llvm.org/D98875#2649269).
All the memory that we pass to the kernel in these tests is from global
variables. Instead of having architecture-specific untagging mechanisms
for this memory, let's just not tag the globals.
Reviewed By: eugenis, morehouse
Differential Revision: https://reviews.llvm.org/D101121
This patch does a few cleanup things:
1. The non-standalone scudo has a problem where GWP-ASan allocations
may not meet alignment requirements where Scudo was requested to have
alignment >= 16. Use the new GWP-ASan API to fix this.
2. The standalone variant loses some debugging information inside of
GWP-ASan because we ask GWP-ASan to allocate an aligned size in the
frontend. This means reports end up with 'UaF on a 16-byte allocation'
for a 1-byte allocation with 16-byte alignment. Also use the new API to
fix this.
3. Add post-alloc hooks for GWP-ASan intercepted allocations, and add
stats tracking for GWP-ASan allocations.
4. Add a small test that checks the alignment of the frontend
allocator, so that it can be used under GWP-ASan torture mode.
5. Add GWP-ASan torture mode as a testing configuration to catch these
regressions.
Depends on D94830, D95889.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D95884
From a cache perspective it's better to store the header immediately
after loading it. If we delay this operation until after we've
retagged it's more likely that our header will have been evicted from
the cache and we'll need to fetch it again in order to perform the
compare-exchange operation.
For similar reasons, store the deallocation stack before retagging
instead of afterwards.
Differential Revision: https://reviews.llvm.org/D101137
It looks like there's some old version of gcc that doesn't like this
static_assert (I couldn't reproduce the issue with gcc 8, 9 or 10).
Work around the issue by only checking the static_assert under clang,
which should provide sufficient coverage.
Should hopefully fix this buildbot:
https://lab.llvm.org/buildbot/#/builders/112/builds/5356
With AndroidSizeClassMap all of the LSBs are in the range 4-6 so we
only need 2 bits of information per size class. Furthermore we have
32 size classes, which conveniently lets us fit all of the information
into a 64-bit integer. Do so if possible so that we can avoid a table
lookup entirely.
Differential Revision: https://reviews.llvm.org/D101105
In the most common case we call computeOddEvenMaskForPointerMaybe()
from quarantineOrDeallocateChunk(), in which case we need to look up
the class size from the SizeClassMap in order to compute the LSB. Since
we need to do a lookup anyway, we may as well look up the LSB itself
and avoid computing it every time.
While here, switch to a slightly more efficient way of computing the
odd/even mask.
Differential Revision: https://reviews.llvm.org/D101018
The first version of origin tracking tracks only memory stores. Although
this is sufficient for understanding correct flows, it is hard to figure
out where an undefined value is read from. To find reading undefined values,
we still have to do a reverse binary search from the last store in the chain
with printing and logging at possible code paths. This is
quite inefficient.
Tracking memory load instructions can help this case. The main issues of
tracking loads are performance and code size overheads.
With tracking only stores, the code size overhead is 38%,
memory overhead is 1x, and cpu overhead is 3x. In practice #load is much
larger than #store, so both code size and cpu overhead increases. The
first blocker is code size overhead: link fails if we inline tracking
loads. The workaround is using external function calls to propagate
metadata. This is also the workaround ASan uses. The cpu overhead
is ~10x. This is a trade off between debuggability and performance,
and will be used only when debugging cases that tracking only stores
is not enough.
Reviewed By: gbalats
Differential Revision: https://reviews.llvm.org/D100967
Since we already have a tagged pointer available to us, we can just
extract the tag from it and avoid an LDG instruction.
Differential Revision: https://reviews.llvm.org/D101014
Now that we have a more efficient implementation of storeTags(),
we should start using it from resizeTaggedChunk(). With that, plus
a new storeTag() function, resizeTaggedChunk() can be made generic,
and so can prepareTaggedChunk(). Make it so.
Now that the functions are generic, move them to combined.h so that
memtag.h no longer needs to know about chunks.
Differential Revision: https://reviews.llvm.org/D100911
DC GZVA can operate on multiple granules at a time (corresponding to
the CPU's cache line size) so we can generally expect it to be faster
than STZG in a loop.
Differential Revision: https://reviews.llvm.org/D100910
An empty macro that expands to just `... else ;` can get
warnings from some compilers (e.g. GCC's -Wempty-body).
Reviewed By: cryptoad, vitalybuka
Differential Revision: https://reviews.llvm.org/D100693
If these sizes do not match, asan will not work as expected. Previously, we added compile-time checks for non-iOS platforms. We check at run time for iOS because we get the max VM size from the kernel at run time.
rdar://76477969
Reviewed By: delcypher
Differential Revision: https://reviews.llvm.org/D100784
This test from @MaskRay comment on D69428. The patch is looking to
break this behavior. If we go with D69428 I hope we will have some
workaround for this test or include explicit test update into the patch.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D100906
sanitizer_linux_libcdep.cpp doesn't build for Linux sparc (with minimum support
but can build) after D98926. I wasn't aware because the file didn't mention
`__sparc__`.
While here, add the relevant support since it does not add complexity
(the D99566 approach). Adds an explicit `#error` for unsupported
non-Android Linux and FreeBSD architectures.
ThreadDescriptorSize is only used by lsan to scan thread-specific data keys in
the thread control block.
On TLS Variant II architectures (i386/x86_64/s390/sparc), our dl_iterate_phdr
based approach can cover the region from the first byte of the static TLS block
(static TLS surplus) to the thread pointer.
We just need to extend the range to include the first few members of struct
pthread. offsetof(struct pthread, specific_used) satisfies the requirement and
has not changed since 2007-05-10. We don't need to update ThreadDescriptorSize
for each glibc version.
Technically we could use the 524/1552 for x86_64 as well but there is potential
risk that large applications with thousands of shared object dependency may
dislike the time complexity increase if there are many threads, so I don't make
the simplification for now.
Differential Revision: https://reviews.llvm.org/D100892
The previous check was wrong because it only checks that the LLVM CMake
directory exists. However, it's possible that the directory exists but
the `LLVMConfig.cmake` file does not. When this happens we would
incorectly try to include the non-existant file.
To fix this we make the check stricter by checking that the file
we want to include actually exists.
This is a follow up to fd28517d87.
rdar://76870467
If these sizes do not match, asan will not work as expected.
If possible, assert at compile time that the vm size is less than or equal to mmap range.
If a compile time assert is not possible, check at run time (for iOS)
rdar://76477969
Reviewed By: delcypher, yln
Differential Revision: https://reviews.llvm.org/D100239
We previously shrunk the mmap range size on ios, but those settings got inherited by apple silicon macs.
Don't shrink the vm range on apple silicon Mac since we have access to the full range.
Also don't shrink vm range for iOS simulators because they have the same range as the host OS, not the simulated OS.
rdar://75302812
Reviewed By: delcypher, kubamracek, yln
Differential Revision: https://reviews.llvm.org/D100234
As mentioned in https://gcc.gnu.org/PR100114 , glibc starting with the
https://sourceware.org/git/?p=glibc.git;a=commit;h=6c57d320484988e87e446e2e60ce42816bf51d53
change doesn't define SIGSTKSZ and MINSIGSTKSZ macros to constants, but to sysconf function call.
sanitizer_posix_libcdep.cpp has
static const uptr kAltStackSize = SIGSTKSZ * 4; // SIGSTKSZ is not enough.
which is generally fine, just means that when SIGSTKSZ is not a compile time constant will be initialized later.
The problem is that kAltStackSize is used in SetAlternateSignalStack which is called very early, from .preinit_array
initialization, i.e. far before file scope variables are constructed, which means it is not initialized and
mmapping 0 will fail:
==145==ERROR: AddressSanitizer failed to allocate 0x0 (0) bytes of SetAlternateSignalStack (error code: 22)
Here is one possible fix, another one could be to make kAltStackSize a preprocessor macro if _SG_SIGSTKSZ is defined
(but perhaps with having an automatic const variable initialized to it so that sysconf isn't at least called twice
during SetAlternateSignalStack.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D100645
In order to integrate libFuzzer with a dynamic symbolic execution tool
Sydr we need to print loaded file paths.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D100303
... so that FreeBSD specific GetTls/glibc specific pthread_self code can be
removed. This also helps FreeBSD arm64/powerpc64 which don't have GetTls
implementation yet.
GetTls is the range of
* thread control block and optional TLS_PRE_TCB_SIZE
* static TLS blocks plus static TLS surplus
On glibc, lsan requires the range to include
`pthread::{specific_1stblock,specific}` so that allocations only referenced by
`pthread_setspecific` can be scanned.
This patch uses `dl_iterate_phdr` to collect TLS blocks. Find the one
with `dlpi_tls_modid==1` as one of the initially loaded module, then find
consecutive ranges. The boundaries give us addr and size.
This allows us to drop the glibc internal `_dl_get_tls_static_info` and
`InitTlsSize`. However, huge glibc x86-64 binaries with numerous shared objects
may observe time complexity penalty, so exclude them for now. Use the simplified
method with non-Android Linux for now, but in theory this can be used with *BSD
and potentially other ELF OSes.
This removal of RISC-V `__builtin_thread_pointer` makes the code compilable with
more compiler versions (added in Clang in 2020-03, added in GCC in 2020-07).
This simplification enables D99566 for TLS Variant I architectures.
Note: as of musl 1.2.2 and FreeBSD 12.2, dlpi_tls_data returned by
dl_iterate_phdr is not desired: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254774
This can be worked around by using `__tls_get_addr({modid,0})` instead
of `dlpi_tls_data`. The workaround can be shared with the workaround for glibc<2.25.
This fixes some tests on Alpine Linux x86-64 (musl)
```
test/lsan/Linux/cleanup_in_tsd_destructor.c
test/lsan/Linux/fork.cpp
test/lsan/Linux/fork_threaded.cpp
test/lsan/Linux/use_tls_static.cpp
test/lsan/many_tls_keys_thread.cpp
test/msan/tls_reuse.cpp
```
and `test/lsan/TestCases/many_tls_keys_pthread.cpp` on glibc aarch64.
The number of sanitizer test failures does not change on FreeBSD/amd64 12.2.
Differential Revision: https://reviews.llvm.org/D98926
While attempting to roll the latest Scudo in Fuchsia, some issues
arose. While trying to debug them, it appeared that `DCHECK`s were
also never exercised in Fuchsia. This CL fixes the following
problems:
- the size of a block in the TransferBatch class must be a multiple
of the compact pointer scale. In some cases, it wasn't true, which
lead to obscure crashes. Now, we round up `sizeof(TransferBatch)`.
This only materialized in Fuchsia due to the specific parameters
of the `DefaultConfig`;
- 2 `DCHECK` statements in Fuchsia were incorrect;
- `map()` & co. require a size multiple of a page (as enforced in
Fuchsia `DCHECK`s), which wasn't the case for `PackedCounters`.
- In the Secondary, a parameter was marked as `UNUSED` while it is
actually used.
Differential Revision: https://reviews.llvm.org/D100524
This change adds convenient composed constants to be used for tsan_read_try_lock annotations, reducing the boilerplate at the instrumentation site.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D99595
Do not hold the free/live thread list lock longer than necessary.
This change speeds up the following benchmark 10x.
constexpr int kTopThreads = 50;
constexpr int kChildThreads = 20;
constexpr int kChildIterations = 8;
void Thread() {
for (int i = 0; i < kChildIterations; ++i) {
std::vector<std::thread> threads;
for (int i = 0; i < kChildThreads; ++i)
threads.emplace_back([](){});
for (auto& t : threads)
t.join();
}
}
int main() {
std::vector<std::thread> threads;
for (int i = 0; i < kTopThreads; ++i)
threads.emplace_back(Thread);
for (auto& t : threads)
t.join();
}
Differential Revision: https://reviews.llvm.org/D100348
The GNU assembler can't parse `.arch_extension ...` before a `;`.
So instead uniformly use raw string syntax with separate lines
instead of `;` separators in the assembly code.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D100413
Allow test contents to be copied before execution by using
`%ld_flags_rpath_so`, `%ld_flags_rpath_exe`, and `%dynamiclib`
substitutions.
rdar://76302416
Differential Revision: https://reviews.llvm.org/D100240
This will allow us to make osx specific changes easier. Because apple silicon macs also run on aarch64, it was easy to confuse it with iOS.
rdar://75302812
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D100157
Mark the test as unsupported to bring the bot online. Could probably be
permanently fixed by using one of the workarounds already present in
compiler-rt.
The current variable name isn't used anywhere else, which indicates it's
a typo. Let's fix it before someone copy+pastes it somewhere else.
Reviewed By: Jim
Differential Revision: https://reviews.llvm.org/D39157
ASan declares these functions as strongly-defined, which results in
'duplicate symbol' errors when trying to replace them in user code when
linking the runtimes statically.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D100220
D99763 fixed `SizeClassAllocatorLocalCache::drain` but with the
assumption that `BatchClassId` is 0 - which is currently true. I would
rather not make the assumption so that if we ever change the ID of
the batch class, the loop would still work. Since `BatchClassId` is
used more often in `local_cache.h`, introduce a constant so that we
don't have to specify `SizeClassMap::` every time.
Differential Revision: https://reviews.llvm.org/D100062
After a follow-up change (D98332) this header can be included the same time
as fenv.h when running the tests. To avoid enum members conflicting with
the macros/enums defined in the host fenv.h, prefix them with CRT_.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D98333
These tests were added in:
1daa48f00559e422c90b
malloc_zero.c and realloc_too_big.c fail when only
leak sanitizer is enabled.
http://lab.llvm.org:8011/#/builders/59/builds/1635
(also in an armv8 32 bit build)
(I would XFAIL them but the same test is run with
address and leak sanitizer enabled and that one does
pass)
Fixes the ASan RISC-V memory mapping (originally introduced by D87580 and
D87581). This should be an improvement both in terms of first principles
soundness and observed test failures --- test failures would occur
non-deterministically depending on the ASLR random offset.
On RISC-V Linux (64-bit), `TASK_UNMAPPED_BASE` is currently defined as
`PAGE_ALIGN(TASK_SIZE / 3)`. The non-power-of-two divisor makes the result
be the not very round number 0x1555556000. That address had to be further
rounded to ensure page alignment after the shadow scale shifting is applied.
Still, that value explains why the mapping table may look less regular than
expected.
Further cleanups:
- Moved the mapping table comment, to ensure that the two Linux/AArch64
tables stayed together;
- Removed mention of Sv48. Neither the original mapping nor this one are
compatible with an actual Linux Sv48 address space (mainline Linux still
operates Sv48 in Sv39 mode). A future patch can improve this;
- Removed the additional comments, for consistency.
Differential Revision: https://reviews.llvm.org/D97646
Previously it wasn't possible to configure a standalone compiler-rt
build if the `LLVMConfig.cmake` file isn't present in a shipped
toolchain.
This patch adds a fallback behaviour for when `LLVMConfig.cmake` is not
available in the toolchain being used for configure. The fallback
behaviour mocks out the bare minimum required to make a configure
succeed when the host is Darwin. Support for other platforms could
be added in future patches.
The new code path is taken either in one of the following cases:
* `llvm-config` is not available.
* `llvm-config` is available but it provides an invalid path for the CMake files.
The motivation here is to be able to generate the compiler-rt lit test
suites for an arbitrary LLVM toolchain and then run the tests against
it.
The invocation to do this looks something like.
```
CC=/path/to/cc \
CXX=/path/to/c++ \
cmake \
-G Ninja \
-DLLVM_CONFIG_PATH=/path/to/llvm-config \
-DCOMPILER_RT_INCLUDE_TESTS=ON \
/path/to/llvm-project/compiler-rt
# Note we don't compile compiler-rt in this workflow.
bin/llvm-lit -v test/path/to/generated/test_suite
```
A possible alternative approach is to configure the
`cmake/modules/LLVMConfig.cmake.in` file in the LLVM source tree
and then include it. This approach was not taken because it is more
complicated.
An interesting side benefit of this patch is that it is now
possible to configure on Darwin without `llvm-config` being available
by configuring with `-DLLVM_CONFIG_PATH=""`. This moves us a step
closer to a world where no LLVM build artefacts are required to
build compiler-rt.
rdar://76016632
Differential Revision: https://reviews.llvm.org/D99621
layout.
When doing a standalone compiler-rt build we currently rely on
getting information from the `llvm-config` binary. Previously
we would rely on calling `llvm-config --src-root` to find the
LLVM sources. Unfortunately the returned path could easily be wrong
if the sources were built on another machine.
Now that compiler-rt is part of a monorepo we can easily fix this
problem by finding the LLVM source tree next to `compiler-rt` in
the monorepo. We do this regardless of whether or not the `llvm-config`
binary is available which moves us one step closer to not requiring
`llvm-config` to be available.
To try avoid anyone breaking anyone who relies on the current behavior,
if the path assuming the monorepo layout doesn't exist we invoke
`llvm-config --src-root` to get the path. A deprecation warning is
emitted if this path is taken because we should remove this path
in the future given that other runtimes already assume the monorepo
layout.
We also now emit a warning if `LLVM_MAIN_SRC_DIR` does not exist.
The intention is that this should be a hard error in future but
to avoid breaking existing users we'll keep this as a warning
for now.
rdar://76016632
Differential Revision: https://reviews.llvm.org/D99620
This was reverted by f176803ef1 due to
Ubuntu 16.04 x86-64 glibc 2.23 problems.
This commit additionally calls `__tls_get_addr({modid,0})` to work around the
dlpi_tls_data==NULL issues for glibc<2.25
(https://sourceware.org/bugzilla/show_bug.cgi?id=19826)
GetTls is the range of
* thread control block and optional TLS_PRE_TCB_SIZE
* static TLS blocks plus static TLS surplus
On glibc, lsan requires the range to include
`pthread::{specific_1stblock,specific}` so that allocations only referenced by
`pthread_setspecific` can be scanned.
This patch uses `dl_iterate_phdr` to collect TLS blocks. Find the one
with `dlpi_tls_modid==1` as one of the initially loaded module, then find
consecutive ranges. The boundaries give us addr and size.
This allows us to drop the glibc internal `_dl_get_tls_static_info` and
`InitTlsSize` entirely. Use the simplified method with non-Android Linux for
now, but in theory this can be used with *BSD and potentially other ELF OSes.
This simplification enables D99566 for TLS Variant I architectures.
See https://reviews.llvm.org/D93972#2480556 for analysis on GetTls usage
across various sanitizers.
Differential Revision: https://reviews.llvm.org/D98926
The check was removed in D99786 as it seems that quarantine is
irrelevant for the just created allocator. However there is internal
issues with tagged memory access.
We should be able to fix iterateOverChunks for taggin later.
Existing implementations took up to 30 minutues to execute on my setup.
Now it's more convenient to debug a single test.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D99786
Linux-only for now. Some mac bits stubbed out, but not tested.
Good enough for the tiny_race.c example at
https://clang.llvm.org/docs/ThreadSanitizer.html :
$ out/gn/bin/clang -fsanitize=address -g -O1 tiny_race.c
$ while true; do ./a.out || echo $? ; done
While here, also make `-fsanitize=address` work for .c files.
Differential Revision: https://reviews.llvm.org/D99795
This change adds a SimpleFastHash64 variant of SimpleFastHash which allows call sites to specify a starting value and get a 64 bit hash in return. This allows a hash to be "resumed" with more data.
A later patch needs this to be able to hash a sequence of module-relative values one at a time, rather than just a region a memory.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D94510
Trying to build the builtins code fails because `arm64_32_SOURCES` is
missing. Setting it to the same list used for `aarch64_SOURCES` solves
that problem and allow the builtins to compile for that architecture.
Additionally, arm64_32 is added as a possible architecture for watchos
platforms.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D99690
On 64-bit systems with small VMAs (e.g. 39-bit) we can't use
SizeClassAllocator64 parameterized with size class maps containing a large
number of classes, as that will make the allocator region size too small
(< 2^32). Several tests were already disabled for Android because of this.
This patch provides the correct allocator configuration for RISC-V
(riscv64), generalizes the gating condition for tests that can't be enabled
for small VMA systems, and tweaks the tests that can be made compatible with
those systems to enable them.
I think the previous gating on Android should instead be AArch64+Android, so
the patch reflects that.
Differential Revision: https://reviews.llvm.org/D97234
The previous code may underestimate the static TLS surplus part, which may cause
false positives to LeakSanitizer if a dynamically loaded module uses the surplus
and there is an allocation only referenced by a thread's TLS.
With D98926, many_tls_keys_pthread.cpp appears to be working.
On glibc 2.30-0ubuntu2, swapcontext.cpp and Linux/fork_and_leak.cpp work fine
but they strangely fail on clang-cmake-aarch64-full
(https://lab.llvm.org/buildbot/#/builders/7/builds/2240).
Disable them for now.
Note: check-lsan was recently enabled on AArch64 in D98985. A test takes
10+ seconds. We should figure out the bottleneck.
```
/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/memprof/TestCases/test_terse.cpp:11:11: error: CHECK: expected string not found in input
// CHECK: MIB:[[STACKID:[0-9]+]]/1/40.00/40/40/20.00/20/20/[[AVELIFETIME:[0-9]+]].00/[[AVELIFETIME]]/[[AVELIFETIME]]/0/0/0/0
^
<stdin>:1:1: note: scanning from here
MIB:StackID/AllocCount/AveSize/MinSize/MaxSize/AveAccessCount/MinAccessCount/MaxAccessCount/AveLifetime/MinLifetime/MaxLifetime/NumMigratedCpu/NumLifetimeOverlaps/NumSameAllocCpu/NumSameDeallocCpu
^
<stdin>:4:1: note: possible intended match here
MIB:134217729/1/40.00/40/40/20.00/20/20/7.00/7/7/1/0/0/0
```