Using `_mkdir` of CRT in Asan Init leads to launch failure and hanging in Windows.
You can trigger it by calling:
> set ASAN_OPTIONS=log_path=a/a/a
> .\asan_program.exe
And their crash dump shows the following stack trace:
```
_guard_dispatch_icall_nop()
__acrt_get_utf8_acp_compatibility_codepage()
_mkdir(const char * path)
```
I guess there could be a cfg guard in CRT, which may lead to calling uninitialized cfg guard function address. Also, `_mkdir` supports UTF-8 encoding of the path and calls _wmkdir, but that's not necessary for this case since other file apis in sanitizer_win.cpp assumes only ANSI code case, so it makes sense to use CreateDirectoryA matching other file api calls in the same file.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D114760
Google-signed apexes appear on Android build servers' symbol files as
being under /apex/com.google.android.<foo>/. In reality, the apexes are
always installed as /apex/com.android.<foo>/ (note the lack of
'google'). In order for local symbolization under hwasan_symbolize to
work correctly, we also try the 'google' directory.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D114919
According comments on D44404, something like that was the goal.
Reviewed By: morehouse, kstoimenov
Differential Revision: https://reviews.llvm.org/D114991
The goal is to identify the bot and try to fix it.
SetSoftRssLimitExceededCallback is AsanInitInternal as I assume
that only MaybeStartBackgroudThread needs to be delayed to constructors.
Later I want to move MaybeStartBackgroudThread call into sanitizer_common.
If it needs to be reverted please provide to more info, like bot, or details about setup.
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D114934
Compress by factor 4x, takes about 10ms per 8 MiB block.
Depends on D114498.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D114503
It should be NFC, as they already intercept pthread_create.
This will let us to fix BackgroundThread for these sanitizerts.
In in followup patches I will fix MaybeStartBackgroudThread for them
and corresponding tests.
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D114935
We call UnmapShadow before the actual munmap, at that point we don't yet
know if the provided address/size are sane. We can't call UnmapShadow
after the actual munmap becuase at that point the memory range can
already be reused for something else, so we can't rely on the munmap
return value to understand is the values are sane.
While calling munmap with insane values (non-canonical address, negative
size, etc) is an error, the kernel won't crash. We must also try to not
crash as the failure mode is very confusing (paging fault inside of the
runtime on some derived shadow address).
Such invalid arguments are observed on Chromium tests:
https://bugs.chromium.org/p/chromium/issues/detail?id=1275581
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114944
The added test demonstrates loading a dynamic library with static TLS.
Such static TLS is a hack that allows a dynamic library to have faster TLS,
but it can be loaded only iff all threads happened to allocate some excess
of static TLS space for whatever reason. If it's not the case loading fails with:
dlopen: cannot load any more object with static TLS
We used to produce a false positive because dlopen will write into TLS
of all existing threads to initialize/zero TLS region for the loaded library.
And this appears to be racing with initialization of TLS in the thread
since we model a write into the whole static TLS region (we don't what part
of it is currently unused):
WARNING: ThreadSanitizer: data race (pid=2317365)
Write of size 1 at 0x7f1fa9bfcdd7 by main thread:
0 memset
1 init_one_static_tls
2 __pthread_init_static_tls
[[ this is where main calls dlopen ]]
3 main
Previous write of size 8 at 0x7f1fa9bfcdd0 by thread T1:
0 __tsan_tls_initialization
Fix this by ignoring accesses during dlopen.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114953
Broke the build on Windows, where MprotectReadOnly() isn't defined, see comment
on the code review.
> Compress by factor 4x, takes about 10ms per 8 MiB block.
>
> Depends on D114498.
>
> Reviewed By: morehouse
>
> Differential Revision: https://reviews.llvm.org/D114503
This reverts commit 1d8f295759.
Compress by factor 4x, takes about 10ms per 8 MiB block.
Depends on D114498.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D114503
The first 8b of each raw profile section need to be aligned to 8b since
the first item in each section is a u64 count of the number of items in
the section.
Summary of changes:
* Assert alignment when reading counts.
* Update test to check alignment, relax some size checks to allow padding.
* Update raw binary inputs for llvm-profdata tests.
Differential Revision: https://reviews.llvm.org/D114826
Add Compression::Test type which just pretends packing,
but does nothing useful. It's only called from test for now.
Depends on D114493.
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D114494
We would like to use TLS to store the ThreadState object (or at least a
reference ot it), but on Darwin accessing TLS via __thread or manually
by using pthread_key_* is problematic, because there are several places
where interceptors are called when TLS is not accessible (early process
startup, thread cleanup, ...).
Previously, we used a "poor man's TLS" implementation, where we use the
shadow memory of the pointer returned by pthread_self() to store a
pointer to the ThreadState object.
The problem with that was that certain operations can populate shadow
bytes unbeknownst to TSan, and we later interpret these non-zero bytes
as the pointer to our ThreadState object and crash on when dereferencing
the pointer.
This patch changes the storage location of our reference to the
ThreadState object to "real" TLS. We make this work by artificially
keeping this reference alive in the pthread_key destructor by resetting
the key value with pthread_setspecific().
This change also fixes the issue were the ThreadState object is
re-allocated after DestroyThreadState() because intercepted functions
can still get called on the terminating thread after the
THREAD_TERMINATE event.
Radar-Id: rdar://problem/72010355
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D110236
The memprof unit tests use an older version of gmock (included in the
repo) which does not build cleanly with -pedantic:
https://github.com/google/googletest/issues/2650
For now just silence the warning by disabling pedantic and add the
appropriate flags for gcc and clang.
This commit adds initial support to llvm-profdata to read and print
summaries of raw memprof profiles.
Summary of changes:
* Refactor shared defs to MemProfData.inc
* Extend show_main to display memprof profile summaries.
* Add a simple raw memprof profile reader.
* Add a couple of tests to tools/llvm-profdata.
Differential Revision: https://reviews.llvm.org/D114286
In multi-threaded application concurrent StackStore::Store may
finish in order different from assigned Id. So we can't assume
that after we switch writing the next block the previous is done.
The workaround is to count exact number of uptr stored into the block,
including skipped tail/head which were not able to fit entire trace.
Depends on D114490.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D114493
Larger blocks are more convenient for compressions.
Blocks are allocated with MmapNoReserveOrDie to save some memory.
Also it's 15% faster on StackDepotBenchmarkSuite
Depends on D114464.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D114488
Sometimes stacks for at_exit callbacks don't include any of the user functions/files.
For example, a race with a global std container destructor will only contain
the container type name and our at_exit_wrapper function. No signs what global variable
this is.
Remember and include in reports the function that installed the at_exit callback.
This should give glues as to what variable is being destroyed.
Depends on D114606.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D114607
Add a test for a common C++ bug when a global object is destroyed
while background threads still use it.
Depends on D114604.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D114605
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master and slave with primary and secondary respectively in
`sanitizer_mac.cpp`.
Reviewed By: ZarkoCA
Differential Revision: https://reviews.llvm.org/D114255
This change switches tsan to the new runtime which features:
- 2x smaller shadow memory (2x of app memory)
- faster fully vectorized race detection
- small fixed-size vector clocks (512b)
- fast vectorized vector clock operations
- unlimited number of alive threads/goroutimes
Depends on D112602.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112603
Linux/fork_deadlock.cpp currently hangs in debug mode in the following stack.
Disable memory access handling in OnUserAlloc/Free around fork.
1 0x000000000042c54b in __sanitizer::internal_sched_yield () at sanitizer_linux.cpp:452
2 0x000000000042da15 in __sanitizer::StaticSpinMutex::LockSlow (this=0x57ef02 <__sanitizer::internal_allocator_cache_mu>) at sanitizer_mutex.cpp:24
3 0x0000000000423927 in __sanitizer::StaticSpinMutex::Lock (this=0x57ef02 <__sanitizer::internal_allocator_cache_mu>) at sanitizer_mutex.h:32
4 0x000000000042354c in __sanitizer::GenericScopedLock<__sanitizer::StaticSpinMutex>::GenericScopedLock (this=this@entry=0x7ffcabfca0b8, mu=0x1) at sanitizer_mutex.h:367
5 0x0000000000423653 in __sanitizer::RawInternalAlloc (size=size@entry=72, cache=cache@entry=0x0, alignment=8, alignment@entry=0) at sanitizer_allocator.cpp:52
6 0x00000000004235e9 in __sanitizer::InternalAlloc (size=size@entry=72, cache=0x1, cache@entry=0x0, alignment=4, alignment@entry=0) at sanitizer_allocator.cpp:86
7 0x000000000043aa15 in __sanitizer::SymbolizedStack::New (addr=4802655) at sanitizer_symbolizer.cpp:45
8 0x000000000043b353 in __sanitizer::Symbolizer::SymbolizePC (this=0x7f578b77a028, addr=4802655) at sanitizer_symbolizer_libcdep.cpp:90
9 0x0000000000439dbe in __sanitizer::(anonymous namespace)::StackTraceTextPrinter::ProcessAddressFrames (this=this@entry=0x7ffcabfca208, pc=4802655) at sanitizer_stacktrace_libcdep.cpp:36
10 0x0000000000439c89 in __sanitizer::StackTrace::PrintTo (this=this@entry=0x7ffcabfca2a0, output=output@entry=0x7ffcabfca260) at sanitizer_stacktrace_libcdep.cpp:109
11 0x0000000000439fe0 in __sanitizer::StackTrace::Print (this=0x18) at sanitizer_stacktrace_libcdep.cpp:132
12 0x0000000000495359 in __sanitizer::PrintMutexPC (pc=4802656) at tsan_rtl.cpp:774
13 0x000000000042e0e4 in __sanitizer::InternalDeadlockDetector::Lock (this=0x7f578b1ca740, type=type@entry=2, pc=pc@entry=4371612) at sanitizer_mutex.cpp:177
14 0x000000000042df65 in __sanitizer::CheckedMutex::LockImpl (this=<optimized out>, pc=4) at sanitizer_mutex.cpp:218
15 0x000000000042bc95 in __sanitizer::CheckedMutex::Lock (this=0x600001000000) at sanitizer_mutex.h:127
16 __sanitizer::Mutex::Lock (this=0x600001000000) at sanitizer_mutex.h:165
17 0x000000000042b49c in __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock (this=this@entry=0x7ffcabfca370, mu=0x1) at sanitizer_mutex.h:367
18 0x000000000049504f in __tsan::TraceSwitch (thr=0x7f578b1ca980) at tsan_rtl.cpp:656
19 0x000000000049523e in __tsan_trace_switch () at tsan_rtl.cpp:683
20 0x0000000000499862 in __tsan::TraceAddEvent (thr=0x7f578b1ca980, fs=..., typ=__tsan::EventTypeMop, addr=4499472) at tsan_rtl.h:624
21 __tsan::MemoryAccessRange (thr=0x7f578b1ca980, pc=4499472, addr=135257110102784, size=size@entry=16, is_write=true) at tsan_rtl_access.cpp:563
22 0x000000000049853a in __tsan::MemoryRangeFreed (thr=thr@entry=0x7f578b1ca980, pc=pc@entry=4499472, addr=addr@entry=135257110102784, size=16) at tsan_rtl_access.cpp:487
23 0x000000000048f6bf in __tsan::OnUserFree (thr=thr@entry=0x7f578b1ca980, pc=pc@entry=4499472, p=p@entry=135257110102784, write=true) at tsan_mman.cpp:260
24 0x000000000048f61f in __tsan::user_free (thr=thr@entry=0x7f578b1ca980, pc=4499472, p=p@entry=0x7b0400004300, signal=true) at tsan_mman.cpp:213
25 0x000000000044a820 in __interceptor_free (p=0x7b0400004300) at tsan_interceptors_posix.cpp:708
26 0x00000000004ad599 in alloc_free_blocks () at fork_deadlock.cpp:25
27 __tsan_test_only_on_fork () at fork_deadlock.cpp:32
28 0x0000000000494870 in __tsan::ForkBefore (thr=0x7f578b1ca980, pc=pc@entry=4904437) at tsan_rtl.cpp:510
29 0x000000000046fcb4 in syscall_pre_fork (pc=1) at tsan_interceptors_posix.cpp:2577
30 0x000000000046fc9b in __sanitizer_syscall_pre_impl_fork () at sanitizer_common_syscalls.inc:3094
31 0x00000000004ad5f5 in myfork () at syscall.h:9
32 main () at fork_deadlock.cpp:46
Depends on D114595.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114597
We currently use a wrong value for heap block
(only works for C++, but not for Java).
Use the correct value (we already computed it before, just forgot to use).
Depends on D114593.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114595
Add a basic test that checks races between vector/non-vector
read/write accesses of different sizes/offsets in different orders.
This gives coverage of __tsan_read/write16 callbacks.
Depends on D114591.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114592
Vector SSE accesses make compiler emit __tsan_[unaligned_]read/write16 callbacks.
Make it possible to test these.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114591
Building cfi with recent clang on a 64-bit system results in the
following warnings:
compiler-rt/lib/cfi/cfi.cpp:233:64: warning: format specifies type 'void *' but the argument has type '__sanitizer::uptr' (aka 'unsigned long') [-Wformat]
VReport(1, "Can not handle: symtab > strtab (%p > %zx)\n", symtab, strtab);
~~ ^~~~~~
%lu
compiler-rt/lib/sanitizer_common/sanitizer_common.h:231:46: note: expanded from macro 'VReport'
if ((uptr)Verbosity() >= (level)) Report(__VA_ARGS__); \
^~~~~~~~~~~
compiler-rt/lib/cfi/cfi.cpp:253:59: warning: format specifies type 'void *' but the argument has type '__sanitizer::uptr' (aka 'unsigned long') [-Wformat]
VReport(1, "Can not handle: symtab %p, strtab %zx\n", symtab, strtab);
~~ ^~~~~~
%lu
compiler-rt/lib/sanitizer_common/sanitizer_common.h:231:46: note: expanded from macro 'VReport'
if ((uptr)Verbosity() >= (level)) Report(__VA_ARGS__); \
^~~~~~~~~~~
Since `__sanitizer::uptr` has the same size as `size_t`, consistently
use `%z` as a printf specifier.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D114466
Now that we lock the internal allocator around fork,
it's possible it will create additional deadlocks.
Add a fake mutex that substitutes the internal allocator
for the purposes of deadlock detection.
Depends on D114531.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114532
There is a small chance that the internal allocator is locked
during fork and then the new process is created with locked
internal allocator and any attempts to use it will deadlock.
For example, if detected a suppressed race in the parent during fork
and then another suppressed race after the fork.
This becomes much more likely with the new tsan runtime
as it uses the internal allocator for more things.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114531
The test tries to provoke internal allocator to be locked during fork
and then force the child process to use the internal allocator.
This test sometimes deadlocks with the new tsan runtime.
Depends on D114514.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114515
It was introduced in:
9cffc9550b tsan: allow to force use of __libc_malloc in sanitizer_common
and used in:
512a18e518 tsan: add standalone deadlock detector
and later used for Go support.
But now both uses are gone. Nothing defines SANITIZER_USE_MALLOC.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114514
Test size larger than clear_shadow_mmap_threshold,
which is handled differently.
Depends on D114348.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D114366
Nothing uses more than 8bit now. So the rest of the headers can store other data.
kStackTraceMax is 256 now, but all sanitizers by default store just 20-30 frames here.
Verified no diff exist between previous version, new version python 2, and python 3 for an example stack.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D114404
The INSTR_PROF_RAW_MAGIC_* number in profraw files should match during
profile merging. This causes an error with 32-bit and 64-bit variants
of the same code. The module signatures for the two binaries are
identical but they use different INSTR_PROF_RAW_MAGIC_* causing a
failure when profile-merging is used. Including it when computing the
module signature yields different signatures for the 32-bit and 64-bit
profiles.
Differential Revision: https://reviews.llvm.org/D114054
This change switches tsan to the new runtime which features:
- 2x smaller shadow memory (2x of app memory)
- faster fully vectorized race detection
- small fixed-size vector clocks (512b)
- fast vectorized vector clock operations
- unlimited number of alive threads/goroutimes
Differential Revision: https://reviews.llvm.org/D112603
We dropped the printing of live on exit blocks in rG1243cef245f6 -
the commit changed the insertOrMerge logic. Remove the message since it
is no longer needed (all live blocks are inserted into the hashmap)
before serializing/printing the profile. Furthermore, the original
intent was to capture evicted blocks so it wasn't entirely correct.
Also update the binary format test invocation to remove the redundant
print_text directive now that it is the default.
Differential Revision: https://reviews.llvm.org/D114285
MemProfUnitTest now depends on libcxx but the dependency is not
explicitly expressed in build system, causing build races. This patch
addresses this issue.
Differential Revision: https://reviews.llvm.org/D114267
This change switches tsan to the new runtime which features:
- 2x smaller shadow memory (2x of app memory)
- faster fully vectorized race detection
- small fixed-size vector clocks (512b)
- fast vectorized vector clock operations
- unlimited number of alive threads/goroutimes
Depends on D112602.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112603
All runtime callbacks must be non-instrumented with the new tsan runtime
(it's now more picky with respect to recursion into runtime).
Disable instrumentation in Darwin tests as we do in all other tests now.
Differential Revision: https://reviews.llvm.org/D114348
Add a fork test that models what happens on Mac
where fork calls malloc/free inside of our atfork
callbacks.
Reviewed By: vitalybuka, yln
Differential Revision: https://reviews.llvm.org/D114250
Using the out-of-line LSE atomics helpers for AArch64 on FreeBSD also
requires adding support for initializing __aarch64_have_lse_atomics
correctly. On Linux this is done with getauxval(3), on FreeBSD with
elf_aux_info(3), which has a slightly different interface.
Differential Revision: https://reviews.llvm.org/D109330
We need libfuzzer libraries on Arm32 so that we can fuzz
Arm32 binaries on Linux (Chrome OS). Android already
allows Arm32 for libfuzzer.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D112091
pthread_join needs to map pthread_t of the joined thread to our Tid.
Currently we do this with linear search over all threads.
This has quadratic complexity and becomes much worse with the new
tsan runtime, which memorizes all threads that ever existed.
To resolve this add a hash map of live threads only (that are still
associated with pthread_t) and use it for the mapping.
With the new tsan runtime some programs spent 1/3 of time in this mapping.
After this change the mapping disappears from profiles.
Depends on D113996.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D113997
The __flags variable needs to be of type 'long' in order to get sign extended
properly.
internal_clone() uses an svc (Supervisor Call) directly (as opposed to
internal_syscall), and therefore needs to take care to set errno and return
-1 as needed.
Review: Ulrich Weigand
asan does not use user_id for anything,
so don't pass it to ThreadCreate.
Passing a random uninitialized field of AsanThread
as user_id does not make much sense anyway.
Depends on D113921.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D113922
memprof does not use user_id for anything,
so don't pass it to ThreadCreate.
Passing a random field of MemprofThread as user_id
does not make much sense anyway.
Depends on D113920.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D113921
They don't seem to do anything useful in lsan.
They are needed only if a tools needs to execute
some custom logic during detach/join, or if it uses
thread registry quarantine. Lsan does none of this.
And if a tool cares then it would also need to intercept
pthread_tryjoin_np and pthread_timedjoin_np, otherwise
it will mess thread states.
Fwiw, asan does not intercept these functions either.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D113920
Since libclang_rt.profile is added later in the command line, a
definition of __llvm_profile_raw_version is not included if it is
provided from an earlier object, e.g. from a shared dependency.
This causes an extra dependence edge where if libA.so depends on libB.so
and both are coverage-instrumented, libA.so uses libB.so's definition of
__llvm_profile_raw_version. This leads to a runtime link failure if the
libB.so available at runtime does not provide this symbol (but provides
the other dependent symbols). Such a scenario can occur in Android's
mainline modules.
E.g.:
ld -o libB.so libclang_rt.profile-x86_64.a
ld -o libA.so -l B libclang_rt.profile-x86_64.a
libB.so has a global definition of __llvm_profile_raw_version. libA.so
uses libB.so's definition of __llvm_profile_raw_version. At runtime,
libB.so may not be coverage-instrumented (i.e. not export
__llvm_profile_raw_version) so runtime linking of libA.so will fail.
Marking this symbol as hidden forces each binary to use the definition
of __llvm_profile_raw_version from libclang_rt.profile. The visiblity
is unchanged for Apple platforms where its presence is checked by the
TAPI tool.
Reviewed By: MaskRay, phosek, davidxl
Differential Revision: https://reviews.llvm.org/D111759
The new test started failing on bots with:
CHECK failed: tsan_rtl.cpp:327 "((addr + size)) <= ((TraceMemEnd()))"
(0xf06200e03010, 0xf06200000000) (tid=4073872)
https://lab.llvm.org/buildbot#builders/179/builds/1761
This is a latent bug in aarch64 virtual address space layout,
there is not enough address space to fit traces for all threads.
But since the trace space is going away with the new tsan runtime
(D112603), disable the test.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D113990
Use of gethostent provokes caching of some resources inside of libc.
They are freed in __libc_thread_freeres very late in thread lifetime,
after our ThreadFinish. __libc_thread_freeres calls free which
previously crashed in malloc hooks.
Fix it by setting ignore_interceptors for finished threads,
which in turn prevents malloc hooks.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D113989
Precisely specifying the unused parts of the bitfield is critical for
performance. If we don't specify them, compiler will generate code to load
the old value and shuffle it to extract the unused bits to apply to the new
value. If we specify the unused part and store 0 in there, all that
unnecessary code goes away (store of the 0 const is combined with other
constant parts).
I don't see a good way to ensure we cover all of u64 bits with fields.
So at least introduce named kUnusedBits consts and check that bits
sum up to 64.
Depends on D113978.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D113979
In the old runtime we used to use different number of trace parts
for C++ and Go to reduce trace memory consumption for Go.
But now it's easier and better to use smaller parts because
we already use minimal possible number of parts for C++ (3).
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D113978
man pthread_equal:
The pthread_equal() function is necessary because thread IDs
should be considered opaque: there is no portable way for
applications to directly compare two pthread_t values.
Depends on D113916.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D113919
pthread_setname_np does linear search over all thread descriptors
to map pthread_t to the thread descriptor. This has O(N^2) complexity
and becomes much worse in the new tsan runtime that keeps all ever
existed threads in the thread registry.
Replace linear search with direct access if pthread_setname_np
is called for the current thread (a very common case).
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D113916
We used to mmap C++ shadow stack as part of the trace region
before ed7f3f5bc9 ("tsan: move shadow stack into ThreadState"),
which moved the shadow stack into TLS. This started causing
timeouts and OOMs on some of our internal tests that repeatedly
create and destroy thousands of threads.
Allocate C++ shadow stack with mmap and small pages again.
This prevents the observed timeouts and OOMs.
But we now need to be more careful with interceptors that
run after thread finalization because FuncEntry/Exit and
TraceAddEvent all need the shadow stack.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D113786
Similar to how the other swift sections are registered by the ORC
runtime's macho platform, add the __swift5_types section, which contains
type metadata. Add a simple test that demonstrates that the swift
runtime recognized the registered types.
rdar://85358530
Differential Revision: https://reviews.llvm.org/D113811
Since glibc 2.34, dlsym does
1. malloc 1
2. malloc 2
3. free pointer from malloc 1
4. free pointer from malloc 2
These sequence was not handled by trivial dlsym hack.
This fixes https://bugs.llvm.org/show_bug.cgi?id=52278
Reviewed By: eugenis, morehouse
Differential Revision: https://reviews.llvm.org/D112588
The check was failing because it was matching against the end of the range, not
the start.
This bug wasn't causing the ORC-RT MachO TLV regression test to fail because
we were only logging deallocation errors (including TLV deregistration errors)
and not actually returning a failure code. This commit updates llvm-jitlink to
report the errors properly.
This change switches tsan to the new runtime which features:
- 2x smaller shadow memory (2x of app memory)
- faster fully vectorized race detection
- small fixed-size vector clocks (512b)
- fast vectorized vector clock operations
- unlimited number of alive threads/goroutimes
Depends on D112602.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112603
Start the background thread only after fork, but not after clone.
For fork we did this always and it's known to work (or user code has adopted).
But if we do this for the new clone interceptor some code (sandbox2) fails.
So model we used to do for years and don't start the background thread after clone.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D113744
The compiler does not recognize HACKY_CALL as a call
(we intentionally hide it from the compiler so that it can
compile non-leaf functions as leaf functions).
To compensate for that hacky call thunk saves and restores
all caller-saved registers. However, it saves only
general-purposes registers and does not save XMM registers.
This is a latent bug that was masked up until a recent "NFC" commit
d736002e90 ("tsan: move memory access functions to a separate file"),
which allowed more inlining and exposed the 10-year bug.
Save and restore caller-saved XMM registers (all) as well.
Currently the bug manifests as e.g. frexp interceptor messes the
return value and the added test fails with:
i=8177 y=0.000000 exp=4
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D113742
Some compiler-rt tests are inherently incompatible with VE because..
* No consistent denormal support on VE. We skip denormal fp inputs in builtin tests.
* `madvise` unsupported on VE.
* Instruction alignment requirements.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D113093
Set the default memprof serialization format as binary. 9 tests are
updated to use print_text=true. Also fixed an issue with concatenation
of default and test specified options (missing separator).
Differential Revision: https://reviews.llvm.org/D113617
This change implements the raw binary format discussed in
https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html
Summary of changes
* Add a new memprof option to choose binary or text (default) format.
* Add a rawprofile library which serializes the MIB map to profile.
* Add a unit test for rawprofile.
* Mark sanitizer procmaps methods as virtual to be able to mock them.
* Extend memprof_profile_dump regression test.
Differential Revision: https://reviews.llvm.org/D113317
The existing implementation uses a cache + eviction based scheme to
record heap profile information. This design was adopted to ensure a
constant memory overhead (due to fixed number of cache entries) along
with incremental write-to-disk for evictions. We find that since the
number to entries to track is O(unique-allocation-contexts) the overhead
of keeping all contexts in memory is not very high. On a clang workload,
the max number of unique allocation contexts was ~35K, median ~11K.
For each context, we (currently) store 64 bytes of data - this amounts
to 5.5MB (max). Given the low overheads for a complex workload, we can
simplify the implementation by using a hashmap without eviction.
Other changes:
* Memory map is dumped at the end rather than startup. The relative
order in the profile dump is unchanged since we no longer have evicted
entries at runtime.
* Added a test to check meminfoblocks are merged.
Differential Revision: https://reviews.llvm.org/D111676
Move the memprof MemInfoBlock struct to it's own header as requested
during the review of D111676.
Differential Revision: https://reviews.llvm.org/D113315
This change adds a ForEach method to the AddrHashMap class which can
then be used to iterate over all the key value pairs in the hash map.
I intend to use this in an upcoming change to the memprof runtime.
Added a unit test to cover basic insertion and the ForEach callback.
Differential Revision: https://reviews.llvm.org/D111368
Clone does not exist on Mac.
There are chances it will break on other OSes.
Enable it incrementally starting with Linux only,
other OSes can enable it later as needed.
Reviewed By: melver, thakis
Differential Revision: https://reviews.llvm.org/D113693
gtest uses clone for death tests and it needs the same
handling as fork to prevent deadlock (take runtime mutexes
before and release them after).
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D113677
Currently, SANITIZER_COMMON_SUPPORTED_OS is being used to enable many libraries.
Unfortunately this makes it impossible to selectively disable a library based on the OS.
This patch removes this limitation by adding a separate list of supported OSs for the lsan, ubsan, ubsan_minimal, and stats libraries.
Reviewed By: delcypher
Differential Revision: https://reviews.llvm.org/D113444
Entropic scheduling with exec-time option can be misled, if inputs
on the right path to become crashing inputs accidentally take more
time to execute before it's added to the corpus. This patch, by letting
more of such inputs added to the corpus (four inputs of size 7 to 10,
instead of a single input of size 2), reduces possibilities of being
influenced by timing flakiness.
A longer-term fix could be to reduce timing flakiness in the fuzzer;
one way could be to execute inputs multiple times and take average of
their execution time before they are added to the corpus.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D113544
If async signal handler called when we MsanThread::Init
signal handler may trigger false reports.
I failed to reproduce this locally for a test.
Differential Revision: https://reviews.llvm.org/D113328
add tracing for loads and stores.
The primary goal is to have more options for data-flow-guided fuzzing,
i.e. use data flow insights to perform better mutations or more agressive corpus expansion.
But the feature is general puspose, could be used for other things too.
Pipe the flag though clang and clang driver, same as for the other SanitizerCoverage flags.
While at it, change some plain arrays into std::array.
Tests: clang flags test, LLVM IR test, compiler-rt executable test.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D113447
Applying the same rules as for LLVM_BUILD_INSTRUMENTED build in the cmake files.
By having this patch, we are able to disable/enable instrument+coverage build
of the compiler-rt project when building instrumented LLVM.
Differential Revision: https://reviews.llvm.org/D108127
getting the tls base address. unlike linux arm64, the tpidr_el0 returns always 0 (aka unused)
thus using tpidrro_el0 instead clearing up the cpu id encoded in the lower bits.
Reviewed-By: yln
Differential Revision: https://reviews.llvm.org/D112866
D98452 introduced a mismatch between clang expectations for
builtin name for baremetal targets on arm. Fix it by
adding a case for baremetal. This now matches the output of
"clang -target armv7m-none-eabi -print-libgcc-file-name \
-rtlib=compiler-rt"
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D113357
https://sourceware.org/bugzilla/show_bug.cgi?id=22742
uc_mcontext.__reserved probably should not be considered user visible API but
unfortunate it is: it is the only way to access cpu states of some Linux
asm/sigcontext.h extensions. That said, the declaration may be
long double __reserved[256]; (used by musl)
instead of
unsigned char __reserved[4096] __attribute__((__aligned__(16))); (glibc)
to avoid dependency on a GNU variable attribute.
GCC introduced `__attribute__((mode(unwind_word)))` to work around
Cell Broadband Engine SPU (which was removed from GCC in 2019-09),
which is irrelevant to hwasan.
_Unwind_GetGR/_Unwind_GetCFA from llvm-project/libunwind don't use unwind_word.
Using _Unwind_Word can lead to build failures if libunwind's unwind.h is
preferred over unwind.h in the Clang resource directory (e.g. built with GCC).
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169.
Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:
(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.
(2) The remaining tests are updated manually.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D108453
Resolve lit failures in clang after 8ca4b3e's land
Fix lit test failures in clang-ppc* and clang-x64-windows-msvc
Fix missing failures in clang-ppc64be* and retry fixing clang-x64-windows-msvc
Fix internal_clone(aarch64) inline assembly
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
It's called from ATTRIBUTE_NO_SANITIZE_MEMORY code.
It worked as expected if inlined and complained otherwise.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D113323
Fixes:
sanitizer_stacktrace.h:212:5: warning: ISO C++ forbids braced-groups within expressions [-Wpedantic]
Differential Revision: https://reviews.llvm.org/D113292
Read each pointer in the argv and envp arrays before dereferencing
it; this correctly marks an error when these pointers point into
memory that has been freed.
Differential Revision: https://reviews.llvm.org/D113046
As pointed out in Bug 52371, the Solaris version of
`MemoryMappingLayout::Next` completely failed to handle `readlink` errors
or properly NUL-terminate the result.
This patch fixes this. Originally provided in the PR with slight
formatting changes.
Tested on `amd64-pc-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D112998
I recently spent some extra time debugging a false positive because I
didn't realize the "real" tag was in the short granule. Adding the
short tag here makes it more obvious that we could be dealing with a
short granule.
Reviewed By: hctim, eugenis
Differential Revision: https://reviews.llvm.org/D112949
Previously we only applied it to the first one, which could allow
subsequent global tags to exceed the valid number of bits.
Reviewed By: hctim
Differential Revision: https://reviews.llvm.org/D112853
This is NOOP in x86_64.
On arch64 it avoids Data Memory Barrier with visible improvements on micro benchmarks.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D112391
MachOPlatform used to make an EPC-call (registerObjectSections) to register the
eh-frame and thread-data sections for each linked object with the ORC runtime.
Now that JITLinkMemoryManager supports allocation actions we can use these
instead of an EPC call. This saves us one EPC-call per object linked, and
manages registration/deregistration in the executor, rather than the controller
process. In the future we may use this to allow JIT'd code in the executor to
outlive the controller object while still being able to be cleanly destroyed.
Since the code for allocation actions must be available when the actions are
run, and since the eh-frame registration code lives in the ORC runtime itself,
this change required that MachO eh-frame support be split out of
macho_platform.cpp and into its own macho_ehframe_registration.cpp file that has
no other dependencies. During bootstrap we start by forcing emission of
macho_ehframe_registration.cpp so that eh-frame registration is guaranteed to be
available for the rest of the bootstrap process. Then we load the rest of the
MachO-platform runtime support, erroring out if there is any attempt to use
TLVs. Once the bootstrap process is complete all subsequent code can use all
features.
The ParseUnixMemoryProfile function is defined only for a subset
of platforms. Define the test for the same set of platforms.
Also disable the test for 32-bit platforms b/c the pointer
values used in the test are 64-bit and don't fit into 32-bit uptr.
Reported-by: Jan Svoboda (jansvoboda11)
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112815
The new `Posix/mmap_write_exec.cpp` test FAILs on 32-bit Solaris/x86. This
happens because only `mmap` is intercepted, but not `mmap64` which is used
for largefile support.
Fixed by also intercepting `mmap64`.
Tested on `amd64-pc-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D112810
I am hitting some cases where /proc/self/maps does not fit into 64MB.
256MB is lots of memory, but it's not radically more than the current 64MB.
Ideally we should read/parse these huge files incrementally,
but that's lots of work for a debugging/introspection interface.
So for now just bump the limit.
Depends on D112793.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112794
ParseUnixMemoryProfile assumes well-formed input with \n at the end, etc.
It can over-read the input and crash on basically every line
in the case of malformed input.
ReadFileToBuffer has cap the max file size (64MB) and returns
truncated contents if the file is larger. Thus even if kernel behaves,
ParseUnixMemoryProfile crashes on too large /proc/self/smaps.
Fix input over-reading in ParseUnixMemoryProfile.
Depends on D112792.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112793
stats_size argument is unnecessary in GetMemoryProfile and in the callback.
It just clutters code. The callback knowns how many stats to expect.
Depends on D112789.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112790
Move parsing of /proc/self/smaps into a separate function
so that it can be tested.
Depends on D112788.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112789
D112630 ("sanitizer_common: fix up onprint.cpp test")
added O_CREAT, but we also need O_TRUNC b/c the file
may not exist, or may exist as well.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112788
WrapperFunctionCall represents a call to a wrapper function as a pair of a
target function (as an ExecutorAddr), and an argument buffer range (as an
ExecutorAddrRange). WrapperFunctionCall instances can be serialized via
SPS to send to remote machines (only the argument buffer address range is
copied, not any buffer content).
This utility will simplify the implementation of JITLinkMemoryManager
allocation actions in the ORC runtime.
tsan_rtl.cpp is huge and does lots of things.
Move everything related to memory access and tracing
to a separate tsan_rtl_access.cpp file.
No functional changes, only code movement.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D112625
Add `__c11_atomic_fetch_nand` builtin to language extensions and support `__atomic_fetch_nand` libcall in compiler-rt.
Reviewed By: theraven
Differential Revision: https://reviews.llvm.org/D112400
There's a lot of duplicated calls to find various compiler-rt libraries
from build of runtime libraries like libunwind, libc++, libc++abi and
compiler-rt. The compiler-rt helper module already implemented caching
for results avoid repeated Clang invocations.
This change moves the compiler-rt implementation into a shared location
and reuses it from other runtimes to reduce duplication and speed up
the build.
Differential Revision: https://reviews.llvm.org/D88458
We were writing a pointer to a selector string into the contents of a
string instead of overwriting the pointer to the string, leading to
corruption. This was causing non-deterministic failures of the
'trivial-objc-methods' test case.
Differential Revision: https://reviews.llvm.org/D112671
Enables the arm64 MachO platform, adds basic tests, and implements the
missing TLV relocations and runtime wrapper function. The TLV
relocations are just handled as GOT accesses.
rdar://84671534
Differential Revision: https://reviews.llvm.org/D112656
Commit D112602 ("sanitizer_common: tighten on_print hook test")
changed fopen to open in this test. fopen created the file
if if does not exist, but open does not. This was unnoticed
during local testing because lit is not hermetic and reuses
files from previous runs, but it started failing on bots.
Fix the open call.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112630
The new tsan runtime does not support arbitrary forms
of recursing into the runtime from hooks.
Disable instrumentation of the hook and use write instead
of fwrite (calls malloc internally).
The new version still recurses (write is intercepted),
but does not fail now (the issue at hand was malloc).
Depends on D112601.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112602
Gtest's EXPECT calls whole lot of libc functions
(mem*, malloc) even when EXPECT does not fail.
This does not play well with tsan runtime unit tests
b/c e.g. we call some EXPECTs with runtime mutexes locked.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112601
Don't leak caller_pc var from the macro
(it's not supposed to be used by interceptors).
Use UNUSED instead of (void) cast.
Depends on D112540.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112541
If the real function is not intercepted,
we are going to crash one way or another.
The question is just in the failure mode:
error message vs NULL deref. But the message
costs us a check in every interceptor and
they are not observed to be failing in real life
for a long time, also other sanitizers don't
have this check as well (also crash on
NULL deref if that happens).
Remove the check from non-debug mode.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112540
This is a test-only failure. The test wrongly assumes that this gets us
a tagged pointer:
```
NSObject* num1 = @7;
assert(isTaggedPtr(num1));
```
However, on newer deployment targets that have “const data support” we
get a “normal” pointer to constant object.
Radar-Id: rdar://problem/83217293
All tsan interceptors check for initialization and/or initialize things
as necessary lazily, so we can pretend everything is initialized in the
COMMON_INTERCEPTOR_NOTHING_IS_INITIALIZED check to avoid double-checking
for initialization (this is only necessary for sanitizers that don't
handle initialization on common grounds).
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112446
Print PC of the previous lock, not the current one.
The current one will be printed during unwind.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112533
In TSan, we use the a function reference (`__tsan_stack_initialization`)
in a call to `StackTrace::GetNextInstructionPc(uptr pc)`. We sign
function pointers, so we need to strip the signature from this function
pointer.
Caused by: https://reviews.llvm.org/D111147
Radar-Id: rdar://problem/83940546
MutexSet is too large to be allocated on stack.
But we need local MutexSet objects in few places
and use various hacks to allocate them.
Add DynamicMutexSet helper that simplifies allocation
of such objects.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112449