Currently we effectively duplicate "once" logic for __cxa_guard_acquire
and pthread_once. Unify the implementations.
This is not a no-op change:
- constants used for pthread_once are changed to match __cxa_guard_acquire
(__cxa_guard_acquire constants are tied to ABI, but it does not seem
to be the case for pthread_once)
- pthread_once now also uses PotentiallyBlockingRegion annotations
- __cxa_guard_acquire checks thr->in_ignored_lib to skip user synchronization
It's unclear if these 2 differences are intentional or a mere sloppy inconsistency.
Since all tests still pass, let's assume the latter.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D107359
Atomic functions are semi-hot in profiles.
The CHECKs verify values passed by compiler
and they never fired, so replace them with DCHECKs.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D107373
Similar to qsort, bsearch can be called from non-instrumented
code of glibc. When it happends tls for arguments can be in uninitialized
state.
Unlike to qsort, bsearch does not move data, so we don't need to
check or initialize searched memory or key. Intrumented comparator will
do that on it's own.
Differential Revision: https://reviews.llvm.org/D107387
On macOS the unit tests currently rely on libmalloc being used for
allocations (due to no functioning interceptors) but also having the
ASan/TSan allocator initialized in the same process.
This leads to crashes with the macOS 12.0 libmalloc nano allocator so
disable use of the allocator while running unit tests as a workaround.
rdar://80086125
Differential Revision: https://reviews.llvm.org/D107412
A `Vector` that doesn't require an initial `reserve()` (eg: with a
default, or small enough capacity) can have a constant initializer.
This changes the code in a few places to make that possible:
- mark a few other functions as `constexpr`
- do without any `reinterpret_cast`
- allow to skip `reserve` from `init`
Differential Revision: https://reviews.llvm.org/D107308
mallopt calls are left-over from the times we used
__libc_malloc/__libc_free for internal allocations.
Now we have own internal allocator, so this is not needed.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107342
We currently use ad-hoc spin waiting to synchronize thread creation
and thread start both ways. But spinning tend to degrade ungracefully
under high contention (lots of threads are created at the same time).
Use semaphores for synchronization instead.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107337
Currently unaligned access functions are defined in tsan_interface.cpp
and do a real call to MemoryAccess. This means we have a real call
and no read/write constant propagation.
Unaligned memory access can be quite hot for some programs
(observed on some compression algorithms with ~90% of unaligned accesses).
Move them to tsan_interface_inl.h to avoid the additional call
and enable constant propagation.
Also reorder the actual store and memory access handling for
__sanitizer_unaligned_store callbacks to enable tail calling
in MemoryAccess.
Depends on D107282.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D107283
Add AccessVptr access type.
For now it's converted to the same thr->is_vptr_access,
but later it will be passed directly to ReportRace
and will enable efficient tail calling in MemoryAccess function
(currently __tsan_vptr_update/__tsan_vptr_read can't use
tail calls in MemoryAccess because of the trailing assignment
to thr->is_vptr_access).
Depends on D107276.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D107282
Currently we have MemoryAccess function that accepts
"bool kAccessIsWrite, bool kIsAtomic" and 4 wrappers:
MemoryRead/MemoryWrite/MemoryReadAtomic/MemoryWriteAtomic.
Such scheme with bool flags is not particularly scalable/extendable.
Because of that we did not have Read/Write wrappers for UnalignedMemoryAccess,
and "true, false" or "false, true" at call sites is not very readable.
Moreover, the new tsan runtime will introduce more flags
(e.g. move "freed" and "vptr access" to memory acccess flags).
We can't have 16 wrappers and each flag also takes whole
64-bit register for non-inlined calls.
Introduce AccessType enum that contains bit mask of
read/write, atomic/non-atomic, and later free/non-free,
vptr/non-vptr.
Such scheme is more scalable, more readble, more efficient
(don't consume multiple registers for these flags during calls)
and allows to cover unaligned and range variations of memory
access functions as well.
Also switch from size log to just size.
The new tsan runtime won't have the limitation of supporting
only 1/2/4/8 access sizes, so we don't need the logarithms.
Also add an inline thunk that converts the new interface to the old one.
For inlined calls it should not add any overhead because
all flags/size can be computed as compile time.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D107276
... and rename it to 'warnIfNonZero' to better-reflect what it actually
does.
The goal is to minimize the amount of logic that's conditionally
compiled under '#if __APPLE__'.
Add new fixed-size vector clock for the new tsan runtime.
For now it's unused.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107167
Currently we save the creation stack for sync objects always.
But it's not needed to some sync objects, most notably atomics.
We simply don't use atomic creation stack anywhere.
Allow callers to control saving of the creation stack
and don't save it for atomics.
Depends on D107257.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107258
MetaMap::GetSync shows up in profiles,
so add LIKELY/UNLIKELY annotations.
Depends on D107256.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107257
Don't lock the sync object inside of MetaMap methods.
This has several advantages:
- the new interface does not confuse thread-safety analysis
so we can remove a bunch of NO_THREAD_SAFETY_ANALYSIS attributes
- this allows use of scoped lock objects
- this allows more flexibility, e.g. locking some other mutex
between searching and locking the sync object
Also prefix the methods with GetSync to be consistent with GetBlock method.
Also make interface wrappers inlinable, otherwise we either end up with
2 copies of the method, or with an additional call.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107256
ProcessPendingSignals is called in all interceptors
and user atomic operations. Make the fast-path check
(no pending signals) inlinable.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107217
Some bots started failing with the following error
after changing Alloc to New. Change it back.
ThreadSanitizer: CHECK failed: ((locked[i].recursion)) == ((0))
4 __sanitizer::CheckedMutex::CheckNoLocks()
5 __tsan::ScopedInterceptor::~ScopedInterceptor()
6 memset
7 __tsan::New<__tsan::ExpectRace>()
8 __tsan::AddExpectRace()
9 BenignRaceImpl()
Differential Revision: https://reviews.llvm.org/D107212
Currently we inconsistently use u32 and int for thread ids,
there are also "unique tid" and "os tid" and just lots of other
things identified by integers.
Additionally new tsan runtime will introduce yet another
thread identifier that is very different from current tids.
Similarly for stack IDs, it's easy to confuse u32 with other
integer identifiers. And when a function accepts u32 or a struct
contains u32 field, it's not always clear what it is.
Add Tid and StackID typedefs to make it clear what is what.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107152
We currently build tests without -g, which is quite inconvenient.
Crash stacks don't have line numbers, gdb don't how line numbers either.
Always build tests with -g.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D107168
"Expected" races is a very ancient facility used in tsanv1 tests.
It's not used/needed anymore. Remove it.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D107175
Currently we setup either sigaction signal handler with 3 arguments
or old style signal handler with 1 argument depending on user handler type.
This unnecessarily complicates code. Always setup sigaction handler.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D107186
This fixes support for merging profiles which broke as a consequence
of e50a38840d. The issue was missing
adjustment in merge logic to account for the binary IDs which are
now included in the raw profile just after header.
In addition, this change also:
* Includes the version in module signature that's used for merging
to avoid accidental attempts to merge incompatible profiles.
* Moves the binary IDs size field after version field in the header
as was suggested in the review.
Differential Revision: https://reviews.llvm.org/D107143
This fixes support for merging profiles which broke as a consequence
of e50a38840d. The issue was missing
adjustment in merge logic to account for the binary IDs which are
now included in the raw profile just after header.
In addition, this change also:
* Includes the version in module signature that's used for merging
to avoid accidental attempts to merge incompatible profiles.
* Moves the binary IDs size field after version field in the header
as was suggested in the review.
Differential Revision: https://reviews.llvm.org/D107143
As code diverge from Google style we need
to add more and more exceptions to suppress
conflicts with clang-format and clang-tidy.
As this point it does not provide a additional value.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D107197
Multiple copies of emulated TLS state means inconsistent results when
accessing the same thread-local variable from different shared objects
(https://github.com/android/ndk/issues/1551). Making `__emutls_get_address`
be a weak default visibility symbol should make the dynamic linker
ensure only a single copy gets used at runtime. This is best-effort, but
the more robust approach of putting emulated TLS into its own shared
object would (a) be a much bigger change, and (b) shared objects are
pretty heavyweight, and adding a new one to a space-constrained
environment isn't an easy sell. Given the expected rarity of direct
accesses to emulated TLS variables across different shared objects, the
best-effort approach should suffice.
Reviewed By: danalbert, rprichard
Differential Revision: https://reviews.llvm.org/D107127