Currently we either define SANITIZER_GO for Go or don't define it at all for C++.
This works fine with preprocessor (ifdef/ifndef/defined), but does not work
for C++ if statements (e.g. if (SANITIZER_GO) {...}). Also this is different
from majority of SANITIZER_FOO macros which are always defined to either 0 or 1.
Always define SANITIZER_GO to either 0 or 1.
This allows to use SANITIZER_GO in expressions and in flag default values.
Also remove kGoMode and kCppMode, which were meant to be used in expressions,
but they are not defined in sanitizer_common code, so SANITIZER_GO become prevalent.
Also convert some preprocessor checks to C++ if's or ternary expressions.
Majority of this change is done mechanically with:
sed "s#ifdef SANITIZER_GO#if SANITIZER_GO#g"
sed "s#ifndef SANITIZER_GO#if \!SANITIZER_GO#g"
sed "s#defined(SANITIZER_GO)#SANITIZER_GO#g"
llvm-svn: 285443
This is reincarnation of http://reviews.llvm.org/D17648 with the bug fix pointed out by Adhemerval (zatrazz).
Currently ThreadState holds both logical state (required for race-detection algorithm, user-visible)
and physical state (various caches, most notably malloc cache). Move physical state in a new
Process entity. Besides just being the right thing from abstraction point of view, this solves several
problems:
Cache everything on P level in Go. Currently we cache on a mix of goroutine and OS thread levels.
This unnecessary increases memory consumption.
Properly handle free operations in Go. Frees are issue by GC which don't have goroutine context.
As the result we could not do anything more than just clearing shadow. For example, we leaked
sync objects and heap block descriptors.
This will allow to get rid of libc malloc in Go (now we have Processor context for internal allocator cache).
This in turn will allow to get rid of dependency on libc entirely.
Potentially we can make Processor per-CPU in C++ mode instead of per-thread, which will
reduce resource consumption.
The distinction between Thread and Processor is currently used only by Go, C++ creates Processor per OS thread,
which is equivalent to the current scheme.
llvm-svn: 267678
Currently ThreadState holds both logical state (required for race-detection algorithm, user-visible)
and physical state (various caches, most notably malloc cache). Move physical state in a new
Process entity. Besides just being the right thing from abstraction point of view, this solves several
problems:
1. Cache everything on P level in Go. Currently we cache on a mix of goroutine and OS thread levels.
This unnecessary increases memory consumption.
2. Properly handle free operations in Go. Frees are issue by GC which don't have goroutine context.
As the result we could not do anything more than just clearing shadow. For example, we leaked
sync objects and heap block descriptors.
3. This will allow to get rid of libc malloc in Go (now we have Processor context for internal allocator cache).
This in turn will allow to get rid of dependency on libc entirely.
4. Potentially we can make Processor per-CPU in C++ mode instead of per-thread, which will
reduce resource consumption.
The distinction between Thread and Processor is currently used only by Go, C++ creates Processor per OS thread,
which is equivalent to the current scheme.
llvm-svn: 262037
On OS X, for weak function (that user can override by providing their own implementation in the main binary), we need extern `"C" SANITIZER_INTERFACE_ATTRIBUTE SANITIZER_WEAK_ATTRIBUTE NOINLINE`.
Fixes a broken test case on OS X, java_symbolization.cc, which uses a weak function __tsan_symbolize_external.
Differential Revision: http://reviews.llvm.org/D14907
llvm-svn: 254298
On OS X, GCD worker threads are created without a call to pthread_create. We need to properly register these threads with ThreadCreate and ThreadStart. This patch uses a libpthread API (`pthread_introspection_hook_install`) to get notifications about new threads and about threads that are about to be destroyed.
Differential Revision: http://reviews.llvm.org/D14328
llvm-svn: 252049
Embed UBSan runtime into TSan and MSan runtimes in the same as we do
in ASan. Extend UBSan test suite to also run tests for these
combinations.
llvm-svn: 235954
Provide defaults for TSAN_COLLECT_STATS and TSAN_NO_HISTORY.
Replace #ifdef directives with #if. This fixes a bug introduced
in r229112, where building TSan runtime with -DTSAN_COLLECT_STATS=0
would still enable stats collection and reporting.
llvm-svn: 229581
TSAN_SHADOW_COUNT is defined to 4 in all environments.
Other values of TSAN_SHADOW_COUNT were never tested and
were broken by recent changes to shadow mapping.
Remove it as there is no reason to fix nor maintain it.
llvm-svn: 226466
Summary:
This change removes `__tsan::StackTrace` class. There are
now three alternatives:
# Lightweight `__sanitizer::StackTrace`, which doesn't own a buffer
of PCs. It is used in functions that need stack traces in read-only
mode, and helps to prevent unnecessary allocations/copies (e.g.
for StackTraces fetched from StackDepot).
# `__sanitizer::BufferedStackTrace`, which stores buffer of PCs in
a constant array. It is used in TraceHeader (non-Go version)
# `__tsan::VarSizeStackTrace`, which owns buffer of PCs, dynamically
allocated via TSan internal allocator.
Test Plan: compiler-rt test suite
Reviewers: dvyukov, kcc
Reviewed By: kcc
Subscribers: llvm-commits, kcc
Differential Revision: http://reviews.llvm.org/D6004
llvm-svn: 221194
The optimization is two-fold:
First, the algorithm now uses SSE instructions to
handle all 4 shadow slots at once. This makes processing
faster.
Second, if shadow contains the same access, we do not
store the event into trace. This increases effective
trace size, that is, tsan can remember up to 10x more
previous memory accesses.
Perofrmance impact:
Before:
[ OK ] DISABLED_BENCH.Mop8Read (2461 ms)
[ OK ] DISABLED_BENCH.Mop8Write (1836 ms)
After:
[ OK ] DISABLED_BENCH.Mop8Read (1204 ms)
[ OK ] DISABLED_BENCH.Mop8Write (976 ms)
But this measures only fast-path.
On large real applications the speedup is ~20%.
Trace size impact:
On app1:
Memory accesses : 1163265870
Including same : 791312905 (68%)
on app2:
Memory accesses : 166875345
Including same : 150449689 (90%)
90% of filtered events means that trace size is effectively 10x larger.
llvm-svn: 209897
The new storage (MetaMap) is based on direct shadow (instead of a hashmap + per-block lists).
This solves a number of problems:
- eliminates quadratic behaviour in SyncTab::GetAndLock (https://code.google.com/p/thread-sanitizer/issues/detail?id=26)
- eliminates contention in SyncTab
- eliminates contention in internal allocator during allocation of sync objects
- removes a bunch of ad-hoc code in java interface
- reduces java shadow from 2x to 1/2x
- allows to memorize heap block meta info for Java and Go
- allows to cleanup sync object meta info for Go
- which in turn enabled deadlock detector for Go
llvm-svn: 209810
This allows to increase max shadow stack size to 64K,
and reliably catch shadow stack overflows instead of silently
corrupting memory.
llvm-svn: 192797