introduce a BufferedStackTrace class, which owns this array.
Summary:
This change splits __sanitizer::StackTrace class into a lightweight
__sanitizer::StackTrace, which doesn't own array of PCs, and BufferedStackTrace,
which owns it. This would allow us to simplify the interface of StackDepot,
and eventually merge __sanitizer::StackTrace with __tsan::StackTrace.
Test Plan: regression test suite.
Reviewers: kcc, dvyukov
Reviewed By: dvyukov
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5985
llvm-svn: 220635
The current handling (manual execution of atexit callbacks)
is overly complex and leads to constant problems due to mutual ordering of callbacks.
Instead simply wrap callbacks into our wrapper to establish
the necessary synchronization.
Fixes issue https://code.google.com/p/thread-sanitizer/issues/detail?id=80
llvm-svn: 219675
On some tests we see that signals are not delivered
when a thread is blocked in epoll_wait. The hypothesis
is that the signal is delivered right before epoll_wait
call. The signal is queued as in_blocking_func is not set
yet, and then the thread just blocks in epoll_wait forever.
So double check pending signals *after* setting
in_blocking_func. This way we either queue a signal
and handle it in the beginning of a blocking func,
or process the signal synchronously if it's delivered
when in_blocking_func is set.
llvm-svn: 218070
I don't remember that crash on mmap in internal allocator
ever yielded anything useful, only crashes in rare wierd untested situations.
One of the reasons for crash was to catch if tsan starts allocating
clocks using mmap. Tsan does not allocate clocks using internal_alloc anymore.
Solve it once and for all by allowing mmaps.
llvm-svn: 217929
We may as well just use Symbolizer::GetOrInit() in all the cases.
Don't call Symbolizer::Get() early in tools initialization: these days
it doesn't do any important setup work, and we may as well create the
symbolizer the first time it's actually needed.
llvm-svn: 217558
There interceptors do not seem to be strictly necessary for tsan.
But we see cases where the interceptors consume 70% of execution time.
Memory blocks passed to fgetgrent_r are "written to" by tsan several times.
First, there is some recursion (getgrnam_r calls fgetgrent_r), and each
function "writes to" the buffer. Then, the same memory is "written to"
twice, first as buf and then as pwbufp (both of them refer to the same addresses).
llvm-svn: 216904
Vector clocks is the most actively allocated object in tsan runtime.
Current internal allocator is not scalable enough to handle allocation
of clocks in scalable way (too small caches). This changes transforms
clocks to 2-level array with 512-byte blocks. Since all blocks are of
the same size, it's possible to cache them more efficiently in per-thread caches.
llvm-svn: 214912
Suppression context might be used in multiple sanitizers working
simultaneously (e.g. LSan and UBSan) and not knowing about each other.
llvm-svn: 214831
Convert TSan and LSan to the new interface. More changes will follow:
1) "suppressions" should become a common runtime flag.
2) Code for parsing suppressions file should be moved to SuppressionContext::Init().
llvm-svn: 214334
Get rid of Symbolizer::Init(path_to_external) in favor of
thread-safe Symbolizer::GetOrInit(), and use the latter version
everywhere. Implicitly depend on the value of external_symbolizer_path
runtime flag instead of passing it around manually.
No functionality change.
llvm-svn: 214005
It is currently broken because it reads a wrong value from profile (heap instead of total).
Also make it faster by reading /proc/self/statm. Reading of /proc/self/smaps
can consume more than 50% of time on beefy apps if done every 100ms.
llvm-svn: 213942
The tsan's deadlock detector has been used in Chromium for a while;
it found a few real bugs and reported no false positives.
So, it's time to give it a bit more exposure.
llvm-svn: 212533
The bug happens in the following case:
Mutex is located at heap block beginning,
when we call MutexDestroy, s->next is set to 0,
so free can't find the MBlock related to the block.
llvm-svn: 212531
Introduce new public header <sanitizer/allocator_interface.h> and a set
of functions __sanitizer_get_ownership(), __sanitizer_malloc_hook() etc.
that will eventually replace their tool-specific equivalents
(__asan_get_ownership(), __msan_get_ownership() etc.). Tool-specific
functions are now deprecated and implemented as stubs redirecting
to __sanitizer_ versions (which are implemented differently in each tool).
Replace all uses of __xsan_ versions with __sanitizer_ versions in unit
and lit tests.
llvm-svn: 212469
The former used to crash with a null deref if it was given a not owned pointer,
while the latter returned 0. Now they both return 0. This is still not the best possible
behavior: it is better to print an error report with a stack trace, pointing
to the error in user code, as we do in ASan.
llvm-svn: 212112
The optimization is two-fold:
First, the algorithm now uses SSE instructions to
handle all 4 shadow slots at once. This makes processing
faster.
Second, if shadow contains the same access, we do not
store the event into trace. This increases effective
trace size, that is, tsan can remember up to 10x more
previous memory accesses.
Perofrmance impact:
Before:
[ OK ] DISABLED_BENCH.Mop8Read (2461 ms)
[ OK ] DISABLED_BENCH.Mop8Write (1836 ms)
After:
[ OK ] DISABLED_BENCH.Mop8Read (1204 ms)
[ OK ] DISABLED_BENCH.Mop8Write (976 ms)
But this measures only fast-path.
On large real applications the speedup is ~20%.
Trace size impact:
On app1:
Memory accesses : 1163265870
Including same : 791312905 (68%)
on app2:
Memory accesses : 166875345
Including same : 150449689 (90%)
90% of filtered events means that trace size is effectively 10x larger.
llvm-svn: 209897
The new storage (MetaMap) is based on direct shadow (instead of a hashmap + per-block lists).
This solves a number of problems:
- eliminates quadratic behaviour in SyncTab::GetAndLock (https://code.google.com/p/thread-sanitizer/issues/detail?id=26)
- eliminates contention in SyncTab
- eliminates contention in internal allocator during allocation of sync objects
- removes a bunch of ad-hoc code in java interface
- reduces java shadow from 2x to 1/2x
- allows to memorize heap block meta info for Java and Go
- allows to cleanup sync object meta info for Go
- which in turn enabled deadlock detector for Go
llvm-svn: 209810
The refactoring makes suppressions more flexible
and allow to suppress based on arbitrary number of stacks.
In particular it fixes:
https://code.google.com/p/thread-sanitizer/issues/detail?id=64
"Make it possible to suppress deadlock reports by any stack (not just first)"
llvm-svn: 209757
This way does not require a __sanitizer_cov_dump() call. That's
important on Android, where apps can be killed at arbitrary time.
We write raw PCs to disk instead of module offsets; we also write
memory layout to a separate file. This increases dump size by the
factor of 2 on 64-bit systems.
llvm-svn: 209653
Move fflush and fclose interceptors to sanitizer_common.
Use a metadata map to keep information about the external locations
that must be updated when the file is written to.
llvm-svn: 208676
If a multi-threaded program calls fork(), TSan ignores all memory accesses
in the child to prevent deadlocks in TSan runtime. This is OK, as child is
probably going to call exec() as soon as possible. However, a rare deadlocks
could be caused by ThreadIgnoreBegin() function itself.
ThreadIgnoreBegin() remembers the current stack trace and puts it into the
StackDepot to report a warning later if a thread exited with ignores enabled.
Using StackDepotPut in a child process is dangerous: it locks a mutex on
a slow path, which could be already locked in a parent process.
The fix is simple: just don't put current stack traces to StackDepot in
ThreadIgnoreBegin() and ThreadIgnoreSyncBegin() functions if we're
running after a multithreaded fork. We will not report any
"thread exited with ignores enabled" errors in this case anyway.
Submitting this without a testcase, as I believe the standalone reproducer
is pretty hard to construct.
llvm-svn: 205534
Make vector clock operations O(1) for several important classes of use cases.
See comments for details.
Below are stats from a large server app, 77% of all clock operations are handled as O(1).
Clock acquire : 25983645
empty clock : 6288080
fast from release-store : 14917504
contains my tid : 4515743
repeated (fast) : 2141428
full (slow) : 2636633
acquired something : 1426863
Clock release : 2544216
resize : 6241
fast1 : 197693
fast2 : 1016293
fast3 : 2007
full (slow) : 1797488
was acquired : 709227
clear tail : 1
last overflow : 0
Clock release store : 3446946
resize : 200516
fast : 469265
slow : 2977681
clear tail : 0
Clock acquire-release : 820028
llvm-svn: 204656
Extend ParseFlag to accept the |description| parameter, add dummy values for all existing flags.
As the flags are parsed their descriptions are stored in a global linked list.
The tool can later call __sanitizer::PrintFlagDescriptions() to dump all the flag names and their descriptions.
Add the 'help' flag and make ASan, TSan and MSan print the flags if 'help' is set to 1.
llvm-svn: 204339
Introduce DDetector interface between the tool and the DD itself.
It will help to experiment with other DD implementation,
as well as reuse DD in other tools.
llvm-svn: 202485