This patch is by Simone Atzeni with portions by Adhemerval Zanella.
This contains the LLVM patches to enable the thread sanitizer for
PPC64, both big- and little-endian. Two different virtual memory
sizes are supported: Old kernels use a 44-bit address space, while
newer kernels require a 46-bit address space.
There are two companion patches that will be added shortly. There is
a Clang patch to actually turn on the use of the thread sanitizer for
PPC64. There is also a patch that I wrote to provide interceptor
support for setjmp/longjmp on PPC64.
Patch discussion at reviews.llvm.org/D12841.
llvm-svn: 255057
This patch unify the 39 and 42-bit support for AArch64 by using an external
memory read to check the runtime detected VMA and select the better mapping
and transformation. Although slower, this leads to same instrumented binary
to be independent of the kernel.
Along with this change this patch also fix some 42-bit failures with
ALSR disable by increasing the upper high app memory threshold and also
the 42-bit madvise value for non large page set.
llvm-svn: 254151
This implements a "poor man's TLV" to be used for TSan's ThreadState on OS X. Based on the fact that `pthread_self()` is always available and reliable and returns a valid pointer to memory, we'll use the shadow memory of this pointer as a thread-local storage. No user code should ever read/write to this internal libpthread structure, so it's safe to use it for this purpose. We lazily allocate the ThreadState object and store the pointer here.
Differential Revision: http://reviews.llvm.org/D14288
llvm-svn: 252159
This patch moves a few functions from `sanitizer_linux_libcdep.cc` to `sanitizer_posix_libcdep.cc` in order to use them on OS X as well. Plus a few more small build fixes.
This is part of an effort to port TSan to OS X, and it's one the very first steps. Don't expect TSan on OS X to actually work or pass tests at this point.
Differential Revision: http://reviews.llvm.org/D14235
llvm-svn: 251918
Race deduplication code proved to be a performance bottleneck in the past if suppressions/annotations are used, or just some races left unaddressed. And we still get user complaints about this:
https://groups.google.com/forum/#!topic/thread-sanitizer/hB0WyiTI4e4
ReportRace already has several layers of caching for racy pcs/addresses to make deduplication faster. However, ReportRace still takes a global mutex (ThreadRegistry and ReportMutex) during deduplication and also calls mmap/munmap (which take process-wide semaphore in kernel), this makes deduplication non-scalable.
This patch moves race deduplication outside of global mutexes and also removes all mmap/munmap calls.
As the result, race_stress.cc with 100 threads and 10000 iterations become 30x faster:
before:
real 0m21.673s
user 0m5.932s
sys 0m34.885s
after:
real 0m0.720s
user 0m23.646s
sys 0m1.254s
http://reviews.llvm.org/D12554
llvm-svn: 246758
This patch adds support for tsan on aarch64-linux with 42-bit VMA
(current default config for 64K pagesize kernels). The support is
enabled by defining the SANITIZER_AARCH64_VMA to 42 at build time
for both clang/llvm and compiler-rt. The default VMA is 39 bits.
It also enabled tsan for previous supported VMA (39).
llvm-svn: 246330
This patch enabled TSAN for aarch64 with 39-bit VMA layout. As defined by
tsan_platform.h the layout used is:
0000 4000 00 - 0200 0000 00: main binary
2000 0000 00 - 4000 0000 00: shadow memory
4000 0000 00 - 5000 0000 00: metainfo
5000 0000 00 - 6000 0000 00: -
6000 0000 00 - 6200 0000 00: traces
6200 0000 00 - 7d00 0000 00: -
7d00 0000 00 - 7e00 0000 00: heap
7e00 0000 00 - 7fff ffff ff: modules and main thread stack
Which gives it about 8GB for main binary, 4GB for heap and 8GB for
modules and main thread stack.
Most of tests are passing, with the exception of:
* ignore_lib0, ignore_lib1, ignore_lib3 due a kernel limitation for
no support to make mmap page non-executable.
* longjmp tests due missing specialized assembly routines.
These tests are xfail for now.
The only tsan issue still showing is:
rtl/TsanRtlTest/Posix.ThreadLocalAccesses
Which still required further investigation. The test is disable for
aarch64 for now.
llvm-svn: 244055
This is done by creating a named shared memory region, unlinking it
and setting up a private (i.e. copy-on-write) mapping of that instead
of a regular anonymous mapping. I've experimented with regular
(sparse) files, but they can not be scaled to the size of MSan shadow
mapping, at least on Linux/X86_64 and ext3 fs.
Controlled by a common flag, decorate_proc_maps, disabled by default.
This patch has a few shortcomings:
* not all mappings are annotated, especially in TSan.
* our handling of memset() of shadow via mmap() puts small anonymous
mappings inside larger named mappings, which looks ugly and can, in
theory, hit the mapping number limit.
llvm-svn: 238621
The patch is generated using clang-tidy misc-use-override check.
This command was used:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \
-checks='-*,misc-use-override' -header-filter='llvm|clang' -j=32 -fix \
-format
llvm-svn: 234680
The problem is that without SA_RESTORER flag, kernel ignores the handler. So tracer actually did not setup any handler.
Add SA_RESTORER flag when setting up handlers.
Add a test that causes SIGSEGV in stoptheworld callback.
Move SignalContext from asan to sanitizer_common to print better diagnostics about signal in the tracer thread.
http://reviews.llvm.org/D8005
llvm-svn: 230978
Provide defaults for TSAN_COLLECT_STATS and TSAN_NO_HISTORY.
Replace #ifdef directives with #if. This fixes a bug introduced
in r229112, where building TSan runtime with -DTSAN_COLLECT_STATS=0
would still enable stats collection and reporting.
llvm-svn: 229581
Return a linked list of AddressInfo objects, instead of using an array of
these objects as an output parameter. This simplifies the code in callers
of this function (especially TSan).
Fix a few memory leaks from internal allocator, when the returned
AddressInfo objects were not properly cleared.
llvm-svn: 223145
This flag can be used to specify the format of stack frames - user
can now provide a string with placeholders, which should be printed
for each stack frame with placeholders replaced with actual data.
For example "%p" will be replaced by PC, "%s" will be replaced by
the source file name etc.
"DEFAULT" value enforces default stack trace format currently used in
all the sanitizers except TSan.
This change also implements __sanitizer_print_stack_trace interface
function in TSan.
llvm-svn: 221469
Summary:
This change removes `__tsan::StackTrace` class. There are
now three alternatives:
# Lightweight `__sanitizer::StackTrace`, which doesn't own a buffer
of PCs. It is used in functions that need stack traces in read-only
mode, and helps to prevent unnecessary allocations/copies (e.g.
for StackTraces fetched from StackDepot).
# `__sanitizer::BufferedStackTrace`, which stores buffer of PCs in
a constant array. It is used in TraceHeader (non-Go version)
# `__tsan::VarSizeStackTrace`, which owns buffer of PCs, dynamically
allocated via TSan internal allocator.
Test Plan: compiler-rt test suite
Reviewers: dvyukov, kcc
Reviewed By: kcc
Subscribers: llvm-commits, kcc
Differential Revision: http://reviews.llvm.org/D6004
llvm-svn: 221194
Vector clocks is the most actively allocated object in tsan runtime.
Current internal allocator is not scalable enough to handle allocation
of clocks in scalable way (too small caches). This changes transforms
clocks to 2-level array with 512-byte blocks. Since all blocks are of
the same size, it's possible to cache them more efficiently in per-thread caches.
llvm-svn: 214912
The optimization is two-fold:
First, the algorithm now uses SSE instructions to
handle all 4 shadow slots at once. This makes processing
faster.
Second, if shadow contains the same access, we do not
store the event into trace. This increases effective
trace size, that is, tsan can remember up to 10x more
previous memory accesses.
Perofrmance impact:
Before:
[ OK ] DISABLED_BENCH.Mop8Read (2461 ms)
[ OK ] DISABLED_BENCH.Mop8Write (1836 ms)
After:
[ OK ] DISABLED_BENCH.Mop8Read (1204 ms)
[ OK ] DISABLED_BENCH.Mop8Write (976 ms)
But this measures only fast-path.
On large real applications the speedup is ~20%.
Trace size impact:
On app1:
Memory accesses : 1163265870
Including same : 791312905 (68%)
on app2:
Memory accesses : 166875345
Including same : 150449689 (90%)
90% of filtered events means that trace size is effectively 10x larger.
llvm-svn: 209897
The new storage (MetaMap) is based on direct shadow (instead of a hashmap + per-block lists).
This solves a number of problems:
- eliminates quadratic behaviour in SyncTab::GetAndLock (https://code.google.com/p/thread-sanitizer/issues/detail?id=26)
- eliminates contention in SyncTab
- eliminates contention in internal allocator during allocation of sync objects
- removes a bunch of ad-hoc code in java interface
- reduces java shadow from 2x to 1/2x
- allows to memorize heap block meta info for Java and Go
- allows to cleanup sync object meta info for Go
- which in turn enabled deadlock detector for Go
llvm-svn: 209810
The refactoring makes suppressions more flexible
and allow to suppress based on arbitrary number of stacks.
In particular it fixes:
https://code.google.com/p/thread-sanitizer/issues/detail?id=64
"Make it possible to suppress deadlock reports by any stack (not just first)"
llvm-svn: 209757
Introduce DDetector interface between the tool and the DD itself.
It will help to experiment with other DD implementation,
as well as reuse DD in other tools.
llvm-svn: 202485
Currently correct programs can deadlock after fork, because atomic operations and async-signal-safe calls are not async-signal-safe under tsan.
With this change:
- if a single-threaded program forks, the child continues running with verification enabled (the tsan background thread is recreated as well)
- if a multi-threaded program forks, then the child runs with verification disabled (memory accesses, atomic operations and interceptors are disabled); it's expected that it will exec soon anyway
- if the child tries to create more threads after multi-threaded fork, the program aborts with error message
- die_after_fork flag is added that allows to continue running, but all bets are off
http://llvm-reviews.chandlerc.com/D2614
llvm-svn: 199993
This is intended to address the following problem.
Episodically we see CHECK-failures when recursive interceptors call back into user code. Effectively we are not "in_rtl" at this point, but it's very complicated and fragile to properly maintain in_rtl property. Instead get rid of it. It was used mostly for sanity CHECKs, which basically never uncover real problems.
Instead introduce ignore_interceptors flag, which is used in very few narrow places to disable recursive interceptors (e.g. during runtime initialization).
llvm-svn: 197979