Summary:
Due to sandbox restrictions in the recent versions of the simulator runtime the
atos program is no longer able to access the task port of a parent process
without additional help.
This patch fixes this by registering a task port for the parent process
before spawning atos and also tells atos to look for this by setting
a special environment variable.
This patch is based on an Apple internal fix (rdar://problem/43693565) that
unfortunately contained a bug (rdar://problem/58789439) because it used
setenv() to set the special environment variable. This is not safe because in
certain circumstances this can trigger a call to realloc() which can fail
during symbolization leading to deadlock. A test case is included that captures
this problem.
The approach used to set the necessary environment variable is as
follows:
1. Calling `putenv()` early during process init (but late enough that
malloc/realloc works) to set a dummy value for the environment variable.
2. Just before `atos` is spawned the storage for the environment
variable is modified to contain the correct PID.
A flaw with this approach is that if the application messes with the
atos environment variable (i.e. unsets it or changes it) between the
time its set and the time we need it then symbolization will fail. We
will ignore this issue for now but a `DCHECK()` is included in the patch
that documents this assumption but doesn't check it at runtime to avoid
calling `getenv()`.
The issue reported in rdar://problem/58789439 manifested as a deadlock
during symbolization in the following situation:
1. Before TSan detects an issue something outside of the runtime calls
setenv() that sets a new environment variable that wasn't previously
set. This triggers a call to malloc() to allocate a new environment
array. This uses TSan's normal user-facing allocator. LibC stores this
pointer for future use later.
2. TSan detects an issue and tries to launch the symbolizer. When we are in the
symbolizer we switch to a different (internal allocator) and then we call
setenv() to set a new environment variable. When this happen setenv() sees
that it needs to make the environment array larger and calls realloc() on the
existing enviroment array because it remembers that it previously allocated
memory for it. Calling realloc() fails here because it is being called on a
pointer its never seen before.
The included test case closely reproduces the originally reported
problem but it doesn't replicate the `((kBlockMagic)) ==
((((u64*)addr)[0])` assertion failure exactly. This is due to the way
TSan's normal allocator allocates the environment array the first time
it is allocated. In the test program addr[0] accesses an inaccessible
page and raises SIGBUS. If TSan's SIGBUS signal handler is active, the
signal is caught and symbolication is attempted again which results in
deadlock.
In the originally reported problem the pointer is successfully derefenced but
then the assert fails due to the provided pointer not coming from the active
allocator. When the assert fails TSan tries to symbolicate the stacktrace while
already being in the middle of symbolication which results in deadlock.
rdar://problem/58789439
Reviewers: kubamracek, yln
Subscribers: jfb, #sanitizers, llvm-commits
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D78179
Summary:
These tests pass with clang, but fail if gcc was used.
gcc build creates similar but not the same stacks.
Reviewers: vitalybuka
Reviewed By: vitalybuka
Subscribers: dvyukov, llvm-commits, #sanitizers
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D78114
Summary:
Previously `AtosSymbolizer` would set the PID to examine in the
constructor which is called early on during sanitizer init. This can
lead to incorrect behaviour in the case of a fork() because if the
symbolizer is launched in the child it will be told examine the parent
process rather than the child.
To fix this the PID is determined just before the symbolizer is
launched.
A test case is included that triggers the buggy behaviour that existed
prior to this patch. The test observes the PID that `atos` was called
on. It also examines the symbolized stacktrace. Prior to this patch
`atos` failed to symbolize the stacktrace giving output that looked
like...
```
#0 0x100fc3bb5 in __sanitizer_print_stack_trace asan_stack.cpp:86
#1 0x10490dd36 in PrintStack+0x56 (/path/to/print-stack-trace-in-code-loaded-after-fork.cpp.tmp_shared_lib.dylib:x86_64+0xd36)
#2 0x100f6f986 in main+0x4a6 (/path/to/print-stack-trace-in-code-loaded-after-fork.cpp.tmp_loader:x86_64+0x100001986)
#3 0x7fff714f1cc8 in start+0x0 (/usr/lib/system/libdyld.dylib:x86_64+0x1acc8)
```
After this patch stackframes `#1` and `#2` are fully symbolized.
This patch is also a pre-requisite refactor for rdar://problem/58789439.
Reviewers: kubamracek, yln
Subscribers: #sanitizers, llvm-commits
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D77623
Summary:
In preparation for writing a test for a bug fix we need to be able to
see the command used to launch the symbolizer process. This feature
will likely be useful for debugging how the Sanitizers use the
symbolizer in general.
This patch causes the command line used to launch the process to be
shown at verbosity level 3 and higher.
A small test case is included.
Reviewers: kubamracek, yln, vitalybuka, eugenis, kcc
Subscribers: #sanitizers, llvm-commits
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D77622
For targets where char is unsigned (like PowerPC), something like
char c = fgetc(...) will never produce a char that will compare
equal to EOF so this loop does not terminate.
Change the type to int (which appears to be the POSIX return type
for fgetc).
This allows the test case to terminate normally on PPC.
Buildbots say:
[126/127] Running lint check for sanitizer sources...
FAILED: projects/compiler-rt/lib/CMakeFiles/SanitizerLintCheck
cd /home/buildbots/ppc64be-clang-multistage-test/clang-ppc64be-multistage/stage1/projects/compiler-rt/lib && env LLVM_CHECKOUT=/home/buildbots/ppc64be-clang-multistage-test/clang-ppc64be-multistage/llvm/llvm SILENT=1 TMPDIR= PYTHON_EXECUTABLE=/usr/bin/python COMPILER_RT=/home/buildbots/ppc64be-clang-multistage-test/clang-ppc64be-multistage/llvm/compiler-rt /home/buildbots/ppc64be-clang-multistage-test/clang-ppc64be-multistage/llvm/compiler-rt/lib/sanitizer_common/scripts/check_lint.sh
/home/buildbots/ppc64be-clang-multistage-test/clang-ppc64be-multistage/llvm/compiler-rt/test/tsan/fiber_cleanup.cpp:71: Could not find a newline character at the end of the file. [whitespace/ending_newline] [5]
ninja: build stopped: subcommand failed.
Somehow this check is not part of 'ninja check-tsan'.
When creating and destroying fibers in tsan a thread state is created and destroyed. Currently, a memory mapping is leaked with each fiber (in __tsan_destroy_fiber). This causes applications with many short running fibers to crash or hang because of linux vm.max_map_count.
The root of this is that ThreadState holds a pointer to ThreadSignalContext for handling signals. The initialization and destruction of it is tied to platform specific events in tsan_interceptors_posix and missed when destroying a fiber (specifically, SigCtx is used to lazily create the ThreadSignalContext in tsan_interceptors_posix). This patch cleans up the memory by makinh the ThreadState create and destroy the ThreadSignalContext.
The relevant code causing the leak with fibers is the fiber destruction:
void FiberDestroy(ThreadState *thr, uptr pc, ThreadState *fiber) {
FiberSwitchImpl(thr, fiber);
ThreadFinish(fiber);
FiberSwitchImpl(fiber, thr);
internal_free(fiber);
}
Author: Florian
Reviewed-in: https://reviews.llvm.org/D76073
Summary:
This commit adds two command-line options to clang.
These options let the user decide which functions will receive SanitizerCoverage instrumentation.
This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing.
Patch by Yannis Juglaret of DGA-MI, Rennes, France
libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy.
Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions.
The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`.
Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists:
```
# (a)
src:*
fun:*
# (b)
src:SRC/*
fun:*
# (c)
src:SRC/src/woff2_dec.cc
fun:*
```
Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points.
For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s.
We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction.
The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise.
Reviewers: kcc, morehouse, vitalybuka
Reviewed By: morehouse, vitalybuka
Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D63616
Looks like this test fails on Darwin x86_64 as well:
http://green.lab.llvm.org/green/job/clang-stage1-RA/8593/
Command Output (stderr):
--
fatal error: error in backend: Global variable '__sancov_gen_' has an invalid section specifier '__DATA,__sancov_bool_flag': mach-o section specifier requires a section whose length is between 1 and 16 characters.
The intent of the `llvm_gcda_start_file` function is that only
one process create the .gcda file and initialize it to be updated
by other processes later.
Before this change, if multiple processes are started simultaneously,
some of them may initialize the file because both the first and
second `open` calls may succeed in a race condition and `new_file`
becomes 1 in those processes. This leads incorrect coverage counter
values. This often happens in MPI (Message Passing Interface) programs.
The test program added in this change is a simple reproducer.
This change ensures only one process creates/initializes the file by
using the `O_EXCL` flag.
Differential Revision: https://reviews.llvm.org/D76206
Summary:
Follow up fix to 445b810fbd. The `log show` command only works for
privileged users so run a quick test of the command during lit config to
see if the command works and only add the `darwin_log_cmd` feature if
this is the case.
Unfortunately this means the `asan/TestCases/Darwin/duplicate_os_log_reports.cpp`
test and any other tests in the future that use this feature won't run
for unprivileged users which is likely the case in CI.
rdar://problem/55986279
Reviewers: kubamracek, yln, dcoughlin
Subscribers: Charusso, #sanitizers, llvm-commits
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D76899
Summary:
When ASan reports an issue the contents of the system log buffer
(`error_message_buffer`) get flushed to the system log (via
`LogFullErrorReport()`). After this happens the buffer is not cleared
but this is usually fine because the process usually exits soon after
reporting the issue.
However, when ASan runs in `halt_on_error=0` mode execution continues
without clearing the buffer. This leads to problems if more ASan
issues are found and reported.
1. Duplicate ASan reports in the system log. The Nth (start counting from 1)
ASan report will be duplicated (M - N) times in the system log if M is the
number of ASan issues reported.
2. Lost ASan reports. Given a sufficient
number of reports the buffer will fill up and consequently cannot be appended
to. This means reports can be lost.
The fix here is to reset `error_message_buffer_pos` to 0 which
effectively clears the system log buffer.
A test case is included but unfortunately it is Darwin specific because
querying the system log is an OS specific activity.
rdar://problem/55986279
Reviewers: kubamracek, yln, vitalybuka, kcc, filcab
Subscribers: #sanitizers, llvm-commits
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D76749
Disable symbolization of results, since llvm-symbolizer cannot start
due to restricted readlink(), causing the test to die with SIGPIPE.
Author: Ilya Leoshkevich
Reviewed By: Evgenii Stepanov
Differential Revision: https://reviews.llvm.org/D76576
Temporarily revert "tsan: fix leak of ThreadSignalContext for fibers"
because it breaks the LLDB bot on GreenDragon.
This reverts commit 93f7743851.
This reverts commit d8a0f76de7.
When creating and destroying fibers in tsan a thread state
is created and destroyed. Currently, a memory mapping is
leaked with each fiber (in __tsan_destroy_fiber).
This causes applications with many short running fibers
to crash or hang because of linux vm.max_map_count.
The root of this is that ThreadState holds a pointer to
ThreadSignalContext for handling signals. The initialization
and destruction of it is tied to platform specific events
in tsan_interceptors_posix and missed when destroying a fiber
(specifically, SigCtx is used to lazily create the
ThreadSignalContext in tsan_interceptors_posix). This patch
cleans up the memory by inverting the control from the
platform specific code calling the generic ThreadFinish to
ThreadFinish calling a platform specific clean-up routine
after finishing a thread.
The relevant code causing the leak with fibers is the fiber destruction:
void FiberDestroy(ThreadState *thr, uptr pc, ThreadState *fiber) {
FiberSwitchImpl(thr, fiber);
ThreadFinish(fiber);
FiberSwitchImpl(fiber, thr);
internal_free(fiber);
}
I would appreciate feedback if this way of fixing the leak is ok.
Also, I think it would be worthwhile to more closely look at the
lifecycle of ThreadState (i.e. it uses no constructor/destructor,
thus requiring manual callbacks for cleanup) and how OS-Threads/user
level fibers are differentiated in the codebase. I would be happy to
contribute more if someone could point me at the right place to
discuss this issue.
Reviewed-in: https://reviews.llvm.org/D76073
Author: Florian (Florian)
struct stack_t on Linux x86_64 has internal padding which may be left
uninitialized. The check should be replaced with multiple checks for
individual fields of the struct. For now, remove the check altogether.
Summary:
Move interceptor from msan to sanitizer_common_interceptors.inc, so that
other sanitizers could benefit.
Adjust FixedCVE_2016_2143() to deal with the intercepted uname().
Patch by Ilya Leoshkevich.
Reviewers: eugenis, vitalybuka, uweigand, jonpa
Reviewed By: eugenis, vitalybuka
Subscribers: dberris, krytarowski, #sanitizers, stefansf, Andreas-Krebbel
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D76578
This test case fails due to different handling of weak items between
LLD and LD on PPC. The issue only occurs when the default linker is LLD
and the test case is run on a system where ASLR is enabled.
A recent change to MemorySSA caused LLVM to start optimizing the call to
'f(x)' into just 'x', despite the 'noinline' attribute. So try harder to
prevent this optimization from firing.
After a first attempt to fix the test-suite failures, my first recommit
caused the same failures again. I had updated CMakeList.txt files of
tests that needed -fcommon, but it turns out that there are also
Makefiles which are used by some bots, so I've updated these Makefiles
now too.
See the original commit message for more details on this change:
0a9fc9233e
This includes fixes for:
- test-suite: some benchmarks need to be compiled with -fcommon, see D75557.
- compiler-rt: one test needed -fcommon, and another a change, see D75520.
and follow-ups:
a2ca1c2d "build: disable zlib by default on Windows"
2181bf40 "[CMake] Link against ZLIB::ZLIB"
1079c68a "Attempt to fix ZLIB CMake logic on Windows"
This changed the output of llvm-config --system-libs, and more
importantly it broke stand-alone builds. Instead of piling on more fix
attempts, let's revert this to reduce the risk of more breakages.
After the format change from D69471, there can be more than one section
in an object that contains coverage function records. Look up each of
these sections and concatenate all the records together.
This re-enables the instrprof-merging.cpp test, which previously was
failing on OSes which use comdats.
Thanks to Jeremy Morse, who very kindly provided object files from the
bot I broke to help me debug.
An execution count goes missing for a constructor, this needs
investigation:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/45132/
```
/home/buildbots/ppc64be-clang-test/clang-ppc64be/llvm/compiler-rt/test/profile/instrprof-merging.cpp:28:16:
error: V1: expected string not found in input
A() {} // V1: [[@LINE]]{{ *}}|{{ *}}1
<stdin>:28:32: note: possible intended match here
28| | A() {} // V1: [[@LINE]]{{ *}}|{{ *}}1
```
Hope this fixes:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/27977/steps/run%20lit%20tests%20%5Bi686%2Ffugu-userdebug%2FN2G48C%5D/logs/stdio
```
: 'RUN: at line 8'; UBSAN_OPTIONS=suppressions=/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_i686/test/ubsan/Standalone-i386/TestCases/Misc/Output/nullability.c.tmp.supp /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_i686/test/ubsan/Standalone-i386/TestCases/Misc/Output/nullability.c.tmp 2>&1 | count 0
--
Exit Code: 1
Command Output (stderr):
--
Expected 0 lines, got 2.
```
Not sure what this would be printing though, a sanitizer initialization message?
Summary:
When -dfsan-event-callbacks is specified, insert a call to
__dfsan_mem_transfer_callback on every memcpy and memmove.
Reviewers: vitalybuka, kcc, pcc
Reviewed By: kcc
Subscribers: eugenis, hiraditya, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D75386
Summary:
For now just insert the callback for stores, similar to how MSan tracks
origins. In the future we may want to add callbacks for loads, memcpy,
function calls, CMPs, etc.
Reviewers: pcc, vitalybuka, kcc, eugenis
Reviewed By: vitalybuka, kcc, eugenis
Subscribers: eugenis, hiraditya, #sanitizers, llvm-commits, kcc
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D75312
Summary:
Sanitizer tests don't entirely pass on an R device. Fix up all the
incompatibilities with the new system.
Reviewers: eugenis, pcc
Reviewed By: eugenis
Subscribers: #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D75303
Generally we ignore interceptors coming from called_from_lib-suppressed libraries.
However, we must not ignore critical interceptors like e.g. pthread_create,
otherwise runtime will lost track of threads.
pthread_detach is one of these interceptors we should not ignore as it affects
thread states and behavior of pthread_join which we don't ignore as well.
Currently we can produce very obscure false positives. For more context see:
https://groups.google.com/forum/#!topic/thread-sanitizer/ecH2P0QUqPs
The added test captures this pattern.
While we are here rename ThreadTid to ThreadConsumeTid to make it clear that
it's not just a "getter", it resets user_id to 0. This lead to confusion recently.
Reviewed in https://reviews.llvm.org/D74828
Like was done before in D67999 for `logbf`, this patch fixes the tests for
the internal compiler-rt implementations of `logb` and `logbl` to consider
all NaNs equivalent. Not doing so was resulting in test failures for
riscv64, since the the NaNs had different signs, but the spec doesn't
specify the NaN signedness or payload.
Fixes bug 44244.
Reviewers: rupprecht, delcypher
Reviewed By: rupprecht, delcypher
Differential Revision: https://reviews.llvm.org/D74826
Summary:
This substitution expands to the appropriate minimum deployment target
flag where thread local storage (TLS) was first introduced on Darwin
platforms. For all other platforms the substitution expands to an empty
string.
E.g. for macOS the substitution expands to `-mmacosx-version-min=10.12`
This patch adds support for the substitution (and future substitutions)
by doing a minor refactor and then uses the substitution in the relevant
TSan tests.
rdar://problem/59568956
Reviewers: yln, kubamracek, dvyukov, vitalybuka
Subscribers: #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D74802