Fixes 2 issues in origins arising from realloc() calls:
* In the in-place grow case origin for the new memory is not set at all.
* In the copy-realloc case __msan_memcpy is used, which unwinds stack from
inside the MSan runtime. This does not generally work (as we may be built
w/o frame pointers), and produces "bad" stack trace anyway, with several
uninteresting (internal) frames on top.
This change also makes realloc() honor "zeroise" and "poison_in_malloc" flags.
See https://code.google.com/p/memory-sanitizer/issues/detail?id=73.
llvm-svn: 226674
InternalAlloc is quite complex and its behavior may depend on the values of
flags. As such, it should not be used while parsing flags.
Sadly, LowLevelAlloc does not support deallocation of memory.
llvm-svn: 226453
The new parser is a lot stricter about syntax, reports unrecognized
flags, and will make it easier to implemented some of the planned features.
llvm-svn: 226169
This mirrors r225239 to all the rest sanitizers:
ASan, DFSan, LSan, MSan, TSan, UBSan.
Now the runtime flag type, name, default value and
description is located in the single place in the
.inc file.
llvm-svn: 225327
Fix test failures by introducing CommonFlags::CopyFrom() to make sure
compiler doesn't insert memcpy() calls into runtime code.
Original commit message:
Protect CommonFlags singleton by adding const qualifier to
common_flags() accessor. The only ways to modify the flags are
SetCommonFlagsDefaults(), ParseCommonFlagsFromString() and
OverrideCommonFlags() functions, which are only supposed to be
called during initialization.
llvm-svn: 225088
We've got some internal users that either aren't compatible with this or
have found a bug with it. Either way, this is an isolated cleanup and so
I'm reverting it to un-block folks while we investigate. Alexey and
I will be working on fixing everything up so this can be re-committed
soon. Sorry for the noise and any inconvenience.
llvm-svn: 225079
This is a re-commit of r224838 + r224839, previously reverted in r224850.
Test failures were likely (still can not reproduce) caused by two lit tests
using the same name for an intermediate build target.
llvm-svn: 224853
Summary:
Protect CommonFlags singleton by adding const qualifier to
common_flags() accessor. The only ways to modify the flags are
SetCommonFlagsDefaults(), ParseCommonFlagsFromString() and
OverrideCommonFlags() functions, which are only supposed to be
called during initialization.
Test Plan: regression test suite
Reviewers: kcc, eugenis, glider
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6741
llvm-svn: 224736
Add CommonFlags::SetDefaults() and CommonFlags::ParseFromString(),
so that this object can be easily tested. Enforce
that ParseCommonFlagsFromString() and SetCommonFlagsDefaults()
work only with singleton CommonFlags, shared across all sanitizer
runtimes.
llvm-svn: 224617
pthread_getspecific is not async-signal-safe.
MsanThread pointer is now stored in a TLS variable, and the TSD slot
is used only for its destructor, and never from a signal handler.
This should fix intermittent CHECK failures in MsanTSDSet.
llvm-svn: 224423
Summary:
Turn "allocator_may_return_null" common flag into an
Allocator::may_return_null bool flag. We want to make sure
that common flags are immutable after initialization. There
are cases when we want to change this flag in the allocator
at runtime: e.g. in unit tests and during ASan activation
on Android.
Test Plan: regression test suite, real-life applications
Reviewers: kcc, eugenis
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6623
llvm-svn: 224148
Previously, all origin ids were "chained" origins, i.e values of
ChainedOriginDepot. This added a level of indirection for simple
stack and heap allocation, which were represented as chains of
length 1. This costs both RAM and CPU, but provides a joined 2**29
origin id space. It also made function (any instrumented function)
entry non-async-signal-safe, but that does not really matter because
memory stores in track-origins=2 mode are not async-signal-safe anyway.
With this change, the type of the origin is encoded in origin id.
See comment in msan_origin.h for more details. This reduces chained and stack
origin id range to 2**28 each, but leaves extra 2**31 for heap origins.
This change should not have any user-visible effects.
llvm-svn: 223233
Summary:
Exactly what the title says. I've tested this change against the libc++ test failures and it solves all of them. The check-msan rule also still passes.
I'm not sure why it called memset originally.
I can add tests if requested but currently there are no tests involving wide chars and they are a c++11 features.
Reviewers: kcc, eugenis
Reviewed By: eugenis
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6352
llvm-svn: 222673
MSanDR is a dynamic instrumentation tool that can instrument the code
(prebuilt libraries and such) that could not be instrumented at compile time.
This code is unused (to the best of our knowledge) and unmaintained, and
starting to bit-rot.
llvm-svn: 222232
introduce a BufferedStackTrace class, which owns this array.
Summary:
This change splits __sanitizer::StackTrace class into a lightweight
__sanitizer::StackTrace, which doesn't own array of PCs, and BufferedStackTrace,
which owns it. This would allow us to simplify the interface of StackDepot,
and eventually merge __sanitizer::StackTrace with __tsan::StackTrace.
Test Plan: regression test suite.
Reviewers: kcc, dvyukov
Reviewed By: dvyukov
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5985
llvm-svn: 220635
ParamTLS (shadow for function arguments) is of limited size. This change
makes all arguments that do not fit unpoisoned, and avoids writing
past the end of a TLS buffer.
llvm-svn: 220351
Chained origins make plain memory stores async-signal-unsafe.
We already disable it inside signal handlers.
This change grabs all origin-related locks before fork() and releases
them after fork() to avoid a deadlock in the child process.
llvm-svn: 217140
Get rid of Symbolizer::Init(path_to_external) in favor of
thread-safe Symbolizer::GetOrInit(), and use the latter version
everywhere. Implicitly depend on the value of external_symbolizer_path
runtime flag instead of passing it around manually.
No functionality change.
llvm-svn: 214005
This was done by calling __cxa_demangle directly, which is bad
when c++abi library is instrumented. The following line always
contains the demangled name (when running with a symbolizer) anyway.
llvm-svn: 212929
Our versions are not exactly as fast as libc's, and
MSan uses them heavily (even compared to other sanitizers).
This will break if libc version of mem* are instrumented,
but they never are, and if they are, we should be able
to fix it on libc side.
llvm-svn: 212799
Introduce new public header <sanitizer/allocator_interface.h> and a set
of functions __sanitizer_get_ownership(), __sanitizer_malloc_hook() etc.
that will eventually replace their tool-specific equivalents
(__asan_get_ownership(), __msan_get_ownership() etc.). Tool-specific
functions are now deprecated and implemented as stubs redirecting
to __sanitizer_ versions (which are implemented differently in each tool).
Replace all uses of __xsan_ versions with __sanitizer_ versions in unit
and lit tests.
llvm-svn: 212469
Origin history should only be recorded for uninitialized values, because it is
meaningless otherwise. This change moves __msan_chain_origin to the runtime
library side and makes it conditional on the corresponding shadow value.
Previous code was correct, but _very_ inefficient.
llvm-svn: 211700
This way does not require a __sanitizer_cov_dump() call. That's
important on Android, where apps can be killed at arbitrary time.
We write raw PCs to disk instead of module offsets; we also write
memory layout to a separate file. This increases dump size by the
factor of 2 on 64-bit systems.
llvm-svn: 209653
Generalize StackDepot and create a new specialized instance of it to
efficiently (i.e. without duplicating stack trace data) store the
origin history tree.
This reduces memory usage for chained origins roughly by an order of
magnitude.
Most importantly, this new design allows us to put two limits on
stored history data (exposed in MSAN_OPTIONS) that help avoid
exponential growth in used memory on certain workloads.
See comments in lib/msan/msan_origin.h for more details.
llvm-svn: 209284
This change lets MSan rely on libcxx's own build system instead of manually
compiling its sources and setting up all the necessary compile flags. It would
also simplify compiling libcxx with another sanitizers (in particular, TSan).
The tricky part is to make sure libcxx is reconfigured/rebuilt when Clang or
MSan runtime library is changed. "clobber" step used in this patch works well
for me, but it's possible it would break for other configurations - will
watch the buildbots.
llvm-svn: 208451
Format string parsing is disabled by default.
This is not expected to meaningfully change the tool behavior.
With this change, check_printf flag could be used to evaluate printf format
string parsing in MSan.
llvm-svn: 208295
Soon there will be an option to build compiler-rt parts as shared libraries
on Linux. Extracted from http://llvm-reviews.chandlerc.com/D3042
by Yuri Gribov.
llvm-svn: 205183
These interceptors require deep unpoisoning of return values.
While at it, we do the same for all other pw/gr interceptors to
reduce dependency on libc implementation details.
llvm-svn: 205004
The interceptors had code that after macro expansion ended up looking like
extern "C" void memalign()
__attribute__((weak, alias("__interceptor_memalign")));
extern "C" void __interceptor_memalign() {}
extern "C" void __interceptor___libc_memalign()
__attribute__((alias("memalign")));
That is,
* __interceptor_memalign is a function
* memalign is a weak alias to __interceptor_memalign
* __interceptor___libc_memalign is an alias to memalign
Both gcc and clang produce assembly that look like
__interceptor_memalign:
...
.weak memalign
memalign = __interceptor_memalign
.globl __interceptor___libc_memalign
__interceptor___libc_memalign = memalign
What it means in the end is that we have 3 symbols pointing to the
same position in the file, one of which is weak:
8: 0000000000000000 1 FUNC GLOBAL DEFAULT 1
__interceptor_memalign
9: 0000000000000000 1 FUNC WEAK DEFAULT 1 memalign
10: 0000000000000000 1 FUNC GLOBAL DEFAULT 1
__interceptor___libc_memalign
In particular, note that __interceptor___libc_memalign will always
point to __interceptor_memalign, even if we do link in a strong symbol
for memalign. In fact, the above code produces exactly the same binary
as
extern "C" void memalign()
__attribute__((weak, alias("__interceptor_memalign")));
extern "C" void __interceptor_memalign() {}
extern "C" void __interceptor___libc_memalign()
__attribute__((alias("__interceptor_memalign")));
If nothing else, this patch makes it more obvious what is going on.
llvm-svn: 204823
Using __msan_unpoison() on null-terminated strings is awkward because
strlen() can't be called on a poisoned string. This case warrants a special
interface function.
llvm-svn: 204448
Extend ParseFlag to accept the |description| parameter, add dummy values for all existing flags.
As the flags are parsed their descriptions are stored in a global linked list.
The tool can later call __sanitizer::PrintFlagDescriptions() to dump all the flag names and their descriptions.
Add the 'help' flag and make ASan, TSan and MSan print the flags if 'help' is set to 1.
llvm-svn: 204339
Compiler-rt part of MSan implementation of advanced origin tracking,
when we record not only creation point, but all locations where
an uninitialized value was stored to memory, too.
llvm-svn: 204152
This reverts commit r201910.
While __func__ may be standard in C++11, it was only recently added to
MSVC in 2013 CTP, and LLVM supports MSVC 2012. __FUNCTION__ may not be
standard, but it's *very* portable.
llvm-svn: 201916
This is covered by existing ASan test.
This does not change anything for TSan by default (but provides a flag to
change the threshold size).
Based on a patch by florent.bruneau here:
https://code.google.com/p/address-sanitizer/issues/detail?id=256
llvm-svn: 201400
Because of the way Bionic sets up signal stack frames, libc unwinder is unable
to step through it, resulting in broken SEGV stack traces.
Luckily, libcorkscrew.so on Android implements an unwinder that can start with
a signal context, thus sidestepping the issue.
llvm-svn: 201151
Some build environments are missing the required headers.
This reverts r200544, r200547, r200551. This does not revert the change that
introduced READWRITE ioctl type.
llvm-svn: 200567
Also rename internal_sigaction() into internal_sigaction_norestorer(), as this function doesn't fully
implement the sigaction() functionality on Linux.
This change is a part of refactoring intended to have common signal handling behavior in all tools.
llvm-svn: 200535
Express the strto* interceptors though macros. This removes a lot of
duplicate code and fixes a couple of copypasto bugs (where "res" was declared of
a different type than the actual return type). Also, add a few more interceptors
for strto*_l.
llvm-svn: 200316
warning: ISO C99 requires rest arguments to be used [enabled by default]
INTERCEPTOR(char *, dlerror) {
warning: invoking macro INTERCEPTOR argument 3: empty macro arguments are undefined in ISO C90 and ISO C++98 [enabled by default]
llvm-svn: 199873
Intercept and sanitize arguments passed to printf functions in ASan and TSan
(don't do this in MSan for now). The checks are controlled by runtime flag
(off by default for now).
Patch http://llvm-reviews.chandlerc.com/D2480 by Yuri Gribov!
llvm-svn: 199729
This code is not robust enough and triggers when simply linking with
libdynamorio.so, without any code translation at all. Disabling it is safe
(i.e. we may unpoison too much memory and see false negatives, but never false
positives).
llvm-svn: 197568
*h_errno is written not on success, but on failure.
In fact, it seems like it can be written even when return value signals
success, so we just unpoison it in all cases.
llvm-svn: 197383
Before we did it lazily on the first stack unwind in the thread.
It resulted in deadlock when the unwind was caused by memory allocation
inside pthread_getattr_np:
pthread_getattr_np <<< not reentable
GetThreadStackTopAndBottom
__interceptor_realloc
pthread_getattr_np
llvm-svn: 197026
This change unifies the summary printing across sanitizers:
now each tool uses specific version of ReportErrorSummary() method,
which deals with symbolization of the top frame and formatting a
summary message. This change modifies the summary line for ASan+LSan mode:
now the summary mentions "AddressSanitizer" instead of "LeakSanitizer".
llvm-svn: 193864
Summary:
TSan and MSan need to know if interceptor was called by the
user code or by the symbolizer and use pre- and post-symbolization hooks
for that. Make Symbolizer class responsible for calling these hooks instead.
This would ensure the hooks are only called when necessary (during
in-process symbolization, they are not needed for out-of-process) and
save specific sanitizers from tracing all places in the code where symbolization
will be performed.
Reviewers: eugenis, dvyukov
Reviewed By: eugenis
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2067
llvm-svn: 193807
This moves away from creating the symbolizer object and initializing the
external symbolizer as separate steps. Those steps now always take place
together.
Sanitizers with a legacy requirement to specify their own symbolizer path
should use InitSymbolizer to initialize the symbolizer with the desired
path, and GetSymbolizer to access the symbolizer. Sanitizers with no
such requirement (e.g. UBSan) can use GetOrInitSymbolizer with no need for
initialization.
The symbolizer interface has been made thread-safe (as far as I can
tell) by protecting its member functions with mutexes.
Finally, the symbolizer interface no longer relies on weak externals, the
introduction of which was probably a mistake on my part.
Differential Revision: http://llvm-reviews.chandlerc.com/D1985
llvm-svn: 193448
Origin copying may destroy valid origin info. This is caused by
__msan_copy_origin widening the address range to the nearest 4-byte aligned
addresses both on the left and on the right. If the target buffer is
uninitialized and the source is fully initialized, this will result in
overriding valid origin of target buffer with stale (possibly 0) origin of the
source buffer.
With this change the widened origin is copied only if corresponding shadow
values are non zero.
llvm-svn: 193338