Origin is meaningless for fully initialized values. Avoid
storing origin for function arguments that are known to
be always initialized (i.e. shadow is a compile-time null
constant).
This is not about correctness, but purely an optimization.
Seems to affect compilation time of blacklisted functions
significantly.
llvm-svn: 213239
Currently ASan instrumentation pass creates a string with global name
for each instrumented global (to include global names in the error report). Global
name is already mangled at this point, and we may not be able to demangle it
at runtime (e.g. there is no __cxa_demangle on Android).
Instead, create a string with fully qualified global name in Clang, and pass it
to ASan instrumentation pass in llvm.asan.globals metadata. If there is no metadata
for some global, ASan will use the original algorithm.
This fixes https://code.google.com/p/address-sanitizer/issues/detail?id=264.
llvm-svn: 212872
Turn llvm::SpecialCaseList into a simple class that parses text files in
a specified format and knows nothing about LLVM IR. Move this class into
LLVMSupport library. Implement two users of this class:
* DFSanABIList in DFSan instrumentation pass.
* SanitizerBlacklist in Clang CodeGen library.
The latter will be modified to use actual source-level information from frontend
(source file names) instead of unstable LLVM IR things (LLVM Module identifier).
Remove dependency edge from ClangCodeGen/ClangDriver to LLVMTransformUtils.
No functionality change.
llvm-svn: 212643
All blacklisting logic is now moved to the frontend (Clang).
If a function (or source file it is in) is blacklisted, it doesn't
get sanitize_address attribute and is therefore not instrumented.
If a global variable (or source file it is in) is blacklisted, it is
reported to be blacklisted by the entry in llvm.asan.globals metadata,
and is not modified by the instrumentation.
The latter may lead to certain false positives - not all the globals
created by Clang are described in llvm.asan.globals metadata (e.g,
RTTI descriptors are not), so we may start reporting errors on them
even if "module" they appear in is blacklisted. We assume it's fine
to take such risk:
1) errors on these globals are rare and usually indicate wild memory access
2) we can lazily add descriptors for these globals into llvm.asan.globals
lazily.
llvm-svn: 212505
With this change all values passed through blacklisted functions
become fully initialized. Previous behavior was to initialize all
loads in blacklisted functions, but apply normal shadow propagation
logic for all other operation.
This makes blacklist applicable in a wider range of situations.
It also makes code for blacklisted functions a lot shorter, which
works as yet another workaround for PR17409.
llvm-svn: 212268
With this change all values passed through blacklisted functions
become fully initialized. Previous behavior was to initialize all
loads in blacklisted functions, but apply normal shadow propagation
logic for all other operation.
This makes blacklist applicable in a wider range of situations.
It also makes code for blacklisted functions a lot shorter, which
works as yet another workaround for PR17409.
llvm-svn: 212265
See https://code.google.com/p/address-sanitizer/issues/detail?id=299 for the
original feature request.
Introduce llvm.asan.globals metadata, which Clang (or any other frontend)
may use to report extra information about global variables to ASan
instrumentation pass in the backend. This metadata replaces
llvm.asan.dynamically_initialized_globals that was used to detect init-order
bugs. llvm.asan.globals contains the following data for each global:
1) source location (file/line/column info);
2) whether it is dynamically initialized;
3) whether it is blacklisted (shouldn't be instrumented).
Source location data is then emitted in the binary and can be picked up
by ASan runtime in case it needs to print error report involving some global.
For example:
0x... is located 4 bytes to the right of global variable 'C::array' defined in '/path/to/file:17:8' (0x...) of size 40
These source locations are printed even if the binary doesn't have any
debug info.
This is an ABI-breaking change. ASan initialization is renamed to
__asan_init_v4(). Pre-built libraries compiled with older Clang will not work
with the fresh runtime.
llvm-svn: 212188
string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.
small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.
This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.
The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.
llvm-svn: 211749
Origin history should only be recorded for uninitialized values, because it is
meaningless otherwise. This change moves __msan_chain_origin to the runtime
library side and makes it conditional on the corresponding shadow value.
Previous code was correct, but _very_ inefficient.
llvm-svn: 211700
Multiplication by an integer with a number of trailing zero bits leaves
the same number of lower bits of the result initialized to zero.
This change makes MSan take this into account in the case of multiplication by
a compile-time constant.
We don't handle the general, non-constant, case because
(a) it's not going to be cheap (computation-wise);
(b) multiplication by a partially uninitialized value in user code is
a bad idea anyway.
Constant case must be handled because it appears from LLVM optimization of a
completely valid user code, as the test case in compiler-rt demonstrates.
llvm-svn: 211092
Init-order and use-after-return modes can currently be enabled
by runtime flags. use-after-scope mode is not really working at the
moment.
The only problem I see is that users won't be able to disable extra
instrumentation for init-order and use-after-scope by a top-level Clang flag.
But this instrumentation was implicitly enabled for quite a while and
we didn't hear from users hurt by it.
llvm-svn: 210924
This commit adds a weak variant of the cmpxchg operation, as described
in C++11. A cmpxchg instruction with this modifier is permitted to
fail to store, even if the comparison indicated it should.
As a result, cmpxchg instructions must return a flag indicating
success in addition to their original iN value loaded. Thus, for
uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The
second flag is 1 when the store succeeded.
At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been
added as the natural representation for the new cmpxchg instructions.
It is a strong cmpxchg.
By default this gets Expanded to the existing ATOMIC_CMP_SWAP during
Legalization, so existing backends should see no change in behaviour.
If they wish to deal with the enhanced node instead, they can call
setOperationAction on it. Beware: as a node with 2 results, it cannot
be selected from TableGen.
Currently, no use is made of the extra information provided in this
patch. Test updates are almost entirely adapting the input IR to the
new scheme.
Summary for out of tree users:
------------------------------
+ Legacy Bitcode files are upgraded during read.
+ Legacy assembly IR files will be invalid.
+ Front-ends must adapt to different type for "cmpxchg".
+ Backends should be unaffected by default.
llvm-svn: 210903
never be true in a well-defined context. The checking for null pointers
has been moved into the caller logic so it does not rely on undefined behavior.
llvm-svn: 210497
Instrumentation passes now use attributes
address_safety/thread_safety/memory_safety which are added by Clang frontend.
Clang parses the blacklist file and adds the attributes accordingly.
Currently blacklist is still used in ASan module pass to disable instrumentation
for certain global variables. We should fix this as well by collecting the
set of globals we're going to instrument in Clang and passing it to ASan
in metadata (as we already do for dynamically-initialized globals and init-order
checking).
This change also removes -tsan-blacklist and -msan-blacklist LLVM commandline
flags in favor of -fsanitize-blacklist= Clang flag.
llvm-svn: 210038
Clang knows about the sanitizer blacklist and it makes no sense to
add global to the list of llvm.asan.dynamically_initialized_globals if it
will be blacklisted in the instrumentation pass anyway. Instead, we should
do as much blacklisting as possible (if not all) in the frontend.
llvm-svn: 209790
Don't assume that dynamically initialized globals are all initialized from
_GLOBAL__<module_name>I_ function. Instead, scan the llvm.global_ctors and
insert poison/unpoison calls to each function there.
Patch by Nico Weber!
llvm-svn: 209780
Most importantly, it gives debug location info to the coverage callback.
This change also removes 2 cases of unnecessary setDebugLoc when IRBuilder
is created with the same debug location.
llvm-svn: 208767
definition below all of the header #include lines, lib/Transforms/...
edition.
This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.
Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.
llvm-svn: 206844
This flag replaces inline instrumentation for checks and origin stores with
calls into MSan runtime library. This is a workaround for PR17409.
Disabled by default.
llvm-svn: 206585
Also updated as many loops as I could find using df_begin/idf_begin -
strangely I found no uses of idf_begin. Is that just used out of tree?
Also a few places couldn't use df_begin because either they used the
member functions of the depth first iterators or had specific ordering
constraints (I added a comment in the latter case).
Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T>
where you needed iterator_range<idf_iterator<T>>)
llvm-svn: 206016
This adds back r204781.
Original message:
Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given
define void @my_func() {
ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias
We produce without this patch:
.weak my_alias
my_alias = my_func
.globl my_alias2
my_alias2 = my_alias
That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a
@my_alias = alias void ()* @other_func
would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.
There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.
llvm-svn: 204934
This reverts commit r204781.
I will follow up to with msan folks to see what is what they
were trying to do with aliases to weak aliases.
llvm-svn: 204784
Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given
define void @my_func() {
ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias
We produce without this patch:
.weak my_alias
my_alias = my_func
.globl my_alias2
my_alias2 = my_alias
That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a
@my_alias = alias void ()* @other_func
would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.
There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.
llvm-svn: 204781
LLVM part of MSan implementation of advanced origin tracking,
when we record not only creation point, but all locations where
an uninitialized value was stored to memory, too.
llvm-svn: 204151
The syntax for "cmpxchg" should now look something like:
cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic
where the second ordering argument gives the required semantics in the case
that no exchange takes place. It should be no stronger than the first ordering
constraint and cannot be either "release" or "acq_rel" (since no store will
have taken place).
rdar://problem/15996804
llvm-svn: 203559
This requires a number of steps.
1) Move value_use_iterator into the Value class as an implementation
detail
2) Change it to actually be a *Use* iterator rather than a *User*
iterator.
3) Add an adaptor which is a User iterator that always looks through the
Use to the User.
4) Wrap these in Value::use_iterator and Value::user_iterator typedefs.
5) Add the range adaptors as Value::uses() and Value::users().
6) Update *all* of the callers to correctly distinguish between whether
they wanted a use_iterator (and to explicitly dig out the User when
needed), or a user_iterator which makes the Use itself totally
opaque.
Because #6 requires churning essentially everything that walked the
Use-Def chains, I went ahead and added all of the range adaptors and
switched them to range-based loops where appropriate. Also because the
renaming requires at least churning every line of code, it didn't make
any sense to split these up into multiple commits -- all of which would
touch all of the same lies of code.
The result is still not quite optimal. The Value::use_iterator is a nice
regular iterator, but Value::user_iterator is an iterator over User*s
rather than over the User objects themselves. As a consequence, it fits
a bit awkwardly into the range-based world and it has the weird
extra-dereferencing 'operator->' that so many of our iterators have.
I think this could be fixed by providing something which transforms
a range of T&s into a range of T*s, but that *can* be separated into
another patch, and it isn't yet 100% clear whether this is the right
move.
However, this change gets us most of the benefit and cleans up
a substantial amount of code around Use and User. =]
llvm-svn: 203364
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.
llvm-svn: 203083
already lived there and it is where it belongs -- this is the in-memory
debug location representation.
This is just cleanup -- Modules can actually cope with this, but that
doesn't make it right. After chatting with folks that have out-of-tree
stuff, going ahead and moving the rest of the headers seems preferable.
llvm-svn: 202960
this would have been required because of the use of DataLayout, but that
has moved into the IR proper. It is still required because this folder
uses the constant folding in the analysis library (which uses the
datalayout) as the more aggressive basis of its folder.
llvm-svn: 202832
directly care about the Value class (it is templated so that the key can
be any arbitrary Value subclass), it is in fact concretely tied to the
Value class through the ValueHandle's CallbackVH interface which relies
on the key type being some Value subclass to establish the value handle
chain.
Ironically, the unittest is already in the right library.
llvm-svn: 202824
business.
This header includes Function and BasicBlock and directly uses the
interfaces of both classes. It has to do with the IR, it even has that
in the name. =] Put it in the library it belongs to.
This is one step toward making LLVM's Support library survive a C++
modules bootstrap.
llvm-svn: 202814
After this I will set the default back to F_None. The advantage is that
before this patch forgetting to set F_Binary would corrupt a file on windows.
Forgetting to set F_Text produces one that cannot be read in notepad, which
is a better failure mode :-)
llvm-svn: 202052
I am really sorry for the noise, but the current state where some parts of the
code use TD (from the old name: TargetData) and other parts use DL makes it
hard to write a patch that changes where those variables come from and how
they are passed along.
llvm-svn: 201827
r201608 made llvm corretly handle private globals with MachO. r201622 fixed
a bug in it and r201624 and r201625 were changes for using private linkage,
assuming that llvm would do the right thing.
They all got reverted because r201608 introduced a crash in LTO. This patch
includes a fix for that. The issue was that TargetLoweringObjectFile now has
to be initialized before we can mangle names of private globals. This is
trivially true during the normal codegen pipeline (the asm printer does it),
but LTO has to do it manually.
llvm-svn: 201700
The entry block of a function starts with all the static allocas. The change
in r195513 splits the block before those allocas, which has the effect of
turning them into dynamic allocas. That breaks all sorts of things. Change to
split after the initial allocas, and also add a comment explaining why the
block is split.
llvm-svn: 200515
flag from clang, and disable zero-base shadow support on all platforms
where it is not the default behavior.
- It is completely unused, as far as we know.
- It is ABI-incompatible with non-zero-base shadow, which means all
objects in a process must be built with the same setting. Failing to
do so results in a segmentation fault at runtime.
- It introduces a backward dependency of compiler-rt on user code,
which is uncommon and complicates testing.
This is the LLVM part of a larger change.
llvm-svn: 199371
are part of the core IR library in order to support dumping and other
basic functionality.
Rename the 'Assembly' include directory to 'AsmParser' to match the
library name and the only functionality left their -- printing has been
in the core IR library for quite some time.
Update all of the #includes to match.
All of this started because I wanted to have the layering in good shape
before I started adding support for printing LLVM IR using the new pass
infrastructure, and commandline support for the new pass infrastructure.
llvm-svn: 198688
subsequent changes are easier to review. About to fix some layering
issues, and wanted to separate out the necessary churn.
Also comment and sink the include of "Windows.h" in three .inc files to
match the usage in Memory.inc.
llvm-svn: 198685
This should fix http://llvm.org/bugs/show_bug.cgi?id=17976
Another test checking for the global variables' locations and prefixes on Darwin will be committed separately.
llvm-svn: 198017
Summary:
Before this change the instrumented code before Ret instructions looked like:
<Unpoison Frame Redzones>
if (Frame != OriginalFrame) // I.e. Frame is fake
<Poison Complete Frame>
Now the instrumented code looks like:
if (Frame != OriginalFrame) // I.e. Frame is fake
<Poison Complete Frame>
else
<Unpoison Frame Redzones>
Reviewers: eugenis
Reviewed By: eugenis
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2458
llvm-svn: 197907
Currently SplitBlockAndInsertIfThen requires that branch condition is an
Instruction itself, which is very inconvenient, because it is sometimes an
Operator, or even a Constant.
llvm-svn: 197677
It was failing because ASan was adding all of the following to one
function:
- dynamic alloca
- stack realignment
- inline asm
This patch avoids making the static alloca dynamic when coverage is
used.
ASan should probably not be inserting empty inline asm blobs to inhibit
duplicate tail elimination.
llvm-svn: 196973
Summary:
Rewrite asan's stack frame layout.
First, most of the stack layout logic is moved into a separte file
to make it more testable and (potentially) useful for other projects.
Second, make the frames more compact by using adaptive redzones
(smaller for small objects, larger for large objects).
Third, try to minimized gaps due to large alignments (this is hypothetical since
today we don't see many stack vars aligned by more than 32).
The frames indeed become more compact, but I'll still need to run more benchmarks
before committing, but I am sking for review now to get early feedback.
This change will be accompanied by a trivial change in compiler-rt tests
to match the new frame sizes.
Reviewers: samsonov, dvyukov
Reviewed By: samsonov
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2324
llvm-svn: 196568
This currently breaks clang/test/CodeGen/code-coverage.c. The root cause
is that the newly introduced access to Funcs[j] is out of bounds.
llvm-svn: 196365
gcov expects every function to contain an entry block that
unconditionally branches into the next block. clang does not implement
basic blocks in this manner, so gcov did not output correct branch info
if the entry block branched to multiple blocks.
This change splits every function's entry block into an empty block and
a block with the rest of the instructions. The instrumentation code will
take care of the rest.
llvm-svn: 195513
The new command line flags are -dfsan-ignore-pointer-label-on-store and -dfsan-ignore-pointer-label-on-load. Their default value matches the current labelling scheme.
Additionally, the function __dfsan_union_load is marked as readonly.
Patch by Lorenzo Martignoni!
Differential Revision: http://llvm-reviews.chandlerc.com/D2187
llvm-svn: 195382
Instead of permanently outputting "MVLL" as the file checksum, clang
will create gcno and gcda checksums by hashing the destination block
numbers of every arc. This allows for llvm-cov to check if the two gcov
files are synchronized.
Regenerated the test files so they contain the checksum. Also added
negative test to ensure error when the checksums don't match.
llvm-svn: 195191
I was able to successfully run a bootstrapped LTO build of clang with
r194701, so this change does not seem to be the cause of our failing
buildbots.
llvm-svn: 194789
This reverts commit 194701. Apple's bootstrapped LTO builds have been failing,
and this change (along with compiler-rt 194702-194704) is the only thing on
the blamelist. I will either reappy these changes or help debug the problem,
depending on whether this fixes the buildbots.
llvm-svn: 194780
Indirect call wrapping helps MSanDR (dynamic instrumentation companion tool
for MSan) to catch all cases where execution leaves a compiler-instrumented
module by allowing the tool to rewrite targets of indirect calls.
This change is an optimization that skips wrapping for calls when target is
inside the current module. This relies on the linker providing symbols at the
begin and end of the module code (or code + data, does not really matter).
Gold linker provides such symbols by default. GNU (BFD) linker needs a link
flag: -Wl,--defsym=__executable_start=0.
More info:
https://code.google.com/p/memory-sanitizer/wiki/MSanDR#Native_exec
llvm-svn: 194697
LLVM optimizers may widen accesses to packed structures that overflow the structure itself, but should be in bounds up to the alignment of the object
llvm-svn: 193317
Summary:
Given a global array G[N], which is declared in this CU and has static initializer
avoid instrumenting accesses like G[i], where 'i' is a constant and 0<=i<N.
Also add a bit of stats.
This eliminates ~1% of instrumentations on SPEC2006
and also partially helps when asan is being run together with coverage.
Reviewers: samsonov
Reviewed By: samsonov
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1947
llvm-svn: 192794
Currently MSan checks that arguments of *cvt* intrinsics are fully initialized.
That's too much to ask: some of them only operate on lower half, or even
quarter, of the input register.
llvm-svn: 192599
infrastructure.
This was essentially work toward PGO based on a design that had several
flaws, partially dating from a time when LLVM had a different
architecture, and with an effort to modernize it abandoned without being
completed. Since then, it has bitrotted for several years further. The
result is nearly unusable, and isn't helping any of the modern PGO
efforts. Instead, it is getting in the way, adding confusion about PGO
in LLVM and distracting everyone with maintenance on essentially dead
code. Removing it paves the way for modern efforts around PGO.
Among other effects, this removes the last of the runtime libraries from
LLVM. Those are being developed in the separate 'compiler-rt' project
now, with somewhat different licensing specifically more approriate for
runtimes.
llvm-svn: 191835
Adds a flag to the MemorySanitizer pass that enables runtime rewriting of
indirect calls. This is part of MSanDR implementation and is needed to return
control to the DynamiRio-based helper tool on transition between instrumented
and non-instrumented modules. Disabled by default.
llvm-svn: 191006
The work on this project was left in an unfinished and inconsistent state.
Hopefully someone will eventually get a chance to implement this feature, but
in the meantime, it is better to put things back the way the were. I have
left support in the bitcode reader to handle the case-range bitcode format,
so that we do not lose bitcode compatibility with the llvm 3.3 release.
This reverts the following commits: 155464, 156374, 156377, 156613, 156704,
156757, 156804 156808, 156985, 157046, 157112, 157183, 157315, 157384, 157575,
157576, 157586, 157612, 157810, 157814, 157815, 157880, 157881, 157882, 157884,
157887, 157901, 158979, 157987, 157989, 158986, 158997, 159076, 159101, 159100,
159200, 159201, 159207, 159527, 159532, 159540, 159583, 159618, 159658, 159659,
159660, 159661, 159703, 159704, 160076, 167356, 172025, 186736
llvm-svn: 190328
instead of having its own implementation.
The implementation of isTBAAVtableAccess is in TypeBasedAliasAnalysis.cpp
since it is related to the format of TBAA metadata.
The path for struct-path tbaa will be exercised by
test/Instrumentation/ThreadSanitizer/read_from_global.ll, vptr_read.ll, and
vptr_update.ll when struct-path tbaa is on by default.
llvm-svn: 190216
Select condition shadow was being ignored resulting in false negatives.
This change OR-s sign-extended condition shadow into the result shadow.
llvm-svn: 189785
The code was erroneously reading overflow area shadow from the TLS slot,
bypassing the local copy. Reading shadow directly from TLS is wrong, because
it can be overwritten by a nested vararg call, if that happens before va_start.
llvm-svn: 189104
DFSan changes the ABI of each function in the module. This makes it possible
for a function with the native ABI to be called with the instrumented ABI,
or vice versa, thus possibly invoking undefined behavior. A simple way
of statically detecting instances of this problem is to prepend the prefix
"dfs$" to the name of each instrumented-ABI function.
This will not catch every such problem; in particular function pointers passed
across the instrumented-native barrier cannot be used on the other side.
These problems could potentially be caught dynamically.
Differential Revision: http://llvm-reviews.chandlerc.com/D1373
llvm-svn: 189052
There are situations which can affect the correctness (or at least expectation)
of the gcov output. For instance, if a call to __gcov_flush() occurs within a
block before the execution count is registered and then the program aborts in
some way, then that block will not be marked as executed. This is not normally
what the user expects.
If we move the code that's registering when a block is executed to the
beginning, we can catch these types of situations.
PR16893
llvm-svn: 188849
Summary:
When the -dfsan-debug-nonzero-labels parameter is supplied, the code
is instrumented such that when a call parameter, return value or load
produces a nonzero label, the function __dfsan_nonzero_label is called.
The idea is that a debugger breakpoint can be set on this function
in a nominally label-free program to help identify any bugs in the
instrumentation pass causing labels to be introduced.
Reviewers: eugenis
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1405
llvm-svn: 188472
This replaces the old incomplete greylist functionality with an ABI
list, which can provide more detailed information about the ABI and
semantics of specific functions. The pass treats every function in
the "uninstrumented" category in the ABI list file as conforming to
the "native" (i.e. unsanitized) ABI. Unless the ABI list contains
additional categories for those functions, a call to one of those
functions will produce a warning message, as the labelling behaviour
of the function is unknown. The other supported categories are
"functional", "discard" and "custom".
- "discard" -- This function does not write to (user-accessible) memory,
and its return value is unlabelled.
- "functional" -- This function does not write to (user-accessible)
memory, and the label of its return value is the union of the label of
its arguments.
- "custom" -- Instead of calling the function, a custom wrapper __dfsw_F
is called, where F is the name of the function. This function may wrap
the original function or provide its own implementation.
Differential Revision: http://llvm-reviews.chandlerc.com/D1345
llvm-svn: 188402
DataFlowSanitizer is a generalised dynamic data flow analysis.
Unlike other Sanitizer tools, this tool is not designed to detect a
specific class of bugs on its own. Instead, it provides a generic
dynamic data flow analysis framework to be used by clients to help
detect application-specific issues within their own code.
Differential Revision: http://llvm-reviews.chandlerc.com/D965
llvm-svn: 187923
The globals being generated here were given the 'private' linkage type. However,
this caused them to end up in different sections with the wrong prefix. E.g.,
they would be in the __TEXT,__const section with an 'L' prefix instead of an 'l'
(lowercase ell) prefix.
The problem is that the linker will eat a literal label with 'L'. If a weak
symbol is then placed into the __TEXT,__const section near that literal, then it
cannot distinguish between the literal and the weak symbol.
Part of the problems here was introduced because the address sanitizer converted
some C strings into constant initializers with trailing nuls. (Thus putting them
in the __const section with the wrong prefix.) The others were variables that
the address sanitizer created but simply had the wrong linkage type.
llvm-svn: 187827
This patch provides basic support for powerpc64le as an LLVM target.
However, use of this target will not actually generate little-endian
code. Instead, use of the target will cause the correct little-endian
built-in defines to be generated, so that code that tests for
__LITTLE_ENDIAN__, for example, will be correctly parsed for
syntax-only testing. Code generation will otherwise be the same as
powerpc64 (big-endian), for now.
The patch leaves open the possibility of creating a little-endian
PowerPC64 back end, but there is no immediate intent to create such a
thing.
The LLVM portions of this patch simply add ppc64le coverage everywhere
that ppc64 coverage currently exists. There is nothing of any import
worth testing until such time as little-endian code generation is
implemented. In the corresponding Clang patch, there is a new test
case variant to ensure that correct built-in defines for little-endian
code are generated.
llvm-svn: 187179
A special case list can now specify categories for specific globals,
which can be used to instruct an instrumentation pass to treat certain
functions or global variables in a specific way, such as by omitting
certain aspects of instrumentation while keeping others, or informing
the instrumentation pass that a specific uninstrumentable function
has certain semantics, thus allowing the pass to instrument callers
according to those semantics.
For example, AddressSanitizer now uses the "init" category instead of
global-init prefixes for globals whose initializers should not be
instrumented, but which in all other respects should be instrumented.
The motivating use case is DataFlowSanitizer, which will have a
number of different categories for uninstrumentable functions, such
as "functional" which specifies that a function has pure functional
semantics, or "discard" which indicates that a function's return
value should not be labelled.
Differential Revision: http://llvm-reviews.chandlerc.com/D1092
llvm-svn: 185978
- Build debug metadata for 'bare' Modules using DIBuilder
- DebugIR can be constructed to generate an IR file (to be seen by a debugger)
or not in cases where the user already has an IR file on disk.
llvm-svn: 185193
No functionality change.
It should suffice to check the type of a debug info metadata, instead of
calling Verify. For cases where we know the type of a DI metadata, use
assert.
Also update testing cases to make them conform to the format of DI classes.
llvm-svn: 185135
Before this change, each module defined a weak_odr global __msan_track_origins
with a value of 1 if origin tracking is enabled, 0 if disabled. If there are
modules with different values, any of them may win. If 0 wins, and there is at
least one module with 1, the program will most likely crash.
With this change, __msan_track_origins is only emitted if origin tracking is
on. Then runtime library detects if there is at least one module with origin
tracking, and enables runtime support for it.
llvm-svn: 182997
- move AsmWriter.h from public headers into lib
- marked all AssemblyWriter functions as non-virtual; no need to override them
- DebugIR now "plugs into" AssemblyWriter with an AssemblyAnnotationWriter helper
- exposed flags to control hiding of a) debug metadata b) debug intrinsic calls
C/R: Paul Redmond
llvm-svn: 182617
- requires existing debug information to be present
- fixes up file name and line number information in metadata
- emits a "<orig_filename>-debug.ll" succinct IR file (without !dbg metadata
or debug intrinsics) that can be read by a debugger
- initialize pass in opt tool to enable the "-debug-ir" flag
- lit tests to follow
llvm-svn: 181467
the things, and renames it to CBindingWrapping.h. I also moved
CBindingWrapping.h into Support/.
This new file just contains the macros for defining different wrap/unwrap
methods.
The calls to those macros, as well as any custom wrap/unwrap definitions
(like for array of Values for example), are put into corresponding C++
headers.
Doing this required some #include surgery, since some .cpp files relied
on the fact that including Wrap.h implicitly caused the inclusion of a
bunch of other things.
This also now means that the C++ headers will include their corresponding
C API headers; for example Value.h must include llvm-c/Core.h. I think
this is harmless, since the C API headers contain just external function
declarations and some C types, so I don't believe there should be any
nasty dependency issues here.
llvm-svn: 180881
If we compile a single source program, the `.gcda' file will be generated where
the program was executed. This isn't desirable, because that place may be at an
unpredictable place (the program could call `chdir' for instance).
Instead, we will output the `.gcda' file in the same place we output the `.gcno'
file. I.e., the directory where the executable was generated. This matches GCC's
behavior.
<rdar://problem/13061072> & PR11809
llvm-svn: 178084
Before: the function name was stored by the compiler as a constant string
and the run-time was printing it.
Now: the PC is stored instead and the run-time prints the full symbolized frame.
This adds a couple of instructions into every function with non-empty stack frame,
but also reduces the binary size because we store less strings (I saw 2% size reduction).
This change bumps the asan ABI version to v3.
llvm part.
Example of report (now):
==31711==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffa77cf1c5 at pc 0x41feb0 bp 0x7fffa77cefb0 sp 0x7fffa77cefa8
READ of size 1 at 0x7fffa77cf1c5 thread T0
#0 0x41feaf in Frame0(int, char*, char*, char*) stack-oob-frames.cc:20
#1 0x41f7ff in Frame1(int, char*, char*) stack-oob-frames.cc:24
#2 0x41f477 in Frame2(int, char*) stack-oob-frames.cc:28
#3 0x41f194 in Frame3(int) stack-oob-frames.cc:32
#4 0x41eee0 in main stack-oob-frames.cc:38
#5 0x7f0c5566f76c (/lib/x86_64-linux-gnu/libc.so.6+0x2176c)
#6 0x41eb1c (/usr/local/google/kcc/llvm_cmake/a.out+0x41eb1c)
Address 0x7fffa77cf1c5 is located in stack of thread T0 at offset 293 in frame
#0 0x41f87f in Frame0(int, char*, char*, char*) stack-oob-frames.cc:12 <<<<<<<<<<<<<< this is new
This frame has 6 object(s):
[32, 36) 'frame.addr'
[96, 104) 'a.addr'
[160, 168) 'b.addr'
[224, 232) 'c.addr'
[288, 292) 's'
[352, 360) 'd'
llvm-svn: 177724
Use the new `llvm_gcov_init' function to register the writeout and flush
functions. The initialization function will also call `atexit' for some cleanups
and final writout calls. But it does this only once. This is better than
checking for the `main' function, because in a library that function may not
exist.
<rdar://problem/12439551>
llvm-svn: 177579
We don't want to write out >1000 files at the same time. That could make things
prohibitively expensive. Instead, register the "writeout" function so that it's
emitted serially.
<rdar://problem/12439551>
llvm-svn: 177437
For each compile unit, we want to register a function that will flush that
compile unit. Otherwise, __gcov_flush() would only flush the counters within the
current compile unit, and not any outside of it.
PR15191 & <rdar://problem/13167507>
llvm-svn: 177340
constructs default arguments. It can now take default arguments from
cl::opt'ions. Add a new -default-gcov-version=... option, and actually test it!
Sink the reverse-order of the version into GCOVProfiling, hiding it from our
users.
llvm-svn: 177002
emitProfileNotes(), similar to emitProfileArcs(). Also update its comment.
Also add a comment on Version[4] (there will be another comment in clang later),
and compress lines that exceeded 80 columns.
llvm-svn: 176994
it. Fortunately, versions of gcov that predate the extra checksum also ignore
any extra data, so this isn't a problem. There will be a matching commit in
compiler-rt.
llvm-svn: 176745
into the actual gcov file.
Instead of using the bottom 4 bytes as the function identifier, use a counter.
This makes the identifier numbers stable across multiple runs.
llvm-svn: 176616
Shadow checks are disabled and memory loads always produce fully initialized
values in functions that don't have a sanitize_memory attribute. Value and
argument shadow is propagated as usual.
This change also updates blacklist behaviour to match the above.
llvm-svn: 176247
These are two related changes (one in llvm, one in clang).
LLVM:
- rename address_safety => sanitize_address (the enum value is the same, so we preserve binary compatibility with old bitcode)
- rename thread_safety => sanitize_thread
- rename no_uninitialized_checks -> sanitize_memory
CLANG:
- add __attribute__((no_sanitize_address)) as a synonym for __attribute__((no_address_safety_analysis))
- add __attribute__((no_sanitize_thread))
- add __attribute__((no_sanitize_memory))
for S in address thread memory
If -fsanitize=S is present and __attribute__((no_sanitize_S)) is not
set llvm attribute sanitize_S
llvm-svn: 176075
This patch makes asan instrument memory accesses with unusual sizes (e.g. 5 bytes or 10 bytes), e.g. long double or
packed structures.
Instrumentation is done with two 1-byte checks
(first and last bytes) and if the error is found
__asan_report_load_n(addr, real_size) or
__asan_report_store_n(addr, real_size)
is called.
Also, call these two new functions in memset/memcpy
instrumentation.
asan-rt part will follow.
llvm-svn: 175507
This flag makes asan use a small (<2G) offset for 64-bit asan shadow mapping.
On x86_64 this saves us a register, thus achieving ~2/3 of the
zero-base-offset's benefits in both performance and code size.
Thanks Jakub Jelinek for the idea.
llvm-svn: 174886
This reverts r171041. This was a nice idea that didn't work out well.
Clang warnings need to be associated with warning groups so that they can
be selectively disabled, promoted to errors, etc. This simplistic patch didn't
allow for that. Enhancing it to provide some way for the backend to specify
a front-end warning type seems like overkill for the few uses of this, at
least for now.
llvm-svn: 174748
It is way too slow. Change the default option value to 0.
Always do exact shadow propagation for unsigned ICmp with constants, it is
cheap (under 1% cpu time) and required for correctness.
llvm-svn: 173682
Only for integers, pointers, and vectors of those. No floats.
Instrumentation seems very heavy, and may need to be replaced
with some approximation in the future.
llvm-svn: 173452
This fixes va_start/va_copy of a va_list field which happens to not
be laid out at a 16-byte boundary.
Differential Revision: http://llvm-reviews.chandlerc.com/D276
llvm-svn: 172128
code that includes Intrinsics.gen directly.
This never showed up in my testing because the old Intrinsics.gen was
still kicking around in the make build system and was correct there. =[
Thankfully, some of the bots to clean rebuilds and that caught this.
llvm-svn: 171373
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.
There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.
The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.
I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).
I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.
llvm-svn: 171366
utils/sort_includes.py script.
Most of these are updating the new R600 target and fixing up a few
regressions that have creeped in since the last time I sorted the
includes.
llvm-svn: 171362
directly.
This is in preparation for removing the use of the 'Attribute' class as a
collection of attributes. That will shift to the AttributeSet class instead.
llvm-svn: 171253