Summary:
On PowerPC64 Linux PTRACE_GETREGS is a #define and PT_GETREGS is not.
On other systems it's the other way round. Extend the #ifs to check for
both PTRACE_* and PT_*.
This fixes test/sanitizer_common/TestCases/Linux/ptrace.cc when msan is
enabled for PowerPC64.
Reviewers: zatrazz, kcc, eugenis, samsonov
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14564
llvm-svn: 252730
Basic blocks that are always executed can not be error blocks as their execution
can not possibly be an unlikely event. In this commit we tighten the check
if an error block to basic blcoks that do not dominate the exit condition, but
that dominate all exiting blocks of the scop.
llvm-svn: 252726
r252713 introduced a couple of regressions due to later basic blocks refering
to instructions defined in error blocks which have not yet been modeled.
This commit is currently just encoding limitations of our modeling and code
generation backends to ensure correctness. In theory, we should be able to
generate and optimize such regions, as everything that is dominated by an error
region is assumed to not be executed anyhow. We currently just lack the code
to make this happen in practice.
llvm-svn: 252725
If possible and profitable, replace lea %reg, 1(%reg) and lea %reg, -1(%reg) with inc %reg and dec %reg respectively.
Patch by: anton.nadolsky@intel.com
Differential Revision: http://reviews.llvm.org/D14059
llvm-svn: 252722
Summary:
For clarity and ease of maintenance, I suggest porting this test
to use the same tooling as the rest of the tests.
Reviewers: joerg, rengolin, dougk, yaron.keren
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D14548
llvm-svn: 252720
Atomic RMW is not necessary in InitializeGuardArray.
It is supposed to run when no user code runs.
And if user code runs concurrently, then the atomic
RMW won't help anyway. So replace it with non-atomic RMW.
InitializeGuardArray takes more than 50% of time during re2 fuzzing:
real 0m47.215s
51.56% a.out a.out [.] __sanitizer_reset_coverage
6.68% a.out a.out [.] __sanitizer_cov
3.41% a.out a.out [.] __sanitizer::internal_bzero_aligned16(void*, unsigned long)
1.79% a.out a.out [.] __asan::Allocator::Allocate(unsigned long, unsigned long,
With this change:
real 0m31.661s
26.21% a.out a.out [.] sanitizer_reset_coverage
10.12% a.out a.out [.] sanitizer_cov
5.38% a.out a.out [.] __sanitizer::internal_bzero_aligned16(void*, unsigned long)
2.53% a.out a.out [.] __asan::Allocator::Allocate(unsigned long, unsigned long,
That's 33% speedup.
Reviewed in http://reviews.llvm.org/D14537
llvm-svn: 252715
Thinking more about the last commit I came to realize that for testing the
new functionality it is sufficient to verify that the iteration domains
we construct for a simple test case do not contain any of the complexity that
caused compile time issues for larger inputs.
llvm-svn: 252714
Previously, we just skipped error blocks during scop construction. With
this change we make sure we can construct domains for error blocks such that
these domains can be forwarded to subsequent basic blocks.
This change ensures that basic blocks that post-dominate and are dominated by
a basic block that branches to an error condition have the very same iteration
domain as the branching basic block. Before, this change we would construct
a domain that excludes all error conditions. Such domains could become _very_
complex and were undesirable to build.
Another solution would have been to drop these constraints using a
dominance/post-dominance check instead of modeling the error blocks. Such
a solution could also work in case of unreachable statements or infinite
loops in the scop. However, as we currently (to my believe incorrectly) model
unreachable basic blocks in the post-dominance tree, such a solution is not
yet feasible and requires first a change to LLVM's post-dominance tree
construction.
This commit addresses the most sever compile time issue reported in:
http://llvm.org/PR25458
llvm-svn: 252713
Especially for structs, the SAI object of a base pointer does not
describe all the types that the user might expect when he loads from
that base pointer. While we will still cast integers and pointers we
will now reload the value with the correct type if floating point and
non-floating point values are involved. However, there are now TODOs
where we use bitcasts instead of a proper conversion or reloading.
This fixes bug 25479.
llvm-svn: 252706
This test fails most of the time when run under heavy load. The dsym
variant doesn't seem to be failing.
Tracking XFAIL marker with:
https://llvm.org/bugs/show_bug.cgi?id=25485
llvm-svn: 252702
We now create all invariant equivalence classes for required invariant loads
instead of creating them on-demand. This way we can check if a parameter
references an invariant load that is actually not executed and was therefor
not materialized. If that happens the parameter is not materialized either.
This fixes bug 25469.
llvm-svn: 252701
Summary:
On Darwin, interposed functions are prefixed with "wrap_". On Linux,
they are prefixed with "__interceptor_".
Reviewers: dvyukov, samsonov, glider, kcc, kubabrecka
Subscribers: zaks.anna, llvm-commits
Differential Revision: http://reviews.llvm.org/D14512
llvm-svn: 252695
Re-implement `ilist_node::getNextNode()` and `getPrevNode()` without
relying on the sentinel having a "next" pointer. Instead, get access to
the owning list and compare against the `begin()` and `end()` iterators.
This only works when the node *can* get access to the owning list. The
new support is in `ilist_node_with_parent<>`, and any class `Ty`
inheriting from `ilist_node<NodeTy>` that wants `getNextNode()` and/or
`getPrevNode()` should inherit from
`ilist_node_with_parent<NodeTy, ParentTy>` instead. The requirements:
- `NodeTy` must have a `getParent()` function that returns the parent.
- `ParentTy` must have a `getSublistAccess()` static that, given a(n
ignored) `NodeTy*` (to determine which list), returns a member field
pointer to the appropriate `ilist<>`.
This isn't the cleanest way to get access to the owning list, but it
leverages the API already used in the IR hierarchy (see, e.g.,
`Instruction::getSublistAccess()`).
If anyone feels like ripping out the calls to `getNextNode()` and
`getPrevNode()` and replacing with direct iterator logic, they can also
remove the access function, etc., but as an incremental step, I'm
maintaining the API where it's currently used in tree.
If these requirements are *not* met, call sites with access to the ilist
can call `iplist<NodeTy>::getNextNode(NodeTy*)` directly, as in
ilistTest.cpp.
Why rewrite this?
The old code was broken, calling `getNext()` on a sentinel that possibly
didn't have a "next" pointer at all! The new code avoids that
particular flavour of UB (see the commit message for r252538 for more
details about the "lucky" memory layout that made this function so
interesting).
There's still some UB here: the end iterator gets downcast to `NodeTy*`,
even when it's a sentinel (which is typically
`ilist_half_node<NodeTy*>`). I'll tackle that in follow-up commits.
See this llvm-dev thread for more details:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/091115.html
What's the danger?
There might be some code that relies on `getNextNode()` or
`getPrevNode()` *never* returning `nullptr` -- i.e., that relies on them
being broken when the sentinel is an `ilist_half_node<NodeTy>`. I tried
to root out those cases with the audits I did leading up to r252380, but
it's possible I missed one or two. I hope not.
(If (1) you have out-of-tree code, (2) you've reverted r252380
temporarily, and (3) you get some weird crashes with this commit, then I
recommend un-reverting r252380 and auditing the compile errors looking
for "strange" implicit conversions.)
llvm-svn: 252694
https://gcc.gnu.org/onlinedocs/gcc/Typeof.html
Differences from the GCC extension:
* __auto_type is also permitted in C++ (but only in places where
it could appear in C), allowing its use in headers that might
be shared across C and C++, or used from C++98
* __auto_type can be combined with a declarator, as with C++ auto
(for instance, "__auto_type *p")
* multiple variables can be declared in a single __auto_type
declaration, with the C++ semantics (the deduced type must be
the same in each case)
This patch also adds a missing restriction on applying typeof to
a bit-field, which GCC has historically rejected in C (due to
lack of clarity as to whether the operand should be promoted).
The same restriction also applies to __auto_type in C (in both
GCC and Clang).
This also fixes PR25449.
Patch by Nicholas Allegra!
llvm-svn: 252690
std::initializer_list<T> type. Instead, the list must contain a single element
and the type is deduced from that.
In Clang 3.7, we warned by default on all the cases that would change meaning
due to this change. In Clang 3.8, we will support only the new rules -- per
the request in N3922, this change is applied as a Defect Report against earlier
versions of the C++ standard.
This change is not entirely trivial, because for lambda init-captures we
previously did not track the difference between direct-list-initialization and
copy-list-initialization. The difference was not previously observable, because
the two forms of initialization always did the same thing (the elements of the
initializer list were always copy-initialized regardless of the initialization
style used for the init-capture).
llvm-svn: 252688
Summary:
First batch of sancov.py rewrite in C++.
Supports "-print" and "-covered_functions" commands.
Differential Revision: http://reviews.llvm.org/D14356
llvm-svn: 252683
leaq symbol@tlsld(%rip), %rdi
call __tls_get_addr@plt
symbol@tlsld (R_X86_64_TLSLD) instructs the linker to generate a tls_index entry (two GOT slots) in the GOT for the entire module (shared object or executable) with an offset of 0. The symbol for this GOT entry doesn't matter (as long as it's either local to the module or null), and gold doesn't put a symbol in the dynamic R_X86_64_DTPMOD64 relocation for the GOT entry.
All other platforms defined in http://www.akkadia.org/drepper/tls.pdf except for Itanium use a similar model where global and local dynamic GOT entries take up 2 contiguous GOT slots, so we can handle this in a unified manner if we don't care about Itanium.
While scanning relocations we need to identify local dynamic relocations and generate a single tls_index entry in the GOT for the module and store the address of it somewhere so we can later statically resolve the offset for R_X86_64_TLSLD relocations. We also need to generate a R_X86_64_DTPMOD64 relocation in the RelaDyn relocation section.
This implementation is a bit hacky. It side steps the issue of GotSection and RelocationSection only handling SymbolBody entries by relying on a specific relocation type. The alternative to this seemed to be completely rewriting how GotSection and RelocationSection work, or using a different hacky signaling method.
llvm-svn: 252682