The specializations were broken. For example,
void foo(const CallGraph *G) {
auto I = GraphTraits<const CallGraph *>::nodes_begin(G);
auto K = I++;
...
}
or
void bar(const CallGraphNode *N) {
auto I = GraphTraits<const CallGraphNode *>::nodes_begin(G);
auto K = I++;
....
}
would not compile.
Patch by Speziale Ettore!
llvm-svn: 222149
This creates a TargetThreadWindows class and updates the thread
list of the Process with the main thread. Additionally, we
fill out a few more overrides of Process base class methods. We
do not yet update the thread list as threads are created and/or
destroyed, and we do not yet propagate stop reasons to threads as
their states change.
llvm-svn: 222148
Summary:
The google-readability-namespace-comments/llvm-namespace-comment
warnings are quite confusing when they appear at the beginning of a long
namespace and the closing brace is not in sight.
For convenience added notes pointing to the start of the namespace.
Reviewers: klimek
Reviewed By: klimek
Subscribers: curdeius, cfe-commits
Differential Revision: http://reviews.llvm.org/D6251
llvm-svn: 222145
This test has intermittently failed on FreeBSD for quite some time when
run as part of the full test suite. It generally passes when run by
itself. Mark as expected failure for now to reduce buildbot noise.
llvm.org/pr15039 test fails intermittently on FreeBSD
llvm-svn: 222134
Summary:
The generic variadic matcher is faster (one less virtual function call
per match) and doesn't require template instantiations which reduces
compile time and binary size.
Registry.cpp.o generates ~14% less symbols and compiles ~7.5% faster.
The change also speeds up our clang-tidy benchmark by ~2%.
Reviewers: klimek
Subscribers: klimek, cfe-commits
Differential Revision: http://reviews.llvm.org/D6278
llvm-svn: 222131
The triple parser should only accept existing architecture names
when the triple starts with armv, armebv, thumbv or thumbebv.
Patch by Gabor Ballabas.
llvm-svn: 222129
information from the compact unwind section used on darwin
for exception handling, and dump that information..
The UNWIND_X86_64_MODE_RBP_FRAME and UNWIND_X86_64_MODE_DWARF
entries look to be handled correctly.
UNWIND_X86_64_MODE_STACK_IMMD and UNWIND_X86_64_MODE_STACK_IND
are still a work in progress.
Only x86_64 is supported right now. Given that this is an
experiment in parsing the section contents, I don't expect to
add other architectures; they are trivial variations on this
arch. There exists a real dumper included in the Xcode tools,
unwinddump.
llvm-svn: 222127
SCEVDivision::divide constructed an object of SCEVDivision<Derived>
instead of Derived. divide would call visit which would cast the
SCEVDivision<Derived> to type Derived. As it happens,
SCEVDivision<Derived> and Derived currently have the same layout but
this is fragile and grounds for UB.
Instead, just construct Derived. No functional change intended.
llvm-svn: 222126
This was motivated by a bug which caused code like this to be
miscompiled:
declare void @take_ptr(i8*)
define void @test() {
%addr1.32 = alloca i8
%addr2.32 = alloca i32, i32 1028
call void @take_ptr(i8* %addr1)
ret void
}
This was emitting the following assembly to get the value of %addr1:
add r0, sp, #1020
add r0, r0, #8
However, "add r0, r0, #8" is not a valid Thumb1 instruction, and this
could not be assembled. The generated object file contained this,
resulting in r0 holding SP+8 rather tha SP+1028:
add r0, sp, #1020
add r0, sp, #8
This function looked like it could have caused miscompilations for
other combinations of registers and offsets (though I don't think it is
currently called with these), and the heuristic it used did not match
the emitted code in all cases.
llvm-svn: 222125
We were a little lax in a few areas:
- We pretended that import libraries were like any old COFF file, they
are not. In fact, they aren't really COFF files at all, we should
probably grow some specialized functionality to handle them smarter.
- Our symbol iterators were more than happy to attempt to go past the
end of the symbol table if you had a symbol with a bad list of
auxiliary symbols.
llvm-svn: 222124
Some optimisations in DAGCombiner cause miscompilations for targets that use
TargetLowering::UndefinedBooleanContent, because they assume that the results
of a SELECT_CC node are boolean values, and can be safely ANDed, ORed and
XORed. These optimisations are only valid for targets that use
ZeroOrOneBooleanContent or ZeroOrNegativeOneBooleanContent.
This is a follow-up to D6210/r221693.
llvm-svn: 222123
This is a simple optimization for switch table lookup:
It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output.
Example:
int f1(int x) {
switch (x) {
case 0: return 10;
case 1: return 11;
case 2: return 12;
case 3: return 13;
}
return 0;
}
generates:
define i32 @f1(i32 %x) #0 {
entry:
%0 = icmp ult i32 %x, 4
br i1 %0, label %switch.lookup, label %return
switch.lookup:
%switch.offset = add i32 %x, 10
ret i32 %switch.offset
return:
ret i32 0
}
llvm-svn: 222121
Indices into the table are stored in each MCRegisterClass instead of a pointer. A new method, getRegClassName, is added to MCRegisterInfo and TargetRegisterInfo to lookup the string in the table.
llvm-svn: 222118
This adds back r222061, but now calls initializePAEvalPass from the correct
library to avoid link problems.
Original message:
Don't make assumptions about the name of private global variables.
Private variables are can be renamed, so it is not reliable to make
decisions on the name.
The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).
This patch changes one case where we were looking at the variable name to use
the section instead.
Test tuning by Michael Gottesman.
llvm-svn: 222117
SCEV based code generation allows Polly to detect and generate code for loops
that do not have an explicit induction variable, but only virtual induction
variables given by SCEV.
Being able to do so has two main benefits:
- We can detect more scops by default
- We require less canonicalization before Polly, which means we get closer
to our goal of not touching the IR before analyzing its properties.
Specifically, we do not need to run -polly-indvars to introduce explicit
canonical induction variables.
This switch became possible as both the isl code generation and -polly-parallel
are LNT error free with SCEV based code generation and the isl ast generator.
llvm-svn: 222113
This patch includes tests where we actually need to adjust the CHECK lines
for SCEV based code generation. Besides these adjustments we add explicit
calls to -polly-codegen-scev=[true|false] and make sure we test both cases.
llvm-svn: 222112
In particular, make SanitizerArgs responsible for parsing
and passing down to frontend -fsanitize-recover and
-fsanitize-undefined-trap-on-error flags.
Simplify parsing -f(no-)sanitize= flags parsing: get rid of
too complex filterUnsupportedKinds function.
No functionality change.
llvm-svn: 222105
It turns out that not all users of SCEVDivision want the same
signedness. Let the users determine which operation they'd like by
explicitly choosing SCEVUDivision or SCEVSDivision.
findArrayDimensions and computeAccessFunctions will use SCEVSDivision
while HowFarToZero will use SCEVUDivision.
llvm-svn: 222104
This prevents SCEVs to reference values not valid any more and as a consequence
solves a bug where such values reintroduced during ast generation caused the
independent blocks pass to fail validation.
http://llvm.org/PR21204
llvm-svn: 222103
The isl based backend has been tested since a long time and with the recently
commited OpenMP support the last missing piece of functionality was ported from
the CLooG backend.
The isl based backend gives us interesting new functionality:
- Run-time alias checks (enabled by default)
Optimize scops that contain possibly aliasing pointers. This feature has
largely increased the number of loop nests we consider for optimization.
Thanks Johannes!
- Delinearization (not yet enabled by default)
Model accesses to multi-dimensional arrays precisely. This will allow us to
understand kernels with multi-dimensional VLAs written in Julia, boost::ublas,
coremark or C99.
Thanks Sebastian!
- Generation of higher quality code
Sven and me spent a long time to optimize the quality of the generated code. A
major focus were expressions as they result from modulos/divisions or
piecewise affine expressions (a ? b : c).
- Full/Partial tile separation, polyhedral unrolling
The isl code generation provides functionality to generate specialized code
for core and cleanup loops and to specialize code using polyhedral context
information while unrolling statements.
(not yet exploited in Polly)
- Modifieable access functions
We can now use standard isl functionality to remap memory accesses to new
data locations. A standard use case is the use of shared memory, where
accesses to a larger region in global memory need to be mapped to a smaller
shared memory region using a modulo mapping.
(not yet exploited in Polly)
The cloog based code generation is still available for comparision, but is
scheduled for removal.
llvm-svn: 222101
Summary:
Several places in DependenceAnalysis assumes both SCEVs in a subscript pair
share the same integer type. For instance, isKnownPredicate calls
SE->getMinusSCEV(X, Y) which asserts X and Y share the same type. However,
DependenceAnalysis fails to ensure this assumption when producing a subscript
pair, causing tests such as NonCanonicalizedSubscript to crash. With this
patch, DependenceAnalysis runs unifySubscriptType before producing any
subscript pair, ensuring the assumption.
Test Plan:
Added NonCanonicalizedSubscript.ll on which DependenceAnalysis before the fix
crashed because subscripts have different types.
Reviewers: spop, sebpop, jingyue
Reviewed By: jingyue
Subscribers: eliben, meheff, llvm-commits
Differential Revision: http://reviews.llvm.org/D6289
llvm-svn: 222100
Instead of parallelizing every parallel outermost loop, we now use a very
minimalistic cost model. Specifically, we assume innermost loops are not
worth parallelising and all non-innermost loops are.
When parallelizing all loops in LNT we got several slowdowns/timeouts due to
us parallelizing innermost loops that are executed only a couple of times
(number of iterations not known statically). With this basic heuristic enabled
LNT does not show any more timeouts, while several interesting loops are still
parallelized.
There are many ways to obtain an improved heuristic. Constructing such an
improvide heuristic from a position of minimal slow-down and zero code size
increase seems to be the best, as it allows us to track progress on LNT.
llvm-svn: 222096