The control flow finalizer would sometimes use an ALU_POP_AFTER
instruction before the vetex fetch clause instead of using a POP
instruction after it.
llvm-svn: 199917
Implement the getUnrollingPreferences() function for
AMDGPUTargetTransformInfo so that loops that do address calculations
on pointers derived from alloca are unconditionally unrolled.
Unrolling these loops makes it more likely that SROA will be able to
eliminate the allocas, which is a big win for R600 since memory
allocated by alloca (private memory) is really slow.
llvm-svn: 199916
Argument promotion can replace an argument of a call with an alloca. This
requires clearing the tail marker as it is very likely that the callee is now
using an alloca in the caller.
This fixes pr14710.
llvm-svn: 199909
This reverts Host.cpp LaunchProcess spawn behavior on FreeBSD to be
like Linux (and unlike OS X) with regards to how default signal
handlers and setup on the spawned process. FreeBSD does not reset
default signal handlers on the spawned process after this change.
llvm-svn: 199908
The unit test is now disabled on non-asserts builds.
The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE,
CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of
sub-entries on the stack is greater than or equal to the stack entry
size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is
present when number of sub-entries module 8 is either 7 or 0)
We choose to be conservative and always apply the work-around when the
number of sub-enries is greater than or equal to the stack entry size,
so that we can safely over-allocate the stack when we are unsure of the
stack allocation rules.
reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199905
Due to statement expressions supported as GCC extension, it is possible
to put 'break' or 'continue' into a loop/switch statement but outside
its body, for example:
for ( ; ({ if (first) { first = 0; continue; } 0; }); )
This code is rejected by GCC if compiled in C mode but is accepted in C++
code. GCC bug 44715 tracks this discrepancy. Clang used code generation
that differs from GCC in both modes: only statement of the third
expression of 'for' behaves as if it was inside loop body.
This change makes code generation more close to GCC, considering 'break'
or 'continue' statement in condition and increment expressions of a
loop as it was inside the loop body. It also adds error for the cases
when 'break'/'continue' appear outside loop due to this syntax. If
code generation differ from GCC, warning is issued.
Differential Revision: http://llvm-reviews.chandlerc.com/D2518
llvm-svn: 199897
This is a simpler rule, broadly in line with previous Darwin (which chose
between "soft" and "softfp") but probably safer. In practice the only real
reason for "softfp" is ABI compatibility, not usually an issue on limited chips
like these, so anyone who wanted hard-float should already be saying so.
That's my story and I'm sticking to it.
rdar://problem/15887493
llvm-svn: 199896
More universal way of removing trailing whitespace characters then 'chomp' does. Chomp "removes any trailing string that corresponds to the current value of $/" (quote from perldoc). In my case an input ended with '\r\r\n', chomp left '\r' at the end of input and the script ended up with an error "Use of uninitialized value in concatenation (.) or string"
llvm-svn: 199892
With constant-sharing, litpool loads consume 4 + N*2 bytes of code, but
movw/movt pairs consume 8*N. This means litpools are better than movw/movt even
with just one use. Other materialisation strategies can still be better though,
so the logic is a little odd.
llvm-svn: 199891
function and a FunctionPass.
This has many benefits. The motivating use case was to be able to
compute function analysis passes *after* running LoopSimplify (to avoid
invalidating them) and then to run other passes which require
LoopSimplify. Specifically passes like unrolling and vectorization are
critical to wire up to BranchProbabilityInfo and BlockFrequencyInfo so
that they can be profile aware. For the LoopVectorize pass the only
things in the way are LoopSimplify and LCSSA. This fixes LoopSimplify
and LCSSA is next on my list.
There are also a bunch of other benefits of doing this:
- It is now very feasible to make more passes *preserve* LoopSimplify
because they can simply run it after changing a loop. Because
subsequence passes can assume LoopSimplify is preserved we can reduce
the runs of this pass to the times when we actually mutate a loop
structure.
- The new pass manager should be able to more easily support loop passes
factored in this way.
- We can at long, long last observe that LoopSimplify is preserved
across SCEV. This *halves* the number of times we run LoopSimplify!!!
Now, getting here wasn't trivial. First off, the interfaces used by
LoopSimplify are all over the map regarding how analysis are updated. We
end up with weird "pass" parameters as a consequence. I'll try to clean
at least some of this up later -- I'll have to have it all clean for the
new pass manager.
Next up I discovered a really frustrating bug. LoopUnroll *claims* to
preserve LoopSimplify. That's actually a lie. But the way the
LoopPassManager ends up running the passes, it always ran LoopSimplify
on the unrolled-into loop, rectifying this oversight before any
verification could kick in and point out that in fact nothing was
preserved. So I've added code to the unroller to *actually* simplify the
surrounding loop when it succeeds at unrolling.
The only functional change in the test suite is that we now catch a case
that was previously missed because SCEV and other loop transforms see
their containing loops as simplified and thus don't miss some
opportunities. One test case has been converted to check that we catch
this case rather than checking that we miss it but at least don't get
the wrong answer.
Note that I have #if-ed out all of the verification logic in
LoopSimplify! This is a temporary workaround while extracting these bits
from the LoopPassManager. Currently, there is no way to have a pass in
the LoopPassManager which preserves LoopSimplify along with one which
does not. The LPM will try to verify on each loop in the nest that
LoopSimplify holds but the now-Function-pass cannot distinguish what
loop is being verified and so must try to verify all of them. The inner
most loop is clearly no longer simplified as there is a pass which
didn't even *attempt* to preserve it. =/ Once I get LCSSA out (and maybe
LoopVectorize and some other fixes) I'll be able to re-enable this check
and catch any places where we are still failing to preserve
LoopSimplify. If this causes problems I can back this out and try to
commit *all* of this at once, but so far this seems to work and allow
much more incremental progress.
llvm-svn: 199884
Eliminate the copies LLVM's System mmap and cache invalidation code. These were
slowly drifting away from the original version, and moreover the copied code
was a dead end in terms of portability.
We now statically link to Support but in practice with stripping this adds next
to no weight to the resultant binary.
Also avoid installing lli-child-target to the user's $PATH. It's not meant to
be run directly.
llvm-svn: 199881
warning: ISO C99 requires rest arguments to be used [enabled by default]
INTERCEPTOR(char *, dlerror) {
warning: invoking macro INTERCEPTOR argument 3: empty macro arguments are undefined in ISO C90 and ISO C++98 [enabled by default]
llvm-svn: 199873
New/delete implementations in system libraries almost always are built without
frame pointers. As we switched to frame pointer based unwinder on ARM, they no
longer work for us, resulting in broken allocation/deallocation stacks.
Note that this does not work with statically linked
libstdc++/libc++/libstlport.
llvm-svn: 199872
e.g. linkonce, to TargetMachine and set it when we've done so
for ELF targets currently. This involved making TargetMachine
non-const in a TLOF use and propagating that change around - I'm
open to other ideas.
This will be used in a future commit to handle emitting debug
information with ranges.
llvm-svn: 199871
Some ABIs have different return types for constructors and
destructors, and we're just looking for the end of the function
here. Loosen up the regex.
llvm-svn: 199870
If there are non-trivially-copyable types /other/ than C++ records, we
won't have a synthesized copy expression, but we can't just use a simple
load/return.
Also, add comments and shore up tests, making sure to test in both ARC
and non-ARC.
llvm-svn: 199869