taken (and used!). This prevents merging the blocks (invalidating
the block addresses) in a case like this:
#define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; })
void foo() {
printf("%p\n", _THIS_IP_);
printf("%p\n", _THIS_IP_);
printf("%p\n", _THIS_IP_);
}
which fixes PR4151.
llvm-svn: 125829
No one uses *-mingw64. mingw-w64 is represented as {i686|x86_64}-w64-mingw32. In llvm side, i686 and x64 can be treated as similar way.
llvm-svn: 125747
transformation if we can't legally create a build vector of the correct
type. Check that we can make the transformation first, and add a TODO to
refactor this code with similar cases.
Fixes: PR9223 and rdar://9000350
llvm-svn: 125631
Machine instruction range consisting of only DBG_VALUE MIs only contributes consecutive labels in assembly output, which is harmless, and empty scope entry in DebugInfo, which confuses debugger tools.
llvm-svn: 125577
The i64_buildvector test in this file relies on the alignment of i64 and
f64 types being the same, which is true for Darwin but not AAPCS.
llvm-svn: 125525
- Add custom operand matching for imod and iflags.
- Rename SplitMnemonicAndCC to SplitMnemonic since it splits more than CC
from mnemonic.
- While adding ".w" as an operand, don't change "Head" to avoid passing the
wrong mnemonic to ParseOperand.
- Add asm parser tests.
- Add disassembler tests just to make sure it can catch all cps versions.
llvm-svn: 125489
have their low bits set to zero. This allows us to optimize
out explicit stack alignment code like in stack-align.ll:test4 when
it is redundant.
Doing this causes the code generator to start turning FI+cst into
FI|cst all over the place, which is general goodness (that is the
canonical form) except that various pieces of the code generator
don't handle OR aggressively. Fix this by introducing a new
SelectionDAG::isBaseWithConstantOffset predicate, and using it
in places that are looking for ADD(X,CST). The ARM backend in
particular was missing a lot of addressing mode folding opportunities
around OR.
llvm-svn: 125470
plus some variations of this. According to my auto-simplifier this occurs a lot
but usually in combination with max/min idioms. Because max/min aren't handled
yet this unfortunately doesn't have much effect in the testsuite.
llvm-svn: 125462
It caused a crash in MultiSource/Benchmarks/Bullet.
Opt hit an assertion with "opt -std-compile-opts" because
Constant::getAllOnesValue doesn't know how to handle floats.
This patch added a test to reproduce the problem and a check that the
destination vector is of integer type.
Thank you Benjamin!
llvm-svn: 125459
These are just FXSAVE and FXRSTOR with REX.W prefixes. These versions use
64-bit pointer values instead of 32-bit pointer values in the memory map they
dump and restore.
llvm-svn: 125446
The DAGCombiner created illegal BUILD_VECTOR operations.
The patch added a check that either illegal operations are
allowed or that the created operation is legal.
llvm-svn: 125435
unsigned overflow (e.g. "gep P, -1"), and while they can have
signed wrap in theoretical situations, modelling an AddRec as
not having signed wrap is going enough for any case we can
think of today. In the future if this isn't enough, we can
revisit this. Modeling them as having NUW isn't causing any
known problems either FWIW.
llvm-svn: 125410
This
define float @foo(float %x, float %y) nounwind readnone {
entry:
%0 = tail call float @copysignf(float %x, float %y) nounwind readnone
ret float %0
}
Was compiled to:
vmov s0, r1
bic r0, r0, #-2147483648
vmov s1, r0
vcmpe.f32 s0, #0
vmrs apsr_nzcv, fpscr
it lt
vneglt.f32 s1, s1
vmov r0, s1
bx lr
This fails to copy the sign of -0.0f because it's lost during the float to int
conversion. Also, it's sub-optimal when the inputs are in GPR registers.
Now it uses integer and + or operations when it's profitable. And it's correct!
lsrs r1, r1, #31
bfi r0, r1, #31, #1
bx lr
rdar://8984306
llvm-svn: 125357
gep to explicit addressing, we know that none of the intermediate
computation overflows.
This could use review: it seems that the shifts certainly wouldn't
overflow, but could the intermediate adds overflow if there is a
negative index?
Previously the testcase would instcombine to:
define i1 @test(i64 %i) {
%p1.idx.mask = and i64 %i, 4611686018427387903
%cmp = icmp eq i64 %p1.idx.mask, 1000
ret i1 %cmp
}
now we get:
define i1 @test(i64 %i) {
%cmp = icmp eq i64 %i, 1000
ret i1 %cmp
}
llvm-svn: 125271
exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.
Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck. I believe that this takes care of a bunch of related
code quality issues attached to PR8862.
llvm-svn: 125267
optimizations to be much more aggressive in the face of
exact/nsw/nuw div and shifts. For example, these (which
are the same except the first is 'exact' sdiv:
define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
%A = sdiv exact i64 %X, -5 ; X/-5 == 0 --> x == 0
%B = icmp eq i64 %A, 0
ret i1 %B
}
define i1 @sdiv_icmp4(i64 %X) nounwind {
%A = sdiv i64 %X, -5 ; X/-5 == 0 --> x == 0
%B = icmp eq i64 %A, 0
ret i1 %B
}
compile down to:
define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
%1 = icmp eq i64 %X, 0
ret i1 %1
}
define i1 @sdiv_icmp4(i64 %X) nounwind {
%X.off = add i64 %X, 4
%1 = icmp ult i64 %X.off, 9
ret i1 %1
}
This happens when you do something like:
(ptr1-ptr2) == 42
where the pointers are pointers to non-unit types.
llvm-svn: 125266
When matching operands for a candidate opcode match in the auto-generated
AsmMatcher, check each operand against the expected operand match class.
Previously, operands were classified independently of the opcode being
handled, which led to difficulties when operand match classes were
more complicated than simple subclass relationships.
llvm-svn: 125245
could end up removing a different function than we intended because it was
functionally equivalent, then end up with a comparison of a function against
itself in the next round of comparisons (the one in the function set and the
one on the deferred list). To fix this, I introduce a choice in the form of
comparison for ComparableFunctions, either normal or "pointer only" used to
find exact Function*'s in lookups.
Also add some debugging statements.
llvm-svn: 125180
The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using
post-increment versions, but all the rest of the NEON load/store instructions
should be handled now.
llvm-svn: 125014
failures with relocations.
The code committed is a first cut at compatibility for emitted relocations in
ELF .o.
Why do this? because existing ARM tools like emitting relocs symbols as
explicit relocations, not as section-offset relocs.
Result is that with these changes,
1) relocs are now substantially identical what to gcc outputs.
2) larger apps (including many spec2k tests) compile, cross-link, and pass
Added reminder fixme to tests for future conversion to .s form.
llvm-svn: 124996
(yes, this is different from R_ARM_CALL)
- Adds a new method getARMBranchTargetOpValue() which handles the
necessary distinction between the conditional and unconditional br/bl
needed for ARM/ELF
At least for ARM mode, the needed fixup for conditional versus unconditional
br/bl is identical, but the ARM docs and existing ARM tools expect this
reloc type...
Added a few FIXME's for future naming fixups in ARMInstrInfo.td
llvm-svn: 124895
auto-simplifier). This has a big impact on Ada code, but not much else.
Unfortunately the impact is mostly negative! This is due to PR9004 (aka
SCCP failing to resolve conditional branch conditions in the destination
blocks of the branch), in which simple correlated expressions are not
resolved but complicated ones are, so simplifying has a bad effect!
llvm-svn: 124788
Reversing the operands allows us to fold, but doesn't force us to. Also, at
this point the DAG is still being optimized, so the check for hasOneUse is not
very precise.
llvm-svn: 124773
overflow (nsw flag), which was disabled because it breaks 254.gap. I have
informed the GAP authors of the mistake in their code, and arranged for the
testsuite to use -fwrapv when compiling this benchmark.
llvm-svn: 124746
This makes the job of the later optzn passes easier, allowing the vast amount of
icmp transforms to chew on it.
We transform 840 switches in gcc.c, leading to a 16k byte shrink of the resulting
binary on i386-linux.
The testcase from README.txt now compiles into
decl %edi
cmpl $3, %edi
sbbl %eax, %eax
andl $1, %eax
ret
llvm-svn: 124724