These copies would coalesce easily, but the resulting value would be
defined by a deleted instruction. Now we also remove the undefined value
number from the destination register.
This fixes PR10503.
llvm-svn: 136174
On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can
fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add,
which can be commuted and encoded efficiently.
This code is generated for __builtin_clz and friends.
llvm-svn: 136167
shuffle before inserting on a 256-bit vector.
- Add AVX versions of movd/movq instructions
- Introduce a few COPY patterns to match insert_subvector instructions.
This turns a trivial insert_subvector instruction into a register copy,
coalescing the xmm into a ymm and avoid emiting on more instruction.
llvm-svn: 136002
Fix the Rn register encoding for both SSAT and USAT. Update the parsing of the
shift operand to correctly handle the allowed shift types and immediate ranges
and issue meaningful diagnostics when an illegal value or shift type is
specified. Add aliases to parse an ommitted shift operand (default value of
'lsl #0').
Add tests for diagnostics and proper encoding.
llvm-svn: 135990
The .local, .hidden, .internal, and .protected are not legal for all supported
file formats (in particular, they're invalid for MachO). Move the parsing for
them into the ELF assembly parser since that's the format they're for.
Similarly, .weak is used by COFF and ELF, but not MachO, so move the parsing
to the COFF and ELF asm parsers. Previously, using any of these directives
on Darwin would result in an assertion failure in the parser; now we get
a diagnostic as we should.
rdar://9827089
llvm-svn: 135921
This fixes PR10463. A two-address instruction with an <undef> use
operand was incorrectly rewritten so the def and use no longer used the
same register, violating the tie constraint.
Fix this by always rewriting <undef> operands with the register a def
operand would use.
llvm-svn: 135885
and was actually very wrong, fix it and make it simpler. Also remove the
ConcatVectors function, which is unused now.
- Fix a introduction of useless nodes in r126664 and r126264. The
VUNPCKL* should never be introduced cause we don't want duplicate
nodes for 128 AVX and non-AVX modes, the actual instruction
difference only exists during isel, but not for target specific DAG
nodes. We only introduce V* target nodes when there is no 128-bit
version already there.
- Fix a fragile test and make it more useful.
llvm-svn: 135729
size but different element types, so that it filters out the cases
that CreateShuffleVectorCast doesn't handle. This fixes rdar://9786827.
llvm-svn: 135721
Aliases for LDM/STM. The single-register versions should encode to LDR/STR
with writeback, but we don't (yet) get that correct. Neither does Darwin's
system assembler, though, so that's not a deal-breaker of a limitation.
llvm-svn: 135702
instruction introduced in AVX, which can operate on 128 and 256-bit vectors.
It considers a 256-bit vector as two independent 128-bit lanes. It can permute
any 32 or 64 elements inside a lane, and restricts the second lane to
have the same permutation of the first one. With the improved splat support
introduced early today, adding codegen for this instruction enable more
efficient 256-bit code:
Instead of:
vextractf128 $0, %ymm0, %xmm0
punpcklbw %xmm0, %xmm0
punpckhbw %xmm0, %xmm0
vinsertf128 $0, %xmm0, %ymm0, %ymm1
vinsertf128 $1, %xmm0, %ymm1, %ymm0
vextractf128 $1, %ymm0, %xmm1
shufps $1, %xmm1, %xmm1
movss %xmm1, 28(%rsp)
movss %xmm1, 24(%rsp)
movss %xmm1, 20(%rsp)
movss %xmm1, 16(%rsp)
vextractf128 $0, %ymm0, %xmm0
shufps $1, %xmm0, %xmm0
movss %xmm0, 12(%rsp)
movss %xmm0, 8(%rsp)
movss %xmm0, 4(%rsp)
movss %xmm0, (%rsp)
vmovaps (%rsp), %ymm0
We get:
vextractf128 $0, %ymm0, %xmm0
punpcklbw %xmm0, %xmm0
punpckhbw %xmm0, %xmm0
vinsertf128 $0, %xmm0, %ymm0, %ymm1
vinsertf128 $1, %xmm0, %ymm1, %ymm0
vpermilps $85, %ymm0, %ymm0
llvm-svn: 135662
The system register spec should be case insensitive. The preferred form for
output with mask values of 4, 8, and 12 references APSR rather than CPSR.
Update and tidy up tests accordingly.
llvm-svn: 135532
Correct the handling of the 's' suffix when parsing ARM mode. It's only a
truly separate opcode in Thumb. Add test cases to make sure we handle
the s and condition suffices correctly, including diagnostics.
llvm-svn: 135513
Add range checking for the immediate operand and handle the "mov" mnemonic
choosing between encodings based on the value of the immediate. Add tests
for fixups, encoding choice and values, and diagnostic for out of range values.
llvm-svn: 135500
- In EmitAtomicBinaryPartword, mask incr in loopMBB only if atomic.swap is the
instruction being expanded, instead of masking it in thisMBB.
- Remove redundant Or in EmitAtomicCmpSwap.
llvm-svn: 135495
(including compilation, assembly). Move relocation model Reloc::Model from
TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine.
llvm-svn: 135468
For -disable-iv-rewrite, perform LFTR without generating a new
"canonical" induction variable. Instead find the "best" existing
induction variable for use in the loop exit test and compute the final
value of that IV for use in the new loop exit test. In short,
convert to a simple eq/ne exit test as long as it's cheap to do so.
llvm-svn: 135420
moving them out of the loop. Previously, stores and loads to a stack frame
object were inserted to accomplish this. Remove the code that was needed to do
this. Patch by Sasa Stankovic.
llvm-svn: 135415
When splitting a live range immediately before an LDR_POST instruction
that redefines the address register, make sure to use the correct value
number in leaveIntvBefore.
We need the value number entering the instruction.
<rdar://problem/9793765>
llvm-svn: 135413