as we don't necessarily need to do this yet - though we could move
the base class to the TargetMachine as it isn't subtarget dependent.
This reverts commit r232103.
llvm-svn: 232665
Currently v2i64 vectors shifts (non-equal shift amounts) are scalarized, costing 4 x extract, 2 x x86-shifts and 2 x insert instructions - and it gets even more awkward on 32-bit targets.
This patch separately shifts the vector by both shift amounts and then shuffles the partial results back together, costing 2 x shuffles and 2 x sse-shifts instructions (+ 2 movs on pre-AVX hardware).
Note - this patch only improves the SHL / LSHR logical shifts as only these are supported in SSE hardware.
Differential Revision: http://reviews.llvm.org/D8416
llvm-svn: 232660
Memcpy, and other memory intrinsics, typically tries to use LDM/STM if
the source and target addresses are 4-byte aligned. In CodeGenPrepare
look for calls to memory intrinsics and, if the object is on the
stack, 4-byte align it if it's large enough that we expect that memcpy
would want to use LDM/STM to copy it.
Differential Revision: http://reviews.llvm.org/D7908
llvm-svn: 232627
Currently, there are no itineraries defined for ext and ins instructions.
This patch adds these itineraries and uses them in the instruction definitions.
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D7209
llvm-svn: 232613
COFF COMDATs (for selection kinds other than 'select any') require at
least one non-section symbol in the symbol table.
Satisfy this by morally enhancing the linkage from private to internal.
Differential Revision: http://reviews.llvm.org/D8394
llvm-svn: 232570
Summary: Building FP16 constant vectors caused the FP16 data to be bitcast to i64. This patch creates a BITCAST node with the correct value, and adds a test to verify correct handling.
Reviewers: mcrosier
Reviewed By: mcrosier
Subscribers: mcrosier, jmolloy, ab, srhines, llvm-commits, rengolin, aemerson
Differential Revision: http://reviews.llvm.org/D8369
llvm-svn: 232562
Summary:
COFF COMDATs (for selection kinds other than 'select any') require at
least one non-section symbol in the symbol table.
Satisfy this by morally enhancing the linkage from private to internal.
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8374
llvm-svn: 232539
Before this patch code wanting to create temporary labels for a given entity
(function, cu, exception range, etc) had to keep its own counter to have stable
symbol names.
createTempSymbol would still add a suffix to make sure a new symbol was always
returned, but it kept a single counter. Because of that, if we were to use
just createTempSymbol("cu_begin"), the label could change from cu_begin42 to
cu_begin43 because some other code started using temporary labels.
Simplify this by just keeping one counter per prefix and removing the various
specialized counters.
llvm-svn: 232535
We have observed that noreg was being generated due to a bug in FastIsel and was not being detected during emission. It happens that in the Asm emission there is an assertion that detects this in getRegisterName() from the tbl-generated file PPCGenAsmWriter.inc. However, when emitting an Obj file, invalid registers can be emitted given that no check are made in getBinaryCodeFromInstr() from PPCGenMCCodeEmitter.inc. In order to cover all cases this adds an assertion for reg operands in LowerPPCMachineInstrToMCInst.
llvm-svn: 232525
The input offset to needsFrameBaseReg is a negative value below the top of the
stack frame, but when converting to a positive offset from the bottom of the
stack frame this value was negated, causing the final offset to be too large
by twice the input offset's magnitude. Fix that by not negating the offset.
Patch by John Brawn
Differential Revision: http://reviews.llvm.org/D8316
llvm-svn: 232513
Summary:
But still handle them the same way since I don't know how they differ on
this target.
No functional change intended.
Reviewers: uweigand
Reviewed By: uweigand
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8251
llvm-svn: 232495
The VSX stores are sometimes generated with a undefined index register, causing %noreg to be used and R0 to be emitted later on. The semantics of the VSX store (e.g. stdsdx) requires R0 to be used as base if we want zero to be used in the computation of the effective address instead of the content of R0. This patch checks if no index register was generated and forces R0 to be used as base address.
llvm-svn: 232486
Summary:
But still handle them the same way since I don't know how they differ on
this target.
No functional change intended.
Reviewers: kparzysz, adasgupt
Reviewed By: kparzysz, adasgupt
Subscribers: colinl, llvm-commits
Differential Revision: http://reviews.llvm.org/D8204
Like for the PowerPC target, I've had to add 'i' to the constraint mappings in
order to pass 2007-12-17-InvokeAsm.ll. It's not clear why 'i' has historically
been treated as a memory constraint.
llvm-svn: 232480
Summary:
This adds a MipsInstAlias which expands to XORi $reg,$reg,imm. For example, "xor $6, 0x3A" should be expanded to "xori $6, $6, 58".
This should work for all MIPS ISAs.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8284
llvm-svn: 232473
It's not completely clear why 'i' has historically been treated as a memory
constraint. According to the documentation, it represents a constant immediate.
llvm-svn: 232470
ARMv6K is another layer between ARMV6 and ARMV6T2. This is the LLVM
side of the changes.
ARMV6 family LLVM implementation.
+-------------------------------------+
| ARMV6 |
+----------------+--------------------+
| ARMV6M (thumb) | ARMV6K (arm,thumb) | <- From ARMV6K and ARMV6M processors
+----------------+--------------------+ have support for hint instructions
| ARMV6T2 (arm,thumb,thumb2) | (SEV/WFE/WFI/NOP/YIELD). They can
+-------------------------------------+ be either real or default to NOP.
| ARMV7 (arm,thumb,thumb2) | The two processors also use
+-------------------------------------+ different encoding for them.
Patch by Vinicius Tinti.
llvm-svn: 232468
Summary:
But still handle them the same way since I don't know how they differ on
this target.
Of these, 'es', and 'Q' do not have backend tests but are accepted by
clang.
No functional change intended. Depends on D8173.
Reviewers: hfinkel
Reviewed By: hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8213
llvm-svn: 232466
Optimize concat_vectors of truncated vectors, where the intermediate
type is illegal, to avoid said illegality, e.g.,
(v4i16 (concat_vectors (v2i16 (truncate (v2i64))),
(v2i16 (truncate (v2i64)))))
->
(v4i16 (truncate (v4i32 (concat_vectors (v2i32 (truncate (v2i64))),
(v2i32 (truncate (v2i64)))))))
This isn't really target-specific, and, as such, would best go in the
DAGCombiner. However, ISD::TRUNCATE legality isn't keyed on both input
and result type, so we might generate worse code when we don't know
better. On AArch64 we know it's fine for v2i64->v4i16 and v4i32->v8i8.
rdar://20022387
llvm-svn: 232459
Fix justify error for small structures bigger than 32 bits in fixed
arguments for MIPS64 big endian. There was a problem when small structures
are passed as fixed arguments. The structures that are bigger than 32 bits
but smaller than 64 bits were not left justified properly on MIPS64 big
endian. This is fixed by shifting the value to make it left justified when
appropriate.
Patch by Aleksandar Beserminji.
Differential Revision: http://reviews.llvm.org/D8174
llvm-svn: 232382