to TargetFrameLowering, where it belongs. Incidentally, this allows us
to delete some duplicated (and slightly different!) code in TRI.
There are potentially other layering problems that can be cleaned up
as a result, or in a similar manner.
The refactoring was OK'd by Anton Korobeynikov on llvmdev.
Note: this touches the target interfaces, so out-of-tree targets may
be affected.
llvm-svn: 175788
Large code model is identical to medium code model except that the
addis/addi sequence for "local" accesses is never used. All accesses
use the addis/ld sequence.
The coding changes are straightforward; most of the patch is taken up
with creating variants of the medium model tests for large model.
llvm-svn: 175767
exists solely to enable it to call itself for i8 with some registers.
The proposed patch simplifies the function somewhat to make the High
bit only meaningful for the i8 mode, which makes sense. No functional
difference (getX86SubSuperRegister is not getting called from anywhere
outside with i64 and High=true).
llvm-svn: 175762
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175758
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175757
It actually fixes quite a bunch of piglit tests.
This is a candidate for the mesa-stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175756
Instead of using custom inserters, it's simpler and
should make DAG folding easier.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175755
v2: put implicit parameters in []
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175754
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175753
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175752
Order the classes and add asm operands.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175751
Fixing asm operation names.
v2: fix name of the e64 encoding, also add asm operands
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175750
Fixing asm operation names.
v2: use ZERO constant, also add asm operands
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175749
Fixing asm operation names.
v2: use ZERO constant, also add asm operands
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175748
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175747
Those two files got mixed up.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 175746
Fixes for-loop.cl piglit test
Patch By: Vincent Lejeune
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
NOTE: This is a candidate for the Mesa stable branch.
llvm-svn: 175742
The constructs %hi() and %lo() represent the high and low 16
bits of the address.
Because the 16 bit offset field of an LW instruction is
interpreted as signed, if bit 15 of the low part is 1 then the
low part will act as a negative and 1 needs to be added to the
high part.
Contributer: Vladimir Medic
llvm-svn: 175707
This patch implements the PPCDAGToDAGISel::PostprocessISelDAG virtual
method to perform post-selection peephole optimizations on the DAG
representation.
One optimization is implemented here: folds to clean up complex
addressing expressions for thread-local storage and medium code
model. It will also be useful for large code model sequences when
those are added later. I originally thought about doing this on the
MI representation prior to register assignment, but it's difficult to
do effective global dead code elimination at that point. DCE is
trivial on the DAG representation.
A typical example of a candidate code sequence in assembly:
addis 3, 2, globalvar@toc@ha
addi 3, 3, globalvar@toc@l
lwz 5, 0(3)
When the final instruction is a load or store with an immediate offset
of zero, the offset from the add-immediate can replace the zero,
provided the relocation information is carried along:
addis 3, 2, globalvar@toc@ha
lwz 5, globalvar@toc@l(3)
Since the addi can in general have multiple uses, we need to only
delete the instruction when the last use is removed.
llvm-svn: 175697
excluding visibility bits.
Mips specific standalone assembler directive "set at".
This directive changes the general purpose register
that the assembler will use when given the symbolic
register name $at.
This does not include negative testing. That will come
in a future patch.
A side affect of this patch recognizes the different
GPR register names for temporaries between old abi
and new abi so a test case for that is included.
Contributer: Vladimir Medic
llvm-svn: 175686
This handles the cases where the 6-bit splat element is odd, converting
to a three-instruction sequence to add or subtract two splats. With this
fix, the XFAIL in test/CodeGen/PowerPC/vec_constants.ll is removed.
llvm-svn: 175663
The PPC backend doesn't handle these correctly. This patch uses logic
similar to that in the X86 and ARM backends to track these arguments
properly.
llvm-svn: 175635
Add HexagonMCInst class which adds various Hexagon VLIW annotations.
In addition, this class also includes some APIs related to the
constant extenders.
llvm-svn: 175634
During lowering of a BUILD_VECTOR, we look for opportunities to use a
vector splat. When the splatted value fits in 5 signed bits, a single
splat does the job. When it doesn't fit in 5 bits but does fit in 6,
and is an even value, we can splat on half the value and add the result
to itself.
This last optimization hasn't been working recently because of improved
constant folding. To circumvent this, create a pseudo VADD_SPLAT that
can be expanded during instruction selection.
llvm-svn: 175632
sext <4 x i1> to <4 x i64>
sext <4 x i8> to <4 x i64>
sext <4 x i16> to <4 x i64>
I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns:
(sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT)))
The sext_in_reg (v4i32 x) may be lowered to shl+sar operations.
The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution.
I also added a cost of this operations to the AVX costs table.
llvm-svn: 175619
It is possible that frame pointer is not found in the
callee saved info, thus FramePtrSpillFI may be incorrect
if we don't check the result of hasFP(MF).
Besides, if we enable the stack coloring algorithm, there
will be an assertion to ensure the slot is live. But in
the test case, %var1 is not live in the prologue of the
function, and we will get the assertion failure.
Note: There is similar code in ARMFrameLowering.cpp.
llvm-svn: 175616
SltCCRxRy16, SltiCCRxImmX16, SltiuCCRxImmX16, SltuCCRxRy16
$T8 shows up as register $24 when emitted from C++ code so we had
to change some tests that were already there for this functionality.
llvm-svn: 175593
MS-style inline assembly.
This is a follow-on to r175334. Forcing a FP to be emitted doesn't ensure it
will be used. Therefore, force the base pointer as well. We now treat MS
inline assembly in the same way we treat functions with dynamic stack
realignment and VLAs. This guarantees the BP will be used to reference
parameters and locals.
rdar://13218191
llvm-svn: 175576
excluding visibility bits.
Mips (o32 abi) specific e_header setting.
EF_MIPS_ABI_O32 needs to be set in the
ELF header flags for o32 abi output.
Contributer: Reed Kotler
llvm-svn: 175569
excluding visibility bits.
Mips (Mips16) specific e_header setting.
EF_MIPS_ARCH_ASE_M16 needs to be set in the
ELF header flags for Mips16.
Contributer: Reed Kotler
llvm-svn: 175566
excluding visibility bits.
Mips (MicroMips) specific STO handling .
The st_other field settig for STO_MIPS_MICROMIPS
Contributer: Zoran Jovanovic
llvm-svn: 175564
In my previous commit:
"Merge a f32 bitcast of a v2i32 extractelt
A vectorized sitfp on doubles will get scalarized to a sequence of an
extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
Due to the the extract_element, and the bitcast we will uneccessarily generate
moves between scalar and vector registers."
I added a pattern containing a copy_to_regclass. The copy_to_regclass is
actually not needed.
radar://13191881
llvm-svn: 175555
When creating an allocation hint for a register pair, make sure the hint
for the physical register reference is still in the allocation order.
rdar://13240556
llvm-svn: 175541