Commit Graph

9130 Commits

Author SHA1 Message Date
Juergen Ributzka 2b97f9b211 [DAG] Fix the recognition of opaque constants in the SelectionDAGBuilder.
This fix checks the original LLVM IR node to identify opaque constants by
looking for the bitcast-constant pattern. Originally we looked at the generated
SDNode, but this might lead to incorrect results. The SDNode could have been
generated by an constant expression that was folded to a constant.

This fixes <rdar://problem/16050719>

llvm-svn: 201291
2014-02-13 04:19:26 +00:00
Hao Liu 4f345f3c03 [AArch64]Add support for spilling FPR8/FPR16.
llvm-svn: 201287
2014-02-13 02:36:58 +00:00
Andrea Di Biagio 386d566395 [X86] Teach the backend how to lower vector shift left into multiply rather than scalarizing it.
Instead of expanding a packed shift into a sequence of scalar shifts,
the backend now tries (when possible) to convert the vector shift into a
vector multiply.

Before this change, a shift of a MVT::v8i16 vector by a
build_vector of constants was always scalarized into a long sequence of "vector
extracts + scalar shifts + vector insert".
With this change, if there is SSE2 support, we emit a single vector multiply.

This change also affects SSE4.1, AVX, AVX2 shifts:
 - A shift of a MVT::v4i32 vector by a build_vector of non uniform constants
is now lowered when possible into a single SSE4.1 vector multiply.
 - Packed v16i16 shift left by constant build_vector are now expanded when
possible into a single AVX2 vpmullw.
This change also improves the lowering of AVX512f vector shifts.

Added test CodeGen/X86/vec_shift6.ll with some code examples that are affected
by this change.

llvm-svn: 201271
2014-02-12 23:42:28 +00:00
Akira Hatanaka a07ffb5b31 Pass edges weights to MachineBasicBlock::addSuccessor in TailDuplicatePass to
preserve branch probability information.

<rdar://problem/15893208>

llvm-svn: 201245
2014-02-12 18:09:18 +00:00
Daniel Sanders abe212a3b8 Revert r201237+r201238: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call
It introduced multiple test failures in the buildbots.

llvm-svn: 201241
2014-02-12 15:39:20 +00:00
Daniel Sanders a7d504cf58 Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call
Summary:
AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output.

The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as.

All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler.

Reviewers: rafael

Reviewed By: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2686

llvm-svn: 201237
2014-02-12 14:44:54 +00:00
Evan Cheng 57add3e4ee Tweak ARM fastcc by adopting these two AAPCS rules:
* CPRCs may be allocated to co-processor registers or the stack – they may never be allocated to core registers
* When a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable

The difference is only noticeable in rare cases where there are a large number of floating point arguments (e.g.
7 doubles + additional float, double arguments). Although it's probably still better to avoid vmov as it can cause
stalls in some older ARM cores. The other, more subtle benefit, is to minimize difference between the various
calling conventions.

rdar://16039676

llvm-svn: 201193
2014-02-11 23:49:31 +00:00
David Blaikie 6bd395f3f0 DebugInfo: Remove dependence on file numbering in the line table.
These tests were unnecessarily sensitive to the presence and ordering of
elements in the line table file_names list which will break on a future
change I'm working on.

llvm-svn: 201185
2014-02-11 21:46:46 +00:00
Matt Arsenault 71b71d25eb R600/SI: Fix assertion on infinite loops.
This isn't the most useful case to fix in the real world,
but bugpoint runs into this.

llvm-svn: 201177
2014-02-11 21:12:38 +00:00
Robert Lougher 7d9084ffa1 Teach the DAGCombiner how to fold concat_vector nodes when the input is two
BUILD_VECTOR nodes, e.g.:

(concat_vectors (BUILD_VECTOR a1, a2, a3, a4), (BUILD_VECTOR b1, b2, b3, b4))
->
(BUILD_VECTOR a1, a2, a3, a4, b1, b2, b3, b4)

This fixes an issue with AVX, where a sequence was not recognized as a 256-bit
vbroadcast due to the concat_vectors.

llvm-svn: 201158
2014-02-11 15:42:46 +00:00
Robert Lytton 70b5ba49c3 XCore target: fix const section handling
Xcore target ABI requires const data that is externally visible
to be handled differently if it has C-language linkage rather than
C++ language linkage.

Clang now emits ".cp.rodata" section information.

All other externally visible constant data will be placed in the DP section.

llvm-svn: 201144
2014-02-11 10:36:26 +00:00
Robert Lytton 9b6bb461b1 XCore target: Lower ATOMIC_LOAD & ATOMIC_STORE
llvm-svn: 201143
2014-02-11 10:36:18 +00:00
Elena Demikhovsky 1f32c313f1 AVX: fixed a bug in LowerVECTOR_SHUFFLE
llvm-svn: 201140
2014-02-11 10:21:53 +00:00
Elena Demikhovsky 2aafc22ed9 AVX-512: Optimized BUILD_VECTOR pattern;
fixed encoding of VEXTRACTPS instruction.

llvm-svn: 201134
2014-02-11 07:25:59 +00:00
Quentin Colombet 64ca544d95 [CodeGenPrepare] Test case for the promotions that bypass the
profitability check due to some other checks in the addressing
mode matcher. I.e., test case for commit r201121.

<rdar://problem/16020230>

llvm-svn: 201132
2014-02-11 06:55:43 +00:00
Tom Stellard 5d7aaaed7d R600/SI: Initialize M0 and emit S_WQM_B64 whenever DS instructions are used
DS instructions that access local memory can only uses addresses that
are less than or equal to the value of M0.  When M0 is uninitialized,
then we experience undefined behavior.

This patch also changes the behavior to emit S_WQM_B64 on pixel shaders
no matter what kind of DS instruction is used.

llvm-svn: 201097
2014-02-10 16:58:30 +00:00
Tim Northover b0430415e6 ARM: use natural LLVM IR for vshll instructions
Similarly to the vshrn instructions, these are simple zext/sext + trunc
operations. Using normal LLVM IR should allow for better code, and more sharing
with the AArch64 backend.

llvm-svn: 201093
2014-02-10 16:20:29 +00:00
Oliver Stannard 8dcaa761a2 ARM: r12 is callee-saved for interrupt handlers
For A- and R-class processors, r12 is not normally callee-saved, but is for
interrupt handlers. See AAPCS, 5.3.1.1, "Use of IP by the linker".

llvm-svn: 201089
2014-02-10 14:24:23 +00:00
Tim Northover 170daafe01 ARM: use LLVM IR to represent the vshrn operation
vshrn is just the combination of a right shift and a truncate (and the limits
on the immediate value actually mean the signedness of the shift doesn't
matter). Using that representation allows us to get rid of an ARM-specific
intrinsic, share more code with AArch64 and hopefully get better code out of
the mid-end optimisers.

llvm-svn: 201085
2014-02-10 14:04:07 +00:00
Robert Lougher 48ee75b7e3 Test commit - added a new line to vec_shuf-insert.ll.
llvm-svn: 201083
2014-02-10 12:42:13 +00:00
Matheus Almeida 4b27eb588c [mips][msa] Add DLSA instruction.
llvm-svn: 201081
2014-02-10 12:05:17 +00:00
Matheus Almeida b4133b25e7 [mips][msa] Update FileCheck prefix in preparation for
the addition of Mips64 tests.

No functional changes.

llvm-svn: 201080
2014-02-10 11:30:09 +00:00
Elena Demikhovsky 9f423d6f25 AVX-512: Fixed extract_vector_elt for v16i1 and v8i1 vectors.
llvm-svn: 201066
2014-02-10 07:02:39 +00:00
Hao Liu 6e73761dc8 [AArch64]Implement the copy of two FPR8 registers by using FMOVss of two FPR32 registers in copyPhysReg.
llvm-svn: 201061
2014-02-10 03:16:22 +00:00
Rafael Espindola 5054362920 Always create a temporary symbol to use with the cfi frame.
This is a small simplification and a small step in fixing pr18743 since
private functions on MachO should be using a 'l' prefix.

llvm-svn: 200994
2014-02-07 21:23:18 +00:00
Rafael Espindola e393aab13b Use FileCheck variables to simplify this test.
llvm-svn: 200992
2014-02-07 21:11:33 +00:00
Renato Golin 3dc5ade8bb Fix Darwin bots from EHABI change
llvm-svn: 200990
2014-02-07 20:32:32 +00:00
Matt Arsenault 9ad27e8d76 R600/SI: Add failing test for 3 x i64 vectors.
Stores of <4 x i64> do work (although they do expand to 4 stores
instead of 2), but 3 x i64 vectors fail to select.

llvm-svn: 200989
2014-02-07 20:29:40 +00:00
Renato Golin 78a6eba862 Remove -arm-disable-ehabi option
llvm-svn: 200988
2014-02-07 20:12:49 +00:00
Sasa Stankovic 4c80bdae72 [mips] Forbid the use of registers t6, t7 and t8 if the target is NaCl.
Differential Revision: http://llvm-reviews.chandlerc.com/D2694

llvm-svn: 200978
2014-02-07 17:16:40 +00:00
Rafael Espindola 61acf5d9b0 Fix a bug with .weak_def_can_be_hidden: Mutable variables cannot use it.
Thanks to John McCall for noticing it.

llvm-svn: 200977
2014-02-07 16:21:30 +00:00
Oliver Stannard 1dc1034218 LLVM-1163: AAPCS-VFP violation when CPRC allocated to stack
According to the AAPCS, when a CPRC is allocated to the stack, all other
VFP registers should be marked as unavailable.

I have also modified the rules for allocating non-CPRCs to the stack, to make
it more explicit that all GPRs must be made unavailable. I cannot think of a
case where the old version would produce incorrect answers, so there is no test
for this.

llvm-svn: 200970
2014-02-07 11:19:53 +00:00
Venkatraman Govindaraju fd07500dd1 [Sparc] Emit relocations for Thread Local Storage (TLS) when integrated assembler is used.
llvm-svn: 200962
2014-02-07 05:54:20 +00:00
Venkatraman Govindaraju 104643d0aa [Sparc] Emit correct relocations for PIC code when integrated assembler is used.
llvm-svn: 200961
2014-02-07 04:24:35 +00:00
Manman Ren 37c9267107 PGO branch weight: fix PR18752.
Fix a bug triggered in IfConverterTriangle when CvtBB has multiple predecessors
by getting the weights before removing a successor.

llvm-svn: 200958
2014-02-07 00:38:56 +00:00
Jim Grosbach e9008de652 X86: Resolve a long standing FIXME and properly isel pextr[bw].
Generalize the AArch64 .td nodes for AssertZext and AssertSext. Use
them to match the relevant pextr store instructions.

The test widen_load-2.ll requires a slight change because with the
stores gone, the remaining instructions are scheduled in a different
order.

Add test cases for SSE4 and AVX variants.

Resolves rdar://13414672.

Patch by Adam Nemet <anemet@apple.com>.

llvm-svn: 200957
2014-02-07 00:16:33 +00:00
Rafael Espindola 803fb10874 Convert test to FileCheck.
llvm-svn: 200955
2014-02-06 23:35:22 +00:00
Quentin Colombet 3a4bf0405e [CodeGenPrepare] Move away sign extensions that get in the way of addressing
mode.

Basically the idea is to transform code like this:
%idx = add nsw i32 %a, 1
%sextidx = sext i32 %idx to i64
%gep = gep i8* %myArray, i64 %sextidx
load i8* %gep

Into:
%sexta = sext i32 %a to i64
%idx = add nsw i64 %sexta, 1
%gep = gep i8* %myArray, i64 %idx
load i8* %gep

That way the computation can be folded into the addressing mode.

This transformation is done as part of the addressing mode matcher.
If the matching fails (not profitable, addressing mode not legal, etc.), the
matcher will revert the related promotions.

<rdar://problem/15519855>

llvm-svn: 200947
2014-02-06 21:44:56 +00:00
Tom Stellard e236794578 R600/SI: Add a MUBUF store pattern for Reg+Imm offsets
llvm-svn: 200935
2014-02-06 18:36:41 +00:00
Tom Stellard 2937cbc005 R600/SI: Add a MUBUF store pattern for Imm offsets
llvm-svn: 200934
2014-02-06 18:36:39 +00:00
Tom Stellard 11624bc577 R600/SI: Add a MUBUF load pattern for Reg+Imm offsets
llvm-svn: 200933
2014-02-06 18:36:38 +00:00
Tom Stellard 044e418f15 R600/SI: Use immediates offsets for SMRD instructions whenever possible
There was a problem with the old pattern, so we were copying some
larger immediates into registers when we could have been encoding
them in the instruction.

llvm-svn: 200932
2014-02-06 18:36:34 +00:00
Juergen Ributzka fa0eba6c8b [DAG] Don't pull the binary operation though the shift if the operands have opaque constants.
During DAGCombine visitShiftByConstant assumes that certain binary operations
with only constant operands can always be folded successfully. This is no longer
true when the constant is opaque. This commit fixes visitShiftByConstant by not
performing the optimization for opaque constants. Otherwise we would end up in
an infinite DAGCombine loop.

llvm-svn: 200900
2014-02-06 04:09:06 +00:00
Quentin Colombet 87769713cf [RegAlloc] Add a last chance recoloring mechanism when everything else failed to
find a register.

The idea is to choose a color for the variable that cannot be allocated and
recolor its interferences around. Unlike the current register allocation scheme,
it is allowed to change the color of an already assigned (but maybe not
splittable or spillable) live interval while propagating this change to its
neighbors.
In other word, there are two things that may help finding an available color:
- Already assigned variables (RS_Done) can be recolored to different color.
- The recoloring allows to catch solutions that needs to touch more that just
  the neighbors of the current allocated variable.

E.g.,
vA can use {R1, R2    }
vB can use {    R2, R3}
vC can use {R1        }
Where vA, vB, and vC cannot be split anymore (they are reloads for instance) and
they all interfere.

vA is assigned R1
vB is assigned R2
vC tries to evict vA but vA is already done.
=> Regular register allocation heuristic fails.

Last chance recoloring kicks in:
vC does as if vA was evicted => vC uses R1.
vC is marked as fixed.
vA needs to find a color.
None are available.
vA cannot evict vC: vC is a fixed virtual register now.
vA does as if vB was evicted => vA uses R2.
vB needs to find a color.
R3 is available.
Recoloring => vC = R1, vA = R2, vB = R3.

<rdar://problem/15947839>

llvm-svn: 200883
2014-02-05 22:13:59 +00:00
Rafael Espindola b4eec1daa1 Remove support for not using .loc directives.
Clang itself was not using this. The only way to access it was via llc.

llvm-svn: 200862
2014-02-05 18:00:21 +00:00
Petar Jovanovic 9725016af3 [mips] Add NaCl target and forbid indexed loads and stores for it
This patch adds NaCl target for Mips. It also forbids indexed loads and
stores if the target is NaCl.

Patch by Sasa Stankovic.

Differential Revision: http://llvm-reviews.chandlerc.com/D2690

llvm-svn: 200855
2014-02-05 17:19:30 +00:00
Elena Demikhovsky 0b79be8ab2 AVX-512: optimized icmp -> sext -> icmp pattern
llvm-svn: 200849
2014-02-05 16:17:36 +00:00
Michel Danzer 5d26fdfcba R600/SI: Add pattern for zero-extending i1 to i32
Fixes opencl-example if_* tests with radeonsi.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200830
2014-02-05 09:48:05 +00:00
Elena Demikhovsky a30e437659 AVX-512: Added intrinsic for cvtph2ps.
Added VPTESTNM instruction.
Added a pattern to vselect (lit tests will follow).

llvm-svn: 200823
2014-02-05 07:05:03 +00:00
David Peixotto b9b7362cdc Fix PR18345: ldr= pseudo instruction produces incorrect code when using in inline assembly
This patch fixes the ldr-pseudo implementation to work when used in
inline assembly.  The fix is to move arm assembler constant pools
from the ARMAsmParser class to the ARMTargetStreamer class.

Previously we kept the assembler generated constant pools in the
ARMAsmParser object. This does not work for inline assembly because
a new parser object is created for each blob of inline assembly.
This patch moves the constant pools to the ARMTargetStreamer class
so that the constant pool will remain alive for the entire code
generation process.

An ARMTargetStreamer class is now required for the arm backend.
There was no existing implementation for MachO, only Asm and ELF.
Instead of creating an empty MachO subclass, we decided to make the
ARMTargetStreamer a non-abstract class and provide default
(llvm_unreachable) implementations for the non constant-pool related
methods.

Differential Revision: http://llvm-reviews.chandlerc.com/D2638

llvm-svn: 200777
2014-02-04 17:22:40 +00:00