These require multiple mov instructions to copy,
but the default value is that 1 instruction is needed.
I'm not sure if this actually changes anything.
llvm-svn: 248651
Trying to use the version with the explicit output operand
would complain because of the missing WriteSALU. I'm not sure
why it doesn't complain about this with the implicit VCC def.
llvm-svn: 248646
It's easier to understand creating a full instruction
than the current situation where sometimes a new
instruction is created and sometimes it is awkwardly
mutated in place.
llvm-svn: 248627
Don't run passes related to stack maps, garbage collection,
exceptions since these aren't useful for GPUs.
There might be a few more to turn off that I'm less sure about
(e.g. ShrinkWrapping) or I'm not sure how to disable
(SafeStack and StackProtector)
llvm-svn: 248591
This fixes a select error when the i64 source was also
bitcasted to v2i32 in the original source.
Instead of awkwardly trying to select the modified source value and
the store, replace before isel begins.
Uses a worklist to avoid possible problems from mutating the DAG,
although it seems to work OK without it.
llvm-svn: 248589
When buffer resource descriptors were built, the upper two components
of the descriptor were first composed into a 64-bit register because
legalizeOperands assumed all operands had the same register class.
Fix that problem, but keep the workaround. I'm not sure anything
actually is actually emitting such a REG_SEQUENCE now.
If multiple resource descriptors are set up with different base
pointers, this is copied with a single s_mov_b64. We probably
should fix this better by recognizing a pair of s_mov_b32 later,
but for now delete the dead code.
llvm-svn: 248585
This was only set on the final _si/_vi version, but not
on the pseudos most of codegen sees.
No test since these instructions aren't used yet.
llvm-svn: 248583
These were all using the default 32-bit VALU write class,
but the i64/f64 compares are half rate.
I'm not sure this is really correct, because they are still using
the write to VALU write class, even though they really write
to the SALU.
llvm-svn: 248582
These are necessary for implementing mem_fence for
OpenCL 2.0.
The VI assembler tests are disabled since it seems to be
using the wrong encoding or opcode.
llvm-svn: 248532
Allow a target to do something other than search for copies
that will avoid cross register bank copies.
Implement for SI by only rewriting the most basic copies,
so it should look through anything like a subregister extract.
I'm not entirely satisified with this because it seems like
eliminating a reg_sequence that isn't fully used should work
generically for all targets without them having to override
something. However, it seems to be tricky to have a simple
implementation of this without rewriting to invalid kinds
of subregister copies on some targets.
I'm not sure if there is currently a generic way to easily check
if a subregister index would be valid for the current use.
The current set of TargetRegisterInfo::get*Class functions don't
quite behave like I would expect (e.g. getSubClassWithSubReg
returns the maximal register class rather than the minimal), so
I'm not sure how to make the generic test keep searching if
SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making
the default implementation to check for simple copies breaks
a variety of ARM and x86 tests by producing illegal subregister uses.
The ARM tests are not actually changed since it should still be using
the same sharesSameRegisterFile implementation, this just relaxes
them to not check for specific registers.
llvm-svn: 248478
Instead of always inserting a copy in case
the super register is itself a subregister,
only extract to the super reg class if this is
actually the case.
This shouldn't really change codegen, but
makes looking at the output of SIFixSGPRCopies
easier to read.
llvm-svn: 248467
After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing,
so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests:
if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is
one test case in this patch to prove that point.
This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I
did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF
( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes.
This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as
FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the
current global settings.
Differential Revision: http://reviews.llvm.org/D12095
llvm-svn: 247815
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).
For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.
This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.
This commit also contains a trivial patch to clang to account for the C++ API
change. Thanks go to Pavel Labath for fixing LLDB for me.
Reviewers: rengolin
Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10969
llvm-svn: 247692
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).
For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.
This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.
This commit also contains a trivial patch to clang to account for the C++ API
change.
Reviewers: rengolin
Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10969
llvm-svn: 247683
Summary: This fixes a variety of typos in docs, code and headers.
Subscribers: jholewinski, sanjoy, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D12626
llvm-svn: 247495
Instead of extracting both 32-bit components from the 128-bit
register. This produces fewer copies and is easier for
the copy peephole optimizer to understand and see the actual uses
as extracts from a reg_sequence.
This avoids needing to handle subregister composing in the
PeepholeOptimizer's ValueTracker for this case.
llvm-svn: 247162
Summary:
This helps mostly when we use add instructions for address calculations
that contain immediates.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D12256
llvm-svn: 247157