Very sorry, this was a premature patch that I still need to investigate and
finish off (for some reason beyond me at the moment it doesn't actually fix the
issue in all cases).
This reverts commit r199091.
llvm-svn: 199093
There are two attempted optimisations in reMaterializeTrivialDef, trying to
avoid promoting the size of a register too much when rematerializing.
Unfortunately, both appear to be flawed. First, we see if the original register
would have worked, but this is inadequate. Consider:
v1 = SOMETHING (v1 is QQ)
v2:Q0 = COPY v1:Q1 (v1, v2 are QQ)
...
uses of v2
In this case even though v2 *could* be used directly as the output of
SOMETHING, this would set the wrong bits of the QQ register involved. The
correct rematerialization must be:
v2:Q0_Q1 = SOMETHING (v2 promoted to QQQ)
...
uses of v2:Q1_Q2
For the second optimisation, if the correct remat is "v2:idx = SOMETHING" then
we can't necessarily expect v2 itself to be valid for SOMETHING, but we do try
to hunt for a class between v1 and v2 that works. Unfortunately, this is also
wrong:
v1 = SOMETHING (v1 is QQ)
v2:Q0_Q1 = COPY v1 (v1 is QQ, v2 is QQQ)
...
uses of v2 as a QQQ
The canonical rematerialization here is "v2:Q0_Q1 = SOMETHING". However current
logic would decide that v2 could be a QQ (no interest is taken in later uses).
This patch, therefore, always accepts the widened register class without trying
to be clever. Generally there is no penalty to this (e.g. in the common GR32 <
GR64 case, expanding the width doesn't matter because it's not like you were
going to do anything else with the high bits of a GR32 register). It can
increase register pressure in cases like the ARM VFP regs though (multiple
non-overlapping but equivalent subregisters). Hopefully this situation is rare
enough that it won't matter.
Unfortunately, no in-tree targets actually expose this as far as I can tell
(there are so few isAsCheapAsAMove instructions for it to trigger on) so I've
been unable to produce a test. It was exposed in our ARM64 SPEC tests though,
and I will be adding a test there that we should be able to contribute
soon(TM).
llvm-svn: 199091
The Segment struct contains a single interval; multiple instances of this struct
are used to construct a live range, but the struct is not a live range by
itself.
llvm-svn: 192392
This makes using array_pod_sort significantly safer. The implementation relies
on function pointer casting but that should be safe as we're dealing with void*
here.
llvm-svn: 191175
Track new virtual registers by register number, rather than by the live
interval created for them. This is the first step in separating the
creation of new virtual registers and new live intervals. Eventually
live intervals will be created and populated on demand after the virtual
registers have been created and used in instructions.
llvm-svn: 188434
When we're rematerializing into a not-quite-right register we already add the
real definition as an imp-def, but we should also be marking the "official"
register as dead, since nothing else is going to use it as a result of this
remat.
Not doing this can affect pressure tracking.
rdar://problem/14158833
llvm-svn: 184002
r182872 introduced a bug in how the register-coalescer's rematerialization
handled defining a physical register. It relied on the output of the
coalescer's setRegisters method to determine whether the replacement
instruction needed an implicit-def. However, this value isn't necessarily the
same as the CopyMI's actual destination register which is what the rest of the
basic-block expects us to be defining.
The commit changes the rematerializer to use the actual register attached to
CopyMI in its decision.
This will be tested soon by an X86 patch which moves everything to using
MOV32r0 instead of other sizes.
llvm-svn: 182925
This allows rematerialization during register coalescing to handle
more cases involving operations like SUBREG_TO_REG which might need to
be rematerialized using sub-register indices.
For example, code like:
v1(GPR64):sub_32 = MOVZ something
v2(GPR64) = COPY v1(GPR64)
should be convertable to:
v2(GPR64):sub_32 = MOVZ something
but previously we just gave up in places like this
llvm-svn: 182872
of the copy is a subregister def. The current code assumes that it can do a full
def of the destination register, but it is not checking that the def operand is
read-undef. It also doesn't clear the subregister index of the destination in
the new instruction to reflect the full subregister def.
These issues were found running 'make check' with my next commit that enables
rematerialization in more cases.
llvm-svn: 175122
RegisterCoalescer used to depend on LiveDebugVariable. LDV removes DBG_VALUEs
without emitting them at the end.
We fix this by removing LDV from RegisterCoalescer. Also add an assertion to
make sure we call emitDebugValues if DBG_VALUEs are removed at
runOnMachineFunction.
rdar://problem/13183203
Reviewed by Andy & Jakob
llvm-svn: 175023
Most IMPLICIT_DEF instructions are removed by the ProcessImplicitDefs
pass, and a few are reinserted by PHIElimination when a PHI argument is
<undef>.
RegisterCoalescer was assuming that all IMPLICIT_DEF live ranges look
like those created by PHIElimination, and that their live range never
leaves the basic block.
The PR14732 test case does tricks with PHI nodes that causes a longer
IMPLICIT_DEF live range to appear. This happens very rarely, but
RegisterCoalescer should be able to handle it.
llvm-svn: 171435
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.
There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.
The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.
I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).
I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.
llvm-svn: 171366
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131
No functional change, just moved header files.
Targets can inject custom passes between register allocation and
rewriting. This makes it possible to tweak the register allocation
before rewriting, using the full global interference checking available
from LiveRegMatrix.
llvm-svn: 168806
Jakub Staszak spotted this in review. I don't notice these things
until I manually rerun benchmarks. But reducing unit tests is a very
high priority.
llvm-svn: 168021
This allows me to begin enabling (or backing out) misched by default
for one subtarget at a time. To run misched we typically want to:
- Disable SelectionDAG scheduling (use the source order scheduler)
- Enable more aggressive coalescing (until we decide to always run the coalescer this way)
- Enable MachineScheduler pass itself.
Disabling PostRA sched may follow for some subtargets.
llvm-svn: 167826
This adds the -join-globalcopies option which can be enabled by
default once misched is also enabled.
Ideally, the register coalescer would be able to split local live
ranges in a way that produces copies that can be easily resolved by
the scheduler. Until then, this heuristic should be good enough to at
least allow the scheduler to run after coalescing.
llvm-svn: 167825
This teaches the register coalescer to be less prone to split critical
edges. I am currently benchmarking this with the new (post-coalescer)
scheduler. I plan to enable this by default and remove the option as
soon as misched is enabled.
llvm-svn: 167758
Partial copies can show up even when CoalescerPair.isPartial() returns
false. For example:
%vreg24:dsub_0<def> = COPY %vreg31:dsub_0; QPR:%vreg24,%vreg31
Such a partial-partial copy is not good enough for the transformation
adjustCopiesBackFrom() needs to do.
llvm-svn: 166944