If it's safe to clobber the condition flags, we can do a few extra things:
it's then possible to reset the base register writeback using a SUBS, so
we can try to merge even if the base register isn't dead after the merged
instruction.
This is effectively a (heavily bug-fixed) rewrite of r208992.
llvm-svn: 218386
ADDS/SUBS unless it's safe to clobber the condition flags.
If the merged instructions are in a range where the CPSR is live,
e.g. between a CMP -> Bcc, we can't safely materialize a new base
register.
This problem is quite rare, I couldn't come up with a test case and I've
never actually seen this happen in the tests I'm running - there is a
potential trigger for this in LNT/oggenc (spills being inserted between
a CMP/Bcc), but at the moment this isn't being merged. I'll try to
reduce that into a small test case once I've committed my upcoming patch
to make merging less conservative.
llvm-svn: 217881
There are two add-immediate instructions in Thumb1: tADDi8 and tADDi3. Only
the latter supports using different source and destination registers, so
whenever we materialize a new base register (at a certain offset) we'd do
so by moving the base register value to the new register and then adding in
place. This patch changes the code to use a single tADDi3 if the offset is
small enough to fit in 3 bits.
Differential Revision: http://reviews.llvm.org/D5006
llvm-svn: 216193
In a previous iteration of the pass, we would try to compensate for
writeback by updating later instructions and/or inserting a SUBS to
reset the base register if necessary.
Since such a SUBS sets the condition flags it's not generally safe to do
this. For now, only merge LDR/STRs if there is no writeback to the base
register (LDM that loads into the base register) or the base register is
killed by one of the merged instructions. These cases are clear wins
both in terms of instruction count and performance.
Also add three new test cases, and update the existing ones accordingly.
llvm-svn: 215729
This adds some code back that was deleted in r92053. The location of the
last merged memory operation needs to be kept up-to-date since MemOps
may be in a different order to the original instruction stream to
allow merging (since registers need to be in ascending order). Also
simplify the logic to determine BaseKill using findRegisterUseOperandIdx
to use an equivalent function call instead.
llvm-svn: 215728
shorter/easier and have the DAG use that to do the same lookup. This
can be used in the future for TargetMachine based caching lookups from
the MachineFunction easily.
Update the MIPS subtarget switching machinery to update this pointer
at the same time it runs.
llvm-svn: 214838
Originally I switched the LD/ST optimizer off in TargetMachine as it was previously, but Eric has suggested he'd prefer that it be short-circuited in the pass itself.
No functionality change.
llvm-svn: 211037
Previously, the basic block was searched for future uses of the base register,
and if necessary any writeback to the base register was reset using a SUB
instruction (e.g. before calling a function) just before such a use. However,
this step happened *before* the merged LDM/STM instruction was built. So if
there was (e.g.) a function call directly after the not-yet-formed LDM/STM,
the pass would first insert a SUB instruction to reset the base register,
and then (at the same location, incorrectly) insert the LDM/STM itself.
This patch fixes PR19972. Patch by Moritz Roth.
llvm-svn: 210542
When we had a sequence like:
s1 = VLDRS [r0, 1], Q0<imp-def>
s3 = VLDRS [r0, 2], Q0<imp-use,kill>, Q0<imp-def>
s0 = VLDRS [r0, 0], Q0<imp-use,kill>, Q0<imp-def>
s2 = VLDRS [r0, 4], Q0<imp-use,kill>, Q0<imp-def>
we were gathering the {s0, s1} loads below the s3 load. This is fine,
but confused the verifier since now the s3 load had Q0<imp-use> with
no definition above it.
This should mark such uses <undef> as well. The liveness structure at
the beginning and end of the block is unaffected, and the true sN
definitions should prevent any dodgy reorderings being introduced
elsewhere.
rdar://problem/15124449
llvm-svn: 192344
it at the moment.
This allows to form more paired loads even when stack coloring pass destroys the
memoryoperand's value.
<rdar://problem/13978317>
llvm-svn: 184492
sure the base register and would-be writeback register don't conflict for
stores. This was already being done for loads.
Unfortunately, it is rather difficult to create a test case for this issue. It
was exposed in 450.soplex at LTO and requires unlucky register allocation.
<rdar://13394908>
llvm-svn: 177874
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.
There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.
The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.
I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).
I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.
llvm-svn: 171366
This function is often used to decorate dangling instructions, so a
context reference is required to allocate memory for the operands.
Also add a corresponding MachineInstrBuilder method.
llvm-svn: 170797
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131
the register info for getEncodingValue. This builds on the
small patch of yesterday to set HWEncoding in the register
file.
One (deprecated) use was turned into a hard number to avoid
needing register info in the old JIT.
llvm-svn: 161628
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).
So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.
Patch by Yiannis Tsiouris!
llvm-svn: 156328
The load/store optimizer splits LDRD/STRD into two instructions when the
register pairing doesn't work out. For negative offsets in Thumb2, it uses
t2STRi8 to do that. That's fine, except for the case when the offset is in
the range [-4,-1]. In that case, we'll also form a second t2STRi8 with
the original offset plus 4, resulting in a t2STRi8 with a non-negative
offset, which ends up as if it were an STRT, which is completely bogus.
Similarly for loads.
No testcase, unfortunately, as any I've been able to construct is both large
and extremely fragile.
rdar://11193937
llvm-svn: 154141
When an strd instruction doesn't get the registers it wants, it can be
expanded into two str instructions. Make sure the first str doesn't kill
the base register in the case where the base and data registers are
identical:
t2STRi12 %R0<kill>, %R0, 4, pred:14, pred:%noreg
t2STRi12 %R2<kill>, %R0, 8, pred:14, pred:%noreg
<rdar://problem/11101911>
llvm-svn: 153611
When a number of sub-register VLRDS instructions are combined into a
VLDM, preserve any super-register implicit defs. This is required to
keep the register scavenger and machine code verifier happy.
Enable machine code verification after ARMLoadStoreOptimizer.
ARM/2012-01-26-CopyPropKills.ll was failing because of this.
llvm-svn: 153610
Revert r153519: "ARMLoadStoreOptimizer invalidates register liveness."
These patches caused miscompilations in povray by turning off branch
folding's updating of live-in lists.
It turns out the the late scheduler depends on the live-in lists, even
if it doesn't need correct kill flags.
<rdar://problem/11139228>
llvm-svn: 153593
This pass tries to update kill flags, but there are still many bugs.
Passes after the load/store optimizer don't need accurate liveness, so
don't even try.
<rdar://problem/11101911>
llvm-svn: 153519