With the new composite physical registers to represent arbitrary pairs
of DPR registers, we don't need the pseudo-registers anymore. Get rid of
a bunch of them that use DPR register pairs and just use the real
instructions directly instead.
llvm-svn: 152045
MachineOperands that define part of a virtual register must have an
<undef> flag if they are not intended as read-modify-write operands.
The old trick of adding an <imp-def> operand doesn't work any longer.
Fixes PR12177.
llvm-svn: 152008
Without this hook, functions w/ a completely empty body (including no
epilogue) will cause an MCEmitter assertion failure.
For example,
define internal fastcc void @empty_function() {
unreachable
}
rdar://10947471
llvm-svn: 151673
Now that canRealignStack() understands frozen reserved registers, it is
safe to use it for aligned spill instructions.
It will only return true if the registers reserved at the beginning of
register allocation allow for dynamic stack realignment.
<rdar://problem/10625436>
llvm-svn: 147579
This patch caused a miscompilation of oggenc because a frame pointer was
suddenly needed halfway through register allocation.
<rdar://problem/10625436>
llvm-svn: 147487
Use the spill slot alignment as well as the local variable alignment to
determine when the stack needs to be realigned. This works now that the
ARM target can always realign the stack by using a base pointer.
Still respect the ARMBaseRegisterInfo::canRealignStack() function
vetoing a realigned stack. Don't use aligned spill code in that case.
llvm-svn: 146997
r0 = mov #0
r0 = moveq #1
Then the second instruction has an implicit data dependency on the first
instruction. Sadly I have yet to come up with a small test case that
demonstrate the post-ra scheduler taking advantage of this.
llvm-svn: 146583
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
prevent IT blocks from being broken apart.
llvm-svn: 146542
Refactor the instructions into fixed writeback and register-stride
writeback variants to simplify the offset operand (no more optional
register operand using reg0). This is a simpler representation and allows
the assembly parser to more easily handle these instructions.
Add tests for the instruction variants now supported.
llvm-svn: 146278
generator to it. For non-bundle instructions, these behave exactly the same
as the MC layer API.
For properties like mayLoad / mayStore, look into the bundle and if any of the
bundled instructions has the property it would return true.
For properties like isPredicable, only return true if *all* of the bundled
instructions have the property.
For properties like canFoldAsLoad, isCompare, conservatively return false for
bundles.
llvm-svn: 146026
1. Added opcode BUNDLE
2. Taught MachineInstr class to deal with bundled MIs
3. Changed MachineBasicBlock iterator to skip over bundled MIs; added an iterator to walk all the MIs
4. Taught MachineBasicBlock methods about bundled MIs
llvm-svn: 145975
This pseudo-instruction contains a .align directive in its expansion, so
the total size may vary by 2 bytes.
It is too difficult to accurately keep track of this alignment
directive, just use the worst-case size instead.
llvm-svn: 145971
This will widen 32-bit register vmov instructions to 64-bit when
possible. The 64-bit vmovd instructions can then be translated to NEON
vorr instructions by the execution dependency fix pass.
The copies are only widened if they are marked as clobbering the whole
D-register.
llvm-svn: 144734
Split am6offset into fixed and register offset variants so the instruction
encodings are explicit rather than relying an a magic reg0 marker.
Needed to being able to parse these.
llvm-svn: 142853
Clean up the patterns, fix comments, and avoid confusing both tools
and coders. Note that the special adds/subs SelectionDAG nodes no
longer have the dummy cc_out operand.
llvm-svn: 142397
When widening a copy, we are reading a larger register that may not be
live. Use an <undef> flag to tell the register scavenger and machine
code verifier that we know the value isn't defined.
We now widen:
%S6<def> = COPY %S4<kill>, %D3<imp-def>
into:
%D3<def> = VMOVD %D2<undef>, pred:14, pred:%noreg, %S4<imp-use,kill>
This also keeps the <kill> flag on %S4 so we don't inadvertently kill a
live value in %S5.
Finally, ensure that ARMBaseInstrInfo::setExecutionDomain() preserves
the <undef> flag when converting VMOVD to VORR.
llvm-svn: 141746
The VMOVS widening needs to look at the implicit COPY operands. Trying
to dig out the COPY instruction from an iterator in copyPhysReg() is the
wrong approach.
The expandPostRAPseudo() hook gets to look at COPY instructions before
they are converted to copyPhysReg() calls.
llvm-svn: 141619
This is still a hack until we can teach tblgen to generate the
optional CPSR operand rather than an implicit CPSR def. But the
strangeness is now limited to the selection DAG. ADD/SUB MI's no
longer have implicit CPSR defs, nor do we allow flag setting variants
of these opcodes in machine code. There are several corner cases to
consider, and getting one wrong would previously lead to nasty
miscompilation. It's not the first time I've debugged one, so this
time I added enough verification to ensure it won't happen again.
llvm-svn: 140228
It appears that our use of the imp-use and imp-def flags with
sub-registers is not yet robust enough to support this.
The failing test case is complicated, I am working on a reduction.
<rdar://problem/10044201>
llvm-svn: 138861
There is no non-writeback store multiple instruction in Thumb1, so
don't define one. As a result load multiple is the only instantiation of
the multiclass, so refactor that away entirely.
llvm-svn: 138338
This pleases the register scavenger and brings
test/CodeGen/ARM/2011-08-12-vmovqqqq-pseudo.ll a little closer to
working with -verify-machineinstrs.
llvm-svn: 138164
On Cortex-A8, we use the NEON v2f32 instructions for f32 arithmetic. For
better latency, we also send D-register copies down the NEON pipeline by
translating them to vorr instructions.
This patch promotes even S-register copies to D-register copies when
possible so they can also go down the NEON pipeline. Example:
vldr.32 s0, LCPI0_0
loop:
vorr d1, d0, d0
loop2:
...
vadd.f32 d1, d1, d16
The vorr instruction looked like this after regalloc:
%S2<def> = COPY %S0, %D1<imp-def>
Copies involving odd S-registers, and copies that don't define the full
D-register are left alone.
llvm-svn: 137182