1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
"optimize for latency". Call instructions don't have the right latency and
this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
not # of micro-ops since multi-latency instructions is completely executed
even when the predicate is false. Also, some instruction will be "slower"
when they are predicated due to the register def becoming implicit input.
rdar://8598427
llvm-svn: 118135
explicit about the operands. Split out the different variants into separate
instructions. This gives us the ability to, among other things, assign
different scheduling itineraries to the variants. rdar://8477752.
llvm-svn: 117409
"long latency" enough to hoist even if it may increase spilling. Reloading
a value from spill slot is often cheaper than performing an expensive
computation in the loop. For X86, that means machine LICM will hoist
SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON
instructions.
- Enable register pressure aware machine LICM by default.
llvm-svn: 116781
allow target to correctly compute latency for cases where static scheduling
itineraries isn't sufficient. e.g. variable_ops instructions such as
ARM::ldm.
This also allows target without scheduling itineraries to compute operand
latencies. e.g. X86 can return (approximated) latencies for high latency
instructions such as division.
- Compute operand latencies for those defined by load multiple instructions,
e.g. ldm and those used by store multiple instructions, e.g. stm.
llvm-svn: 115755
stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide
more nuanced estimates in the future.
llvm-svn: 115364
Rather than having arbitrary cutoffs, actually try to cost model the conversion.
For now, the constants are tuned to more or less match our existing behavior, but these will be
changed to reflect realistic values as this work proceeds.
llvm-svn: 114973
into OptimizeCompareInstr.
This necessitates the passing of CmpValue around,
so widen the virtual functions to accomodate.
No functionality changes.
llvm-svn: 114428
iterator when an optimization took place. This allows us to do more insane
things with the code than just remove an instruction or two.
llvm-svn: 113640
take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.
llvm-svn: 113570
instruction in the class would be decoded to. Or zero if the number of
uOPs must be determined dynamically.
This will be used to determine the cost-effectiveness of predicating a
micro-coded instruction.
llvm-svn: 113513
(I discovered 2 more copies of the ARM instruction format list, bringing the
total to 4!! Two of them were already out of sync. I haven't yet gotten into
the disassembler enough to know the best way to fix this, but something needs
to be done.) Add support for encoding these instructions.
llvm-svn: 110754
relatively expensive comparison analyzer on each instruction. Also rename the
comparison analyzer method to something more in line with what it actually does.
This pass is will eventually be folded into the Machine CSE pass.
llvm-svn: 110539
This pass tries to remove comparison instructions when possible. For instance,
if you have this code:
sub r1, 1
cmp r1, 0
bz L1
and "sub" either sets the same flag as the "cmp" instruction or could be
converted to set the same flag, then we can eliminate the "cmp" instruction all
together. This is a important for ARM where the ALU instructions could set the
CPSR flag, but need a special suffix ('s') to do so.
llvm-svn: 110423
ARM/PPC/MSP430-specific code (which are the only targets that
implement the hook) can directly reference their target-specific
instrinfo classes.
llvm-svn: 109171
The only folding these load/store architectures can do is converting COPY into a
load or store, and the target independent part of foldMemoryOperand already
knows how to do that.
llvm-svn: 108099
void t(int *cp0, int *cp1, int *dp, int fmd) {
int c0, c1, d0, d1, d2, d3;
c0 = (*cp0++ & 0xffff) | ((*cp1++ << 16) & 0xffff0000);
c1 = (*cp0++ & 0xffff) | ((*cp1++ << 16) & 0xffff0000);
/* ... */
}
It code gens into something pretty bad. But with this change (analogous to the
X86 back-end), it will use ldm and generate few instructions.
llvm-svn: 106693
- This fixed a number of bugs in if-converter, tail merging, and post-allocation
scheduler. If-converter now runs branch folding / tail merging first to
maximize if-conversion opportunities.
- Also changed the t2IT instruction slightly. It now defines the ITSTATE
register which is read by instructions in the IT block.
- Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't
change the instruction ordering in the IT block (since IT mask has been
finalized). It also ensures no other instructions can be scheduled between
instructions in the IT block.
This is not yet enabled.
llvm-svn: 106344
addresses a longstanding deficiency noted in many FIXMEs scattered
across all the targets.
This effectively moves the problem up one level, replacing eleven
FIXMEs in the targets with eight FIXMEs in CodeGen, plus one path
through FastISel where we actually supply a DebugLoc, fixing Radar
7421831.
llvm-svn: 106243
the machine instruction representation of the immediate value to be encoded
into an integer with similar fields as the actual VMOV instruction. This makes
things easier for the disassembler, since it can just stuff the bits into the
immediate operand, but harder for the asm printer since it has to decode the
value to be printed. Testcase for the encoding will follow later when MC has
more support for ARM.
llvm-svn: 105836
instruction defines subregisters.
Any existing subreg indices on the original instruction are preserved or
composed with the new subreg index.
Also substitute multiple operands mentioning the original register by using the
new MachineInstr::substituteRegister() function. This is necessary because there
will soon be <imp-def> operands added to non read-modify-write partial
definitions. This instruction:
%reg1234:foo = FLAP %reg1234<imp-def>
will reMaterialize(%reg3333, bar) like this:
%reg3333:bar-foo = FLAP %reg333:bar<imp-def>
Finally, replace the TargetRegisterInfo pointer argument with a reference to
indicate that it cannot be NULL.
llvm-svn: 105358