regB = move RCX
regA = op regB, regC
RAX = move regA
where both regB and regC are killed. If regB is constrainted to non-compatible
physical registers but regC is not constrainted at all, then it's better to
commute the instruction.
movl %edi, %eax
shlq $32, %rcx
leaq (%rcx,%rax), %rax
=>
movl %edi, %eax
shlq $32, %rcx
orq %rcx, %rax
rdar://8762995
llvm-svn: 121793
must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize
the pass's dependencies.
Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.
I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems
with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass
registration/creation, please send the testcase to me directly.
llvm-svn: 116820
perform initialization without static constructors AND without explicit initialization
by the client. For the moment, passes are required to initialize both their
(potential) dependencies and any passes they preserve. I hope to be able to relax
the latter requirement in the future.
llvm-svn: 116334
If we are emitting COPY instructions for the REG_SEQUENCE, make sure the kill
flag goes on the last COPY. Otherwise we may be using a killed register.
<rdar://problem/8287792>
llvm-svn: 110589
ScheduleDAGEmit, TwoAddressLowering, and PHIElimination.
This switches the bulk of register copies to using COPY, but many less used
copyRegToReg calls remain.
llvm-svn: 108050
EXTRACT_SUBREG no longer appears as a machine instruction. Use COPY instead.
Add isCopy() checks in many places using isMoveInstr() and isExtractSubreg().
The isMoveInstr hook will be removed later.
llvm-svn: 107879
INSERT_SUBREG will now only appear in SSA machine instructions.
Fix the handling of partial redefs in ProcessImplicitDefs. This is now relevant
since partial redef COPY instructions appear.
llvm-svn: 107726
- X86 unfolding should check if the instructions being unfolded has memoperands.
If there is no memoperands, then it must assume conservative alignment. If this
would introduce an expensive sse unaligned load / store, then unfoldMemoryOperand
etc. should not unfold the instruction.
llvm-svn: 107509
opportunities. For example, this lets it emit this:
movq (%rax), %rcx
addq %rdx, %rcx
instead of this:
movq %rdx, %rcx
addq (%rax), %rcx
in the case where %rdx has subsequent uses. It's the same number
of instructions, and usually the same encoding size on x86, but
it appears faster, and in general, it may allow better scheduling
for the load.
llvm-svn: 106493
combined to an insert_subreg, i.e., where the destination register is larger
than the source. We need to check that the subregs can be composed for that
case in a symmetrical way to the case when the destination is smaller.
llvm-svn: 106004
replacing the overly conservative checks that I had introduced recently to
deal with correctness issues. This makes a pretty noticable difference
in our testcases where reg_sequences are used. I've updated one test to
check that we no longer emit the unnecessary subreg moves.
llvm-svn: 105991
Check that all the instructions are in the same basic block, that the
EXTRACT_SUBREGs write to the same subregs that are being extracted, and that
the source and destination registers are in the same regclass. Some of
these constraints can be relaxed with a bit more work. Jakob suggested
that the loop that checks for subregs when NewSubIdx != 0 should use the
"nodbg" iterator, so I made that change here, too.
llvm-svn: 105437
instruction defines subregisters.
Any existing subreg indices on the original instruction are preserved or
composed with the new subreg index.
Also substitute multiple operands mentioning the original register by using the
new MachineInstr::substituteRegister() function. This is necessary because there
will soon be <imp-def> operands added to non read-modify-write partial
definitions. This instruction:
%reg1234:foo = FLAP %reg1234<imp-def>
will reMaterialize(%reg3333, bar) like this:
%reg3333:bar-foo = FLAP %reg333:bar<imp-def>
Finally, replace the TargetRegisterInfo pointer argument with a reference to
indicate that it cannot be NULL.
llvm-svn: 105358
that are aliases of the specified register.
- Rename modifiesRegister to definesRegister since it's looking a def of the
specific register or one of its super-registers. It's not looking for def of a
sub-register or alias that could change the specified register.
- Added modifiesRegister to look for defs of aliases.
llvm-svn: 104377
instructions.
e.g.
%reg1026<def> = VLDMQ %reg1025<kill>, 260, pred:14, pred:%reg0
%reg1027<def> = EXTRACT_SUBREG %reg1026, 6
%reg1028<def> = EXTRACT_SUBREG %reg1026<kill>, 5
...
%reg1029<def> = REG_SEQUENCE %reg1028<kill>, 5, %reg1027<kill>, 6, %reg1028, 7, %reg1027, 8, %reg1028, 9, %reg1027, 10, %reg1030<kill>, 11, %reg1032<kill>, 12
After REG_SEQUENCE is eliminated, we are left with:
%reg1026<def> = VLDMQ %reg1025<kill>, 260, pred:14, pred:%reg0
%reg1029:6<def> = EXTRACT_SUBREG %reg1026, 6
%reg1029:5<def> = EXTRACT_SUBREG %reg1026<kill>, 5
The regular coalescer will not be able to coalesce reg1026 and reg1029 because it doesn't
know how to combine sub-register indices 5 and 6. Now 2-address pass will consult the
target whether sub-registers 5 and 6 of reg1026 can be combined to into a larger
sub-register (or combined to be reg1026 itself as is the case here). If it is possible,
it will be able to replace references of reg1026 with reg1029 + the larger sub-register
index.
llvm-svn: 103835
into TargetOpcodes.h. #include the new TargetOpcodes.h
into MachineInstr. Add new inline accessors (like isPHI())
to MachineInstr, and start using them throughout the
codebase.
llvm-svn: 95687
When TwoAddressInstructionPass deletes a dead instruction, make sure that all
register kills are accounted for. The 2-addr register does not get special
treatment.
llvm-svn: 89246
is trivially rematerializable and integrate it into
TargetInstrInfo::isTriviallyReMaterializable. This way, all places that
need to know whether an instruction is rematerializable will get the
same answer.
This enables the useful parts of the aggressive-remat option by
default -- using AliasAnalysis to determine whether a memory location
is invariant, and removes the questionable parts -- rematting operations
with virtual register inputs that may not be live everywhere.
llvm-svn: 83687
for the complicated case where one register is tied to multiple destinations.
This avoids the extra scan of instruction operands that was introduced by
my recent change. I also pulled some code out into a separate
TryInstructionTransform method, added more comments, and renamed some
variables.
Besides all those changes, this takes care of a FIXME in the code regarding
an assumption about there being a single tied use of a register when
converting to a 3-address form. I'm not aware of cases where that assumption
is violated, but the code now only attempts to transform an instruction,
either by commuting its operands or by converting to a 3-address form,
for the simple case where there is a single pair of tied operands.
llvm-svn: 80945
tied to different source registers, the TwoAddressInstructionPass needs to
be smarter. Change it to check before replacing a source register whether
that source register is tied to a different destination register, and if so,
defer handling it until a subsequent iteration.
llvm-svn: 80654
- Some clients which used DOUT have moved to DEBUG. We are deprecating the
"magic" DOUT behavior which avoided calling printing functions when the
statement was disabled. In addition to being unnecessary magic, it had the
downside of leaving code in -Asserts builds, and of hiding potentially
unnecessary computations.
llvm-svn: 77019
with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
instructions), and teach the DAGCombiner to take advantage of this on
targets which support it. This eliminates many redundant
zero-extension operations on x86-64.
This adds a new TargetLowering hook, isZExtFree. It's similar to
isTruncateFree, except it only applies to actual definitions, and not
no-op truncates which may not zero the high bits.
Also, this adds a new optimization to SimplifyDemandedBits: transform
operations like x+y into (zext (add (trunc x), (trunc y))) on targets
where all the casts are no-ops. In contexts where the high part of the
add is explicitly masked off, this allows the mask operation to be
eliminated. Fix the DAGCombiner to avoid undoing these transformations
to eliminate casts on targets where the casts are no-ops.
Also, this adds a new two-address lowering heuristic. Since
two-address lowering runs before coalescing, it helps to be able to
look through copies when deciding whether commuting and/or
three-address conversion are profitable.
Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
the case that a clobber range extended both before and beyond an
existing live range. In that case, multiple live ranges need to be
added. This was exposed by the new subreg coalescing code.
Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
spiller behavior it was looking for no longer occurrs with the new
instruction selection.
llvm-svn: 68576
e.g.
%reg1024<def> = MOV r1
%reg1025<def> = ADD %reg1024, %reg1026
r0 = MOV %reg1025
If it's not possible / profitable to commute ADD, then turning ADD into a LEA saves a copy.
llvm-svn: 68065
%reg1028<def> = EXTRACT_SUBREG %reg1027<kill>, 1
%reg1029<def> = MOV8rr %reg1028
%reg1029<def> = SHR8ri %reg1029, 7, %EFLAGS<imp-def,dead>
insert => %reg1030<def> = MOV8rr %reg1028
%reg1030<def> = ADD8rr %reg1028<kill>, %reg1029<kill>, %EFLAGS<imp-def,dead>
In this case, it might not be possible to coalesce the second MOV8rr
instruction if the first one is coalesced. So it would be profitable to
commute it:
%reg1028<def> = EXTRACT_SUBREG %reg1027<kill>, 1
%reg1029<def> = MOV8rr %reg1028
%reg1029<def> = SHR8ri %reg1029, 7, %EFLAGS<imp-def,dead>
insert => %reg1030<def> = MOV8rr %reg1029
%reg1030<def> = ADD8rr %reg1029<kill>, %reg1028<kill>, %EFLAGS<imp-def,dead>
llvm-svn: 62954
Running /Users/void/llvm/llvm.src/test/CodeGen/X86/dg.exp ...
FAIL: /Users/void/llvm/llvm.src/test/CodeGen/X86/2007-11-30-LoadFolding-Bug.ll
Failed with exit(1) at line 1
while running: llvm-as < /Users/void/llvm/llvm.src/test/CodeGen/X86/2007-11-30-LoadFolding-Bug.ll | llc -march=x86 -mattr=+sse2 -stats |& grep {1 .*folded into instructions}
child process exited abnormally
Make this conditional for now.
llvm-svn: 51563
address of the PassInfo directly instead of calling getPassInfo.
This eliminates a bunch of dynamic initializations of static data.
Also, fold RegisterPassBase into PassInfo, make a bunch of its
data members const, and rearrange some code to initialize data
members in constructors instead of using setter member functions.
llvm-svn: 51022
all clients over to using predicates instead of these flags directly.
These are now private values which are only to be used to statically
initialize the tables.
llvm-svn: 45692
flags that can be set. Add predicates for the ones lacking it, and switch
some clients over to using the predicates instead of Flags directly.
llvm-svn: 45690
that it is cheap and efficient to get.
Move a variety of predicates from TargetInstrInfo into
TargetInstrDescriptor, which makes it much easier to query a predicate
when you don't have TII around. Now you can use MI->getDesc()->isBranch()
instead of going through TII, and this is much more efficient anyway. Not
all of the predicates have been moved over yet.
Update old code that used MI->getInstrDescriptor()->Flags to use the
new predicates in many places.
llvm-svn: 45674
that "machine" classes are used to represent the current state of
the code being compiled. Given this expanded name, we can start
moving other stuff into it. For now, move the UsedPhysRegs and
LiveIn/LoveOuts vectors from MachineFunction into it.
Update all the clients to match.
This also reduces some needless #includes, such as MachineModuleInfo
from MachineFunction.
llvm-svn: 45467