VirtRegMap keeps track of allocations so it knows what's not used. As a horrible hack, the stack coloring can color spill slots with *free* registers. That is, it replace reload and spills with copies from and to the free register. It unfold instructions that load and store the spill slot and replace them with register using variants.
Not yet enabled. This is part 1. More coming.
llvm-svn: 70787
This fixes a very subtle bug. vr defined by an implicit_def is allowed overlap with any register since it doesn't actually modify anything. However, if it's used as a two-address use, its live range can be extended and it can be spilled. The spiller must take care not to emit a reload for the vn number that's defined by the implicit_def. This is both a correctness and performance issue.
llvm-svn: 69743
%reg1498<def> = MOV32rm %reg1024, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
%reg1506<def> = MOV32rm %reg1024, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
%reg1486<def> = MOV32rr %reg1506
%reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
%reg1510<def> = MOV32rm %reg1024, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
=>
%reg1498<def> = MOV32rm %reg2036, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
%reg1506<def> = MOV32rm %reg2037, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
%reg1486<def> = MOV32rr %reg1506
%reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
%reg1510<def> = MOV32rm %reg2038, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
From linearscan's point of view, each of reg2036, 2037, and 2038 are separate registers, each is "killed" after a single use. The reloaded register is available and it's often clobbered right away. e.g. In thise case reg1498 is allocated EAX while reg2036 is allocated RAX. This means we end up with multiple reloads from the same stack slot in the same basic block.
Now linearscan recognize there are other reloads from same SS in the same BB. So it'll "downgrade" RAX (and its aliases) after reg2036 is allocated until the next reload (reg2037) is done. This greatly increase the likihood reloads from SS are reused.
This speeds up sha1 from OpenSSL by 5.8%. It is also an across the board win for SPEC2000 and 2006.
llvm-svn: 69585
e.g. allocating for GR32, bh is not used, updating bl spill weight.
bl should get the same spill weight otherwise it will be choosen
as a spill candidate since spilling bh doesn't make ebx available.
This fix PR2866.
llvm-svn: 67574
instead of requiring all "short description" strings to begin with
two spaces. This makes these strings less mysterious, and it fixes
some cases where short description strings mistakenly did not
begin with two spaces.
llvm-svn: 57521
"If a re-materializable instruction has a register
operand, the spiller will change the register operand's
spill weight to HUGE_VAL to avoid it being spilled.
However, if the operand is already in the queue ready
to be spilled, avoid re-materializing it".
llvm-svn: 56837
RA problem by expanding the live interval of an
earlyclobber def back one slot. Remove
overlap-earlyclobber throughout. Remove
earlyclobber bits and their handling from
live internals.
llvm-svn: 56539
with an earlyclobber operand elsewhere. Propagate
this bit and the earlyclobber bit through SDISel.
Change linear-scan RA not to allocate regs in a way
that conflicts with an earlyclobber. See also comments.
llvm-svn: 56290
to multiply the instruction count by a constant factor in a few places, which
caused the register allocator to require many more iterations.
llvm-svn: 53959
instead of init'ing it maximally to zeros on entry. getFreePhysReg
is pretty hot and only a few elements are typically used. This speeds
up linscan by 5% on 176.gcc.
llvm-svn: 47631
that "machine" classes are used to represent the current state of
the code being compiled. Given this expanded name, we can start
moving other stuff into it. For now, move the UsedPhysRegs and
LiveIn/LoveOuts vectors from MachineFunction into it.
Update all the clients to match.
This also reduces some needless #includes, such as MachineModuleInfo
from MachineFunction.
llvm-svn: 45467
When a live interval is being spilled, rather than creating short, non-spillable
intervals for every def / use, split the interval at BB boundaries. That is, for
every BB where the live interval is defined or used, create a new interval that
covers all the defs and uses in the BB.
This is designed to eliminate one common problem: multiple reloads of the same
value in a single basic block. Note, it does *not* decrease the number of spills
since no copies are inserted so the split intervals are *connected* through
spill and reloads (or rematerialization). The newly created intervals can be
spilled again, in that case, since it does not span multiple basic blocks, it's
spilled in the usual manner. However, it can reuse the same stack slot as the
previously split interval.
This is currently controlled by -split-intervals-at-bb.
llvm-svn: 44198
can be eliminated by the allocator is the destination and source targets the
same register. The most common case is when the source and destination registers
are in different class. For example, on x86 mov32to32_ targets GR32_ which
contains a subset of the registers in GR32.
The allocator can do 2 things:
1. Set the preferred allocation for the destination of a copy to that of its source.
2. After allocation is done, change the allocation of a copy destination (if
legal) so the copy can be eliminated.
This eliminates 443 extra moves from 403.gcc.
llvm-svn: 43662
simultaneously. Move that pass to SimpleRegisterCoalescing.
This makes it easier to implement alternative register allocation and
coalescing strategies while maintaining reuse of the existing live
interval analysis.
llvm-svn: 37520
long live interval that has low usage density.
1. Change order of coalescing to join physical registers with virtual
registers first before virtual register intervals become too long.
2. Check size and usage density to determine if it's worthwhile to join.
3. If joining is aborted, assign virtual register live interval allocation
preference field to the physical register.
4. Register allocator should try to allocate to the preferred register
first (if available) to create identify moves that can be eliminated.
llvm-svn: 36218