This nicely handles the most common case of virtual register sets, but
also handles anticipated cases where we will map pointers to IDs.
The goal is not to develop a completely generic SparseSet
template. Instead we want to handle the expected uses within llvm
without any template antics in the client code. I'm adding a bit of
template nastiness here, and some assumption about expected usage in
order to make the client code very clean.
The expected common uses cases I'm designing for:
- integer keys that need to be reindexed, and may map to additional
data
- densely numbered objects where we want pointer keys because no
number->object map exists.
llvm-svn: 155227
This makes RAFast 4% faster, and it gets rid of the dodgy DenseMap
iteration.
This also revealed that RAFast would sometimes dereference DenseMap
iterators after erasing other elements from the map. That does seem to
work in the current DenseMap implementation, but SparseSet doesn't allow
it.
llvm-svn: 151111
Passes after RegAlloc should be able to rely on MRI->getNumVirtRegs() == 0.
This makes sharing code for pre/postRA passes more robust.
Now, to check if a pass is running before the RA pipeline begins, use MRI->isSSA().
To check if a pass is running after the RA pipeline ends, use !MRI->getNumVirtRegs().
PEI resets virtual regs when it's done scavenging.
PTX will either have to provide its own PEI pass or assign physregs.
llvm-svn: 151032
MRI keeps track of which physregs have been used. Make sure it gets
updated with all the regmask-clobbered registers.
Delete the closePhysRegsUsed() function which isn't necessary.
llvm-svn: 150830
Creates a configurable regalloc pipeline.
Ensure specific llc options do what they say and nothing more: -reglloc=... has no effect other than selecting the allocator pass itself. This patch introduces a new umbrella flag, "-optimize-regalloc", to enable/disable the optimizing regalloc "superpass". This allows for example testing coalscing and scheduling under -O0 or vice-versa.
When a CodeGen pass requires the MachineFunction to have a particular property, we need to explicitly define that property so it can be directly queried rather than naming a specific Pass. For example, to check for SSA, use MRI->isSSA, not addRequired<PHIElimination>.
CodeGen transformation passes are never "required" as an analysis
ProcessImplicitDefs does not require LiveVariables.
We have a plan to massively simplify some of the early passes within the regalloc superpass.
llvm-svn: 150226
This removes implicit assumption about the form of MI coming into regalloc. In particular, it should be independent of ProcessImplicitDefs which will eventually become a standard part of coming out of SSA--unless we simply can eliminate IMPLICIT_DEF completely. Current unit tests expose this once I remove incidental pass ordering restrictions.
This is not a final fix. Just a temporary workaround until I figure out the right way.
llvm-svn: 149360
The register allocators don't currently support adding reserved
registers while they are running. Extend the MRI API to keep track of
the set of reserved registers when register allocation started.
Target hooks like hasFP() and needsStackRealignment() can look at this
set to avoid reserving more registers during register allocation.
llvm-svn: 147577
generator to it. For non-bundle instructions, these behave exactly the same
as the MC layer API.
For properties like mayLoad / mayStore, look into the bundle and if any of the
bundled instructions has the property it would return true.
For properties like isPredicable, only return true if *all* of the bundled
instructions have the property.
For properties like canFoldAsLoad, isCompare, conservatively return false for
bundles.
llvm-svn: 146026
sink them into MC layer.
- Added MCInstrInfo, which captures the tablegen generated static data. Chang
TargetInstrInfo so it's based off MCInstrInfo.
llvm-svn: 134021
In particular, don't spill dirty registers only to satisfy a hint. It is
not worth it.
The attached test case provides an example where the fast allocator
would spill a register when other registers are available.
llvm-svn: 132900
When compiling a program with lots of small functions like
483.xalancbmk, this makes RAFast 11% faster.
Add some comments to clarify the difference between unallocatable and
reserved registers. It's quite subtle.
The fast register allocator depends on EFLAGS' not being allocatable on
x86. That way it can completely avoid tracking liveness, and it won't
mind when there are multiple uses of a single def.
llvm-svn: 132514
registers for fast allocation a different way. This has us updating
used registers only when we're using that exact register.
Fixes rdar://9207598
llvm-svn: 129711
when no virtual registers have been allocated.
It was only used to resize IndexedMaps, so provide an IndexedMap::resize()
method such that
Map.grow(MRI.getLastVirtReg());
can be replaced with the simpler
Map.resize(MRI.getNumVirtRegs());
This works correctly when no virtuals are allocated, and it bypasses the to/from
index conversions.
llvm-svn: 123130
Print virtual registers numbered from 0 instead of the arbitrary
FirstVirtualRegister. The first virtual register is printed as %vreg0.
TRI::NoRegister is printed as %noreg.
llvm-svn: 123107
must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize
the pass's dependencies.
Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.
I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems
with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass
registration/creation, please send the testcase to me directly.
llvm-svn: 116820
overload UserInInstr. Explicitly check Allocatable. The early exit in the
condition will mean the performance impact of the extra test should be
minimal.
llvm-svn: 113016
multiple defs, like t2LDRSB_POST.
The first def could accidentally steal the physreg that the second, tied def was
required to be allocated to.
Now, the tied use-def is treated more like an early clobber, and the physreg is
reserved before allocating the other defs.
This would never be a problem when the tied def was the only def which is the
usual case.
This fixes MallocBench/gs for thumb2 -O0.
llvm-svn: 109715
A partial redefine needs to be treated like a tied operand, and the register
must be reloaded while processing use operands.
This fixes a bug where partially redefined registers were processed as normal
defs with a reload added. The reload could clobber another use operand if it was
a kill that allowed register reuse.
llvm-svn: 107193
When an instruction has tied operands and physreg defines, we must take extra
care that the tied operands conflict with neither physreg defs nor uses.
The special treatment is given to inline asm and instructions with tied operands
/ early clobbers and physreg defines.
This fixes PR7509.
llvm-svn: 107043
Early clobbers defining a virtual register were first alocated to a physreg and
then processed as a physreg EC, spilling the virtreg.
This fixes PR7382.
llvm-svn: 105998
register allocation.
Process all of the clobber lists at the end of the function, marking the
registers as used in MachineRegisterInfo.
This is necessary in case the calls clobber callee-saved registers (sic).
llvm-svn: 105473
A partial redef now triggers a reload if required. Also don't add
<imp-def,dead> operands for physical superregisters.
Kill flags are still treated as full register kills, and <imp-use,kill> operands
are added for physical superregisters as before.
llvm-svn: 104167
This fixes the miscompilations of MultiSource/Applications/JM/l{en,de}cod.
Clang now successfully self hosts in a debug build with the fast register allocator.
llvm-svn: 103975
While that approach works wonders for register pressure, it tends to break
everything.
This should unbreak the arm-linux builder and fix a number of miscompilations.
llvm-svn: 103946
out aliases when allocating. Clean up allocVirtReg().
Use calcSpillCost() to allow more aggressive hinting. Now the hint is always
taken unless blocked by a reserved register. This leads to more coalescing,
lower register pressure, and less spilling.
llvm-svn: 103939
This is safe to do because the physreg has been marked UsedInInstr and the kill flag will be set on the last operand using the virtreg if there are more then one.
llvm-svn: 103933
Debug code doesn't use callee saved registers anyway, and the code is simpler this way. Now spillVirtReg always kills, and the isKill parameter is not needed.
llvm-svn: 103927
a condition's grouping. Every other use of Allocatable.test(Hint) groups it the
same way as it is indented, so move the parentheses to agree with that
grouping.
llvm-svn: 103869
When working top-down in a basic block, substituting physregs for virtregs, the use-def chains are kept up to date. That means we can recognize a virtreg kill by the use-def chain becoming empty.
This makes the fast allocator independent of incoming kill flags.
llvm-svn: 103866
- Kill is implicit when use and def registers are identical.
- Only virtual registers can differ.
Add a -verify-fast-regalloc to run the verifier before the fast allocator.
llvm-svn: 103797
This loop is quadratic in the capacity for a DenseMap:
while(!map.empty())
map.erase(map.begin());
Instead we now do a normal begin() - end() iteration followed by map.clear().
That also has the nice sideeffect of shrinking the map capacity on demand.
llvm-svn: 103747
Sorry for the big change. The path leading up to this patch had some TableGen
changes that I didn't want to commit before I knew they were useful. They
weren't, and this version does not need them.
The fast register allocator now does no liveness calculations. Instead it relies
on kill flags provided by isel. (Currently those kill flags are also ignored due
to isel bugs). The allocation algorithm is supposed to work with any subset of
valid kill flags. More kill flags simply means fewer spills inserted.
Registers are allocated from a working set that contains no aliases. That means
most allocations can be done directly without expensive alias checks. When the
working set runs out of registers we do the full alias check to find new free
registers.
llvm-svn: 103488
This actually makes everything slower, but the plan is to have isel add <kill>
flags the way it is already adding <dead> flags. Then LiveVariables can be
removed again.
When ignoring the time spent in LiveVariables, -regalloc=fast is now twice as
fast as -regalloc=local.
llvm-svn: 102034
So far this is just a clone of -regalloc=local that has been lobotomized to run
25% faster. It drops the least-recently-used calculations, and is just plain
stupid when it runs out of registers.
The plan is to make this go even faster for -O0 by taking advantage of the short
live intervals in unoptimized code. It should not be necessary to calculate
liveness when most virtual registers are killed 2-3 instructions after they are
born.
llvm-svn: 102006