forked from OSchip/llvm-project
bd1be4d01c
discussions with Andy. Fundamentally, the previous algorithm is both counter productive on several fronts and prioritizing things which aren't necessarily the most important: static branch prediction. The new algorithm uses the existing loop CFG structure information to walk through the CFG itself to layout blocks. It coalesces adjacent blocks within the loop where the CFG allows based on the most likely path taken. Finally, it topologically orders the block chains that have been formed. This allows it to choose a (mostly) topologically valid ordering which still priorizes fallthrough within the structural constraints. As a final twist in the algorithm, it does violate the CFG when it discovers a "hot" edge, that is an edge that is more than 4x hotter than the competing edges in the CFG. These are forcibly merged into a fallthrough chain. Future transformations that need te be added are rotation of loop exit conditions to be fallthrough, and better isolation of cold block chains. I'm also planning on adding statistics to model how well the algorithm does at laying out blocks based on the probabilities it receives. The old tests mostly still pass, and I have some new tests to add, but the nested loops are still behaving very strangely. This almost seems like working-as-intended as it rotated the exit branch to be fallthrough, but I'm not convinced this is actually the best layout. It is well supported by the probabilities for loops we currently get, but those are pretty broken for nested loops, so this may change later. llvm-svn: 142743 |
||
---|---|---|
.. | ||
ARM | ||
Alpha | ||
Blackfin | ||
CBackend | ||
CPP | ||
CellSPU | ||
Generic | ||
MBlaze | ||
MSP430 | ||
Mips | ||
PTX | ||
PowerPC | ||
SPARC | ||
SystemZ | ||
Thumb | ||
Thumb2 | ||
X86 | ||
XCore |