forked from OSchip/llvm-project
8fe62b7af1
This patch allows PRE of the following type of loads: ``` preheader: br label %loop loop: br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p br label %merge merge: ... br i1 ..., label %loop, label %exit ``` Into ``` preheader: %x0 = load %p br label %loop loop: %x.pre = phi(x0, x2) br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p %x1 = load %p br label %merge merge: x2 = phi(x.pre, x1) ... br i1 ..., label %loop, label %exit ``` So instead of loading from %p on every iteration, we load only when the actual clobber happens. The typical pattern which it is trying to address is: hot loop, with all code inlined and provably having no side effects, and some side-effecting calls on cold path. The worst overhead from it is, if we always take clobber block, we make 1 more load overall (in preheader). It only matters if loop has very few iteration. If clobber block is not taken at least once, the transform is neutral or profitable. There are several improvements prospect open up: - We can sometimes be smarter in loop-exiting blocks via split of critical edges; - If we have block frequency info, we can handle multiple clobbers. The only obstacle now is that we don't know if their sum is colder than the header. Differential Revision: https://reviews.llvm.org/D99926 Reviewed By: reames |
||
---|---|---|
.. | ||
Analysis | ||
Assembler | ||
Bindings | ||
Bitcode | ||
BugPoint | ||
CodeGen | ||
DebugInfo | ||
Demangle | ||
Examples | ||
ExecutionEngine | ||
Feature | ||
FileCheck | ||
Instrumentation | ||
Integer | ||
JitListener | ||
LTO | ||
Linker | ||
MC | ||
MachineVerifier | ||
Object | ||
ObjectYAML | ||
Other | ||
SafepointIRVerifier | ||
Support | ||
SymbolRewriter | ||
TableGen | ||
ThinLTO/X86 | ||
Transforms | ||
Unit | ||
Verifier | ||
YAMLParser | ||
tools | ||
.clang-format | ||
CMakeLists.txt | ||
TestRunner.sh | ||
lit.cfg.py | ||
lit.site.cfg.py.in |