This website requires JavaScript.
Explore
Help
Sign In
maxjhandsome
/
llvm-project
forked from
OSchip/llvm-project
Watch
1
Star
0
Fork
You've already forked llvm-project
0
Code
Issues
Pull Requests
Packages
Releases
Wiki
Activity
3b243608f5
llvm-project
/
llvm
/
test
/
Transforms
/
LoopUnroll
/
X86
/
lit.local.cfg
4 lines
68 B
INI
Raw
Normal View
History
Unescape
Escape
Reduce verbiage of lit.local.cfg files We can just split targets_to_build in one place and make it immutable. llvm-svn: 210496
2014-06-10 06:42:55 +08:00
if not 'X86' in config.root.targets:
Implement X86TTI::getUnrollingPreferences This provides an initial implementation of getUnrollingPreferences for x86. getUnrollingPreferences is used by the generic (concatenation) unroller, which is distinct from the unrolling done by the loop vectorizer. Many modern x86 cores have some kind of uop cache and loop-stream detector (LSD) used to efficiently dispatch small loops, and taking full advantage of this requires unrolling small loops (small here means 10s of uops). These caches also have limits on the number of taken branches in the loop, and so we also cap the loop unrolling factor based on the maximum "depth" of the loop. This is currently calculated with a partial DFS traversal (partial because it will stop early if the path length grows too much). This is still an approximation, and one that is both conservative (because it does not account for branches eliminated via block placement) and optimistic (because it is only recording the maximum depth over minimum paths). Nevertheless, because the loops that fit in these uop caches are so small, it is not clear how much the details matter. The original set of patches posted for review produced the following test-suite performance results (from the TSVC benchmark) at that time: ControlLoops-dbl - 13% speedup ControlLoops-flt - 15% speedup Reductions-dbl - 7.5% speedup llvm-svn: 205348
2014-04-02 02:50:34 +08:00
config.unsupported
=
True