llvm-project/llvm/lib
Chandler Carruth e24f3973eb [vectorize] Initial version of respecting PGO in the vectorizer: treat
cold loops as-if they were being optimized for size.

Nothing fancy here. Simply test case included. The nice thing is that we
can now incrementally build on top of this to drive other heuristics.
All of the infrastructure work is done to get the profile information
into this layer.

The remaining work necessary to make this a fully general purpose loop
unroller for very hot loops is to make it a fully general purpose loop
unroller. Things I know of but am not going to have time to benchmark
and fix in the immediate future:

1) Don't disable the entire pass when the target is lacking vector
   registers. This really doesn't make any sense any more.
2) Teach the unroller at least and the vectorizer potentially to handle
   non-if-converted loops. This is trivial for the unroller but hard for
   the vectorizer.
3) Compute the relative hotness of the loop and thread that down to the
   various places that make cost tradeoffs (very likely only the
   unroller makes sense here, and then only when dealing with loops that
   are small enough for unrolling to not completely blow out the LSD).

I'm still dubious how useful hotness information will be. So far, my
experiments show that if we can get the correct logic for determining
when unrolling actually helps performance, the code size impact is
completely unimportant and we can unroll in all cases. But at least
we'll no longer burn code size on cold code.

One somewhat unrelated idea that I've had forever but not had time to
implement: mark all functions which are only reachable via the global
constructors rigging in the module as optsize. This would also decrease
the impact of any more aggressive heuristics here on code size.

llvm-svn: 200219
2014-01-27 13:11:50 +00:00
..
Analysis Fix crasher introduced in r200203 and caught by a libc++ buildbot. Don't assume that getMulExpr returns a SCEVMulExpr, it may have simplified it to something else! 2014-01-27 10:47:44 +00:00
AsmParser Add an inalloca flag to allocas 2014-01-17 23:58:17 +00:00
Bitcode Make parseBitcodeFile return an ErrorOr<Module *>. 2014-01-15 01:08:23 +00:00
CodeGen Fix for PR18102. 2014-01-27 09:18:31 +00:00
DebugInfo [Sparc] Add support for parsing DW_CFA_GNU_window_save. 2014-01-26 05:13:44 +00:00
ExecutionEngine Fix known typos 2014-01-24 17:20:08 +00:00
IR Fix llvm-dis to print the inalloca bit on allocas. 2014-01-25 01:24:06 +00:00
IRReader Make parseBitcodeFile return an ErrorOr<Module *>. 2014-01-15 01:08:23 +00:00
LTO Construct the MCStreamer before constructing the MCTargetStreamer. 2014-01-26 06:06:37 +00:00
Linker Reapply r194218 with fix: 2014-01-16 06:29:36 +00:00
MC AsmParser: improve diagnostics for invalid variants 2014-01-26 22:29:43 +00:00
Object llvm-readobj: add support for PE32+ (Windows 64 bit executable). 2014-01-26 04:15:52 +00:00
Option Avoid buffer copies when a Twine already is a StringRef. 2013-12-03 18:18:28 +00:00
Support Roll back the ConstStringRef change for now 2014-01-27 05:24:39 +00:00
TableGen [TableGen] Correctly generate implicit anonymous prototype defs in multiclasses 2014-01-02 20:47:09 +00:00
Target XCore: Fix typo in function name. 2014-01-27 11:50:13 +00:00
Transforms [vectorize] Initial version of respecting PGO in the vectorizer: treat 2014-01-27 13:11:50 +00:00
CMakeLists.txt Move LTO support library to a component, allowing it to be tested 2013-09-24 23:52:22 +00:00
LLVMBuild.txt Move LTO support library to a component, allowing it to be tested 2013-09-24 23:52:22 +00:00
Makefile Reformat Makefile. No other changes. 2013-10-30 04:03:03 +00:00