llvm-project/clang/lib/CodeGen
Rong Xu 9c6f1538cc [PGO] Change profile use cc1 option to handle IR level profiles
This patch changes cc1 option for PGO profile use from
-fprofile-instr-use=<path> to -fprofile-instrument-use-path=<path>.
-fprofile-instr-use=<path> is now a driver only option.

In addition to decouple the cc1 option from the driver level option, this patch
also enables IR level profile use. cc1 option handling now reads the profile
header and sets CodeGenOpt ProfileUse (valid values are {None, Clang, LLVM}
-- this is a common enum for -fprofile-instrument={}, for the profile
instrumentation), and invoke the pipeline to enable the respective PGO use pass.

Reviewers: silvas, davidxl

Differential Revision: http://reviews.llvm.org/D17737

llvm-svn: 262515
2016-03-02 20:59:36 +00:00
..
ABIInfo.h Add support for Android Vector calling convention for AArch64 2016-02-22 16:48:42 +00:00
Address.h Work around build failure due to GCC 4.8.1 bug. We don't completely understand 2016-02-02 23:11:49 +00:00
BackendUtil.cpp [PGO] Change profile use cc1 option to handle IR level profiles 2016-03-02 20:59:36 +00:00
CGAtomic.cpp [MSVC Compat] Don't provide /volatile:ms semantics to types > pointer 2016-01-22 16:36:44 +00:00
CGBlocks.cpp Move DebugInfoKind into its own header to cut the cyclic dependency edge from Driver to Frontend. 2016-02-02 11:06:51 +00:00
CGBlocks.h Move BlockByrefHelpers back to CodeGenModule.h to placate MSVC. 2015-09-08 08:21:11 +00:00
CGBuilder.h Revert "Change memcpy/memset/memmove to have dest and source alignments." 2015-11-19 05:55:59 +00:00
CGBuiltin.cpp Add __builtin_canonicalize 2016-02-27 09:06:18 +00:00
CGCUDABuiltin.cpp [CUDA] Don't crash when trying to printf a non-scalar object. 2016-02-11 02:00:52 +00:00
CGCUDANV.cpp [CUDA] Do not generate unnecessary runtime init code. 2016-03-02 18:28:53 +00:00
CGCUDARuntime.cpp Roll-back r250822. 2015-10-20 13:23:58 +00:00
CGCUDARuntime.h [CUDA] Emit host-side 'shadows' for device-side global variables 2016-03-02 18:28:50 +00:00
CGCXX.cpp Use CodeGenModule::addReplacement() instead of directly accessing Replacements[]. 2016-02-07 12:44:35 +00:00
CGCXXABI.cpp Add the `pass_object_size` attribute to clang. 2015-12-02 21:58:08 +00:00
CGCXXABI.h Fix use-after-free when a C++ thread_local variable gets replaced (because its 2015-12-01 01:10:48 +00:00
CGCall.cpp [CUDA] Mark all CUDA device-side function defs, decls, and calls as convergent. 2016-02-24 21:55:11 +00:00
CGCall.h Don't emit exceptional stackrestore cleanups around inalloca functions 2015-10-08 00:17:45 +00:00
CGClass.cpp Add whole-program vtable optimization feature to Clang. 2016-02-24 20:46:36 +00:00
CGCleanup.cpp Update for LLVM function name change. 2016-01-14 21:00:27 +00:00
CGCleanup.h Update for LLVM function name change. 2016-01-14 21:00:27 +00:00
CGDebugInfo.cpp Reapply r261657. 2016-02-23 19:30:08 +00:00
CGDebugInfo.h Revert r261634 "Supporting all entities declared in lexical scope in LLVM debug info." and r261657 2016-02-23 19:10:16 +00:00
CGDecl.cpp Serialize `#pragma detect_mismatch`. 2016-03-02 19:28:54 +00:00
CGDeclCXX.cpp [CUDA] Do not allow dynamic initialization of global device side variables. 2016-02-02 22:29:48 +00:00
CGException.cpp Reword a misleading comment discussing landingpads and SEH 2016-03-01 19:51:48 +00:00
CGExpr.cpp [OPENMP 4.0] Fixed support of array sections/array subscripts. 2016-02-04 11:27:03 +00:00
CGExprAgg.cpp Default vaarg lowering should support indirect struct types. 2016-02-24 02:59:33 +00:00
CGExprCXX.cpp Add whole-program vtable optimization feature to Clang. 2016-02-24 20:46:36 +00:00
CGExprComplex.cpp [Bugfix] Fix ICE on constexpr vector splat. 2016-01-13 01:52:39 +00:00
CGExprConstant.cpp Update for LLVM function name change. 2016-01-14 21:00:27 +00:00
CGExprScalar.cpp Default vaarg lowering should support indirect struct types. 2016-02-24 02:59:33 +00:00
CGLoopInfo.cpp [OpenCL] Generate metadata for opencl_unroll_hint attribute 2016-02-19 18:30:11 +00:00
CGLoopInfo.h Add new llvm.loop.unroll.enable metadata for use with "#pragma unroll". 2015-08-10 17:29:39 +00:00
CGObjC.cpp Emit calls to objc_unsafeClaimAutoreleasedReturnValue when 2016-01-27 18:32:30 +00:00
CGObjCGNU.cpp Reduce the number of implicit StringRef->std::string conversions by threading StringRef through more APIs. 2016-02-13 13:42:54 +00:00
CGObjCMac.cpp Objective-C: Add a size field to non-fragile category metadata. 2016-02-24 17:49:50 +00:00
CGObjCRuntime.cpp Update for LLVM function name change. 2016-01-14 21:00:27 +00:00
CGObjCRuntime.h Reduce the number of implicit StringRef->std::string conversions by threading StringRef through more APIs. 2016-02-13 13:42:54 +00:00
CGOpenCLRuntime.cpp [OpenCL] Pipe type support 2016-01-09 12:53:17 +00:00
CGOpenCLRuntime.h [OpenCL] Pipe type support 2016-01-09 12:53:17 +00:00
CGOpenMPRuntime.cpp [OPENMP] Improved layout of CGOpenMPRuntime class, NFC. 2016-02-19 10:38:26 +00:00
CGOpenMPRuntime.h [OPENMP] Improved layout of CGOpenMPRuntime class, NFC. 2016-02-19 10:38:26 +00:00
CGOpenMPRuntimeNVPTX.cpp Re-apply for the 2nd-time r259977 - [OpenMP] Reorganize code to allow specialized code generation for different devices. 2016-02-08 15:59:20 +00:00
CGOpenMPRuntimeNVPTX.h Re-apply for the 2nd-time r259977 - [OpenMP] Reorganize code to allow specialized code generation for different devices. 2016-02-08 15:59:20 +00:00
CGRecordLayout.h Make CodeGen headers self-contained. 2016-02-02 16:05:18 +00:00
CGRecordLayoutBuilder.cpp Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. 2016-01-29 19:38:18 +00:00
CGStmt.cpp [PGO] cc1 option name change for profile instrumentation 2016-02-04 18:39:09 +00:00
CGStmtOpenMP.cpp [OPENMP 4.5] Codegen for data members in 'reduction' clause. 2016-03-02 04:57:40 +00:00
CGVTT.cpp Remove and forbid raw_svector_ostream::flush() calls. 2015-08-13 18:12:56 +00:00
CGVTables.cpp Add whole-program vtable optimization feature to Clang. 2016-02-24 20:46:36 +00:00
CGVTables.h [CodeGen] Remove dead code. NFC. 2015-10-15 15:29:40 +00:00
CGValue.h [Sema] PR26444 fix crash when alignment value is >= 2**16 2016-03-02 06:48:47 +00:00
CMakeLists.txt Re-apply for the 2nd-time r259977 - [OpenMP] Reorganize code to allow specialized code generation for different devices. 2016-02-08 15:59:20 +00:00
CodeGenABITypes.cpp Add the `pass_object_size` attribute to clang. 2015-12-02 21:58:08 +00:00
CodeGenAction.cpp Serialize `#pragma detect_mismatch`. 2016-03-02 19:28:54 +00:00
CodeGenFunction.cpp [MSVC Compat] Correctly handle finallys nested within finallys 2016-03-01 19:42:53 +00:00
CodeGenFunction.h [MSVC Compat] Correctly handle finallys nested within finallys 2016-03-01 19:42:53 +00:00
CodeGenModule.cpp [PGO] Change profile use cc1 option to handle IR level profiles 2016-03-02 20:59:36 +00:00
CodeGenModule.h Add whole-program vtable optimization feature to Clang. 2016-02-24 20:46:36 +00:00
CodeGenPGO.cpp [PGO] code simplification: use existing VP annotation API /NFC 2016-02-04 19:54:17 +00:00
CodeGenPGO.h [PGO] Windows buildbot failure fix. [NFC] 2016-01-24 00:56:19 +00:00
CodeGenTBAA.cpp [PR26550] Use a different TBAA root for C++ vs C. 2016-02-11 19:19:18 +00:00
CodeGenTBAA.h Make the remaining headers self-contained. 2016-02-02 14:24:21 +00:00
CodeGenTypeCache.h Compute and preserve alignment more faithfully in IR-generation. 2015-09-08 08:05:57 +00:00
CodeGenTypes.cpp [MS ABI] Allow a member pointers' converted type to change 2016-01-26 19:30:26 +00:00
CodeGenTypes.h [MS ABI] Allow a member pointers' converted type to change 2016-01-26 19:30:26 +00:00
CoverageMappingGen.cpp [Coverage] Fix crash when handling certain macro expansions 2016-02-08 19:25:45 +00:00
CoverageMappingGen.h [Coverage] Reduce complexity of adding function mapping records 2016-01-21 19:25:35 +00:00
EHScopeStack.h Update clang to use the updated LLVM EH instructions 2015-12-12 05:39:21 +00:00
ItaniumCXXABI.cpp Add whole-program vtable optimization feature to Clang. 2016-02-24 20:46:36 +00:00
MicrosoftCXXABI.cpp Add whole-program vtable optimization feature to Clang. 2016-02-24 20:46:36 +00:00
ModuleBuilder.cpp Serialize `#pragma detect_mismatch`. 2016-03-02 19:28:54 +00:00
ObjectFilePCHContainerOperations.cpp Move DebugInfoKind into its own header to cut the cyclic dependency edge from Driver to Frontend. 2016-02-02 11:06:51 +00:00
README.txt
SanitizerMetadata.cpp [ASan] Initial support for Kernel AddressSanitizer 2015-06-19 12:19:07 +00:00
SanitizerMetadata.h Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; Clang edition. 2015-02-15 22:54:08 +00:00
TargetInfo.cpp Default vaarg lowering should support indirect struct types. 2016-02-24 02:59:33 +00:00
TargetInfo.h Fix Clang-tidy modernize-use-nullptr warnings in headers and generated files; other minor cleanups. 2015-09-29 20:56:43 +00:00

README.txt

IRgen optimization opportunities.

//===---------------------------------------------------------------------===//

The common pattern of
--
short x; // or char, etc
(x == 10)
--
generates an zext/sext of x which can easily be avoided.

//===---------------------------------------------------------------------===//

Bitfields accesses can be shifted to simplify masking and sign
extension. For example, if the bitfield width is 8 and it is
appropriately aligned then is is a lot shorter to just load the char
directly.

//===---------------------------------------------------------------------===//

It may be worth avoiding creation of alloca's for formal arguments
for the common situation where the argument is never written to or has
its address taken. The idea would be to begin generating code by using
the argument directly and if its address is taken or it is stored to
then generate the alloca and patch up the existing code.

In theory, the same optimization could be a win for block local
variables as long as the declaration dominates all statements in the
block.

NOTE: The main case we care about this for is for -O0 -g compile time
performance, and in that scenario we will need to emit the alloca
anyway currently to emit proper debug info. So this is blocked by
being able to emit debug information which refers to an LLVM
temporary, not an alloca.

//===---------------------------------------------------------------------===//

We should try and avoid generating basic blocks which only contain
jumps. At -O0, this penalizes us all the way from IRgen (malloc &
instruction overhead), all the way down through code generation and
assembly time.

On 176.gcc:expr.ll, it looks like over 12% of basic blocks are just
direct branches!

//===---------------------------------------------------------------------===//