forked from OSchip/llvm-project
fa80642205
This patch adds a new common code feature that allows platform code to request minimum alignment of global symbols. The background for this is that on SystemZ, the most efficient way to load addresses of global symbol is the LOAD ADDRESS RELATIVE LONG (LARL) instruction. This instruction provides PC-relative addressing, but only to *even* addresses. For this reason, existing compilers will guarantee that global symbols are always aligned to at least 2. [ Since symbols would otherwise already use a default alignment based on their type, this will usually only affect global objects of character type or character arrays. ] GCC also allows creating symbols without that extra alignment by using explicit "aligned" attributes (which then need to be used on both definition and each use of the symbol). To enable support for this with Clang, this patch adds a TargetInfo::MinGlobalAlign variable that provides a global minimum for the alignment of every global object (unless overridden via explicit alignment attribute), and adds code to respect this setting. Within this patch, no platform actually sets the value to anything but the default 1, resulting in no change in behaviour on any existing target. This version of the patch incorporates feedback from reviews by Eric Christopher and John McCall. Thanks to all reviewers! Patch by Richard Sandiford. llvm-svn: 181210 |
||
---|---|---|
.. | ||
ABIInfo.h | ||
BackendUtil.cpp | ||
CGAtomic.cpp | ||
CGBlocks.cpp | ||
CGBlocks.h | ||
CGBuilder.h | ||
CGBuiltin.cpp | ||
CGCUDANV.cpp | ||
CGCUDARuntime.cpp | ||
CGCUDARuntime.h | ||
CGCXX.cpp | ||
CGCXXABI.cpp | ||
CGCXXABI.h | ||
CGCall.cpp | ||
CGCall.h | ||
CGClass.cpp | ||
CGCleanup.cpp | ||
CGCleanup.h | ||
CGDebugInfo.cpp | ||
CGDebugInfo.h | ||
CGDecl.cpp | ||
CGDeclCXX.cpp | ||
CGException.cpp | ||
CGExpr.cpp | ||
CGExprAgg.cpp | ||
CGExprCXX.cpp | ||
CGExprComplex.cpp | ||
CGExprConstant.cpp | ||
CGExprScalar.cpp | ||
CGObjC.cpp | ||
CGObjCGNU.cpp | ||
CGObjCMac.cpp | ||
CGObjCRuntime.cpp | ||
CGObjCRuntime.h | ||
CGOpenCLRuntime.cpp | ||
CGOpenCLRuntime.h | ||
CGRTTI.cpp | ||
CGRecordLayout.h | ||
CGRecordLayoutBuilder.cpp | ||
CGStmt.cpp | ||
CGVTT.cpp | ||
CGVTables.cpp | ||
CGVTables.h | ||
CGValue.h | ||
CMakeLists.txt | ||
CodeGenAction.cpp | ||
CodeGenFunction.cpp | ||
CodeGenFunction.h | ||
CodeGenModule.cpp | ||
CodeGenModule.h | ||
CodeGenTBAA.cpp | ||
CodeGenTBAA.h | ||
CodeGenTypes.cpp | ||
CodeGenTypes.h | ||
ItaniumCXXABI.cpp | ||
Makefile | ||
MicrosoftCXXABI.cpp | ||
ModuleBuilder.cpp | ||
README.txt | ||
TargetInfo.cpp | ||
TargetInfo.h |
README.txt
IRgen optimization opportunities. //===---------------------------------------------------------------------===// The common pattern of -- short x; // or char, etc (x == 10) -- generates an zext/sext of x which can easily be avoided. //===---------------------------------------------------------------------===// Bitfields accesses can be shifted to simplify masking and sign extension. For example, if the bitfield width is 8 and it is appropriately aligned then is is a lot shorter to just load the char directly. //===---------------------------------------------------------------------===// It may be worth avoiding creation of alloca's for formal arguments for the common situation where the argument is never written to or has its address taken. The idea would be to begin generating code by using the argument directly and if its address is taken or it is stored to then generate the alloca and patch up the existing code. In theory, the same optimization could be a win for block local variables as long as the declaration dominates all statements in the block. NOTE: The main case we care about this for is for -O0 -g compile time performance, and in that scenario we will need to emit the alloca anyway currently to emit proper debug info. So this is blocked by being able to emit debug information which refers to an LLVM temporary, not an alloca. //===---------------------------------------------------------------------===// We should try and avoid generating basic blocks which only contain jumps. At -O0, this penalizes us all the way from IRgen (malloc & instruction overhead), all the way down through code generation and assembly time. On 176.gcc:expr.ll, it looks like over 12% of basic blocks are just direct branches! //===---------------------------------------------------------------------===//