forked from OSchip/llvm-project
61c9b7cb9f
As background, when constructing a complete object, virtual bases are constructed first. If an exception is thrown later in the ctor, those virtual bases are destroyed, so sema marks the relevant constructors and destructors of virtual bases as referenced. If necessary, they are emitted. However, an abstract class can never be used to construct a complete object. In the Itanium C++ ABI, this works out nicely, because we never end up emitting the "complete" constructor variant, only the "base" constructor variant, which can be called by constructors of derived classes. Clang's Sema::MarkBaseAndMemberDestructorsReferenced is aware of this optimization, and it does not mark ctors and dtors of virtual bases referenced when the constructor of an abstract class is emitted. In the Microsoft ABI, there are no complete/base variants, so before this change, the constructor of an abstract class could reference ctors and dtors of a virtual base without marking them referenced. This could lead to unresolved symbol errors at link time, as reported in PR41065. The fix is to implement the same optimization as Sema: If the class is abstract, don't bother initializing its virtual bases. The "is this class the most derived class" check in the constructor will never pass, and the virtual base constructor calls are always dead. Skip them. I think Richard noticed this missed optimization back in 2016 when he was implementing inheriting constructors. I wasn't able to find any bugs or email about it, though. Fixes PR41065 llvm-svn: 356425 |
||
---|---|---|
.. | ||
ABIInfo.h | ||
Address.h | ||
BackendUtil.cpp | ||
CGAtomic.cpp | ||
CGBlocks.cpp | ||
CGBlocks.h | ||
CGBuilder.h | ||
CGBuiltin.cpp | ||
CGCUDANV.cpp | ||
CGCUDARuntime.cpp | ||
CGCUDARuntime.h | ||
CGCXX.cpp | ||
CGCXXABI.cpp | ||
CGCXXABI.h | ||
CGCall.cpp | ||
CGCall.h | ||
CGClass.cpp | ||
CGCleanup.cpp | ||
CGCleanup.h | ||
CGCoroutine.cpp | ||
CGDebugInfo.cpp | ||
CGDebugInfo.h | ||
CGDecl.cpp | ||
CGDeclCXX.cpp | ||
CGException.cpp | ||
CGExpr.cpp | ||
CGExprAgg.cpp | ||
CGExprCXX.cpp | ||
CGExprComplex.cpp | ||
CGExprConstant.cpp | ||
CGExprScalar.cpp | ||
CGGPUBuiltin.cpp | ||
CGLoopInfo.cpp | ||
CGLoopInfo.h | ||
CGNonTrivialStruct.cpp | ||
CGObjC.cpp | ||
CGObjCGNU.cpp | ||
CGObjCMac.cpp | ||
CGObjCRuntime.cpp | ||
CGObjCRuntime.h | ||
CGOpenCLRuntime.cpp | ||
CGOpenCLRuntime.h | ||
CGOpenMPRuntime.cpp | ||
CGOpenMPRuntime.h | ||
CGOpenMPRuntimeNVPTX.cpp | ||
CGOpenMPRuntimeNVPTX.h | ||
CGRecordLayout.h | ||
CGRecordLayoutBuilder.cpp | ||
CGStmt.cpp | ||
CGStmtOpenMP.cpp | ||
CGVTT.cpp | ||
CGVTables.cpp | ||
CGVTables.h | ||
CGValue.h | ||
CMakeLists.txt | ||
CodeGenABITypes.cpp | ||
CodeGenAction.cpp | ||
CodeGenFunction.cpp | ||
CodeGenFunction.h | ||
CodeGenModule.cpp | ||
CodeGenModule.h | ||
CodeGenPGO.cpp | ||
CodeGenPGO.h | ||
CodeGenTBAA.cpp | ||
CodeGenTBAA.h | ||
CodeGenTypeCache.h | ||
CodeGenTypes.cpp | ||
CodeGenTypes.h | ||
ConstantEmitter.h | ||
ConstantInitBuilder.cpp | ||
CoverageMappingGen.cpp | ||
CoverageMappingGen.h | ||
EHScopeStack.h | ||
ItaniumCXXABI.cpp | ||
MacroPPCallbacks.cpp | ||
MacroPPCallbacks.h | ||
MicrosoftCXXABI.cpp | ||
ModuleBuilder.cpp | ||
ObjectFilePCHContainerOperations.cpp | ||
README.txt | ||
SanitizerMetadata.cpp | ||
SanitizerMetadata.h | ||
SwiftCallingConv.cpp | ||
TargetInfo.cpp | ||
TargetInfo.h | ||
VarBypassDetector.cpp | ||
VarBypassDetector.h |
README.txt
IRgen optimization opportunities. //===---------------------------------------------------------------------===// The common pattern of -- short x; // or char, etc (x == 10) -- generates an zext/sext of x which can easily be avoided. //===---------------------------------------------------------------------===// Bitfields accesses can be shifted to simplify masking and sign extension. For example, if the bitfield width is 8 and it is appropriately aligned then is is a lot shorter to just load the char directly. //===---------------------------------------------------------------------===// It may be worth avoiding creation of alloca's for formal arguments for the common situation where the argument is never written to or has its address taken. The idea would be to begin generating code by using the argument directly and if its address is taken or it is stored to then generate the alloca and patch up the existing code. In theory, the same optimization could be a win for block local variables as long as the declaration dominates all statements in the block. NOTE: The main case we care about this for is for -O0 -g compile time performance, and in that scenario we will need to emit the alloca anyway currently to emit proper debug info. So this is blocked by being able to emit debug information which refers to an LLVM temporary, not an alloca. //===---------------------------------------------------------------------===// We should try and avoid generating basic blocks which only contain jumps. At -O0, this penalizes us all the way from IRgen (malloc & instruction overhead), all the way down through code generation and assembly time. On 176.gcc:expr.ll, it looks like over 12% of basic blocks are just direct branches! //===---------------------------------------------------------------------===//