forked from OSchip/llvm-project
bed4d9c897
Summary: A recent change to enable more importing of global variables with references exposed some efficiency issues with export computation. See D73724 for more information and detailed analysis. The first was specific to variable importing. The code was marking every copy of a referenced value (from possibly thousands of files in the case of linkonce_odr) as exported, and we only need to mark the copy in the module containing the variable def being imported as exported. The reason is that this is tracking what values are newly exported as a result of importing. Anything that was defined in another module and simply used in the exporting module is already exported, and would have been identified by the caller (e.g. the LTO API implementations). The second issue is that the code was re-adding previously exported values (along with all references). It is easy to identify when a variable was already imported into the same module (via the import list insert call return value), and we already did this for function importing. However, what we weren't doing for either function or variable importing was avoiding a re-insertion when it was previously exported into a different importing module. The reason we couldn't do this is there was no way of telling from the export list whether it was previously inserted there because its definition was exported (in which case we already marked all its references as exported) from when it was inserted there because it was referenced by another exported value (in which case we haven't yet inserted its own references). To address this we can restructure the way the export list is constructed. This patch only adds the actual imported definitions (variable or function) to the export list for its module during the import computation. After import computation is complete, where we were already post-processing the export list we go ahead and add all references made by those exported values to the export list. These changes speed up the thin link not only with constant variable importing enabled, but also without (due to the efficiency improvement in function importing). Some thin link user time measurements for one large application, average of 5 runs: With constant variable importing enabled: - without this patch: 479.5s - with this patch: 74.6s Without constant variable importing enabled: - without this patch: 80.6s - with this patch: 70.3s Note I have not re-enabled constant variable importing here, as I would like to do additional compile time measurements with these fixes first. Reviewers: evgeny777 Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73851 |
||
---|---|---|
.. | ||
AlwaysInliner.cpp | ||
ArgumentPromotion.cpp | ||
Attributor.cpp | ||
BarrierNoopPass.cpp | ||
BlockExtractor.cpp | ||
CMakeLists.txt | ||
CalledValuePropagation.cpp | ||
ConstantMerge.cpp | ||
CrossDSOCFI.cpp | ||
DeadArgumentElimination.cpp | ||
ElimAvailExtern.cpp | ||
ExtractGV.cpp | ||
ForceFunctionAttrs.cpp | ||
FunctionAttrs.cpp | ||
FunctionImport.cpp | ||
GlobalDCE.cpp | ||
GlobalOpt.cpp | ||
GlobalSplit.cpp | ||
HotColdSplitting.cpp | ||
IPConstantPropagation.cpp | ||
IPO.cpp | ||
InferFunctionAttrs.cpp | ||
InlineSimple.cpp | ||
Inliner.cpp | ||
Internalize.cpp | ||
LLVMBuild.txt | ||
LoopExtractor.cpp | ||
LowerTypeTests.cpp | ||
MergeFunctions.cpp | ||
PartialInlining.cpp | ||
PassManagerBuilder.cpp | ||
PruneEH.cpp | ||
SCCP.cpp | ||
SampleProfile.cpp | ||
StripDeadPrototypes.cpp | ||
StripSymbols.cpp | ||
SyntheticCountsPropagation.cpp | ||
ThinLTOBitcodeWriter.cpp | ||
WholeProgramDevirt.cpp |