llvm-project

Commit Graph

Author	SHA1	Message	Date
Gulfem Savrun Yeniceri	e3a6d70c68	Revert "[Passes] Add relative lookup table converter pass" This reverts commit `78a65cd945` which caused buildbot failures.	2021-03-23 00:43:16 +00:00
Gulfem Savrun Yeniceri	78a65cd945	[Passes] Add relative lookup table converter pass Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in: https://bugs.llvm.org/show_bug.cgi?id=45244 This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly. Differential Revision: https://reviews.llvm.org/D94355	2021-03-22 22:09:02 +00:00
Sriraman Tallam	0ba1ebcbb7	Remove original implementation of UniqueInternalLinkageNames pass. D96109 was recently submitted which contains the refactored implementation of -funique-internal-linakge-names by adding the unique suffixes in clang rather than as an LLVM pass. Deleting the former implementation in this change. Differential Revision: https://reviews.llvm.org/D98234	2021-03-10 11:57:40 -08:00
Ta-Wei Tu	8a003861a3	[NPM] Add -enable-loopinterchange option to NPM We have the `enable-loopinterchange` option in legacy pass manager but not in NPM. Add `LoopInterchange` pass to the optimization pipeline (at the same position as before) when `enable-loopinterchange` is turned on. Reviewed By: aeubanks, fhahn Differential Revision: https://reviews.llvm.org/D98116	2021-03-07 02:39:28 +08:00
Arthur Eubanks	a9b33ffb8f	[ThinLTO][NewPM] Clean up dead code under -O0 We're running into undefined references using ThinLTO with -O0 on Windows/Chrome. This fixes that. This matches the legacy PM. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D97414	2021-02-24 17:08:57 -08:00
Vitaly Buka	8560c2d426	[ThinLTO, NewPM] Run OptimizerLastEPCallbacks from buildThinLTOPreLinkDefaultPipeline -O1 and above do dont call real optimizer pipeline in ThinLTO PreLink. Also clang can't add PostLink OptimizerLastEPCallbacks for in-process ThinLTO. This results in missing sanitizer passes with ThinLTO. Simple working solution is just call OptimizerLastEPCallbacks at the end of buildThinLTOPreLinkDefaultPipeline. Differential Revision: https://reviews.llvm.org/D96320	2021-02-23 22:14:41 -08:00
Wenlei He	a952d7291e	[SampleFDO] Skip PreLink ICP for better profile quality of MonoLTO PostLink For ThinLTO, PreLink ICP is skipped to favor better profile annotation during LTO PostLink. This change applies the same tweak for MonoLTO. Note that PreLink ICP not only makes PostLink profile annotation harder, it is also uncoordinated with PostLink ICP so duplicated ICP could happen. Differential Revision: https://reviews.llvm.org/D97028	2021-02-19 19:35:23 -08:00
Nikita Popov	71a8e4e7d6	[MemCopyOpt] Enable MemorySSA by default This enables use of MemorySSA instead of MemDep in MemCpyOpt. To allow this without significant compile-time impact, the MemCpyOpt pass is moved directly before DSE (in the cases where this was not already the case), which allows us to reuse the existing MemorySSA analysis. Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt can also perform simple optimizations across basic blocks. Differential Revision: https://reviews.llvm.org/D94376	2021-02-19 18:06:25 +01:00
David Green	c141c6551b	[NPM][LTO] Do not enable MemorySSA with LoopFullUnrollPass As with the standard opt pipeline, we disable the MemorySSA dependency in the LTO LPM pipeline as not all passes preserve MemorySSA.	2021-02-19 08:35:11 +00:00
David Green	908ac47ef4	[NPM][LTO] Update buildLTODefaultPipeline to be more in-line with the old pass manager The NPM LTO pipeline has a lot of fixme's and missing passes, causing a lot of regressions after the switch in `c70737b`. Notably unrolling and vectorization were both disabled, but many other passes are missing compared to the old pass manager. This attempt to enable the most obvious missing passes like the unroller, vectorization and other loop passes, fixing the existing FIXME comments. Differential Revision: https://reviews.llvm.org/D96780	2021-02-17 16:56:28 +00:00
Sanne Wouda	93d9a4c95a	Use LoopRotate PrepareForLTO stage in NPM The PrepareForLTO stage of LoopRotate tries to avoid unrolling loops with calls that might be inlined later. See D94232 where this was introduced. We didn't catch all occurances of the LoopRotatePass in the New Pass Manager, so the original regression in astar returned with the pass manager switch.	2021-02-17 14:06:57 +00:00
Sameer Sahasrabuddhe	11bf7da64a	[NewPM] Introduce (GPU)DivergenceAnalysis in the new pass manager The GPUDivergenceAnalysis is now renamed to just "DivergenceAnalysis" since there is no conflict with LegacyDivergenceAnalysis. In the legacy PM, this analysis can only be used through the legacy DA serving as a wrapper. It is now made available as a pass in the new PM, and has no relation with the legacy DA. The new DA currently cannot handle irreducible control flow; its presence can cause the analysis to run indefinitely. The analysis is now modified to detect this and report all instructions in the function as divergent. This is super conservative, but allows the analysis to be used without hanging the compiler. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D96615	2021-02-16 10:26:45 +05:30
Kazu Hirata	910e2d1e57	[llvm] Use llvm::is_contained (NFC)	2021-02-14 08:36:20 -08:00
Arthur Eubanks	5d960cba34	[opt][NewPM] Add a --print-passes flag to print all available passes It seems nicer to list passes given a flag rather than displaying all passes in opt --help. This is awkwardly structured because a PassBuilder is required, but reusing the PassBuilder in runPassPipeline() doesn't work because we read the input IR before getting to runPassPipeline(). So printing the list of passes needs to happen before reading the input IR. If we remove the legacy PM code in main() and move everything from NewPMDriver.cpp into opt.cpp, we can create the PassBuilder before reading IR and check if we should print the list of passes and exit. But until then this hack seems fine. Compared to the legacy PM, the new PM passes are lacking descriptions. We'll need to figure out a way to add descriptions if we think this is important. Also, this only works for passes specified in PassRegistry.def. If we want to print other custom registered passes, we'll need a different mechanism. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D96101	2021-02-10 11:22:12 -08:00
Jamie Schmeiser	4b661b4059	Introduce -print-changed=[diff \| diff-quiet] which show changes in patch-like format Summary: Introduce base classes that hold a textual represent of the IR based on basic blocks and a base class for comparing this representation. A new change printer is introduced that uses these classes to save and compare representations of the IR before and after each pass. It only reports when changes are made by a pass (similar to -print-changed) except that the changes are shown in a patch-like format with those lines that are removed shown in red prefixed with '-' and those added shown in green with '+'. This functionality was introduced in my tutorial at the 2020 virtual developer's meeting. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks) Differential Revision: https://reviews.llvm.org/D91890	2021-02-08 10:11:22 -05:00
Arthur Eubanks	f020544601	[NewPM][HelloWorld] Move HelloWorld to Utils To prevent creating a new component, which creates a new library. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D95907	2021-02-03 12:59:40 -08:00
Hongtao Yu	3d89b3cbec	[CSSPGO] Introducing distribution factor for pseudo probe. Sample re-annotation is required in LTO time to achieve a reasonable post-inline profile quality. However, we have seen that such LTO-time re-annotation degrades profile quality. This is mainly caused by preLTO code duplication that is done by passes such as loop unrolling, jump threading, indirect call promotion etc, where samples corresponding to a source location are aggregated multiple times due to the duplicates. In this change we are introducing a concept of distribution factor for pseudo probes so that samples can be distributed for duplicated probes scaled by a factor. We hope that optimizations duplicating code well-maintain the branch frequency information (BFI) based on which probe distribution factors are calculated. Distribution factors are updated at the end of preLTO pipeline to reflect an estimated portion of the real execution count. This change also introduces a pseudo probe verifier that can be run after each IR passes to detect duplicated pseudo probes. A saturated distribution factor stands for 1.0. A pesudo probe will carry a factor with the value ranged from 0.0 to 1.0. A 64-bit integral distribution factor field that represents [0.0, 1.0] is associated to each block probe. Unfortunately this cannot be done for callsite probes due to the size limitation of a 32-bit Dwarf discriminator. A 7-bit distribution factor is used instead. Changes are also needed to the sample profile inliner to deal with prorated callsite counts. Call sites duplicated by PreLTO passes, when later on inlined in LTO time, should have the callees’s probe prorated based on the Prelink-computed distribution factors. The distribution factors should also be taken into account when computing hotness for inline candidates. Also, Indirect call promotion results in multiple callisites. The original samples should be distributed across them. This is fixed by adjusting the callisites' distribution factors. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D93264	2021-02-02 11:55:01 -08:00
Arthur Eubanks	7739f9ff97	[NewPM][Unswitch] Add option to disable -O3 non-trivial unswitching Some benchmarks regress with non-trivial unswitching, so add an option to opt-out of performing non-trivial unswitching while investigating. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D95796	2021-02-01 11:11:59 -08:00
Bjorn Pettersson	a9bd3d37bd	[NewPM] Add ExtraVectorizerPasses support As it looks like NewPM generally is using SimpleLoopUnswitch instead of LoopUnswitch, this patch also use SimpleLoopUnswitch in the ExtraVectorizerPasses sequence (compared with LegacyPM which use the LoopUnswitch pass). Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D95457	2021-01-26 22:59:10 +01:00
Florian Hahn	83daa49758	[LoopRotate] Add PrepareForLTO stage, avoid rotating with inline cands. D84108 exposed a bad interaction between inlining and loop-rotation during regular LTO, which is causing notable regressions in at least CINT2006/473.astar. The problem boils down to: we now rotate a loop just before the vectorizer which requires duplicating a function call in the preheader when compiling the individual files ('prepare for LTO'). But this then prevents further inlining of the function during LTO. This patch tries to resolve this issue by making LoopRotate more conservative with respect to rotating loops that have inline-able calls during the 'prepare for LTO' stage. I think this change intuitively improves the current situation in general. Loop-rotate tries hard to avoid creating headers that are 'too big'. At the moment, it assumes all inlining already happened and the cost of duplicating a call is equal to just doing the call. But with LTO, inlining also happens during full LTO and it is possible that a previously duplicated call is actually a huge function which gets inlined during LTO. From the perspective of LV, not much should change overall. Most loops calling user-provided functions won't get vectorized to start with (unless we can infer that the function does not touch memory, has no other side effects). If we do not inline the 'inline-able' call during the LTO stage, we merely delayed loop-rotation & vectorization. If we inline during LTO, chances should be very high that the inlined code is itself vectorizable or the user call was not vectorizable to start with. There could of course be scenarios where we inline a sufficiently large function with code not profitable to vectorize, which would have be vectorized earlier (by scalarzing the call). But even in that case, there probably is no big performance impact, because it should be mostly down to the cost-model to reject vectorization in that case. And then the version with scalarized calls should also not be beneficial. In a way, LV should have strictly more information after inlining and make more accurate decisions (barring cost-model issues). There is of course plenty of room for things to go wrong unexpectedly, so we need to keep a close look at actual performance and address any follow-up issues. I took a look at the impact on statistics for MultiSource/SPEC2000/SPEC2006. There are a few benchmarks with fewer loops rotated, but no change to the number of loops vectorized. Reviewed By: sanwou01 Differential Revision: https://reviews.llvm.org/D94232	2021-01-19 10:15:29 +00:00
Mircea Trofin	e8049dc3c8	[NewPM][Inliner] Move the 'always inliner' case in the same CGSCC pass as 'regular' inliner Expanding from D94808 - we ensure the same InlineAdvisor is used by both InlinerPass instances. The notion of mandatory inlining is moved into the core InlineAdvisor: advisors anyway have to handle that case, so this change also factors out that a bit better. Differential Revision: https://reviews.llvm.org/D94825	2021-01-15 17:59:38 -08:00
Kazu Hirata	7dc3575ef2	[llvm] Remove redundant return and continue statements (NFC) Identified with readability-redundant-control-flow.	2021-01-14 20:30:34 -08:00
Arthur Eubanks	a03ffa9850	[NewPM] Fix placement of LoopFlatten https://reviews.llvm.org/D90402 was inconsistent with where it put LoopFlatten between the two pass managers. It also missed adding it to the non-O1 function simplification pipeline. PR48738 Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D94650	2021-01-14 09:49:31 -08:00
Arthur Eubanks	b196dc6607	[NFC] Remove unused entry in PassRegistry.def	2021-01-13 19:01:07 -08:00
Wei Mi	86341247c4	[NFC] Rename ThinLTOPhase to ThinOrFullLTOPhase and move it from PassBuilder.h to Pass.h. In some compiler passes like SampleProfileLoaderPass, we want to know which LTO/ThinLTO phase the pass is in. Currently the phase is represented in enum class PassBuilder::ThinLTOPhase, so it is only available in PassBuilder and it also cannot represent phase in full LTO. The patch extends it to include full LTO phases and move it from PassBuilder.h to Pass.h, then it is much easier for PassBuilder to communiate with each pass about current LTO phase. Differential Revision: https://reviews.llvm.org/D94613	2021-01-13 15:55:40 -08:00
Arthur Eubanks	39e6d24237	[NewPM] Only non-trivially loop unswitch at -O3 and for non-optsize functions This matches the legacy pipeline/pass. Reviewed By: asbirlea, SjoerdMeijer Differential Revision: https://reviews.llvm.org/D94559	2021-01-13 14:54:49 -08:00
Arthur Eubanks	f748e92295	[NewPM] Run non-trivial loop unswitching under -O2/3/s/z Fixes https://bugs.llvm.org/show_bug.cgi?id=48715. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D94448	2021-01-12 11:04:40 -08:00
Jamie Schmeiser	43a830ed94	Introduce new quiet mode and new option handling for -print-changed. Summary: Introduce a new mode of operation for -print-changed that only reports after a pass changes the IR with all of the other messages suppressed (ie, no initial IR and no messages about ignored, filtered or non-modifying passes). The option processing for -print-changed is changed to take an optional string indicating options for print-changed. Initially, the only option supported is quiet (as described above). This new quiet mode is specified with -print-changed=quiet while -print-changed will continue to function in the same way. It is intended that there will be more options in the future. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks) Differential Revision: https://reviews.llvm.org/D92589	2021-01-11 14:15:18 -05:00
Arthur Eubanks	756dd70766	[NewPM] Run ObjC ARC passes Match the legacy PM in running various ObjC ARC passes. This requires making some module passes into function passes. These were initially ported as module passes since they add function declarations (e.g. https://reviews.llvm.org/D86178), but that's still up for debate and other passes do so. Reviewed By: ahatanak Differential Revision: https://reviews.llvm.org/D93743	2021-01-08 15:47:11 -08:00
Arthur Eubanks	69cf735062	[NewPM] Don't error when there's an unrecognized pass name This currently blocks --print-before/after with a legacy PM pass, for example when we use the new PM for the optimization pipeline but the legacy PM for the codegen pipeline. Also in the future when the codegen pipeline works with the new PM there will be multiple places to specify passes, so even when everything is using the new PM, there will still be multiple places that can accept different pass names. Reviewed By: hoy, ychen Differential Revision: https://reviews.llvm.org/D94283	2021-01-07 22:33:32 -08:00
Arthur Eubanks	28a326eba0	[NFC] Rename registerAliasAnalyses -> registerDefaultAliasAnalyses To clarify that this only affects the "default" AA. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D93980	2021-01-05 11:07:58 -08:00
Kazu Hirata	eb198f4c3c	[llvm] Use llvm::any_of (NFC)	2021-01-04 11:42:47 -08:00
Florian Hahn	c367258b5c	[SimplifyCFG] Enabled hoisting late in LTO pipeline. `bb7d3af113` disabled hoisting in SimplifyCFG by default, but enabled it late in the pipeline. But it appears as if the LTO pipelines got missed. This patch adjusts the LTO pipelines to also enable hoisting in the later stages. Unfortunately there's no easy way to add a test for the change I think. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D93684	2021-01-04 16:26:58 +00:00
Hongtao Yu	01f0d162d6	Moving UniqueInternalLinkageNamesPass to the start of IR pipelines. `UniqueInternalLinkageNamesPass` is useful to CSSPGO, especially when pseudo probe is used. It solves naming conflict for static functions which otherwise will share a merged profile and likely have a profile quality issue with mismatched CFG checksums. Since the pseudo probe instrumentation happens very early in the pipeline, I'm moving `UniqueInternalLinkageNamesPass` right before it. This is being done only to the new pass manager. Reviewed By: dblaikie, aeubanks Differential Revision: https://reviews.llvm.org/D93656	2021-01-02 14:26:21 -08:00
Arthur Eubanks	c2ef06d3dd	[NewPM] Port infer-address-spaces And add it to the AMDGPU opt pipeline. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D93880	2020-12-28 19:58:12 -08:00
Arthur Eubanks	6c36286a2e	[NewPM] Fix CGSCCOptimizerLateEPCallbacks place in pipeline CGSCCOptimizerLateEPCallbacks are supposed to be run before the function simplification pipeline, like in the legacy PM and as specified in the comments for registerCGSCCOptimizerLateEPCallback(). Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D93871	2020-12-28 14:03:10 -08:00
Arthur Eubanks	0219cf7dfa	[NewPM] Fix objc-arc-apelim pass typo	2020-12-22 21:40:43 -08:00
Arthur Eubanks	76f4f42eba	[NewPM] Add TargetMachine method to add alias analyses AMDGPUTargetMachine::adjustPassManager() adds some alias analyses to the legacy PM. We need a way to do the same for the new PM in order to port AMDGPUTargetMachine::adjustPassManager() to the new PM. Currently the new PM adds alias analyses by creating an AAManager via PassBuilder and overriding the AAManager a PassManager uses via FunctionAnalysisManager::registerPass(). We will continue to respect a custom AA pipeline that specifies an exact AA pipeline to use, but for "default" we will now add alias analyses that backends specify. Most uses of PassManager use the "default" AAManager created by PassBuilder::buildDefaultAAPipeline(). Backends can override the newly added TargetMachine::registerAliasAnalyses() to add custom alias analyses. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D93261	2020-12-21 13:46:07 -08:00
Samuel Eubanks	47dbee6790	Make NPM OptBisectInstrumentation use global singleton OptBisect Currently there is an issue where the legacy pass manager uses a different OptBisect counter than the new pass manager. This fix makes the npm OptBisectInstrumentation use the global OptBisect. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D92897	2020-12-20 13:47:56 -08:00
Andrew Litteken	dae34463e3	[IRSim][IROutliner] Adding the extraction basics for the IROutliner. Extracting the similar regions is the first step in the IROutliner. Using the IRSimilarityIdentifier, we collect the SimilarityGroups and sort them by how many instructions will be removed. Each IRSimilarityCandidate is used to define an OutlinableRegion. Each region is ordered by their occurrence in the Module and the regions that are not compatible with previously outlined regions are discarded. Each region is then extracted with the CodeExtractor into its own function. We test that correctly extract in: test/Transforms/IROutliner/extraction.ll test/Transforms/IROutliner/address-taken.ll test/Transforms/IROutliner/outlining-same-globals.ll test/Transforms/IROutliner/outlining-same-constants.ll test/Transforms/IROutliner/outlining-different-structure.ll Recommit of `bf899e8913` fixing memory leaks. Reviewers: paquette, jroelofs, yroux Differential Revision: https://reviews.llvm.org/D86975	2020-12-17 11:27:26 -06:00
dfukalov	9ed8e0caab	[NFC] Reduce include files dependency and AA header cleanup (part 2). Continuing work started in https://reviews.llvm.org/D92489: Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h". Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92852	2020-12-17 14:04:48 +03:00
Bardia Mahjour	6eff12788e	[DDG] Data Dependence Graph - DOT printer - recommit This is being recommitted to try and address the MSVC complaint. This patch implements a DDG printer pass that generates a graph in the DOT description language, providing a more visually appealing representation of the DDG. Similar to the CFG DOT printer, this functionality is provided under an option called -dot-ddg and can be generated in a less verbose mode under -dot-ddg-only option. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D90159	2020-12-16 12:37:36 -05:00
Bardia Mahjour	a29ecca781	Revert "[DDG] Data Dependence Graph - DOT printer" This reverts commit `fd4a10732c`, to investigate the failure on windows: http://lab.llvm.org:8011/#/builders/127/builds/3274	2020-12-14 16:54:20 -05:00
Bardia Mahjour	fd4a10732c	[DDG] Data Dependence Graph - DOT printer This patch implements a DDG printer pass that generates a graph in the DOT description language, providing a more visually appealing representation of the DDG. Similar to the CFG DOT printer, this functionality is provided under an option called -dot-ddg and can be generated in a less verbose mode under -dot-ddg-only option. Differential Revision: https://reviews.llvm.org/D90159	2020-12-14 16:41:14 -05:00
Zequan Wu	b5216b2950	[PGO] Enable preinline and cleanup when optimize for size Differential Revision: https://reviews.llvm.org/D91673	2020-12-10 12:29:17 -08:00
Arthur Eubanks	ff7e1da68f	[NPM] Support -fmerge-functions I tried to put it in the same place in the pipeline as the legacy PM. Fixes PR48399. Reviewed By: asbirlea, nikic Differential Revision: https://reviews.llvm.org/D93002	2020-12-10 11:45:08 -08:00
Anna Thomas	29356e3279	[ScalarizeMaskedMemIntrin] Add new PM support This patch adds new PM support for the pass and the pass can be now used during middle-end transforms. The old pass is remamed to ScalarizeMaskedMemIntrinLegacyPass. Reviewed-By: skatkov, aeubanks Differential Revision: https://reviews.llvm.org/D92743	2020-12-08 17:15:22 -05:00
Arthur Eubanks	0173eb0faf	Use isIgnored instead of checking pass name In preparation for https://reviews.llvm.org/D92616 which will remove angle brackets from pass manager/adaptor names. Reviewed By: dexonsmith, thakis Differential Revision: https://reviews.llvm.org/D92625	2020-12-03 18:37:57 -08:00
Arthur Eubanks	2f0de58294	[NewPM] Support --print-before/after in NPM This changes --print-before/after to be a list of strings rather than legacy passes. (this also has the effect of not showing the entire list of passes in --help-hidden after --print-before/after, which IMO is great for making it less verbose). Currently PrintIRInstrumentation passes the class name rather than pass name to llvm::shouldPrintBeforePass(), meaning llvm::shouldPrintBeforePass() never functions as intended in the NPM. There is no easy way of converting class names to pass names outside of within an instance of PassBuilder. This adds a map of pass class names to their short names in PassRegistry.def within PassInstrumentationCallbacks. It is populated inside the constructor of PassBuilder, which takes a PassInstrumentationCallbacks. Add a pointer to PassInstrumentationCallbacks inside PrintIRInstrumentation and use the newly created map. This is a bit hacky, but I can't think of a better way since the short id to class name only exists within PassRegistry.def. This also doesn't handle passes not in PassRegistry.def but rather added via PassBuilder::registerPipelineParsingCallback(). llvm/test/CodeGen/Generic/print-after.ll doesn't seem very useful now with this change. Reviewed By: ychen, jamieschmeiser Differential Revision: https://reviews.llvm.org/D87216	2020-12-03 16:52:14 -08:00
Mircea Trofin	5fe10263ab	[llvm][inliner] Reuse the inliner pass to implement 'always inliner' Enable performing mandatory inlinings upfront, by reusing the same logic as the full inliner, instead of the AlwaysInliner. This has the following benefits: - reduce code duplication - one inliner codebase - open the opportunity to help the full inliner by performing additional function passes after the mandatory inlinings, but before th full inliner. Performing the mandatory inlinings first simplifies the problem the full inliner needs to solve: less call sites, more contextualization, and, depending on the additional function optimization passes run between the 2 inliners, higher accuracy of cost models / decision policies. Note that this patch does not yet enable much in terms of post-always inline function optimization. Differential Revision: https://reviews.llvm.org/D91567	2020-11-30 12:03:39 -08:00
Hongtao Yu	c083fededf	[CSSPGO] A Clang switch -fpseudo-probe-for-profiling for pseudo-probe instrumentation. This change introduces a new clang switch `-fpseudo-probe-for-profiling` to enable AutoFDO with pseudo instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. One implication from pseudo-probe instrumentation is that the profile is now sensitive to CFG changes. We perform the pseudo instrumentation very early in the pre-LTO pipeline, before any CFG transformation. This ensures that the CFG instrumented and annotated is stable and optimization-resilient. The early instrumentation also allows the inliner to duplicate probes for inlined instances. When a probe along with the other instructions of a callee function are inlined into its caller function, the GUID of the callee function goes with the probe. This allows samples collected on inlined probes to be reported for the original callee function. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86502	2020-11-30 10:16:54 -08:00
Hongtao Yu	64fa8cce22	[CSSPGO] Pseudo probe instrumentation pass This change introduces a pseudo probe instrumentation pass for block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. Given the following LLVM IR: ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ``` The instrumented IR will look like below. Note that each llvm.pseudoprobe intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID. ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86499	2020-11-30 10:16:54 -08:00
Andrew Litteken	a8a43b6338	Revert "[IRSim][IROutliner] Adding the extraction basics for the IROutliner." Reverting commit due to address sanitizer errors. > Extracting the similar regions is the first step in the IROutliner. > > Using the IRSimilarityIdentifier, we collect the SimilarityGroups and > sort them by how many instructions will be removed. Each > IRSimilarityCandidate is used to define an OutlinableRegion. Each > region is ordered by their occurrence in the Module and the regions that > are not compatible with previously outlined regions are discarded. > > Each region is then extracted with the CodeExtractor into its own > function. > > We test that correctly extract in: > test/Transforms/IROutliner/extraction.ll > test/Transforms/IROutliner/address-taken.ll > test/Transforms/IROutliner/outlining-same-globals.ll > test/Transforms/IROutliner/outlining-same-constants.ll > test/Transforms/IROutliner/outlining-different-structure.ll > > Reviewers: paquette, jroelofs, yroux > > Differential Revision: https://reviews.llvm.org/D86975 This reverts commit `bf899e8913`.	2020-11-27 19:55:57 -06:00
Andrew Litteken	bf899e8913	[IRSim][IROutliner] Adding the extraction basics for the IROutliner. Extracting the similar regions is the first step in the IROutliner. Using the IRSimilarityIdentifier, we collect the SimilarityGroups and sort them by how many instructions will be removed. Each IRSimilarityCandidate is used to define an OutlinableRegion. Each region is ordered by their occurrence in the Module and the regions that are not compatible with previously outlined regions are discarded. Each region is then extracted with the CodeExtractor into its own function. We test that correctly extract in: test/Transforms/IROutliner/extraction.ll test/Transforms/IROutliner/address-taken.ll test/Transforms/IROutliner/outlining-same-globals.ll test/Transforms/IROutliner/outlining-same-constants.ll test/Transforms/IROutliner/outlining-different-structure.ll Reviewers: paquette, jroelofs, yroux Differential Revision: https://reviews.llvm.org/D86975	2020-11-27 19:08:29 -06:00
Roman Lebedev	a8d74517dc	[PassManager] Run Induction Variable Simplification pass after Recognize loop idioms pass, not before Currently, `-indvars` runs first, and then immediately after `-loop-idiom` does. I'm not really sure if `-loop-idiom` requires `-indvars` to run beforehand, but i'm very sure that `-indvars` requires `-loop-idiom` to run afterwards, as it can be seen in the phase-ordering test. LoopIdiom runs on two types of loops: countable ones, and uncountable ones. For uncountable ones, IndVars obviously didn't make any change to them, since they are uncountable, so for them the order should be irrelevant. For countable ones, well, they should have been countable before IndVars for IndVars to make any change to them, and since SCEV is used on them, it shouldn't matter if IndVars have already canonicalized them. So i don't really see why we'd want the current ordering. Should this cause issues, it will give us a reproducer test case that shows flaws in this logic, and we then could adjust accordingly. While this is quite likely beneficial in-the-wild already, it's a required part for the full motivational pattern behind `left-shift-until-bittest` loop idiom (D91038). Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D91800	2020-11-25 19:20:07 +03:00
Arthur Eubanks	2c7870dcca	[NewPM] Add pipeline EP callback after initial frontend cleanup This matches the legacy PM's EP_ModuleOptimizerEarly. Some backends use this extension point and adding the pass somewhere else like PipelineStartEPCallback doesn't work. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D91804	2020-11-24 21:14:36 -08:00
Arthur Eubanks	aff058b1a9	Reland [CGSCC] Detect devirtualization in more cases The devirtualization wrapper misses cases where if it wraps a pass manager, an individual pass may devirtualize an indirect call created by a previous pass. For example, inlining may create a new indirect call which is devirtualized by instcombine. Currently the devirtualization wrapper will not see that because it only checks cgscc edges at the very beginning and end of the pass (manager) it wraps. This fixes some tests testing this exact behavior in the legacy PM. Instead of checking WeakTrackingVHs for CallBases at the very beginning and end of the pass it wraps, check every time updateCGAndAnalysisManagerForPass() is called. check-llvm and check-clang with -abort-on-max-devirt-iterations-reached on by default doesn't show any failures outside of tests specifically testing it so it doesn't needlessly rerun passes more than necessary. (The NPM -O2/3 pipeline run the inliner/function simplification pipeline under a devirtualization repeater pass up to 4 times by default). http://llvm-compile-time-tracker.com/?config=O3&stat=instructions&remote=aeubanks shows that 7zip has ~1% compile time regression. I looked at it and saw that there indeed was devirtualization happening that was not previously caught, so now it reruns the CGSCC pipeline on some SCCs, which is WAI. The initial land assumed CallBase WeakTrackingVHs would always be CallBases, but they can be RAUW'd with undef. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89587	2020-11-23 21:28:59 -08:00
Arthur Eubanks	6a2799cf8e	Revert "[CGSCC] Detect devirtualization in more cases" This reverts commit `14a68b4aa9`. Causes building self hosted clang to crash when using NPM.	2020-11-23 13:21:05 -08:00
Arthur Eubanks	3c811ce4f3	[NPM] Share pass building options with legacy PM We should share options when possible. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D91741	2020-11-23 13:04:05 -08:00
Arthur Eubanks	7167e5203a	Port -print-memderefs to NPM There is lots of code duplication, but hopefully it won't matter soon. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D91683	2020-11-23 11:56:22 -08:00
Arthur Eubanks	14a68b4aa9	[CGSCC] Detect devirtualization in more cases The devirtualization wrapper misses cases where if it wraps a pass manager, an individual pass may devirtualize an indirect call created by a previous pass. For example, inlining may create a new indirect call which is devirtualized by instcombine. Currently the devirtualization wrapper will not see that because it only checks cgscc edges at the very beginning and end of the pass (manager) it wraps. This fixes some tests testing this exact behavior in the legacy PM. Instead of checking WeakTrackingVHs for CallBases at the very beginning and end of the pass it wraps, check every time updateCGAndAnalysisManagerForPass() is called. check-llvm and check-clang with -abort-on-max-devirt-iterations-reached on by default doesn't show any failures outside of tests specifically testing it so it doesn't needlessly rerun passes more than necessary. (The NPM -O2/3 pipeline run the inliner/function simplification pipeline under a devirtualization repeater pass up to 4 times by default). http://llvm-compile-time-tracker.com/?config=O3&stat=instructions&remote=aeubanks shows that 7zip has ~1% compile time regression. I looked at it and saw that there indeed was devirtualization happening that was not previously caught, so now it reruns the CGSCC pipeline on some SCCs, which is WAI. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89587	2020-11-23 11:55:20 -08:00
Jamie Schmeiser	621efa6a5a	[NFC intended] Refactor the code for printChanged for reuse and to facilitate subsequent reporters of changes to the IR in the new pass manager. Summary: [NFC intended] Refactor the code for printChanged for reuse and to facilitate subsequent reporters of changes to the IR in the new pass manager. Create abstract template base classes for common functionality and give classes more appropriate names. The base classes handle all of the determination of when a function or pass is "interesting" and should be reported or filtered out. They have pure virtual functions which are called when a change by a pass has been recognized so the derived class need only provide the overrides to present the information about the changing IR. There are at least 2 more change reporters to come (which were presented in my tutorial at the 2020 llvm developer's meeting) that derive from these classes. Respond to review comments: move function out of line, remove inline keyword, remove unneeded qualifiers, simplify comparison. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks), madhur13490 (Madhur Amilkanthwar) Differential Revision: https://reviews.llvm.org/D87000	2020-11-20 09:43:06 -05:00
Arthur Eubanks	b77436047a	[PGO] Make -disable-preinline work with NPM Fixes cspgo_profile_summary.ll under NPM. Reviewed By: xur Differential Revision: https://reviews.llvm.org/D91826	2020-11-19 22:58:55 -08:00
Arthur Eubanks	513d165b80	Port -lower-matrix-intrinsics-minimal to NPM This reuses the existing lower-matrix-intrinsics pass rather than going the legacy pass route of creating a new pass. Use this new variant in the NPM -O0 pipeline. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D91811	2020-11-19 17:42:48 -08:00
Arthur Eubanks	72badbcdcc	[NPM] Move more O0 pass building into PassBuilder This moves handling of alwaysinline, coroutines, matrix lowering, PGO, and LTO-required passes into PassBuilder. Much of this is replicated between Clang and opt. Other out-of-tree users also replicate some of this, such as Rust [1] replicating the alwaysinline, LTO, and PGO passes. The LTO passes are also now run in build(Thin)LTOPreLinkDefaultPipeline() since they are semantically required for (Thin)LTO. [1]: `f5230fbf76/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp (L896)` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D91585	2020-11-19 11:22:23 -08:00
Arthur Eubanks	67f16e9e91	[NPM] Remove -enable-npm-optnone flag It has been on by default for a couple months without complaint. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D91743	2020-11-18 15:49:16 -08:00
Yevgeny Rouban	cba3e78338	[NewPM] Disable PreservedCFGChecker and add regression unit tests The design of the PreservedCFG Checker (landed with the commit `28012e00d8`) has a fundamental flaw which makes it incorrect. The checker is based on the PreservedAnalyses result returned by functional passes: if CFGAnalyses is in the returned PreservedAnalyses set, then the checker asserts that the CFG snapshot saved before the pass is equal to the CFG snapshot taken after the the pass. The problem is in passes that change CFG and invalidate CFGAnalyses on their own. Such passes do not return CFGanalyses in the returned PreservedAnalyses. So the checker mistakenly expects CFG unchanged. As an example see the class TestSimplifyCFGInvalidatingAnalysisPass in the new tests. It is interesting that the bug was not found in LLVM. That is because the CFG checker ran only if CFGAnalyses was checked incorrectly: if (!PassPA.allAnalysesInSetPreserved<CFGAnalyses>()) return; but must be checked as follows: auto PAC = PA.getChecker<PreservedCFGCheckerAnalysis>(); if (!(PAC.preserved() \|\| PAC.preservedSet<AllAnalysesOn<Function>>() \|\| PAC.preservedSet<CFGAnalyses>()) return; A fully redesigned checker will be sent as a separate follow-up patch. Reviewed By: Serguei Katkov, Jakub Kuderski Differential Revision: https://reviews.llvm.org/D91324	2020-11-18 10:02:47 +07:00
Florian Hahn	8dbe44cb29	Add pass to add !annotate metadata from @llvm.global.annotations. This patch adds a new pass to add !annotation metadata for entries in @llvm.global.anotations, which is generated using __attribute__((annotate("_name"))) on functions in Clang. This has been discussed on llvm-dev as part of RFC: Combining Annotation Metadata and Remarks http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D91195	2020-11-16 14:57:11 +00:00
Arthur Eubanks	6e04da0a5a	[DCE] Port -redundant-dbg-inst-elim to NPM This is used to test RemoveRedundantDbgInstrs(), which is used by other passes. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D91477	2020-11-14 16:55:20 -08:00
Florian Hahn	8bb6347939	Add !annotation metadata and remarks pass. This patch adds a new !annotation metadata kind which can be used to attach annotation strings to instructions. It also adds a new pass that emits summary remarks per function with the counts for each annotation kind. The intended uses cases for this new metadata is annotating 'interesting' instructions and the remarks should provide additional insight into transformations applied to a program. To motivate this, consider these specific questions we would like to get answered: * How many stores added for automatic variable initialization remain after optimizations? Where are they? * How many runtime checks inserted by a frontend could be eliminated? Where are the ones that did not get eliminated? Discussed on llvm-dev as part of 'RFC: Combining Annotation Metadata and Remarks' (http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html) Reviewed By: thegameg, jdoerfert Differential Revision: https://reviews.llvm.org/D91188	2020-11-13 13:24:10 +00:00
serge-sans-paille	9218ff50f9	llvmbuildectomy - replace llvm-build by plain cmake No longer rely on an external tool to build the llvm component layout. Instead, leverage the existing `add_llvm_componentlibrary` cmake function and introduce `add_llvm_component_group` to accurately describe component behavior. These function store extra properties in the created targets. These properties are processed once all components are defined to resolve library dependencies and produce the header expected by llvm-config. Differential Revision: https://reviews.llvm.org/D90848	2020-11-13 10:35:24 +01:00
Jamie Schmeiser	782d6a6963	Introduce -print-before-changed, making -print-changed also print before passes that modify IR Summary: Add an option -print-before-changed that modifies the print-changed behaviour so that it prints the IR before a pass that changed it in addition to printing the IR after the pass. Note that the option does nothing in isolation. The filtering options work as expected. Lit tests are included. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks) Differential Revision: https://reviews.llvm.org/D88757	2020-11-12 15:20:50 +00:00
Arthur Eubanks	b6ccff3d5f	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Since callbacks may end up not adding passes, we need to check if the pass managers are empty before adding them, so PassManager now has an isEmpty() function. For example, polly adds callbacks but doesn't always add passes in those callbacks, so this is necessary to keep -debug-pass-manager tests' output from changing depending on if polly is enabled or not. Tests are a continuation of those added in https://reviews.llvm.org/D89083. Reviewed By: asbirlea, Meinersbur Differential Revision: https://reviews.llvm.org/D89158	2020-11-11 15:10:27 -08:00
Sjoerd Meijer	2ef47910d5	[LoopFlatten] Run it earlier, just before IndVarSimplify This is a prep step for widening induction variables in LoopFlatten if this is posssible (D90640), to avoid having to perform certain overflow checks. Since IndVarSimplify may already widen induction variables, we want to run LoopFlatten just before IndVarSimplify. This is a minor reshuffle as both passes were already close after each other. Differential Revision: https://reviews.llvm.org/D90402	2020-11-10 20:22:41 +00:00
Sjoerd Meijer	706ead0e87	[LoopFlatten] Make it a FunctionPass This converts LoopFlatten from a LoopPass to a FunctionPass so that we don't run into problems of a loop pass deleting a (inner)loop. Differential Revision: https://reviews.llvm.org/D90940	2020-11-10 20:03:31 +00:00
Arthur Eubanks	1cbf8e89b5	[NewPM] Port -separate-const-offset-from-gep Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D91095	2020-11-09 17:42:36 -08:00
Josh Stone	4463b73e79	Enable opt-bisect for the new pass manager This instruments a should-run-optional-pass callback using the existing OptBisect class to decide if new passes should be skipped. Passes that force isRequired never reach this at all, so they are not included in "BISECT:" output nor its pass count. The test case is resurrected from r267022, an early version of D19172 that had new pass manager support (later reverted and redone without). Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D87951	2020-11-09 15:57:48 -08:00
Arthur Eubanks	cdb51bfaa7	[NewPM] Add unique-internal-linkage-names to PassRegistry.def Pass was already ported, just not properly hooked up.	2020-11-09 12:54:13 -08:00
Pedro Tammela	5e8ecff0d8	[Reg2Mem] add support for the new pass manager This patch refactors the pass to accomodate the new pass manager boilerplate. Differential Revision: https://reviews.llvm.org/D91005	2020-11-08 11:14:05 +00:00
Arthur Eubanks	226e179f74	Revert "[NewPM] Provide method to run all pipeline callbacks, used for -O0" This reverts commit `ae38540042`. As well as some follow-up test fixes. The original change causes new-pass-manager.ll to fail when polly is enabled.	2020-11-08 00:32:35 -08:00
Arthur Eubanks	ae38540042	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Tests are a continuation of those added in https://reviews.llvm.org/D89083. In order to prevent TargetMachines from adding unnecessary optimization passes at -O0, TargetMachine::registerPassBuilderCallbacks() will be changed to take an OptimizationLevel, but that will be done separately. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89158	2020-11-04 22:27:16 -08:00
Arthur Eubanks	ab0ddbc38a	Reland [NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback This allows targets to skip optional optimization passes at -O0. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90777	2020-11-04 13:11:40 -08:00
Arthur Eubanks	9173b5a99d	Revert "[NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback" This reverts commit `7a83aa0520`. Causing buildbot failures.	2020-11-04 12:57:32 -08:00
Arthur Eubanks	7a83aa0520	[NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback This allows targets to skip optional optimization passes at -O0. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90777	2020-11-04 12:53:30 -08:00
Arthur Eubanks	d8f531c42c	[NewPM] Don't run before pass instrumentation on required passes This allows those instrumentation to log when they decide to skip a pass. This provides extra helpful info for optnone functions and also will help with opt-bisect. Have OptNoneInstrumentation print when it skips due to seeing optnone. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90545	2020-11-04 09:45:10 -08:00
Arthur Eubanks	06926e0f01	Port print-must-be-executed-contexts and print-mustexecute to NPM Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90207	2020-11-03 21:06:46 -08:00
Arthur Eubanks	2e31727a88	[NFC] Clean up PassBuilder Make DebugLogging a member variable so that users of PassBuilder don't need to pass it around so much. Move call to TargetMachine::registerPassBuilderCallbacks() within PassBuilder so users don't need to remember to call it. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90437	2020-10-30 10:03:59 -07:00
Serguei Katkov	b69919b537	[GVN LoadPRE] Add an option to disable splitting backedge GVN Load PRE can split the backedge causing breaking the loop structure where the latch contains the conditional branch with for example induction variable. Different optimizations expect this form of the loop, so it is better to preserve it for some time. This CL adds an option to control an ability to split backedge. Default value is true so technically it is NFC and current behavior is not changed. Reviewers: fedor.sergeev, mkazantsev, nikic, reames, fhahn Reviewed By: mkazasntsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D89854	2020-10-27 11:59:52 +07:00
Arthur Eubanks	3dd1c72458	Port -objc-arc-expand to NPM Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90182	2020-10-26 20:05:10 -07:00
Arthur Eubanks	90c0b0d3d6	Port -objc-arc-apelim to NPM Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90181	2020-10-26 20:01:46 -07:00
TaWeiTu	0efbfa38ae	[NPM] Port -slsr to NPM `-separate-const-offset-from-gep` has not yet be ported, so some tests are not updated. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D90149	2020-10-27 09:21:40 +08:00
Arthur Eubanks	c039e83a2c	Fix typo SSC -> SCC	2020-10-24 16:26:48 -07:00
TaWeiTu	65a36bbc3d	[NPM] Port -loop-versioning-licm to NPM Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D89371	2020-10-24 21:51:18 +08:00
Arthur Eubanks	baffd052b0	[StructurizeCFG][NewPM] Port -structurizecfg to NPM This doesn't support -structurizecfg-skip-uniform-regions since that would require porting LegacyDivergenceAnalysis. The NPM doesn't support adding a non-analysis pass as a dependency of another, so I had to add -lowerswitch to some tests or pin them to the legacy PM. This is the only RegionPass in tree, so I simply copied the logic for finding all Regions from the legacy PM's RGManager into StructurizeCFG::run(). Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D89026	2020-10-23 15:54:03 -07:00
Arthur Eubanks	92d9a3868a	Port -instnamer to NPM Some clang tests use this. Reviewed By: akhuang Differential Revision: https://reviews.llvm.org/D89931	2020-10-22 12:08:36 -07:00
Arthur Eubanks	cb9ca35977	[LoopRotate][NPM] Disable header duplication under -Oz It was already disabled under -Oz in buildFunctionSimplificationPipeline(), but not in buildModuleOptimizationPipeline()/addPGOInstrPasses(). Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D89927	2020-10-22 08:39:12 -07:00
Arthur Eubanks	8d9466a385	[BlockExtract][NewPM] Port -extract-blocks to NPM Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D89015	2020-10-21 12:51:11 -07:00
Florian Hahn	88241ffb56	[Passes] Move ADCE before DSE & LICM. The adjustment seems to have very little impact on optimizations. The only binary change with -O3 MultiSource/SPEC2000/SPEC2006 on X86 is in consumer-typeset and the size there actually decreases by -0.1%, with not significant changes in the stats. On its own, it is mildly positive in terms of compile-time, most likely due to LICM & DSE having to process slightly less instructions. It should also be unlikely that DSE/LICM make much new code dead. http://llvm-compile-time-tracker.com/compare.php?from=df63eedef64d715ce1f31843f7de9c11fe1e597f&to=e3bdfcf94a9eeae6e006d010464f0c1b3550577d&stat=instructions With DSE & MemorySSA, it gives some nice compile-time improvements, due to the fact that DSE can re-use the PDT from ADCE, if it does not make any changes: http://llvm-compile-time-tracker.com/compare.php?from=15fdd6cd7c24c745df1bb419e72ff66fd138aa7e&to=481f494515fc89cb7caea8d862e40f2c910dc994&stat=instructions Reviewed By: xbolva00 Differential Revision: https://reviews.llvm.org/D87322	2020-10-21 10:30:56 +01:00
Ta-Wei Tu	529ecd19df	[NPM] port -unify-loop-exits to NPM Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D89774	2020-10-20 10:46:57 -07:00
Ta-Wei Tu	59286b36df	[NPM] Port -mergereturn to NPM Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D89781	2020-10-20 10:33:58 -07:00
Amy Huang	ea693a1627	[NPM] Port module-debuginfo pass to the new pass manager Port pass to NPM and update tests in DebugInfo/Generic. Differential Revision: https://reviews.llvm.org/D89730	2020-10-19 14:31:17 -07:00
Hans Wennborg	0628bea513	Revert "[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting" This broke Chromium's PGO build, it seems because hot-cold-splitting got turned on unintentionally. See comment on the code review for repro etc. > This patch adds -f[no-]split-cold-code CC1 options to clang. This allows > the splitting pass to be toggled on/off. The current method of passing > `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose > correctly (say, with `-O0` or `-Oz`). > > To implement the -fsplit-cold-code option, an attribute is applied to > functions to indicate that they may be considered for splitting. This > removes some complexity from the old/new PM pipeline builders, and > behaves as expected when LTO is enabled. > > Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org> > Differential Revision: https://reviews.llvm.org/D57265 > Reviewed By: Aditya Kumar, Vedant Kumar > Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar This reverts commit `273c299d5d`.	2020-10-19 12:31:14 +02:00
Vedant Kumar	273c299d5d	[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting This patch adds -f[no-]split-cold-code CC1 options to clang. This allows the splitting pass to be toggled on/off. The current method of passing `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose correctly (say, with `-O0` or `-Oz`). To implement the -fsplit-cold-code option, an attribute is applied to functions to indicate that they may be considered for splitting. This removes some complexity from the old/new PM pipeline builders, and behaves as expected when LTO is enabled. Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org> Differential Revision: https://reviews.llvm.org/D57265 Reviewed By: Aditya Kumar, Vedant Kumar Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar	2020-10-15 23:13:33 +00:00
Arthur Eubanks	518ec05a10	[LoopExtract][NewPM] Port -loop-extract to NPM -loop-extract-single is just -loop-extract on one loop. -loop-extract depended on -break-crit-edges and -loop-simplify in the legacy PM, but the NPM doesn't allow specifying pass dependencies like that, so manually add those passes to the RUN lines where necessary. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89016	2020-10-13 22:55:42 -07:00
Arthur Eubanks	0689dab844	[FixIrreducible][NewPM] Port -fix-irreducible to NPM In the NPM, a pass cannot depend on another non-analysis pass. So pin the test that tests that -lowerswitch is run automatically to legacy PM. Reviewed By: sameerds Differential Revision: https://reviews.llvm.org/D89051	2020-10-09 09:22:09 -07:00
Arthur Eubanks	9c21c6c966	[LoopInterchange][NewPM] Port -loop-interchange to NPM Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D89058	2020-10-09 09:21:31 -07:00
Arthur Eubanks	6dcbea877b	[NewPM] Use PassInstrumentation for -verify-each This removes "VerifyEachPass" parameters from a lot of functions which is nice. Don't verify after special passes or VerifierPass. This introduces verification on loop and cgscc passes, verifying the corresponding function/module. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D88764	2020-10-07 19:24:25 -07:00
Reid Kleckner	940d7aaea9	Port StripGCRelocates pass to NPM Fixes one test under NPM Differential Revision: https://reviews.llvm.org/D88766	2020-10-07 14:41:29 -07:00
Reid Kleckner	da48fe1732	[NPM] Port strip nonlinetable debuginfo pass to the new pass manager Fixes a few tests in llvm/test/Transforms/Utils. Differential Revision: https://reviews.llvm.org/D88762	2020-10-07 14:35:36 -07:00
Arthur Eubanks	d4e08c95e5	[NewPM] Set -enable-npm-optnone to true by default This makes the NPM skip not required passes on functions marked optnone. If this causes a pass that should be required but has not been marked required to be skipped, add `static bool isRequired() { return true; }` to the pass class. AlwaysInlinerPass is an example. clang/test/CodeGen/O0-no-skipped-passes.c is useful for checking that no passes are skipped under -O0. The -enable-npm-optnone option will be removed once this has been stable for long enough without issues. Reviewed By: ychen, asbirlea Differential Revision: https://reviews.llvm.org/D87869	2020-10-05 18:42:32 -07:00
Arthur Eubanks	321986fe68	[MetaRenamer][NewPM] Port metarenamer to NPM Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D88690	2020-10-02 15:42:25 -07:00
Jamie Schmeiser	71124a9dbd	Reland No.3: Add new hidden option -print-changed which only reports changes to IR A new hidden option -print-changed is added along with code to support printing the IR as it passes through the opt pipeline in the new pass manager. Only those passes that change the IR are reported, with others only having the banner reported, indicating that they did not change the IR, were filtered out or ignored. Filtering of output via the -filter-print-funcs is supported and a new supporting hidden option -filter-passes is added. The latter takes a comma separated list of pass names and filters the output to only show those passes in the list that change the IR. The output can also be modified via the -print-module-scope function. The code introduces an abstract template base class that generalizes the comparison of IRs that takes an IR representation as template parameter. Derived classes provide overrides that provide an event based API for generalized reporting of IRs as they are changed in the opt pipeline through the new pass manager. The first of several instantiations is provided that prints the IR in a form similar to that produced by -print-after-all with the above mentioned filtering capabilities. This version, and the others to follow will be introduced at the upcoming developer's conference. Reviewed By: aeubanks (Arthur Eubanks), yrouban (Yevgeny Rouban), ychen (Yuanfang Chen), MaskRay (Fangrui Song) Differential Revision: https://reviews.llvm.org/D86360	2020-10-01 17:39:13 +00:00
Sjoerd Meijer	d53b4bee0c	[LoopFlatten] Add a loop-flattening pass This is a simple pass that flattens nested loops. The intention is to optimise loop nests like this, which together access an array linearly: for (int i = 0; i < N; ++i) for (int j = 0; j < M; ++j) f(A[iM+j]); into one loop: for (int i = 0; i < (NM); ++i) f(A[i]); It can also flatten loops where the induction variables are not used in the loop. This can help with codesize and runtime, especially on simple cpus without advanced branch prediction. This is only worth flattening if the induction variables are only used in an expression like i*M+j. If they had any other uses, we would have to insert a div/mod to reconstruct the original values, so this wouldn't be profitable. This partially fixes PR40581 as this pass triggers on one of the two cases. I will follow up on this to learn LoopFlatten a few more (small) tricks. Please note that LoopFlatten is not yet enabled by default. Patch by Oliver Stannard, with minor tweaks from Dave Green and myself. Differential Revision: https://reviews.llvm.org/D42365	2020-10-01 13:54:45 +01:00
Arthur Eubanks	460dda071e	[WholeProgramDevirt][NewPM] Add NPM testing path to match legacy pass The legacy pass's default constructor sets UseCommandLine = true and goes down a separate testing route. Match that in the NPM pass. This fixes all tests in llvm/test/Transforms/WholeProgramDevirt under NPM. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D88588	2020-09-30 17:27:37 -07:00
Arthur Eubanks	4fbd83c716	[ObjCARCAA][NewPM] Add already ported objc-arc-aa to PassRegistry.def Also add missing AnalysisKey definition.	2020-09-30 08:50:44 -07:00
Chuanqi Xu	b3a722e66b	[Coroutines] Reuse storage for local variables with non-overlapping lifetimes bug 45566 shows the process of building coroutine frame won't consider that the lifetimes of different local variables are not overlapped, which means the compiler could generates smaller frame. This patch calculate the lifetime range of each alloca by StackLifetime class. Then the patch build non-overlapped sets for allocas whose lifetime ranges are not overlapped. We use the largest type in a non-overlapped set as the field type in the frame. In insertSpills process, if we find the type of field is not the same with the alloca, we cast the pointer to the field type to the pointer to the alloca type. Since the lifetime range of alloca in one non-overlapped set is not overlapped with each other, it should be ok to reuse the storage space in the frame. Test plan: check-llvm, check-clang, cppcoro, folly Reviewers: junparser, lxfind, modocache Differential Revision: https://reviews.llvm.org/D87596	2020-09-28 15:48:00 +08:00
Fangrui Song	50bd71e1d7	[NewPM] Port ConstraintElimination to the new pass manager If -enable-constraint-elimination is specified, add it to the -O2/-O3 pipeline. (-O1 uses a separate function now.) Reviewed By: fhahn, aeubanks Differential Revision: https://reviews.llvm.org/D88365	2020-09-27 11:12:26 -07:00
Arthur Eubanks	83e3ea2cfc	[LowerTypeTests][NewPM] Add constructor that uses command line flags This matches the legacy PM pass by having one constructor use command line flags, and the other use parameters to the pass. This fixes all tests under Transforms/LowerTypeTests using NPM. Reviewed By: ychen, pcc Differential Revision: https://reviews.llvm.org/D87845	2020-09-25 17:39:59 -07:00
Arthur Eubanks	d3f6972abb	[LoopReroll][NewPM] Port -loop-reroll to NPM Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87957	2020-09-25 12:09:06 -07:00
Hans Wennborg	4f1897c6f0	Move PassBuilder::registerParseTopLevelPipelineCallback out-of-line For some mysterious reason it doesn't build with clang-cl when compiled as part of the includes in clang's CodeGenAction.cpp (crbug.com/1132292).	2020-09-25 19:55:40 +02:00
Andrew Litteken	f02c4c87b4	[IRSim] Adding wrapper pass for IRSimilarityIdentfier This introduces an analysis pass that wraps IRSimilarityIdentifier, and adds a printer pass to examine in what function similarities are being found. Test for what the printer pass can find are in test/Analysis/IRSimilarityIdentifier. Reviewed by: paquette, jroelofs Differential Revision: https://reviews.llvm.org/D86973	2020-09-24 14:59:41 -05:00
Arthur Eubanks	29aaa18848	Revert "[NewPM] Add callbacks to PassBuilder to run before/after parsing a pass" This reverts commit `111aa4e366`.	2020-09-23 18:43:13 -07:00
Arthur Eubanks	111aa4e366	[NewPM] Add callbacks to PassBuilder to run before/after parsing a pass This is in preparation for supporting -debugify-each, which adds a debug info pass before and after each pass. Switch VerifyEach to use this. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D88107	2020-09-23 15:25:40 -07:00
Arthur Eubanks	a031ef6f3a	[GVNSink][NewPM] Add GVNSinkPass to PassRegistry.def	2020-09-22 08:24:09 -07:00
Arthur Eubanks	9db0c572c1	[Delinearization][NewPM] Port delinearization to NPM Also make tests in Analysis/Delinearization work under NPM. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87741	2020-09-21 17:59:08 -07:00
Arthur Eubanks	024979b7b6	[ObjCARC][NewPM] Port objc-arc-contract to NPM Similar to https://reviews.llvm.org/D86178. This is a module pass instead of a function pass since ARCRuntimeEntryPoints can lazily add function declarations. Reviewed By: ahatanak Differential Revision: https://reviews.llvm.org/D87806	2020-09-21 09:40:14 -07:00
Arthur Eubanks	5249e6f248	[LoopSimplifyCFG][NewPM] Rename simplify-cfg -> loop-simplifycfg This matches the legacy PM name and makes all tests in Transforms/LoopSimplifyCFG pass under NPM. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D87948	2020-09-21 08:27:19 -07:00
Douglas Yung	b03c2b8395	Revert "Re-land: Add new hidden option -print-changed which only reports changes to IR" The test added in this commit is failing on Windows bots: http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/1269 This reverts commit `f9e6d1edc0` and follow-up commit `6859d95ea2`.	2020-09-17 01:32:29 -07:00
Arthur Eubanks	f4ea0f9814	[NewPM] Port -print-alias-sets to NPM Really it should be named print<alias-sets>, but for the sake of changing fewer tests, added a TODO to rename after NPM switch and test cleanup. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87713	2020-09-16 18:34:56 -07:00
Michael Liao	6859d95ea2	Fix build.	2020-09-16 14:52:00 -04:00
Jamie Schmeiser	f9e6d1edc0	Re-land: Add new hidden option -print-changed which only reports changes to IR A new hidden option -print-changed is added along with code to support printing the IR as it passes through the opt pipeline in the new pass manager. Only those passes that change the IR are reported, with others only having the banner reported, indicating that they did not change the IR, were filtered out or ignored. Filtering of output via the -filter-print-funcs is supported and a new supporting hidden option -filter-passes is added. The latter takes a comma separated list of pass names and filters the output to only show those passes in the list that change the IR. The output can also be modified via the -print-module-scope function. The code introduces a template base class that generalizes the comparison of IRs that takes an IR representation as template parameter. The constructor takes a series of lambdas that provide an event based API for generalized reporting of IRs as they are changed in the opt pipeline through the new pass manager. The first of several instantiations is provided that prints the IR in a form similar to that produced by -print-after-all with the above mentioned filtering capabilities. This version, and the others to follow will be introduced at the upcoming developer's conference. Reviewed By: aeubanks (Arthur Eubanks), yrouban (Yevgeny Rouban), ychen (Yuanfang Chen) Differential Revision: https://reviews.llvm.org/D86360	2020-09-16 17:25:18 +00:00
Arthur Eubanks	09c342493d	[NPM] Translate alias analysis into require<> as well 'require<globals-aa>' is needed to make globals-aa work in NPM, since globals-aa is a module analysis but function passes cannot run module analyses on demand. So don't skip translating alias analyses to 'require<>'. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87743	2020-09-16 08:54:09 -07:00
Arthur Eubanks	ba12e77ec1	[NewPM] Port strip* passes to NPM strip-nondebug and strip-debug-declare have no existing associated tests Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87639	2020-09-15 18:25:12 -07:00
Arthur Eubanks	f7aa1563eb	[LowerSwitch][NewPM] Port lowerswitch to NPM Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87726	2020-09-15 18:18:31 -07:00
Wenlei He	2c391a5a14	[LICM] Make Loop ICM profile aware again D65060 was reverted because it introduced non-determinism by using BFI counts from already freed blocks. The parent of this revision fixes that by using a VH callback on blocks to prevent this from happening and makes sure BFI data is passed correctly in LoopStandardAnalysisResults. This re-introduces the previous optimization of using BFI data to prevent LICM from hoisting/sinking if the instruction will end up moving to a colder block. Internally at Facebook this change results in a ~7% win in a CPU related metric in one of our big services by preventing hoisting cold code into a hot pre-header like the added test case demonstrates. Testing: ninja check Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87551	2020-09-15 17:21:58 -07:00
Wenlei He	2ea4c2c598	[BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults ~~D65060 uncovered that trying to use BFI in loop passes can lead to non-deterministic behavior when blocks are re-used while retaining old BFI data.~~ ~~To make sure BFI is preserved through loop passes a Value Handle (VH) callback is registered on blocks themselves. When a block is freed it now also wipes out the accompanying BFI entry such that stale BFI data can no longer persist resolving the determinism issue. ~~ ~~An optimistic approach would be to incrementally update BFI information throughout the loop passes rather than only invalidating them on removed blocks. The issues with that are:~~ ~~1. It is not clear how BFI information should be incrementally updated: If a block is duplicated does its BFI information come with? How about if it's split/modified/moved around? ~~ ~~2. Assuming we can address these problems the implementation here will be a massive undertaking. ~~ ~~There's a known need of BFI in LICM analysis which requires correct but not incrementally updated BFI data. A follow-up change can register BFI in all loop passes so this preserved but potentially lossy data is available to any loop pass that wants it.~~ See: D75341 for an identical implementation of preserving BFI via VH callbacks. The previous statements do still apply but this change no longer has to be in this diff because it's already upstream 😄 . This diff also moves BFI to be a part of LoopStandardAnalysisResults since the previous method using getCachedResults now (correctly!) statically asserts (D72893) that this data isn't static through the loop passes. Testing Ninja check Reviewed By: asbirlea, nikic Differential Revision: https://reviews.llvm.org/D86156	2020-09-15 16:16:24 -07:00
Arthur Eubanks	3f69b2140f	[NewPM][opt] Fix -globals-aa not being recognized as alias analysis in NPM Was missing MODULE_ALIAS_ANALYSIS, previously only FUNCTION_ALIAS_ANALYSIS was taken into account. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87664	2020-09-15 11:18:19 -07:00
Arthur Eubanks	10b12d4035	Reland [docs][NewPM] Add docs for writing NPM passes As to not conflict with the legacy PM example passes under llvm/lib/Transforms/Hello, this is under HelloNew. This makes the CMakeLists.txt and general directory structure less confusing for people following the example. Much of the doc structure was taken from WritinAnLLVMPass.rst. This adds a HelloWorld pass which simply prints out each function name. More will follow after this, e.g. passes over different units of IR, analyses. https://llvm.org/docs/WritingAnLLVMPass.html contains a lot more. Relanded with missing "Support" dependency in LLVMBuild.txt. Reviewed By: ychen, asbirlea Differential Revision: https://reviews.llvm.org/D86979	2020-09-14 16:06:19 -07:00
Arthur Eubanks	39ec36415d	Revert "[docs][NewPM] Add docs for writing NPM passes" This reverts commit `c2590de30d`. Breaks shared libs build	2020-09-14 15:55:17 -07:00
Arthur Eubanks	c2590de30d	[docs][NewPM] Add docs for writing NPM passes As to not conflict with the legacy PM example passes under llvm/lib/Transforms/Hello, this is under HelloNew. This makes the CMakeLists.txt and general directory structure less confusing for people following the example. Much of the doc structure was taken from WritinAnLLVMPass.rst. This adds a HelloWorld pass which simply prints out each function name. More will follow after this, e.g. passes over different units of IR, analyses. https://llvm.org/docs/WritingAnLLVMPass.html contains a lot more. Reviewed By: ychen, asbirlea Differential Revision: https://reviews.llvm.org/D86979	2020-09-14 13:26:03 -07:00
Teresa Johnson	226d80ebe2	[MemProf] Rename HeapProfiler to MemProfiler for consistency This is consistent with the clang option added in `7ed8124d46`, and the comments on the runtime patch in D87120. Differential Revision: https://reviews.llvm.org/D87622	2020-09-14 13:14:57 -07:00
Yevgeny Rouban	28012e00d8	[NewPM] Introduce PreserveCFG check Check that all passes, which report they preserve CFG, are really preserving CFG. A new standard instrumentation is introduced. It can be switched on/off by the flag verify-cfg-preserved, which is on by default for debug builds. Reviewers: kuhar, fedor.sergeev Differential Revision: https://reviews.llvm.org/D81558	2020-09-11 14:32:21 +07:00
Juneyoung Lee	1b9884df8d	Enable InsertFreeze flag of JumpThreading when used in LTO This patch enables inserting freeze when JumpThreading converts a select to a conditional branch when it is run in LTO. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D85534	2020-09-10 19:05:49 +09:00
Roman Lebedev	bb7d3af113	Reland [SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline This was reverted in `503deec218` because it caused gigantic increase (3x) in branch mispredictions in certain benchmarks on certain CPU's, see https://reviews.llvm.org/D84108#2227365. It has since been investigated and here are the results: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200907/827578.html > It's an amazingly severe regression, but it's also all due to branch > mispredicts (about 3x without this). The code layout looks ok so there's > probably something else to deal with. I'm not sure there's anything we can > reasonably do so we'll just have to take the hit for now and wait for > another code reorganization to make the branch predictor a bit more happy :) > > Thanks for giving us some time to investigate and feel free to recommit > whenever you'd like. > > -eric So let's just reland this. Original commit message: I've been looking at missed vectorizations in one codebase. One particular thing that stands out is that some of the loops reach vectorizer in a rather mangled form, with weird PHI's, and some of the loops aren't even in a rotated form. After taking a more detailed look, that happened because the loop's headers were too big by then. It is evident that SimplifyCFG's common code hoisting transform is at fault there, because the pattern it handles is precisely the unrotated loop basic block structure. Surprizingly, `SimplifyCFGOpt::HoistThenElseCodeToIf()` is enabled by default, and is always run, unlike it's friend, common code sinking transform, `SinkCommonCodeFromPredecessors()`, which is not enabled by default and is only run once very late in the pipeline. I'm proposing to harmonize this, and disable common code hoisting until //late// in pipeline. Definition of //late// may vary, here currently i've picked the same one as for code sinking, but i suppose we could enable it as soon as right after loop rotation happens. Experimentation shows that this does indeed unsurprizingly help, more loops got rotated, although other issues remain elsewhere. Now, this undoubtedly seriously shakes phase ordering. This will undoubtedly be a mixed bag in terms of both compile- and run- time performance, codesize. Since we no longer aggressively hoist+deduplicate common code, we don't pay the price of said hoisting (which wasn't big). That may allow more loops to be rotated, so we pay that price. That, in turn, that may enable all the transforms that require canonical (rotated) loop form, including but not limited to vectorization, so we pay that too. And in general, no deduplication means more [duplicate] instructions going through the optimizations. But there's still late hoisting, some of them will be caught late. As per benchmarks i've run {F12360204}, this is mostly within the noise, there are some small improvements, some small regressions. One big regression i saw i fixed in rG8d487668d09fb0e4e54f36207f07c1480ffabbfd, but i'm sure this will expose many more pre-existing missed optimizations, as usual :S llvm-compile-time-tracker.com thoughts on this: http://llvm-compile-time-tracker.com/compare.php?from=e40315d2b4ed1e38962a8f33ff151693ed4ada63&to=c8289c0ecbf235da9fb0e3bc052e3c0d6bff5cf9&stat=instructions * this does regress compile-time by +0.5% geomean (unsurprizingly) * size impact varies; for ThinLTO it's actually an improvement The largest fallout appears to be in GVN's load partial redundancy elimination, it spends much more time in `MemoryDependenceResults::getNonLocalPointerDependency()`. Non-local `MemoryDependenceResults` is widely-known to be, uh, costly. There does not appear to be a proper solution to this issue, other than silencing the compile-time performance regression by tuning cut-off thresholds in `MemoryDependenceResults`, at the cost of potentially regressing run-time performance. D84609 attempts to move in that direction, but the path is unclear and is going to take some time. If we look at stats before/after diffs, some excerpts: * RawSpeed (the target) {F12360200} * -14 (-73.68%) loops not rotated due to the header size (yay) * -272 (-0.67%) `"Number of live out of a loop variables"` - good for vectorizer * -3937 (-64.19%) common instructions hoisted * +561 (+0.06%) x86 asm instructions * -2 basic blocks * +2418 (+0.11%) IR instructions * vanilla test-suite + RawSpeed + darktable {F12360201} * -36396 (-65.29%) common instructions hoisted * +1676 (+0.02%) x86 asm instructions * +662 (+0.06%) basic blocks * +4395 (+0.04%) IR instructions It is likely to be sub-optimal for when optimizing for code size, so one might want to change tune pipeline by enabling sinking/hoisting when optimizing for size. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D84108 This reverts commit `503deec218`.	2020-09-08 00:24:03 +03:00
Arthur Eubanks	c9771391ce	[NewPM][Lint] Port -lint to NewPM This also changes -lint from an analysis to a pass. It's similar to -verify, and that is a normal pass, and lives in llvm/IR. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87057	2020-09-03 13:03:44 -07:00
Jamie Schmeiser	b2e65cf950	Revert "Add new hidden option -print-changed which only reports changes to IR" This reverts commit `7bc9924cb2` due to failure caused by missing a space between trailing >>, required by some versions of C++:wq.	2020-09-03 18:41:20 +00:00
Jamie Schmeiser	7bc9924cb2	Add new hidden option -print-changed which only reports changes to IR A new hidden option -print-changed is added along with code to support printing the IR as it passes through the opt pipeline in the new pass manager. Only those passes that change the IR are reported, with others only having the banner reported, indicating that they did not change the IR, were filtered out or ignored. Filtering of output via the -filter-print-funcs is supported and a new supporting hidden option -filter-passes is added. The latter takes a comma separated list of pass names and filters the output to only show those passes in the list that change the IR. The output can also be modified via the -print-module-scope function. The code introduces a template base class that generalizes the comparison of IRs that takes an IR representation as template parameter. The constructor takes a series of lambdas that provide an event based API for generalized reporting of IRs as they are changed in the opt pipeline through the new pass manager. The first of several instantiations is provided that prints the IR in a form similar to that produced by -print-after-all with the above mentioned filtering capabilities. This version, and the others to follow will be introduced at the upcoming developer's conference. See https://hotcrp.llvm.org/usllvm2020/paper/29 for more information. Reviewed By: yrouban (Yevgeny Rouban) Differential Revision: https://reviews.llvm.org/D86360	2020-09-03 15:52:35 +00:00
Arthur Eubanks	e440b4933a	Revert "[NewPM][Lint] Port -lint to NewPM" This reverts commit `883399c840`.	2020-09-02 21:34:29 -07:00
Arthur Eubanks	883399c840	[NewPM][Lint] Port -lint to NewPM This also changes -lint from an analysis to a pass. It's similar to -verify, and that is a normal pass, and lives in llvm/IR. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87057	2020-09-02 21:13:01 -07:00
Arthur Eubanks	cfde93e5d6	[ObjCARCOpt] Port objc-arc to NPM Since doInitialization() in the legacy pass modifies the module, the NPM pass is a Module pass. Reviewed By: ahatanak, ychen Differential Revision: https://reviews.llvm.org/D86178	2020-08-28 12:59:33 -07:00

1 2 3 4 5 ...

781 Commits