llvm-project

History

Roorda, Jan-Willem 4b8bcf007b [Pipeliner] Fixed node order issue related to zero latency edges Summary: A desired property of the node order in Swing Modulo Scheduling is that for nodes outside circuits the following holds: none of them is scheduled after both a successor and a predecessor. We call node orders that meet this property valid. Although invalid node orders do not lead to the generation of incorrect code, they can cause the pipeliner not being able to find a pipelined schedule for arbitrary II. The reason is that after scheduling the successor and the predecessor of a node, no room may be left to schedule the node itself. For data flow graphs with 0-latency edges, the node ordering algorithm of Swing Modulo Scheduling can generate such undesired invalid node orders. This patch fixes that. In the remainder of this commit message, I will give an example demonstrating the issue, explain the fix, and explain how the the fix is tested. Consider, as an example, the following data flow graph with all edge latencies 0 and all edges pointing downward. ``` n0 / \ n1 n3 \ / n2 \| n4 ``` Consider the implemented node order algorithm in top-down mode. In that mode, the algorithm orders the nodes based on greatest Height and in case of equal Height on lowest Movability. Finally, in case of equal Height and Movability, given two nodes with an edge between them, the algorithm prefers the source-node. In the graph, for every node, the Height and Movability are equal to 0. As will be explained below, the algorithm can generate the order n0, n1, n2, n3, n4. So, node n3 is scheduled after its predecessor n0 and after its successor n2. The reason that the algorithm can put node n2 in the order before node n3, even though they have an edge between them in which node n3 is the source, is the following: Suppose the algorithm has constructed the partial node order n0, n1. Then, the nodes left to be ordered are nodes n2, n3, and n4. Suppose that the while-loop in the implemented algorithm considers the nodes in the order n4, n3, n2. The algorithm will start with node n4, and look for more preferable nodes. First, node n4 will be compared with node n3. As the nodes have equal Height and Movability and have no edge between them, the algorithm will stick with node n4. Then node n4 is compared with node n2. Again the Height and Movability are equal. But, this time, there is an edge between the two nodes, and the algorithm will prefer the source node n2. As there are no nodes left to compare, the algorithm will add node n2 to the node order, yielding the partial node order n0, n1, n2. In this way node n2 arrives in the node-order before node n3. To solve this, this patch introduces the ZeroLatencyHeight (ZLH) property for nodes. It is defined as the maximum unweighted length of a path from the given node to an arbitrary node in which each edge has latency 0. So, ZLH(n0)=3, ZLH(n1)=ZLH(n3)=2, ZLH(n2)=1, and ZLH(n4)=0 In this patch, the preference for a greater ZeroLatencyHeight is added in the top-down mode of the node ordering algorithm, after the preference for a greater Height, and before the preference for a lower Movability. Therefore, the two allowed node-orders are n0, n1, n3, n2, n4 and n0, n3, n1, n2, n4. Both of them are valid node orders. In the same way, the bottom-up mode of the node ordering algorithm is adapted by introducing the ZeroLatencyDepth property for nodes. The patch is tested by adding extra checks to the following existing lit-tests: test/CodeGen/Hexagon/SUnit-boundary-prob.ll test/CodeGen/Hexagon/frame-offset-overflow.ll test/CodeGen/Hexagon/vect/vect-shuffle.ll Before this patch, the pipeliner failed to pipeline the loops in these tests due to invalid node-orders. After the patch, the pipeliner successfully pipelines all these loops. Reviewers: bcahoon Reviewed By: bcahoon Subscribers: Ayal, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D43620 llvm-svn: 326925		2018-03-07 18:53:36 +00:00
..
AsmPrinter	Revert "Reapply "[DWARFv5] Emit file 0 to the line table.""	2018-03-07 16:27:44 +00:00
GlobalISel	GlobalISel: IRTranslate llvm.fabs.* intrinsic	2018-03-05 22:31:55 +00:00
MIRParser	[GlobalISel] Print/Parse FailedISel MachineFunction property	2018-02-28 17:55:45 +00:00
SelectionDAG	[TargetLowering] Add vector BITCAST support to SimplifyDemandedVectorElts	2018-03-06 22:32:01 +00:00
AggressiveAntiDepBreaker.cpp	[CodeGen] Don't print "pred:" and "opt:" in -debug output	2018-01-09 17:31:07 +00:00
AggressiveAntiDepBreaker.h	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
AllocationOrder.cpp	[CodeGen] Rename functions PrintReg* to printReg*	2017-11-28 12:42:37 +00:00
AllocationOrder.h	[RegAlloc, SystemZ] Increase number of LOCRs by passing "hard" regalloc hints.	2017-11-10 08:46:26 +00:00
Analysis.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
AntiDepBreaker.h	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).	2017-09-29 21:55:49 +00:00
AtomicExpandPass.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
BasicTargetTransformInfo.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
BranchFolding.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
BranchFolding.h	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2017-10-10 22:33:29 +00:00
BranchRelaxation.cpp	Changes in the branch relaxation algorithm.	2018-01-04 07:08:45 +00:00
BreakFalseDeps.cpp	Separate LoopTraversal, ReachingDefAnalysis and BreakFalseDeps into their own files.	2018-01-22 10:06:50 +00:00
BuiltinGCs.cpp	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
CMakeLists.txt	Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..	2018-01-22 22:05:25 +00:00
CalcSpillWeights.cpp	Rename LiveIntervalAnalysis.h to LiveIntervals.h	2017-12-13 02:51:04 +00:00
CallingConvLower.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
CodeGen.cpp	Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..	2018-01-22 22:05:25 +00:00
CodeGenPrepare.cpp	Adding a width of the GEP index to the Data Layout.	2018-02-14 06:58:08 +00:00
CriticalAntiDepBreaker.cpp	[CodeGen] Don't print "pred:" and "opt:" in -debug output	2018-01-09 17:31:07 +00:00
CriticalAntiDepBreaker.h	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).	2017-09-29 21:55:49 +00:00
DFAPacketizer.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
DeadMachineInstructionElim.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
DetectDeadLanes.cpp	Remove redundant includes from lib/CodeGen.	2017-12-13 21:30:47 +00:00
DwarfEHPrepare.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
EarlyIfConversion.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
EdgeBundles.cpp	[CodeGen] Unify MBB reference format in both MIR and debug output	2017-12-04 17:18:51 +00:00
ExecutionDomainFix.cpp	Fixing warnings caused by commit 323095	2018-01-22 13:24:10 +00:00
ExpandISelPseudos.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
ExpandMemCmp.cpp	[x86, MemCmpExpansion] allow 2 pairs of loads per block (PR33325)	2018-01-06 16:16:04 +00:00
ExpandPostRAPseudos.cpp	[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.	2017-12-07 10:40:31 +00:00
ExpandReductions.cpp	[IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag	2017-11-06 16:27:15 +00:00
FEntryInserter.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
FaultMaps.cpp	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
FuncletLayout.cpp	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
GCMetadata.cpp	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).	2017-06-07 23:53:32 +00:00
GCMetadataPrinter.cpp	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).	2017-06-07 23:53:32 +00:00
GCRootLowering.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
GCStrategy.cpp	…
GlobalMerge.cpp	[GlobalMerge] Allow merging of dllexported variables	2018-02-12 21:14:21 +00:00
IfConversion.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
ImplicitNullChecks.cpp	[NFC] fix trivial typos in comments and documents	2018-01-26 08:15:29 +00:00
IndirectBrExpandPass.cpp	Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..	2018-01-22 22:05:25 +00:00
InlineSpiller.cpp	LiveStacks: Rename LiveStack.{h\|cpp} to LiveStacks.{h\|cpp}; NFC	2017-12-18 23:19:44 +00:00
InterferenceCache.cpp	Report fatal error in the case of out of memory	2018-02-20 05:41:26 +00:00
InterferenceCache.h	[CodeGen] Fix build bots which uses old Clang broken in r314046. (NFC)	2017-09-22 23:55:32 +00:00
InterleavedAccessPass.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
IntrinsicLowering.cpp	[CodeGen] fix documentation comments; NFC	2017-12-15 18:34:45 +00:00
LLVMBuild.txt	LLVMCodeGen: Add ProfileData into deps corresponding to r300277.	2017-04-14 00:36:06 +00:00
LLVMTargetMachine.cpp	[CodeGen] Add a -trap-unreachable option for debugging	2018-02-12 11:06:27 +00:00
LatencyPriorityQueue.cpp	Assert correct removal of SUnit in LatencyPriorityQueue	2017-11-16 10:18:07 +00:00
LazyMachineBlockFrequencyInfo.cpp	…
LexicalScopes.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
LiveDebugValues.cpp	[LiveDebugValues] recognize spilled reg killed in instruction after spill	2018-01-16 14:46:05 +00:00
LiveDebugVariables.cpp	Fixup for rL326769 (RegState::Debug is being truncated to a bool)	2018-03-06 13:23:28 +00:00
LiveDebugVariables.h	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).	2017-08-24 21:21:39 +00:00
LiveInterval.cpp	LiveInterval: Print weight in print() function.	2018-01-29 22:03:00 +00:00
LiveIntervalUnion.cpp	Report fatal error in the case of out of memory	2018-02-20 05:41:26 +00:00
LiveIntervals.cpp	[LiveIntervals] Handle moving up dead partial write	2018-02-26 14:42:13 +00:00
LivePhysRegs.cpp	[LivePhysRegs] Fix handling of return instructions.	2018-02-06 23:00:17 +00:00
LiveRangeCalc.cpp	Rename LiveIntervalAnalysis.h to LiveIntervals.h	2017-12-13 02:51:04 +00:00
LiveRangeCalc.h	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).	2017-08-24 21:21:39 +00:00
LiveRangeEdit.cpp	LiveRangeEdit: Inline markDeadRemat() into only user; NFC	2018-01-10 22:36:26 +00:00
LiveRangeShrink.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
LiveRangeUtils.h	…
LiveRegMatrix.cpp	Take into account the cost of local intervals when selecting split candidate.	2018-01-31 13:31:08 +00:00
LiveRegUnits.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
LiveStacks.cpp	LiveStacks: Rename LiveStack.{h\|cpp} to LiveStacks.{h\|cpp}; NFC	2017-12-18 23:19:44 +00:00
LiveVariables.cpp	Remove redundant includes from lib/CodeGen.	2017-12-13 21:30:47 +00:00
LocalStackSlotAllocation.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
LoopTraversal.cpp	Fixing warnings caused by commit 323095	2018-01-22 13:24:10 +00:00
LowLevelType.cpp	[GlobalISel] Support vector-of-pointers in LLT	2017-04-19 07:23:57 +00:00
LowerEmuTLS.cpp	[TLS] use emulated TLS if the target supports only this mode	2018-02-28 17:48:55 +00:00
MIRCanonicalizerPass.cpp	Remove redundant includes from lib/CodeGen.	2017-12-13 21:30:47 +00:00
MIRPrinter.cpp	[GlobalISel] Print/Parse FailedISel MachineFunction property	2018-02-28 17:55:45 +00:00
MIRPrintingPass.cpp	Remove redundant includes from lib/CodeGen.	2017-12-13 21:30:47 +00:00
MachineBasicBlock.cpp	[CodeGen] Don't omit any redundant information in -debug output	2018-02-26 15:23:42 +00:00
MachineBlockFrequencyInfo.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
MachineBlockPlacement.cpp	Add hasProfileData() to check if a function has profile data. NFC.	2017-12-22 01:33:52 +00:00
MachineBranchProbabilityInfo.cpp	[CodeGen] Unify MBB reference format in both MIR and debug output	2017-12-04 17:18:51 +00:00
MachineCSE.cpp	GlobalISel: Make MachineCSE runnable in the middle of the GlobalISel	2018-01-18 02:06:56 +00:00
MachineCombiner.cpp	The final step to close D41278 [MachineCombiner] Improve debug output (NFC).	2018-02-26 09:43:21 +00:00
MachineCopyPropagation.cpp	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"	2018-02-27 16:59:10 +00:00
MachineDominanceFrontier.cpp	[Dominators] Make IsPostDominator a template parameter	2017-07-14 18:26:09 +00:00
MachineDominators.cpp	[Dominators] Remove verifyDomTree and add some verifying for Post Dom Trees	2018-02-28 11:00:08 +00:00
MachineFrameInfo.cpp	MachineFrameInfo: Cleanup some parameter naming inconsistencies; NFC	2017-12-05 01:18:15 +00:00
MachineFunction.cpp	[CodeGen] Don't omit any redundant information in -debug output	2018-02-26 15:23:42 +00:00
MachineFunctionPass.cpp	CodeGen: Refactor MIR parsing	2017-06-06 00:44:35 +00:00
MachineFunctionPrinterPass.cpp	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
MachineInstr.cpp	The final step to close D41278 [MachineCombiner] Improve debug output (NFC).	2018-02-26 09:43:21 +00:00
MachineInstrBundle.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
MachineLICM.cpp	Split MachineLICM into EarlyMachineLICM and MachineLICM; NFC	2018-01-19 06:46:10 +00:00
MachineLoopInfo.cpp	Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.	2017-10-15 14:32:27 +00:00
MachineModuleInfo.cpp	MachineFunction: Slight refactoring; NFC	2017-12-15 22:22:46 +00:00
MachineModuleInfoImpls.cpp	[MachineModuleInfoImpls] Replace qsort with array_pod_sort	2017-10-26 16:07:20 +00:00
MachineOperand.cpp	[MachineOperand][Target] MachineOperand::isRenamable semantics changes	2018-02-23 18:25:08 +00:00
MachineOptimizationRemarkEmitter.cpp	[CodeGen][NFC] Rename IsVerbose to IsStandalone in Machine*::print	2018-01-18 18:05:15 +00:00
MachineOutliner.cpp	[MachineOutliner] Freeze registers in new functions	2018-01-31 20:15:16 +00:00
MachinePassRegistry.cpp	…
MachinePipeliner.cpp	[Pipeliner] Fixed node order issue related to zero latency edges	2018-03-07 18:53:36 +00:00
MachinePostDominators.cpp	[Dominators] Make IsPostDominator a template parameter	2017-07-14 18:26:09 +00:00
MachineRegionInfo.cpp	Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.	2017-10-15 14:32:27 +00:00
MachineRegisterInfo.cpp	GlobalISel: Make MachineCSE runnable in the middle of the GlobalISel	2018-01-18 02:06:56 +00:00
MachineSSAUpdater.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
MachineScheduler.cpp	[MachineScheduler] Dump SUnits before calling SchedImpl->initialize()	2018-03-05 16:31:49 +00:00
MachineSink.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
MachineTraceMetrics.cpp	[CodeGen] Unify MBB reference format in both MIR and debug output	2017-12-04 17:18:51 +00:00
MachineVerifier.cpp	[GlobalISel] Print/Parse FailedISel MachineFunction property	2018-02-28 17:55:45 +00:00
MacroFusion.cpp	[CodeGen] Improve the consistency of instruction fusion*	2017-12-11 21:09:27 +00:00
OptimizePHIs.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
PHIElimination.cpp	Rename LiveIntervalAnalysis.h to LiveIntervals.h	2017-12-13 02:51:04 +00:00
PHIEliminationUtils.cpp	…
PHIEliminationUtils.h	…
ParallelCG.cpp	Pass a reference to a module to the bitcode writer.	2018-02-14 19:11:32 +00:00
PatchableFunction.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
PeepholeOptimizer.cpp	PeepholeOpt cleanup/refactor; NFC	2018-01-11 22:59:33 +00:00
PostRAHazardRecognizer.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
PostRASchedulerList.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
PreISelIntrinsicLowering.cpp	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2017-09-13 21:15:20 +00:00
ProcessImplicitDefs.cpp	[CodeGen] Unify MBB reference format in both MIR and debug output	2017-12-04 17:18:51 +00:00
PrologEpilogInserter.cpp	[PEI][NFC] Move StackSize opt-remark code next to -warn-stack code	2018-02-05 22:46:54 +00:00
PseudoSourceValue.cpp	Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering	2017-11-08 01:01:31 +00:00
README.txt	LiveStacks: Rename LiveStack.{h\|cpp} to LiveStacks.{h\|cpp}; NFC	2017-12-18 23:19:44 +00:00
ReachingDefAnalysis.cpp	Fixing warnings caused by commit 323095	2018-01-22 13:24:10 +00:00
RegAllocBase.cpp	Rename LiveIntervalAnalysis.h to LiveIntervals.h	2017-12-13 02:51:04 +00:00
RegAllocBase.h	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2017-09-13 21:15:20 +00:00
RegAllocBasic.cpp	LiveStacks: Rename LiveStack.{h\|cpp} to LiveStacks.{h\|cpp}; NFC	2017-12-18 23:19:44 +00:00
RegAllocFast.cpp	[MachineOperand][Target] MachineOperand::isRenamable semantics changes	2018-02-23 18:25:08 +00:00
RegAllocGreedy.cpp	Take into account the cost of local intervals when selecting split candidate.	2018-01-31 13:31:08 +00:00
RegAllocPBQP.cpp	[PBQP] Fix PR33038 by pruning empty intervals in initializeGraph.	2018-02-20 22:15:09 +00:00
RegUsageInfoCollector.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
RegUsageInfoPropagate.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
RegisterClassInfo.cpp	[RegisterClassInfo] Invalidate the register pressure set limit cache when reserved regs or callee saved regs change	2018-02-14 18:53:29 +00:00
RegisterCoalescer.cpp	Rename LiveIntervalAnalysis.h to LiveIntervals.h	2017-12-13 02:51:04 +00:00
RegisterCoalescer.h	[CodeGen] Fix some Clang-tidy modernize-use-default-member-init and Include What You Use warnings; other minor fixes (NFC).	2017-09-22 23:46:57 +00:00
RegisterPressure.cpp	Report fatal error in the case of out of memory	2018-02-20 05:41:26 +00:00
RegisterScavenging.cpp	[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.	2017-12-07 10:40:31 +00:00
RegisterUsageInfo.cpp	[CodeGen] Always use `printReg` to print registers in both MIR and debug	2017-11-30 16:12:24 +00:00
RenameIndependentSubregs.cpp	Rename LiveIntervalAnalysis.h to LiveIntervals.h	2017-12-13 02:51:04 +00:00
ResetMachineFunctionPass.cpp	[GlobalISel] Print/Parse FailedISel MachineFunction property	2018-02-28 17:55:45 +00:00
SafeStack.cpp	[SafeStack] Use updated CreateMemCpy API to set more accurate source and destination alignments.	2018-02-12 22:39:47 +00:00
SafeStackColoring.cpp	Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.	2017-10-15 14:32:27 +00:00
SafeStackColoring.h	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2017-10-10 22:33:29 +00:00
SafeStackLayout.cpp	[SafeStack] Use updated CreateMemCpy API to set more accurate source and destination alignments.	2018-02-12 22:39:47 +00:00
SafeStackLayout.h	[SafeStack] Use updated CreateMemCpy API to set more accurate source and destination alignments.	2018-02-12 22:39:47 +00:00
ScalarizeMaskedMemIntrin.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
ScheduleDAG.cpp	[CodeGen] Rename functions PrintReg* to printReg*	2017-11-28 12:42:37 +00:00
ScheduleDAGInstrs.cpp	Revert "[CodeGen] Move printing '\n' from MachineInstr::print to MachineBasicBlock::print"	2018-02-19 15:08:49 +00:00
ScheduleDAGPrinter.cpp	Remove redundant includes from lib/CodeGen.	2017-12-13 21:30:47 +00:00
ScoreboardHazardRecognizer.cpp	Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering	2017-11-08 01:01:31 +00:00
ShadowStackGCLowering.cpp	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2017-09-13 21:15:20 +00:00
ShrinkWrap.cpp	[LV][CFG] Add irreducible CFG detection for outer loops	2018-03-02 12:24:25 +00:00
SjLjEHPrepare.cpp	[SjLj] Replace recursive block marking algorithm with iterative algorithm	2017-07-12 23:05:15 +00:00
SlotIndexes.cpp	Remove redundant includes from lib/CodeGen.	2017-12-13 21:30:47 +00:00
SpillPlacement.cpp	[CodeGen] Fix some Clang-tidy modernize-use-bool-literals and Include What You Use warnings; other minor fixes (NFC).	2017-09-21 23:20:16 +00:00
SpillPlacement.h	[CodeGen] Fix some Clang-tidy modernize-use-bool-literals and Include What You Use warnings; other minor fixes (NFC).	2017-09-21 23:20:16 +00:00
Spiller.h	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).	2017-08-29 22:32:07 +00:00
SplitKit.cpp	SplitKit: Fix liveness recomputation in some remat cases.	2018-02-02 00:08:19 +00:00
SplitKit.h	SplitKit: Fix liveness recomputation in some remat cases.	2018-02-02 00:08:19 +00:00
StackColoring.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
StackMapLivenessAnalysis.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
StackMaps.cpp	Mark all library options as hidden.	2017-12-01 00:53:10 +00:00
StackProtector.cpp	Re-commit r319490 "XOR the frame pointer with the stack cookie when protecting the stack"	2017-12-05 20:22:20 +00:00
StackSlotColoring.cpp	LiveStacks: Rename LiveStack.{h\|cpp} to LiveStacks.{h\|cpp}; NFC	2017-12-18 23:19:44 +00:00
TailDuplication.cpp	Split TailDuplicatePass into pre- and post-RA variant; NFC	2018-01-19 06:08:17 +00:00
TailDuplicator.cpp	[DWARF] Allow duplication of tails with CFI instructions	2018-01-31 15:57:57 +00:00
TargetFrameLoweringImpl.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
TargetInstrInfo.cpp	[AMDGPU][X86][Mips] Make sure renamable bit not set for reserved regs	2018-01-29 18:47:48 +00:00
TargetLoweringBase.cpp	[SelectionDAG] Add LegalTypes flag to getShiftAmountTy. Use it to unify and simplify DAGCombiner and simplifySetCC code and fix a bug.	2018-02-20 17:41:05 +00:00
TargetLoweringObjectFileImpl.cpp	CodeGen: support an extension to pass linker options on ELF	2018-01-30 16:29:29 +00:00
TargetOptionsImpl.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
TargetPassConfig.cpp	[MergeICmps] Revert 324317 "Enable the MergeICmps Pass by default."	2018-03-02 14:34:49 +00:00
TargetRegisterInfo.cpp	[GISel][NFC]: Move RegisterBankInfo::getSizeInBits into TargetRegisterInfo.	2018-02-02 19:42:07 +00:00
TargetSchedule.cpp	Fix a bunch more layering of CodeGen headers that are in Target	2017-11-17 01:07:10 +00:00
TargetSubtargetInfo.cpp	Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..	2018-01-22 22:05:25 +00:00
TwoAddressInstructionPass.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00
UnreachableBlockElim.cpp	Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering	2017-11-08 01:01:31 +00:00
VirtRegMap.cpp	[MachineOperand][Target] MachineOperand::isRenamable semantics changes	2018-02-23 18:25:08 +00:00
WinEHPrepare.cpp	Use phi ranges to simplify code. No functionality change intended.	2017-12-30 15:27:33 +00:00
XRayInstrumentation.cpp	MachineFunction: Return reference from getFunction(); NFC	2017-12-15 22:22:58 +00:00

README.txt

//===---------------------------------------------------------------------===//

Common register allocation / spilling problem:

        mul lr, r4, lr
        str lr, [sp, #+52]
        ldr lr, [r1, #+32]
        sxth r3, r3
        ldr r4, [sp, #+52]
        mla r4, r3, lr, r4

can be:

        mul lr, r4, lr
        mov r4, lr
        str lr, [sp, #+52]
        ldr lr, [r1, #+32]
        sxth r3, r3
        mla r4, r3, lr, r4

and then "merge" mul and mov:

        mul r4, r4, lr
        str r4, [sp, #+52]
        ldr lr, [r1, #+32]
        sxth r3, r3
        mla r4, r3, lr, r4

It also increase the likelihood the store may become dead.

//===---------------------------------------------------------------------===//

bb27 ...
        ...
        %reg1037 = ADDri %reg1039, 1
        %reg1038 = ADDrs %reg1032, %reg1039, %noreg, 10
    Successors according to CFG: 0x8b03bf0 (#5)

bb76 (0x8b03bf0, LLVM BB @0x8b032d0, ID#5):
    Predecessors according to CFG: 0x8b0c5f0 (#3) 0x8b0a7c0 (#4)
        %reg1039 = PHI %reg1070, mbb<bb76.outer,0x8b0c5f0>, %reg1037, mbb<bb27,0x8b0a7c0>

Note ADDri is not a two-address instruction. However, its result %reg1037 is an
operand of the PHI node in bb76 and its operand %reg1039 is the result of the
PHI node. We should treat it as a two-address code and make sure the ADDri is
scheduled after any node that reads %reg1039.

//===---------------------------------------------------------------------===//

Use local info (i.e. register scavenger) to assign it a free register to allow
reuse:
        ldr r3, [sp, #+4]
        add r3, r3, #3
        ldr r2, [sp, #+8]
        add r2, r2, #2
        ldr r1, [sp, #+4]  <==
        add r1, r1, #1
        ldr r0, [sp, #+4]
        add r0, r0, #2

//===---------------------------------------------------------------------===//

LLVM aggressively lift CSE out of loop. Sometimes this can be negative side-
effects:

R1 = X + 4
R2 = X + 7
R3 = X + 15

loop:
load [i + R1]
...
load [i + R2]
...
load [i + R3]

Suppose there is high register pressure, R1, R2, R3, can be spilled. We need
to implement proper re-materialization to handle this:

R1 = X + 4
R2 = X + 7
R3 = X + 15

loop:
R1 = X + 4  @ re-materialized
load [i + R1]
...
R2 = X + 7 @ re-materialized
load [i + R2]
...
R3 = X + 15 @ re-materialized
load [i + R3]

Furthermore, with re-association, we can enable sharing:

R1 = X + 4
R2 = X + 7
R3 = X + 15

loop:
T = i + X
load [T + 4]
...
load [T + 7]
...
load [T + 15]
//===---------------------------------------------------------------------===//

It's not always a good idea to choose rematerialization over spilling. If all
the load / store instructions would be folded then spilling is cheaper because
it won't require new live intervals / registers. See 2003-05-31-LongShifts for
an example.

//===---------------------------------------------------------------------===//

With a copying garbage collector, derived pointers must not be retained across
collector safe points; the collector could move the objects and invalidate the
derived pointer. This is bad enough in the first place, but safe points can
crop up unpredictably. Consider:

        %array = load { i32, [0 x %obj] }** %array_addr
        %nth_el = getelementptr { i32, [0 x %obj] }* %array, i32 0, i32 %n
        %old = load %obj** %nth_el
        %z = div i64 %x, %y
        store %obj* %new, %obj** %nth_el

If the i64 division is lowered to a libcall, then a safe point will (must)
appear for the call site. If a collection occurs, %array and %nth_el no longer
point into the correct object.

The fix for this is to copy address calculations so that dependent pointers
are never live across safe point boundaries. But the loads cannot be copied
like this if there was an intervening store, so may be hard to get right.

Only a concurrent mutator can trigger a collection at the libcall safe point.
So single-threaded programs do not have this requirement, even with a copying
collector. Still, LLVM optimizations would probably undo a front-end's careful
work.

//===---------------------------------------------------------------------===//

The ocaml frametable structure supports liveness information. It would be good
to support it.

//===---------------------------------------------------------------------===//

The FIXME in ComputeCommonTailLength in BranchFolding.cpp needs to be
revisited. The check is there to work around a misuse of directives in inline
assembly.

//===---------------------------------------------------------------------===//

It would be good to detect collector/target compatibility instead of silently
doing the wrong thing.

//===---------------------------------------------------------------------===//

It would be really nice to be able to write patterns in .td files for copies,
which would eliminate a bunch of explicit predicates on them (e.g. no side 
effects).  Once this is in place, it would be even better to have tblgen 
synthesize the various copy insertion/inspection methods in TargetInstrInfo.

//===---------------------------------------------------------------------===//

Stack coloring improvements:

1. Do proper LiveStacks analysis on all stack objects including those which are
   not spill slots.
2. Reorder objects to fill in gaps between objects.
   e.g. 4, 1, <gap>, 4, 1, 1, 1, <gap>, 4 => 4, 1, 1, 1, 1, 4, 4

//===---------------------------------------------------------------------===//

The scheduler should be able to sort nearby instructions by their address. For
example, in an expanded memset sequence it's not uncommon to see code like this:

  movl $0, 4(%rdi)
  movl $0, 8(%rdi)
  movl $0, 12(%rdi)
  movl $0, 0(%rdi)

Each of the stores is independent, and the scheduler is currently making an
arbitrary decision about the order.

//===---------------------------------------------------------------------===//

Another opportunitiy in this code is that the $0 could be moved to a register:

  movl $0, 4(%rdi)
  movl $0, 8(%rdi)
  movl $0, 12(%rdi)
  movl $0, 0(%rdi)

This would save substantial code size, especially for longer sequences like
this. It would be easy to have a rule telling isel to avoid matching MOV32mi
if the immediate has more than some fixed number of uses. It's more involved
to teach the register allocator how to do late folding to recover from
excessive register pressure.