llvm-project

Commit Graph

Author	SHA1	Message	Date
Evan Cheng	1355bbdd11	Be careful about scheduling nodes above previous calls. It increase usages of more callee-saved registers and introduce copies. Only allows it if scheduling a node above calls would end up lessen register pressure. Call operands also has added ABI restrictions for register allocation, so be extra careful with hoisting them above calls. rdar://9329627 llvm-svn: 130245	2011-04-26 21:31:35 +00:00
Chris Lattner	0ab5e2cded	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
Andrew Trick	bfbd972b1f	In the pre-RA scheduler, maintain cmp+br proximity. This is done by pushing physical register definitions close to their use, which happens to handle flag definitions if they're not glued to the branch. This seems to be generally a good thing though, so I didn't need to add a target hook yet. The primary motivation is to generate code closer to what people expect and rule out missed opportunity from enabling macro-op fusion. As a side benefit, we get several 2-5% gains on x86 benchmarks. There is one regression: SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is an independent scheduler bug that will be tracked separately. See rdar://problem/9283108. Incidentally, pre-RA scheduling is only half the solution. Fixing the later passes is tracked by: <rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump Fixes: <rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion llvm-svn: 129508	2011-04-14 05:15:06 +00:00
Andrew Trick	b53a00d2cb	Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency. Additional fixes: Do something reasonable for subtargets with generic itineraries by handle node latency the same as for an empty itinerary. Now nodes default to unit latency unless an itinerary explicitly specifies a zero cycle stage or it is a TokenFactor chain. Original fixes: UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make the ndoe latency adjustments work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129421	2011-04-13 00:38:32 +00:00
Andrew Trick	1b60ad6644	Revert 129383. It causes some targets to hit a scheduler assert. llvm-svn: 129385	2011-04-12 20:14:07 +00:00
Andrew Trick	c5dd24a542	PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency. UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make these heuristic adjustments to node latency work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129383	2011-04-12 19:54:36 +00:00
Andrew Trick	2ad0b37318	Added a check in the preRA scheduler for potential interference on a induction variable. The preRA scheduler is unaware of induction vars, so we look for potential "virtual register cycles" instead. Fixes <rdar://problem/8946719> Bad scheduling prevents coalescing llvm-svn: 129100	2011-04-07 19:54:57 +00:00
Andrew Trick	072ed2ee0d	Improve pre-RA-sched register pressure tracking for duplicate operands. This helps cases like 2008-07-19-movups-spills.ll, but doesn't have an obvious impact on benchmarks llvm-svn: 127347	2011-03-09 19:12:43 +00:00
Eric Christopher	7238cba180	Fix some latent bugs if the nodes are unschedulable. We'd gotten away with this before since none of the register tracking or nightly tests had unschedulable nodes. This should probably be refixed with a special default Node that just returns some "don't touch me" values. Fixes PR9427 llvm-svn: 127263	2011-03-08 19:35:47 +00:00
Andrew Trick	d7f4c21684	Fix for -sched-high-latency-cycles in sched=list-ilp mode. llvm-svn: 127071	2011-03-05 09:18:16 +00:00
Andrew Trick	641e2d4f8c	Increased the register pressure limit on x86_64 from 8 to 12 regs. This is the only change in this checkin that may affects the default scheduler. With better register tracking and heuristics, it doesn't make sense to artificially lower the register limit so much. Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to give the scheduler a way to account for div and sqrt on targets that don't have an itinerary. It is currently defaults to 10 (the actual number doesn't matter much), but only takes effect on non-default schedulers: list-hybrid and list-ilp. Added several heuristics that can be individually disabled for the non-default sched=list-ilp mode. This helps us determine how much better we can do on a given benchmark than the default scheduler. Certain compute intensive loops run much faster in this mode with the right set of heuristics, and it doesn't seem to have much negative impact elsewhere. Not all of the heuristics are needed, but we still need to experiment to decide which should be disabled by default for sched=list-ilp. llvm-svn: 127067	2011-03-05 08:00:22 +00:00
Andrew Trick	d0548ae750	Introducing a new method of tracking register pressure. We can't precisely track pressure on a selection DAG, but we can at least keep it balanced. This design accounts for various interesting aspects of selection DAGS: register and subregister copies, glued nodes, dead nodes, unused registers, etc. Added SUnit::NumRegDefsLeft and ScheduleDAGSDNodes::RegDefIter. Note: I disabled PrescheduleNodesWithMultipleUses when register pressure is enabled, based on no evidence other than I don't think it makes sense to have both enabled. llvm-svn: 124853	2011-02-04 03:18:17 +00:00
Andrew Trick	3f924e4e87	whitespace llvm-svn: 124827	2011-02-03 23:00:17 +00:00
Devang Patel	92b7077f9e	Reapply 124301 llvm-svn: 124339	2011-01-27 00:13:27 +00:00
Devang Patel	b370bf329a	Revert 124301. llvm-svn: 124327	2011-01-26 21:41:22 +00:00
Devang Patel	9d4eb2f480	Process valid SDDbgValues even if the node does not have any order assigned. llvm-svn: 124301	2011-01-26 18:42:32 +00:00
Devang Patel	1448e7c8b6	Refactor. llvm-svn: 124300	2011-01-26 18:20:04 +00:00
Devang Patel	04b649d48a	This assertion is too restrictive, it does not apply for dangling dbg value nodes (nodes where dbg.value intrinsic preceds use of the value). llvm-svn: 124202	2011-01-25 18:09:33 +00:00
Chris Lattner	11a33811b6	flags -> glue for selectiondag llvm-svn: 122509	2010-12-23 17:24:32 +00:00
Chris Lattner	3e5fbd74ed	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310	2010-12-21 02:38:05 +00:00
Evan Cheng	debf9c502a	Two sets of changes. Sorry they are intermingled. 1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 llvm-svn: 118135	2010-11-03 00:45:17 +00:00
Evan Cheng	6c1414f9c2	Avoiding overly aggressive latency scheduling. If the two nodes share an operand and one of them has a single use that is a live out copy, favor the one that is live out. Otherwise it will be difficult to eliminate the copy if the instruction is a loop induction variable update. e.g. BB: sub r1, r3, #1 str r0, [r2, r3] mov r3, r1 cmp bne BB => BB: str r0, [r2, r3] sub r3, r3, #1 cmp bne BB This fixed the recent 256.bzip2 regression. llvm-svn: 117675	2010-10-29 18:09:28 +00:00
Evan Cheng	ff310737e5	Re-commit 117518 and 117519 now that ARM MC test failures are out of the way. llvm-svn: 117531	2010-10-28 06:47:08 +00:00
Evan Cheng	e2c211c1b9	Revert 117518 and 117519 for now. They changed scheduling and cause MC tests to fail. Ugh. llvm-svn: 117520	2010-10-28 02:00:25 +00:00
Evan Cheng	523fa3a2e8	Fix a major bug in operand latency computation. The use index must be adjusted by the number of defs first for it to match the instruction itinerary. llvm-svn: 117518	2010-10-28 01:46:29 +00:00
Evan Cheng	49d4c0bd18	- Add TargetInstrInfo::getOperandLatency() to compute operand latencies. This allow target to correctly compute latency for cases where static scheduling itineraries isn't sufficient. e.g. variable_ops instructions such as ARM::ldm. This also allows target without scheduling itineraries to compute operand latencies. e.g. X86 can return (approximated) latencies for high latency instructions such as division. - Compute operand latencies for those defined by load multiple instructions, e.g. ldm and those used by store multiple instructions, e.g. stm. llvm-svn: 115755	2010-10-06 06:27:31 +00:00
Evan Cheng	4a010fd1ea	Model Cortex-a9 load to SUB, RSB, ADD, ADC, SBC, RSC, CMN, MVN, or CMP pipeline forwarding path. llvm-svn: 115098	2010-09-29 22:42:35 +00:00
Evan Cheng	bf4070756f	Teach if-converter to be more careful with predicating instructions that would take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570	2010-09-10 01:29:16 +00:00
Evan Cheng	23ef829096	Add missing null check reported by Amaury Pouly. llvm-svn: 110649	2010-08-10 02:39:45 +00:00
Dan Gohman	a64a323564	Fix a bug in the code which re-inserts DBG_VALUE nodes after scheduling; if a block is split (by a custom inserter), the insert point may be in a different block than it was originally. This fixes 32-bit llvm-gcc bootstrap builds, and I haven't been able to reproduce it otherwise. llvm-svn: 108060	2010-07-10 22:42:31 +00:00
Dan Gohman	d7b5ce3312	Reapply bottom-up fast-isel, with several fixes for x86-32: - Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039	2010-07-10 09:00:22 +00:00
Bob Wilson	6586e9b203	--- Reverse-merging r107947 into '.': U utils/TableGen/FastISelEmitter.cpp --- Reverse-merging r107943 into '.': U test/CodeGen/X86/fast-isel.ll U test/CodeGen/X86/fast-isel-loads.ll U include/llvm/Target/TargetLowering.h U include/llvm/Support/PassNameParser.h U include/llvm/CodeGen/FunctionLoweringInfo.h U include/llvm/CodeGen/CallingConvLower.h U include/llvm/CodeGen/FastISel.h U include/llvm/CodeGen/SelectionDAGISel.h U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/CallingConvLower.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp U lib/CodeGen/SelectionDAG/FastISel.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp U lib/CodeGen/SelectionDAG/InstrEmitter.cpp U lib/CodeGen/SelectionDAG/TargetLowering.cpp U lib/Target/XCore/XCoreISelLowering.cpp U lib/Target/XCore/XCoreISelLowering.h U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86ISelLowering.h llvm-svn: 107987	2010-07-09 16:37:18 +00:00
Dan Gohman	0b5aa1cdd3	Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL. llvm-svn: 107943	2010-07-09 00:39:23 +00:00
Jim Grosbach	caf9b3ab7d	grammar tweak in comment. llvm-svn: 107321	2010-06-30 21:27:56 +00:00
Rafael Espindola	38a7d7cbc3	Add a VT argument to getMinimalPhysRegClass and replace the copy related uses of getPhysicalRegisterRegClass with it. If we want to make a copy (or estimate its cost), it is better to use the smallest class as more efficient operations might be possible. llvm-svn: 107140	2010-06-29 14:02:34 +00:00
Duncan Sands	2dc70bea54	Remove variables which are assigned to but for which the value is not used. Spotted by gcc-4.6. llvm-svn: 106854	2010-06-25 14:48:39 +00:00
Bill Wendling	2d3c490026	It's possible that a flag is added to the SDNode that points back to the original SDNode. This is badness. Also, this function allows one SDNode to point multiple flags to another SDNode. Badness as well. llvm-svn: 106793	2010-06-24 22:00:37 +00:00
Bill Wendling	a136521a17	MorphNodeTo doesn't preserve the memory operands. Because we're morphing a node into the same node, but with different non-memory operands, we need to replace the memory operands after it's finished morphing. llvm-svn: 106643	2010-06-23 18:16:24 +00:00
Dan Gohman	dd41bba517	Use A.append(...) instead of A.insert(A.end(), ...) when A is a SmallVector, and other SmallVector simplifications. llvm-svn: 106452	2010-06-21 19:47:52 +00:00
Evan Cheng	38f6560461	Code refactoring, no functionality changes. llvm-svn: 105775	2010-06-10 02:09:31 +00:00
Evan Cheng	cc2efe11db	Fix some latency computation bugs: if the use is not a machine opcode do not just return zero. llvm-svn: 105061	2010-05-28 23:26:21 +00:00
Evan Cheng	4401f8873c	Allow targets more controls on what nodes are scheduled by reg pressure, what for latency in hybrid mode. llvm-svn: 104293	2010-05-20 23:26:43 +00:00
Evan Cheng	bdd062dae0	Add a hybrid bottom up scheduler that reduce register usage while avoiding pipeline stall. It's useful for targets like ARM cortex-a8. NEON has a lot of long latency instructions so a strict register pressure reduction scheduler does not work well. Early experiments show this speeds up some NEON loops by over 30%. llvm-svn: 104216	2010-05-20 06:13:19 +00:00
Evan Cheng	70e506e18a	Code clean up. llvm-svn: 104173	2010-05-19 22:42:23 +00:00
Dan Gohman	25c1653700	Get rid of the EdgeMapping map. Instead, just check for BasicBlock changes before doing phi lowering for switches. llvm-svn: 102809	2010-05-01 00:01:06 +00:00
Dan Gohman	8acc8f7dfd	EmitDbgValue doesn't need its EdgeMapping argument. llvm-svn: 102742	2010-04-30 19:35:33 +00:00
Dale Johannesen	e098352ed1	Add DBG_VALUE handling for byval parameters; this produces a comment on targets that support it, but the Dwarf writer is not hooked up yet. llvm-svn: 102372	2010-04-26 20:06:49 +00:00
Evan Cheng	ed69b382ea	- Move TargetLowering::EmitTargetCodeForFrameDebugValue to TargetInstrInfo and rename it to emitFrameIndexDebugValue. - Teach spiller to modify DBG_VALUE instructions to reference spill slots. llvm-svn: 102323	2010-04-26 07:38:55 +00:00
Dan Gohman	1f0f2142cc	Fix -Wcast-qual warnings. llvm-svn: 101655	2010-04-17 17:42:52 +00:00
Evan Cheng	1889440b52	Scheduler assumes SDDbgValue nodes are in source order. That's true currently. But add an assertion to verify it. llvm-svn: 99501	2010-03-25 07:16:57 +00:00

1 2

91 Commits