llvm-project

Commit Graph

Author	SHA1	Message	Date
Evan Cheng	4266a79351	Add a if-conversion optimization that allows 'true' side of a diamond to be unpredicated. That is, turn subeq r0, r1, #1 addne r0, r1, #1 into sub r0, r1, #1 addne r0, r1, #1 For targets where conditional instructions are always executed, this may be beneficial. It may remove pseudo anti-dependency in out-of-order execution CPUs. e.g. op r1, ... str r1, [r10] ; end-of-life of r1 as div result cmp r0, #65 movne r1, #44 ; raw dependency on previous r1 moveq r1, #12 If movne is unpredicated, then op r1, ... str r1, [r10] cmp r0, #65 mov r1, #44 ; r1 written unconditionally moveq r1, #12 Both mov and moveq are no longer depdendent on the first instruction. This gives the out-of-order execution engine more freedom to reorder them. This has passed entire LLVM test suite. But it has not been enabled for any ARM variant pending more performance evaluation. rdar://8951196 llvm-svn: 146914	2011-12-19 22:01:30 +00:00
Evan Cheng	7f8e563a69	Add bundle aware API for querying instruction properties and switch the code generator to it. For non-bundle instructions, these behave exactly the same as the MC layer API. For properties like mayLoad / mayStore, look into the bundle and if any of the bundled instructions has the property it would return true. For properties like isPredicable, only return true if all of the bundled instructions have the property. For properties like canFoldAsLoad, isCompare, conservatively return false for bundles. llvm-svn: 146026	2011-12-07 07:15:52 +00:00
Pete Cooper	77c703f11c	Added missing &. Fixes <rdar://problem/10393723> llvm-svn: 143753	2011-11-04 23:49:14 +00:00
Jakub Staszak	3ef20e35f9	Fix typo in #include which revealed in the case-sensitive filesystem. llvm-svn: 136828	2011-08-03 22:53:41 +00:00
Jakub Staszak	15e5b742ad	Use MachineBranchProbabilityInfo in If-Conversion instead of its own heuristics. llvm-svn: 136826	2011-08-03 22:34:43 +00:00
Jakub Staszak	7987ea7460	Revert patch which broke some IfConversion tests. llvm-svn: 135738	2011-07-22 00:55:15 +00:00
Jakub Staszak	76d711582c	Fix typo in #include which revealed in the case-sensitive filesystem. llvm-svn: 135734	2011-07-22 00:39:00 +00:00
Jakub Staszak	44860314d2	Use MachineBranchProbabilityInfo instead of MachineLoopInfo in IfConversion. llvm-svn: 135724	2011-07-21 23:48:55 +00:00
Jakub Staszak	9b07c0ab6b	Use BranchProbability instead of floating points in IfConverter. llvm-svn: 134858	2011-07-10 02:58:07 +00:00
Jakub Staszak	a4a18f092c	Don't analyze block if it's not considered for ifcvt anymore. llvm-svn: 134856	2011-07-10 02:00:16 +00:00
Evan Cheng	8264e272a9	Sink SubtargetFeature and TargetInstrItineraries (renamed MCInstrItineraries) into MC. llvm-svn: 134049	2011-06-29 01:14:12 +00:00
Evan Cheng	6cc775f905	- Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. llvm-svn: 134021	2011-06-28 19:10:37 +00:00
Evan Cheng	cfdf33904b	Re-commit 131172 with fix. MachineInstr identity checks should check dead markers. In some cases a register def is dead on one path, but not on another. This is passing Clang self-hosting. llvm-svn: 131214	2011-05-12 00:56:58 +00:00
Rafael Espindola	2a09d65979	Revert 131172 as it is causing clang to miscompile itself. I will try to provide a reduced testcase. llvm-svn: 131176	2011-05-11 03:27:17 +00:00
Evan Cheng	05fc35e275	Add a late optimization to BranchFolding that hoist common instruction sequences at the start of basic blocks to their common predecessor. It's actually quite common (e.g. about 50 times in JM/lencod) and has shown to be a nice code size benefit. e.g. pushq %rax testl %edi, %edi jne LBB0_2 ## BB#1: xorb %al, %al popq %rdx ret LBB0_2: xorb %al, %al callq _foo popq %rdx ret => pushq %rax xorb %al, %al testl %edi, %edi je LBB0_2 ## BB#1: callq _foo LBB0_2: popq %rdx ret rdar://9145558 llvm-svn: 131172	2011-05-11 01:03:01 +00:00
Evan Cheng	9808d31b9e	If converter was being too cute. It look for root BBs (which don't have successors) and use inverse depth first search to traverse the BBs. However that doesn't work when the CFG has infinite loops. Simply do a linear traversal of all BBs work just fine. rdar://9344645 llvm-svn: 130324	2011-04-27 19:32:43 +00:00
Benjamin Kramer	63abc84630	Prune includes. llvm-svn: 118342	2010-11-06 11:45:59 +00:00
Evan Cheng	debf9c502a	Two sets of changes. Sorry they are intermingled. 1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 llvm-svn: 118135	2010-11-03 00:45:17 +00:00
Bob Wilson	e1961fe289	When the "true" and "false" blocks of a diamond if-conversion are the same, do not double-count the duplicate instructions by counting once from the beginning and again from the end. Keep track of where the duplicates from the beginning ended and don't go past that point when counting duplicates at the end. Radar 8589805. This change causes one of the MC/ARM/simple-fp-encoding tests to produce different (better!) code without the vmovne instruction being tested. I changed the test to produce vmovne and vmoveq instructions but moving between register files in the opposite direction. That's not quite the same but predicated versions of those instructions weren't being tested before, so at least the test coverage is not any worse, just different. llvm-svn: 117333	2010-10-26 00:02:24 +00:00
Bob Wilson	efd360c535	Change if-conversion to keep track of the extra cost due to microcoded instructions separately from the count of non-predicated instructions. The instruction count is used in places to determine how many instructions to copy, predicate, etc. and things get confused if that count includes the extra cost for microcoded ops. llvm-svn: 117332	2010-10-26 00:02:21 +00:00
Owen Anderson	6c18d1aac0	Get rid of static constructors for pass registration. Instead, every pass exposes an initializeMyPassFunction(), which must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize the pass's dependencies. Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h before parsing commandline arguments. I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass registration/creation, please send the testcase to me directly. llvm-svn: 116820	2010-10-19 17:21:58 +00:00
Owen Anderson	8ac477ffb5	Begin adding static dependence information to passes, which will allow us to perform initialization without static constructors AND without explicit initialization by the client. For the moment, passes are required to initialize both their (potential) dependencies and any passes they preserve. I hope to be able to relax the latter requirement in the future. llvm-svn: 116334	2010-10-12 19:48:12 +00:00
Owen Anderson	df7a4f2515	Now with fewer extraneous semicolons! llvm-svn: 115996	2010-10-07 22:25:06 +00:00
Owen Anderson	f31f33ea89	Thread the determination of branch prediction hit rates back through the if-conversion heuristic APIs. For now, stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide more nuanced estimates in the future. llvm-svn: 115364	2010-10-01 22:45:50 +00:00
Benjamin Kramer	2016f0eaac	Silence msvc warnings. llvm-svn: 115097	2010-09-29 22:38:50 +00:00
Owen Anderson	1b35f4cc66	Give the if-converter access to MachineLoopInfo, and use it to generate plausible branch prediction estimates. llvm-svn: 114981	2010-09-28 20:42:15 +00:00
Owen Anderson	88af7d00fc	Part one of switching to using a more sane heuristic for determining if-conversion profitability. Rather than having arbitrary cutoffs, actually try to cost model the conversion. For now, the constants are tuned to more or less match our existing behavior, but these will be changed to reflect realistic values as this work proceeds. llvm-svn: 114973	2010-09-28 18:32:13 +00:00
Evan Cheng	bf4070756f	Teach if-converter to be more careful with predicating instructions that would take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570	2010-09-10 01:29:16 +00:00
Owen Anderson	a7aed18624	Reapply r110396, with fixes to appease the Linux buildbot gods. llvm-svn: 110460	2010-08-06 18:33:48 +00:00
Owen Anderson	bda59bd247	Revert r110396 to fix buildbots. llvm-svn: 110410	2010-08-06 00:23:35 +00:00
Owen Anderson	755aceb5d0	Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static ID member as the sole unique type identifier. Clean up APIs related to this change. llvm-svn: 110396	2010-08-05 23:42:04 +00:00
Owen Anderson	a57b97e7e7	Fix batch of converting RegisterPass<> to INTIALIZE_PASS(). llvm-svn: 109045	2010-07-21 22:09:45 +00:00
Bob Wilson	1e5da550e5	Reapply my if-conversion cleanup from svn r106939 with fixes. There are 2 changes relative to the previous version of the patch: 1) For the "simple" if-conversion case, there's no need to worry about RemoveExtraEdges not handling an unanalyzable branch. Predicated terminators are ignored in this context, so RemoveExtraEdges does the right thing. This might break someday if we ever treat indirect branches (BRIND) as predicable, but for now, I just removed this part of the patch, because in the case where we do not add an unconditional branch, we rely on keeping the fall-through edge to CvtBBI (which is empty after this transformation). The change relative to the previous patch is: @@ -1036,10 +1036,6 @@ IterIfcvt = false; } - // RemoveExtraEdges won't work if the block has an unanalyzable branch, - // which is typically the case for IfConvertSimple, so explicitly remove - // CvtBBI as a successor. - BBI.BB->removeSuccessor(CvtBBI->BB); RemoveExtraEdges(BBI); // Update block info. BB can be iteratively if-converted. 2) My patch exposed a bug in the code for merging the tail of a "diamond", which had previously never been exercised. The code was simply checking that the tail had a single predecessor, but there was a case in MultiSource/Benchmarks/VersaBench/dbms where that single predecessor was neither edge of the diamond. I added the following change to check for that: @@ -1276,7 +1276,18 @@ // tail, add a unconditional branch to it. if (TailBB) { BBInfo TailBBI = BBAnalysis[TailBB->getNumber()]; - if (TailBB->pred_size() == 1 && !TailBBI.HasFallThrough) { + bool CanMergeTail = !TailBBI.HasFallThrough; + // There may still be a fall-through edge from BBI1 or BBI2 to TailBB; + // check if there are any other predecessors besides those. + unsigned NumPreds = TailBB->pred_size(); + if (NumPreds > 1) + CanMergeTail = false; + else if (NumPreds == 1 && CanMergeTail) { + MachineBasicBlock::pred_iterator PI = TailBB->pred_begin(); + if (PI != BBI1->BB && PI != BBI2->BB) + CanMergeTail = false; + } + if (CanMergeTail) { MergeBlocks(BBI, TailBBI); TailBBI.IsDone = true; } else { With these fixes, I was able to run all the SingleSource and MultiSource tests successfully. llvm-svn: 107110	2010-06-29 00:55:23 +00:00
Jim Grosbach	ee6e29aa72	new, no longer brain-dead, r106907 llvm-svn: 107060	2010-06-28 20:26:00 +00:00
Daniel Dunbar	b8c058cbb0	Revert r106907, "make sure to handle dbg_value instructions in the middle of the block, not...", it caused a bunch of nightly test regressions. llvm-svn: 107009	2010-06-28 15:47:17 +00:00
Bob Wilson	418e64a385	Revert my if-conversion cleanup since it caused a bunch of nightly test regressions. --- Reverse-merging r106939 into '.': U test/CodeGen/Thumb2/thumb2-ifcvt3.ll U lib/CodeGen/IfConversion.cpp llvm-svn: 106951	2010-06-26 17:47:06 +00:00
Bob Wilson	c72da6bb56	Clean up some problems with extra CFG edges being introduced during if-conversion. The RemoveExtraEdges function doesn't work for blocks that end with unanalyzable branches, so in those cases, the "extra" edges must be explicitly removed. The CopyAndPredicateBlock and MergeBlocks methods can also avoid copying successor edges due to branches that have already been removed. The latter case is especially helpful when MergeBlocks is called for handling "diamond" if-conversions, where otherwise you can end up with some weird intermediate states in the CFG. Unfortunately I've been unable to find cases where this cleanup actually makes a significant difference in the code. There is one test where we manage to remove an empty block at the end of a function. Radar 6911268. llvm-svn: 106939	2010-06-26 04:27:33 +00:00
Jim Grosbach	c34befc78f	make sure to handle dbg_value instructions in the middle of the block, not just at the head, when doing diamond if-conversion. rdar://7797940 llvm-svn: 106907	2010-06-25 23:05:46 +00:00
Evan Cheng	02b184de5b	Change if-conversion block size limit checks to add some flexibility. llvm-svn: 106901	2010-06-25 22:42:03 +00:00
Jim Grosbach	8a6deefec6	80 column and typo fix llvm-svn: 106894	2010-06-25 22:02:28 +00:00
Dan Gohman	d2d1ae105d	Use pre-increment instead of post-increment when the result is not used. llvm-svn: 106542	2010-06-22 15:08:57 +00:00
Bob Wilson	4581434c27	Tidy. llvm-svn: 106383	2010-06-19 05:33:57 +00:00
Evan Cheng	2d51c7c592	Allow ARM if-converter to be run after post allocation scheduling. - This fixed a number of bugs in if-converter, tail merging, and post-allocation scheduler. If-converter now runs branch folding / tail merging first to maximize if-conversion opportunities. - Also changed the t2IT instruction slightly. It now defines the ITSTATE register which is read by instructions in the IT block. - Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't change the instruction ordering in the IT block (since IT mask has been finalized). It also ensures no other instructions can be scheduled between instructions in the IT block. This is not yet enabled. llvm-svn: 106344	2010-06-18 23:09:54 +00:00
Evan Cheng	cf9e8a987f	Fix an inverted condition. llvm-svn: 106330	2010-06-18 22:17:13 +00:00
Evan Cheng	c0e0d85b18	Teach iff-converter to properly count # of dups. It was not skipping over dbg_value's which resulted in non-duplicated instructions being deleted. rdar://8104384. llvm-svn: 106323	2010-06-18 21:52:57 +00:00
Bob Wilson	f82c8fcc58	Fix PR7372: Conditional branches (at least on ARM) are treated as predicated, so when IfConverter::CopyAndPredicateBlock checks to see if it should ignore an instruction because it is a branch, it should not check if the branch is predicated. This case (when IgnoreBr is true) is only relevant from IfConvertTriangle, where new branches are inserted after the block has been copied and predicated. If the original branch is not removed, we end up with multiple conditional branches (possibly conflicting) at the end of the block. Aside from any immediate errors resulting from that, this confuses the AnalyzeBranch functions so that the branches are not analyzable. That in turn causes the IfConverter to think that the "Simple" pattern can be applied, and things go downhill fast because the "Simple" pattern does _not_ apply if the block can fall through. This is pretty fragile. If there are other degenerate cases where AnalyzeBranch fails, but where the block may still fall through, the IfConverter should not perform its "Simple" if-conversion. But, I don't know how to do that with the current AnalyzeBranch interface, so for now, the best thing seems to be to avoid creating branches that AnalyzeBranch cannot handle. Evan, please review! llvm-svn: 106291	2010-06-18 17:07:23 +00:00
Stuart Hastings	0125b6410a	Add a DebugLoc parameter to TargetInstrInfo::InsertBranch(). This addresses a longstanding deficiency noted in many FIXMEs scattered across all the targets. This effectively moves the problem up one level, replacing eleven FIXMEs in the targets with eight FIXMEs in CodeGen, plus one path through FastISel where we actually supply a DebugLoc, fixing Radar 7421831. llvm-svn: 106243	2010-06-17 22:43:56 +00:00
Evan Cheng	f128bdcb55	Make post-ra scheduling, anti-dep breaking, and register scavenger (conservatively) aware of predicated instructions. This enables ARM to move if-conversion before post-ra scheduler. llvm-svn: 106091	2010-06-16 07:35:02 +00:00
Bob Wilson	8105144fcd	Fix 80col violations, remove trailing whitespace, and clarify a comment. llvm-svn: 106057	2010-06-15 22:18:54 +00:00
Bob Wilson	fc7d739422	IfConversion's AnalyzeBlocks method always returns false; clean it up. llvm-svn: 106027	2010-06-15 18:57:15 +00:00

1 2 3 4

152 Commits