llvm-project/llvm/test/Transforms
Kevin Qin fc02e3c363 Use a loop to simplify the runtime unrolling prologue.
Runtime unrolling will create a prologue to execute the extra
iterations which is can't divided by the unroll factor. It
generates an if-then-else sequence to jump into a factor -1
times unrolled loop body, like

    extraiters = tripcount % loopfactor
    if (extraiters == 0) jump Loop:
    if (extraiters == loopfactor) jump L1
    if (extraiters == loopfactor-1) jump L2
    ...
    L1:  LoopBody;
    L2:  LoopBody;
    ...
    if tripcount < loopfactor jump End
    Loop:
    ...
    End:

It means if the unroll factor is 4, the loop body will be 7
times unrolled, 3 are in loop prologue, and 4 are in the loop.
This commit is to use a loop to execute the extra iterations
in prologue, like

        extraiters = tripcount % loopfactor
        if (extraiters == 0) jump Loop:
        else jump Prol
 Prol:  LoopBody;
        extraiters -= 1                 // Omitted if unroll factor is 2.
        if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2.
        if (tripcount < loopfactor) jump End
 Loop:
 ...
 End:

Then when unroll factor is 4, the loop body will be copied by
only 5 times, 1 in the prologue loop, 4 in the original loop.
And if the unroll factor is 2, new loop won't be created, just
as the original solution.

llvm-svn: 218604
2014-09-29 11:15:00 +00:00
..
ADCE
AddDiscriminators Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks. 2014-08-21 22:45:21 +00:00
AlignmentFromAssumptions [AlignmentFromAssumptions] Don't crash just because the target is 32-bit 2014-09-11 08:40:17 +00:00
ArgumentPromotion Don't promote byval pointer arguments when padding matters 2014-08-28 22:42:00 +00:00
AtomicExpand/ARM Use target-dependent emitLeading/TrailingFence instead of the target-independent insertLeading/TrailingFence (in AtomicExpandPass) 2014-09-03 21:01:03 +00:00
BBVectorize Reduce verbiage of lit.local.cfg files 2014-06-09 22:42:55 +00:00
BranchFolding Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call 2014-02-13 14:44:26 +00:00
CodeExtractor
CodeGenPrepare CodeGenPrep: fall back to MVT::Other if instruction's type isn't an EVT. 2014-07-29 10:20:22 +00:00
ConstProp Fix a bug around truncating vector in const prop. 2014-08-21 02:12:35 +00:00
ConstantHoisting [ConstantHoisting][X86] Improve the cost model for small constants with large types (i64 and above). 2014-06-10 00:32:29 +00:00
ConstantMerge Remove the linker_private and linker_private_weak linkages. 2014-03-13 23:18:37 +00:00
CorrelatedValuePropagation
DeadArgElim musttail: Don't eliminate varargs packs if there is a forwarding call 2014-08-26 00:59:51 +00:00
DeadStoreElimination Relax the constraint more in MemoryDependencyAnalysis.cpp 2014-08-29 20:32:58 +00:00
DebugIR Use right pointer type in DebugIR 2013-09-27 22:26:25 +00:00
EarlyCSE
FunctionAttrs [optnone] Make the optnone attribute effective at suppressing function 2014-08-13 10:49:33 +00:00
GCOVProfiling Fix coverage for files with global constructors again. Adds a testcase to the commit from r206671, as requested by David Blaikie. 2014-06-05 04:31:43 +00:00
GVN Relax the constraint more in MemoryDependencyAnalysis.cpp 2014-08-29 20:32:58 +00:00
GlobalDCE Remove dangling initializers in GlobalDCE 2014-08-25 17:51:14 +00:00
GlobalOpt GlobalOpt: Preserve comdats of unoptimized initializers 2014-09-23 22:33:01 +00:00
IPConstantProp No need for those tests to go thru llvm-as and/or llvm-dis. 2014-05-27 22:03:28 +00:00
IndVarSimplify [IndVar] Don't widen loop compare unless IV user is sign extended. 2014-09-26 20:05:35 +00:00
Inline Add functions for finding ephemeral values 2014-09-07 13:49:57 +00:00
InstCombine [InstCombine] Fix wrong folding of constant comparison involving ahsr and negative quantities (PR20945). 2014-09-17 11:32:31 +00:00
InstMerge MergedLoadStoreMotion pass 2014-07-18 19:13:09 +00:00
InstSimplify InstSimplify: Don't allow (x srem y) urem y -> x srem y 2014-09-17 04:16:35 +00:00
Internalize Use "weak alias" instead of "alias weak" 2014-07-30 22:51:54 +00:00
JumpThreading Make use of @llvm.assume from LazyValueInfo 2014-09-07 20:29:59 +00:00
LCSSA
LICM Fix assertion in LICM doFinalization() 2014-09-24 16:48:31 +00:00
LoadCombine Add LoadCombine pass. 2014-05-29 01:55:07 +00:00
LoopDeletion
LoopIdiom R600: Implement TTI:getPopcntSupport 2014-07-18 06:07:13 +00:00
LoopReroll Fix loop rerolling pass failure with non-consant loop lower bound 2014-01-03 17:20:01 +00:00
LoopRotate [LPM] Fix PR18643, another scary place where loop transforms failed to 2014-01-29 13:16:53 +00:00
LoopSimplify FileCheckize. NFC. 2014-09-12 17:55:16 +00:00
LoopStrengthReduce Reduce verbiage of lit.local.cfg files 2014-06-09 22:42:55 +00:00
LoopUnroll Use a loop to simplify the runtime unrolling prologue. 2014-09-29 11:15:00 +00:00
LoopUnswitch
LoopVectorize Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option names controlling this variable. 2014-09-10 17:58:16 +00:00
LowerAtomic IR: add "cmpxchg weak" variant to support permitted failure. 2014-06-13 14:24:07 +00:00
LowerExpectIntrinsic Lower llvm.expect intrinsic correctly for i1 2014-02-02 22:43:55 +00:00
LowerInvoke Remove LowerInvoke's obsolete "-enable-correct-eh-support" option 2014-03-20 19:54:47 +00:00
LowerSwitch Added test for commit r212802 that was missing 2014-07-11 10:36:00 +00:00
Mem2Reg Debug Info: update testing cases to specify the debug info version number. 2013-11-22 21:49:45 +00:00
MemCpyOpt Fix a really bad miscompile introduced in r216865 - the else-if logic 2014-09-01 10:09:18 +00:00
MergeFunc Fix crash when looking up the addrspace of GEPs with vector types 2014-09-02 18:47:54 +00:00
MetaRenamer Use "weak alias" instead of "alias weak" 2014-07-30 22:51:54 +00:00
ObjCARC Fix use_iterator crash in ObjCArc from r203364 2014-03-18 22:32:43 +00:00
PartiallyInlineLibCalls PartiallyInlineLibCalls: Check sqrt result type before transforming it. 2014-08-01 23:21:21 +00:00
PhaseOrdering
PruneEH
Reassociate Reassociate x + -0.1234 * y into x - 0.1234 * y 2014-08-21 10:45:30 +00:00
Reg2Mem
SCCP SCCP: update for cmpxchg returning { iN, i1 } now. 2014-06-13 14:54:09 +00:00
SLPVectorizer Preserve IR flags (nsw, nuw, exact, fast-math) in SLP vectorizer (PR20802). 2014-09-03 17:40:30 +00:00
SROA SROA: Don't insert instructions before a PHI 2014-09-01 21:20:14 +00:00
SampleProfile Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks. 2014-08-21 22:45:21 +00:00
ScalarRepl Fix PR18800. llvm intrinsic memcpy takes 5 arguments void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>, i32 <len>, i32 <align>, i1 <isvolatile>).The test case incorrectly uses the old format resulting in isVolatile function in MemIntrinsic to crash during SROA transformation.Modified the test case to use correct signature of memcpy and memset. 2014-03-13 04:50:29 +00:00
Scalarizer Fix Scalarizer insertion point when replacing PHIs with insertelements 2013-12-23 14:51:56 +00:00
SeparateConstOffsetFromGEP/NVPTX Partially revert r210444 due to performance regression 2014-07-16 23:25:00 +00:00
SimplifyCFG Add a test for hoisting instructions with metadata out of then/else blocks 2014-09-09 17:10:21 +00:00
Sink Sink: Don't sink static allocas from the entry block 2014-03-21 15:51:51 +00:00
StripSymbols Add a debug info code generation level to the compile unit metadata 2014-02-27 01:24:56 +00:00
StructurizeCFG StructurizeCFG: Fix verification failure with some loops. 2013-11-22 19:24:39 +00:00
TailCallElim We may visit a call that uses an alloca multiple times in callUsesLocalStack, sometimes with IsNocapture true and sometimes with IsNocapture false. We accidentally skipped work we needed to do in the IsNocapture=false case if we were called with IsNocapture=true the first time. Fixes PR20405! 2014-07-23 06:24:49 +00:00
TailDup Reduce verbiage of lit.local.cfg files 2014-06-09 22:42:55 +00:00
Util utils: Fix segfault in flattencfg 2014-08-13 20:31:53 +00:00