llvm-project/llvm/test/CodeGen
Sjoerd Meijer d1522513d4 [ARM] Reimplement MVE Tail-Predication pass using @llvm.get.active.lane.mask
To set up a tail-predicated loop, we need to to calculate the number of
elements processed by the loop. We can now use intrinsic
@llvm.get.active.lane.mask() to do this, which is emitted by the vectoriser in
D79100. This intrinsic generates a predicate for the masked loads/stores, and
consumes the Backedge Taken Count (BTC) as its second argument. We can now use
that to reconstruct the loop tripcount, instead of the IR pattern match
approach we were using before.

Many thanks to Eli Friedman and Sam Parker for all their help with this work.

This also adds overflow checks for the different, new expressions that we
create: the loop tripcount, and the sub expression that calculates the
remaining elements to be processed. For the latter, SCEV is not able to
calculate precise enough bounds, so we work around that at the moment, but is
not entirely correct yet, it's conservative. The overflow checks can be
overruled with a force flag, which is thus potentially unsafe (but not really
because the vectoriser is the only place where this intrinsic is emitted at the
moment). It's also good to mention that the tail-predication pass is not yet
enabled by default.  We will follow up to see if we can implement these
overflow checks better, either by a change in SCEV or we may want revise the
definition of llvm.get.active.lane.mask.

Differential Revision: https://reviews.llvm.org/D79175
2020-06-17 15:17:42 +01:00
..
AArch64 [gicombiner] Allow disable-rule option to disable all-except-... 2020-06-16 16:57:16 -07:00
AMDGPU [AMDGPU] Fix failure in VCC spilling 2020-06-17 20:11:15 +09:00
ARC
ARM [ARM][MachineOutliner] Fix no-lr-save testcase. 2020-06-15 16:09:31 +02:00
AVR [AVR] Remove faulty stack pushing behavior 2020-06-16 13:53:32 +02:00
BPF [BPF] fix incorrect type in BPFISelDAGToDAG readonly load optimization 2020-06-11 19:31:06 -07:00
Generic [LLParser] Delete temp CallInst when error occurs 2020-06-16 11:41:25 +08:00
Hexagon Simplify MachineVerifier's block-successor verification. 2020-06-06 22:30:51 -04:00
Inputs
Lanai
MIR [MachineVerifier] Verify that a DBG_VALUE has a debug location 2020-05-28 13:53:40 -07:00
MSP430
Mips [DAGCombine] Generalize the case (add (or x, c1), c2) -> (add x, (c1 + c2)) 2020-06-12 13:53:08 -04:00
NVPTX
PowerPC [NFC]][PowerPC] Remove unused intrinsic for old CTR loop pass 2020-06-17 07:06:46 +00:00
RISCV [DAGCombine] Generalize the case (add (or x, c1), c2) -> (add x, (c1 + c2)) 2020-06-12 13:53:08 -04:00
SPARC [SPARC] Lower fp16 ops to libcalls 2020-06-10 19:15:26 -07:00
SystemZ [SystemZ] Bugfix in storeLoadCanUseBlockBinary(). 2020-06-17 09:49:31 +02:00
Thumb
Thumb2 [ARM] Reimplement MVE Tail-Predication pass using @llvm.get.active.lane.mask 2020-06-17 15:17:42 +01:00
VE [VE] Support relocation information in MC layer 2020-06-15 11:24:53 +02:00
WebAssembly [WebAssembly] Adding 64-bit versions of all load & store ops. 2020-06-15 08:31:56 -07:00
WinCFGuard
WinEH
X86 [X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions 2020-06-16 09:42:34 +01:00
XCore