llvm-project/llvm/test/CodeGen
Nicolai Haehnle 3b572002a2 AMDGPU: add execfix flag to SI_ELSE
Summary:
SI_ELSE is lowered into two parts:

s_or_saveexec_b64 dst, src (at the start of the basic block)

s_xor_b64 exec, exec, dst (at the end of the basic block)

The idea is that dst contains the exec mask of the preceding IF block. It can
happen that SIWholeQuadMode decides to switch from WQM to Exact mode inside
the basic block that contains SI_ELSE, in which case it introduces an instruction

s_and_b64 exec, exec, s[...]

which masks out bits that can correspond to both the IF and the ELSE paths.
So the resulting sequence must be:

s_or_savexec_b64 dst, src

s_and_b64 exec, exec, s[...] <-- added by SIWholeQuadMode
s_and_b64 dst, dst, exec <-- added by SILowerControlFlow

s_xor_b64 exec, exec, dst

Whether to add the additional s_and_b64 dst, dst, exec is currently determined
via the ExecModified tracking. With this change, it is instead determined by
an additional flag on SI_ELSE which is set by SIWholeQuadMode.

Finally: It also occured to me that an alternative approach for the long run
is for SILowerControlFlow to unconditionally emit

s_or_saveexec_b64 dst, src

...

s_and_b64 dst, dst, exec
s_xor_b64 exec, exec, dst

and have a pass that detects and cleans up the "redundant AND with exec"
pattern where possible. This could be useful anyway, because we also add
instructions

s_and_b64 vcc, exec, vcc

before s_cbranch_scc (in moveToALU), and those are often redundant. I have
some pending changes to how KILL is lowered that could also benefit from
such a cleanup pass.

In any case, this current patch could help in the short term with the whole
ExecModified business.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D22846

llvm-svn: 276972
2016-07-28 11:39:24 +00:00
..
AArch64 GlobalISel: support zero-sized allocas 2016-07-27 17:47:54 +00:00
AMDGPU AMDGPU: add execfix flag to SI_ELSE 2016-07-28 11:39:24 +00:00
ARM MIRParser: Use shorter cfi identifiers 2016-07-26 18:20:00 +00:00
BPF [BPF] Remove exit-on-error from tests (PR27768, PR27769) 2016-05-30 08:28:34 +00:00
Generic Move mempcpy_call.ll to X86 subdirectory 2016-07-13 18:28:45 +00:00
Hexagon [Hexagon] Find speculative loop preheader in hardware loop generation 2016-07-27 21:20:54 +00:00
Inputs
Lanai [lanai] Use peephole optimizer to generate more conditional ALU operations. 2016-07-07 23:36:04 +00:00
MIR MIRParser: Use dot instead of colon to mark subregisters 2016-07-26 21:49:34 +00:00
MSP430
Mips [mips] MIPS64R6 compact branch support 2016-07-26 10:25:07 +00:00
NVPTX Fix NVPTX/call-with-alloca-buffer.ll after r276777. 2016-07-26 18:28:33 +00:00
PowerPC Revert "RegScavenging: Add scavengeRegisterBackwards()" 2016-07-20 00:21:32 +00:00
SPARC VirtRegMap: Replace some identity copies with KILL instructions. 2016-07-09 00:19:07 +00:00
SystemZ Revert "RegScavenging: Add scavengeRegisterBackwards()" 2016-07-20 00:21:32 +00:00
Thumb Revert "RegScavenging: Add scavengeRegisterBackwards()" 2016-07-20 00:21:32 +00:00
Thumb2 [Thumb] Reapply r272251 with a fix for PR28348 (mk 2) 2016-07-05 12:37:13 +00:00
WebAssembly [WebAssembly] Emit type signatures for declared functions 2016-06-03 18:34:36 +00:00
WinEH Revert EH-specific checks in BranchFolding that were causing blow ups in compile time. 2016-07-27 17:55:33 +00:00
X86 Revert EH-specific checks in BranchFolding that were causing blow ups in compile time. 2016-07-27 17:55:33 +00:00
XCore IR: Introduce Module::global_objects(). 2016-06-22 20:29:42 +00:00