Commit Graph

7 Commits

Author SHA1 Message Date
Matt Arsenault 57c37b2dcd AMDGPU: Fix producing saveexec when the copy is spilled
If the register from the copy from exec was spilled,
the copy before the spill was deleted leaving a spill
of undefined register verifier error and miscompiling.
Check for other use instructions of the copy register.

llvm-svn: 318132
2017-11-14 02:16:54 +00:00
Matt Arsenault f42074b699 AMDGPU: Fix missing skipFunction calls
llvm-svn: 315361
2017-10-10 20:48:36 +00:00
Stanislav Mekhanoshin da0edef1bd [AMDGPU] Turn s_and_saveexec_b64 into s_and_b64 if result is unused
With SI_END_CF elimination for some nested control flow we can now
eliminate saved exec register completely by turning a saveexec version
of instruction into just a logical instruction.

Differential Revision: https://reviews.llvm.org/D36007

llvm-svn: 309766
2017-08-01 23:44:35 +00:00
Nicolai Haehnle 87bc4c218b AMDGPU: Fix use-after-free in SIOptimizeExecMasking
Summary:
There was a bug with sequences like

   s_mov_b64 s[0:1], exec
   s_and_b64 s[2:3]<def>, s[0:1], s[2:3]<kill>
   ...
   s_mov_b64_term exec, s[2:3]

because s[2:3] was defined and used in the same instruction, ending up with
SaveExecInst inside OtherUseInsts.

Note that the test case also exposes an unrelated bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98028

Reviewers: tstellarAMD, arsenm

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25306

llvm-svn: 283528
2016-10-07 08:40:14 +00:00
Konstantin Zhuravlyov c9786f0d8a [AMDGPU] Remove unused variables from SIOptimizeExecMasking
Differential Revision: https://reviews.llvm.org/D25110

llvm-svn: 283087
2016-10-03 04:43:22 +00:00
Mehdi Amini 117296c0a0 Use StringRef in Pass/PassManager APIs (NFC)
llvm-svn: 283004
2016-10-01 02:56:57 +00:00
Matt Arsenault e6740754f0 AMDGPU: Partially fix control flow at -O0
Fixes to allow spilling all registers at the end of the block
work with exec modifications. Don't emit s_and_saveexec_b64 for
if lowering, and instead emit copies. Mark control flow mask
instructions as terminators to get correct spill code placement
with fast regalloc, and then have a separate optimization pass
form the saveexec.

This should work if SGPRs are spilled to VGPRs, but
will likely fail in the case that an SGPR spills to memory
and no workitem takes a divergent branch.

llvm-svn: 282667
2016-09-29 01:44:16 +00:00