llvm-project/llvm/test/CodeGen/AMDGPU/wqm.mir

# RUN: llc -march=amdgcn -verify-machineinstrs -run-pass si-wqm -o -  %s | FileCheck %s

---
# Check for awareness that s_or_saveexec_b64 clobbers SCC
#
#CHECK: ENTER_WWM
#CHECK: S_CMP_LT_I32
#CHECK: S_CSELECT_B32
name:            test_wwm_scc
alignment:       1
exposesReturnsTwice: false
legalized:       false
regBankSelected: false
selected:        false
tracksRegLiveness: true
registers:       
  - { id: 0, class: sgpr_32, preferred-register: '' }
  - { id: 1, class: sgpr_32, preferred-register: '' }
  - { id: 2, class: sgpr_32, preferred-register: '' }
  - { id: 3, class: vgpr_32, preferred-register: '' }
  - { id: 4, class: vgpr_32, preferred-register: '' }
  - { id: 5, class: sgpr_32, preferred-register: '' }
  - { id: 6, class: vgpr_32, preferred-register: '' }
  - { id: 7, class: vgpr_32, preferred-register: '' }
  - { id: 8, class: sreg_32_xm0, preferred-register: '' }
  - { id: 9, class: sreg_32, preferred-register: '' }
  - { id: 10, class: sreg_32, preferred-register: '' }
  - { id: 11, class: vgpr_32, preferred-register: '' }
  - { id: 12, class: vgpr_32, preferred-register: '' }
liveins:         
  - { reg: '$sgpr0', virtual-reg: '%0' }
  - { reg: '$sgpr1', virtual-reg: '%1' }
  - { reg: '$sgpr2', virtual-reg: '%2' }
  - { reg: '$vgpr0', virtual-reg: '%3' }
body:             |
  bb.0:
    liveins: $sgpr0, $sgpr1, $sgpr2, $vgpr0
  
    %3 = COPY $vgpr0
    %2 = COPY $sgpr2
    %1 = COPY $sgpr1
    %0 = COPY $sgpr0
    S_CMP_LT_I32 0, %0, implicit-def $scc
    %12 = V_ADD_I32_e32 %3, %3, implicit-def $vcc, implicit $exec
    %5 = S_CSELECT_B32 %2, %1, implicit $scc
    %11 = V_ADD_I32_e32 %5, %12, implicit-def $vcc, implicit $exec
    $vgpr0 = WWM %11, implicit $exec
    SI_RETURN_TO_EPILOG $vgpr0

...
[AMDGPU] Add support for Whole Wavefront Mode Summary: Whole Wavefront Wode (WWM) is similar to WQM, except that all of the lanes are always enabled, regardless of control flow. This is required for implementing wavefront reductions in non-uniform control flow, where we need to use the inactive lanes to propagate intermediate results, so they need to be enabled. We need to propagate WWM to uses (unless they're explicitly marked as exact) so that they also propagate intermediate results correctly. We do the analysis and exec mask munging during the WQM pass, since there are interactions with WQM for things that require both WQM and WWM. For simplicity, WWM is entirely block-local -- blocks are never WWM on entry or exit of a block, and WWM is not propagated to the block level. This means that computations involving WWM cannot involve control flow, but we only ever plan to use WWM for a few limited purposes (none of which involve control flow) anyways. Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There isn't yet a way to turn WWM off -- that will be added in a future change. Finally, it turns out that turning on inactive lanes causes a number of problems with register allocation. While the best long-term solution seems like teaching LLVM's register allocator about predication, for now we need to add some hacks to prevent ourselves from getting into trouble due to constraints that aren't currently expressed in LLVM. For the gory details, see the comments at the top of SIFixWWMLiveness.cpp. Reviewers: arsenm, nhaehnle, tpr Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35524 llvm-svn: 310087 2017-08-05 02:36:52 +08:00			`# RUN: llc -march=amdgcn -verify-machineinstrs -run-pass si-wqm -o - %s \| FileCheck %s`

			`---`
			`# Check for awareness that s_or_saveexec_b64 clobbers SCC`
			`#`
[AMDGPU] Pre-allocate WWM registers to reduce VGPR pressure. This change incorporates an effort by Connor Abbot to change how we deal with WWM operations potentially trashing valid values in inactive lanes. Previously, the SIFixWWMLiveness pass would work out which registers were being trashed within WWM regions, and ensure that the register allocator did not have any values it was depending on resident in those registers if the WWM section would trash them. This worked perfectly well, but would cause sometimes severe register pressure when the WWM section resided before divergent control flow (or at least that is where I mostly observed it). This fix instead runs through the WWM sections and pre allocates some registers for WWM. It then reserves these registers so that the register allocator cannot use them. This results in a significant register saving on some WWM shaders I'm working with (130 -> 104 VGPRs, with just this change!). Differential Revision: https://reviews.llvm.org/D59295 llvm-svn: 357400 2019-04-01 23:19:52 +08:00			`#CHECK: ENTER_WWM`
[AMDGPU] Add support for Whole Wavefront Mode Summary: Whole Wavefront Wode (WWM) is similar to WQM, except that all of the lanes are always enabled, regardless of control flow. This is required for implementing wavefront reductions in non-uniform control flow, where we need to use the inactive lanes to propagate intermediate results, so they need to be enabled. We need to propagate WWM to uses (unless they're explicitly marked as exact) so that they also propagate intermediate results correctly. We do the analysis and exec mask munging during the WQM pass, since there are interactions with WQM for things that require both WQM and WWM. For simplicity, WWM is entirely block-local -- blocks are never WWM on entry or exit of a block, and WWM is not propagated to the block level. This means that computations involving WWM cannot involve control flow, but we only ever plan to use WWM for a few limited purposes (none of which involve control flow) anyways. Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There isn't yet a way to turn WWM off -- that will be added in a future change. Finally, it turns out that turning on inactive lanes causes a number of problems with register allocation. While the best long-term solution seems like teaching LLVM's register allocator about predication, for now we need to add some hacks to prevent ourselves from getting into trouble due to constraints that aren't currently expressed in LLVM. For the gory details, see the comments at the top of SIFixWWMLiveness.cpp. Reviewers: arsenm, nhaehnle, tpr Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35524 llvm-svn: 310087 2017-08-05 02:36:52 +08:00			`#CHECK: S_CMP_LT_I32`
			`#CHECK: S_CSELECT_B32`
			`name: test_wwm_scc`
[Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes mir parsing Summary: This catches malformed mir files which specify alignment as log2 instead of pow2. See https://reviews.llvm.org/D65945 for reference, This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67433 llvm-svn: 371608 2019-09-11 19:16:48 +08:00			`alignment: 1`
[AMDGPU] Add support for Whole Wavefront Mode Summary: Whole Wavefront Wode (WWM) is similar to WQM, except that all of the lanes are always enabled, regardless of control flow. This is required for implementing wavefront reductions in non-uniform control flow, where we need to use the inactive lanes to propagate intermediate results, so they need to be enabled. We need to propagate WWM to uses (unless they're explicitly marked as exact) so that they also propagate intermediate results correctly. We do the analysis and exec mask munging during the WQM pass, since there are interactions with WQM for things that require both WQM and WWM. For simplicity, WWM is entirely block-local -- blocks are never WWM on entry or exit of a block, and WWM is not propagated to the block level. This means that computations involving WWM cannot involve control flow, but we only ever plan to use WWM for a few limited purposes (none of which involve control flow) anyways. Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There isn't yet a way to turn WWM off -- that will be added in a future change. Finally, it turns out that turning on inactive lanes causes a number of problems with register allocation. While the best long-term solution seems like teaching LLVM's register allocator about predication, for now we need to add some hacks to prevent ourselves from getting into trouble due to constraints that aren't currently expressed in LLVM. For the gory details, see the comments at the top of SIFixWWMLiveness.cpp. Reviewers: arsenm, nhaehnle, tpr Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35524 llvm-svn: 310087 2017-08-05 02:36:52 +08:00			`exposesReturnsTwice: false`
			`legalized: false`
			`regBankSelected: false`
			`selected: false`
			`tracksRegLiveness: true`
			`registers:`
			`- { id: 0, class: sgpr_32, preferred-register: '' }`
			`- { id: 1, class: sgpr_32, preferred-register: '' }`
			`- { id: 2, class: sgpr_32, preferred-register: '' }`
			`- { id: 3, class: vgpr_32, preferred-register: '' }`
			`- { id: 4, class: vgpr_32, preferred-register: '' }`
			`- { id: 5, class: sgpr_32, preferred-register: '' }`
			`- { id: 6, class: vgpr_32, preferred-register: '' }`
			`- { id: 7, class: vgpr_32, preferred-register: '' }`
			`- { id: 8, class: sreg_32_xm0, preferred-register: '' }`
			`- { id: 9, class: sreg_32, preferred-register: '' }`
			`- { id: 10, class: sreg_32, preferred-register: '' }`
			`- { id: 11, class: vgpr_32, preferred-register: '' }`
			`- { id: 12, class: vgpr_32, preferred-register: '' }`
			`liveins:`
Followup on Proposal to move MIR physical register namespace to '$' sigil. Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922 2018-02-01 06:04:26 +08:00			`- { reg: '$sgpr0', virtual-reg: '%0' }`
			`- { reg: '$sgpr1', virtual-reg: '%1' }`
			`- { reg: '$sgpr2', virtual-reg: '%2' }`
			`- { reg: '$vgpr0', virtual-reg: '%3' }`
[AMDGPU] Add support for Whole Wavefront Mode Summary: Whole Wavefront Wode (WWM) is similar to WQM, except that all of the lanes are always enabled, regardless of control flow. This is required for implementing wavefront reductions in non-uniform control flow, where we need to use the inactive lanes to propagate intermediate results, so they need to be enabled. We need to propagate WWM to uses (unless they're explicitly marked as exact) so that they also propagate intermediate results correctly. We do the analysis and exec mask munging during the WQM pass, since there are interactions with WQM for things that require both WQM and WWM. For simplicity, WWM is entirely block-local -- blocks are never WWM on entry or exit of a block, and WWM is not propagated to the block level. This means that computations involving WWM cannot involve control flow, but we only ever plan to use WWM for a few limited purposes (none of which involve control flow) anyways. Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There isn't yet a way to turn WWM off -- that will be added in a future change. Finally, it turns out that turning on inactive lanes causes a number of problems with register allocation. While the best long-term solution seems like teaching LLVM's register allocator about predication, for now we need to add some hacks to prevent ourselves from getting into trouble due to constraints that aren't currently expressed in LLVM. For the gory details, see the comments at the top of SIFixWWMLiveness.cpp. Reviewers: arsenm, nhaehnle, tpr Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35524 llvm-svn: 310087 2017-08-05 02:36:52 +08:00			`body: \|`
			`bb.0:`
Followup on Proposal to move MIR physical register namespace to '$' sigil. Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922 2018-02-01 06:04:26 +08:00			`liveins: $sgpr0, $sgpr1, $sgpr2, $vgpr0`
[AMDGPU] Add support for Whole Wavefront Mode Summary: Whole Wavefront Wode (WWM) is similar to WQM, except that all of the lanes are always enabled, regardless of control flow. This is required for implementing wavefront reductions in non-uniform control flow, where we need to use the inactive lanes to propagate intermediate results, so they need to be enabled. We need to propagate WWM to uses (unless they're explicitly marked as exact) so that they also propagate intermediate results correctly. We do the analysis and exec mask munging during the WQM pass, since there are interactions with WQM for things that require both WQM and WWM. For simplicity, WWM is entirely block-local -- blocks are never WWM on entry or exit of a block, and WWM is not propagated to the block level. This means that computations involving WWM cannot involve control flow, but we only ever plan to use WWM for a few limited purposes (none of which involve control flow) anyways. Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There isn't yet a way to turn WWM off -- that will be added in a future change. Finally, it turns out that turning on inactive lanes causes a number of problems with register allocation. While the best long-term solution seems like teaching LLVM's register allocator about predication, for now we need to add some hacks to prevent ourselves from getting into trouble due to constraints that aren't currently expressed in LLVM. For the gory details, see the comments at the top of SIFixWWMLiveness.cpp. Reviewers: arsenm, nhaehnle, tpr Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35524 llvm-svn: 310087 2017-08-05 02:36:52 +08:00
Followup on Proposal to move MIR physical register namespace to '$' sigil. Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922 2018-02-01 06:04:26 +08:00			`%3 = COPY $vgpr0`
			`%2 = COPY $sgpr2`
			`%1 = COPY $sgpr1`
			`%0 = COPY $sgpr0`
			`S_CMP_LT_I32 0, %0, implicit-def $scc`
			`%12 = V_ADD_I32_e32 %3, %3, implicit-def $vcc, implicit $exec`
			`%5 = S_CSELECT_B32 %2, %1, implicit $scc`
			`%11 = V_ADD_I32_e32 %5, %12, implicit-def $vcc, implicit $exec`
			`$vgpr0 = WWM %11, implicit $exec`
			`SI_RETURN_TO_EPILOG $vgpr0`
[AMDGPU] Add support for Whole Wavefront Mode Summary: Whole Wavefront Wode (WWM) is similar to WQM, except that all of the lanes are always enabled, regardless of control flow. This is required for implementing wavefront reductions in non-uniform control flow, where we need to use the inactive lanes to propagate intermediate results, so they need to be enabled. We need to propagate WWM to uses (unless they're explicitly marked as exact) so that they also propagate intermediate results correctly. We do the analysis and exec mask munging during the WQM pass, since there are interactions with WQM for things that require both WQM and WWM. For simplicity, WWM is entirely block-local -- blocks are never WWM on entry or exit of a block, and WWM is not propagated to the block level. This means that computations involving WWM cannot involve control flow, but we only ever plan to use WWM for a few limited purposes (none of which involve control flow) anyways. Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There isn't yet a way to turn WWM off -- that will be added in a future change. Finally, it turns out that turning on inactive lanes causes a number of problems with register allocation. While the best long-term solution seems like teaching LLVM's register allocator about predication, for now we need to add some hacks to prevent ourselves from getting into trouble due to constraints that aren't currently expressed in LLVM. For the gory details, see the comments at the top of SIFixWWMLiveness.cpp. Reviewers: arsenm, nhaehnle, tpr Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35524 llvm-svn: 310087 2017-08-05 02:36:52 +08:00
			`...`