[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
# RUN: llc -march=amdgcn -mcpu=gfx900 -verify-machineinstrs -run-pass si-load-store-opt -o - %s | FileCheck -check-prefix=GFX9 %s
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_L_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_l_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
# GFX9-LABEL: name: image_sample_l_merged_v1v3_reversed
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_L_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub0_sub1_sub2
|
|
|
|
|
|
|
|
name: image_sample_l_merged_v1v3_reversed
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_merged_v2v2
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_L_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_64 = COPY %8.sub0_sub1
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_64 = COPY killed %8.sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_l_merged_v2v2
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vreg_64 = IMAGE_SAMPLE_L_V2_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 3, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 8, align 16, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_64 = IMAGE_SAMPLE_L_V2_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 12, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 8, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_merged_v2v2_reversed
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_L_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_64 = COPY %8.sub2_sub3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_64 = COPY killed %8.sub0_sub1
|
|
|
|
|
|
|
|
name: image_sample_l_merged_v2v2_reversed
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vreg_64 = IMAGE_SAMPLE_L_V2_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 12, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 8, align 16, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_64 = IMAGE_SAMPLE_L_V2_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 3, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 8, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_merged_v3v1
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_L_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY %8.sub0_sub1_sub2
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY killed %8.sub3
|
|
|
|
|
|
|
|
name: image_sample_l_merged_v3v1
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_merged_v3v1_reversed
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_L_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY %8.sub1_sub2_sub3
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY killed %8.sub0
|
|
|
|
|
|
|
|
name: image_sample_l_merged_v3v1_reversed
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_divided_merged
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_L_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_divided_merged
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%7:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%8:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-05-29 08:38:16 +08:00
|
|
|
%9:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %7:vreg_128, %3:sgpr_256, %2:sgpr_128, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%10:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-05-29 08:38:16 +08:00
|
|
|
%11:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_divided_not_merged
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_divided_not_merged
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vreg_128 = COPY %2
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
IMAGE_STORE_V4_V4 %4:vreg_128, %5:vreg_128, %3:sgpr_256, 15, -1, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (store 16)
|
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_dmask_overlapped_not_merged
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 4, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_dmask_overlapped_not_merged
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 4, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_dmask_not_disjoint_not_merged
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 4, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 11, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_dmask_not_disjoint_not_merged
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 4, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 11, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_not_merged_0
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %6, %3, %2, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_not_merged_0
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%6:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 1, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%7:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
%8:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %6, %3, %2, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_not_merged_1
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %6, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %6, %4, %2, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_not_merged_1
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
|
|
|
%4:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%5:vgpr_32 = COPY %2.sub3
|
|
|
|
%6:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%7:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %6, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
%8:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %6, %4, %2, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_not_merged_2
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %6, %4, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %6, %4, %3, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_not_merged_2
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
|
|
|
%3:sgpr_128 = COPY $sgpr92_sgpr93_sgpr94_sgpr95
|
2020-04-22 18:08:08 +08:00
|
|
|
%4:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%5:vgpr_32 = COPY %2.sub3
|
|
|
|
%6:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%7:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %6, %4, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%8:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %6, %4, %3, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_not_merged_3
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 1, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_not_merged_3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 1, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_not_merged_4
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 1, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_not_merged_4
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 1, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_not_merged_5
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 1, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_not_merged_5
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 1, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_not_merged_6
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 1, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_not_merged_6
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 1, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_not_merged_7
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_64 = IMAGE_SAMPLE_L_V2_V4 %5, %3, %2, 8, 0, 0, 0, 0, 1, 0, -1, 0, implicit $exec :: (dereferenceable load 8, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_not_merged_7
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%6:vreg_64 = IMAGE_SAMPLE_L_V2_V4 %5, %3, %2, 8, 0, 0, 0, 0, 1, 0, -1, 0, implicit $exec :: (dereferenceable load 8, addrspace 4)
|
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_not_merged_8
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_64 = IMAGE_SAMPLE_L_V2_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 1, -1, 0, implicit $exec :: (dereferenceable load 8, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_not_merged_8
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%6:vreg_64 = IMAGE_SAMPLE_L_V2_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 1, -1, 0, implicit $exec :: (dereferenceable load 8, addrspace 4)
|
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_not_merged_9
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_not_merged_9
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_not_merged_10
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 0, 0, 0, -1, 1, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
|
|
|
|
name: image_sample_l_not_merged_10
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_V1_V4 %5, %3, %2, 8, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_V3_V4 %5, %3, %2, 7, 0, 0, 0, 0, 0, 0, -1, 1, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_b_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_B_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_b_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_B_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_B_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_b_cl_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_B_CL_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_b_cl_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_B_CL_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_B_CL_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_b_cl_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_B_CL_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_b_cl_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_B_CL_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_B_CL_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_b_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_B_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_b_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_B_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_B_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_cd_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_CD_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_cd_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_CD_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_CD_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_cd_cl_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_CD_CL_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_cd_cl_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_CD_CL_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_CD_CL_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_cd_cl_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_CD_CL_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_cd_cl_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_CD_CL_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_CD_CL_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_cd_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_CD_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_cd_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_CD_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_CD_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_cl_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_CL_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_cl_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_CL_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_CL_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_cl_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_CL_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_cl_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_CL_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_CL_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_b_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_B_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_b_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_B_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_B_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_b_cl_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_B_CL_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_b_cl_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_B_CL_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_B_CL_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_b_cl_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_B_CL_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_b_cl_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_B_CL_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_B_CL_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_b_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_B_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_b_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_B_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_B_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_cd_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_CD_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_cd_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_CD_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_CD_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_cd_cl_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_CD_CL_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_cd_cl_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_CD_CL_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_CD_CL_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_cd_cl_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_CD_CL_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_cd_cl_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_CD_CL_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_CD_CL_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_cd_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_CD_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_cd_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_CD_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_CD_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_cl_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_CL_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_cl_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_CL_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_CL_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_cl_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_CL_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_cl_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_CL_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_CL_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_d_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_D_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_d_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_D_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_D_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_d_cl_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_D_CL_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_d_cl_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_D_CL_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_D_CL_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_d_cl_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_D_CL_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_d_cl_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_D_CL_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_D_CL_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_d_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_D_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_d_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_D_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_D_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_l_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_L_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_l_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_L_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_L_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_lz_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_LZ_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_lz_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_LZ_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_LZ_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_lz_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_LZ_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_lz_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_LZ_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_LZ_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_l_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_L_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_l_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_L_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_L_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_c_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_C_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_c_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_C_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_C_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_d_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_D_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_d_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_D_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_D_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_d_cl_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_D_CL_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_d_cl_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_D_CL_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_D_CL_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_d_cl_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_D_CL_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_d_cl_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_D_CL_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_D_CL_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_d_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_D_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_d_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_D_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_D_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_lz_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_LZ_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_lz_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_LZ_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_LZ_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_lz_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_LZ_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_lz_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_LZ_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_LZ_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_l_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_L_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_l_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_L_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_L_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
# GFX9-LABEL: name: image_sample_o_merged_v1v3
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_128 = IMAGE_SAMPLE_O_V4_V4 %5, %3, %2, 15, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec, implicit $exec :: (dereferenceable load 16, align 4, addrspace 4)
|
|
|
|
# GFX9: %{{[0-9]+}}:vgpr_32 = COPY %8.sub0
|
|
|
|
# GFX9: %{{[0-9]+}}:vreg_96 = COPY killed %8.sub1_sub2_sub3
|
|
|
|
|
|
|
|
name: image_sample_o_merged_v1v3
|
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
%0:sgpr_64 = COPY $sgpr0_sgpr1
|
|
|
|
%1:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 36, 0, 0
|
|
|
|
%2:sgpr_128 = COPY $sgpr96_sgpr97_sgpr98_sgpr99
|
2020-04-22 18:08:08 +08:00
|
|
|
%3:sgpr_256 = S_LOAD_DWORDX8_IMM %1, 208, 0, 0
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
%4:vgpr_32 = COPY %2.sub3
|
|
|
|
%5:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET %2:sgpr_128, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load 16)
|
2020-04-22 18:08:08 +08:00
|
|
|
%6:vgpr_32 = IMAGE_SAMPLE_O_V1_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 1, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 4, addrspace 4)
|
2020-05-29 08:38:16 +08:00
|
|
|
%7:vreg_96 = IMAGE_SAMPLE_O_V3_V4 %5:vreg_128, %3:sgpr_256, %2:sgpr_128, 14, 0, 0, 0, 0, 0, 0, -1, 0, implicit $exec :: (dereferenceable load 12, align 16, addrspace 4)
|
[AMDGPU] Extend the SI Load/Store optimizer
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
2019-10-16 18:17:02 +08:00
|
|
|
...
|
|
|
|
---
|