llvm-project/llvm/test/CodeGen/AMDGPU/movreld-bug.ll

; RUN: llc -march=amdgcn -mcpu=verde -verify-machineinstrs < %s | FileCheck -check-prefix=GCN %s
; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck -check-prefix=GCN %s

; GCN-LABEL: {{^}}main:
; GCN: v_movreld_b32_e32 v0,
; GCN: v_mov_b32_e32 v0, v1
; GCN: ; return
define amdgpu_ps float @main(i32 inreg %arg) #0 {
main_body:
  %tmp24 = insertelement <16 x float> undef, float 0.000000e+00, i32 %arg
  %tmp25 = extractelement <16 x float> %tmp24, i32 1
  ret float %tmp25
}

attributes #0 = { "InitialPSInputAddr"="36983" }
AMDGPU: Fix Two Address problems with v_movreld Summary: The v_movreld machine instruction is used with three operands that are in a sense tied to each other (the explicit VGPR_32 def and the implicit VGPR_NN def and use). There is no way to express that using the currently available operand bits, and indeed there are cases where the Two Address instructions pass does the wrong thing. This patch introduces a new set of pseudo instructions that are identical in intended semantics as v_movreld, but they only have two tied operands. Having to add a new set of pseudo instructions is admittedly annoying, but it's a fairly straightforward and solid approach. The only alternative I see is to try to teach the Two Address instructions pass about Three Address instructions, and I'm afraid that's trickier and is going to end up more fragile. Note that v_movrels does not suffer from this problem, and so this patch does not touch it. This fixes several GL45-CTS.shaders.indexing.* tests. Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25633 llvm-svn: 284980 2016-10-24 22:56:02 +08:00			`; RUN: llc -march=amdgcn -mcpu=verde -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s`
			`; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s`

			`; GCN-LABEL: {{^}}main:`
			`; GCN: v_movreld_b32_e32 v0,`
			`; GCN: v_mov_b32_e32 v0, v1`
			`; GCN: ; return`
			`define amdgpu_ps float @main(i32 inreg %arg) #0 {`
			`main_body:`
[AMDGPU] Convert insert_vector_elt into set of selects This allows to avoid scratch use or indirect VGPR addressing for small vectors. Differential Revision: https://reviews.llvm.org/D54606 llvm-svn: 347231 2018-11-20 01:39:20 +08:00			`%tmp24 = insertelement <16 x float> undef, float 0.000000e+00, i32 %arg`
			`%tmp25 = extractelement <16 x float> %tmp24, i32 1`
AMDGPU: Fix Two Address problems with v_movreld Summary: The v_movreld machine instruction is used with three operands that are in a sense tied to each other (the explicit VGPR_32 def and the implicit VGPR_NN def and use). There is no way to express that using the currently available operand bits, and indeed there are cases where the Two Address instructions pass does the wrong thing. This patch introduces a new set of pseudo instructions that are identical in intended semantics as v_movreld, but they only have two tied operands. Having to add a new set of pseudo instructions is admittedly annoying, but it's a fairly straightforward and solid approach. The only alternative I see is to try to teach the Two Address instructions pass about Three Address instructions, and I'm afraid that's trickier and is going to end up more fragile. Note that v_movrels does not suffer from this problem, and so this patch does not touch it. This fixes several GL45-CTS.shaders.indexing.* tests. Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25633 llvm-svn: 284980 2016-10-24 22:56:02 +08:00			`ret float %tmp25`
			`}`

			`attributes #0 = { "InitialPSInputAddr"="36983" }`