llvm-project/llvm/test/CodeGen/AMDGPU/commute-shifts.ll

; RUN: llc -march=amdgcn -mcpu=verde -verify-machineinstrs < %s | FileCheck -check-prefix=GCN -check-prefix=SI %s
; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck -check-prefix=GCN -check-prefix=VI %s

; GCN-LABEL: {{^}}main:
; SI: v_lshl_b32_e32 v{{[0-9]+}}, 1, v{{[0-9]+}}
; VI: v_lshlrev_b32_e64 v{{[0-9]+}}, v{{[0-9]+}}, 1
define amdgpu_ps float @main(float %arg0, float %arg1) #0 {
bb:
  %tmp = fptosi float %arg0 to i32
  %tmp1 = call <4 x float> @llvm.amdgcn.image.load.v4f32.v4i32.v8i32(<4 x i32> undef, <8 x i32> undef, i32 15, i1 false, i1 false, i1 false, i1 false)
  %tmp2.f = extractelement <4 x float> %tmp1, i32 0
  %tmp2 = bitcast float %tmp2.f to i32
  %tmp3 = and i32 %tmp, 7
  %tmp4 = shl i32 1, %tmp3
  %tmp5 = and i32 %tmp2, %tmp4
  %tmp6 = icmp eq i32 %tmp5, 0
  %tmp7 = select i1 %tmp6, float 0.000000e+00, float %arg1
  %tmp8 = call <2 x half> @llvm.amdgcn.cvt.pkrtz(float undef, float %tmp7)
  %tmp9 = bitcast <2 x half> %tmp8 to float
  ret float %tmp9
}

declare <2 x half> @llvm.amdgcn.cvt.pkrtz(float, float) #1
declare <4 x float> @llvm.amdgcn.image.load.v4f32.v4i32.v8i32(<4 x i32>, <8 x i32>, i32, i1, i1, i1, i1) #2

attributes #0 = { nounwind }
attributes #1 = { nounwind readnone }
attributes #2 = { nounwind readonly }
AMDGPU: really don't commute REV opcodes if the target variant doesn't exist If pseudoToMCOpcode failed, we would return the original opcode, so operands would be swapped, but the instruction would remain the same. It resulted in LSHLREV a, b ---> LSHLREV b, a. This fixes Glamor text rendering and piglit/arb_sample_shading-builtin-gl-sample-mask on VI. This is a candidate for stable branches. v2: the test was simplified by Tom Stellard llvm-svn: 240824 2015-06-27 04:29:10 +08:00			`; RUN: llc -march=amdgcn -mcpu=verde -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN -check-prefix=SI %s`
			`; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN -check-prefix=VI %s`

			`; GCN-LABEL: {{^}}main:`
			`; SI: v_lshl_b32_e32 v{{[0-9]+}}, 1, v{{[0-9]+}}`
			`; VI: v_lshlrev_b32_e64 v{{[0-9]+}}, v{{[0-9]+}}, 1`
AMDGPU: Remove some uses of llvm.SI.export in tests Merge some of the old, smaller tests into more complete versions. llvm-svn: 295792 2017-02-22 08:02:21 +08:00			`define amdgpu_ps float @main(float %arg0, float %arg1) #0 {`
AMDGPU: Remove old sample intrinsics I did my best to try to update all the uses in tests that just happened to use the old ones to the newer intrinsics. I'm not sure I got all of the immediate operand conversions correct, since the value seems to have been ignored by the old pattern but I don't think it really matters. llvm-svn: 258787 2016-01-26 12:38:08 +08:00			`bb:`
AMDGPU: Fix a few slightly broken tests Fix minor bugs and uses of undef which break when pointer related optimization passes are run. llvm-svn: 269944 2016-05-18 23:48:44 +08:00			`%tmp = fptosi float %arg0 to i32`
AMDGPU: Convert image intrinsic uses in tests llvm-svn: 298386 2017-03-22 00:24:12 +08:00			`%tmp1 = call <4 x float> @llvm.amdgcn.image.load.v4f32.v4i32.v8i32(<4 x i32> undef, <8 x i32> undef, i32 15, i1 false, i1 false, i1 false, i1 false)`
AMDGPU: Remove old sample intrinsics I did my best to try to update all the uses in tests that just happened to use the old ones to the newer intrinsics. I'm not sure I got all of the immediate operand conversions correct, since the value seems to have been ignored by the old pattern but I don't think it really matters. llvm-svn: 258787 2016-01-26 12:38:08 +08:00			`%tmp2.f = extractelement <4 x float> %tmp1, i32 0`
			`%tmp2 = bitcast float %tmp2.f to i32`
			`%tmp3 = and i32 %tmp, 7`
			`%tmp4 = shl i32 1, %tmp3`
			`%tmp5 = and i32 %tmp2, %tmp4`
			`%tmp6 = icmp eq i32 %tmp5, 0`
AMDGPU: Fix a few slightly broken tests Fix minor bugs and uses of undef which break when pointer related optimization passes are run. llvm-svn: 269944 2016-05-18 23:48:44 +08:00			`%tmp7 = select i1 %tmp6, float 0.000000e+00, float %arg1`
AMDGPU: Add cvt.pkrtz intrinsic Convert llvm.SI.packf16 test uses llvm-svn: 295797 2017-02-22 08:27:34 +08:00			`%tmp8 = call <2 x half> @llvm.amdgcn.cvt.pkrtz(float undef, float %tmp7)`
			`%tmp9 = bitcast <2 x half> %tmp8 to float`
AMDGPU: Remove some uses of llvm.SI.export in tests Merge some of the old, smaller tests into more complete versions. llvm-svn: 295792 2017-02-22 08:02:21 +08:00			`ret float %tmp9`
AMDGPU: really don't commute REV opcodes if the target variant doesn't exist If pseudoToMCOpcode failed, we would return the original opcode, so operands would be swapped, but the instruction would remain the same. It resulted in LSHLREV a, b ---> LSHLREV b, a. This fixes Glamor text rendering and piglit/arb_sample_shading-builtin-gl-sample-mask on VI. This is a candidate for stable branches. v2: the test was simplified by Tom Stellard llvm-svn: 240824 2015-06-27 04:29:10 +08:00			`}`

AMDGPU: Add cvt.pkrtz intrinsic Convert llvm.SI.packf16 test uses llvm-svn: 295797 2017-02-22 08:27:34 +08:00			`declare <2 x half> @llvm.amdgcn.cvt.pkrtz(float, float) #1`
AMDGPU: Convert image intrinsic uses in tests llvm-svn: 298386 2017-03-22 00:24:12 +08:00			`declare <4 x float> @llvm.amdgcn.image.load.v4f32.v4i32.v8i32(<4 x i32>, <8 x i32>, i32, i1, i1, i1, i1) #2`
AMDGPU: really don't commute REV opcodes if the target variant doesn't exist If pseudoToMCOpcode failed, we would return the original opcode, so operands would be swapped, but the instruction would remain the same. It resulted in LSHLREV a, b ---> LSHLREV b, a. This fixes Glamor text rendering and piglit/arb_sample_shading-builtin-gl-sample-mask on VI. This is a candidate for stable branches. v2: the test was simplified by Tom Stellard llvm-svn: 240824 2015-06-27 04:29:10 +08:00
AMDGPU: Remove superfluous string attributes from tests Also fix v_mac.ll not testing right thing for fneg llvm-svn: 275129 2016-07-12 07:35:48 +08:00			`attributes #0 = { nounwind }`
AMDGPU: really don't commute REV opcodes if the target variant doesn't exist If pseudoToMCOpcode failed, we would return the original opcode, so operands would be swapped, but the instruction would remain the same. It resulted in LSHLREV a, b ---> LSHLREV b, a. This fixes Glamor text rendering and piglit/arb_sample_shading-builtin-gl-sample-mask on VI. This is a candidate for stable branches. v2: the test was simplified by Tom Stellard llvm-svn: 240824 2015-06-27 04:29:10 +08:00			`attributes #1 = { nounwind readnone }`
AMDGPU: Convert image intrinsic uses in tests llvm-svn: 298386 2017-03-22 00:24:12 +08:00			`attributes #2 = { nounwind readonly }`