llvm-project/llvm/test/CodeGen/RISCV/rvv/tail-agnostic-impdef-copy.mir

# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
# RUN: llc %s -mtriple=riscv64 -mattr=experimental-v -riscv-v-vector-bits-min=128 -run-pass=finalize-isel -o - | FileCheck %s

# This test makes sure we peak through the COPY instruction between the
# IMPLICIT_DEF and PseudoVLE64_V_M8_MASK in order to select the tail agnostic
# policy. The test is working if the second argument to PseudoVSETVLI has bit 6
# set.

--- |
  ; ModuleID = 'test.ll'
  source_filename = "test.ll"
  target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n64-S128"
  target triple = "riscv64"

  ; Function Attrs: nounwind
  define <vscale x 8 x i64> @masked_load_nxv8i64(<vscale x 8 x i64>* %a, <vscale x 8 x i1> %mask) #0 {
    %load = call <vscale x 8 x i64> @llvm.masked.load.nxv8i64.p0nxv8i64(<vscale x 8 x i64>* %a, i32 8, <vscale x 8 x i1> %mask, <vscale x 8 x i64> undef)
    ret <vscale x 8 x i64> %load
  }

  ; Function Attrs: argmemonly nofree nosync nounwind readonly willreturn
  declare <vscale x 8 x i64> @llvm.masked.load.nxv8i64.p0nxv8i64(<vscale x 8 x i64>*, i32 immarg, <vscale x 8 x i1>, <vscale x 8 x i64>) #1

  attributes #0 = { nounwind "target-features"="+experimental-v" }
  attributes #1 = { argmemonly nofree nosync nounwind readonly willreturn "target-features"="+experimental-v" }

...
---
name:            masked_load_nxv8i64
alignment:       4
tracksRegLiveness: true
registers:
  - { id: 0, class: gpr }
  - { id: 1, class: vr }
  - { id: 2, class: vrm8nov0 }
  - { id: 3, class: vrm8 }
  - { id: 4, class: vrm8nov0 }
liveins:
  - { reg: '$x10', virtual-reg: '%0' }
  - { reg: '$v0', virtual-reg: '%1' }
frameInfo:
  maxAlignment:    1
machineFunctionInfo: {}
body:             |
  bb.0 (%ir-block.0):
    liveins: $x10, $v0

    ; CHECK-LABEL: name: masked_load_nxv8i64
    ; CHECK: liveins: $x10, $v0
    ; CHECK-NEXT: {{  $}}
    ; CHECK-NEXT: [[COPY:%[0-9]+]]:vr = COPY $v0
    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr = COPY $x10
    ; CHECK-NEXT: $v0 = COPY [[COPY]]
    ; CHECK-NEXT: [[DEF:%[0-9]+]]:vrm8 = IMPLICIT_DEF
    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:vrm8nov0 = COPY [[DEF]]
    ; CHECK-NEXT: [[PseudoVLE64_V_M8_MASK:%[0-9]+]]:vrm8nov0 = PseudoVLE64_V_M8_MASK [[COPY2]], [[COPY1]], $v0, -1, 6, 1 :: (load (s512) from %ir.a, align 8)
    ; CHECK-NEXT: $v8m8 = COPY [[PseudoVLE64_V_M8_MASK]]
    ; CHECK-NEXT: PseudoRET implicit $v8m8
    %1:vr = COPY $v0
    %0:gpr = COPY $x10
    $v0 = COPY %1
    %3:vrm8 = IMPLICIT_DEF
    %4:vrm8nov0 = COPY %3
    %2:vrm8nov0 = PseudoVLE64_V_M8_MASK %4, %0, $v0, -1, 6, 1 :: (load (s512) from %ir.a, align 8)
    $v8m8 = COPY %2
    PseudoRET implicit $v8m8

...
[RISCV] Look through copies when trying to find an implicit def in addVSetVL. The InstrEmitter can sometimes insert a copy after an IMPLICIT_DEF before connecting it to the vector instruction. This occurs when constrainRegClass reduces to a class with less than 4 registers. I believe LMUL8 on masked instructions triggers this since the result can only use the v8, v16, or v24 register group as the mask is using v0. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98567 2021-03-16 22:49:24 +08:00			`# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py`
			`# RUN: llc %s -mtriple=riscv64 -mattr=experimental-v -riscv-v-vector-bits-min=128 -run-pass=finalize-isel -o - \| FileCheck %s`

			`# This test makes sure we peak through the COPY instruction between the`
			`# IMPLICIT_DEF and PseudoVLE64_V_M8_MASK in order to select the tail agnostic`
			`# policy. The test is working if the second argument to PseudoVSETVLI has bit 6`
			`# set.`

			`--- \|`
			`; ModuleID = 'test.ll'`
			`source_filename = "test.ll"`
			`target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n64-S128"`
			`target triple = "riscv64"`

			`; Function Attrs: nounwind`
			`define <vscale x 8 x i64> @masked_load_nxv8i64(<vscale x 8 x i64>* %a, <vscale x 8 x i1> %mask) #0 {`
			`%load = call <vscale x 8 x i64> @llvm.masked.load.nxv8i64.p0nxv8i64(<vscale x 8 x i64>* %a, i32 8, <vscale x 8 x i1> %mask, <vscale x 8 x i64> undef)`
			`ret <vscale x 8 x i64> %load`
			`}`

			`; Function Attrs: argmemonly nofree nosync nounwind readonly willreturn`
			`declare <vscale x 8 x i64> @llvm.masked.load.nxv8i64.p0nxv8i64(<vscale x 8 x i64>*, i32 immarg, <vscale x 8 x i1>, <vscale x 8 x i64>) #1`

			`attributes #0 = { nounwind "target-features"="+experimental-v" }`
			`attributes #1 = { argmemonly nofree nosync nounwind readonly willreturn "target-features"="+experimental-v" }`

			`...`
			`---`
			`name: masked_load_nxv8i64`
			`alignment: 4`
			`tracksRegLiveness: true`
			`registers:`
			`- { id: 0, class: gpr }`
			`- { id: 1, class: vr }`
			`- { id: 2, class: vrm8nov0 }`
			`- { id: 3, class: vrm8 }`
			`- { id: 4, class: vrm8nov0 }`
			`liveins:`
			`- { reg: '$x10', virtual-reg: '%0' }`
			`- { reg: '$v0', virtual-reg: '%1' }`
			`frameInfo:`
			`maxAlignment: 1`
			`machineFunctionInfo: {}`
			`body: \|`
			`bb.0 (%ir-block.0):`
			`liveins: $x10, $v0`

			`; CHECK-LABEL: name: masked_load_nxv8i64`
			`; CHECK: liveins: $x10, $v0`
[RISCV] Update mir tests. 2021-09-23 07:47:50 +08:00			`; CHECK-NEXT: {{ $}}`
			`; CHECK-NEXT: [[COPY:%[0-9]+]]:vr = COPY $v0`
			`; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr = COPY $x10`
			`; CHECK-NEXT: $v0 = COPY [[COPY]]`
			`; CHECK-NEXT: [[DEF:%[0-9]+]]:vrm8 = IMPLICIT_DEF`
			`; CHECK-NEXT: [[COPY2:%[0-9]+]]:vrm8nov0 = COPY [[DEF]]`
[RISCV] (1/2) Add the tail policy argument to builtins/intrinsics. Add the tail policy argument to LLVM IR intrinsics. There are two policies for tail elements. Tail agnostic means users do not care about the values in the tail elements and tail undisturbed means the values in the tail elements need to be kept after the operation. In order to let users control the tail policy, we add an additional argument at the end of the argument list. For unmasked operations, we have no maskedoff and the tail policy is always tail agnostic. If users want to keep tail elements under unmasked operations, they could use all one mask in the masked operations to do it. So, we only add the additional argument for masked operations for most cases. There are exceptions listed below. In this patch, we do not handle the following cases to reduce the complexity of the patch. There could be two separate patches for them. * Use dest argument to control tail policy vmerge.vvm/vmerge.vxm/vmerge.vim (add _t builtins with additional dest argument) vfmerge.vfm (add _t builtins with additional dest argument) vmv.v.v (add _t builtins with additional dest argument) vmv.v.x (add _t builtins with additional dest argument) vmv.v.i (add _t builtins with additional dest argument) vfmv.v.f (add _t builtins with additional dest argument) vadc.vvm/vadc.vxm/vadc.vim (add _t builtins with additional dest argument) vsbc.vvm/vsbc.vxm (add _t builtins with additional dest argument) * Always has tail argument for masked/unmasked intrinsics Vector Single-Width Integer Multiply-Add Instructions (add _t and _mt builtins) Vector Widening Integer Multiply-Add Instructions (add _t and _mt builtins) Vector Single-Width Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins) Vector Widening Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins) Vector Reduction Operations (add _t and _mt builtins) Vector Slideup Instructions (add _t and _mt builtins) Vector Slidedown Instructions (add _t and _mt builtins) Discussion: https://github.com/riscv/rvv-intrinsic-doc/pull/101 Differential Revision: https://reviews.llvm.org/D105092 2021-09-02 22:42:50 +08:00			`; CHECK-NEXT: [[PseudoVLE64_V_M8_MASK:%[0-9]+]]:vrm8nov0 = PseudoVLE64_V_M8_MASK [[COPY2]], [[COPY1]], $v0, -1, 6, 1 :: (load (s512) from %ir.a, align 8)`
[RISCV] Update mir tests. 2021-09-23 07:47:50 +08:00			`; CHECK-NEXT: $v8m8 = COPY [[PseudoVLE64_V_M8_MASK]]`
			`; CHECK-NEXT: PseudoRET implicit $v8m8`
[RISCV] Look through copies when trying to find an implicit def in addVSetVL. The InstrEmitter can sometimes insert a copy after an IMPLICIT_DEF before connecting it to the vector instruction. This occurs when constrainRegClass reduces to a class with less than 4 registers. I believe LMUL8 on masked instructions triggers this since the result can only use the v8, v16, or v24 register group as the mask is using v0. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98567 2021-03-16 22:49:24 +08:00			`%1:vr = COPY $v0`
			`%0:gpr = COPY $x10`
			`$v0 = COPY %1`
			`%3:vrm8 = IMPLICIT_DEF`
			`%4:vrm8nov0 = COPY %3`
[RISCV] (1/2) Add the tail policy argument to builtins/intrinsics. Add the tail policy argument to LLVM IR intrinsics. There are two policies for tail elements. Tail agnostic means users do not care about the values in the tail elements and tail undisturbed means the values in the tail elements need to be kept after the operation. In order to let users control the tail policy, we add an additional argument at the end of the argument list. For unmasked operations, we have no maskedoff and the tail policy is always tail agnostic. If users want to keep tail elements under unmasked operations, they could use all one mask in the masked operations to do it. So, we only add the additional argument for masked operations for most cases. There are exceptions listed below. In this patch, we do not handle the following cases to reduce the complexity of the patch. There could be two separate patches for them. * Use dest argument to control tail policy vmerge.vvm/vmerge.vxm/vmerge.vim (add _t builtins with additional dest argument) vfmerge.vfm (add _t builtins with additional dest argument) vmv.v.v (add _t builtins with additional dest argument) vmv.v.x (add _t builtins with additional dest argument) vmv.v.i (add _t builtins with additional dest argument) vfmv.v.f (add _t builtins with additional dest argument) vadc.vvm/vadc.vxm/vadc.vim (add _t builtins with additional dest argument) vsbc.vvm/vsbc.vxm (add _t builtins with additional dest argument) * Always has tail argument for masked/unmasked intrinsics Vector Single-Width Integer Multiply-Add Instructions (add _t and _mt builtins) Vector Widening Integer Multiply-Add Instructions (add _t and _mt builtins) Vector Single-Width Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins) Vector Widening Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins) Vector Reduction Operations (add _t and _mt builtins) Vector Slideup Instructions (add _t and _mt builtins) Vector Slidedown Instructions (add _t and _mt builtins) Discussion: https://github.com/riscv/rvv-intrinsic-doc/pull/101 Differential Revision: https://reviews.llvm.org/D105092 2021-09-02 22:42:50 +08:00			`%2:vrm8nov0 = PseudoVLE64_V_M8_MASK %4, %0, $v0, -1, 6, 1 :: (load (s512) from %ir.a, align 8)`
[RISCV] Look through copies when trying to find an implicit def in addVSetVL. The InstrEmitter can sometimes insert a copy after an IMPLICIT_DEF before connecting it to the vector instruction. This occurs when constrainRegClass reduces to a class with less than 4 registers. I believe LMUL8 on masked instructions triggers this since the result can only use the v8, v16, or v24 register group as the mask is using v0. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98567 2021-03-16 22:49:24 +08:00			`$v8m8 = COPY %2`
			`PseudoRET implicit $v8m8`

			`...`