llvm-project/llvm/test/CodeGen/AArch64/vecreduce-fmax-legalization...

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

73 lines
2.3 KiB
LLVM
Raw Normal View History

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -mattr=+neon | FileCheck %s --check-prefix=CHECK
declare half @llvm.vector.reduce.fmax.v1f16(<1 x half> %a)
declare float @llvm.vector.reduce.fmax.v1f32(<1 x float> %a)
declare double @llvm.vector.reduce.fmax.v1f64(<1 x double> %a)
declare fp128 @llvm.vector.reduce.fmax.v1f128(<1 x fp128> %a)
declare float @llvm.vector.reduce.fmax.v3f32(<3 x float> %a)
declare fp128 @llvm.vector.reduce.fmax.v2f128(<2 x fp128> %a)
declare float @llvm.vector.reduce.fmax.v16f32(<16 x float> %a)
define half @test_v1f16(<1 x half> %a) nounwind {
; CHECK-LABEL: test_v1f16:
; CHECK: // %bb.0:
; CHECK-NEXT: ret
%b = call half @llvm.vector.reduce.fmax.v1f16(<1 x half> %a)
ret half %b
}
define float @test_v1f32(<1 x float> %a) nounwind {
; CHECK-LABEL: test_v1f32:
; CHECK: // %bb.0:
; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
; CHECK-NEXT: // kill: def $s0 killed $s0 killed $q0
; CHECK-NEXT: ret
%b = call float @llvm.vector.reduce.fmax.v1f32(<1 x float> %a)
ret float %b
}
define double @test_v1f64(<1 x double> %a) nounwind {
; CHECK-LABEL: test_v1f64:
; CHECK: // %bb.0:
; CHECK-NEXT: ret
%b = call double @llvm.vector.reduce.fmax.v1f64(<1 x double> %a)
ret double %b
}
define fp128 @test_v1f128(<1 x fp128> %a) nounwind {
; CHECK-LABEL: test_v1f128:
; CHECK: // %bb.0:
; CHECK-NEXT: ret
%b = call fp128 @llvm.vector.reduce.fmax.v1f128(<1 x fp128> %a)
ret fp128 %b
}
; TODO: This doesn't work, because ExpandReductions only supports power of two
; unordered reductions.
;define float @test_v3f32(<3 x float> %a) nounwind {
; %b = call float @llvm.vector.reduce.fmax.v3f32(<3 x float> %a)
; ret float %b
;}
define fp128 @test_v2f128(<2 x fp128> %a) nounwind {
; CHECK-LABEL: test_v2f128:
; CHECK: // %bb.0:
[Intrinsics] define semantics for experimental fmax/fmin vector reductions As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html This is hopefully the final remaining showstopper before we can remove the 'experimental' from the reduction intrinsics. No behavior was specified for the FP min/max reductions, so we have a mess of different interpretations. There are a few potential options for the semantics of these max/min ops. I think this is the simplest based on current behavior/implementation: make the reductions inherit from the existing llvm.maxnum/minnum intrinsics. These correspond to libm fmax/fmin, and those are similar to the (now deprecated?) IEEE-754 maxNum/minNum functions (NaNs are treated as missing data). So the default expansion creates calls to libm functions. Another option would be to inherit from llvm.maximum/minimum (NaNs propagate), but most targets just crash in codegen when given those nodes because no default expansion was ever implemented AFAICT. We could also just assume 'nnan' semantics by default (we are already assuming 'nsz' semantics in the maxnum/minnum intrinsics), but some targets (AArch64, PowerPC) support the more defined behavior, so it doesn't make much sense to not allow a tighter spec. Fast-math-flags (nnan) can be used to loosen the semantics. (Note that D67507 was proposed to update the LangRef to acknowledge the more recent IEEE-754 2019 standard, but that patch seems to have stalled. If we do update based on the new standard, the reduction instructions can seamlessly inherit from whatever updates are made to the max/min intrinsics.) x86 sees a regression here on 'nnan' tests because we have underlying, longstanding bugs in FMF creation/propagation. Those need to be fixed apart from this change (for example: https://llvm.org/PR35538). The expansion sequence before this patch may not have been correct. Differential Revision: https://reviews.llvm.org/D87391
2020-09-12 21:08:07 +08:00
; CHECK-NEXT: b fmaxl
%b = call fp128 @llvm.vector.reduce.fmax.v2f128(<2 x fp128> %a)
ret fp128 %b
}
define float @test_v16f32(<16 x float> %a) nounwind {
; CHECK-LABEL: test_v16f32:
; CHECK: // %bb.0:
; CHECK-NEXT: fmaxnm v1.4s, v1.4s, v3.4s
; CHECK-NEXT: fmaxnm v0.4s, v0.4s, v2.4s
; CHECK-NEXT: fmaxnm v0.4s, v0.4s, v1.4s
[Intrinsics] define semantics for experimental fmax/fmin vector reductions As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html This is hopefully the final remaining showstopper before we can remove the 'experimental' from the reduction intrinsics. No behavior was specified for the FP min/max reductions, so we have a mess of different interpretations. There are a few potential options for the semantics of these max/min ops. I think this is the simplest based on current behavior/implementation: make the reductions inherit from the existing llvm.maxnum/minnum intrinsics. These correspond to libm fmax/fmin, and those are similar to the (now deprecated?) IEEE-754 maxNum/minNum functions (NaNs are treated as missing data). So the default expansion creates calls to libm functions. Another option would be to inherit from llvm.maximum/minimum (NaNs propagate), but most targets just crash in codegen when given those nodes because no default expansion was ever implemented AFAICT. We could also just assume 'nnan' semantics by default (we are already assuming 'nsz' semantics in the maxnum/minnum intrinsics), but some targets (AArch64, PowerPC) support the more defined behavior, so it doesn't make much sense to not allow a tighter spec. Fast-math-flags (nnan) can be used to loosen the semantics. (Note that D67507 was proposed to update the LangRef to acknowledge the more recent IEEE-754 2019 standard, but that patch seems to have stalled. If we do update based on the new standard, the reduction instructions can seamlessly inherit from whatever updates are made to the max/min intrinsics.) x86 sees a regression here on 'nnan' tests because we have underlying, longstanding bugs in FMF creation/propagation. Those need to be fixed apart from this change (for example: https://llvm.org/PR35538). The expansion sequence before this patch may not have been correct. Differential Revision: https://reviews.llvm.org/D87391
2020-09-12 21:08:07 +08:00
; CHECK-NEXT: fmaxnmv s0, v0.4s
; CHECK-NEXT: ret
%b = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %a)
ret float %b
}