llvm-project/llvm/test/CodeGen/NVPTX/fast-math.ll

; RUN: llc < %s -march=nvptx -mcpu=sm_20 | FileCheck %s

declare float @llvm.nvvm.sqrt.f(float)

; CHECK-LABEL: sqrt_div
; CHECK: sqrt.rn.f32
; CHECK: div.rn.f32
define float @sqrt_div(float %a, float %b) {
  %t1 = tail call float @llvm.nvvm.sqrt.f(float %a)
  %t2 = fdiv float %t1, %b
  ret float %t2
}

; CHECK-LABEL: sqrt_div_fast
; CHECK: sqrt.approx.f32
; CHECK: div.approx.f32
define float @sqrt_div_fast(float %a, float %b) #0 {
  %t1 = tail call float @llvm.nvvm.sqrt.f(float %a)
  %t2 = fdiv float %t1, %b
  ret float %t2
}

; CHECK-LABEL: fadd
; CHECK: add.rn.f32
define float @fadd(float %a, float %b) {
  %t1 = fadd float %a, %b
  ret float %t1
}

; CHECK-LABEL: fadd_ftz
; CHECK: add.rn.ftz.f32
define float @fadd_ftz(float %a, float %b) #1 {
  %t1 = fadd float %a, %b
  ret float %t1
}

declare float @llvm.sin.f32(float)
declare float @llvm.cos.f32(float)

; CHECK-LABEL: fsin_approx
; CHECK:       sin.approx.f32
define float @fsin_approx(float %a) #0 {
  %r = tail call float @llvm.sin.f32(float %a)
  ret float %r
}

; CHECK-LABEL: fcos_approx
; CHECK:       cos.approx.f32
define float @fcos_approx(float %a) #0 {
  %r = tail call float @llvm.cos.f32(float %a)
  ret float %r
}

attributes #0 = { "unsafe-fp-math" = "true" }
attributes #1 = { "nvptx-f32ftz" = "true" }
[NVPTX] Use approximate FP ops when unsafe-fp-math is used, and append .ftz to instructions if the nvptx-f32ftz attribute is set to "true" llvm-svn: 186820 2013-07-22 20:18:04 +08:00			`; RUN: llc < %s -march=nvptx -mcpu=sm_20 \| FileCheck %s`

			`declare float @llvm.nvvm.sqrt.f(float)`

[NVPTX] Add CHECK-LABEL where appropriate to fast-math.ll test. Also fix up whitespace. Test-only change. llvm-svn: 291617 2017-01-11 07:42:46 +08:00			`; CHECK-LABEL: sqrt_div`
[NVPTX] Use approximate FP ops when unsafe-fp-math is used, and append .ftz to instructions if the nvptx-f32ftz attribute is set to "true" llvm-svn: 186820 2013-07-22 20:18:04 +08:00			`; CHECK: sqrt.rn.f32`
			`; CHECK: div.rn.f32`
			`define float @sqrt_div(float %a, float %b) {`
			`%t1 = tail call float @llvm.nvvm.sqrt.f(float %a)`
			`%t2 = fdiv float %t1, %b`
			`ret float %t2`
			`}`

[NVPTX] Add CHECK-LABEL where appropriate to fast-math.ll test. Also fix up whitespace. Test-only change. llvm-svn: 291617 2017-01-11 07:42:46 +08:00			`; CHECK-LABEL: sqrt_div_fast`
[NVPTX] Use approximate FP ops when unsafe-fp-math is used, and append .ftz to instructions if the nvptx-f32ftz attribute is set to "true" llvm-svn: 186820 2013-07-22 20:18:04 +08:00			`; CHECK: sqrt.approx.f32`
			`; CHECK: div.approx.f32`
			`define float @sqrt_div_fast(float %a, float %b) #0 {`
			`%t1 = tail call float @llvm.nvvm.sqrt.f(float %a)`
			`%t2 = fdiv float %t1, %b`
			`ret float %t2`
			`}`

[NVPTX] Add CHECK-LABEL where appropriate to fast-math.ll test. Also fix up whitespace. Test-only change. llvm-svn: 291617 2017-01-11 07:42:46 +08:00			`; CHECK-LABEL: fadd`
[TM] Restore default TargetOptions in TargetMachine::resetTargetOptions. Summary: Previously if you had * a function with the fast-math-enabled attr, followed by * a function without the fast-math attr, the second function would inherit the first function's fast-math-ness. This means that mixing fast-math and non-fast-math functions in a module was completely broken unless you explicitly annotated every non-fast-math function with "unsafe-fp-math"="false". This appears to have been broken since r176986 (March 2013), when the resetTargetOptions function was introduced. This patch tests the correct behavior as best we can. I don't think I can test FPDenormalMode and NoTrappingFPMath, because they aren't used in any backends during function lowering. Surprisingly, I also can't find any uses at all of LessPreciseFPMAD affecting generated code. The NVPTX/fast-math.ll test changes are an expected result of fixing this bug. When FMA is disabled, we emit add as "add.rn.f32", which prevents fma combining. Before this patch, fast-math was enabled in all functions following the one which explicitly enabled it on itself, so we were emitting plain "add.f32" where we should have generated "add.rn.f32". Reviewers: mkuper Subscribers: hfinkel, majnemer, jholewinski, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28507 llvm-svn: 291618 2017-01-11 07:43:04 +08:00			`; CHECK: add.rn.f32`
[NVPTX] Use approximate FP ops when unsafe-fp-math is used, and append .ftz to instructions if the nvptx-f32ftz attribute is set to "true" llvm-svn: 186820 2013-07-22 20:18:04 +08:00			`define float @fadd(float %a, float %b) {`
			`%t1 = fadd float %a, %b`
			`ret float %t1`
			`}`

[NVPTX] Add CHECK-LABEL where appropriate to fast-math.ll test. Also fix up whitespace. Test-only change. llvm-svn: 291617 2017-01-11 07:42:46 +08:00			`; CHECK-LABEL: fadd_ftz`
[TM] Restore default TargetOptions in TargetMachine::resetTargetOptions. Summary: Previously if you had * a function with the fast-math-enabled attr, followed by * a function without the fast-math attr, the second function would inherit the first function's fast-math-ness. This means that mixing fast-math and non-fast-math functions in a module was completely broken unless you explicitly annotated every non-fast-math function with "unsafe-fp-math"="false". This appears to have been broken since r176986 (March 2013), when the resetTargetOptions function was introduced. This patch tests the correct behavior as best we can. I don't think I can test FPDenormalMode and NoTrappingFPMath, because they aren't used in any backends during function lowering. Surprisingly, I also can't find any uses at all of LessPreciseFPMAD affecting generated code. The NVPTX/fast-math.ll test changes are an expected result of fixing this bug. When FMA is disabled, we emit add as "add.rn.f32", which prevents fma combining. Before this patch, fast-math was enabled in all functions following the one which explicitly enabled it on itself, so we were emitting plain "add.f32" where we should have generated "add.rn.f32". Reviewers: mkuper Subscribers: hfinkel, majnemer, jholewinski, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28507 llvm-svn: 291618 2017-01-11 07:43:04 +08:00			`; CHECK: add.rn.ftz.f32`
[NVPTX] Use approximate FP ops when unsafe-fp-math is used, and append .ftz to instructions if the nvptx-f32ftz attribute is set to "true" llvm-svn: 186820 2013-07-22 20:18:04 +08:00			`define float @fadd_ftz(float %a, float %b) #1 {`
			`%t1 = fadd float %a, %b`
			`ret float %t1`
			`}`

[NVPTX] Only lower sin/cos to approximate instructions if unsafe math is allowed. Previously we'd always lower @llvm.{sin,cos}.f32 to {sin.cos}.approx.f32 instruction even when unsafe FP math was not allowed. Clang-generated IR is not affected by this as it uses precise sin/cos from CUDA's libdevice when unsafe math is disabled. Differential Revision: https://reviews.llvm.org/D28619 llvm-svn: 291936 2017-01-14 02:48:13 +08:00			`declare float @llvm.sin.f32(float)`
			`declare float @llvm.cos.f32(float)`

			`; CHECK-LABEL: fsin_approx`
			`; CHECK: sin.approx.f32`
			`define float @fsin_approx(float %a) #0 {`
			`%r = tail call float @llvm.sin.f32(float %a)`
			`ret float %r`
			`}`

			`; CHECK-LABEL: fcos_approx`
			`; CHECK: cos.approx.f32`
			`define float @fcos_approx(float %a) #0 {`
			`%r = tail call float @llvm.cos.f32(float %a)`
			`ret float %r`
			`}`

[NVPTX] Use approximate FP ops when unsafe-fp-math is used, and append .ftz to instructions if the nvptx-f32ftz attribute is set to "true" llvm-svn: 186820 2013-07-22 20:18:04 +08:00			`attributes #0 = { "unsafe-fp-math" = "true" }`
			`attributes #1 = { "nvptx-f32ftz" = "true" }`