llvm-project/llvm/test/CodeGen/NVPTX/fp16.ll

; RUN: llc -march=nvptx -verify-machineinstrs < %s | FileCheck %s

declare float @llvm.convert.from.fp16.f32(i16) nounwind readnone
declare double @llvm.convert.from.fp16.f64(i16) nounwind readnone
declare i16 @llvm.convert.to.fp16.f32(float) nounwind readnone
declare i16 @llvm.convert.to.fp16.f64(double) nounwind readnone

; CHECK-LABEL: @test_convert_fp16_to_fp32
; CHECK: cvt.f32.f16
define void @test_convert_fp16_to_fp32(float addrspace(1)* noalias %out, i16 addrspace(1)* noalias %in) nounwind {
  %val = load i16, i16 addrspace(1)* %in, align 2
  %cvt = call float @llvm.convert.from.fp16.f32(i16 %val) nounwind readnone
  store float %cvt, float addrspace(1)* %out, align 4
  ret void
}


; CHECK-LABEL: @test_convert_fp16_to_fp64
; CHECK: cvt.f64.f16
define void @test_convert_fp16_to_fp64(double addrspace(1)* noalias %out, i16 addrspace(1)* noalias %in) nounwind {
  %val = load i16, i16 addrspace(1)* %in, align 2
  %cvt = call double @llvm.convert.from.fp16.f64(i16 %val) nounwind readnone
  store double %cvt, double addrspace(1)* %out, align 4
  ret void
}


; CHECK-LABEL: @test_convert_fp32_to_fp16
; CHECK: cvt.rn.f16.f32
define void @test_convert_fp32_to_fp16(i16 addrspace(1)* noalias %out, float addrspace(1)* noalias %in) nounwind {
  %val = load float, float addrspace(1)* %in, align 2
  %cvt = call i16 @llvm.convert.to.fp16.f32(float %val) nounwind readnone
  store i16 %cvt, i16 addrspace(1)* %out, align 4
  ret void
}


; CHECK-LABEL: @test_convert_fp64_to_fp16
; CHECK: cvt.rn.f16.f64
define void @test_convert_fp64_to_fp16(i16 addrspace(1)* noalias %out, double addrspace(1)* noalias %in) nounwind {
  %val = load double, double addrspace(1)* %in, align 2
  %cvt = call i16 @llvm.convert.to.fp16.f64(double %val) nounwind readnone
  store i16 %cvt, i16 addrspace(1)* %out, align 4
  ret void
}
NVPTX: support direct f16 <-> f64 conversions via intrinsics. Clang may well start emitting these soon, and while it may not be directly relevant for OpenCL or GLSL, the instructions were just sitting there waiting to be used. llvm-svn: 213356 2014-07-18 16:30:10 +08:00			`; RUN: llc -march=nvptx -verify-machineinstrs < %s \| FileCheck %s`

			`declare float @llvm.convert.from.fp16.f32(i16) nounwind readnone`
			`declare double @llvm.convert.from.fp16.f64(i16) nounwind readnone`
			`declare i16 @llvm.convert.to.fp16.f32(float) nounwind readnone`
			`declare i16 @llvm.convert.to.fp16.f64(double) nounwind readnone`

			`; CHECK-LABEL: @test_convert_fp16_to_fp32`
			`; CHECK: cvt.f32.f16`
			`define void @test_convert_fp16_to_fp32(float addrspace(1)* noalias %out, i16 addrspace(1)* noalias %in) nounwind {`
[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace\(\d+\) )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794 2015-02-28 05:17:42 +08:00			`%val = load i16, i16 addrspace(1)* %in, align 2`
NVPTX: support direct f16 <-> f64 conversions via intrinsics. Clang may well start emitting these soon, and while it may not be directly relevant for OpenCL or GLSL, the instructions were just sitting there waiting to be used. llvm-svn: 213356 2014-07-18 16:30:10 +08:00			`%cvt = call float @llvm.convert.from.fp16.f32(i16 %val) nounwind readnone`
			`store float %cvt, float addrspace(1)* %out, align 4`
			`ret void`
			`}`


			`; CHECK-LABEL: @test_convert_fp16_to_fp64`
			`; CHECK: cvt.f64.f16`
			`define void @test_convert_fp16_to_fp64(double addrspace(1)* noalias %out, i16 addrspace(1)* noalias %in) nounwind {`
[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace\(\d+\) )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794 2015-02-28 05:17:42 +08:00			`%val = load i16, i16 addrspace(1)* %in, align 2`
NVPTX: support direct f16 <-> f64 conversions via intrinsics. Clang may well start emitting these soon, and while it may not be directly relevant for OpenCL or GLSL, the instructions were just sitting there waiting to be used. llvm-svn: 213356 2014-07-18 16:30:10 +08:00			`%cvt = call double @llvm.convert.from.fp16.f64(i16 %val) nounwind readnone`
			`store double %cvt, double addrspace(1)* %out, align 4`
			`ret void`
			`}`


			`; CHECK-LABEL: @test_convert_fp32_to_fp16`
			`; CHECK: cvt.rn.f16.f32`
			`define void @test_convert_fp32_to_fp16(i16 addrspace(1)* noalias %out, float addrspace(1)* noalias %in) nounwind {`
[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace\(\d+\) )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794 2015-02-28 05:17:42 +08:00			`%val = load float, float addrspace(1)* %in, align 2`
NVPTX: support direct f16 <-> f64 conversions via intrinsics. Clang may well start emitting these soon, and while it may not be directly relevant for OpenCL or GLSL, the instructions were just sitting there waiting to be used. llvm-svn: 213356 2014-07-18 16:30:10 +08:00			`%cvt = call i16 @llvm.convert.to.fp16.f32(float %val) nounwind readnone`
			`store i16 %cvt, i16 addrspace(1)* %out, align 4`
			`ret void`
			`}`


			`; CHECK-LABEL: @test_convert_fp64_to_fp16`
			`; CHECK: cvt.rn.f16.f64`
			`define void @test_convert_fp64_to_fp16(i16 addrspace(1)* noalias %out, double addrspace(1)* noalias %in) nounwind {`
[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace\(\d+\) )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794 2015-02-28 05:17:42 +08:00			`%val = load double, double addrspace(1)* %in, align 2`
NVPTX: support direct f16 <-> f64 conversions via intrinsics. Clang may well start emitting these soon, and while it may not be directly relevant for OpenCL or GLSL, the instructions were just sitting there waiting to be used. llvm-svn: 213356 2014-07-18 16:30:10 +08:00			`%cvt = call i16 @llvm.convert.to.fp16.f64(double %val) nounwind readnone`
			`store i16 %cvt, i16 addrspace(1)* %out, align 4`
			`ret void`
			`}`