llvm-project/llvm/test/CodeGen/AMDGPU/fp32_to_fp16.ll

; RUN: llc -march=amdgcn -verify-machineinstrs < %s | FileCheck -check-prefix=GCN -check-prefix=FUNC %s
; RUN: llc -march=amdgcn -mcpu=tonga -mattr=-flat-for-global -verify-machineinstrs < %s | FileCheck -check-prefix=GCN -check-prefix=FUNC %s
; RUN: llc -march=r600 -mcpu=cypress -verify-machineinstrs < %s | FileCheck -check-prefix=EG -check-prefix=FUNC %s

declare i16 @llvm.convert.to.fp16.f32(float) nounwind readnone

; FUNC-LABEL: {{^}}test_convert_fp32_to_fp16:
; GCN: buffer_load_dword [[VAL:v[0-9]+]]
; GCN: v_cvt_f16_f32_e32 [[RESULT:v[0-9]+]], [[VAL]]
; GCN: buffer_store_short [[RESULT]]

; EG: MEM_RAT MSKOR
; EG: VTX_READ_32
; EG: FLT32_TO_FLT16
define void @test_convert_fp32_to_fp16(i16 addrspace(1)* noalias %out, float addrspace(1)* noalias %in) nounwind {
  %val = load float, float addrspace(1)* %in, align 4
  %cvt = call i16 @llvm.convert.to.fp16.f32(float %val) nounwind readnone
  store i16 %cvt, i16 addrspace(1)* %out, align 2
  ret void
}
AMDGPU/EG,CM: Add fp16 conversion instructions Differential Revision: https://reviews.llvm.org/D28164 llvm-svn: 291622 2017-01-11 08:12:39 +08:00			`; RUN: llc -march=amdgcn -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN -check-prefix=FUNC %s`
Enable FeatureFlatForGlobal on Volcanic Islands This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982 2017-01-25 06:02:15 +08:00			`; RUN: llc -march=amdgcn -mcpu=tonga -mattr=-flat-for-global -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN -check-prefix=FUNC %s`
AMDGPU/EG,CM: Add fp16 conversion instructions Differential Revision: https://reviews.llvm.org/D28164 llvm-svn: 291622 2017-01-11 08:12:39 +08:00			`; RUN: llc -march=r600 -mcpu=cypress -verify-machineinstrs < %s \| FileCheck -check-prefix=EG -check-prefix=FUNC %s`
R600/SI: Add support for llvm.convert.{to\|from}.fp16 llvm-svn: 212676 2014-07-10 11:22:20 +08:00
CodeGen: extend f16 conversions to permit types > float. This makes the two intrinsics @llvm.convert.from.f16 and @llvm.convert.to.f16 accept types other than simple "float". This is only strictly needed for the truncate operation, since otherwise double rounding occurs and there's no way to represent the strict IEEE conversion. However, for symmetry we allow larger types in the extend too. During legalization, we can expand an "fp16_to_double" operation into two extends for convenience, but abort when the truncate isn't legal. A new libcall is probably needed here. Even after this commit, various target tweaks are needed to actually use the extended intrinsics. I've put these into separate commits for clarity, so there are no actual tests of f64 conversion here. llvm-svn: 213248 2014-07-17 18:51:23 +08:00			`declare i16 @llvm.convert.to.fp16.f32(float) nounwind readnone`
R600/SI: Add support for llvm.convert.{to\|from}.fp16 llvm-svn: 212676 2014-07-10 11:22:20 +08:00
AMDGPU/EG,CM: Add fp16 conversion instructions Differential Revision: https://reviews.llvm.org/D28164 llvm-svn: 291622 2017-01-11 08:12:39 +08:00			`; FUNC-LABEL: {{^}}test_convert_fp32_to_fp16:`
			`; GCN: buffer_load_dword [[VAL:v[0-9]+]]`
			`; GCN: v_cvt_f16_f32_e32 [[RESULT:v[0-9]+]], [[VAL]]`
			`; GCN: buffer_store_short [[RESULT]]`

			`; EG: MEM_RAT MSKOR`
			`; EG: VTX_READ_32`
			`; EG: FLT32_TO_FLT16`
R600: rename misleading fp16 test. This test is actually going in the opposite direction to what the filename and function name suggested. llvm-svn: 213358 2014-07-18 16:43:30 +08:00			`define void @test_convert_fp32_to_fp16(i16 addrspace(1)* noalias %out, float addrspace(1)* noalias %in) nounwind {`
[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace\(\d+\) )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794 2015-02-28 05:17:42 +08:00			`%val = load float, float addrspace(1)* %in, align 4`
CodeGen: extend f16 conversions to permit types > float. This makes the two intrinsics @llvm.convert.from.f16 and @llvm.convert.to.f16 accept types other than simple "float". This is only strictly needed for the truncate operation, since otherwise double rounding occurs and there's no way to represent the strict IEEE conversion. However, for symmetry we allow larger types in the extend too. During legalization, we can expand an "fp16_to_double" operation into two extends for convenience, but abort when the truncate isn't legal. A new libcall is probably needed here. Even after this commit, various target tweaks are needed to actually use the extended intrinsics. I've put these into separate commits for clarity, so there are no actual tests of f64 conversion here. llvm-svn: 213248 2014-07-17 18:51:23 +08:00			`%cvt = call i16 @llvm.convert.to.fp16.f32(float %val) nounwind readnone`
R600/SI: Add support for llvm.convert.{to\|from}.fp16 llvm-svn: 212676 2014-07-10 11:22:20 +08:00			`store i16 %cvt, i16 addrspace(1)* %out, align 2`
			`ret void`
			`}`