This commit fixes a bug in IRGen where it generates completely broken
code for __fp16 vectors on X86. For example when the following code is
compiled:
half4 hv0, hv1, hv2; // these are vectors of __fp16.
void foo221() {
hv0 = hv1 + hv2;
}
clang generates the following IR, in which two i16 vectors are added:
@hv1 = common global <4 x i16> zeroinitializer, align 8
@hv2 = common global <4 x i16> zeroinitializer, align 8
@hv0 = common global <4 x i16> zeroinitializer, align 8
define void @foo221() {
%0 = load <4 x i16>, <4 x i16>* @hv1, align 8
%1 = load <4 x i16>, <4 x i16>* @hv2, align 8
%add = add <4 x i16> %0, %1
store <4 x i16> %add, <4 x i16>* @hv0, align 8
ret void
}
To fix the bug, this commit uses the code committed in r314056, which
modified clang to promote and truncate __fp16 vectors to and from float
vectors in the AST. It also fixes another IRGen bug where a short value
is assigned to an __fp16 variable without any integer-to-floating-point
conversion, as shown in the following example:
__fp16 a;
short b;
void foo1() {
a = b;
}
@b = common global i16 0, align 2
@a = common global i16 0, align 2
define void @foo1() #0 {
%0 = load i16, i16* @b, align 2
store i16 %0, i16* @a, align 2
ret void
}
rdar://problem/20625184
Differential Revision: https://reviews.llvm.org/D40112
llvm-svn: 320215
This commit fixes a bug in the handling of storage-only __fp16 vectors
where clang didn't promote __fp16 vector operands to float vectors.
Conceptually, it performs the following transformation on the AST in
CreateBuiltinBinOp and CreateBuiltinUnaryOp:
(Before)
typedef __fp16 half4 __attribute__ ((vector_size (8)));
typedef float float4 __attribute__ ((vector_size (16)));
half4 hv0, hv1, hv2, hv3;
hv0 = hv1 + hv2 + hv3;
(After)
float4 t0 = (float4)hv1 + (float4)hv2;
float4 t1 = t0 + (float4)hv3;
hv0 = (half4)t1;
Note that this commit fixes the bug for targets that set
HalfArgsAndReturns to true (ARM and ARM64). Targets using intrinsics
such as llvm.convert.to.fp16 to handle __fp16 are still broken.
rdar://problem/20625184
Differential Revision: https://reviews.llvm.org/D32520
llvm-svn: 314056