llvm-project/clang/test/CodeGen/arm64_vcvtfp.c

49 lines
1.5 KiB
C
Raw Normal View History

// RUN: %clang_cc1 -O1 -triple arm64-apple-ios7 -target-feature +neon -ffreestanding -S -o - -emit-llvm %s | FileCheck %s
#include <arm_neon.h>
float64x2_t test_vcvt_f64_f32(float32x2_t x) {
// CHECK-LABEL: test_vcvt_f64_f32
return vcvt_f64_f32(x);
// CHECK: fpext <2 x float> {{%.*}} to <2 x double>
// CHECK-NEXT: ret
}
float64x2_t test_vcvt_high_f64_f32(float32x4_t x) {
// CHECK-LABEL: test_vcvt_high_f64_f32
return vcvt_high_f64_f32(x);
// CHECK: [[HIGH:%.*]] = shufflevector <4 x float> {{%.*}}, <4 x float> undef, <2 x i32> <i32 2, i32 3>
// CHECK-NEXT: fpext <2 x float> [[HIGH]] to <2 x double>
// CHECK-NEXT: ret
}
float32x2_t test_vcvt_f32_f64(float64x2_t v) {
// CHECK: test_vcvt_f32_f64
return vcvt_f32_f64(v);
// CHECK: fptrunc <2 x double> {{%.*}} to <2 x float>
// CHECK-NEXT: ret
}
float32x4_t test_vcvt_high_f32_f64(float32x2_t x, float64x2_t v) {
// CHECK: test_vcvt_high_f32_f64
return vcvt_high_f32_f64(x, v);
// CHECK: [[TRUNC:%.*]] = fptrunc <2 x double> {{.*}} to <2 x float>
// CHECK-NEXT: shufflevector <2 x float> {{.*}}, <2 x float> [[TRUNC]], <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK-NEXT: ret
}
float32x2_t test_vcvtx_f32_f64(float64x2_t v) {
// CHECK: test_vcvtx_f32_f64
return vcvtx_f32_f64(v);
// CHECK: llvm.aarch64.neon.fcvtxn.v2f32.v2f64
// CHECK-NEXT: ret
}
float32x4_t test_vcvtx_high_f32_f64(float32x2_t x, float64x2_t v) {
// CHECK: test_vcvtx_high_f32_f64
return vcvtx_high_f32_f64(x, v);
// CHECK: llvm.aarch64.neon.fcvtxn.v2f32.v2f64
// CHECK: shufflevector
Rewrite ARM NEON intrinsic emission completely. There comes a time in the life of any amateur code generator when dumb string concatenation just won't cut it any more. For NeonEmitter.cpp, that time has come. There were a bunch of magic type codes which meant different things depending on the context. There were a bunch of special cases that really had no reason to be there but the whole thing was so creaky that removing them would cause something weird to fall over. There was a 1000 line switch statement for code generation involving string concatenation, which actually did lexical scoping to an extent (!!) with a bunch of semi-repeated cases. I tried to refactor this three times in three different ways without success. The only way forward was to rewrite the entire thing. Luckily the testing coverage on this stuff is absolutely massive, both with regression tests and the "emperor" random test case generator. The main change is that previously, in arm_neon.td a bunch of "Operation"s were defined with special names. NeonEmitter.cpp knew about these Operations and would emit code based on a huge switch. Actually this doesn't make much sense - the type information was held as strings, so type checking was impossible. Also TableGen's DAG type actually suits this sort of code generation very well (surprising that...) So now every operation is defined in terms of TableGen DAGs. There are a bunch of operators to use, including "op" (a generic unary or binary operator), "call" (to call other intrinsics) and "shuffle" (take a guess...). One of the main advantages of this apart from making it more obvious what is going on, is that we have proper type inference. This has two obvious advantages: 1) TableGen can error on bad intrinsic definitions easier, instead of just generating wrong code. 2) Calls to other intrinsics are typechecked too. So we no longer need to work out whether the thing we call needs to be the Q-lane version or the D-lane version - TableGen knows that itself! Here's an example: before: case OpAbdl: { std::string abd = MangleName("vabd", typestr, ClassS) + "(__a, __b)"; if (typestr[0] != 'U') { // vabd results are always unsigned and must be zero-extended. std::string utype = "U" + typestr.str(); s += "(" + TypeString(proto[0], typestr) + ")"; abd = "(" + TypeString('d', utype) + ")" + abd; s += Extend(utype, abd) + ";"; } else { s += Extend(typestr, abd) + ";"; } break; } after: def OP_ABDL : Op<(cast "R", (call "vmovl", (cast $p0, "U", (call "vabd", $p0, $p1))))>; As an example of what happens if you do something wrong now, here's what happens if you make $p0 unsigned before the call to "vabd" - that is, $p0 -> (cast "U", $p0): arm_neon.td:574:1: error: No compatible intrinsic found - looking up intrinsic 'vabd(uint8x8_t, int8x8_t)' Available overloads: - float64x2_t vabdq_v(float64x2_t, float64x2_t) - float64x1_t vabd_v(float64x1_t, float64x1_t) - float64_t vabdd_f64(float64_t, float64_t) - float32_t vabds_f32(float32_t, float32_t) ... snip ... This makes it seriously easy to work out what you've done wrong in fairly nasty intrinsics. As part of this I've massively beefed up the documentation in arm_neon.td too. Things still to do / on the radar: - Testcase generation. This was implemented in the previous version and not in the new one, because - Autogenerated tests are not being run. The testcase in test/ differs from the autogenerated version. - There were a whole slew of special cases in the testcase generation that just felt (and looked) like hacks. If someone really feels strongly about this, I can try and reimplement it too. - Big endian. That's coming soon and should be a very small diff on top of this one. llvm-svn: 211101
2014-06-17 21:11:27 +08:00
// CHECK: ret
}