llvm-project/clang/test/CodeGen/arm64_vdupq_n_f64.c

// RUN: %clang_cc1 -triple arm64-apple-ios7 -target-feature +neon -ffreestanding -fallow-half-arguments-and-returns -S -o - -disable-O0-optnone -emit-llvm %s | opt -S -mem2reg | FileCheck %s

#include <arm_neon.h>

// vdupq_n_f64 -> dup.2d v0, v0[0]
//
// CHECK-LABEL: define <2 x double> @test_vdupq_n_f64(double %w) #0 {
// CHECK:   [[VECINIT_I:%.*]] = insertelement <2 x double> undef, double %w, i32 0
// CHECK:   [[VECINIT1_I:%.*]] = insertelement <2 x double> [[VECINIT_I]], double %w, i32 1
// CHECK:   ret <2 x double> [[VECINIT1_I]]
float64x2_t test_vdupq_n_f64(float64_t w) {
    return vdupq_n_f64(w);
}

// might as well test this while we're here
// vdupq_n_f32 -> dup.4s v0, v0[0]
// CHECK-LABEL: define <4 x float> @test_vdupq_n_f32(float %w) #0 {
// CHECK:   [[VECINIT_I:%.*]] = insertelement <4 x float> undef, float %w, i32 0
// CHECK:   [[VECINIT1_I:%.*]] = insertelement <4 x float> [[VECINIT_I]], float %w, i32 1
// CHECK:   [[VECINIT2_I:%.*]] = insertelement <4 x float> [[VECINIT1_I]], float %w, i32 2
// CHECK:   [[VECINIT3_I:%.*]] = insertelement <4 x float> [[VECINIT2_I]], float %w, i32 3
// CHECK:   ret <4 x float> [[VECINIT3_I]]
float32x4_t test_vdupq_n_f32(float32_t w) {
    return vdupq_n_f32(w);
}

// vdupq_lane_f64 -> dup.2d v0, v0[0]
// this was in <rdar://problem/11778405>, but had already been implemented,
// test anyway
// CHECK-LABEL: define <2 x double> @test_vdupq_lane_f64(<1 x double> %V) #0 {
// CHECK:   [[SHUFFLE:%.*]] = shufflevector <1 x double> %V, <1 x double> %V, <2 x i32> zeroinitializer
// CHECK:   ret <2 x double> [[SHUFFLE]]
float64x2_t test_vdupq_lane_f64(float64x1_t V) {
    return vdupq_lane_f64(V, 0);
}

// vmovq_n_f64 -> dup Vd.2d,X0
// this wasn't in <rdar://problem/11778405>, but it was between the vdups
// CHECK-LABEL: define <2 x double> @test_vmovq_n_f64(double %w) #0 {
// CHECK:   [[VECINIT_I:%.*]] = insertelement <2 x double> undef, double %w, i32 0
// CHECK:   [[VECINIT1_I:%.*]] = insertelement <2 x double> [[VECINIT_I]], double %w, i32 1
// CHECK:   ret <2 x double> [[VECINIT1_I]]
float64x2_t test_vmovq_n_f64(float64_t w) {
  return vmovq_n_f64(w);
}

// CHECK-LABEL: define <4 x half> @test_vmov_n_f16(half* %a1) #1 {
// CHECK:   [[TMP0:%.*]] = load half, half* %a1, align 2
// CHECK:   [[VECINIT:%.*]] = insertelement <4 x half> undef, half [[TMP0]], i32 0
// CHECK:   [[VECINIT1:%.*]] = insertelement <4 x half> [[VECINIT]], half [[TMP0]], i32 1
// CHECK:   [[VECINIT2:%.*]] = insertelement <4 x half> [[VECINIT1]], half [[TMP0]], i32 2
// CHECK:   [[VECINIT3:%.*]] = insertelement <4 x half> [[VECINIT2]], half [[TMP0]], i32 3
// CHECK:   ret <4 x half> [[VECINIT3]]
float16x4_t test_vmov_n_f16(float16_t *a1) {
  return vmov_n_f16(*a1);
}

/*
float64x1_t test_vmov_n_f64(float64_t a1) {
  return vmov_n_f64(a1);
}
*/

// CHECK-LABEL: define <8 x half> @test_vmovq_n_f16(half* %a1) #0 {
// CHECK:   [[TMP0:%.*]] = load half, half* %a1, align 2
// CHECK:   [[VECINIT:%.*]] = insertelement <8 x half> undef, half [[TMP0]], i32 0
// CHECK:   [[VECINIT1:%.*]] = insertelement <8 x half> [[VECINIT]], half [[TMP0]], i32 1
// CHECK:   [[VECINIT2:%.*]] = insertelement <8 x half> [[VECINIT1]], half [[TMP0]], i32 2
// CHECK:   [[VECINIT3:%.*]] = insertelement <8 x half> [[VECINIT2]], half [[TMP0]], i32 3
// CHECK:   [[VECINIT4:%.*]] = insertelement <8 x half> [[VECINIT3]], half [[TMP0]], i32 4
// CHECK:   [[VECINIT5:%.*]] = insertelement <8 x half> [[VECINIT4]], half [[TMP0]], i32 5
// CHECK:   [[VECINIT6:%.*]] = insertelement <8 x half> [[VECINIT5]], half [[TMP0]], i32 6
// CHECK:   [[VECINIT7:%.*]] = insertelement <8 x half> [[VECINIT6]], half [[TMP0]], i32 7
// CHECK:   ret <8 x half> [[VECINIT7]]
float16x8_t test_vmovq_n_f16(float16_t *a1) {
  return vmovq_n_f16(*a1);
}

// CHECK: attributes #0 ={{.*}}"min-legal-vector-width"="128"
// CHECK: attributes #1 ={{.*}}"min-legal-vector-width"="64"
IRGen: Add optnone attribute on function during O0 Amongst other, this will help LTO to correctly handle/honor files compiled with O0, helping debugging failures. It also seems in line with how we handle other options, like how -fnoinline adds the appropriate attribute as well. Differential Revision: https://reviews.llvm.org/D28404 llvm-svn: 304127 2017-05-29 13:38:20 +08:00			`// RUN: %clang_cc1 -triple arm64-apple-ios7 -target-feature +neon -ffreestanding -fallow-half-arguments-and-returns -S -o - -disable-O0-optnone -emit-llvm %s \| opt -S -mem2reg \| FileCheck %s`
ARM64: initial clang support commit. This adds Clang support for the ARM64 backend. There are definitely still some rough edges, so please bring up any issues you see with this patch. As with the LLVM commit though, we think it'll be more useful for merging with AArch64 from within the tree. llvm-svn: 205100 2014-03-29 23:09:45 +08:00
			`#include <arm_neon.h>`

			`// vdupq_n_f64 -> dup.2d v0, v0[0]`
			`//`
ARM & AArch64: convert asm tests to LLVM IR and restrict optimizations. This is mostly a one-time autoconversion of tests that checked assembly after "-Owhatever" compiles to only run "opt -mem2reg" and check the assembly. This should make them much more stable to changes in LLVM so they won't break on unrelated changes. "opt -mem2reg" is a compromise designed to increase the readability of tests that check dataflow, while minimizing dependency on LLVM. Hopefully mem2reg is stable enough that no surpises will come along. Should address http://llvm.org/PR26815. llvm-svn: 263048 2016-03-10 02:54:42 +08:00			`// CHECK-LABEL: define <2 x double> @test_vdupq_n_f64(double %w) #0 {`
			`// CHECK: [[VECINIT_I:%.*]] = insertelement <2 x double> undef, double %w, i32 0`
			`// CHECK: [[VECINIT1_I:%.*]] = insertelement <2 x double> [[VECINIT_I]], double %w, i32 1`
			`// CHECK: ret <2 x double> [[VECINIT1_I]]`
			`float64x2_t test_vdupq_n_f64(float64_t w) {`
ARM64: initial clang support commit. This adds Clang support for the ARM64 backend. There are definitely still some rough edges, so please bring up any issues you see with this patch. As with the LLVM commit though, we think it'll be more useful for merging with AArch64 from within the tree. llvm-svn: 205100 2014-03-29 23:09:45 +08:00			`return vdupq_n_f64(w);`
			`}`

			`// might as well test this while we're here`
			`// vdupq_n_f32 -> dup.4s v0, v0[0]`
ARM & AArch64: convert asm tests to LLVM IR and restrict optimizations. This is mostly a one-time autoconversion of tests that checked assembly after "-Owhatever" compiles to only run "opt -mem2reg" and check the assembly. This should make them much more stable to changes in LLVM so they won't break on unrelated changes. "opt -mem2reg" is a compromise designed to increase the readability of tests that check dataflow, while minimizing dependency on LLVM. Hopefully mem2reg is stable enough that no surpises will come along. Should address http://llvm.org/PR26815. llvm-svn: 263048 2016-03-10 02:54:42 +08:00			`// CHECK-LABEL: define <4 x float> @test_vdupq_n_f32(float %w) #0 {`
			`// CHECK: [[VECINIT_I:%.*]] = insertelement <4 x float> undef, float %w, i32 0`
			`// CHECK: [[VECINIT1_I:%.*]] = insertelement <4 x float> [[VECINIT_I]], float %w, i32 1`
			`// CHECK: [[VECINIT2_I:%.*]] = insertelement <4 x float> [[VECINIT1_I]], float %w, i32 2`
			`// CHECK: [[VECINIT3_I:%.*]] = insertelement <4 x float> [[VECINIT2_I]], float %w, i32 3`
			`// CHECK: ret <4 x float> [[VECINIT3_I]]`
			`float32x4_t test_vdupq_n_f32(float32_t w) {`
ARM64: initial clang support commit. This adds Clang support for the ARM64 backend. There are definitely still some rough edges, so please bring up any issues you see with this patch. As with the LLVM commit though, we think it'll be more useful for merging with AArch64 from within the tree. llvm-svn: 205100 2014-03-29 23:09:45 +08:00			`return vdupq_n_f32(w);`
			`}`

			`// vdupq_lane_f64 -> dup.2d v0, v0[0]`
			`// this was in <rdar://problem/11778405>, but had already been implemented,`
			`// test anyway`
ARM & AArch64: convert asm tests to LLVM IR and restrict optimizations. This is mostly a one-time autoconversion of tests that checked assembly after "-Owhatever" compiles to only run "opt -mem2reg" and check the assembly. This should make them much more stable to changes in LLVM so they won't break on unrelated changes. "opt -mem2reg" is a compromise designed to increase the readability of tests that check dataflow, while minimizing dependency on LLVM. Hopefully mem2reg is stable enough that no surpises will come along. Should address http://llvm.org/PR26815. llvm-svn: 263048 2016-03-10 02:54:42 +08:00			`// CHECK-LABEL: define <2 x double> @test_vdupq_lane_f64(<1 x double> %V) #0 {`
			`// CHECK: [[SHUFFLE:%.*]] = shufflevector <1 x double> %V, <1 x double> %V, <2 x i32> zeroinitializer`
			`// CHECK: ret <2 x double> [[SHUFFLE]]`
			`float64x2_t test_vdupq_lane_f64(float64x1_t V) {`
ARM64: initial clang support commit. This adds Clang support for the ARM64 backend. There are definitely still some rough edges, so please bring up any issues you see with this patch. As with the LLVM commit though, we think it'll be more useful for merging with AArch64 from within the tree. llvm-svn: 205100 2014-03-29 23:09:45 +08:00			`return vdupq_lane_f64(V, 0);`
			`}`

			`// vmovq_n_f64 -> dup Vd.2d,X0`
			`// this wasn't in <rdar://problem/11778405>, but it was between the vdups`
ARM & AArch64: convert asm tests to LLVM IR and restrict optimizations. This is mostly a one-time autoconversion of tests that checked assembly after "-Owhatever" compiles to only run "opt -mem2reg" and check the assembly. This should make them much more stable to changes in LLVM so they won't break on unrelated changes. "opt -mem2reg" is a compromise designed to increase the readability of tests that check dataflow, while minimizing dependency on LLVM. Hopefully mem2reg is stable enough that no surpises will come along. Should address http://llvm.org/PR26815. llvm-svn: 263048 2016-03-10 02:54:42 +08:00			`// CHECK-LABEL: define <2 x double> @test_vmovq_n_f64(double %w) #0 {`
			`// CHECK: [[VECINIT_I:%.*]] = insertelement <2 x double> undef, double %w, i32 0`
			`// CHECK: [[VECINIT1_I:%.*]] = insertelement <2 x double> [[VECINIT_I]], double %w, i32 1`
			`// CHECK: ret <2 x double> [[VECINIT1_I]]`
			`float64x2_t test_vmovq_n_f64(float64_t w) {`
ARM64: initial clang support commit. This adds Clang support for the ARM64 backend. There are definitely still some rough edges, so please bring up any issues you see with this patch. As with the LLVM commit though, we think it'll be more useful for merging with AArch64 from within the tree. llvm-svn: 205100 2014-03-29 23:09:45 +08:00			`return vmovq_n_f64(w);`
			`}`

[CodeGen] Update min-legal-vector width based on function argument and return types This is a continuation of my patches to inform the X86 backend about what the largest IR types are in the function so that we can restrict the backend type legalizer to prevent 512-bit vectors on SKX when -mprefer-vector-width=256 is specified if no explicit 512 bit vectors were specified by the user. This patch updates the vector width based on the argument and return types of the current function and from the types of any functions it calls. This is intended to make sure the backend type legalizer doesn't disturb any types that are required for ABI. Differential Revision: https://reviews.llvm.org/D52441 llvm-svn: 345168 2018-10-25 01:42:17 +08:00			`// CHECK-LABEL: define <4 x half> @test_vmov_n_f16(half* %a1) #1 {`
ARM & AArch64: convert asm tests to LLVM IR and restrict optimizations. This is mostly a one-time autoconversion of tests that checked assembly after "-Owhatever" compiles to only run "opt -mem2reg" and check the assembly. This should make them much more stable to changes in LLVM so they won't break on unrelated changes. "opt -mem2reg" is a compromise designed to increase the readability of tests that check dataflow, while minimizing dependency on LLVM. Hopefully mem2reg is stable enough that no surpises will come along. Should address http://llvm.org/PR26815. llvm-svn: 263048 2016-03-10 02:54:42 +08:00			`// CHECK: [[TMP0:%.]] = load half, half %a1, align 2`
			`// CHECK: [[VECINIT:%.*]] = insertelement <4 x half> undef, half [[TMP0]], i32 0`
			`// CHECK: [[VECINIT1:%.*]] = insertelement <4 x half> [[VECINIT]], half [[TMP0]], i32 1`
ARM & AArch64: fix IR-converted tests. My script was converting %a0 to [[A]]0 if it had seen %a defined before %a0. Oops. llvm-svn: 263056 2016-03-10 04:06:10 +08:00			`// CHECK: [[VECINIT2:%.*]] = insertelement <4 x half> [[VECINIT1]], half [[TMP0]], i32 2`
			`// CHECK: [[VECINIT3:%.*]] = insertelement <4 x half> [[VECINIT2]], half [[TMP0]], i32 3`
			`// CHECK: ret <4 x half> [[VECINIT3]]`
ARM & AArch64: convert asm tests to LLVM IR and restrict optimizations. This is mostly a one-time autoconversion of tests that checked assembly after "-Owhatever" compiles to only run "opt -mem2reg" and check the assembly. This should make them much more stable to changes in LLVM so they won't break on unrelated changes. "opt -mem2reg" is a compromise designed to increase the readability of tests that check dataflow, while minimizing dependency on LLVM. Hopefully mem2reg is stable enough that no surpises will come along. Should address http://llvm.org/PR26815. llvm-svn: 263048 2016-03-10 02:54:42 +08:00			`float16x4_t test_vmov_n_f16(float16_t *a1) {`
ARM64: initial clang support commit. This adds Clang support for the ARM64 backend. There are definitely still some rough edges, so please bring up any issues you see with this patch. As with the LLVM commit though, we think it'll be more useful for merging with AArch64 from within the tree. llvm-svn: 205100 2014-03-29 23:09:45 +08:00			`return vmov_n_f16(*a1);`
			`}`

			`/*`
ARM & AArch64: convert asm tests to LLVM IR and restrict optimizations. This is mostly a one-time autoconversion of tests that checked assembly after "-Owhatever" compiles to only run "opt -mem2reg" and check the assembly. This should make them much more stable to changes in LLVM so they won't break on unrelated changes. "opt -mem2reg" is a compromise designed to increase the readability of tests that check dataflow, while minimizing dependency on LLVM. Hopefully mem2reg is stable enough that no surpises will come along. Should address http://llvm.org/PR26815. llvm-svn: 263048 2016-03-10 02:54:42 +08:00			`float64x1_t test_vmov_n_f64(float64_t a1) {`
ARM64: initial clang support commit. This adds Clang support for the ARM64 backend. There are definitely still some rough edges, so please bring up any issues you see with this patch. As with the LLVM commit though, we think it'll be more useful for merging with AArch64 from within the tree. llvm-svn: 205100 2014-03-29 23:09:45 +08:00			`return vmov_n_f64(a1);`
			`}`
			`*/`

ARM & AArch64: convert asm tests to LLVM IR and restrict optimizations. This is mostly a one-time autoconversion of tests that checked assembly after "-Owhatever" compiles to only run "opt -mem2reg" and check the assembly. This should make them much more stable to changes in LLVM so they won't break on unrelated changes. "opt -mem2reg" is a compromise designed to increase the readability of tests that check dataflow, while minimizing dependency on LLVM. Hopefully mem2reg is stable enough that no surpises will come along. Should address http://llvm.org/PR26815. llvm-svn: 263048 2016-03-10 02:54:42 +08:00			`// CHECK-LABEL: define <8 x half> @test_vmovq_n_f16(half* %a1) #0 {`
			`// CHECK: [[TMP0:%.]] = load half, half %a1, align 2`
			`// CHECK: [[VECINIT:%.*]] = insertelement <8 x half> undef, half [[TMP0]], i32 0`
			`// CHECK: [[VECINIT1:%.*]] = insertelement <8 x half> [[VECINIT]], half [[TMP0]], i32 1`
ARM & AArch64: fix IR-converted tests. My script was converting %a0 to [[A]]0 if it had seen %a defined before %a0. Oops. llvm-svn: 263056 2016-03-10 04:06:10 +08:00			`// CHECK: [[VECINIT2:%.*]] = insertelement <8 x half> [[VECINIT1]], half [[TMP0]], i32 2`
			`// CHECK: [[VECINIT3:%.*]] = insertelement <8 x half> [[VECINIT2]], half [[TMP0]], i32 3`
			`// CHECK: [[VECINIT4:%.*]] = insertelement <8 x half> [[VECINIT3]], half [[TMP0]], i32 4`
			`// CHECK: [[VECINIT5:%.*]] = insertelement <8 x half> [[VECINIT4]], half [[TMP0]], i32 5`
			`// CHECK: [[VECINIT6:%.*]] = insertelement <8 x half> [[VECINIT5]], half [[TMP0]], i32 6`
			`// CHECK: [[VECINIT7:%.*]] = insertelement <8 x half> [[VECINIT6]], half [[TMP0]], i32 7`
			`// CHECK: ret <8 x half> [[VECINIT7]]`
ARM & AArch64: convert asm tests to LLVM IR and restrict optimizations. This is mostly a one-time autoconversion of tests that checked assembly after "-Owhatever" compiles to only run "opt -mem2reg" and check the assembly. This should make them much more stable to changes in LLVM so they won't break on unrelated changes. "opt -mem2reg" is a compromise designed to increase the readability of tests that check dataflow, while minimizing dependency on LLVM. Hopefully mem2reg is stable enough that no surpises will come along. Should address http://llvm.org/PR26815. llvm-svn: 263048 2016-03-10 02:54:42 +08:00			`float16x8_t test_vmovq_n_f16(float16_t *a1) {`
ARM64: initial clang support commit. This adds Clang support for the ARM64 backend. There are definitely still some rough edges, so please bring up any issues you see with this patch. As with the LLVM commit though, we think it'll be more useful for merging with AArch64 from within the tree. llvm-svn: 205100 2014-03-29 23:09:45 +08:00			`return vmovq_n_f16(*a1);`
			`}`

[CodeGen] Update min-legal-vector width based on function argument and return types This is a continuation of my patches to inform the X86 backend about what the largest IR types are in the function so that we can restrict the backend type legalizer to prevent 512-bit vectors on SKX when -mprefer-vector-width=256 is specified if no explicit 512 bit vectors were specified by the user. This patch updates the vector width based on the argument and return types of the current function and from the types of any functions it calls. This is intended to make sure the backend type legalizer doesn't disturb any types that are required for ABI. Differential Revision: https://reviews.llvm.org/D52441 llvm-svn: 345168 2018-10-25 01:42:17 +08:00			`// CHECK: attributes #0 ={{.*}}"min-legal-vector-width"="128"`
			`// CHECK: attributes #1 ={{.*}}"min-legal-vector-width"="64"`