llvm-project/clang/test/CodeGen/attr-cpuspecific-avx-abi.c

// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-feature +avx -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK

// Make sure the features from the command line are honored regardless of what
// CPU is specified in the cpu_specific attribute.
// In this test, if the 'avx' feature isn't honored, we'll generate an error for
// the return type having a different ABI without 'avx' being enabled.

typedef double __m256d __attribute__((vector_size(32)));

extern __m256d bar_avx1(void);
extern __m256d bar_avx2(void);

// AVX1/AVX2 dispatcher
__attribute__((cpu_dispatch(generic, core_4th_gen_avx)))
__m256d foo_pd64x4(void);

__attribute__((cpu_specific(generic)))
__m256d foo(void) { return bar_avx1(); }
// CHECK: define{{.*}} @foo.A() #[[A:[0-9]+]]

__attribute__((cpu_specific(core_4th_gen_avx)))
__m256d foo(void) { return bar_avx2(); }
// CHECK: define{{.*}} @foo.V() #[[V:[0-9]+]]

// CHECK: attributes #[[A]] = {{.*}}"target-features"="+avx,+crc32,+cx8,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
// CHECK-SAME: "tune-cpu"="generic"
// CHECK: attributes #[[V]] = {{.*}}"target-features"="+avx,+avx2,+bmi,+cmov,+crc32,+cx8,+f16c,+fma,+lzcnt,+mmx,+movbe,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
// CHECK-SAME: "tune-cpu"="haswell"
[X86] Honor command line features along with cpu_specific attribute If the feature is on the command line we should honor it for all functions. I don't think we could reliably target a single function for a less capable processor than what the rest of the program is compiled for. Fixes PR52407. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D113647 2021-11-12 00:37:11 +08:00			`// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-feature +avx -emit-llvm -o - %s \| FileCheck %s --check-prefixes=CHECK`

			`// Make sure the features from the command line are honored regardless of what`
			`// CPU is specified in the cpu_specific attribute.`
			`// In this test, if the 'avx' feature isn't honored, we'll generate an error for`
			`// the return type having a different ABI without 'avx' being enabled.`

			`typedef double __m256d __attribute__((vector_size(32)));`

			`extern __m256d bar_avx1(void);`
			`extern __m256d bar_avx2(void);`

			`// AVX1/AVX2 dispatcher`
			`__attribute__((cpu_dispatch(generic, core_4th_gen_avx)))`
			`__m256d foo_pd64x4(void);`

			`__attribute__((cpu_specific(generic)))`
			`__m256d foo(void) { return bar_avx1(); }`
			`// CHECK: define{{.*}} @foo.A() #[[A:[0-9]+]]`

			`__attribute__((cpu_specific(core_4th_gen_avx)))`
			`__m256d foo(void) { return bar_avx2(); }`
			`// CHECK: define{{.*}} @foo.V() #[[V:[0-9]+]]`

			`// CHECK: attributes #[[A]] = {{.*}}"target-features"="+avx,+crc32,+cx8,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"`
Have cpu-specific variants set 'tune-cpu' as an optimization hint Due to various implementation constraints, despite the programmer choosing a 'processor' cpu_dispatch/cpu_specific needs to use the 'feature' list of a processor to identify it. This results in the identified processor in source-code not being propogated to the optimizer, and thus, not able to be tuned for. This patch changes to use the actual cpu as written for tune-cpu so that opt can make decisions based on the cpu-as-spelled, which should better match the behavior expected by the programmer. Note that the 'valid' list of processors for x86 is in llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list contains only Intel processors, but other vendors may wish to add their own entries as 'alias'es (or with different feature lists!). If this is not done, there is two potential performance issues with the patch, but I believe them to be worth it in light of the improvements to behavior and performance. 1- In the event that the user spelled "ProcessorB", but we only have the features available to test for "ProcessorA" (where A is B minus features), AND there is an optimization opportunity for "B" that negatively affects "A", the optimizer will likely choose to do so. 2- In the event that the user spelled VendorI's processor, and the feature list allows it to run on VendorA's processor of similar features, AND there is an optimization opportunity for VendorIs that negatively affects "A"s, the optimizer will likely choose to do so. This can be fixed by adding an alias to X86TargetParser.def. Differential Revision: https://reviews.llvm.org/D121410 2022-03-11 05:31:52 +08:00			`// CHECK-SAME: "tune-cpu"="generic"`
[X86] Honor command line features along with cpu_specific attribute If the feature is on the command line we should honor it for all functions. I don't think we could reliably target a single function for a less capable processor than what the rest of the program is compiled for. Fixes PR52407. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D113647 2021-11-12 00:37:11 +08:00			`// CHECK: attributes #[[V]] = {{.*}}"target-features"="+avx,+avx2,+bmi,+cmov,+crc32,+cx8,+f16c,+fma,+lzcnt,+mmx,+movbe,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"`
Have cpu-specific variants set 'tune-cpu' as an optimization hint Due to various implementation constraints, despite the programmer choosing a 'processor' cpu_dispatch/cpu_specific needs to use the 'feature' list of a processor to identify it. This results in the identified processor in source-code not being propogated to the optimizer, and thus, not able to be tuned for. This patch changes to use the actual cpu as written for tune-cpu so that opt can make decisions based on the cpu-as-spelled, which should better match the behavior expected by the programmer. Note that the 'valid' list of processors for x86 is in llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list contains only Intel processors, but other vendors may wish to add their own entries as 'alias'es (or with different feature lists!). If this is not done, there is two potential performance issues with the patch, but I believe them to be worth it in light of the improvements to behavior and performance. 1- In the event that the user spelled "ProcessorB", but we only have the features available to test for "ProcessorA" (where A is B minus features), AND there is an optimization opportunity for "B" that negatively affects "A", the optimizer will likely choose to do so. 2- In the event that the user spelled VendorI's processor, and the feature list allows it to run on VendorA's processor of similar features, AND there is an optimization opportunity for VendorIs that negatively affects "A"s, the optimizer will likely choose to do so. This can be fixed by adding an alias to X86TargetParser.def. Differential Revision: https://reviews.llvm.org/D121410 2022-03-11 05:31:52 +08:00			`// CHECK-SAME: "tune-cpu"="haswell"`