llvm-project

Commit Graph

Author	SHA1	Message	Date
Nathan Chancellor	ef58ae86ba	[RISCV] Fix mcount name GCC's name for this symbol is _mcount, which the Linux kernel expects in a few different place: $ echo 'int main(void) { return 0; }' \| riscv32-linux-gcc -c -pg -o tmp.o -x c - $ llvm-objdump -dr tmp.o \| grep mcount 0000000c: R_RISCV_CALL _mcount $ echo 'int main(void) { return 0; }' \| riscv64-linux-gcc -c -pg -o tmp.o -x c - $ llvm-objdump -dr tmp.o \| grep mcount 000000000000000c: R_RISCV_CALL _mcount $ echo 'int main(void) { return 0; }' \| clang -c -pg -o tmp.o --target=riscv32-linux-gnu -x c - $ llvm-objdump -dr tmp.o \| grep mcount 0000000a: R_RISCV_CALL_PLT mcount $ echo 'int main(void) { return 0; }' \| clang -c -pg -o tmp.o --target=riscv64-linux-gnu -x c - $ llvm-objdump -dr tmp.o \| grep mcount 000000000000000a: R_RISCV_CALL_PLT mcount Set MCountName to "_mcount" in RISCVTargetInfo then prevent it from getting overridden in certain OSTargetInfo constructors. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D98881 Signed-off-by: Nathan Chancellor <nathan@kernel.org>	2021-03-24 18:11:37 -07:00
Nemanja Ivanovic	4020932706	[PowerPC] Make altivec.h work with AIX which has no __int128 There are a number of functions in altivec.h that use vector __int128 which isn't supported on AIX. Those functions need to be guarded for targets that don't support the type. Furthermore, the functions that produce quadword instructions without using the type need a builtin. This patch adds the macro guards to altivec.h using the __SIZEOF_INT128__ which is only defined on targets that support the __int128 type.	2021-03-24 00:35:51 -05:00
Zakk Chen	88c2d4c8eb	[RISCV][Clang] Add RVV Vector Indexed Load intrinsic functions. Support Complex type transformer to define more complexity legal type. Overall our downstream implementation there are only four instructions need to use complex type transformer, it's not a common case. I still feel using a string for prototypes is simple and clear. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98848	2021-03-23 19:18:50 -07:00
Bruno Cardoso Lopes	431e3138a1	[CGAtomic] Lift stronger requirements on cmpxch and support acquire failure mode - Fix `emitAtomicCmpXchgFailureSet` to support release/acquire (succ/fail) memory order. - Remove stronger checks for cmpxch. Effectively, this addresses http://wg21.link/p0418 Differential Revision: https://reviews.llvm.org/D98995	2021-03-23 16:45:37 -07:00
Nancy Wang	f46c41febb	[SystemZ][z/OS] fix lit test related to alignment This patch is to fix lit test case failure relate to alignment, on z/OS, maximum alignment value for 64 bit mode is 16 and also fixed clang/test/Layout/itanium-union-bitfield.cpp, attribute ((aligned(4))) is needed for bit-field member in Union for z/OS because single bit-field has one byte alignment, this will make sure size and alignment will be correct value on z/OS. Differential Revision: https://reviews.llvm.org/D98793	2021-03-23 13:15:19 -04:00
Zakk Chen	0bc1959f51	[RISCV][NFC] Fix RVV intrinsic tests. 1. Skip the temporary file 2. Test cc1 with -S to verify codegen work well. Add '-target-feature +m' because the backend requires it to calculate the vscaled size/offset. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99082	2021-03-23 06:06:05 -07:00
Nemanja Ivanovic	2f782a796a	[PowerPC] Add more missing overloads to altivec.h Add overloads that perform subtraction on v1i128 that take and produce vector unsigned char to avoid needing to use __int128. The overloads are suffixed with _u128 and are needed for targets where __int128 isn't supported (AIX).	2021-03-23 05:52:36 -05:00
Nemanja Ivanovic	54e4654f04	[PowerPC] Add more missing overloads to altivec.h Add overloads that perform addition on v1i128 that take and produce vector unsigned char to avoid needing to use __int128. The overloads are suffixed with _u128 and are needed for targets where __int128 isn't supported (AIX).	2021-03-23 05:09:19 -05:00
Nemanja Ivanovic	10cc5bcd86	[PowerPC] Add more missing overloads to altivec.h Add vec_permi as a synonym for vec_xxpermdi (but only for doubleword vectors).	2021-03-22 23:09:41 -05:00
Nemanja Ivanovic	b5e96e0ad6	[PowerPC] Add more missing overloads to altivec.h Add vec_gbb as a synonym for vec_vgbbd but for doubleword vectors.	2021-03-22 22:25:28 -05:00
Nemanja Ivanovic	d8e574c8e6	[PowerPC] Add more missing overloads to altivec.h Add vec_cvf as a synonym for vec_doublee/vec_floate.	2021-03-22 22:08:43 -05:00
Zakk Chen	1ea07ee453	Revert "[RISCV][NFC] Fix RVV intrinsic tests." This reverts commit `ab082b582d`.	2021-03-22 18:51:48 -07:00
Zakk Chen	ab082b582d	[RISCV][NFC] Fix RVV intrinsic tests. 1. Skip the temporary file 2. Test cc1 with -S to verify codegen work well. Add '-target-feature +m' because the backend requires it to calculate the vscaled size/offset. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99082	2021-03-22 18:24:03 -07:00
Nemanja Ivanovic	bef2cb9062	[PowerPC] Add more missing overloads to altivec.h Add vec_ctd which is similar to vec_ctf except the return type is vector double rather than vector float.	2021-03-22 20:23:07 -05:00
Bradley Smith	48f5a392cb	[IR] Add vscale_range IR function attribute This attribute represents the minimum and maximum values vscale can take. For now this attribute is not hooked up to anything during codegen, this will be added in the future when such codegen is considered stable. Additionally hook up the -msve-vector-bits=<x> clang option to emit this attribute. Differential Revision: https://reviews.llvm.org/D98030	2021-03-22 12:05:06 +00:00
Hongtao Yu	fc1812a0ad	[UniqueLinkageName] Use consistent checks when mangling symbo linkage name and debug linkage name. C functions may be declared and defined in different prototypes like below. This patch unifies the checks for mangling names in symbol linkage name emission and debug linkage name emission so that the two names are consistent. static int go(int); static int go(a) int a; { return a; } Test Plan: Differential Revision: https://reviews.llvm.org/D98799	2021-03-18 22:11:16 -07:00
Thomas Lively	f5764a8654	[WebAssembly] Finalize SIMD names and opcodes Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all SIMD instructions to match the finalized SIMD spec. Deliberately does not change the public interface in wasm_simd128.h yet; that will require more care. Depends on D98466. Differential Revision: https://reviews.llvm.org/D98676	2021-03-18 11:21:25 -07:00
Thomas Lively	2f2ae08da9	[WebAssembly] Remove experimental SIMD instructions Removes the instruction definitions, intrinsics, and builtins for qfma/qfms, signselect, and prefetch instructions, which were not included in the final WebAssembly SIMD spec. Depends on D98457. Differential Revision: https://reviews.llvm.org/D98466	2021-03-18 11:21:24 -07:00
Thomas Lively	8638c897f4	[WebAssembly] Remove unimplemented-simd target feature Now that the WebAssembly SIMD specification is finalized and engines are generally up-to-date, there is no need for a separate target feature for gating SIMD instructions that engines have not implemented. With this change, v128.const is now enabled by default with the simd128 target feature. Differential Revision: https://reviews.llvm.org/D98457	2021-03-18 10:23:12 -07:00
Mircea Trofin	92ccc6cb17	Reapply "[NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes" This reverts commit `11b70b9e3a`. The bot failure was due to ArgumentPromotion deleting functions without deleting their analyses. This was separately fixed in `4b1c807`.	2021-03-18 09:44:34 -07:00
Thomas Preud'homme	e5cd5b352f	[test] Fix variable definition in acle_sve_ld1.sh Clang test acle_sve_ld1.sh is missing the colon in one of the string variable definition separating the variable name from the regex. This leads the substitution block to be parsed as a numeric variable use. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D98852	2021-03-18 12:15:45 +00:00
Zakk Chen	be947aded0	[RISCV][Clang] Add RVV vle/vse intrinsic functions. Add new field PermuteOperands to mapping different operand order between C/C++ API and clang builtin. Reviewed By: craig.topper, rogfer01 Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Co-Authored-by: Zakk Chen <zakk.chen@sifive.com> Differential Revision: https://reviews.llvm.org/D98388	2021-03-17 20:31:25 -07:00
Zakk Chen	95c0125f2b	[Clang][RISCV] Add rvv vsetvl and vsetvlmax intrinsic functions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96843	2021-03-17 20:26:06 -07:00
Alex Lorenz	d672d5219a	Revert "[CodeGenModule] Set dso_local for Mach-O GlobalValue" This reverts commit `809a1e0ffd`. Mach-O doesn't support dso_local and this change broke XNU because of the use of dso_local. Differential Revision: https://reviews.llvm.org/D98458	2021-03-17 17:27:41 -07:00
Thomas Preud'homme	2426b1fa66	[Test] Fix undef var in attr-speculative-load-hardening.c Fix use of undefined variable in CHECK-NOT directive in clang test CodeGen/attr-speculative-load-hardening.c. Reviewed By: kristof.beyls Differential Revision: https://reviews.llvm.org/D93347	2021-03-17 19:12:25 +00:00
Bradley Smith	cf0da91ba5	[AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE Previously NEON used a target specific intrinsic for frintn, given that the FROUNDEVEN ISD node now exists, move over to that instead and add codegen support for that node for both NEON and fixed length SVE. Differential Revision: https://reviews.llvm.org/D98487	2021-03-17 11:41:22 +00:00
Bing1 Yu	320b72e9cd	[X86][AMX] Rename amx-bf16 intrinsic according to correct naming convention __tile_tdpbf16ps should be renamed with __tile_dpbf16ps Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D98685	2021-03-17 11:22:52 +08:00
Amy Huang	f5352dd9da	Emit inline implementation of __builtin__wmemchr on MSVCRT platforms. The MSVC runtime library doesn't have a definition for wmemchr, so provide an inline implementation. Differential Revision: https://reviews.llvm.org/D98472	2021-03-15 15:30:55 -07:00
diggerlin	d1f1bff81b	[AIX][XCOFF] Fixed the test case which failed at aix OS because enable -mignore-xcoff-visibility by default. Summary: because we enable -mignore-xcoff-visibility by default when there is no -fvisibility option in the clang in AIX OS it will cause some test case fail at aix os. in order to let the -mignore-xcoff-visibility to be disable, we need to add the -fvisibility=default for those test case. Reviewers: hubert.reinterpretcast daltenty Differential Revision: https://reviews.llvm.org/D98660	2021-03-15 17:33:02 -04:00
Jonas Paulsson	9cfd301ec8	[SystemZ] Test for isinf and isfinite in testFPKind(). Recognize BI__builtin_isinf and BI__builtin_isfinite (and a few other opcodes for finite) in testFPKind() and handle with TDC. Review: Ulrich Weigand. Differential Revision: https://reviews.llvm.org/D97901	2021-03-15 15:02:39 -06:00
Stelios Ioannou	ab86edbc88	[AArch64] Implement __rndr, __rndrrs intrinsics This patch implements the __rndr and __rndrrs intrinsics to provide access to the random number instructions introduced in Armv8.5-A. They are only defined for the AArch64 execution state and are available when __ARM_FEATURE_RNG is defined. These intrinsics store the random number in their pointer argument and return a status code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter intrinsic reseeds the random number generator. The instructions write the NZCV flags indicating the success of the operation that we can then read with a CSET. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics [2] https://bugs.llvm.org/show_bug.cgi?id=47838 Differential Revision: https://reviews.llvm.org/D98264 Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc	2021-03-15 17:51:48 +00:00
Melanie Blower	33b1f3f42c	[clang][patch] Solve PR49479, File scope fp pragma should propagate to functions nested in struct, and initialization expressions Previously, the CurFPFeatures state was set to command line settings before semantic analysis of the nested member functions and initialization expressions, that's not correct, it should use the pragma state which is in effect at the lexical position. Reviewed By: Erich Keane, Aaron Ballman Differential Revision: https://reviews.llvm.org/D98211	2021-03-15 12:15:20 -04:00
Thomas Preud'homme	f60b35340f	Stop traping on sNaN in __builtin_isinf __builtin_isinf currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: mibintc Differential Revision: https://reviews.llvm.org/D97125	2021-03-15 15:38:08 +00:00
David Green	2b3c813143	[Clang][ARM] Reenable arm_acle.c test. This test was apparently disabled in `6fcd4e080f`, without any sign of how it was going to be reenabled. This patch rewrites the test to use update_cc_test_checks, with midend optimizations other that mem2reg disabled. The first attempt of this patch in `5ae949a927` failed on bots even though it worked locally. I've attempted to adjust the RUN lines and made the test AArch64/ARM specific. Differential Revision: https://reviews.llvm.org/D98510	2021-03-14 10:59:24 +00:00
Nico Weber	d7b7e2026b	Revert "[Clang][ARM] Reenable arm_acle.c test." This reverts commit `5ae949a927`. Test fails everywhere.	2021-03-12 14:37:37 -05:00
David Green	5ae949a927	[Clang][ARM] Reenable arm_acle.c test. This test was apparently disabled in `6fcd4e080f`, without any sign of how it was going to be reenabled. This patch rewrites the test to use update_cc_test_checks, with midend optimizations other that mem2reg disabled.	2021-03-12 19:21:21 +00:00
Nemanja Ivanovic	b5fae4b9b2	[PowerPC] Add more missing overloads to altivec.h We are missing more predicate forms for 'vector double' and some tests. This adds the missing overloads and completes the set of test cases for them.	2021-03-12 10:51:57 -06:00
Sriraman Tallam	cdb42a4cc4	Disable unique linkage suffixes ifor global vars until demanglers can be fixed. D96109 added support for unique internal linkage names for both internal linkage functions and global variables. There was a lot of discussion on how to get the demangling right for functions but I completely missed the point that demanglers do not support suffixes for global vars. For example: $ c++filt _ZL3foo foo $ c++filt _ZL3foo.uniq.123 _ZL3foo.uniq.123 The demangling for functions works as expected. I am not sure of the impact of this. I don't understand how debuggers and other tools depend on the correctness of global variable demangling so I am pre-emptively disabling it until we can get the demangling support added. Importantly, uniquefying global variables is not needed right now as we do not do profile attribution to global vars based on sampling. It was added for completeness and so this feature is not exactly missed. Differential Revision: https://reviews.llvm.org/D98392	2021-03-11 20:59:30 -08:00
Mircea Trofin	11b70b9e3a	Revert "[NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes" This reverts commit `5eaeb0fa67`. It appears there are analyses that assume clearing - example: https://lab.llvm.org/buildbot#builders/36/builds/5964	2021-03-11 18:31:19 -08:00
Mircea Trofin	5eaeb0fa67	[NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes Check with the analysis result by calling invalidate instead of clear on the analysis manager. Differential Revision: https://reviews.llvm.org/D98440	2021-03-11 18:15:28 -08:00
Florian Hahn	c92ec0dd92	[Matrix] Add support for matrix-by-scalar division. This patch extends the matrix spec to allow matrix-by-scalar division. Originally support for `/` was left out to avoid ambiguity for the matrix-matrix version of `/`, which could either be elementwise or specified as matrix multiplication M1 * (1/M2). For the matrix-scalar version, no ambiguity exists; `*` is also an elementwise operation in that case. Matrix-by-scalar division is commonly supported by systems including Matlab, Mathematica or NumPy. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D97857	2021-03-11 22:21:23 +00:00
Zakk Chen	d6a0560bf2	[Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics. Demonstrate how to generate vadd/vfadd intrinsic functions 1. add -gen-riscv-vector-builtins for clang builtins. 2. add -gen-riscv-vector-builtin-codegen for clang codegen. 3. add -gen-riscv-vector-header for riscv_vector.h. It also generates ifdef directives with extension checking, base on D94403. 4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h. Generate overloading version Header for generic api. https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface 5. update tblgen doc for riscv related options. riscv_vector.td also defines some unused type transformers for vadd, because I think it could demonstrate how tranfer type work and we need them for the whole intrinsic functions implementation in the future. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Zakk Chen <zakk.chen@sifive.com> Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos Differential Revision: https://reviews.llvm.org/D95016	2021-03-10 18:43:43 -08:00
Jingu Kang	25951c5ab8	[AArch64] Add missing intrinsics for scalar FP rounding Differential Revision: https://reviews.llvm.org/D98269	2021-03-10 13:22:29 +00:00
Fangrui Song	b4948c27d2	Revert D97743 "Define __GCC_HAVE_DWARF2_CFI_ASM if applicable" This reverts commit `c11ff4bbad` & `df67d35269`. Trying to make the change to the driver to avoid round-trip issues.	2021-03-09 12:14:12 -08:00
Fangrui Song	df67d35269	[test] Fix debug-info-macro.c	2021-03-09 12:04:51 -08:00
diggerlin	46d4d1fea4	[AIX] do not emit visibility attribute into IR when there is -mignore-xcoff-visibility SUMMARY: n the patch https://reviews.llvm.org/D87451 "add new option -mignore-xcoff-visibility" we did as "The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR." in these patch we let -mignore-xcoff-visibility effect on generating IR too. the new feature only work on AIX OS Reviewer: Jason Liu, Differential Revision: https://reviews.llvm.org/D89986	2021-03-09 10:38:00 -05:00
Tomas Matheson	7e5cea5b50	[Clang][Sema] Warn when function argument is less aligned than parameter See https://bugs.llvm.org/show_bug.cgi?id=42154. GCC's __attribute__((align)) can reduce the alignment of a type when applied to a typedef. However, functions which take a pointer or reference to the original type are compiled assuming the original alignment. Therefore when any such function is passed an object of the new, less-aligned type, an alignment fault can occur. In particular, this applies to the constructor, which is defined for the original type and called for the less-aligned object. This change adds a warning whenever an pointer or reference to an object is passed to a function that was defined for a more-aligned type. The calls to ASTContext::getTypeAlignInChars seem change the order in which record layouts are evaluated, which caused changes to the output of -fdump-record-layouts. As such some tests needed to be updated: * Use CHECK-LABEL rather than counting the number of "Dumping AST Record Layout" headers. * Check for end of line in labels, so that struct B1 doesn't match struct B etc. * Add --strict-whitespace, since the whitespace shows meaningful structure. * The order in which record layouts are printed has changed in some cases. * clang-format for regions changed Differential Revision: https://reviews.llvm.org/D97187	2021-03-09 10:37:32 +00:00
Ahsan Saghir	acce401068	[PowerPC] Change target data layout for 16-byte stack alignment This changes the target data layout to make stack align to 16 bytes on Power10. Before this change, stack was being aligned to 32 bytes. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D96265	2021-03-08 08:13:08 -06:00
Saurabh Jha	63851a701e	[Matrix] Implement += and -= for MatrixType. Make sure CompLHSTy is set correctly for += and -= and matrix type operands. Bugzilla ticket is here https://bugs.llvm.org/show_bug.cgi?id=46164 Patch by Saurabh Jha <saurabh.jhaa@gmail.com> Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D98075	2021-03-08 09:32:11 +00:00
Nemanja Ivanovic	f4ad7a1a15	[PowerPC] Add missing double precision vec_all overloads to altivec.h We somehow missed vec_all_nlt, vec_all_nle and vec_all_numeric overloads for double precision vectors when VSX is enabled.	2021-03-05 18:42:12 -06:00
Sriraman Tallam	78d0e91865	Refactor -funique-internal-linakge-names implementation. The option -funique-internal-linkage-names was added in D73307 and D78243 as a LLVM early pass to insert a unique suffix to internal linkage functions and vars. The unique suffix was the hash of the module path. However, we found that this can be done more cleanly in clang early and the fixes that need to be done later can be completely avoided. The fixes in particular are trying to modify the DW_AT_linkage_name and finding the right place to insert the pass. This patch ressurects the original implementation proposed in D73307 which was reviewed and then ditched in favor of the pass based approach. Differential Revision: https://reviews.llvm.org/D96109	2021-03-05 13:32:17 -08:00
Chen Zheng	afa76fe67a	[XCOFF][DWARF] set default DWARF version to 3. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D98010	2021-03-05 09:21:57 -05:00
Jingu Kang	9b302513f6	[AArch64] Add missing intrinsics for vrnd	2021-03-05 11:26:12 +00:00
Gui Andrade	10264a1b21	Introduce noundef attribute at call sites for stricter poison analysis This change adds a new IR noundef attribute, which denotes when a function call argument or return val may never contain uninitialized bits. In MemorySanitizer, this attribute enables optimizations which decrease instrumented code size by up to 17% (measured with an instrumented build of clang) . I'll introduce the change allowing msan to take advantage of this information in a separate patch. Differential Revision: https://reviews.llvm.org/D81678	2021-03-04 12:15:12 -08:00
Thomas Preud'homme	52bfe6605a	Add __builtin_isnan(__fp16) testcase Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D97777	2021-03-04 13:03:48 +00:00
Thomas Preud'homme	6d6e7132f9	Revert "Add __builtin_isnan(__fp16) testcase" This reverts commit `e77b5c40d5` because it fails without `1b6eb56aa0`.	2021-03-04 12:18:03 +00:00
Thomas Preud'homme	b7aeece47c	Revert "Stop traping on sNaN in __builtin_isinf" This reverts commit `1b6eb56aa0` because the invert logic for isfinite is incorrect.	2021-03-04 12:07:35 +00:00
Wang, Pengfei	e7e67c930a	Add Windows ehcont section support (/guard:ehcont). Add option /guard:ehcont Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D96709	2021-03-04 11:47:29 +08:00
Fangrui Song	584cb67d2d	[IRSymTab] Set FB_used on llvm.compiler.used symbols IR symbol table does not parse inline asm. A symbol only referenced by inline asm is not in the IR symbol table, so LTO does not know that the definition (in another translation unit) is referenced and may internalize it, even if that definition has `__attribute__((used))` (which lowers to `llvm.compiler.used` on ELF targets since D97446). ``` // cabac.c __attribute__((used)) const uint8_t ff_h264_cabac_tables[...] = {...}; // h264_cabac.c asm("lea ff_h264_cabac_tables(%rip), %0" : ...); ``` `__attribute__((used))` is the recommended way to tell the compiler there may be inline asm references, so the usage is perfectly fine. This patch conservatively sets the `FB_used` bit on `llvm.compiler.used` symbols to work around the IR symbol table limitation. Note: before D97446, Clang never emitted symbols in the `llvm.compiler.used` list, so this change does not punish any Clang emitted global object. Without the patch, `ff_h264_cabac_tables` may be assigned to a non-external partition and get internalized. Then we will get a linker error because the `cabac.c` definition is not exposed. Differential Revision: https://reviews.llvm.org/D97755	2021-03-03 16:22:30 -08:00
JinGu Kang	394a4d0433	[AArch64] Add missing intrinsics for vcls Differential Revision: https://reviews.llvm.org/D97775	2021-03-03 10:17:56 +00:00
Thomas Preud'homme	e77b5c40d5	Add __builtin_isnan(__fp16) testcase Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D97777	2021-03-02 21:01:51 +00:00
Thomas Preud'homme	1b6eb56aa0	Stop traping on sNaN in __builtin_isinf __builtin_isinf currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: mibintc Differential Revision: https://reviews.llvm.org/D97125	2021-03-02 15:54:56 +00:00
Nemanja Ivanovic	1ff93618e5	[PowerPC] Add missing overloads of vec_promote to altivec.h The VSX-only overloads (for 8-byte element vectors) are missing. Add the missing overloads and convert element numbering to modulo arithmetic to match GCC and XLC.	2021-03-01 21:40:30 -06:00
Fangrui Song	d942a82a07	Make -f[no-]split-dwarf-inlining CC1 default align with driver default (no inlining) This makes CC1 and driver defaults consistent. In addition, for more common cases (-g is specified without -gsplit-dwarf), users will not see -fno-split-dwarf-inlining in CC1 options. Verified that the below is still true: * `clang -g` => `splitDebugInlining: false` in DICompileUnit * `clang -g -gsplit-dwarf` => `splitDebugInlining: false` in DICompileUnit * `clang -g -gsplit-dwarf -fsplit-dwarf-inlining` => no `splitDebugInlining: false` Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D97706	2021-03-01 10:55:19 -08:00
Yonghong Song	283db5f083	BPF: fix enum value 0 issue for __builtin_preserve_enum_value() Lorenz Bauer reported that the following code will have compilation error for bpf target: enum e { TWO }; bpf_core_enum_value_exists(enum e, TWO); The clang emitted the following error message: __builtin_preserve_enum_value argument 1 invalid In SemaChecking, an expression like "(enum NAME)1" will have cast kind CK_IntegralToPointer, but "(enum NAME)0" will have cast kind CK_NullToPointer. Current implementation only permits CK_IntegralToPointer, missing enum value 0 case. This patch permits CK_NullToPointer cast kind and the above test case can pass now. Differential Revision: https://reviews.llvm.org/D97659	2021-03-01 10:23:24 -08:00
Sean Fertile	3f40dbbbc7	[PowerPC][AIX] Enable passing vectors in variadic functions. Differential Revision: https://reviews.llvm.org/D97474	2021-03-01 13:08:28 -05:00
Arthur Eubanks	040c1b49d7	Move EntryExitInstrumentation pass location This seems to be more of a Clang thing rather than a generic LLVM thing, so this moves it out of LLVM pipelines and as Clang extension hooks into LLVM pipelines. Move the post-inline EEInstrumentation out of the backend pipeline and into a late pass, similar to other sanitizer passes. It doesn't fit into the codegen pipeline. Also fix up EntryExitInstrumentation not running at -O0 under the new PM. PR49143 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D97608	2021-03-01 10:08:10 -08:00
Fangrui Song	a0c1cd642d	[test] Add -triple x86_64 to attr-retain.c	2021-02-26 17:26:26 -08:00
Fangrui Song	8afdacba9d	Add GNU attribute 'retain' For ELF targets, GCC 11 will set SHF_GNU_RETAIN on the section of a `__attribute__((retain))` function/variable to prevent linker garbage collection. (See AttrDocs.td for the linker support). This patch adds `retain` functions/variables to the `llvm.used` list, which has the desired linker GC semantics. Note: `retain` does not imply `used`, so an unused function/variable can be dropped by Sema. Before 'retain' was introduced, previous ELF solutions require inline asm or linker tricks, e.g. `asm volatile(".reloc 0, R_X86_64_NONE, target");` (architecture dependent) or define a non-local symbol in the section and use `ld -u`. There was no elegant source-level solution. With D97448, `__attribute__((retain))` will set `SHF_GNU_RETAIN` on ELF targets. Differential Revision: https://reviews.llvm.org/D97447	2021-02-26 16:37:50 -08:00
Fangrui Song	28cb620321	Change some addUsedGlobal to addUsedOrCompilerUsedGlobal An global value in the `llvm.used` list does not have GC root semantics on ELF targets. This will be changed in a subsequent backend patch. Change some `llvm.used` in the ELF code path to use `llvm.compiler.used` to prevent undesired GC root semantics. Change one extern "C" alias (due to `__attribute__((used))` in extern "C") to use `llvm.compiler.used` on all targets. GNU ld has a rule "`__start_/__stop_` references from a live input section retain the associated C identifier name sections", which LLD may drop entirely (currently refined to exclude SHF_LINK_ORDER/SHF_GROUP) in a future release (the rule makes it clumsy to GC metadata sections; D96914 added a way to try the potential future behavior). For `llvm.used` global values defined in a C identifier name section, keep using `llvm.used` so that the future LLD change will not affect them. rnk kindly categorized the changes: ``` ObjC/blocks: this wants GC root semantics, since ObjC mainly runs on Mac. MS C++ ABI stuff: wants GC root semantics, no change OpenMP: unsure, but GC root semantics probably don't hurt CodeGenModule: affected in this patch to not use GC root semantics so that __attribute__((used)) behavior remains the same on ELF, plus two other minor use cases that don't want GC semantics Coverage: Probably want GC root semantics CGExpr.cpp: refers to LTO, wants GC root CGDeclCXX.cpp: one is MS ABI specific, so yes GC root, one is some other C++ init functionality, which should form GC roots (C++ initializers can have side effects and must run) CGDecl.cpp: Changed in this patch for __attribute__((used)) ``` Differential Revision: https://reviews.llvm.org/D97446	2021-02-26 10:42:07 -08:00
Petr Hosek	8459b8ef39	[Driver] Rename -fprofile-{prefix-map,compilation-dir} to -fcoverage-{prefix-map,compilation-dir} These flags affect coverage mapping (-fcoverage-mapping), not -fprofile-[instr-]generate so it makes more sense to use the -fcoverage-* prefix. Differential Revision: https://reviews.llvm.org/D97434	2021-02-25 21:40:12 -08:00
Dan Liew	7b1d2a2891	[NFC] Switch to auto marshalling infrastructure for `-fsanitize-address-destructor-kind=` flag. This change simplifies `clang/lib/Frontend/CompilerInvocation.cpp` because we no longer need to manually parse the flag and set codegen options in the frontend. However, we still need to manually parse the flag in the driver because: * The marshalling infrastructure doesn't operate there. * We need to do some platform specific checks in the driver that will likely never be supported by any kind of marshalling infrastructure. rdar://71609176 Differential Revision: https://reviews.llvm.org/D97327	2021-02-25 13:24:50 -08:00
Dan Liew	5d64dd8e3c	[Clang][ASan] Introduce `-fsanitize-address-destructor-kind=` driver & frontend option. The new `-fsanitize-address-destructor-kind=` option allows control over how module destructors are emitted by ASan. The new option is consumed by both the driver and the frontend and is propagated into codegen options by the frontend. Both the legacy and new pass manager code have been updated to consume the new option from the codegen options. It would be nice if the new utility functions (`AsanDtorKindToString` and `AsanDtorKindFromString`) could live in LLVM instead of Clang so they could be consumed by other language frontends. Unfortunately that doesn't work because the clang driver doesn't link against the LLVM instrumentation library. rdar://71609176 Differential Revision: https://reviews.llvm.org/D96572	2021-02-25 12:02:21 -08:00
Liu, Chen3	4bc7c8631a	[X86] Support amx-bf16 intrinsic. Adding support for intrinsics of AMX-BF16. This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong predicate. Differential Revision: https://reviews.llvm.org/D97358	2021-02-25 09:06:48 +08:00
Dávid Bolvanský	053dc95839	Reduce the number of attributes attached to each function Patch takes advantage of the implicit default behavior to reduce the number of attributes, which in turns reduces compilation time. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D97116	2021-02-24 07:08:44 +01:00
Hsiangkai Wang	1a35a1b074	[RISCV] Add vadd with mask and without mask builtin. Demonstrate how to add RISC-V V builtins and lower them to IR intrinsics for V extension. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Differential Revision: https://reviews.llvm.org/D93446	2021-02-24 07:57:31 +08:00
Liu, Chen3	f8b9035aae	[X86] Support amx-int8 intrinsic. Adding support for intrinsics of TDPBSUD/TDPBUSD/TDPBUUD. Differential Revision: https://reviews.llvm.org/D97259	2021-02-23 17:08:05 +08:00
Ryan Santhiraraja	2c25efcbd3	[AArch64] Adding SHA3 Intrinsics support This patch adds the following SHA3 Intrinsics: vsha512hq_u64, vsha512h2q_u64, vsha512su0q_u64, vsha512su1q_u64 veor3q_u8 veor3q_u16 veor3q_u32 veor3q_u64 veor3q_s8 veor3q_s16 veor3q_s32 veor3q_s64 vrax1q_u64 vxarq_u64 vbcaxq_u8 vbcaxq_u16 vbcaxq_u32 vbcaxq_u64 vbcaxq_s8 vbcaxq_s16 vbcaxq_s32 vbcaxq_s64 Note need to include +sha3 and +crypto when building from the front-end Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D96381	2021-02-22 12:09:20 +00:00
Dávid Bolvanský	ee51c42e00	Reduce the number of attributes attached to each function This takes advantage of the implicit default behavior to reduce the number of attributes.	2021-02-20 06:57:47 +01:00
Dávid Bolvanský	cd54c57919	Reland "[Libcalls, Attrs] Annotate libcalls with noundef" Fixed Clang tests.	2021-02-20 06:18:48 +01:00
Christopher Tetreault	55448ab540	[AArch64] Adding Neon Polynomial vadd Intrinsics This patch adds the following intrinsics: vadd_p8 vadd_p16 vadd_p64 vaddq_p8 vaddq_p16 vaddq_p64 vaddq_p128 Reviewed By: t.p.northover, DavidSpickett, ctetreau Differential Revision: https://reviews.llvm.org/D96825	2021-02-19 14:48:12 -08:00
Artem Belevich	1a368ae3b7	[CUDA] fix builtin constraints for PTX 7.2 This fixes build issues w/ CUDA-11 introduced by https://reviews.llvm.org/D95974 Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D97009	2021-02-19 09:57:21 -08:00
Nikita Popov	71a8e4e7d6	[MemCopyOpt] Enable MemorySSA by default This enables use of MemorySSA instead of MemDep in MemCpyOpt. To allow this without significant compile-time impact, the MemCpyOpt pass is moved directly before DSE (in the cases where this was not already the case), which allows us to reuse the existing MemorySSA analysis. Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt can also perform simple optimizations across basic blocks. Differential Revision: https://reviews.llvm.org/D94376	2021-02-19 18:06:25 +01:00
Petr Hosek	5fbd1a333a	[Coverage] Store compilation dir separately in coverage mapping We currently always store absolute filenames in coverage mapping. This is problematic for several reasons. It poses a problem for distributed compilation as source location might vary across machines. We are also duplicating the path prefix potentially wasting space. This change modifies how we store filenames in coverage mapping. Rather than absolute paths, it stores the compilation directory and file paths as given to the compiler, either relative or absolute. Later when reading the coverage mapping information, we recombine relative paths with the working directory. This approach is similar to handling ofDW_AT_comp_dir in DWARF. Finally, we also provide a new option, -fprofile-compilation-dir akin to -fdebug-compilation-dir which can be used to manually override the compilation directory which is useful in distributed compilation cases. Differential Revision: https://reviews.llvm.org/D95753	2021-02-18 14:34:39 -08:00
Petr Hosek	fbf8b957fd	Revert "[Coverage] Store compilation dir separately in coverage mapping" This reverts commit `97ec8fa5bb` since the test is failing on some bots.	2021-02-18 12:50:24 -08:00
Pengxuan Zheng	0ec32f1326	Revert "[AArch64] Adding Neon Polynomial vadd Intrinsics" Revert the patch due to buildbot failures. This reverts commit `d9645059c5`.	2021-02-18 12:38:16 -08:00
Petr Hosek	97ec8fa5bb	[Coverage] Store compilation dir separately in coverage mapping We currently always store absolute filenames in coverage mapping. This is problematic for several reasons. It poses a problem for distributed compilation as source location might vary across machines. We are also duplicating the path prefix potentially wasting space. This change modifies how we store filenames in coverage mapping. Rather than absolute paths, it stores the compilation directory and file paths as given to the compiler, either relative or absolute. Later when reading the coverage mapping information, we recombine relative paths with the working directory. This approach is similar to handling ofDW_AT_comp_dir in DWARF. Finally, we also provide a new option, -fprofile-compilation-dir akin to -fdebug-compilation-dir which can be used to manually override the compilation directory which is useful in distributed compilation cases. Differential Revision: https://reviews.llvm.org/D95753	2021-02-18 12:27:42 -08:00
Pengxuan Zheng	d9645059c5	[AArch64] Adding Neon Polynomial vadd Intrinsics This patch adds the following intrinsics: vadd_p8 vadd_p16 vadd_p64 vaddq_p8 vaddq_p16 vaddq_p64 vaddq_p128 Reviewed By: t.p.northover, DavidSpickett Differential Revision: https://reviews.llvm.org/D96825	2021-02-18 11:33:24 -08:00
Jonas Paulsson	e57bd1ff4f	[CFE, SystemZ] New target hook testFPKind() for checks of FP values. The recent commit `00a6254` "Stop traping on sNaN in builtin_isnan" changed the lowering in constrained FP mode of builtin_isnan from an FP comparison to integer operations to avoid trapping. SystemZ has a special instruction "Test Data Class" which is the preferred way to do this check. This patch adds a new target hook "testFPKind()" that lets SystemZ emit the s390_tdc intrinsic instead. testFPKind() takes the BuiltinID as an argument and is expected to soon handle more opcodes than just 'builtin_isnan'. Review: Thomas Preud'homme, Ulrich Weigand Differential Revision: https://reviews.llvm.org/D96568	2021-02-18 12:36:46 -06:00
Jeroen Dobbelaere	46757ccb49	[clang] functions with the 'const' or 'pure' attribute must always return. As described in * https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-pure-function-attribute * https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-const-function-attribute An `__attribute__((pure))` function must always return, as well as an `__attribute__((const))` function. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D96960	2021-02-18 17:29:46 +01:00
Hsiangkai Wang	766ee1096f	[Clang][RISCV] Define RISC-V V builtin types Add the types for the RISC-V V extension builtins. These types will be used by the RISC-V V intrinsics which require types of the form <vscale x 1 x i64>(LMUL=1 element size=64) or <vscale x 4 x i32>(LMUL=2 element size=32), etc. The vector_size attribute does not work for us as it doesn't create a scalable vector type. We want these types to be opaque and have no operators defined for them. We want them to be sizeless. This makes them similar to the ARM SVE builtin types. But we will have quite a bit more types. This patch adds around 60. Later patches will add another 230 or so types representing tuples of these types similar to the x2/x3/x4 types in ARM SVE. But with extra complexity that these types are combined with the LMUL concept that is unique to RISCV. For more background see this RFC http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html Authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es> Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Differential Revision: https://reviews.llvm.org/D92715	2021-02-18 10:17:31 +08:00
Sriraman Tallam	e741916330	Basic block sections should enable not function sections implicitly. Basic block sections enables function sections implicitly, this is not needed and is inefficient with "=list" option. We had basic block sections enable function sections implicitly in clang. This is particularly inefficient with "=list" option as it places functions that do not have any basic block sections in separate sections. This causes unnecessary object file overhead for large applications. This patch disables this implicit behavior. It only creates function sections for those functions that require basic block sections. This patch is the second of two patches and this patch removes the implicit enabling of function sections with basic block sections in clang. Differential Revision: https://reviews.llvm.org/D93876	2021-02-17 12:37:50 -08:00
Igor Kudrin	aa84289629	[DebugInfo] Keep the DWARF64 flag in the module metadata This allows the option to affect the LTO output. Module::Max helps to generate debug info for all modules in the same format. Differential Revision: https://reviews.llvm.org/D96597	2021-02-17 17:03:34 +07:00
serge-sans-paille	3c8bf29f14	Reduce the number of attributes attached to each function This takes advantage of the implicit default behavior to reduce the number of attributes, which in turns reduces compilation time. I've observed -3% in instruction count when compiling sqlite3 amalgamation with -O0 Differential Revision: https://reviews.llvm.org/D96400	2021-02-16 16:19:54 +01:00
Wang, Pengfei	61da20575d	[X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) This is a follow up of D92940. We have successfully converted fadd/fmul _mm_reduce_* intrinsics to llvm.reduction + reassoc flag. We can do the same approach for fmin/fmax too, i.e. llvm.reduction + nnan flag. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93179	2021-02-15 08:52:06 +08:00
Jonas Paulsson	b3ac5b84cd	[SystemZ] Fix vecintrin.h to not emit alignment hints in vec_xl/vec_xst. vec_xl() and vec_xst() should not emit alignment hints since they take a scalar pointer and also add a byte offset if passed. This patch uses memcpy to achieve the desired result. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D96471	2021-02-12 18:26:36 -06:00
Florian Hahn	51bf4c0e6d	[clang] Add -ffinite-loops & -fno-finite-loops options. This patch adds 2 new options to control when Clang adds `mustprogress`: 1. -ffinite-loops: assume all loops are finite; mustprogress is added to all loops, regardless of the selected language standard. 2. -fno-finite-loops: assume no loop is finite; mustprogress is not added to any loop or function. We could add mustprogress to functions without loops, but we would have to detect that in Clang, which is probably not worth it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D96419	2021-02-12 19:25:49 +00:00
Florian Hahn	fb4d8fe807	[clang] Update mustprogress tests. This unifies the positive and negative tests in a single file and manually adjusts the check lines to check for differences surgically.	2021-02-12 16:53:51 +00:00
James Y Knight	8043d5a964	NFC: update clang tests to check ordering and alignment for atomicrmw/cmpxchg. The ability to specify alignment was recently added, and it's an important property which we should ensure is set as expected by Clang. (Especially before making further changes to Clang's code in this area.) But, because it's on the end of the lines, the existing tests all ignore it. Therefore, update all the tests to also verify the expected alignment for atomicrmw and cmpxchg. While I was in there, I also updated uses of 'load atomic' and 'store atomic', and added the memory ordering, where that was missing.	2021-02-11 17:35:09 -05:00
Pengxuan Zheng	61cca0f2e5	[AArch64] Adding Neon Sm3 & Sm4 Intrinsics This adds SM3 and SM4 Intrinsics support for AArch64, specifically: vsm3ss1q_u32 vsm3tt1aq_u32 vsm3tt1bq_u32 vsm3tt2aq_u32 vsm3tt2bq_u32 vsm3partw1q_u32 vsm3partw2q_u32 vsm4eq_u32 vsm4ekeyq_u32 Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D95655	2021-02-11 14:20:20 -08:00
Douglas Yung	7b4832648a	NFCI. With the move to the new pass manager by default, sanitize-coverage.c is now passing on ARM. This change removes the XFAIL from the original test and duplicates the test into sanitize-coverage-old-pm.c which uses the old pass manager and has the corresponding XFAIL. This should fix the XPASS from this and similar runs: http://lab.llvm.org:8011/#/builders/60/builds/1875	2021-02-11 13:18:18 -08:00
Paul Robinson	5ea2d4fa48	Avoid conflicts between debug-info and pseudo-probe profiling After D93264, using both -fdebug-info-for-profiling and -fpseudo-probe-for-profiling will cause the compiler to crash. Diagnose these conflicting options in the driver. Also, the existing CodeGen test was using the driver when it should be running cc1. Differential Revision: https://reviews.llvm.org/D96354	2021-02-10 07:09:18 -08:00
Wang, Pengfei	dd2460ed5d	[X86] Always assign reassoc flag for intrinsics reduce_add/mul_ps/pd. Intrinsics reduce_add/mul_ps/pd have assumption that the elements in the vector are reassociable. So we need to always assign the reassoc flag when we call _mm_reduce_* intrinsics. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D96231	2021-02-09 21:14:06 +08:00
Petr Hosek	9fd9b5a9c9	Don't emit coverage mapping for excluded functions When a function or a file is excluded using -fprofile-list= option, don't emit coverage mapping as doing so confuses users since those functions would always have zero count. This also reduces the binary size considerably in cases where only a few functions or files are being instrumented. Differential Revision: https://reviews.llvm.org/D96000	2021-02-05 13:03:57 -08:00
Thomas Preud'homme	00a62547da	Stop traping on sNaN in __builtin_isnan __builtin_isnan currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: kpn Differential Revision: https://reviews.llvm.org/D95948	2021-02-05 18:28:48 +00:00
Krzysztof Parzyszek	bc097f645e	[Hexagon] Add clang builtin definitions for Hexagon V68	2021-02-04 09:54:52 -06:00
Kevin P. Neal	81b69879c9	[FPEnv][X86] Platform builtins edition: clang should get from the AST the metadata for constrained FP builtins Currently clang is not correctly retrieving from the AST the metadata for constrained FP builtins. This patch fixes that for the X86 specific builtins. Differential Revision: https://reviews.llvm.org/D94614	2021-02-03 11:49:17 -05:00
Hongtao Yu	3d89b3cbec	[CSSPGO] Introducing distribution factor for pseudo probe. Sample re-annotation is required in LTO time to achieve a reasonable post-inline profile quality. However, we have seen that such LTO-time re-annotation degrades profile quality. This is mainly caused by preLTO code duplication that is done by passes such as loop unrolling, jump threading, indirect call promotion etc, where samples corresponding to a source location are aggregated multiple times due to the duplicates. In this change we are introducing a concept of distribution factor for pseudo probes so that samples can be distributed for duplicated probes scaled by a factor. We hope that optimizations duplicating code well-maintain the branch frequency information (BFI) based on which probe distribution factors are calculated. Distribution factors are updated at the end of preLTO pipeline to reflect an estimated portion of the real execution count. This change also introduces a pseudo probe verifier that can be run after each IR passes to detect duplicated pseudo probes. A saturated distribution factor stands for 1.0. A pesudo probe will carry a factor with the value ranged from 0.0 to 1.0. A 64-bit integral distribution factor field that represents [0.0, 1.0] is associated to each block probe. Unfortunately this cannot be done for callsite probes due to the size limitation of a 32-bit Dwarf discriminator. A 7-bit distribution factor is used instead. Changes are also needed to the sample profile inliner to deal with prorated callsite counts. Call sites duplicated by PreLTO passes, when later on inlined in LTO time, should have the callees’s probe prorated based on the Prelink-computed distribution factors. The distribution factors should also be taken into account when computing hotness for inline candidates. Also, Indirect call promotion results in multiple callisites. The original samples should be distributed across them. This is fixed by adjusting the callisites' distribution factors. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D93264	2021-02-02 11:55:01 -08:00
Fangrui Song	74c94b5d9c	[test] Default clang/test to FileCheck --allow-unused-prefixes=false	2021-02-02 11:22:46 -08:00
Zarko Todorovski	eb3426a528	[AIX] Improve option processing for mabi=vec-extabi and mabi=vec=defaul Opening this revision to better address comments by @hubert.reinterpretcast in https://reviews.llvm.org/rGcaaaebcde462 Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D95702	2021-02-02 10:59:21 -05:00
Nico Weber	f2b4cc91e0	Revert "[test] Default clang/test to FileCheck --allow-unused-prefixes=false" This reverts commit `80f539526e`. Many test failures on mac: http://45.33.8.238/macm1/2772/summary.html One on win: http://45.33.8.238/win/32442/summary.html	2021-02-02 07:38:44 -05:00
Fangrui Song	80f539526e	[test] Default clang/test to FileCheck --allow-unused-prefixes=false	2021-02-01 22:02:59 -08:00
Petr Hosek	0217f1c7a3	Make the profile-filter.c test compatible with 32-bit systems This addresses PR48930. Differential Revision: https://reviews.llvm.org/D95658	2021-01-29 09:58:32 -08:00
Abhina Sreeskantharajan	42a21778f6	[test] Use host platform specific error message substitution in lit tests On z/OS, the following error message is not matched correctly in lit tests. ``` EDC5129I No such file or directory. ``` This patch uses a lit config substitution to check for platform specific error messages. Reviewed By: muiez, jhenderson Differential Revision: https://reviews.llvm.org/D95246	2021-01-29 07:16:30 -05:00
Thomas Lively	4b68b64dcc	[WebAssembly] Prototype i8x16 to i32x4 widening instructions As proposed in https://github.com/WebAssembly/simd/pull/395 and matching the opcodes used in V8: https://chromium-review.googlesource.com/c/v8/v8/+/2617385/4/src/wasm/wasm-opcodes.h Differential Revision: https://reviews.llvm.org/D95557	2021-01-28 10:59:32 -08:00
James Y Knight	a7246ba02a	Itanium Mangling: In 'enable_if', omit X/E around <expr-primary>. The Clang enable_if extension is mangled as an <extended-qualifier>, which is supposed to contain <template-args>. However, we were unconditionally emitting X/E around its arguments, neglecting the fact that <expr-primary> should be emitted directly without the surrounding X/E. Differential Revision: https://reviews.llvm.org/D95488	2021-01-27 16:46:52 -05:00
Fangrui Song	3e80686186	[test] Fix clang/test/CodeGen tests	2021-01-27 10:55:27 -08:00
Freddy Ye	1edb76cc91	[X86] merge "={eax}" and "~{eax}" into "=&eax" for MSInlineASM Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D94466	2021-01-27 22:54:17 +08:00
Petr Hosek	bb9eb19829	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 17:13:34 -08:00
Petr Hosek	1e634f3952	Revert "Support for instrumenting only selected files or functions" This reverts commit `4edf35f11a` because the test fails on Windows bots.	2021-01-26 12:25:28 -08:00
Fangrui Song	189f311130	CGDebugInfo CreatedLimitedType: Drop file/line for RecordType with invalid location For Clang synthesized `__va_list_tag` (`CreateX86_64ABIBuiltinVaListDecl`), its DW_AT_decl_file/DW_AT_decl_line are arbitrarily set from `CurLoc`. In a stage 2 `-DCMAKE_BUILD_TYPE=Debug` clang build, I observe that in driver.cpp, DW_AT_decl_file/DW_AT_decl_line may be set to an `#include` line (the transitively included file uses va_arg (`__builtin_va_arg`)). This seems arbitrary. Drop that. Reviewed By: #debug-info, dblaikie Differential Revision: https://reviews.llvm.org/D94735	2021-01-26 11:53:25 -08:00
Petr Hosek	4edf35f11a	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 11:11:39 -08:00
Mircea Trofin	0c0d009a88	[NFC] Disallow unused prefixes under clang/test/CodeGen Differential Revision: https://reviews.llvm.org/D95417	2021-01-26 08:05:45 -08:00
Zarko Todorovski	028d7a3668	Remove requirement for -maltivec to be used when using -mabi=vec-extabi or -mabi=vec-default when not using vector code The previous implementation required that `-maltivec` be specified when using either `-mabi=vec-extabi` or `-mabi=vec-default`, this patch removes that requirement. Reviewed By: cebowleratibm Differential Revision: https://reviews.llvm.org/D94986	2021-01-26 07:58:01 -05:00
Abhina Sreeskantharajan	978444d531	Revert "[SystemZ][z/OS] Fix No such file or directory expression error" This reverts commit `06f8a49693`.	2021-01-25 08:29:38 -05:00
Jeroen Dobbelaere	2b9a834c43	[InlineFunction] Use llvm.experimental.noalias.scope.decl for noalias arguments. Insert a llvm.experimental.noalias.scope.decl intrinsic that identifies where a noalias argument was inlined. This patch includes some refactorings from D90104. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93040	2021-01-23 12:10:57 +01:00
Thomas Lively	11802eced5	[WebAssembly] Prototype new f64x2 conversions As proposed in https://github.com/WebAssembly/simd/pull/383. Differential Revision: https://reviews.llvm.org/D95012	2021-01-20 11:28:06 -08:00
George Burgess IV	b270fd59f0	Revert "[clang] Change builtin object size when subobject is invalid" This reverts commit `275f30df8a`. As noted on the code review (https://reviews.llvm.org/D92892), this change causes us to reject valid code in a few cases. Reverting so we have more time to figure out what the right fix{es are, is} here.	2021-01-20 11:03:34 -08:00
Luo, Yuanke	7e1d2224b4	[X86][AMX] Fix the typo. The dpbsud should be dpbssd. Differential Revision: https://reviews.llvm.org/D94943	2021-01-19 16:57:34 +08:00
Florian Hahn	291ac7e622	[AArch64] Revert back to Intrinsic<> for TME instructions. This patch reverts back to Intrinsic for the instructions for the transactional memory extension, so nosync is not included.	2021-01-18 18:03:58 +00:00
Abhina Sreeskantharajan	689aaba7ac	[SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match successfully. ``` EDC5129I No such file or directory. ``` Reviewed By: muiez Differential Revision: https://reviews.llvm.org/D94239	2021-01-18 07:14:37 -05:00
Mircea Trofin	e8049dc3c8	[NewPM][Inliner] Move the 'always inliner' case in the same CGSCC pass as 'regular' inliner Expanding from D94808 - we ensure the same InlineAdvisor is used by both InlinerPass instances. The notion of mandatory inlining is moved into the core InlineAdvisor: advisors anyway have to handle that case, so this change also factors out that a bit better. Differential Revision: https://reviews.llvm.org/D94825	2021-01-15 17:59:38 -08:00
Qiu Chaofan	168be42083	[Clang] Mutate long-double math builtins into f128 under IEEE-quad Under -mabi=ieeelongdouble on PowerPC, IEEE-quad floating point semantic is used for long double. This patch mutates call to related builtins into f128 version on PowerPC. And in theory, this should be applied to other targets when their backend supports IEEE 128-bit style libcalls. GCC already has these mutations except nansl, which is not available on PowerPC along with other variants (nans, nansf). Reviewed By: RKSimon, nemanjai Differential Revision: https://reviews.llvm.org/D92080	2021-01-15 16:56:20 +08:00
Erich Keane	9e53c94d8d	[NFC] Update test to not check for 'opaque' in the file name. The intent presumably is to avoid generating 'opaque' in the IR, but the header contains the filename. Thus, having the workspace in a directory with opaque in it causes this test to fail. This just adds a 'CHECK' line on target-triple, which is the last line of the IR-header.	2021-01-14 11:24:06 -08:00
Lucas Prates	2b1e25befe	[AArch64] Adding ACLE intrinsics for the LS64 extension This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`, and `__arm_st64bv0`. These are selected into the LS64 instructions LD64B, ST64B, ST64BV and ST64BV0, respectively. Based on patches written by Simon Tatham. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D93232	2021-01-14 09:43:58 +00:00
Zequan Wu	e53bbd9951	[IR] move nomerge attribute from function declaration/definition to callsites Move nomerge attribute from function declaration/definition to callsites to allow virtual function calls attach the attribute. Differential Revision: https://reviews.llvm.org/D94537	2021-01-12 12:10:46 -08:00
Jan Svoboda	7ab803095a	[clang][cli] Remove -f[no-]trapping-math from -cc1 command line This patch removes the -f[no-]trapping-math flags from the -cc1 command line. These flags are ignored in the command line parser and their semantics is fully handled by -ffp-exception-mode. This patch does not remove -f[no-]trapping-math from the driver command line. The driver flags are being used and do affect compilation. Reviewed By: dexonsmith, SjoerdMeijer Differential Revision: https://reviews.llvm.org/D93395	2021-01-12 10:00:23 +01:00
Sriraman Tallam	d8c6d24359	-funique-internal-linkage-names appends a hex md5hash suffix to the symbol name which is not demangler friendly, convert it to decimal. Please see D93747 for more context which tries to make linkage names of internal linkage functions to be the uniqueified names. This causes a problem with gdb because breaking using the demangled function name will not work if the new uniqueified name cannot be demangled. The problem is the generated suffix which is a mix of integers and letters which do not demangle. The demangler accepts either all numbers or all letters. This patch simply converts the hash to decimal. There is no loss of uniqueness by doing this as the precision is maintained. The symbol names get longer by a few characters though. Differential Revision: https://reviews.llvm.org/D94154	2021-01-11 11:10:29 -08:00
Joe Ellis	8ea72b3887	[clang][AArch64][SVE] Avoid going through memory for coerced VLST return values VLST return values are coerced to VLATs in the function epilog for consistency with the VLAT ABI. Previously, this coercion was done through memory. It is preferable to use the llvm.experimental.vector.insert intrinsic to avoid going through memory here. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D94290	2021-01-11 12:10:59 +00:00
Esme-Yi	ffa67873a3	[PowerPC] Add variants of 64-bit vector types for vec_sel. Summary: This patch added variants of vec_sel and fixed bugzilla 46770. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D94162	2021-01-11 03:52:16 +00:00
Fangrui Song	b41b743d46	[test] Improve weakref & weak_import tests	2021-01-09 23:56:55 -08:00
Fangrui Song	e2e82c9983	[CodeGenModule] Drop dso_local on function declarations for ELF -fno-pic -fno-direct-access-external-data ELF -fno-pic sets dso_local on a function declaration to allow direct accesses when taking its address (similar to a data symbol). The emitted code follows the traditional GCC/Clang -fno-pic behavior: an absolute relocation is produced. If the function is not defined in the executable, a canonical PLT entry will be needed at link time. This is similar to a copy relocation and is incompatible with (-Bsymbolic or --dynamic-list linked shared objects / protected symbols in a shared object). This patch gives -fno-pic code a way to avoid such a canonical PLT entry. The FIXME was about a generalization for -fpie -mpie-copy-relocations (now -fpie -fdirect-access-external-data). While we could set dso_local to avoid GOT when taking the address of a function declaration (there is an ignorable difference about R_386_PC32 vs R_386_PLT32 on i386), it likely does not provide any benefit and can just cause trouble, so we don't make the generalization.	2021-01-09 16:31:56 -08:00
Fangrui Song	38a716c30f	Make -fno-pic respect -fno-direct-access-external-data D92633 added -f[no-]direct-access-external-data to supersede -m[no-]pie-copy-relocations. (The option works for -fpie but is a no-op for -fno-pic and -fpic.) This patch makes -fno-pic -fno-direct-access-external-data drop dso_local from global variable declarations. This usually causes the backend to emit a GOT indirection for external data access. With a GOT relocation, the subsequent -no-pie link will not have copy relocation even if the data symbol turns out to be defined by a shared object. Differential Revision: https://reviews.llvm.org/D92714	2021-01-09 00:32:02 -08:00
Fangrui Song	1d3ebbf537	Add -f[no-]direct-access-external-data to supersede -mpie-copy-relocations GCC r218397 "x86-64: Optimize access to globals in PIE with copy reloc" made -fpie code emit R_X86_64_PC32 to reference external data symbols by default. Clang adopted -mpie-copy-relocations D19996 as a flexible alternative. The name -mpie-copy-relocations can be improved [1] and does not capture the idea that this option can apply to -fno-pic and -fpic [2], so this patch introduces -f[no-]direct-access-external-data and makes -mpie-copy-relocations their aliases for compatibility. [1] For ``` extern int var; int get() { return var; } ``` if var is defined in another translation unit in the link unit, there is no copy relocation. [2] -fno-pic -fno-direct-access-external-data is useful to avoid copy relocations. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65888 If a shared object is linked with -Bsymbolic or --dynamic-list and exports a data symbol, normally the data symbol cannot be accessed by -fno-pic code (because by default an absolute relocation is produced which will lead to a copy relocation). -fno-direct-access-external-data can prevent copy relocations. -fpic -fdirect-access-external-data can avoid GOT indirection. This is like the undefined counterpart of -fno-semantic-interposition. However, the user should define var in another translation unit and link with -Bsymbolic or --dynamic-list, otherwise the linker will error in a -shared link. Generally the user has better tools for their goal but I want to mention that this combination is valid. On COFF, the behavior is like always -fdirect-access-external-data. `__declspec(dllimport)` is needed to enable indirect access. There is currently no plan to affect non-ELF behaviors or -fpic behaviors. -fno-pic -fno-direct-access-external-data will be implemented in the subsequent patch. GCC feature request https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D92633	2021-01-09 00:32:01 -08:00
Umesh Kalappa	33c8e16f66	PR47391: Canonicalize DIFiles Like @aprantl suggested, modify to use the canonicalized DIFile, if we don't know the loc info and filename for the compiler generated functions for example static initialization functions. Reviewed By: dblaikie, aprantl Differential Revision: https://reviews.llvm.org/D87147	2021-01-08 22:11:16 -08:00
Heejin Ahn	7be271537e	[WebAssembly] Rename wasm_rethrow_in_catch intrinsic/builtin `wasm_rethrow_in_catch` intrinsic and builtin are used in order to rethrow an exception when the exception is caught but there is no matching clause within the current `catch`. For example, ``` try { foo(); } catch (int n) { ... } ``` If the caught exception does not correspond to C++ `int` type, it should be rethrown. These intrinsic/builtin were renamed `rethrow_in_catch` because at the time I thought there would be another intrinsic for C++'s `throw` keyword, which rethrows an exception. It turned out that `throw` keyword doesn't require wasm's `rethrow` instruction, so we rename `rethrow_in_catch` to just `rethrow` here. Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D94038	2021-01-08 06:55:04 -08:00
Jeffrey T Mott	275f30df8a	[clang] Change builtin object size when subobject is invalid Motivating example: ``` struct { int v[10]; } t[10]; __builtin_object_size( &t[0].v[11], // access past end of subobject 1 // request remaining bytes of closest surrounding // subobject ); ``` In GCC, this returns 0. https://godbolt.org/z/7TeGs7 In current clang, however, this returns 356, the number of bytes remaining in the whole variable, as if the `type` was 0 instead of 1. https://godbolt.org/z/6Kffox This patch checks for the specific case where we're requesting a subobject's size (type 1) but the subobject is invalid. Differential Revision: https://reviews.llvm.org/D92892	2021-01-07 12:34:07 -08:00
Jeroen Dobbelaere	59fce6b066	[NFC] make clang/test/CodeGen/arm_neon_intrinsics.c resistent to function attribute id changes When introducing support for @llvm.experimental.noalias.scope.decl, this tests started failing because it checks (for no good reason) for a function attribute id of '#8' which now becomes '#9' Reviewed By: pratlucas Differential Revision: https://reviews.llvm.org/D94233	2021-01-07 17:08:15 +00:00
Thomas Lively	497026c902	[WebAssembly] Prototype prefetch instructions As proposed in https://github.com/WebAssembly/simd/pull/352 and using the opcodes used in the V8 prototype: https://chromium-review.googlesource.com/c/v8/v8/+/2543167. These instructions are only usable via intrinsics and clang builtins to make them opt-in while they are being benchmarked. Differential Revision: https://reviews.llvm.org/D93883	2021-01-05 11:32:03 -08:00
Florian Hahn	51d5991f04	[Clang] Add AArch64 VCMLA LANE variants. This patch adds the LANE variants for VCMLA on AArch64 as defined in "Arm Neon Intrinsics Reference for ACLE Q3 2020" [1] This patch also updates `dup_typed` to accept constant type strings directly. Based on a patch by Tim Northover. [1] https://developer.arm.com/documentation/ihi0073/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D93014	2021-01-05 16:14:00 +00:00
Joe Ellis	3d5b18a3fd	[clang][AArch64][SVE] Avoid going through memory for coerced VLST arguments VLST arguments are coerced to VLATs at the function boundary for consistency with the VLAT ABI. They are then bitcast back to VLSTs in the function prolog. Previously, this conversion is done through memory. With the introduction of the llvm.vector.{insert,extract} intrinsic, we can avoid going through memory here. Depends on D92761 Differential Revision: https://reviews.llvm.org/D92762	2021-01-05 15:18:21 +00:00
Brandon Bergren	6cee9d0cf8	[PowerPC] Support powerpcle target in Clang [3/5] Add powerpcle support to clang. For FreeBSD, assume a freestanding environment for now, as we only need it in the first place to build loader, which runs in the OpenFirmware environment instead of the FreeBSD environment. For Linux, recognize glibc and musl environments to match current usage in Void Linux PPC. Adjust driver to match current binutils behavior regarding machine naming. Adjust and expand tests. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D93919	2021-01-02 12:17:58 -06:00
Fangrui Song	d1fd72343c	Refactor how -fno-semantic-interposition sets dso_local on default visibility external linkage definitions The idea is that the CC1 default for ELF should set dso_local on default visibility external linkage definitions in the default -mrelocation-model pic mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar. The refactoring is made available by `2820a2ca3a`. Currently only x86 supports local aliases. We move the decision to the driver. There are three CC1 states: * -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable. * (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local * -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out. Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.	2020-12-31 13:59:45 -08:00
Fangrui Song	fd739804e0	[test] Add {{.}} to make ELF tests immune to dso_local/dso_preemptable/(none) differences For a default visibility external linkage definition, dso_local is set for ELF -fno-pic/-fpie and COFF and Mach-O. Since default clang -cc1 for ELF is similar to -fpic ("PIC Level" is not set), this nuance causes unneeded binary format differences. To make emitted IR similar, ELF -cc1 -fpic will default to -fno-semantic-interposition, which sets dso_local for default visibility external linkage definitions. To make this flip smooth and enable future (dso_local as definition default), this patch replaces (function) `define ` with `define{{.}} `, (variable/constant/alias) `= ` with `={{.}} `, or inserts appropriate `{{.}} `.	2020-12-31 00:27:11 -08:00
Fangrui Song	f2cc2669a0	[test] Fix -triple and delete UNSUPPORTED: system-windows	2020-12-31 00:13:34 -08:00
Luo, Yuanke	08665b1805	Support tilezero intrinsic and c interface for AMX. Differential Revision: https://reviews.llvm.org/D92837	2020-12-31 13:24:57 +08:00
Fangrui Song	6b3351792c	[test] Add {{.}} to make tests immune to dso_local/dso_preemptable/(none) differences For a definition (of most linkage types), dso_local is set for ELF -fno-pic/-fpie and COFF, but not for Mach-O. This nuance causes unneeded binary format differences. This patch replaces (function) `define ` with `define{{.}} `, (variable/constant/alias) `= ` with `={{.}} `, or inserts appropriate `{{.}} ` if there is an explicit linkage. * Clang will set dso_local for Mach-O, which is currently implied by TargetMachine.cpp. This will make COFF/Mach-O and executable ELF similar. * Eventually I hope we can make dso_local the textual LLVM IR default (write explicit "dso_preemptable" when applicable) and -fpic ELF will be similar to everything else. This patch helps move toward that goal.	2020-12-30 20:52:01 -08:00
Juneyoung Lee	420d046d6b	clang-format, address warnings	2020-12-30 23:05:07 +09:00
Juneyoung Lee	9b29610228	Use unary CreateShuffleVector if possible As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`. Let's update them. Actually, it would have been more natural if the patches were made in this order: (1) let them use unary CreateShuffleVector first (2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793) The order is swapped, but in terms of correctness it is still fine. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93923	2020-12-30 22:36:08 +09:00
Fangrui Song	2820a2ca3a	Move -fno-semantic-interposition dso_local logic from TargetMachine to Clang CodeGenModule This simplifies TargetMachine::shouldAssumeDSOLocal and and gives frontend the decision to use dso_local. For LLVM synthesized functions/globals, they may lose inferred dso_local but such optimizations are probably not very useful. Note: the hasComdat() condition in canBenefitFromLocalAlias (D77429) may be dead now. (llvm/CodeGen/X86/semantic-interposition-comdat.ll) (Investigate whether we need test coverage when Fuchsia C++ ABI is clearer)	2020-12-29 23:37:55 -08:00
Luo, Yuanke	981a0bd858	[X86] Add x86_amx type for intel AMX. The x86_amx is used for AMX intrisics. <256 x i32> is bitcast to x86_amx when it is used by AMX intrinsics, and x86_amx is bitcast to <256 x i32> when it is used by load/store instruction. So amx intrinsics only operate on type x86_amx. It can help to separate amx intrinsics from llvm IR instructions (+-*/). Thank Craig for the idea. This patch depend on https://reviews.llvm.org/D87981. Differential Revision: https://reviews.llvm.org/D91927	2020-12-30 13:52:13 +08:00
Juneyoung Lee	278aa65cc4	[IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93793	2020-12-30 04:21:04 +09:00
Thomas Lively	5e09e9979b	[WebAssembly] Prototype extending pairwise add instructions As proposed in https://github.com/WebAssembly/simd/pull/380. This commit makes the new instructions available only via clang builtins and LLVM intrinsics to make their use opt-in while they are still being evaluated for inclusion in the SIMD proposal. Depends on D93771. Differential Revision: https://reviews.llvm.org/D93775	2020-12-28 14:11:14 -08:00
Juneyoung Lee	9d70dbdc2b	[InstCombine] use poison as placeholder for undemanded elems Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement. However, this is problematic because undef isn’t undefined enough. Especially, a sequence of insertelement can be optimized to shufflevector, but using undef as its placeholder makes shufflevector a poison-blocking instruction because undef cannot be optimized to poison. This makes a few straightforward optimizations incorrect, such as: ``` ; https://bugs.llvm.org/show_bug.cgi?id=44185 define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %xv = insertelement <4 x float> %q, float %x, i32 2 %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef } ret <4 x float> %r ; %r[3] is undef } => define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %r = insertelement <4 x float> %y, float %x, i32 1 ret <4 x float> %r ; %r[3] = %y[3], incorrect if %y[3] = poison } Transformation doesn't verify! ERROR: Target is more poisonous than source ``` I’d like to suggest 1. Using poison as insertelement’s placeholder value (IRBuilder::CreateVectorSplat should be patched too) 2. Updating shufflevector’s semantics to return poison element if mask is undef Note that poison is currently lowered into UNDEF in SelDag, so codegen part is okay. m_Undef() matches PoisonValue as well, so existing optimizations will still fire. The only concern is hidden miscompilations that will go incorrect when poison constant is given. A conservative way is copying all tests having `insertelement undef` & replacing it with `insertelement poison` & run Alive2 on it, but it will create many tests and people won’t like it. :( Instead, I’ll simply locally maintain the tests and run Alive2. If there is any bug found, I’ll report it. Relevant links: https://bugs.llvm.org/show_bug.cgi?id=43958 , http://lists.llvm.org/pipermail/llvm-dev/2019-November/137242.html Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93586	2020-12-28 08:58:15 +09:00
Sriraman Tallam	34e70d722d	Append ".__part." to every basic block section symbol. Every basic block section symbol created by -fbasic-block-sections will contain ".__part." to know that this symbol corresponds to a basic block fragment of the function. This patch solves two problems: a) Like D89617, we want function symbols with suffixes to be properly qualified so that external tools like profile aggregators know exactly what this symbol corresponds to. b) The current basic block naming just adds a ".N" to the symbol name where N is some integer. This collides with how clang creates __cxx_global_var_init.N. clang creates these symbol names to call constructor functions and basic block symbol naming should not use the same style. Fixed all the test cases and added an extra test for __cxx_global_var_init breakage. Differential Revision: https://reviews.llvm.org/D93082	2020-12-23 11:35:44 -08:00
Arthur Eubanks	db1616c768	[test] Fix new-pass-manager-opt-bisect.c Requires x86 target to be registered.	2020-12-20 17:13:42 -08:00
Samuel Eubanks	47dbee6790	Make NPM OptBisectInstrumentation use global singleton OptBisect Currently there is an issue where the legacy pass manager uses a different OptBisect counter than the new pass manager. This fix makes the npm OptBisectInstrumentation use the global OptBisect. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D92897	2020-12-20 13:47:56 -08:00
Roman Lebedev	897c985e1e	[InstCombine] Canonicalize SPF to abs intrinsic This patch enables canonicalization of SPF_ABS and SPF_ABS to the abs intrinsic. This is a recommit, the original try was `05d4c4ebc2`, but it was reverted due to an apparent miscompile, which since then has just been fixed by the previous commit. Differential Revision: https://reviews.llvm.org/D87188	2020-12-18 21:18:14 +03:00
Kevin P. Neal	7fef551cb1	Revert "Revert "[FPEnv] Teach the IRBuilder about invoke's correct use of the strictfp attribute."" Similar to D69312, and documented in D69839, the IRBuilder needs to add the strictfp attribute to invoke instructions when constrained floating point is enabled. This is try 2, with the test corrected. Differential Revision: https://reviews.llvm.org/D93134	2020-12-18 12:42:06 -05:00
Rong Xu	3733463dbb	[IR][PGO] Add hot func attribute and use hot/cold attribute in func section Clang FE currently has hot/cold function attribute. But we only have cold function attribute in LLVM IR. This patch adds support of hot function attribute to LLVM IR. This attribute will be used in setting function section prefix/suffix. Currently .hot and .unlikely suffix only are added in PGO (Sample PGO) compilation (through isFunctionHotInCallGraph and isFunctionColdInCallGraph). This patch changes the behavior. The new behavior is: (1) If the user annotates a function as hot or isFunctionHotInCallGraph is true, this function will be marked as hot. Otherwise, (2) If the user annotates a function as cold or isFunctionColdInCallGraph is true, this function will be marked as cold. The changes are: (1) user annotated function attribute will used in setting function section prefix/suffix. (2) hot attribute overwrites profile count based hotness. (3) profile count based hotness overwrite user annotated cold attribute. The intention for these changes is to provide the user a way to mark certain function as hot in cases where training input is hard to cover all the hot functions. Differential Revision: https://reviews.llvm.org/D92493	2020-12-17 18:41:12 -08:00
Tom Stellard	3203143f13	CodeGen: Improve generated IR for __builtin_mul_overflow(uint, uint, int) Add a special case for handling __builtin_mul_overflow with unsigned inputs and a signed output to avoid emitting the __muloti4 library call on x86_64. __muloti4 is not implemented in libgcc, so avoiding this call fixes compilation of some programs that call __builtin_mul_overflow with these arguments. For example, this fixes the build of cpio with clang, which includes code from gnulib that calls __builtin_mul_overflow with these argument types. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D84405	2020-12-17 14:30:31 -08:00
Baptiste Saleil	c2892978e9	[PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_ On PPC, the vector pair instructions are independent from MMA. This patch renames the vector pair LLVM intrinsics and Clang builtins to replace the _mma_ prefix by _vsx_ in their names. We also move the vector pair type/intrinsic/builtin tests to their own files. Differential Revision: https://reviews.llvm.org/D91974	2020-12-17 13:19:27 -05:00
Tomas Matheson	f500662924	Detect section type conflicts between functions and variables If two variables are declared with __attribute__((section(name))) and the implicit section types (e.g. read only vs writeable) conflict, an error is raised. Extend this mechanism so that an error is raised if the section type implied by a function's __attribute__((section)) conflicts with that of another variable.	2020-12-17 11:43:47 -05:00
Zequan Wu	fb0f728805	[Clang] Make nomerge attribute a function attribute as well as a statement attribute. Differential Revision: https://reviews.llvm.org/D92800	2020-12-17 07:45:38 -08:00
Thomas Preud'homme	150fe05db4	[Test] Fix undef var in catch-undef-behavior.c Commit `9e52c43090` removed the directive defining LINE_1600 but left a string substitution to that variable in a CHECK-NOT directive. This will make that CHECK-NOT directive always fail to match, no matter the string. This commit follows the pattern done in `9e52c43090` of simplifying the CHECK-NOT to only look for the function name and the opening parenthesis, thereby not requiring the LINE_1600 variable. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D93350	2020-12-16 22:39:41 +00:00
Joe Ellis	dad07baf12	[clang][AArch64][SVE] Avoid going through memory for VLAT <-> VLST casts This change makes use of the llvm.vector.extract intrinsic to avoid going through memory when performing bitcasts between vector-length agnostic types and vector-length specific types. Depends on D91362 Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D92761	2020-12-16 12:24:32 +00:00
Qiu Chaofan	f141d1afc5	[NFC] Pre-commit test for long-double builtins This test reflects clang behavior on long-double type math library builtins under default or explicit 128-bit long-double options.	2020-12-16 17:19:54 +08:00
Johannes Doerfert	b9c77542e2	[Clang][Attr] Introduce the `assume` function attribute The `assume` attribute is a way to provide additional, arbitrary information to the optimizer. For now, assumptions are restricted to strings which will be accumulated for a function and emitted as comma separated string function attribute. The key of the LLVM-IR function attribute is `llvm.assume`. Similar to `llvm.assume` and `__builtin_assume`, the `assume` attribute provides a user defined assumption to the compiler. A follow up patch will introduce an LLVM-core API to query the assumptions attached to a function. We also expect to add more options, e.g., expression arguments, to the `assume` attribute later on. The `omp [begin] asssumes` pragma will leverage this attribute and expose the functionality in the absence of OpenMP. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D91979	2020-12-15 16:51:34 -06:00
Kevin P. Neal	2ec5973fdd	Revert "[FPEnv] Teach the IRBuilder about invoke's correct use of the strictfp attribute." The test is busted on some hosts that aren't the one I'm using. This reverts commit `67a1ffd88a`.	2020-12-15 12:58:47 -05:00
Kevin P. Neal	67a1ffd88a	[FPEnv] Teach the IRBuilder about invoke's correct use of the strictfp attribute. Similar to D69312, and documented in D69839, the IRBuilder needs to add the strictfp attribute to invoke instructions when constrained floating point is enabled. Differential Revision: https://reviews.llvm.org/D93134	2020-12-15 12:38:10 -05:00
Joe Ellis	5a2a8369e8	[AArch64][NEON] Remove undocumented vceqz{,q}_p16, vml{a,s}q_n_f64 intrinsics Prior to this patch, Clang supported the following C/C++ intrinsics: vceqz_p16 vceqzq_p16 vmlaq_n_f64 vmlsq_n_f64 ... exposed through arm_neon.h. However, these intrinsics are not part of the ACLE, allowing developers to write code that is not compatible with other toolchains. This patch removes these intrinsics. There is a bug report capturing this issue here: https://bugs.llvm.org/show_bug.cgi?id=47471 Reviewed By: bsmith Differential Revision: https://reviews.llvm.org/D93206	2020-12-15 17:19:16 +00:00
Jan Svoboda	56c5548d7f	[clang][cli] Squash multiple cc1 -fxxx-exceptions flags into single -exception-model=xxx option This patch enables marshalling of the exception model options while enforcing their mutual exclusivity. The clang driver interface remains the same, this only affects the cc1 command line. Depends on D93215. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D93216	2020-12-15 10:15:58 +01:00
Gulfem Savrun Yeniceri	7c0e3a77bc	[clang][IR] Add support for leaf attribute This patch adds support for leaf attribute as an optimization hint in Clang/LLVM. Differential Revision: https://reviews.llvm.org/D90275	2020-12-14 14:48:17 -08:00
Philip Reames	3b3eb7f07f	Speculative fix for build bot failures (The clang build fails for me locally, so this is based on built bot output and a guess as to root cause.) `f5fe849` made the execution of LAA conditional, so I'm guessing that's the root cause.	2020-12-14 13:44:40 -08:00
Matt Arsenault	ef4da3c2ba	clang: Add byval on x86_intrcc parameter 0 This will allow removing the special case treatment of the parameter and avoid depending on the pointer's element type.	2020-12-14 16:34:37 -05:00
Simon Pilgrim	4855a1004d	[X86] Convert fadd/fmul _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) Followup to D87604, having confirmed on PR47506 that we can use the llvm codegen expansion for fadd/fmul as well. Differential Revision: https://reviews.llvm.org/D92940	2020-12-13 15:37:35 +00:00
Alexey Bader	a500a43587	[CodeGen][AMDGPU] Fix ICE for static initializer IR generation Differential Revision: https://reviews.llvm.org/D92782	2020-12-12 23:26:54 +03:00
Nico Weber	a5c65de295	mac/arm: XFAIL the last 3 failing tests We should fix them, but let's XFAIL them for now so that we can start running check-clang on bots and lock in the passing tests. Part of 46644.	2020-12-12 15:09:17 -05:00
Melanie Blower	320af6b138	Create SPIRABIInfo to enable SPIR_FUNC calling convention. Background: Call to library arithmetic functions for div is emitted by the compiler and it set wrong “C” calling convention for calls to these functions, whereas library functions are declared with `spir_function` calling convention. InstCombine optimization replaces such calls with “unreachable” instruction. It looks like clang lacks SPIRABIInfo class which should specify default calling conventions for “system” function calls. SPIR supports only SPIR_FUNC and SPIR_KERNEL calling convention. Reviewers: Erich Keane, Anastasia Differential Revision: https://reviews.llvm.org/D92721	2020-12-12 05:48:20 -08:00
Marco Elver	c28b18af19	[KernelAddressSanitizer] Fix globals exclusion for indirect aliases GlobalAlias::getAliasee() may not always point directly to a GlobalVariable. In such cases, try to find the canonical GlobalVariable that the alias refers to. Link: https://github.com/ClangBuiltLinux/linux/issues/1208 Reviewed By: dvyukov, nickdesaulniers Differential Revision: https://reviews.llvm.org/D92846	2020-12-11 12:20:40 +01:00
Florian Hahn	9c4cddb53a	[Clang] Add vcmla and rotated variants for Arm ACLE. This patch adds vcmla and the rotated variants as defined in "Arm Neon Intrinsics Reference for ACLE Q3 2020" [1] The _lane_ are still missing, but they can be added separately. This patch only adds the builtin mapping for AArch64. [1] https://developer.arm.com/documentation/ihi0073/latest Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D92930	2020-12-10 16:54:08 +00:00
Luo, Yuanke	f80b29878b	[X86] AMX programming model. This patch implements amx programming model that discussed in llvm-dev (http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html). Thank Hal for the good suggestion in the RA. The fast RA is not in the patch yet. This patch implemeted 7 components. 1. The c interface to end user. 2. The AMX intrinsics in LLVM IR. 3. Transform load/store <256 x i32> to AMX intrinsics or split the type into two <128 x i32>. 4. The Lowering from AMX intrinsics to AMX pseudo instruction. 5. Insert psuedo ldtilecfg and build the def-use between ldtilecfg to amx intruction. 6. The register allocation for tile register. 7. Morph AMX pseudo instruction to AMX real instruction. Change-Id: I935e1080916ffcb72af54c2c83faa8b2e97d5cb0 Differential Revision: https://reviews.llvm.org/D87981	2020-12-10 17:01:54 +08:00
Yuanfang Chen	fc3942526f	[NFCI] Add a missing triple in clang/test/CodeGen/ppc64le-varargs-f128.c	2020-12-09 18:17:34 -08:00
Kevin P. Neal	acd4950d4f	[FPEnv] Correct constrained metadata in fp16-ops-strict.c This test shows we're in some cases not getting strictfp information from the AST. Correct that. Differential Revision: https://reviews.llvm.org/D92596	2020-12-08 10:18:32 -05:00
Tim Northover	c5978f42ec	UBSAN: emit distinctive traps Sometimes people get minimal crash reports after a UBSAN incident. This change tags each trap with an integer representing the kind of failure encountered, which can aid in tracking down the root cause of the problem.	2020-12-08 10:28:26 +00:00
Luís Marques	3af354e863	[Clang][CodeGen][RISCV] Fix hard float ABI for struct with empty struct and complex Fixes bug 44904. Differential Revision: https://reviews.llvm.org/D91278	2020-12-08 09:19:05 +00:00
Luís Marques	fa8f5bfa4e	[Clang][CodeGen][RISCV] Fix hard float ABI test cases with empty struct The code seemed not to account for the field 1 offset. Differential Revision: https://reviews.llvm.org/D91270	2020-12-08 09:19:05 +00:00
Luís Marques	ca93f9abdc	[Clang][CodeGen][RISCV] Add hard float ABI tests with empty struct This patch adds tests that showcase a behavior that is currently buggy. Fix in a follow-up patch. Differential Revision: https://reviews.llvm.org/D91269	2020-12-08 09:19:05 +00:00
Qiu Chaofan	5e85a2ba16	[PowerPC] Implement intrinsic for DARN instruction Instruction darn was introduced in ISA 3.0. It means 'Deliver A Random Number'. The immediate number L means: - L=0, the number is 32-bit (higher 32-bits are all-zero) - L=1, the number is 'conditioned' (processed by hardware to reduce bias) - L=2, the number is not conditioned, directly from noise source GCC implements them in three separate intrinsics: __builtin_darn, __builtin_darn_32 and __builtin_darn_raw. This patch implements the same intrinsics. And this change also addresses Bugzilla PR39800. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92465	2020-12-08 14:08:52 +08:00
Jennifer Yu	f8d5b49c78	Fix missing error for use of 128-bit integer inside SPIR64 device code. Emit error for use of 128-bit integer inside device code had been already implemented in https://reviews.llvm.org/D74387. However, the error is not emitted for SPIR64, because for SPIR64, hasInt128Type return true. hasInt128Type: is also used to control generation of certain 128-bit predefined macros, initializer predefined 128-bit integer types and build 128-bit ArithmeticTypes. Except predefined macros, only the device target is considered, since error only emit when 128-bit integer is used inside device code, the host target (auxtarget) also needs to be considered. The change address: 1. (SPIR.h) Correct hasInt128Type() for SPIR targets. 2. Sema.cpp and SemaOverload.cpp: Add additional check to consider host target(auxtarget) when call to hasInt128Type. So that __int128_t and __int128() are allowed to avoid error when they used outside device code. 3. SemaType.cpp: add check for SYCLIsDevice to delay the error message. The error will be emitted if the use of 128-bit integer in the device code. Reviewed By: Johannes Doerfert and Aaron Ballman Differential Revision: https://reviews.llvm.org/D92439	2020-12-07 10:42:32 -08:00

... 2 3 4 5 6 ...

6779 Commits