llvm-project

Commit Graph

Author	SHA1	Message	Date
Arthur Eubanks	ad727ab7d9	[NFC] Migrate some callers away from Function/AttributeLists methods that take an index These methods can be confusing.	2021-08-17 21:05:40 -07:00
Arthur Eubanks	46cf82532c	[NFC] Replace Function handling of attributes with less confusing calls To avoid magic constants and confusing indexes.	2021-08-17 21:05:40 -07:00
Wang, Pengfei	5aeca3b0a5	[CFE][X86] Enable complex _Float16 support Support complex _Float16 on X86 in C/C++ following the latest X86 psABI. (https://gitlab.com/x86-psABIs) Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105331	2021-08-18 11:16:14 +08:00
Wang, Pengfei	2379949aad	[X86] AVX512FP16 instructions enabling 3/6 Enable FP16 conversion instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105265	2021-08-18 09:03:41 +08:00
Dylan Fleming	ef198cd99e	[SVE] Remove usage of getMaxVScale for AArch64, in favour of IR Attribute Removed AArch64 usage of the getMaxVScale interface, replacing it with the vscale_range(min, max) IR Attribute. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D106277	2021-08-17 14:42:47 +01:00
Wang, Pengfei	f1de9d6dae	[X86] AVX512FP16 instructions enabling 2/6 Enable FP16 binary operator instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105264	2021-08-15 08:56:33 +08:00
Arthur Eubanks	8e9ffa1dc6	[NFC] Cleanup callers of AttributeList::hasAttributes() AttributeList::hasAttributes() is confusing, use clearer methods like hasFnAttrs().	2021-08-13 12:16:52 -07:00
Arthur Eubanks	80ea2bb574	[NFC] Rename AttributeList::getParam/Ret/FnAttributes() -> get*Attributes() This is more consistent with similar methods.	2021-08-13 11:16:52 -07:00
Arthur Eubanks	92ce6db9ee	[NFC] Rename AttributeList::hasFnAttribute() -> hasFnAttr() This is more consistent with similar methods.	2021-08-13 11:09:18 -07:00
Michael Kruse	b1de32d6dd	[OMPIRBuilder] Clarify CanonicalLoopInfo. NFC. Add in-source documentation on how CanonicalLoopInfo is intended to be used. In particular, clarify what parts of a CanonicalLoopInfo is considered part of the loop, that those parts must be side-effect free, and that InsertPoints to instructions outside those parts can be expected to be preserved after method calls implementing loop-associated directives. CanonicalLoopInfo are now invalidated after it does not describe canonical loop anymore and asserts when trying to use it afterwards. In addition, rename `createXYZWorkshareLoop` to `applyXYZWorkshareLoop` and remove the update location to avoid that the impression that they insert something from scratch at that location where in reality its InsertPoint is ignored. createStaticWorkshareLoop does not return a CanonicalLoopInfo anymore. First, it was not a canonical loop in the clarified sense (containing side-effects in form of calls to the OpenMP runtime). Second, it is ambiguous which of the two possible canonical loops it should actually return. It will not be needed before a feature expected to be introduced in OpenMP 6.0 Also see discussion in D105706. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D107540	2021-08-12 21:02:19 -05:00
Arnold Schwaighofer	9eb99d2e73	CodeGen: No need to check for isExternC if HasStrictReturn is already false NFC intended. Differential Revision: https://reviews.llvm.org/D107841	2021-08-11 07:42:48 -07:00
Wang, Pengfei	6f7f5b54c8	[X86] AVX512FP16 instructions enabling 1/6 1. Enable FP16 type support and basic declarations used by following patches. 2. Enable new instructions VMOVW and VMOVSH. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105263	2021-08-10 12:46:01 +08:00
Michael Liao	6ec36d18ec	[cuda] Mark builtin texture/surface reference variable as 'externally_initialized'. - They need to be preserved even if there's no reference within the device code as the host code may need to initialize them based on the application logic. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D107718	2021-08-09 13:27:40 -04:00
Roger Ferrer Ibanez	bfb77364d0	[OpenMP] Fix accidental reuse of VLA size We were using an OpaqueValueExpr allocated on the stack to store the size of a VLA. Because the VLASizeMap in CodegenFunction uses the address of the expression to avoid recomputing VLAs, we were accidentally reusing an earlier llvm::Value. This led to invalid LLVM IR. This is a temporary solution until VLASizeMap can be pushed and popped based on the context. Differential Revision: https://reviews.llvm.org/D107666	2021-08-07 05:55:27 +00:00
Joseph Huber	41a6b50c25	[OpenMP]Fix PR51349: Remove AlwaysInline for if regions. After D94315 we add the `NoInline` attribute to the outlined function to handle data environments in the OpenMP if clause. This conflicted with the `AlwaysInline` attribute added to the outlined function. for better performance in D106799. The data environments should ideally not require NoInline, but for now this fixes PR51349. Reviewed By: mikerice Differential Revision: https://reviews.llvm.org/D107649	2021-08-06 17:53:04 -04:00
Serge Pavlov	4c4093e6e3	Introduce intrinsic llvm.isnan This is recommit of the patch `16ff91ebcc`, reverted in `0c28a7c990` because it had an error in call of getFastMathFlags (base type should be FPMathOperator but not Instruction). The original commit message is duplicated below: Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made. * The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN. * Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way. * Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function. Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks: * The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations. * IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool. Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false': std::isnan(std::numeric_limits<float>::quiet_NaN()) The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'. To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics. Differential Revision: https://reviews.llvm.org/D104854	2021-08-06 14:32:27 +07:00
Fangrui Song	c38efb4899	[clang] Implement -falign-loops=N (N is a power of 2) for non-LTO GCC supports multiple forms of -falign-loops=. -falign-loops= is currently ignored in Clang. This patch implements the simplest but the most useful form where N is a power of 2. The underlying implementation uses a `llvm::TargetOptions` option for now. Bitcode generation ignores this option. Differential Revision: https://reviews.llvm.org/D106701	2021-08-05 12:17:50 -07:00
Anshil Gandhi	39dac1f7f6	[clang] Add clang builtins support for gfx90a Implement target builtins for gfx90a including fadd64, fadd32, add2h, max and min on various global, flat and ds address spaces for which intrinsics are implemented. Differential Revision: https://reviews.llvm.org/D106909	2021-08-05 02:08:06 -06:00
Pavel Asyutchenko	7df405e079	Apply -fmacro-prefix-map to __builtin_FILE() This matches the behavior of GCC. Patch does not change remapping logic itself, so adding one simple smoke test should be enough. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D107393	2021-08-04 16:42:14 -07:00
Bradley Smith	e57e1e4e00	[clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts For fixed SVE types, predicates are represented using vectors of i8, where as for scalable types they are represented using vectors of i1. We can avoid going through memory for casts between these by bitcasting the i1 scalable vectors to/from a scalable i8 vector of matching size, which can then use the existing vector insert/extract logic. Differential Revision: https://reviews.llvm.org/D106860	2021-08-04 16:10:37 +00:00
Serge Pavlov	0c28a7c990	Revert "Introduce intrinsic llvm.isnan" This reverts commit `16ff91ebcc`. Several errors were reported mainly test-suite execution time. Reverted for investigation.	2021-08-04 17:18:15 +07:00
Serge Pavlov	16ff91ebcc	Introduce intrinsic llvm.isnan Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made. * The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN. * Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way. * Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function. Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks: * The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations. * IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool. Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false': std::isnan(std::numeric_limits<float>::quiet_NaN()) The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'. To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics. Differential Revision: https://reviews.llvm.org/D104854	2021-08-04 15:27:49 +07:00
Alexandros Lamprineas	29b263a34f	[Clang][AArch64] Inline assembly support for the ACLE type 'data512_t' In LLVM IR terms the ACLE type 'data512_t' is essentially an aggregate type { [8 x i64] }. When emitting code for inline assembly operands, clang tries to scalarize aggregate types to an integer of the equivalent length, otherwise it passes them by-reference. This patch adds a target hook to tell whether a given inline assembly operand is scalarizable so that clang can emit code to pass/return it by-value. Differential Revision: https://reviews.llvm.org/D94098	2021-07-31 09:51:28 +01:00
Tarindu Jayatilaka	7a797b2902	Take OptimizationLevel class out of Pass Builder Pulled out the OptimizationLevel class from PassBuilder in order to be able to access it from within the PassManager and avoid include conflicts. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D107025	2021-07-29 21:57:23 -07:00
Melanie Blower	bc5b5ea037	[clang][patch][FPEnv] Make initialization of C++ globals strictfp aware @kpn pointed out that the global variable initialization functions didn't have the "strictfp" metadata set correctly, and @rjmccall said that there was buggy code in SetFPModel and StartFunction, this patch is to solve those problems. When Sema creates a FunctionDecl, it sets the FunctionDeclBits.UsesFPIntrin to "true" if the lexical FP settings (i.e. a combination of command line options and #pragma float_control settings) correspond to ConstrainedFP mode. That bit is used when CodeGen starts codegen for a llvm function, and it translates into the "strictfp" function attribute. See bugs.llvm.org/show_bug.cgi?id=44571 Reviewed By: Aaron Ballman Differential Revision: https://reviews.llvm.org/D102343	2021-07-29 12:02:37 -04:00
Kai Luo	e4902e69e9	[PowerPC] Fix return type of XL compat CAS `__compare_and_swap*` should return `i32` rather than `i1`. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D107077	2021-07-29 14:49:26 +00:00
Fangrui Song	828767f325	COFF/ELF: Place llvm.global_ctors elements in llvm.used if comdat is used On ELF, an SHT_INIT_ARRAY outside a section group is a GC root. The current codegen abuses SHT_INIT_ARRAY in a section group to mean a GC root. On PE/COFF, the dynamic initialization for `__declspec(selectany)` in a comdat can be garbage collected by `-opt:ref`. Call `addUsedGlobal` for the two cases to fix the abuse/bug. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D106925	2021-07-28 11:44:19 -07:00
Matheus Izvekov	4819b751bd	[clang] NFC: change uses of `Expr->getValueKind` into `is?Value` Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D100733	2021-07-28 03:09:31 +02:00
Jose M Monsalve Diaz	0276db1416	[OpenMP] Creating the `omp_target_num_teams` and `omp_target_thread_limit` attributes to outlined functions The device runtime contains several calls to __kmpc_get_hardware_num_threads_in_block and __kmpc_get_hardware_num_blocks. If the thread_limit and the num_teams are constant, these calls can be folded to the constant value. In commit D106033 we have the optimization phase. This commit adds the attributes to the outlined function for the grid size. the two attributes are `omp_target_num_teams` and `omp_target_thread_limit`. These values are added as long as they are constant. Two functions are created `getNumThreadsExprForTargetDirective` and `getNumTeamsExprForTargetDirective`. The original functions `emitNumTeamsForTargetDirective` and `emitNumThreadsForTargetDirective` identify the expresion and emit the code. However, for the Device version of the outlined function, we cannot emit anything. Therefore, this is a first attempt to separate emision of code from deduction of the values. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106298	2021-07-27 17:21:04 -04:00
Thomas Lively	33786576fd	[WebAssembly] Codegen for extmul SIMD instructions Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions with normal codegen patterns. Differential Revision: https://reviews.llvm.org/D106724	2021-07-27 08:41:30 -07:00
Reid Kleckner	f9f56488e0	[DebugInfo] Use per-enumerator signedness for DIEnumerator Allegedly the DWARF backend ignores this field of DIEnumerator, but we set it nonetheless in case we decide to use it in the future. Alternatively, we could remove it, but it is simpler to pass down the signed bit as it is in the AST for now. Implemented to address comments on D106585	2021-07-26 16:14:28 -07:00
Joseph Huber	af000197c4	[OpenMP] Always inline the OpenMP outlined function This patch adds the always inline attribute to the outlined functions generated by OpenMP regions. Because there is only a single instance of this function and it always has internal linkage it is safe to inline in every instance it is created. This could potentially lead to performance degredation due to inflated register counts in the parallel region. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106799	2021-07-26 17:27:59 -04:00
Reid Kleckner	3230493299	Fix clang debug info irgen of i128 enums DIEnumerator stores an APInt as of April 2020, so now we don't need to truncate the enumerator value to 64 bits. Fixes assertions during IRGen. Split from D105320, thanks to Matheus Izvekov for the test case and report. Differential Revision: https://reviews.llvm.org/D106585	2021-07-26 12:25:29 -07:00
Nemanja Ivanovic	1c50a5da36	[PowerPC] Implement partial vector ld/st builtins for XL compatibility XL provides functions __vec_ldrmb/__vec_strmb for loading/storing a sequence of 1 to 16 bytes in big endian order, right justified in the vector register (regardless of target endianness). This is equivalent to vec_xl_len_r/vec_xst_len_r which are only available on Power9. This patch simply uses the Power9 functions when compiled for Power9, but provides a more general implementation for Power8. Differential revision: https://reviews.llvm.org/D106757	2021-07-26 13:19:52 -05:00
Shilei Tian	3274cdc83e	[Clang][OpenMP] Remove the mandatory flush for capture for OpenMP 5.1 In OpenMP 5.1: > If the `write` or `update` clause is specifieded, the atomic operation is not an atomic conditional update for which the comparison fails, and the effective memory ordering is `release`, `acq_rel`, or `seq_cst`, the strong flush on entry to the atomic operation is also a release flush. If the `read` or `update` clause is specified and the effective memory ordering is `acquire`, `acq_rel`, or `seq_cst` then the strong flush on exit from the atomic operation is also an acquire flush. In OpenMP 5.0: > If the `write`, `update`, or `capture` clause is specified and the `release`, `acq_rel`, or `seq_cst` clause is specified then the strong flush on entry to the atomic operation is also a release flush. If the `read` or `capture` clause is specified and the `acquire`, `acq_rel`, or `seq_cst` clause is specified then the strong flush on exit from the atomic operation is also an acquire flush. From my understanding, in OpenMP 5.1, `capture` is removed from the requirement for flush, therefore we don't have to enforce it. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D100768	2021-07-26 11:00:44 -04:00
Thomas Lively	85157c0079	[WebAssembly] Codegen for pmin and pmax Replace the clang builtins and LLVM intrinsics for {f32x4,f64x2}.{pmin,pmax} with standard codegen patterns. Since wasm_simd128.h uses an integer vector as the standard single vector type, the IR for the pmin and pmax intrinsic functions contains bitcasts that would not be there otherwise. Add extra codegen patterns that can still select the pmin and pmax instructions in the presence of these bitcasts. Differential Revision: https://reviews.llvm.org/D106612	2021-07-23 14:49:21 -07:00
Fangrui Song	7290ddd6b1	Revert "[clang] -falign-loops=" This reverts commit `42896eeed9`. Unfinished. Accidentally pushed when reverting a clangd commit.	2021-07-23 09:58:35 -07:00
Fangrui Song	42896eeed9	[clang] -falign-loops=	2021-07-23 09:50:43 -07:00
Yaxun (Sam) Liu	44dbbe6106	[HIP] Preserve ASAN bitcode library functions Address sanitizer passes may generate call of ASAN bitcode library functions after bitcode linking in lld, therefore lld cannot add those symbols since it does not know they will be used later. To solve this issue, clang emits a reference to a bicode library function which calls all ASAN functions which need to be preserved. This basically force all ASAN functions to be linked in. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D106315	2021-07-23 10:35:52 -04:00
Kai Luo	e4ed93cb25	[PowerPC] Implement XL compatible behavior of __compare_and_swap According to https://www.ibm.com/docs/en/xl-c-and-cpp-aix/16.1?topic=functions-compare-swap-compare-swaplp XL's `__compare_and_swap` has a weird behavior that > In either case, the contents of the memory location specified by addr are copied into the memory location specified by old_val_addr. (unlike c11 `atomic_compare_exchange` specified in http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf) This patch let clang's implementation follow this behavior. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D106344	2021-07-23 01:16:02 +00:00
Florian Mayer	96c63492cb	[hwasan] Use stack safety analysis. This avoids unnecessary instrumentation. Reviewed By: eugenis, vitalybuka Differential Revision: https://reviews.llvm.org/D105703	2021-07-22 16:20:27 -07:00
Alexey Bataev	b88a68c45e	[OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments. Added missed arguments in __tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime functions calls. Differential Revision: https://reviews.llvm.org/D106542	2021-07-22 08:44:37 -07:00
Alexey Bataev	f828f0a90f	Revert "[OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments." This reverts commit `b455f7f225` to fix buildbots.	2021-07-22 08:06:29 -07:00
Alexey Bataev	b455f7f225	[OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments. Added missed arguments in __tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime functions calls. Differential Revision: https://reviews.llvm.org/D106542	2021-07-22 07:53:37 -07:00
Florian Mayer	789a4a2e5c	Revert "[hwasan] Use stack safety analysis." This reverts commit `bde9415fef`.	2021-07-22 12:16:16 +01:00
Florian Mayer	bde9415fef	[hwasan] Use stack safety analysis. This avoids unnecessary instrumentation. Reviewed By: eugenis, vitalybuka Differential Revision: https://reviews.llvm.org/D105703	2021-07-22 12:04:54 +01:00
Simon Tatham	bd41136746	[clang] Use i64 for the !srcloc metadata on asm IR nodes. This is part of a patch series working towards the ability to make SourceLocation into a 64-bit type to handle larger translation units. !srcloc is generated in clang codegen, and pulled back out by llvm functions like AsmPrinter::emitInlineAsm that need to report errors in the inline asm. From there it goes to LLVMContext::emitError, is stored in DiagnosticInfoInlineAsm, and ends up back in clang, at BackendConsumer::InlineAsmDiagHandler(), which reconstitutes a true clang::SourceLocation from the integer cookie. Throughout this code path, it's now 64-bit rather than 32, which means that if SourceLocation is expanded to a 64-bit type, this error report won't lose half of the data. The compiler will tolerate both of i32 and i64 !srcloc metadata in input IR without faulting. Test added in llvm/MC. (The semantic accuracy of the metadata is another matter, but I don't know of any situation where that matters: if you're reading an IR file written by a previous run of clang, you don't have the SourceManager that can relate those source locations back to the original source files.) Original version of the patch by Mikhail Maltsev. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D105491	2021-07-22 10:24:52 +01:00
Joseph Huber	754eb1c210	[OpenMP] Change `__kmpc_free_shared` to include the paired allocation size This patch changes `__kmpc_free_shared` to take an additional argument corresponding to the associated allocation's size. This makes it easier to implement the allocator in the runtime. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106496	2021-07-21 20:56:21 -04:00
Thomas Lively	8af333cf1a	[WebAssembly] Replace @llvm.wasm.popcnt with @llvm.ctpop.v16i8 Use the standard target-independent intrinsic to take advantage of standard optimizations. Differential Revision: https://reviews.llvm.org/D106506	2021-07-21 16:45:54 -07:00
Thomas Lively	db7efcab7d	[WebAssembly] Remove clang builtins for extract_lane and replace_lane These builtins were added to capture the fact that the underlying Wasm instructions return i32s and implicitly sign or zero extend the extracted lanes in the case of the i8x16 and i16x8 variants. But we do sufficient optimizations during code gen that these low-level details do not need to be exposed to users. This commit replaces the use of the builtins in wasm_simd128.h with normal target-independent vector code. As a result, we can switch the relevant intrinsics to use functions rather than macros and can use more user-friendly return types rather than trying to precisely expose the underlying Wasm types. Note, however, that the generated LLVM IR is no different after this change. Differential Revision: https://reviews.llvm.org/D106500	2021-07-21 16:11:00 -07:00
Thomas Lively	1a57ee1276	[WebAssembly] Codegen for v128.load{32,64}_zero Replace the experimental clang builtins and LLVM intrinsics for these instructions with normal instruction selection patterns. The wasm_simd128.h intrinsics header was already using portable code for the corresponding intrinsics, so now it produces the correct instructions. Differential Revision: https://reviews.llvm.org/D106400	2021-07-21 09:02:12 -07:00
Quinn Pham	e002d251dd	[PowerPC] Floating Point Builtins for XL Compat. This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds builtins related to floating point operations Reviewed By: #powerpc, nemanjai, amyk, NeHuang Differential Revision: https://reviews.llvm.org/D103986	2021-07-21 08:33:39 -05:00
Simon Tatham	21401a7262	[clang] Introduce SourceLocation::[U]IntTy typedefs. This is part of a patch series working towards the ability to make SourceLocation into a 64-bit type to handle larger translation units. NFC: this patch introduces typedefs for the integer type used by SourceLocation and makes all the boring changes to use the typedefs everywhere, but for the moment, they are unconditionally defined to uint32_t. Patch originally by Mikhail Maltsev. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D105492	2021-07-21 10:45:46 +01:00
Albion Fung	2fd1520247	[PowerPC] Implemented mtmsr, mfspr, mtspr Builtins Implemented builtins for mtmsr, mfspr, mtspr on PowerPC; the patch is intended for XL Compatibility. Differential revision: https://reviews.llvm.org/D106130	2021-07-20 17:51:00 -05:00
Albion Fung	3434ac9e39	[PowerPC] Store, load, move from and to registers related builtins This patch implements store, load, move from and to registers related builtins, as well as the builtin for stfiw. The patch aims to provide feature parady with xlC on AIX. Differential revision: https://reviews.llvm.org/D105946	2021-07-20 15:46:14 -05:00
Victor Huang	1a762f93f8	[PowerPC] Add PowerPC cmpb builtin and emit target indepedent code for XL compatibility This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch add the builtin and emit target independent code for __cmpb. Reviewed By: nemanjai, #powerpc Differential revision: https://reviews.llvm.org/D105194	2021-07-20 13:06:22 -05:00
Melanie Blower	ea864c9933	[clang][patch][NFC] Refactor calculation of FunctionDecl to avoid duplicate code	2021-07-20 11:01:22 -04:00
Quinn Pham	fd855c24c7	[PowerPC] Restore FastMathFlags of Builder for Vector FDiv Builtins This patch fixes `__builtin_ppc_recipdivf`, `__builtin_ppc_recipdivd`, `__builtin_ppc_rsqrtf`, and `__builtin_ppc_rsqrtd`. FastMathFlags are set to fast immediately before emitting these builtins. Now the flags are restored to their previous values after the builtins are emitted. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D105984	2021-07-20 09:41:00 -05:00
Stefan Pintilie	02cd937945	[PowerPC][Builtins] Added a number of builtins for compatibility with XL. Added a number of different builtins that exist in the XL compiler. Most of these builtins already exist in clang under a different name. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D104386	2021-07-20 08:57:55 -05:00
Florian Mayer	5f08219322	Revert "[hwasan] Use stack safety analysis." This reverts commit `e9c63ed10b`.	2021-07-20 10:36:46 +01:00
Florian Mayer	e9c63ed10b	[hwasan] Use stack safety analysis. This avoids unnecessary instrumentation. Reviewed By: eugenis, vitalybuka Differential Revision: https://reviews.llvm.org/D105703	2021-07-20 10:06:35 +01:00
Quinn Pham	0268e123be	[PowerPC] swdiv_nochk Builtins for XL Compat This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds software divide builtins with no checking. These builtins are each emitted as a fast fdiv. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D106150	2021-07-19 16:51:10 -05:00
Giorgis Georgakoudis	fb0cf01795	Revert "[OpenMP] Codegen aggregate for outlined function captures" This reverts commit `e9c7291cb2`. Fix failing tests	2021-07-19 07:54:26 -07:00
Jamie Schmeiser	73840f9f81	thread_local support for AIX Summary: The AIX linker will produce errors on unresolved weak symbols. Change the generated code to not check for the initialization function but just call it and ensure that it always exists. Also, the AIX atexit routine has a different name (and signature) so call it correctly. Update the lit tests to test on AIX appropriately. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: hubert.reinterpretcast (Hubert Tong) Differential Revision: https://reviews.llvm.org/D104420	2021-07-19 10:03:22 -04:00
Florian Mayer	807d50100c	Revert "[hwasan] Use stack safety analysis." This reverts commit `12268fe14a`.	2021-07-19 12:08:32 +01:00
Florian Mayer	12268fe14a	[hwasan] Use stack safety analysis. This avoids unnecessary instrumentation. Reviewed By: eugenis, vitalybuka Differential Revision: https://reviews.llvm.org/D105703	2021-07-19 11:54:44 +01:00
David Blaikie	dac582ad3a	DebugInfo: Name class templates with default arguments consistently (both direct naming, and as a template argument for a function template) It's noteworthy that GCC has the same bug here, which is a bit surprising. Both Clang and GCC's bug is only for function template arguments that are themselves templates with default template arguments (f1<t1<int[, missing_default_here]>>). Probably because function name matching isn't generally necessary - whereas type matching is necessary for DWARF consumers to associate declarations and definitions across translation units, so the bug's been addressed there already - but continued to exist for function templates since it's fairly benign there. I came across this while working on a change that could reconstitute these pretty printed names based on the rest of the DWARF, reducing the size of the DWARF by not having to encode all the template parameters in the name string. That reconstitution code can't tell the difference between a defaulted argument or not, so couldn't create the current buggy-ish output. Making the names more consistent between direct and indirect references, and between function and class templates seems all to the good. (I fixed the function template version of this a few years back in `9fdd09a4cc` - clearly I should've looked more closely and generalized the code better so it only had to be fixed once - well, doing that here now)	2021-07-17 23:58:15 -07:00
Nikita Popov	2c68ecccc9	[OpaquePtr] Remove uses of CreateGEP() without element type Remove uses of to-be-deprecated API. In cases where the correct element type was not immediately obvious to me, fall back to explicit getPointerElementType().	2021-07-17 22:56:27 +02:00
Nikita Popov	6225d0cc6e	[OpaquePtr] Remove uses of CreateInBoundsGEP() without element type Remove uses of to-be-deprecated API. Unfortunately this one mostly just makes the use of getPointerElementType() explicit, as the correct type to use wasn't immediately available (deriving it from QualType is left as an excercise to the reader).	2021-07-17 21:27:16 +02:00
Nikita Popov	4ace6008f2	[OpaquePtr] Remove uses of CreateStructGEP() without element type Remove uses of to-be-deprecated API.	2021-07-17 18:48:21 +02:00
Nikita Popov	6d3e7c783b	[OpaquePtr] Remove uses of CreateConstGEP1_32() without element type Remove uses of to-be-deprecated API. I've fallen back to calling getPointerElementType() in some cases where the correct type wasn't immediately obvious to me.	2021-07-17 18:32:36 +02:00
Nikita Popov	5071360eb1	[OpaquePtr] Remove uses of CGF.Builder.CreateConstInBoundsGEP1_64() without type Remove uses of to-be-deprecated API.	2021-07-17 17:07:46 +02:00
Nikita Popov	357756ecf6	[OpaquePtr] Remove uses of CreateConstGEP1_64() without element type Remove uses of to-be-deprecated API.	2021-07-17 16:43:20 +02:00
Nikita Popov	4737eebc0d	[OpaquePtr] Remove uses of CreateConstInBoundsGEP2_64() without type Remove uses of to-be-deprecated API.	2021-07-17 16:42:10 +02:00
Giorgis Georgakoudis	e9c7291cb2	[OpenMP] Codegen aggregate for outlined function captures Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102107	2021-07-16 23:27:44 -07:00
Nemanja Ivanovic	35a18a981f	[PowerPC] Implement intrinsics for mtfsf[i] This provides intrinsics for emitting instructions that set the FPSCR (`mtfsf/mtfsfi`). The patch also conservatively marks the rounding mode as an implicit def for both since they both may set the rounding mode depending on the operands. Reviewed By: #powerpc, qiucf Differential Revision: https://reviews.llvm.org/D105957	2021-07-16 16:26:11 -05:00
Victor Huang	4eb107ccba	[PowerPC] Add PowerPC population count, reversed load and store related builtins and instrinsics for XL compatibility This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds the builtins and instrisics for population count, reversed load and store related operations. Reviewed By: nemanjai, #powerpc Differential revision: https://reviews.llvm.org/D106021	2021-07-15 17:23:56 -05:00
Artem Belevich	d774b4aa5e	[NVPTX, CUDA] Add .and.popc variant of the b1 MMA instruction. That should allow clang to compile mma.h from CUDA-11.3. Differential Revision: https://reviews.llvm.org/D105384	2021-07-15 12:02:09 -07:00
Quinn Pham	de3956605a	[PowerPC] Fix popcntb XL Compat Builtin for 32bit This patch implements the `__popcntb` XL compatibility builtin for 32bit in the frontend and backend. This patch also updates tests for `__popcntb` and other XL Compat sync related builtins. Reviewed By: #powerpc, nemanjai, amyk Differential Revision: https://reviews.llvm.org/D105360	2021-07-15 13:19:47 -05:00
Victor Huang	d40e8091bd	[PowerPC] Add PowerPC rotate related builtins and emit target independent code for XL compatibility This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds the builtins and emit target independent code for rotate related operations. Reviewed By: nemanjai, #powerpc Differential revision: https://reviews.llvm.org/D104744	2021-07-15 10:23:54 -05:00
Chuanqi Xu	8a1727ba51	[Coroutines] Run coroutine passes by default This patch make coroutine passes run by default in LLVM pipeline. Now the clang and opt could handle IR inputs containing coroutine intrinsics without special options. It should be fine. On the one hand, the coroutine passes seems to be stable since there are already many projects using coroutine feature. On the other hand, the coroutine passes should do nothing for IR who doesn't contain coroutine intrinsic. Test Plan: check-llvm Reviewed by: lxfind, aeubanks Differential Revision: https://reviews.llvm.org/D105877	2021-07-15 14:33:40 +08:00
Thomas Lively	4a4229f70f	[WebAssembly] Codegen for v128.storeX_lane instructions Replace the experimental clang builtins and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50435. Differential Revision: https://reviews.llvm.org/D106019	2021-07-14 16:15:25 -07:00
Thomas Lively	970e090010	[WebAssembly] Codegen for v128.loadX_lane instructions Replace the experimental clang builtin and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50433. Differential Revision: https://reviews.llvm.org/D105950	2021-07-14 11:31:53 -07:00
Albion Fung	f1aca5ac96	[PowerPC] Fix L[D\|W]ARX Implementation LDARX and LWARX sometimes gets optimized out by the compiler when it is critical to the correctness of the code. This inline asm generation ensures that it preserved. Differential Revision: https://reviews.llvm.org/D105754	2021-07-13 11:02:07 -05:00
Dave MacLachlan	45ffe6341d	[clang/objc] Optimize getters for non-atomic, copied properties Properties that were declared `@property(copy, nonatomic) id foo` make an unnecessary call to objc_get_property(). This call can be replaced with a direct access to the backing variable identical to how a `@property(nonatomic) id foo` would do it. This reduces codegen by 4 bytes (x86_64/arm64) and removes a cross linkage unit function call per property declared as copy/nonatomic. Differential Revision: https://reviews.llvm.org/D105311	2021-07-13 09:22:13 -04:00
Thomas Lively	cbabfc63b1	[WebAssembly] Custom combines for f32x4.demote_zero_f64x2 Replace the clang builtin function and LLVM intrinsic for f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern. Differential Revision: https://reviews.llvm.org/D105755	2021-07-12 10:32:18 -07:00
Johannes Doerfert	e2cfbfcc0c	[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch. The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 This was in parts extracted from D59319. Reviewed By: ABataev, JonChesterfield Differential Revision: https://reviews.llvm.org/D101976	2021-07-10 17:53:56 -05:00
Nico Weber	d3e7491333	Revert Attributor patch series Broke check-clang, see https://reviews.llvm.org/D102307#2869065 Ran `git revert -n ebbe149a6f08535ede848a531a601ae6591cfbc5..269416d41908bb670f67af689155d5ab8eea689a`	2021-07-10 16:15:55 -04:00
Johannes Doerfert	1d5711c3ee	[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch. The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 This was in parts extracted from D59319. Reviewed By: ABataev, JonChesterfield Differential Revision: https://reviews.llvm.org/D101976	2021-07-10 12:32:50 -05:00
Thomas Lively	e5220104d0	[WebAssembly] Custom combines for f64x2.promote_low_f32x4 Replace the clang builtin function and LLVM intrinsic previously used to select the f64x2.promote_low_f32x4 instruction with custom combines from standard SelectionDAG nodes. Implement the new combines to share code with the similar combines for f64x2.convert_low_i32x4_{s,u}. Resolves PR50232. Differential Revision: https://reviews.llvm.org/D105675	2021-07-09 18:59:29 -07:00
Alexey Bataev	ab8989ab87	[OPENMP]Fix overlapped mapping for dereferenced pointer members. If the base is used in a map clause and later we have a memberexpr with this base, and the member is a pointer, and this pointer is dereferenced anyhow (subscript, array section, dereference, etc.), such components should be considered as overlapped, otherwise it may lead to incorrect size computations, since we try to map a pointee as a part of the whole struct, which is not true for the pointer members. Differential Revision: https://reviews.llvm.org/D105562	2021-07-09 12:51:26 -07:00
David Blaikie	768e3af634	PR51034: Debug Info: Remove 'prototyped' from K&R function declarations Regression caused by `6c9559b67b`.	2021-07-09 12:07:36 -07:00
Varun Gandhi	92dcb1d2db	[Clang] Introduce Swift async calling convention. This change is intended as initial setup. The plan is to add more semantic checks later. I plan to update the documentation as more semantic checks are added (instead of documenting the details up front). Most of the code closely mirrors that for the Swift calling convention. Three places are marked as [FIXME: swiftasynccc]; those will be addressed once the corresponding convention is introduced in LLVM. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D95561	2021-07-09 11:50:10 -07:00
David Blaikie	1def2579e1	PR51018: Remove explicit conversions from SmallString to StringRef to future-proof against C++23 C++23 will make these conversions ambiguous - so fix them to make the codebase forward-compatible with C++23 (& a follow-up change I've made will make this ambiguous/invalid even in <C++23 so we don't regress this & it generally improves the code anyway)	2021-07-08 13:37:57 -07:00
Nikita Popov	a0ea367562	[CodeGen] Avoid nullptr arg to CreateStructGEP (NFC) For now just make the getPointerElementType() explicit.	2021-07-08 21:21:43 +02:00
Alexey Bataev	f57d396dca	[OPENMP]Do no privatize const firstprivates in target regions. No need to emit private copyfor firstprivate constants in target regions, we can use the original copy instead. Differential Revision: https://reviews.llvm.org/D105647	2021-07-08 11:55:37 -07:00
Nikita Popov	693251fb2f	[CodeGen] Avoid CreateGEP with nullptr type (NFC) In preparation for dropping support for it. I've replaced it with a proper type where the correct type was obvious and left an explicit getPointerElementType() where it wasn't.	2021-07-08 20:38:54 +02:00
Alexey Bataev	b3c80dd894	[OPENMP]Remove const firstprivate allocation as a variable in a constant space. Current implementation is not compatible with asynchronous target regions, need to remove it. Differential Revision: https://reviews.llvm.org/D105375	2021-07-07 05:56:48 -07:00
Hsiangkai Wang	593bf9b4de	[Clang][RISCV] Implement vlseg and vlsegff. Differential Revision: https://reviews.llvm.org/D103527	2021-07-07 13:44:40 +08:00
David Blaikie	6c9559b67b	DebugInfo: Mangle K&R declarations for debug info linkage names This fixes a gap in the `overloadable` attribute support (K&R declared functions would get mangled symbol names, but that name wouldn't be represented in the debug info linkage name field for the function) and in -funique-internal-linkage-names (this came up in review discussion on D98799) where K&R static declarations would not get the uniqued linkage names.	2021-07-06 16:28:02 -07:00

1 2 3 4 5 ...

14551 Commits