llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexander Potapenko	dd145f953d	[asan] Add support for disable_sanitizer_instrumentation attribute For ASan this will effectively serve as a synonym for __attribute__((no_sanitize("address"))) This is a reland of https://reviews.llvm.org/D114421 Reviewed By: melver, eugenis Differential Revision: https://reviews.llvm.org/D119726	2022-02-15 14:06:12 +01:00
Momchil Velikov	6398903ac8	Extend the `uwtable` attribute with unwind table kind We have the `clang -cc1` command-line option `-funwind-tables=1\|2` and the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind tables (1) or asynchronous unwind tables (2)`. However, this is encoded in LLVM IR by the presence or the absence of the `uwtable` attribute, i.e. we lose the information whether to generate want just some unwind tables or asynchronous unwind tables. Asynchronous unwind tables take more space in the runtime image, I'd estimate something like 80-90% more, as the difference is adding roughly the same number of CFI directives as for prologues, only a bit simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even more, if you consider tail duplication of epilogue blocks. Asynchronous unwind tables could also restrict code generation to having only a finite number of frame pointer adjustments (an example of not having a finite number of `SP` adjustments is on AArch64 when untagging the stack (MTE) in some cases the compiler can modify `SP` in a loop). Having the CFI precise up to an instruction generally also means one cannot bundle together CFI instructions once the prologue is done, they need to be interspersed with ordinary instructions, which means extra `DW_CFA_advance_loc` commands, further increasing the unwind tables size. That is to say, async unwind tables impose a non-negligible overhead, yet for the most common use cases (like C++ exceptions), they are not even needed. This patch extends the `uwtable` attribute with an optional value: - `uwtable` (default to `async`) - `uwtable(sync)`, synchronous unwind tables - `uwtable(async)`, asynchronous (instruction precise) unwind tables Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D114543	2022-02-14 14:35:02 +00:00
Nikita Popov	1aeb4c6b50	[ItaniumCXXABI] Avoid pointer element type accesses	2022-02-14 15:17:14 +01:00
Nikita Popov	f208644ed3	[CGBuilder] Remove CreateBitCast() method Use CreateElementBitCast() instead, or don't work on Address where not necessary.	2022-02-14 15:06:04 +01:00
phyBrackets	de4e855204	Refactor nested if else with ternary operator in CGExprScalar.cpp Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D119364	2022-02-13 00:15:35 +05:30
Evgenii Stepanov	a730b6a41a	[NFC] clang-format one function. fix code formatting Differential Revision: https://reviews.llvm.org/D119299	2022-02-11 15:00:29 -08:00
Weverything	d5c314cdf4	[Clang][OpaquePtr] Remove deprecated Address constructor calls Remove most calls to deprcated Address constructor in CGExpr.cpp Differential Revision: https://reviews.llvm.org/D119496	2022-02-11 13:02:09 -08:00
Arthur Eubanks	87dd3d350c	[clang][OpaquePtr] Remove call to getPointerElementType() in CodeGenModule::GetAddrOfGlobalTemporary()	2022-02-11 10:39:49 -08:00
Sameer Sahasrabuddhe	d8f99bb6e0	[AMDGPU] replace hostcall module flag with function attribute The module flag to indicate use of hostcall is insufficient to catch all cases where hostcall might be in use by a kernel. This is now replaced by a function attribute that gets propagated to top-level kernel functions via their respective call-graph. If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the default behaviour is to emit kernel metadata indicating that the kernel uses the hostcall buffer pointer passed as an implicit argument. The attribute may be placed explicitly by the user, or inferred by the AMDGPU attributor by examining the call-graph. The attribute is inferred only if the function is not being sanitized, and the implictarg_ptr does not result in a load of any byte in the hostcall pointer argument. Reviewed By: jdoerfert, arsenm, kpyzhov Differential Revision: https://reviews.llvm.org/D119216	2022-02-11 22:51:56 +05:30
Simon Pilgrim	9ece72c159	[clang] VisitCastExpr - use cast<> instead of dyn_cast<> to avoid dereference of nullptr The pointer is always dereferenced, so assert the cast is correct (which it should be as we just created that ScalableVectorType) instead of returning nullptr	2022-02-11 10:51:34 +00:00
Sander de Smalen	0b41238ae7	[AArch64] Emit TBAA metadata for SVE load/store intrinsics In Clang we can attach TBAA metadata based on the load/store intrinsics based on the operation's element type. This also contains changes to InstCombine where the AArch64-specific intrinsics are transformed into generic LLVM load/store operations, to ensure that all metadata is transferred to the new instruction. There will be some further work after this patch to also emit TBAA metadata for SVE's gather/scatter- and struct load/store intrinsics. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D119319	2022-02-11 09:00:29 +00:00
Arthur Eubanks	e487ddc5c6	[clang][OpaquePtr] Use proper Address constructor in AtomicInfo::getAtomicAddress()	2022-02-10 18:29:51 -08:00
David Blaikie	389f67b35b	DebugInfo: Don't simplify names referencing local enums Due to the way type units work, this would lead to a declaration in a type unit of a local type in a CU - which is ambiguous. Rather than trying to resolve that relative to the CU that references the type unit, let's just not try to simplify these names. Longer term this should be fixed by not putting the template instantiation in a type unit to begin with - since it references an internal linkage type, it can't legitimately be duplicated/in more than one translation unit, so skip the type unit overhead. (but the right fix for that is to move type unit management into a DICompositeType flag (dropping the "identifier" field is not a perfect solution since it breaks LLVM IR linking decl/def merging during IR linking))	2022-02-10 15:51:47 -08:00
David Blaikie	26c5cf8fa0	Fix Windows build that fails if a class has a member with the same naem	2022-02-10 15:27:31 -08:00
Yuanfang Chen	f927021410	Reland "[clang-cl] Support the /JMC flag" This relands commit `b380a31de0`. Restrict the tests to Windows only since the flag symbol hash depends on system-dependent path normalization.	2022-02-10 15:16:17 -08:00
David Blaikie	f3a2cfc103	DebugInfo: Don't simplify any template referencing a lambda Lambda names aren't entirely canonical (as demonstrated by the cross-project-test added here) at the moment (we should fix that for a bunch of reasons) - even if the template referencing them is non-simplified, other names referencing /that/ template can't be simplified either because type units might cause a different template to be picked up that would conflict with the expected name. (other than for roundtripping precision, it'd be OK to simplify types that reference types that reference lambdas - but best be consistent between the roundtrip/verify mode and the actual simplified template names mode)	2022-02-10 14:56:54 -08:00
Yuanfang Chen	b380a31de0	Revert "[clang-cl] Support the /JMC flag" This reverts commit `bd3a1de683`. Break bots: https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8822587673277278177/overview	2022-02-10 14:17:37 -08:00
Yuanfang Chen	bd3a1de683	[clang-cl] Support the /JMC flag The introduction and some examples are on this page: https://devblogs.microsoft.com/cppblog/announcing-jmc-stepping-in-visual-studio/ The `/JMC` flag enables these instrumentations: - Insert at the beginning of every function immediately after the prologue with a call to `void __fastcall __CheckForDebuggerJustMyCode(unsigned char *JMC_flag)`. The argument for `__CheckForDebuggerJustMyCode` is the address of a boolean global variable (the global variable is initialized to 1) with the name convention `__<hash>_<filename>`. All such global variables are placed in the `.msvcjmc` section. - The `<hash>` part of `__<hash>_<filename>` has a one-to-one mapping with a directory path. MSVC uses some unknown hashing function. Here I used DJB. - Add a dummy/empty COMDAT function `__JustMyCode_Default`. - Add `/alternatename:__CheckForDebuggerJustMyCode=__JustMyCode_Default` link option via ".drectve" section. This is to prevent failure in case `__CheckForDebuggerJustMyCode` is not provided during linking. Implementation: All the instrumentations are implemented in an IR codegen pass. The pass is placed immediately before CodeGenPrepare pass. This is to not interfere with mid-end optimizations and make the instrumentation target-independent (I'm still working on an ELF port in a separate patch). Reviewed By: hans Differential Revision: https://reviews.llvm.org/D118428	2022-02-10 10:26:30 -08:00
Yaxun (Sam) Liu	1d97cb1f6e	[HIP] Emit amdgpu_code_object_version module flag code object version determines ABI, therefore should not be mixed. This patch emits amdgpu_code_object_version module flag in LLVM IR based on code object version (default 4). The amdgpu_code_object_version value is code object version times 100. LLVM IR with different amdgpu_code_object_version module flag cannot be linked. The -cc1 option -mcode-object-version=none is for ROCm device library use only, which supports multiple ABI. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D119026	2022-02-08 21:58:40 -05:00
Bill Wendling	deaf22bc0e	[X86] Implement -fzero-call-used-regs option The "-fzero-call-used-regs" option tells the compiler to zero out certain registers before the function returns. It's also available as a function attribute: zero_call_used_regs. The two upper categories are: - "used": Zero out used registers. - "all": Zero out all registers, whether used or not. The individual options are: - "skip": Don't zero out any registers. This is the default. - "used": Zero out all used registers. - "used-arg": Zero out used registers that are used for arguments. - "used-gpr": Zero out used registers that are GPRs. - "used-gpr-arg": Zero out used GPRs that are used as arguments. - "all": Zero out all registers. - "all-arg": Zero out all registers used for arguments. - "all-gpr": Zero out all GPRs. - "all-gpr-arg": Zero out all GPRs used for arguments. This is used to help mitigate Return-Oriented Programming exploits. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D110869	2022-02-08 17:42:54 -08:00
Arthur Eubanks	f05a63f9a0	[clang] Properly cache member pointer LLVM types When not going through the main Clang->LLVM type cache, we'd accidentally create multiple different opaque types for a member pointer type. This allows us to remove the -verify-type-cache flag now that check-clang passes with it on. We can do the verification in expensive builds. Previously microsoft-abi-member-pointers.cpp was failing with -verify-type-cache. I suspect that there may be more issues when we have multiple member pointer types and we clear the cache, but we can leave that for later. Followup to D118744. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D119215	2022-02-08 13:22:24 -08:00
Dawid Jurczak	5d8d3a11c4	[NFC] Increase initial size of FoldingSets used in ASTContext and CodeGenTypes Among many FoldingSet users most notable seem to be ASTContext and CodeGenTypes. The reasons that we spend not-so-tiny amount of time in FoldingSet calls from there, are following: 1. Default FoldingSet capacity for 2^6 items very often is not enough. For PointerTypes/ElaboratedTypes/ParenTypes it's not unlikely to observe growing it to 256 or 512 items. FunctionProtoTypes can easily exceed 1k items capacity growing up to 4k or even 8k size. 2. FoldingSetBase::GrowBucketCount cost itself is not very bad (pure reallocations are rather cheap thanks to BumpPtrAllocator). What matters is high collision rate when lot of items end up in same bucket slowing down FoldingSetBase::FindNodeOrInsertPos and trashing CPU cache (as items with same hash are organized in intrusive linked list which need to be traversed). This change address both issues by increasing initial size of FoldingSets used in ASTContext and CodeGenTypes. Extracted from: https://reviews.llvm.org/D118385 Differential Revision: https://reviews.llvm.org/D118608	2022-02-08 17:54:04 +01:00
Nikita Popov	18834dca2d	[OpenCL] Mark kernel arguments as ABI aligned Following the discussion on D118229, this marks all pointer-typed kernel arguments as having ABI alignment, per section 6.3.5 of the OpenCL spec: > For arguments to a __kernel function declared to be a pointer to > a data type, the OpenCL compiler can assume that the pointee is > always appropriately aligned as required by the data type. Differential Revision: https://reviews.llvm.org/D118894	2022-02-08 16:12:51 +01:00
Simon Pilgrim	09857a4bd1	[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes: __m256i test_mm256_adds_epi8(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_adds_epi8 // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.}}, <32 x i8> %{{.}}) return _mm256_adds_epi8(a, b); }	2022-02-08 15:00:10 +00:00
Simon Pilgrim	a59faf272e	Revert rG6c174ab2ad0676b295f11f6c3913eff9289fa6b9 "[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat" Missed some legacy builtin tests that need cleaning up first	2022-02-08 14:45:28 +00:00
Simon Pilgrim	6c174ab2ad	[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes: __m256i test_mm256_adds_epi8(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_adds_epi8 // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.}}, <32 x i8> %{{.}}) return _mm256_adds_epi8(a, b); }	2022-02-08 14:21:20 +00:00
David Pagan	0a7cc078ac	Enable inoutset dependency-type in depend clause. Done in manner similar to mutexinoutset (see https://reviews.llvm.org/D57576) Runtime support already exists in LLVM OpenMP runtime (see https://reviews.llvm.org/D97085). The value used to identify an inoutset dependency type in the LLVM OpenMP runtime is 8. Some tests updated due to change in dependency type error messages that now include new dependency type. Also updated test/OpenMP/task_codegen.cpp to verify we emit the right code.	2022-02-08 08:35:36 -05:00
Simon Pilgrim	c00db97159	[Clang] Add elementwise saturated add/sub builtins This patch implements `__builtin_elementwise_add_sat` and `__builtin_elementwise_sub_sat` builtins. These map to the add/sub saturated math intrinsics described here: https://llvm.org/docs/LangRef.html#saturation-arithmetic-intrinsics With this in place we should then be able to replace the x86 SSE adds/subs intrinsics with these generic variants - it looks like other targets should be able to use these as well (arm/aarch64/webassembly all have similar examples in cgbuiltin). Differential Revision: https://reviews.llvm.org/D117898	2022-02-08 11:22:01 +00:00
Arthur Eubanks	45084eab5e	[clang] Fix some clang->llvm type cache invalidation issues Take the following as an example struct z { z (p)(); }; z f(); When we attempt to get the LLVM type of f, we recurse into z. z itself has a function pointer with the same type as f. Given the recursion, Clang simply treats z::p as a pointer to an empty struct `{}`. The LLVM type of f is as expected. So we have two different potential LLVM types for a given Clang type. If we store one of those into the cache, when we access the cache with a different context (e.g. we are/aren't recursing on z) we may get an incorrect result. There is some attempt to clear the cache in these cases, but it doesn't seem to handle all cases. This change makes it so we only use the cache when we are not in any sort of function context, i.e. `noRecordsBeingLaidOut() && FunctionsBeingProcessed.empty()`, which are the cases where we may decide to choose a different LLVM type for a given Clang type. LLVM types for builtin types are never recursive so they're always ok. This allows us to clear the type cache less often (as seen with the removal of one of the calls to `TypeCache.clear()`). We still need to clear it when we use a placeholder type then replace it later with the final type and other dependent types need to be recalculated. I've added a check that the cached type matches what we compute. It triggered in this test case without the fix. It's currently not check-clang clean so it's not on by default for something like expensive checks builds. This change uncovered another issue where the LLVM types for an argument and its local temporary don't match. For example in type-cache-3, when expanding z::dc's argument into a temporary alloca, we ConvertType() the type of z::p which is `void ({})`, which doesn't match the alloca GEP type of `{}*`. No noticeable compile time changes: https://llvm-compile-time-tracker.com/compare.php?from=3918dd6b8acf8c5886b9921138312d1c638b2937&to=50bdec9836ed40e38ece0657f3058e730adffc4c&stat=instructions Fixes #53465. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D118744	2022-02-07 18:59:09 -08:00
Arthur Eubanks	2724c153f9	[clang] Cache OpenCL types If we call CGOpenCLRuntime::convertOpenCLSpecificType() multiple times we should get the same type back. Reviewed By: svenvh Differential Revision: https://reviews.llvm.org/D119011	2022-02-07 09:23:04 -08:00
Nikita Popov	c45a99f36b	[MatrixBuilder] Require explicit element type in CreateColumnMajorLoad() This makes the method compatible with opaque pointers.	2022-02-07 16:57:33 +01:00
Nikita Popov	cdc0573f75	[MatrixBuilder] Remove unnecessary IRBuilder template (NFC) IRBuilderBase exists specifically to avoid the need for this.	2022-02-07 16:42:38 +01:00
Yaxun (Sam) Liu	171da443d5	[HIPSPV] Fix literals are mapped to Generic address space This issue is an oversight in D108621. Literals in HIP are emitted as global constant variables with default address space which maps to Generic address space for HIPSPV. In SPIR-V such variables translate to OpVariable instructions with Generic storage class which are not legal. Fix by mapping literals to CrossWorkGroup address space. The literals are not mapped to UniformConstant because the “flat” pointers in HIP may reference them and “flat” pointers are modeled as Generic pointers in SPIR-V. In SPIR-V/OpenCL UniformConstant pointers may not be casted to Generic. Patch by: Henry Linjamäki Reviewed by: Yaxun Liu Differential Revision: https://reviews.llvm.org/D118876	2022-02-05 17:26:52 -05:00
James Y Knight	caa1ebde70	Don't assume that a new cleanup was added to InnermostEHScope. After `fa87fa97fb`, this was no longer guaranteed to be the cleanup just added by this code, if IsEHCleanup got disabled. Instead, use stable_begin(), which _is_ guaranteed to be the cleanup just added. This caused a crash when a object that is callee destroyed (e.g. with the MS ABI) was passed in a call from a noexcept function. Added a test to verify. Fixes: `fa87fa97fb`	2022-02-04 23:39:42 -05:00
Joseph Huber	034adaf5be	[OpenMP] Completely remove old device runtime This patch completely removes the old OpenMP device runtime. Previously, the old runtime had the prefix `libomptarget-new-` and the old runtime was simply called `libomptarget-`. This patch makes the formerly new runtime the only runtime available. The entire project has been deleted, and all references to the `libomptarget-new` runtime has been replaced with `libomptarget-`. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D118934	2022-02-04 15:31:33 -05:00
Shilei Tian	b35be6fe98	[Clang][Sema][OpenMP] Sema support for `atomic compare` This patch adds the Sema support for `atomic compare`. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D116637	2022-02-04 12:30:56 -05:00
Hans Wennborg	853e0aa424	Don't dllexport reference temporaries Even if the reference itself is dllexport, the temporary should not be. In fact, we're already giving it internal linkage, so dllexporting it is not just wasteful, but will fail to link, as in the example below: $ cat /tmp/a.cc void _DllMainCRTStartup() {} const int __declspec(dllexport) &foo = 42; $ clang-cl -fuse-ld=lld /tmp/a.cc /Zl /link /dll /out:a.dll lld-link: error: <root>: undefined symbol: int const &foo::$RT1 Differential revision: https://reviews.llvm.org/D118980	2022-02-04 16:31:51 +01:00
John Brawn	bca998ed3c	[AArch64] Generate fcmps when appropriate for neon intrinsics Differential Revision: https://reviews.llvm.org/D118257	2022-02-04 12:55:38 +00:00
Jan Svoboda	42afaf7f47	[clang][CodeGen] Use memory type representation in `va_arg` Some types (e.g. `_Bool`) have different scalar and memory representations. CodeGen for `va_arg` didn't take this into account, leading to an assertion failures with different types. This patch makes sure we use memory representation for `va_arg`. Reviewed By: ahatanak Differential Revision: https://reviews.llvm.org/D118904	2022-02-04 12:10:57 +01:00
James Y Knight	fa87fa97fb	Skip exception cleanups when the innermost scope is EHTerminateScope. EHTerminateScope is used to implement C++ noexcept semantics. Per C++ [except.terminate], it is implemented-defined whether no, some, or all cleanups are run prior to terminatation. Therefore, the code to run cleanups on the way towards termination is unnecessary, and may be omitted. After this change, we will still run some cleanups: any cleanups in a function called from the noexcept function will continue to run, while those in the noexcept function itself will not. (Commit attempt 2: check InnermostEHScope != stable_end() before accessing it.) Differential Revision: https://reviews.llvm.org/D113620	2022-02-02 17:50:18 -05:00
Rainer Orth	efdd0a29b7	[clang][Sparc] Fix __builtin_extract_return_addr etc. While investigating the failures of `symbolize_pc.cpp` and `symbolize_pc_inline.cpp` on SPARC (both Solaris and Linux), I noticed that `__builtin_extract_return_addr` is a no-op in `clang` on all targets, while `gcc` has non-default implementations for arm, mips, s390, and sparc. This patch provides the SPARC implementation. For background see `SparcISelLowering.cpp` (`SparcTargetLowering::LowerReturn_32`), the SPARC psABI p.3-12, `%i7` and p.3-16/17, and SCD 2.4.1, p.3P-10, `%i7` and p.3P-15. Tested (after enabling the `sanitizer_common` tests on SPARC) on `sparcv9-sun-solaris2.11`. Differential Revision: https://reviews.llvm.org/D91607	2022-02-02 19:20:02 +01:00
Alex Lorenz	116c1bea65	[clang][macho] add clang frontend support for emitting macho files with two build version load commands This patch extends clang frontend to add metadata that can be used to emit macho files with two build version load commands. It utilizes "darwin.target_variant.triple" and "darwin.target_variant.SDK Version" metadata names for that. MachO uses two build version load commands to represent an object file / binary that is targeting both the macOS target, and the Mac Catalyst target. At runtime, a dynamic library that supports both targets can be loaded from either a native macOS or a Mac Catalyst app on a macOS system. We want to add support to this to upstream to LLVM to be able to build compiler-rt for both targets, to finish the complete support for the Mac Catalyst platform, which is right now targetable by upstream clang, but the compiler-rt bits aren't supported because of the lack of this multiple build version support. Differential Revision: https://reviews.llvm.org/D115415	2022-02-02 08:30:39 -08:00
serge-sans-paille	e188aae406	Cleanup header dependencies in LLVMCore Based on the output of include-what-you-use. This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/ I've tried to summarize the biggest change below: - llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l before: 6400831 after: 6189948 200k lines less to process is no that bad ;-) Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118652	2022-02-02 06:54:20 +01:00
Joseph Huber	53d5757ea2	[OpenMP] Add kernel string attribute to kernel function This patch adds a function attribute to the kernel function generated in OpenMP offloading. We already create a `nvvm.annotations` metadata node indicating the kernels present in the program. However, this created some indirection when trying to identify if a specific function was an entry. We add a single function attribute for each function now to simplify this. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D118708	2022-02-01 13:49:31 -05:00
Fangrui Song	7aaf024dac	[BitcodeWriter] Fix cases of some functions `WriteIndexToFile` is used by external projects so I do not touch it.	2022-01-31 16:46:11 -08:00
Fangrui Song	85dfe19b36	[ModuleUtils] Move EmbedBufferInModule to LLVMTransformsUtils D116542 adds EmbedBufferInModule which introduces a layer violation (https://llvm.org/docs/CodingStandards.html#library-layering). See `2d5f857a1e` for detail. EmbedBufferInModule does not use BitcodeWriter functionality and should be moved LLVMTransformsUtils. While here, change the function case to the prevailing convention. It seems that EmbedBufferInModule just follows the steps of EmbedBitcodeInModule. EmbedBitcodeInModule calls WriteBitcodeToFile but has IR update operations which ideally should be refactored to another library. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D118666	2022-01-31 16:33:57 -08:00
Itay Bookstein	2a868802a3	[clang][CodeGen][NFC] Remove unused CodeGenModule fields Signed-off-by: Itay Bookstein <ibookstein@gmail.com> Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D118619	2022-01-31 23:45:53 +02:00
Joseph Huber	551b177452	[OpenMP] Add a flag for embedding a file into the module This patch adds support for a flag `-fembed-offload-binary` to embed a file as an ELF section in the output by placing it in a global variable. This can be used to bundle offloading files with the host binary so it can be accessed by the linker. The section is named using the `-fembed-offload-section` option. Depends on D116541 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D116542	2022-01-31 15:56:00 -05:00
tyb0807	51e188d079	[AArch64] Support for memset tagged intrinsic This introduces a new ACLE intrinsic for memset tagged (https://github.com/ARM-software/acle/blob/next-release/main/acle.md#memcpy-family-of-operations-intrinsics---mops). void __builtin_arm_mops_memset_tag(void , int, size_t) A corresponding LLVM intrinsic is introduced: i8* llvm.aarch64.mops.memset.tag(i8*, i8, i64) The types match llvm.memset but the return type is not void. This is part 1/4 of a series of patches split from https://reviews.llvm.org/D117405 to facilitate reviewing. Patch by Tomas Matheson Differential Revision: https://reviews.llvm.org/D117753	2022-01-31 20:49:34 +00:00
Ben Shi	653836251a	[clang][AVR] Set '-fno-use-cxa-atexit' to default AVR is baremetal environment, so the avr-libc does not support '__cxa_atexit()'. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D118445	2022-01-30 02:26:19 +00:00
Weverything	be2147db05	Remove reference type when checking const structs ConstStructBuilder::Finalize in CGExprConstant.ccp assumes that the passed in QualType is a RecordType. In some instances, the type is a reference to a RecordType and the reference needs to be removed first. Differential Revision: https://reviews.llvm.org/D117376	2022-01-28 13:08:58 -08:00
Amilendra Kodithuwakku	1f08b08674	[clang][ARM] Emit warnings when PACBTI-M is used with unsupported architectures Branch protection in M-class is supported by - Armv8.1-M.Main - Armv8-M.Main - Armv7-M Attempting to enable this for other architectures, either by command-line (e.g -mbranch-protection=bti) or by target attribute in source code (e.g. __attribute__((target("branch-protection=..."))) ) will generate a warning. In both cases function attributes related to branch protection will not be emitted. Regardless of the warning, module level attributes related to branch protection will be emitted when it is enabled via the command-line. The following people also contributed to this patch: - Victor Campos Reviewed By: chill Differential Revision: https://reviews.llvm.org/D115501	2022-01-28 09:59:58 +00:00
Joseph Huber	2945f11c60	[OpenMP] Only generate runtime flags with host input This patch changes the code generation of runtime flags to only occur if a host bitcode file was passed in. This is a cheap way to determine if we are compiling the OpenMP device runtime itself or user code. This is needed because the global flags we generate for the device runtime e.g. __omp_rtl_debug_kind were being generated with default values when we compiled the runtime library. This would then invalidate the ones we want to be able to add in when the user defines it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D118399	2022-01-27 18:43:41 -05:00
Arthur Eubanks	662ef6d177	[NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in VisitArrayInitLoopExpr With this we can bootstrap an `-O0 -g0` clang with `-mllvm -opaque-pointers`!	2022-01-27 14:44:53 -08:00
Arthur Eubanks	6e8a66bdad	[NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in EmitCXXMemberDataPointerAddress()	2022-01-27 14:44:53 -08:00
Arthur Eubanks	f17123831e	[NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in CreateTempAlloca() Specify the Address element type, which is the bitcast destination type. (the whole bitcast won't be needed after opaque pointers)	2022-01-27 14:18:54 -08:00
Arthur Eubanks	63cf2063a2	[NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in EmitNewArrayInitializer() Specify the Address element type, which is the same for all pointers in the array.	2022-01-27 14:00:16 -08:00
Sri Hari Krishna Narayanan	5aa24558cf	OMPIRBuilder for Interop directive Implements the OMPIRBuilder portion for the Interop directive. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D105876	2022-01-27 14:53:18 -05:00
David Green	82973edfb7	[ARM][AArch64] Introduce qrdmlah and qrdmlsh intrinsics Since it's introduction, the qrdmlah has been represented as a qrdmulh and a sadd_sat. This doesn't produce the same result for all input values though. This patch fixes that by introducing a qrdmlah (and qrdmlsh) intrinsic specifically for the vqrdmlah and sqrdmlah instructions. The old test cases will now produce a qrdmulh and sqadd, as expected. Fixes #53120 and #50905 and #51761. Differential Revision: https://reviews.llvm.org/D117592	2022-01-27 19:19:46 +00:00
Dawid Jurczak	b88ca619d3	[NFC][CodeGen] Use llvm::DenseMap for DeferredDecls CodeGenModule::DeferredDecls std::map::operator[] seem to be hot especially while code generating huge compilation units. In such cases using DenseMap instead gives observable compile time improvement. Patch was tested on Linux build with default config acting as benchmark. Build was performed on isolated CPU cores in silent x86-64 Linux environment following: https://llvm.org/docs/Benchmarking.html#linux rules. Compile time statistics diff produced by perf and time before and after change are following: instructions -0.15%, cycles -0.7%, max-rss +0.65%. Using StringMap instead DenseMap doesn't bring any visible gains. Differential Revision: https://reviews.llvm.org/D118169	2022-01-27 10:57:48 +01:00
Ahmed Bougacha	ecb502342c	[ObjC] Emit selector load right before msgSend call. We currently emit the selector load early, but only because we need it to compute the signature (so that we know which msgSend variant to call). We can prepare the signature with a plain undef, and replace it with the materialized selector value if (and only if) needed, later. Concretely, this usually doesn't have an effect, but tests need updating because we reordered the receiver bitcast and the selector load, which is always fine. There is one notable change: with this, when a msgSend needs a receiver null check, the selector is now loaded in the non-null block, instead of before the null check. That should be a mild improvement.	2022-01-26 20:52:54 -08:00
Arthur Eubanks	eee97f1617	[clang] Use proper type to left shift after D117262 Causing warnings like warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits as reported in D117262.	2022-01-26 17:54:37 -08:00
Arthur Eubanks	6a953d931c	[clang] Fix -Wsubobject-linkage after D117262 /home/buildbot/llvm-avr-linux/llvm-avr-linux/llvm/clang/lib/CodeGen/Address.h:76:7: warning: 'clang::CodeGen::Address' has a field 'clang::CodeGen::Address::A' whose type uses the anonymous namespace [-Wsubobject-linkage] https://lab.llvm.org/buildbot/#/builders/112/builds/12047	2022-01-26 11:43:44 -08:00
Arthur Eubanks	b1613f05ae	[NFC] Store Address's alignment into PointerIntPairs This mitigates the extra memory caused by D115725. On 32-bit arches where we only have 2 bits per PointerIntPair we fall back to simply storing alignment separately. Reviewed By: rnk, nikic Differential Revision: https://reviews.llvm.org/D117262	2022-01-26 10:35:28 -08:00
Benjamin Kramer	f15014ff54	Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17" This reverts commit `ef82063207`. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat	2022-01-26 16:55:53 +01:00
serge-sans-paille	ef82063207	Rename llvm::array_lengthof into llvm::size to match std::size from C++17 As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).	2022-01-26 16:17:45 +01:00
JackAKirk	0ad19a8331	[CUDA,NVPTX] Corrected fragment size for tf32 LD B matrix. Signed-off-by: JackAKirk <jack.kirk@codeplay.com> Reviewed By: tra Differential Revision: https://reviews.llvm.org/D118023	2022-01-25 11:29:19 -08:00
Nikita Popov	30d4a7e295	[IRBuilder] Require explicit element type in CreatePtrDiff() For opaque pointer compatibility, we cannot derive the element type from the pointer type.	2022-01-25 12:43:57 +01:00
Nikita Popov	caff8591ef	[OpenMP] Simplify pointer comparison Rather than checking ptrdiff(a, b) != 0, directly check a != b.	2022-01-25 12:38:37 +01:00
Nikita Popov	99adacbcb7	[clang] Remove some getPointerElementType() uses Same cases where the call can be removed in a straightforward way.	2022-01-25 12:09:06 +01:00
Nikita Popov	aa97bc116d	[NFC] Remove uses of PointerType::getElementType() Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType(). This is part of D117885, in preparation for deprecating the API.	2022-01-25 09:44:52 +01:00
Simon Pilgrim	e4074432d5	[X86] Remove avx512f integer and/or/xor/min/max reduction intrinsics and use generic equivalents None of these have any reordering issues, and they still emit the same reduction intrinsics without any change in the existing test coverage: llvm-project\clang\test\CodeGen\X86\avx512-reduceIntrin.c llvm-project\clang\test\CodeGen\X86\avx512-reduceMinMaxIntrin.c Differential Revision: https://reviews.llvm.org/D117881	2022-01-24 11:57:53 +00:00
Simon Pilgrim	3e50593b18	[X86] Remove `__builtin_ia32_pmax/min` intrinsics and use generic `__builtin_elementwise_max/min` D111985 added the generic `__builtin_elementwise_max` and `__builtin_elementwise_min` intrinsics with the same integer behaviour as the SSE/AVX instructions This patch removes the `__builtin_ia32_pmax/min` intrinsics and just uses `__builtin_elementwise_max/min` - the existing tests see no changes: ``` __m256i test_mm256_max_epu32(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_max_epu32 // CHECK: call <8 x i32> @llvm.umax.v8i32(<8 x i32> %{{.}}, <8 x i32> %{{.}}) return _mm256_max_epu32(a, b); } ``` This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`). Sibling patch to D117791 Differential Revision: https://reviews.llvm.org/D117798	2022-01-24 11:40:29 +00:00
Simon Pilgrim	e5147f82e1	[X86] Remove __builtin_ia32_pabs intrinsics and use generic __builtin_elementwise_abs D111986 added the generic `__builtin_elementwise_abs()` intrinsic with the same integer absolute behaviour as the SSE/AVX instructions (abs(INT_MIN) == INT_MIN) This patch removes the `__builtin_ia32_pabs` intrinsics and just uses `__builtin_elementwise_abs` - the existing tests see no changes: ``` __m256i test_mm256_abs_epi8(__m256i a) { // CHECK-LABEL: test_mm256_abs_epi8 // CHECK: [[ABS:%.]] = call <32 x i8> @llvm.abs.v32i8(<32 x i8> %{{.*}}, i1 false) return _mm256_abs_epi8(a); } ``` This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`). Differential Revision: https://reviews.llvm.org/D117791	2022-01-24 11:25:21 +00:00
Wei Wang	55d887b833	[time-trace] Add optimizer and codegen regions to NPM Optimizer and codegen regions were only added to legacy PM. Add them to NPM as well. Differential Revision: https://reviews.llvm.org/D117605	2022-01-21 19:17:57 -08:00
Simon Pilgrim	0abaf64580	Revert rG4727d29d908f9dd608dd97a58c0af1ad579fd3ca "[X86] Remove __builtin_ia32_pabs intrinsics and use generic __builtin_elementwise_abs" Some build bots are referencing the `__builtin_ia32_pabs` intrinsics via alternative headers	2022-01-21 12:35:36 +00:00
Simon Pilgrim	3ef88b3184	Revert rG8ee135dcf8ff060656ad481c3e980fe8763576f5 "[X86] Remove `__builtin_ia32_pmax/min` intrinsics and use generic `__builtin_elementwise_max/min`" Some build bots are referencing the `__builtin_ia32_pmax/min` intrinsics via alternative headers	2022-01-21 12:34:19 +00:00
Simon Pilgrim	8ee135dcf8	[X86] Remove `__builtin_ia32_pmax/min` intrinsics and use generic `__builtin_elementwise_max/min` D111985 added the generic `__builtin_elementwise_max` and `__builtin_elementwise_min` intrinsics with the same integer behaviour as the SSE/AVX instructions This patch removes the `__builtin_ia32_pmax/min` intrinsics and just uses `__builtin_elementwise_max/min` - the existing tests see no changes: ``` __m256i test_mm256_max_epu32(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_max_epu32 // CHECK: call <8 x i32> @llvm.umax.v8i32(<8 x i32> %{{.}}, <8 x i32> %{{.}}) return _mm256_max_epu32(a, b); } ``` This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`). Sibling patch to D117791 Differential Revision: https://reviews.llvm.org/D117798	2022-01-21 12:24:58 +00:00
Simon Pilgrim	4727d29d90	[X86] Remove __builtin_ia32_pabs intrinsics and use generic __builtin_elementwise_abs D111986 added the generic `__builtin_elementwise_abs()` intrinsic with the same integer absolute behaviour as the SSE/AVX instructions (abs(INT_MIN) == INT_MIN) This patch removes the `__builtin_ia32_pabs` intrinsics and just uses `__builtin_elementwise_abs` - the existing tests see no changes: ``` __m256i test_mm256_abs_epi8(__m256i a) { // CHECK-LABEL: test_mm256_abs_epi8 // CHECK: [[ABS:%.]] = call <32 x i8> @llvm.abs.v32i8(<32 x i8> %{{.*}}, i1 false) return _mm256_abs_epi8(a); } ``` This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`). Differential Revision: https://reviews.llvm.org/D117791	2022-01-21 11:59:08 +00:00
Joao Moreira	82af95029e	[X86] Enable ibt-seal optimization when LTO is used in Kernel Intel's CET/IBT requires every indirect branch target to be an ENDBR instruction. Because of that, the compiler needs to correctly emit these instruction on function's prologues. Because this is a security feature, it is desirable that only actual indirect-branch-targeted functions are emitted with ENDBRs. While it is possible to identify address-taken functions through LTO, minimizing these ENDBR instructions remains a hard task for user-space binaries because exported functions may end being reachable through PLT entries, that will use an indirect branch for such. Because this cannot be determined during compilation-time, the compiler currently emits ENDBRs to every non-local-linkage function. Despite the challenge presented for user-space, the kernel landscape is different as no PLTs are used. With the intent of providing the most fit ENDBR emission for the kernel, kernel developers proposed an optimization named "ibt-seal" which replaces the ENDBRs for NOPs directly in the binary. The discussion of this feature can be seen in [1]. This diff brings the enablement of the flag -mibt-seal, which in combination with LTO enforces a different policy for ENDBR placement in when the code-model is set to "kernel". In this scenario, the compiler will only emit ENDBRs to address taken functions, ignoring non-address taken functions that are don't have local linkage. A comparison between an LTO-compiled kernel binaries without and with the -mibt-seal feature enabled shows that when -mibt-seal was used, the number of ENDBRs in the vmlinux.o binary patched by objtool decreased from 44383 to 33192, and that the number of superfluous ENDBR instructions nopped-out decreased from 11730 to 540. The 540 missed superfluous ENDBRs need to be investigated further, but hypotheses are: assembly code not being taken care of by the compiler, kernel exported symbols mechanisms creating bogus address taken situations or even these being removed due to other binary optimizations like kernel's static_calls. For now, I assume that the large drop in the number of ENDBR instructions already justifies the feature being merged. [1] - https://lkml.org/lkml/2021/11/22/591 Reviewed By: xiangzhangllvm Differential Revision: https://reviews.llvm.org/D116070	2022-01-21 10:55:34 +08:00
Alexandre Ganea	5af2433e17	[clang-cl] Support the /HOTPATCH flag This patch adds support for the MSVC /HOTPATCH flag: https://docs.microsoft.com/sv-se/cpp/build/reference/hotpatch-create-hotpatchable-image?view=msvc-170&viewFallbackFrom=vs-2019 The flag is translated to a new -fms-hotpatch flag, which in turn adds a 'patchable-function' attribute for each function in the TU. This is then picked up by the PatchableFunction pass which would generate a TargetOpcode::PATCHABLE_OP of minsize = 2 (which means the target instruction must resolve to at least two bytes). TargetOpcode::PATCHABLE_OP is only implemented for x86/x64. When targetting ARM/ARM64, /HOTPATCH isn't required (instructions are always 2/4 bytes and suitable for hotpatching). Additionally, when using /Z7, we generate a 'hot patchable' flag in the CodeView debug stream, in the S_COMPILE3 record. This flag is then picked up by LLD (or link.exe) and is used in conjunction with the linker /FUNCTIONPADMIN flag to generate extra space before each function, to accommodate for live patching long jumps. Please see: `d703b92296/lld/COFF/Writer.cpp (L1298)` The outcome is that we can finally use Live++ or Recode along with clang-cl. NOTE: It seems that MSVC cl.exe always enables /HOTPATCH on x64 by default, although if we did the same I thought we might generate sub-optimal code (if this flag was active by default). Additionally, MSVC always generates a .debug$S section and a S_COMPILE3 record, which Clang doesn't do without /Z7. Therefore, the following MSVC command-line "cl /c file.cpp" would have to be written with Clang such as "clang-cl /c file.cpp /HOTPATCH /Z7" in order to obtain the same result. Depends on D43002, D80833 and D81301 for the full feature. Differential Revision: https://reviews.llvm.org/D116511	2022-01-20 12:57:19 -05:00
Florian Hahn	67aa314bce	[IRGen] Do not overwrite existing attributes in CGCall. When adding new attributes, existing attributes are dropped. While this appears to be a longstanding issue, this was highlighted by D105169 which dropped a lot of attributes due to adding the new noundef attribute. Ahmed Bougacha (@ab) tracked down the issue and provided the fix in CGCall.cpp. I bundled it up and updated the tests.	2022-01-20 13:45:19 +00:00
Chenbing.Zheng	0be3da1fab	[RISCV] Add intrinsic for Zbt extension RV32: fsl, fsr, fsri RV64: fsl, fsr, fsri, fslw, fsrw, fsriw Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117468	2022-01-20 08:27:05 +00:00
Yaxun (Sam) Liu	85c2bd2a0e	Prevent adding module flag amdgpu_hostcall multiple times HIP program with printf call fails to compile with -fsanitize=address option, because of appending module flag - amdgpu_hostcall twice, one for printf and one for sanitize option. This patch fixes that issue. Patch by: Praveen Velliengiri Reviewed by: Yaxun Liu, Roman Lebedev Differential Revision: https://reviews.llvm.org/D116216	2022-01-19 12:52:33 -05:00
Arnamoy Bhattacharyya	9fbd33ad62	[OMPIRBuilder] Add support for simd (loop) directive. This patch adds OMPIRBuilder support for the simd directive (without any clause). This will be a first step towards lowering simd directive in LLVM_Flang. The patch uses existing CanonicalLoop infrastructure of IRBuilder to add the support. Also adds necessary code to add llvm.access.group and llvm.loop metadata wherever needed. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D114379	2022-01-19 11:32:17 -05:00
Ben Shi	a2f488c6a5	[clang][AVR] Implement '__flashN' for variables on different flash banks Reviewed By: aykevl Differential Revision: https://reviews.llvm.org/D115982	2022-01-19 11:24:01 +00:00
hyeongyu kim	1b1c8d83d3	[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default. Test updates are made as a separate patch: D108453 Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D105169	2022-01-16 18:54:17 +09:00
Nikita Popov	c63a3175c2	[AttrBuilder] Remove ctor accepting AttributeList and Index Use the AttributeSet constructor instead. There's no good reason why AttrBuilder itself should exact the AttributeSet from the AttributeList. Moving this out of the AttrBuilder generally results in cleaner code.	2022-01-15 22:39:31 +01:00
James Y Knight	0d3f2fd269	Revert "Skip exception cleanups when the innermost scope is EHTerminateScope." Breaks tests on some platforms. Reverting while investigating. This reverts commit `a4e255f9c6`.	2022-01-14 18:59:24 -05:00
Matt Arsenault	33315ef321	clang/AMDGPU: Don't set implicit arg attribute to default size Since `2959e082e1`, we conservatively assume all inputs are enabled by default. This isn't the best interface for controlling these anyway, since it's not granular and only allows trimming the last fields.	2022-01-14 18:43:30 -05:00
James Y Knight	a4e255f9c6	Skip exception cleanups when the innermost scope is EHTerminateScope. EHTerminateScope is used to implement C++ noexcept semantics. Per C++ [except.terminate], it is implemented-defined whether no, some, or all cleanups are run prior to terminatation. Therefore, the code to run cleanups on the way towards termination is unnecessary, and may be omitted. After this change, we will still run some cleanups: any cleanups in a function called from the noexcept function will continue to run, while those in the noexcept function itself will not. Differential Revision: https://reviews.llvm.org/D113620	2022-01-14 18:01:29 -05:00
Erich Keane	2bcba21c8b	[CPU-Dispatch] Make sure Dispatch names get updated if previously mangled Cases where there is a mangling of a cpu-dispatch/cpu-specific function before the function becomes 'multiversion' (such as a member function) causes the wrong name to be emitted for one of the variants/resolver, since the name is cached. Make sure we invalidate the cache in cpu-dispatch/cpu-specific modes, like we previously did for just target multiversioning.	2022-01-14 10:45:55 -08:00
Benjamin Kramer	765dd8b8a4	[CGBuiltin] Simplify code. NFCI.	2022-01-14 16:02:02 +01:00
Jun Zhang	8de0c1feca	[Clang] Add __builtin_reduce_or and __builtin_reduce_and This patch implements two builtins specified in D111529. The last __builtin_reduce_add will be seperated into another one. Differential Revision: https://reviews.llvm.org/D116736	2022-01-14 22:05:26 +08:00
Kevin Athey	a0458b531c	Add -fsanitize-address-param-retval to clang. With the introduction of this flag, it is no longer necessary to enable noundef analysis with 4 separate flags. (-Xclang -enable-noundef-analysis -mllvm -msan-eager-checks=1). This change only covers the introduction into the compiler. This is a follow up to: https://reviews.llvm.org/D116855 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D116633	2022-01-14 00:41:28 -08:00
Maurice Heumann	072e2a7c67	[MS] Implement on-demand TLS initialization for Microsoft CXX ABI TLS initializers, for example constructors of thread-local variables, don't necessarily get called. If a thread was created before a module is loaded, the module's TLS initializers are not executed for this particular thread. This is why Microsoft added support for dynamic TLS initialization. Before every use of thread-local variables, a check is added that runs the module's TLS initializers on-demand. To do this, the method `__dyn_tls_on_demand_init` gets called. Internally, it simply calls `__dyn_tls_init`. No additional TLS initializer that sets the guard needs to be emitted, as the guard always gets set by `__dyn_tls_init`. The guard is also checked again within `__dyn_tls_init`. This makes our check redundant, however, as Microsoft's compiler also emits this check, the behaviour is adopted here. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D115456	2022-01-13 21:23:23 -08:00
Elizabeth Andrews	4eaf5846d0	[clang] Fix function pointer address space Functions pointers should be created with program address space. This patch introduces program address space in TargetInfo. Targets with non-default (default is 0) address space for functions should explicitly set this value. This patch fixes a crash on lvalue reference to function pointer (in device code) when using oneAPI DPC++ compiler. Differential Revision: https://reviews.llvm.org/D111566	2022-01-13 08:06:19 -08:00
Erich Keane	b699e8b11a	Add another assert to cpu-dispatch emission to help track down a tough to repro error. As mentioned yesterday, I've got a problem that I can only reproduce on Godbolt (none of the build configs on my local machine!), so this is at least somewhat usable until I figure out a cause.	2022-01-13 06:54:08 -08:00
Lian Wang	16877c5d2c	[RISCV] Add bfp and bfpw intrinsic in zbf extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116994	2022-01-13 02:53:00 +00:00
Kevin Athey	a141e47138	[NFC] Minimize noundef analysis when disabled Minor adjustment in order of noundef analysis to be a bit more optimal (when disabled). Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D117078	2022-01-12 17:21:19 -08:00
Erich Keane	6e77ad11ff	Add an assert in cpudispatch emit to try to track down an error. I'm attempting to debug an issue that I can only get to happen on godbolt, where the cpu-dispatch resolver for an out of line member function is generated with the wrong name, causing a link failure.	2022-01-12 10:31:28 -08:00
Simon Pilgrim	497a4b26c4	CGBuiltin - Use castAs<> instead of getAs<> to avoid dereference of nullptr The pointer is always dereferenced immediately below, so assert the cast is correct instead of returning nullptr	2022-01-12 15:35:37 +00:00
Marco Elver	732ad8ea62	[clang][auto-init] Provide __builtin_alloca*_uninitialized variants When `-ftrivial-auto-var-init=` is enabled, allocas unconditionally receive auto-initialization since [1]. In certain cases, it turns out, this is causing problems. For example, when using alloca to add a random stack offset, as the Linux kernel does on syscall entry [2]. In this case, none of the alloca'd stack memory is ever used, and initializing it should be controllable; furthermore, it is not always possible to safely call memset (see [2]). Introduce `__builtin_alloca_uninitialized()` (and `__builtin_alloca_with_align_uninitialized`), which never performs initialization when `-ftrivial-auto-var-init=` is enabled. [1] https://reviews.llvm.org/D60548 [2] https://lkml.kernel.org/r/YbHTKUjEejZCLyhX@elver.google.com Reviewed By: glider Differential Revision: https://reviews.llvm.org/D115440	2022-01-12 15:13:10 +01:00
Sven van Haastregt	4b85800bfd	[OpenCL] Set external linkage for block enqueue kernels All kernels can be called from the host as per the SPIR_KERNEL calling convention. As such, all kernels should have external linkage, but block enqueue kernels were created with internal linkage. Reported-by: Pedro Olsen Ferreira Differential Revision: https://reviews.llvm.org/D115523	2022-01-12 13:30:09 +00:00
Adam Magier	b2715660ed	[clang][CodeGen][UBSan] VLA size checking for unsigned integer parameter The code generation for the UBSan VLA size check was qualified by a con- dition that the parameter must be a signed integer, however the C spec does not make any distinction that only signed integer parameters can be used to declare a VLA, only qualifying that it must be greater than zero if it is not a constant. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D116048	2022-01-12 01:11:52 +01:00
Nick Desaulniers	5c562f62a4	[clang] number labels in asm goto strings after tied inputs I noticed that the following case would compile in Clang but not GCC: void x(void) { void p = &&foo; asm goto ("# %0\n\t# %l1":"+r"(p):::foo); foo:; return p; } Changing the output template above from %l2 would compile in GCC but not Clang. This demonstrates that when using tied outputs (say via the "+r" output constraint), the hidden inputs occur or are numbered BEFORE the labels, at least with GCC. In fact, GCC does denote this in its documentation: https://gcc.gnu.org/onlinedocs/gcc-11.2.0/gcc/Extended-Asm.html#Goto-Labels > Output operand with constraint modifier ‘+’ is counted as two operands > because it is considered as one output and one input operand. For the sake of compatibility, I think it's worthwhile to just make this change. It's better to use symbolic names for compatibility (especially now between released version of Clang that support asm goto with outputs). ie. %l1 from the above would be %l[foo]. The GCC docs also make this recommendation. Also, I cleaned up some cruft in GCCAsmStmt::getNamedOperand. AFAICT, NumPlusOperands was no longer used, though I couldn't find which commit didn't clean that up correctly. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98096 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103640 Link: https://gcc.gnu.org/onlinedocs/gcc-11.2.0/gcc/Extended-Asm.html#Goto-Labels Reviewed By: void Differential Revision: https://reviews.llvm.org/D115471	2022-01-11 12:09:24 -08:00
Nick Desaulniers	c8463fd22b	[clang][CGStmt] emit i constraint rather than X for asm goto indirect dests As suggested in: https://reviews.llvm.org/D114895#3177794 X will be converted to i by SelectionDAGISEL anyways. Reviewed By: void, jyknight Differential Revision: https://reviews.llvm.org/D115311	2022-01-11 11:48:40 -08:00
Akira Hatanaka	e5df9cc098	[CodeGen] Treat ObjC `__unsafe_unretained` and class types as trivial when generating copy/dispose helper functions Analyze the block captures just once before generating copy/dispose block helper functions and honor the inert `__unsafe_unretained` qualifier. This refactor fixes a bug where captures of ObjC `__unsafe_unretained` and class types were needlessly retained/released by the copy/dispose helper functions. Differential Revision: https://reviews.llvm.org/D116948	2022-01-11 11:18:24 -08:00
Nikita Popov	acc39873b7	[CodeGen] Avoid deprecated Address constructor	2022-01-11 13:07:02 +01:00
Hans Wennborg	0b48d0fe12	[ADT] Add an in-place version of toHex() and use that to simplify MD5's hex string code which was previously using a string stream, as well as Clang's CGDebugInfo::computeChecksum(). Differential revision: https://reviews.llvm.org/D116960	2022-01-11 11:51:04 +01:00
Nikita Popov	2d1b55ebea	[CodeGen] Make element type in emitArrayDestroy() predictable When calling emitArrayDestroy(), the pointer will usually have ConvertTypeForMem(EltType) as the element type, as one would expect. However, globals with initializers sometimes don't use the same types as values normally would, e.g. here the global uses { double, i32 } rather than %struct.T as element type. Add an early cast to the global destruction path to avoid this special case. The cast would happen lateron anyway, it only gets moved to an earlier point. Differential Revision: https://reviews.llvm.org/D116219	2022-01-11 09:25:29 +01:00
Jennifer Yu	140a6b1e5c	[clang][OpenMP5.1] Initial parsing/sema for 'indirect' clause Differential Revision: https://reviews.llvm.org/D116764	2022-01-10 16:58:56 -08:00
Adrian Prantl	eb200e584e	Emit the C++ dialect in -gmodules .pcm files. Because of commit: https://reviews.llvm.org/D104291 the -gmodules .pcm files do not have the same DW_AT_language dialect as the .o file. This was a simple matter of passing the DebugStrictDwarf flag to the PCHContainerGenerator object's CodeGenOpts from the CompilerInstance passed in to it. Before this change if you ran dwarfdump on the gmodule cache folder you would get DW_AT_language (DW_LANG_C_plus_plus) even when using -std=c++14 with clang Patch by Shubham Rastogi! Differential Revision: https://reviews.llvm.org/D116790	2022-01-10 16:13:40 -08:00
Nikita Popov	7725331ccd	[CodeGen] Avoid some pointer element type accesses Possibly this is sufficient to fix PR53089.	2022-01-10 15:02:55 +01:00
Serge Guelton	d2cc6c2d0c	Use a sorted array instead of a map to store AttrBuilder string attributes Using and std::map<SmallString, SmallString> for target dependent attributes is inefficient: it makes its constructor slightly heavier, and involves extra allocation for each new string attribute. Storing the attribute key/value as strings implies extra allocation/copy step. Use a sorted vector instead. Given the low number of attributes generally involved, this is cheaper, as showcased by https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions Differential Revision: https://reviews.llvm.org/D116599	2022-01-10 14:49:53 +01:00
Kazu Hirata	17d4bd3d78	[clang] Fix bugprone argument comments (NFC) Identified with bugprone-argument-comment.	2022-01-09 00:19:49 -08:00
Kazu Hirata	40446663c7	[clang] Use true/false instead of 1/0 (NFC) Identified with modernize-use-bool-literals.	2022-01-09 00:19:47 -08:00
Johannes Doerfert	37639b72a1	[OpenMP][FIX] Emit debug declares only if debug info is available The `EmitDeclareOfAutoVariable` introduced in D114504 and D115510 has a precondition that cannot be violated. It is unclear if we should call it directly given the sparse usage in clang but for now we should at least not crash if the debug info kind is too low. Fixes #52938. Differential Revision: https://reviews.llvm.org/D116865	2022-01-08 17:01:19 -06:00
Kazu Hirata	d1b127b5b7	[clang] Remove unused forward declarations (NFC)	2022-01-08 11:56:40 -08:00
Simon Pilgrim	6ee589e2f5	[CGObjCMac] Use castAs<> instead of getAs<> to avoid dereference of nullptr inside BuildRCBlockVarRecordLayout This will assert the cast is correct instead of returning nullptr (UnionType is a subtype of RecordType so this should be clean).	2022-01-08 16:18:55 +00:00
Simon Pilgrim	06e9733fec	[CGExpr] Use castAs<> instead of getAs<> to avoid dereference of nullptr This will assert the cast is correct instead of returning nullptr	2022-01-08 14:26:09 +00:00
Jun Zhang	5be131922c	[NFC] Test commit. This is just a test commit to check whether the permission I got is correct or not.	2022-01-08 10:36:09 +08:00
Jun Zhang	b2ed9f3f44	[Clang] Implement the rest of __builtin_elementwise_* functions. The patch implement the rest of __builtin_elementwise_* functions specified in D111529, including: * __builtin_elementwise_floor * __builtin_elementwise_roundeven * __builtin_elementwise_trunc Signed-off-by: Jun <jun@junz.org> Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D115429	2022-01-07 15:11:36 +00:00
Nikita Popov	e8b98a5216	[CodeGen] Emit elementtype attributes for indirect inline asm constraints This implements the clang side of D116531. The elementtype attribute is added for all indirect constraints (*) and tests are updated accordingly. Differential Revision: https://reviews.llvm.org/D116666	2022-01-06 09:29:22 +01:00
David Pagan	7df2371bc6	Add codegen for allocate directive's 'align' clause	2022-01-05 12:40:58 -05:00
Chuanqi Xu	c75cedc237	[Coroutines] Set presplit attribute in Clang and mlir This fixes bug49264. Simply, coroutine shouldn't be inlined before CoroSplit. And the marker for pre-splited coroutine is created in CoroEarly pass, which ran after AlwaysInliner Pass in O0 pipeline. So that the AlwaysInliner couldn't detect it shouldn't inline a coroutine. So here is the error. This patch set the presplit attribute in clang and mlir. So the inliner would always detect the attribute before splitting. Reviewed By: rjmccall, ezhulenev Differential Revision: https://reviews.llvm.org/D115790	2022-01-05 10:25:02 +08:00
serge-sans-paille	9290ccc3c1	Introduce the AttributeMask class This class is solely used as a lightweight and clean way to build a set of attributes to be removed from an AttrBuilder. Previously AttrBuilder was used both for building and removing, which introduced odd situation like creation of Attribute with dummy value because the only relevant part was the attribute kind. Differential Revision: https://reviews.llvm.org/D116110	2022-01-04 15:37:46 +01:00
Jun Zhang	82020de532	Recommit "[Clang] Extend emitUnaryBuiltin to avoid duplicate logic."" This reverts the revert commit `f552ba6e84`. Recommit with fixed author name.	2022-01-04 13:46:41 +00:00
Florian Hahn	f552ba6e84	Revert "[Clang] Extend emitUnaryBuiltin to avoid duplicate logic." This reverts commit `5c57e6aa57`. Reverted due to a typo in the authors name. Will recommit soon with fixed authorship.	2022-01-04 13:45:28 +00:00
Jun Zhan	5c57e6aa57	[Clang] Extend emitUnaryBuiltin to avoid duplicate logic. This patch extends `emitUnaryBuiltin` so that we can better emitting IR when implement builtins specified in D111529. Also contains some NFC, applying it to existing code. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D116161	2022-01-04 11:47:41 +00:00
Kazu Hirata	d677a7cb05	[clang] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-02 10:20:23 -08:00
Kazu Hirata	683e6ee7d0	[CodeGen] Remove redundant string initialization (NFC) Identified with readability-redundant-string-init.	2022-01-01 09:14:23 -08:00
Kazu Hirata	298367ee6e	[clang] Use nullptr instead of 0 or NULL (NFC) Identified with modernize-use-nullptr.	2021-12-29 08:34:20 -08:00
Johannes Doerfert	944aa0421c	Reapply "[OpenMP][NFCI] Embed the source location string size in the ident_t" This reverts commit `73ece231ee` and reapplies `7bfcdbcbf3` with mlir changes. Also reverts commit `423ba12971` and includes the unit test changes of `16da214004`.	2021-12-29 01:10:38 -06:00
Mehdi Amini	73ece231ee	Revert "[OpenMP][NFCI] Embed the source location string size in the ident_t" This reverts commit `7bfcdbcbf3`. Broke MLIR build	2021-12-29 06:57:36 +00:00
Johannes Doerfert	7bfcdbcbf3	[OpenMP][NFCI] Embed the source location string size in the ident_t One of the unused ident_t fields now holds the size of the string (=const char *) field so we have an easier time dealing with those in the future. Differential Revision: https://reviews.llvm.org/D113126	2021-12-28 23:53:29 -06:00
Joseph Huber	7cdaa5a94e	[OpenMP][FIX] Change globalization alignment to 16 This patch changes the default aligntment from 8 to 16, and encodes this information in the `__kmpc_alloc_shared` runtime call to communicate it to the HeapToStack pass. The previous alignment of 8 was not sufficient for the maximum size of primitive types on 64-bit systems, and needs to be increaesd. This reduces the amount of space availible in the data sharing stack, so this implementation will need to be improved later to include the alignment requirements in the allocation call, and use it properly in the data sharing stack in the runtime. Depends on D115888 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D115971	2021-12-27 16:58:25 -05:00
Nikita Popov	3e65861131	[CodeGen] Avoid one more pointer element type access The number of elements is always a SizeTy here.	2021-12-27 12:58:22 +01:00
Nikita Popov	1f07a4a569	[CodeGen] Avoid more pointer element type accesses	2021-12-27 12:00:22 +01:00
Shao-Ce SUN	ec501f15a8	[clang][CodeGen] Remove the signed version of createExpression Fix a TODO. Remove the callers of this signed version and delete. Reviewed By: CodaFi Differential Revision: https://reviews.llvm.org/D116014	2021-12-27 14:16:08 +08:00
Kazu Hirata	31cfb3f4f6	[clang] Remove redundant calls to c_str() (NFC) Identified with readability-redundant-string-cstr.	2021-12-26 13:31:40 -08:00
Kazu Hirata	0542d15211	Remove redundant string initialization (NFC) Identified with readability-redundant-string-init.	2021-12-26 09:39:26 -08:00
Kazu Hirata	2d303e6781	Remove redundant return and continue statements (NFC) Identified with readability-redundant-control-flow.	2021-12-24 23:17:54 -08:00
Kazu Hirata	76f0f1cc5c	Use {DenseSet,SetVector,SmallPtrSet}::contains (NFC)	2021-12-24 21:43:06 -08:00
Kazu Hirata	9c0a4227a9	Use Optional::getValueOr (NFC)	2021-12-24 20:57:40 -08:00
Shilei Tian	c7a589a2c4	[Clang][OpenMP] Add the support for atomic compare in parser This patch adds the support for `atomic compare` in parser. The support in Sema and CodeGen will come soon. For now, it simply eimits an error when it is encountered. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D115561	2021-12-24 08:16:51 -05:00
Nikita Popov	dd903173c0	[OpenMP] Avoid creating null pointer lvalue (NFC) The reduction initialization code creates a "naturally aligned null pointer to void lvalue", which I found somewhat odd, even though it works out in the end because it is not actually used. It doesn't look like this code actually needs an LValue for anything though, and we can use an invalid Address to represent this case instead. Differential Revision: https://reviews.llvm.org/D116214	2021-12-24 09:01:56 +01:00
Chuanqi Xu	f3d4e168db	[C++20] Conform coroutine's comments in clang (NFC-ish) The comments for coroutine in clang wrote for coroutine-TS. Now coroutine is merged into standard. Try to conform the comments.	2021-12-24 12:41:44 +08:00
Nikita Popov	7977fd7cfc	[OpenMP] Remove no-op cast (NFC) This was casting the address to its own element type, which is a no-op.	2021-12-23 15:15:26 +01:00
Nikita Popov	bf2b5551f9	[CodeGen] Use CreateConstInBoundsGEP() in one more place This does exactly what this code manually implemented.	2021-12-23 14:58:47 +01:00
Nikita Popov	2c7dc13146	[CGBuilder] Add CreateGEP() overload that accepts an Address Add an overload for an Address and a single non-constant offset. This makes it easier to preserve the element type and adjust the alignment appropriately.	2021-12-23 14:53:42 +01:00
Nikita Popov	53f0538181	[CodeGen] Use correct element type for store to sret sret is special in that it does not use the memory type representation. Manually construct the LValue using ConvertType instead of ConvertTypeForMem here. This fixes matrix-lowering-opt-levels.c on s390x.	2021-12-23 13:02:49 +01:00
Nikita Popov	09669e6c5f	[CodeGen] Avoid pointer element type access when creating LValue This required fixing two places that were passing the pointer type rather than the expected pointee type to the method.	2021-12-23 10:53:15 +01:00
Nikita Popov	1201a0f395	[OpenMP] Fix incorrect type when casting from uintptr MakeNaturalAlignAddrLValue() expects the pointee type, but the pointer type was passed. As a result, the natural alignment of the pointer (usually 8) was always used in place of the natural alignment of the value type. Differential Revision: https://reviews.llvm.org/D116171	2021-12-23 08:57:11 +01:00
Krzysztof Parzyszek	dcb3e8083a	[Hexagon] Make conversions to vector predicate types explicit for builtins HVX does not have load/store instructions for vector predicates (i.e. bool vectors). Because of that, vector predicates need to be converted to another type before being stored, and the most convenient representation is an HVX vector. As a consequence, in C/C++, source-level builtins that either take or produce vector predicates take or return regular vectors instead. On the other hand, the corresponding LLVM intrinsics do have boolean types that, and so a conversion of the operand or the return value was necessary. This conversion would happen inside clang's codegen, but was somewhat fragile. This patch changes the strategy: a builtin that takes a vector predicate now really expects a vector predicate. Since such a predicate cannot be provided via a variable, this builtin must be composed with other builtins that either convert vector to a predicate (V6_vandvrt) or predicate to a vector (V6_vandqrt). For users using builtins defined in hvx_hexagon_protos.h there is no impact: the conversions were added to that file. Other users will need to insert - __builtin_HEXAGON_V6_vandvrt[_128B](V, -1) to convert vector V to a vector predicate, or - __builtin_HEXAGON_V6_vandqrt[_128B](Q, -1) to convert vector predicate Q to a vector. Builtins __builtin_HEXAGON_V6_vmaskedstore.* are a temporary exception to that, but they are deprecated and should not be used anyway. In the future they will either follow the same rule, or be removed.	2021-12-22 12:52:24 -08:00
Jeremy Morse	ea22fdd120	[Clang][DebugInfo] Cease turning instruction-referencing off by default Over in D114631 I turned this debug-info feature on by default, for x86_64 only. I'd previously stripped out the clang cc1 option that controlled it in `651122fc4a`, unfortunately that turned out to not be completely effective, and the two things deleted in this patch continued to keep it off-by-default. Oooff. As a follow-up, this patch removes the last few things to do with ValueTrackingVariableLocations from clang, which was the original purpose of D114631. In an ideal world, if this patch causes you trouble you'd revert `3c04507088` instead, which was where this behaviour was supposed to start being the default, although that might not be practical any more.	2021-12-22 16:30:05 +00:00
Alok Kumar Sharma	5eb271880c	[clang][OpenMP][DebugInfo] Debug support for variables in shared clause of OpenMP task construct Currently variables appearing inside shared clause of OpenMP task construct are not visible inside lldb debugger. After the current patch, lldb is able to show the variable ``` * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x0000000000400934 a.out`.omp_task_entry. [inlined] .omp_outlined.(.global_tid.=0, .part_id.=0x000000000071f0d0, .privates.=0x000000000071f0e8, .copy_fn.=(a.out`.omp_task_privates_map. at testshared.cxx:8), .task_t.=0x000000000071f0c0, __context=0x000000000071f0f0) at testshared.cxx:10:34 7 else { 8 #pragma omp task shared(svar) firstprivate(n) 9 { -> 10 printf("Task svar = %d\n", svar); 11 printf("Task n = %d\n", n); 12 svar = fib(n - 1); 13 } (lldb) p svar (int) $0 = 9 ``` Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D115510	2021-12-22 20:04:21 +05:30
Jun Zhan	b55ea2fbc0	[Clang] Add __builtin_reduce_xor This patch implements __builtin_reduce_xor as specified in D111529. Reviewed By: fhahn, aaron.ballman Differential Revision: https://reviews.llvm.org/D115231	2021-12-22 10:00:27 +00:00
Alexandre Ganea	a282ea4898	Reland - [CodeView] Emit S_OBJNAME record Reland integrates build fixes & further review suggestions. Thanks to @zturner for the initial S_OBJNAME patch! Differential Revision: https://reviews.llvm.org/D43002	2021-12-21 19:02:14 -05:00
Alexandre Ganea	5bb5142e80	Revert [CodeView] Emit S_OBJNAME record Also revert all subsequent fixes: - `abd1cbf5e5` [Clang] Disable debug-info-objname.cpp test on Unix until I sort out the issue. - `00ec441253` [Clang] debug-info-objname.cpp test: explictly encode a x86 target when using %clang_cl to avoid falling back to a native CPU triple. - `cd407f6e52` [Clang] Fix build by restricting debug-info-objname.cpp test to x86.	2021-12-21 19:02:14 -05:00
Nikita Popov	a995cdab19	[CodeGen] Avoid more pointer element type accesses	2021-12-21 15:52:18 +01:00
Alexandre Ganea	f44e3fbadd	[CodeView] Emit S_OBJNAME record Thanks to @zturner for the initial patch! Differential Revision: https://reviews.llvm.org/D43002	2021-12-21 09:26:36 -05:00
Nikita Popov	9a05a7b00c	[CodeGen] Accept Address in CreateLaunderInvariantGroup Add an overload that accepts and returns an Address, as we generally just want to replace the pointer with a laundered one, while retaining remaining information.	2021-12-21 14:43:20 +01:00
Nikita Popov	e751d97863	[CodeGen] Avoid some pointer element type accesses This avoids some pointer element type accesses when compiling C++ code.	2021-12-21 14:16:28 +01:00
Nikita Popov	55d7a12b86	[CodeGen] Avoid pointee type access during global var declaration All callers pass in a GlobalVariable, so we can conveniently fetch the type from there.	2021-12-21 11:48:37 +01:00
Sami Tolvanen	ec2e26eaf6	[Clang] Add __builtin_function_start Control-Flow Integrity (CFI) replaces references to address-taken functions with pointers to the CFI jump table. This is a problem for low-level code, such as operating system kernels, which may need the address of an actual function body without the jump table indirection. This change adds the __builtin_function_start() builtin, which accepts an argument that can be constant-evaluated to a function, and returns the address of the function body. Link: https://github.com/ClangBuiltLinux/linux/issues/1353 Depends on D108478 Reviewed By: pcc, rjmccall Differential Revision: https://reviews.llvm.org/D108479	2021-12-20 12:55:33 -08:00
Ellis Hoag	ac719d7c9a	[InstrProf] Don't profile merge by default in lightweight mode Profile merging is not supported when using debug info profile correlation because the data section won't be in the binary at runtime. Change the default profile name in this mode to `default_%p.proflite` so we don't use profile merging. Reviewed By: kyulee Differential Revision: https://reviews.llvm.org/D115979	2021-12-20 09:51:49 -08:00
Sam McCall	af27466c50	Reland "[AST] Add UsingType: a sugar type for types found via UsingDecl" This reverts commit `cc56c66f27`. Fixed a bad assertion, the target of a UsingShadowDecl must not have local qualifiers, but it can be a typedef whose underlying type is qualified.	2021-12-20 18:03:15 +01:00
Sam McCall	cc56c66f27	Revert "[AST] Add UsingType: a sugar type for types found via UsingDecl" This reverts commit `e1600db19d`. Breaks sanitizer tests, at least on windows: https://lab.llvm.org/buildbot/#/builders/127/builds/21592/steps/4/logs/stdio	2021-12-20 17:53:56 +01:00
Sam McCall	e1600db19d	[AST] Add UsingType: a sugar type for types found via UsingDecl Currently there's no way to find the UsingDecl that a typeloc found its underlying type through. Compare to DeclRefExpr::getFoundDecl(). Design decisions: - a sugar type, as there are many contexts this type of use may appear in - UsingType is a leaf like TypedefType, the underlying type has no TypeLoc - not unified with UnresolvedUsingType: a single name is appealing, but being sometimes-sugar is often fiddly. - not unified with TypedefType: the UsingShadowDecl is not a TypedefNameDecl or even a TypeDecl, and users think of these differently. - does not cover other rarer aliases like objc @compatibility_alias, in order to be have a concrete API that's easy to understand. - implicitly desugared by the hasDeclaration ASTMatcher, to avoid breaking existing patterns and following the precedent of ElaboratedType. Scope: - This does not cover types associated with template names introduced by using declarations. A future patch should introduce a sugar TemplateName variant for this. (CTAD deduced types fall under this) - There are enough AST matchers to fix the in-tree clang-tidy tests and probably any other matchers, though more may be useful later. Caveats: - This changes a fairly common pattern in the AST people may depend on matching. Previously, typeLoc(loc(recordType())) matched whether a struct was referred to by its original scope or introduced via using-decl. Now, the using-decl case is not matched, and needs a separate matcher. This is similar to the case of typedefs but nevertheless both adds complexity and breaks existing code. Differential Revision: https://reviews.llvm.org/D114251	2021-12-20 17:15:38 +01:00
Yaxun (Sam) Liu	a6786cdd57	[HIPSPV][3/4] Enable SPIR-V emission for HIP This patch enables SPIR-V binary emission for HIP device code via the HIPSPV tool chain. ‘--offload’ option, which is envisioned in [1], is added for specifying offload targets. This option is used to override default device target (amdgcn-amd-amdhsa) for HIP compilation for emitting device code as SPIR-V binary. The option is handled in getHIPOffloadTargetTriple(). getOffloadingDeviceToolChain() function (based on the design in the SYCL repository) is added to select HIPSPVToolChain when HIP offload target is ‘spirv64’. The HIPActionBuilder is modified to produce LLVM IR at the backend phase. HIPSPV tool chain expects to receive HIP device code as LLVM IR so it can run external LLVM passes over them. HIPSPV TC is also responsible for emitting the SPIR-V binary. A Cuda GPU architecture ‘generic’ is added. The name is picked from the LLVM SPIR-V Backend. In the HIPSPV code path the architecture name is inserted to the bundle entry ID as target ID. Target ID is expected to be always present so a component in the target triple is not mistaken as target ID. Tests are added for checking the HIPSPV tool chain. [1]: https://lists.llvm.org/pipermail/cfe-dev/2020-December/067362.html Patch by: Henry Linjamäki Reviewed by: Yaxun Liu, Artem Belevich, Alexey Bader Differential Revision: https://reviews.llvm.org/D110622	2021-12-20 10:45:09 -05:00
jacquesguan	9c11e95286	[Clang][RISCV] Fix upper bound of RISC-V V type in debug info The UpperBound of RVV type in debug info should be elements count minus one, as the LowerBound start from zero. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D115430	2021-12-20 14:25:06 +08:00
Esme-Yi	18f087c21c	[DebugInfo][Clang] record the access flag for class/struct/union types. Summary: This patch records the access flag for class/struct/union types in the clang part. The summary of binary size change and debug info size change due to the DW_AT_accessibility attribute are as the following table. They are built with flags of `clang -O0 -g` (no -gz). \| section \| before \| after \| change \| % \| \| .debug_loc \| 929821 \| 929821 \|0\|0\| \|.debug_abbrev \| 5885289 \| 5971547 \|+86258\|+1.466%\| \|.debug_info \| 497613455 \| 498122074 \|+508619\|+0.102%\| \|.debug_ranges \| 45731664 \| 45731664 \|0\|0\| \|.debug_str \| 233842595 \| 233839388 \|-3207\| -0.001%\| \|.debug_line \| 149773166 \| 149764583 \|-8583\|-0.006%\| \|total (debug) \|933775990 \|934359077\|+583087 \|+0.062%\| \|total (binary) \|1394617288 \| 1395200024\| +582736\|+0.042%\| Reviewed By: dblaikie, shchenz Differential Revision: https://reviews.llvm.org/D115503	2021-12-20 02:40:42 +00:00
Sanjay Patel	1965cc4695	[CodeGen] remove creation of FP cast function attribute This is the last cleanup step resulting from D115804 . Now that clang uses intrinsics when we're in the special FP mode, we don't need a function attribute as an indicator to the backend. The LLVM part of the change is in D115885. Differential Revision: https://reviews.llvm.org/D115886	2021-12-19 11:55:00 -05:00
Kazu Hirata	713ee230f8	[clang] Use llvm::reverse (NFC)	2021-12-17 16:51:42 -08:00
Nikita Popov	9e45146721	[CodeGen] Fix element type for sret argument Fix a mistake in 9bf917394eba3ba4df77cc17690c6d04f4e9d57f: sret arguments use ConvertType, not ConvertTypeForMem, see the handling in CodeGenTypes::GetFunctionType(). This fixes fp-matrix-pragma.c on s390x.	2021-12-17 16:13:28 +01:00
Nikita Popov	9bf917394e	[CodeGen] Avoid more pointer element type accesses	2021-12-17 12:11:50 +01:00
Nikita Popov	ba31cb4d38	[CodeGen] Store element type in RValue For aggregates, we need to store the element type to be able to reconstruct the aggregate Address. This increases the size of this packed structure (as the second value is already used for alignment in this case), but I did not observe any compile-time or memory usage regression from this change.	2021-12-17 09:05:59 +01:00
Ellis Hoag	58d9c1aec8	[Try2][InstrProf] Attach debug info to counters Add the llvm flag `-debug-info-correlate` to attach debug info to instrumentation counters so we can correlate raw profile data to their functions. Raw profiles are dumped as `.proflite` files. The next diff enables `llvm-profdata` to consume `.proflite` and debug info files to produce a normal `.profdata` profile. Part of the "lightweight instrumentation" work: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4 The original diff https://reviews.llvm.org/D114565 was reverted because of the `Instrumentation/InstrProfiling/debug-info-correlate.ll` test, which is fixed in this commit. Reviewed By: kyulee Differential Revision: https://reviews.llvm.org/D115693	2021-12-16 14:20:30 -08:00
Mike Rice	2d0bf14397	[clang] Cleanup unneeded Function nullptr checks [NFC] Add an assert and avoid unneeded checks of Fn in CodeGenFunction::GenerateCode. Differential Revision: https://reviews.llvm.org/D115817	2021-12-16 08:28:10 -08:00
Nikita Popov	2d89382b5a	[CodeGen] Avoid more pointer element type accesses This is enough to build sqlite3 with opaque pointers.	2021-12-16 16:34:09 +01:00
Nikita Popov	8285522014	[CodeGen] Always update map entry after adding initializer With opaque pointers the pointer cast may be a no-op, such that var and castedAddr are the same. However, we still need to update the map entry as the underlying global changed. We could explicitly check whether the global was replaced, but we may as well just always update the entry.	2021-12-16 16:29:35 +01:00
Nikita Popov	a0cf066eac	[CodeGen] Store element type in ParamValue ParamValue is basically a union between an Address and a Value*. To be able to reconstruct the Address, we now need to store the pointer element type.	2021-12-16 15:31:55 +01:00
Nikita Popov	58c8c53263	[CodeGen] Avoid more pointer element type accesses	2021-12-16 15:26:21 +01:00
Sanjay Patel	8c7f2a4f87	[CodeGen] use saturating FP casts when compiling with "no-strict-float-cast-overflow" We got an unintended consequence of the optimizer getting smarter when compiling in a non-standard mode, and there's no good way to inhibit those optimizations at a later stage. The test is based on an example linked from D92270. We allow the "no-strict-float-cast-overflow" exception to normal C cast rules to preserve legacy code that does not expect overflowing casts from FP to int to produce UB. See D46236 for details. Differential Revision: https://reviews.llvm.org/D115804	2021-12-16 09:10:12 -05:00
Nikita Popov	34eb715f61	[CodeGen] Avoid more pointer element type accesses	2021-12-16 12:03:11 +01:00
Nikita Popov	9fa15e0073	[CodeGen] Remove an unused MakeAddrLValue() overload (NFC) This is unused and we should prefer the overloads accepting Address.	2021-12-16 11:49:20 +01:00
Nikita Popov	6bca9a428e	[CodeGen] Store ElementType in LValue Store the pointer element type inside LValue so that we can preserve it when converting it back into an Address. Storing the pointer element type might not be strictly required here in that we could probably re-derive it from the QualType (which would require CGF access though), but storing it seems like the simpler solution. The global register case is special and does not store an element type, as the value is not a pointer type in that case and it's not possible to create an Address from it. This is the main remaining part from D103465. Differential Revision: https://reviews.llvm.org/D115791	2021-12-16 09:23:33 +01:00
Nikita Popov	b9492ec649	[CodeGen] Avoid some pointer element type accesses	2021-12-15 14:46:10 +01:00
Nikita Popov	d930c3155c	[CodeGen] Pass element type to EmitCheckedInBoundsGEP() Same as for other GEP creation methods.	2021-12-15 14:03:33 +01:00
Nikita Popov	90bbf79c7b	[CodeGen] Avoid some deprecated Address constructors Some of these are on the critical path towards making something minimal work with opaque pointers.	2021-12-15 12:45:23 +01:00
Nikita Popov	481de0ed80	[CodeGen] Prefer CreateElementBitCast() where possible CreateElementBitCast() can preserve the pointer element type in the presence of opaque pointers, so use it in place of CreateBitCast() in some places. This also sometimes simplifies the code a bit.	2021-12-15 11:48:39 +01:00
Nikita Popov	834c8ff587	[CodeGen] Avoid some uses of deprecated Address constructor Explicitly pass in the element type instead.	2021-12-15 11:13:10 +01:00
Nikita Popov	c3b624a191	[CodeGen] Avoid deprecated ConstantAddress constructor Change all uses of the deprecated constructor to pass the element type explicitly and drop it. For cases where the correct element type was not immediately obvious to me or would require a slightly larger change I'm falling back to explicitly calling getPointerElementType() for now.	2021-12-15 10:42:41 +01:00
Nikita Popov	b4f46555d7	[CodeGen] Avoid some pointer element type accesses	2021-12-15 09:29:27 +01:00
Nikita Popov	abbc2e997b	[CodeGen] Store ElementType in Address Explicitly track the pointer element type in Address, rather than deriving it from the pointer type, which will no longer be possible with opaque pointers. This just adds the basic facility, for now everything is still going through the deprecated constructors. I had to adjust one place in the LValue implementation to satisfy the new assertions: Global registers are represented as a MetadataAsValue, which does not have a pointer type. We should avoid using Address in this case. This implements a part of D103465. Differential Revision: https://reviews.llvm.org/D115725	2021-12-15 08:59:44 +01:00
Sindhu Chittireddy	4706a297fb	Avoid setting tbaa on the store of return type of call to inline assembler. In 32bit mode, attaching TBAA metadata to the store following the call to inline assembler results in describing the wrong type by making a fake lvalue(i.e., whatever the inline assembler happens to leave in EAX:EDX.) Even if inline assembler somehow describes the correct type, setting TBAA information on return type of call to inline assembler is likely not correct, since TBAA rules need not apply to inline assembler. Differential Revision: https://reviews.llvm.org/D115320	2021-12-14 17:40:33 -08:00
Nikita Popov	b81450afb6	[CodeGen] Add std:: qualifier Hopefully addresses the buildbot failures.	2021-12-14 12:17:55 +01:00
Nikita Popov	b8d121eb1d	[CodeGen] Require use of Address::invalid() for invalid address (NFC) This no longer allows creating an invalid Address through the regular constructor. There were only two places that did this (AggValueSlot and EHCleanupScope) which did this by converting a potential nullptr into an Address. I've fixed both of these by directly storing an Address instead. This is intended as a bit of preliminary cleanup for D103465. Differential Revision: https://reviews.llvm.org/D115630	2021-12-14 12:06:05 +01:00
Ellis Hoag	c809da7d9c	Revert "[InstrProf] Attach debug info to counters" This reverts commit `800bf8ed29`. The `Instrumentation/InstrProfiling/debug-info-correlate.ll` test was failing because I forgot the `llc` commands are architecture specific. I'll follow up with a fix. Differential Revision: https://reviews.llvm.org/D115689	2021-12-13 18:15:17 -08:00
Ellis Hoag	800bf8ed29	[InstrProf] Attach debug info to counters Add the llvm flag `-debug-info-correlate` to attach debug info to instrumentation counters so we can correlate raw profile data to their functions. Raw profiles are dumped as `.proflite` files. The next diff enables `llvm-profdata` to consume `.proflite` and debug info files to produce a normal `.profdata` profile. Part of the "lightweight instrumentation" work: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4 Reviewed By: kyulee Differential Revision: https://reviews.llvm.org/D114565	2021-12-13 17:51:22 -08:00
Ethan Stewart	d1327f8a57	[clang][amdgpu] - Choose when to promote VarDecl to address space 4. There are instances where clang codegen creates stores to address space 4 in ctors, which causes a crash in llc. This store was being optimized out at opt levels > 0. For example: pragma omp declare target static const double log_smallx = log2(smallx); pragma omp end declare target This patch ensures that any global const that does not have constant initialization stays in address space 1. Note - a second patch is in the works where all global constants are placed in address space 1 during codegen and then the opt pass InferAdressSpaces will promote to address space 4 where necessary. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D115661	2021-12-13 16:31:24 -06:00
Matt Devereau	41def32040	[AArch64][SVE][NEON] Add NEON-SVE-Bridge intrinsics Adds svset_neonq, svget_neonq, svdup_neonq AArch64 intrinsics. These are described in the ACLE specification: https://github.com/ARM-software/acle/pull/72 https://reviews.llvm.org/D114713	2021-12-13 11:31:57 +00:00
Andrew Browne	7c004c2bc9	Revert "[asan] Add support for disable_sanitizer_instrumentation attribute" This reverts commit `2b554920f1`. This change causes tsan test timeout on x86_64-linux-autoconf. The timeout can be reproduced by: git clone https://github.com/llvm/llvm-zorg.git BUILDBOT_CLOBBER= BUILDBOT_REVISION=eef8f3f85679c5b1ae725bade1c23ab7bb6b924f llvm-zorg/zorg/buildbot/builders/sanitizers/buildbot_standard.sh	2021-12-10 14:33:38 -08:00
Alexander Potapenko	2b554920f1	[asan] Add support for disable_sanitizer_instrumentation attribute For ASan this will effectively serve as a synonym for __attribute__((no_sanitize("address"))) Differential Revision: https://reviews.llvm.org/D114421	2021-12-10 12:17:26 +01:00
Joseph Huber	bc9c4d7216	[OpenMP][FIX] Pass the num_threads value directly to parallel_51 The problem with the old scheme is that we would need to keep track of the "next region" and reset the num_threads value after it. The new RT doesn't do it and an assertion is triggered. The old RT doesn't do it either, I haven't tested it but I assume a num_threads clause might impact multiple parallel regions "accidentally". Further, in SPMD mode num_threads was simply ignored, for some reason beyond me. In any case, parallel_51 is designed to take the clause value directly, so let's do that instead. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D113623	2021-12-09 16:30:29 -05:00
Chuanqi Xu	352e36e10d	[Coroutines] Remove unused coroutine builtin/intrinsics llvm.coro.param (NFC-ish) I found that the coroutine intrinsic llvm.coro.param in documentation (https://llvm.org/docs/Coroutines.html#id101) didn't get used actually since there isn't lowering codes in LLVM. I also checked the implementation of libstdc++ and libc++. Both of them didn't use llvm.coro.param. So I am pretty sure that the llvm.coro.param intrinsic is unused. I think it would be better t to remove it to avoid possible misleading understandings. Note: according to [class.copy.elision]/p1.3, this optimization is allowed by the C++ language specification. Let's make it someday. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D115222	2021-12-09 14:40:25 +08:00
Duncan P. N. Exon Smith	cfd1d49dc0	OpenMP: Avoid using SmallVector::set_size() Update `OpenMPIRBuilder::collapseLoops()` to call `resize()` instead of `set_size()`. The latter asserts on capacity limits and cannot grow, which seems likely to be unintentional here (if it is, I think a local assertion would be good for clarity). Also update `CodeGenFunction::EmitOMPCollapsedCanonicalLoopNest()` to use `pop_back_n()` instead of `set_size()`. Differential Revision: https://reviews.llvm.org/D115378	2021-12-08 15:22:50 -08:00
Jun Zhang	8680f951c2	Add __builtin_elementwise_ceil This patch implements one of the missing builtin functions specified in https://reviews.llvm.org/D111529.	2021-12-08 08:29:33 -05:00
Henry Linjamäki	9ae5810b53	[HIPSPV] Convert HIP kernels to SPIR-V kernels This patch translates HIP kernels to SPIR-V kernels when the HIP compilation mode is targeting SPIR-S. This involves: * Setting Cuda calling convention to CC_OpenCLKernel (which maps to SPIR_KERNEL in LLVM IR later on). * Coercing pointer arguments with default address space (AS) qualifier to CrossWorkGroup AS (__global in OpenCL). HIPSPV's device code is ultimately SPIR-V for OpenCL execution environment (as starter/default) where Generic or Function (OpenCL's private) is not supported as storage class for kernel pointer types. This leaves the CrossWorkGroup to be the only reasonable choice for HIP buffers. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D109818	2021-12-08 12:18:15 +03:00
Yaxun (Sam) Liu	3b172f60c6	[HIP] Fix -fgpu-rdc for Windows This patch fixes issues for -fgpu-rdc for Windows MSVC toolchain: Fix COFF specific section flags and remove section types in llvm-mc input file for Windows. Escape fatbin path in llvm-mc input file. Add -triple option to llvm-mc. Put __hip_gpubin_handle in comdat when it has linkonce_odr linkage. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D115039	2021-12-06 16:42:23 -05:00
Aaron Ballman	6c75ab5f66	Introduce _BitInt, deprecate _ExtInt WG14 adopted the _ExtInt feature from Clang for C23, but renamed the type to be _BitInt. This patch does the vast majority of the work to rename _ExtInt to _BitInt, which accounts for most of its size. The new type is exposed in older C modes and all C++ modes as a conforming extension. However, there are functional changes worth calling out: * Deprecates _ExtInt with a fix-it to help users migrate to _BitInt. * Updates the mangling for the type. * Updates the documentation and adds a release note to warn users what is going on. * Adds new diagnostics for use of _BitInt to call out when it's used as a Clang extension or as a pre-C23 compatibility concern. * Adds new tests for the new diagnostic behaviors. I want to call out the ABI break specifically. We do not believe that this break will cause a significant imposition for early adopters of the feature, and so this is being done as a full break. If it turns out there are critical uses where recompilation is not an option for some reason, we can consider using ABI tags to ease the transition.	2021-12-06 12:52:01 -05:00
Jonas Devlieghere	4cb79294e8	Revert "[clang][DebugInfo] Allow function-local statics and types to be scoped within a lexical block" This reverts commit `e403f4fdc8` because it breaks TestSetData.py on GreenDragon: https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/39089/	2021-12-06 09:34:53 -08:00
Kristina Bessonova	e403f4fdc8	[clang][DebugInfo] Allow function-local statics and types to be scoped within a lexical block This is almost a reincarnation of https://reviews.llvm.org/D15977 originally implemented by Amjad Aboud. It was discussed on llvm-dev [0], committed with its backend counterpart [1], but finally reverted [2]. This patch makes clang to emit debug info for function-local static variables, records (classes, structs and unions) and typdefs correctly scoped if those function-local entites defined within a lexical (bracketed) block. Before this patch, clang emits all those entities directly scoped in DISubprogram no matter where they were really defined, causing debug info loss (reported several times in [3], [4], [5]). [0] https://lists.llvm.org/pipermail/llvm-dev/2015-November/092551.html [1] https://reviews.llvm.org/rG30e7a8f694a19553f64b3a3a5de81ce317b9ec2f [2] https://reviews.llvm.org/rGdc4531e552af6c880a69d226d3666756198fbdc8 [3] https://bugs.llvm.org/show_bug.cgi?id=19238 [4] https://bugs.llvm.org/show_bug.cgi?id=23164 [5] https://bugs.llvm.org/show_bug.cgi?id=44695 Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D113743	2021-12-06 12:19:09 +02:00
Jay Foad	2774bad112	[AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args The ray_origin, ray_dir and ray_inv_dir arguments should all be vec3 to match how the hardware instruction works. Don't change the API of the corresponding OpenCL builtins. Differential Revision: https://reviews.llvm.org/D115032	2021-12-04 10:32:11 +00:00
Peter Collingbourne	0a14674f27	CodeGen: Strip exception specifications from function types in CFI type names. With C++17 the exception specification has been made part of the function type, and therefore part of mangled type names. However, it's valid to convert function pointers with an exception specification to function pointers with the same argument and return types but without an exception specification, which means that e.g. a function of type "void () noexcept" can be called through a pointer of type "void ()". We must therefore consider the two types to be compatible for CFI purposes. We can do this by stripping the exception specification before mangling the type name, which is what this patch does. Differential Revision: https://reviews.llvm.org/D115015	2021-12-03 14:50:52 -05:00
Qiu Chaofan	b9adaa1782	[PowerPC] [Clang] Fix alignment adjustment of single-elemented float128 This does similar thing to `6b1341e`, but fixes single element 128-bit float type: `struct { long double x; }`. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D114937	2021-12-03 18:07:34 +08:00
Qiu Chaofan	4f94c02616	[Clang] Mutate bulitin names under IEEE128 on PPC64 Glibc 2.32 and newer uses these symbol names to support IEEE-754 128-bit float. GCC transforms name of these builtins to align with Glibc header behavior. Since Clang doesn't have all GCC-compatible builtins implemented, this patch only mutates the implemented part. Note nexttoward is a special case (no nexttowardf128) so it's also handled here. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D112401	2021-12-03 17:50:18 +08:00
Matt Arsenault	2f0a571418	Reapply "OpenMP: Start calling setTargetAttributes for generated kernels" This reverts commit `25eb7fa01d`. Previous buildbot failures appear to have been a fluke from a dirty build.	2021-12-02 14:55:56 -05:00
David Greene	53adfa8750	[clang] Do not duplicate "EnableSplitLTOUnit" module flag If clang's output is set to bitcode and LTO is enabled, clang would unconditionally add the flag to the module. Unfortunately, if the input were a bitcode or IR file and had the flag set, this would result in two copies of the flag, which is illegal IR. Guard the setting of the flag by checking whether it already exists. This follows existing practice for the related "ThinLTO" module flag. Differential Revision: https://reviews.llvm.org/D112177	2021-12-02 08:24:56 -08:00
skc7	16b781e6d1	[AMDGPU][clang] Fix __builtin_nontemporal_store() failure on AMDGPU Reviewed By: yaxunl, sameerds Differential Revision: https://reviews.llvm.org/D114849	2021-12-02 05:53:25 +00:00
Ties Stuij	e3b2f0226b	[clang][ARM] PACBTI-M frontend support Handle branch protection option on the commandline as well as a function attribute. One patch for both mechanisms, as they use the same underlying parsing mechanism. These are recorded in a set of LLVM IR module-level attributes like we do for AArch64 PAC/BTI (see https://reviews.llvm.org/D85649): - command-line options are "translated" to module-level LLVM IR attributes (metadata). - functions have PAC/BTI specific attributes iff the __attribute__((target("branch-protection=...))) was used in the function declaration. - command-line option -mbranch-protection to armclang targeting Arm, following this grammar: branch-protection ::= "-mbranch-protection=" <protection> protection ::= "none" \| "standard" \| "bti" [ "+" <pac-ret-clause> ] \| <pac-ret-clause> [ "+" "bti"] pac-ret-clause ::= "pac-ret" [ "+" <pac-ret-option> ] pac-ret-option ::= "leaf" ["+" "b-key"] \| "b-key" ["+" "leaf"] b-key is simply a placeholder to make it consistent with AArch64's version. In Arm, however, it triggers a warning informing that b-key is unsupported and a-key will be selected instead. - Handle _attribute_((target(("branch-protection=..."))) for AArch32 with the same grammer as the commandline options. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Momchil Velikov - Victor Campos - Ties Stuij Reviewed By: vhscampos Differential Revision: https://reviews.llvm.org/D112421	2021-12-01 10:37:16 +00:00
Matt Arsenault	25eb7fa01d	Revert "OpenMP: Start calling setTargetAttributes for generated kernels" This reverts commit `6c27d389c8`. This is failing on the buildbots	2021-11-29 15:47:10 -05:00
Anshil Gandhi	df0560ca00	[HIP] Add atomic load, atomic store and atomic cmpxchng_weak builtin support in HIP-clang Introduce `__hip_atomic_load`, `__hip_atomic_store` and `__hip_atomic_compare_exchange_weak` builtins in HIP. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D114553	2021-11-29 12:07:13 -07:00
Matt Arsenault	6c27d389c8	OpenMP: Start calling setTargetAttributes for generated kernels This wasn't setting any of the attributes the target would expect to emit for kernels.	2021-11-29 13:43:34 -05:00
Erich Keane	fc53eb69c2	Reapply 'Implement target_clones multiversioning' See discussion in D51650, this change was a little aggressive in an error while doing a 'while we were here', so this removes that error condition, as it is apparently useful. This reverts commit `bb4934601d`.	2021-11-29 06:30:01 -08:00
Alok Kumar Sharma	36cb7477d1	[clang][OpenMP][DebugInfo] Debug support for private variables inside an OpenMP task construct Currently variables appearing inside private/firstprivate/lastprivate clause of openmp task construct are not visible inside lldb debugger. This is because compiler does not generate debug info for it. Please consider the testcase debug_private.c attached with patch. ``` 28 #pragma omp task shared(res) private(priv1, priv2) firstprivate(fpriv) 29 { 30 priv1 = n; 31 priv2 = n + 2; 32 printf("Task n=%d,priv1=%d,priv2=%d,fpriv=%d\n",n,priv1,priv2,fpriv); 33 -> 34 res = priv1 + priv2 + fpriv + foo(n - 1); 35 } 36 #pragma omp taskwait 37 return res; (lldb) p priv1 error: <user expression 0>:1:1: use of undeclared identifier 'priv1' priv1 ^ (lldb) p priv2 error: <user expression 1>:1:1: use of undeclared identifier 'priv2' priv2 ^ (lldb) p fpriv error: <user expression 2>:1:1: use of undeclared identifier 'fpriv' fpriv ^ ``` After the current patch, lldb is able to show the variables ``` (lldb) p priv1 (int) $0 = 10 (lldb) p priv2 (int) $1 = 12 (lldb) p fpriv (int) $2 = 14 ``` Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D114504	2021-11-25 19:55:22 +05:30
Yaxun (Sam) Liu	aa9b90ca44	Fix warning due to default switch label Fix warning due to default label in switch which covers all enumeration values	2021-11-23 10:52:51 -05:00
Yaxun (Sam) Liu	e13246a2ec	[HIP] Add HIP scope atomic operations Add an AtomicScopeModel for HIP and support for OpenCL builtins that are missing in HIP. Patch by: Michael Liao Revised by: Anshil Ghandi Reviewed by: Yaxun Liu Differential Revision: https://reviews.llvm.org/D113925	2021-11-23 10:13:37 -05:00
Alexey Bataev	80256605f8	[OpenMP] support depend clause for taskwait directive, by Deepak Eachempati. This patch adds clang (parsing, sema, serialization, codegen) support for the 'depend' clause on the 'taskwait' directive. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D113540	2021-11-19 06:30:17 -08:00
Phoebe Wang	de34a940ae	[X86] Add -mskip-rax-setup support to align with GCC AMD64 ABI mandates caller to specify the number of used SSE registers when passing variable arguments. GCC also provides option -mskip-rax-setup to skip the setup of rax when SSE is disabled. This helps to reduce the code size, see pr23258. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D112413	2021-11-18 11:20:32 +08:00
Nico Weber	ae98182cf7	[clang] Make -masm=intel affect inline asm style With this, void f() { __asm__("mov eax, ebx"); } now compiles with clang with -masm=intel. This matches gcc. The flag is not accepted in clang-cl mode. It has no effect on MSVC-style `__asm {}` blocks, which are unconditionally in intel mode both before and after this change. One difference to gcc is that in clang, inline asm strings are "local" while they're "global" in gcc. Building the following with -masm=intel works with clang, but not with gcc where the ".att_syntax" from the 2nd __asm__() is in effect until file end (or until a ".intel_syntax" somewhere later in the file): __asm__("mov eax, ebx"); __asm__(".att_syntax\nmovl %ebx, %eax"); __asm__("mov eax, ebx"); This also updates clang's intrinsic headers to work both in -masm=att (the default) and -masm=intel modes. The official solution for this according to "Multiple assembler dialects in asm templates" in gcc docs->Extensions->Inline Assembly->Extended Asm is to write every inline asm snippet twice: bt{l %[Offset],%[Base] \| %[Base],%[Offset]} This works in LLVM after D113932 and D113894, so use that. (Just putting `.att_syntax` at the start of the snippet works in some but not all cases: When LLVM interpolates in parameters like `%0`, it uses at&t or intel syntax according to the inline asm snippet's flavor, so the `.att_syntax` within the snippet happens to late: The interpolated-in parameter is already in intel style, and then won't parse in the switched `.att_syntax`.) It might be nice to invent a `#pragma clang asm_dialect push "att"` / `#pragma clang asm_dialect pop` to be able to force asm style per snippet, so that the inline asm string doesn't contain the same code in two variants, but let's leave that for a follow-up. Fixes PR21401 and PR20241. Differential Revision: https://reviews.llvm.org/D113707	2021-11-17 13:41:59 -05:00
Ahsan Saghir	4c8b8e0154	[PowerPC] Allow MMA built-ins to accept non-void pointers and arrays Calls to MMA builtins that take pointer to void do not accept other pointers/arrays whereas normal functions with the same parameter do. This patch allows MMA built-ins to accept non-void pointers and arrays. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D113306	2021-11-16 09:14:41 -06:00
Kazu Hirata	d0ac215dd5	[clang] Use isa instead of dyn_cast (NFC)	2021-11-14 09:32:40 -08:00
Josh Learn	7611e16fce	[clang][objc][codegen] Skip emitting ObjC category metadata when the category is empty Currently, if we create a category in ObjC that is empty, we still emit runtime metadata for that category. This is a scenario that could commonly be run into when using __attribute__((objc_direct_members)), which elides the need for much of the category metadata. This is slightly wasteful and can be easily skipped by checking the category metadata contents during CodeGen. rdar://66177182 Differential Revision: https://reviews.llvm.org/D113455	2021-11-12 16:21:21 -08:00
Adrian Kuegel	bb4934601d	Revert "Implement target_clones multiversioning" This reverts commit `9deab60ae7`. There is a possibly unintended semantic change.	2021-11-12 11:05:58 +01:00
David Blaikie	6512098877	DebugInfo/Printing: Improve name of policy for including types for template arguments Feedback from Richard Smith that the policy should be named closer to the context its used in.	2021-11-11 21:59:27 -08:00
Erich Keane	9deab60ae7	Implement target_clones multiversioning As discussed here: https://lwn.net/Articles/691932/ GCC6.0 adds target_clones multiversioning. This functionality is an odd cross between the cpu_dispatch and 'target' MV, but is compatible with neither. This attribute allows you to list all options, then emits a separately optimized version of each function per-option (similar to the cpu_specific attribute). It automatically generates a resolver, just like the other two. The mangling however, is... ODD to say the least. The mangling format is: <normal_mangling>.<option string>.<option ordinal>. Differential Revision:https://reviews.llvm.org/D51650	2021-11-11 11:11:16 -08:00
James Y Knight	fddc4e4116	Correct handling of the 'throw()' exception specifier in C++17. Per C++17 [except.spec], 'throw()' has become equivalent to 'noexcept', and should therefore call std::terminate, not std::unexpected. Differential Revision: https://reviews.llvm.org/D113517	2021-11-10 17:40:16 -05:00
Yaxun (Sam) Liu	4b3881e9f3	Emit hidden hostcall argument for sanitized kernels this patch - https://reviews.llvm.org/D110337 changes the way how hostcall hidden argument is emitted for printf, but the sanitized kernels also use hostcall buffer to report a error for invalid memory access, which is not handled by the above patch and it leads to vdi runtime error: Device::callbackQueue aborting with error : HSA_STATUS_ERROR_MEMORY_FAULT: Agent attempted to access an inaccessible address. code: 0x2b Patch by: Praveen Velliengiri Reviewed by: Yaxun Liu, Matt Arsenault Differential Revision: https://reviews.llvm.org/D112820	2021-11-10 17:05:57 -05:00
Yaxun (Sam) Liu	80072fde61	[CUDA][HIP] Allow comdat for kernels Two identical instantiations of a template function can be emitted by two TU's with linkonce_odr linkage without causing duplicate symbols in linker. MSVC also requires these symbols be in comdat sections. Linux does not require the symbols in comdat sections to be merged by linker but by default clang puts them in comdat sections. If a template kernel is instantiated identically in two TU's. MSVC requires that them to be in comdat sections, otherwise MSVC linker will diagnose them as duplicate symbols. However, currently clang does not put instantiated template kernels in comdat sections, which causes link error for MSVC. This patch allows putting instantiated template kernels into comdat sections. Reviewed by: Artem Belevich, Reid Kleckner Differential Revision: https://reviews.llvm.org/D112492	2021-11-10 16:42:23 -05:00
Igor Kirillov	4860f6cb25	[OpenMP] Fix: opposite attributes could be set by -fno-inline After the changes introduced by D106799 it is possible to tag outlined function with both AlwaysInline and NoInline attributes using -fno-inline command line options. This issue is similiar to D107649. Differential Revision: https://reviews.llvm.org/D112645	2021-11-10 16:48:09 +00:00
Jon Chesterfield	27177b82d4	[OpenMP] Lower printf to __llvm_omp_vprintf Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf` which takes the same const char, void arguments as cuda vprintf and also passes the size of the void* alloca which will be needed by a non-stub implementation of `__llvm_omp_vprintf` for amdgpu. This removes the amdgpu link error on any printf in a target region in favour of silently compiling code that doesn't print anything to stdout. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D112680	2021-11-10 15:30:56 +00:00
Vassil Vassilev	4fb0805c65	[clang-repl] Allow Interpreter::getSymbolAddress to take a mangled name.	2021-11-10 12:52:05 +00:00
Jorge Gorbe Moya	770ddf599d	Fix unused variable warning in release build	2021-11-09 19:48:42 -08:00
hsmahesha	3b9a85d10a	[CFE][Codegen] Make sure to maintain the contiguity of all the static allocas at the start of the entry block, which in turn would aid better code transformation/optimization. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D110257	2021-11-10 08:45:21 +05:30
Joseph Huber	4b5c3e591d	[OpenMP] Remove doing assumption propagation in the front end. This patch removes the assumption propagation that was added in D110655 primarily to get assumption informatino on opaque call sites for optimizations. The analysis done in D111445 allows us to do this more intelligently in the back-end. Depends on D111445 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D111463	2021-11-09 17:39:24 -05:00
Kostya Serebryany	b7f3a4f4fa	[sancov] add tracing for loads and store add tracing for loads and stores. The primary goal is to have more options for data-flow-guided fuzzing, i.e. use data flow insights to perform better mutations or more agressive corpus expansion. But the feature is general puspose, could be used for other things too. Pipe the flag though clang and clang driver, same as for the other SanitizerCoverage flags. While at it, change some plain arrays into std::array. Tests: clang flags test, LLVM IR test, compiler-rt executable test. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D113447	2021-11-09 14:35:13 -08:00
Itay Bookstein	9efce0baee	[clang] Run LLVM Verifier in modes without CodeGen too Previously, the Backend_Emit{Nothing,BC,LL} modes did not run the LLVM verifier since it is usually added via the TargetMachine::addPassesToEmitFile method according to the DisableVerify parameter. This is called from EmitAssemblyHelper::AddEmitPasses, which is only relevant for BackendAction-s that require CodeGen. Note: * In these particular situations the verifier is added to the optimization pipeline rather than the codegen pipeline so that it runs prior to the BC/LL emission pass. * This change applies to both the old and the new PMs. * Because the clang tests use -emit-llvm ubiquitously, this change will enable the verifier for them. * A small bug is fixed in emitIFuncDefinition so that the clang/test/CodeGen/ifunc.c test would pass: the emitIFuncDefinition incorrectly passed the GlobalDecl of the IFunc itself to the call to GetOrCreateLLVMFunction for creating the resolver. Signed-off-by: Itay Bookstein <ibookstein@gmail.com> Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D113352	2021-11-09 23:57:13 +02:00
Itay Bookstein	3b1fd19357	[CodeGen] Diagnose and reject non-function ifunc resolvers Signed-off-by: Itay Bookstein <ibookstein@gmail.com> Reviewed By: MaskRay, erichkeane Differential Revision: https://reviews.llvm.org/D112868	2021-11-09 23:51:36 +02:00

... 3 4 5 6 7 ...

15165 Commits