llvm-project

Commit Graph

Author	SHA1	Message	Date
Stanislav Mekhanoshin	932f628121	[AMDGPU] new gfx940 fp atomics Differential Revision: https://reviews.llvm.org/D121028	2022-03-07 12:32:02 -08:00
David Blaikie	c0a6433f2b	Simplify OpenMP Lambda use * Use default ref capture for non-escaping lambdas (this makes maintenance easier by allowing new uses, removing uses, having conditional uses (such as in assertions) not require updates to an explicit capture list) * Simplify addPrivate API not to take a lambda, since it calls it unconditionally/immediately anyway - most callers are simply passing in a named value or short expression anyway and the lambda syntax just adds noise/overhead Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D121077	2022-03-07 18:23:20 +00:00
Qiu Chaofan	b2497e5435	[PowerPC] Add generic fnmsub intrinsic Currently in Clang, we have two types of builtins for fnmsub operation: one for float/double vector, they'll be transformed into IR operations; one for float/double scalar, they'll generate corresponding intrinsics. But for the vector version of builtin, the 3 op chain may be recognized as expensive by some passes (like early cse). We need some way to keep the fnmsub form until code generation. This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub intrinsics. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D116015	2022-03-07 13:00:06 +08:00
Shao-Ce SUN	fa9c8bab0c	[RISCV] Support k-ext clang intrinsics Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D112774	2022-03-05 13:57:18 +08:00
Akira Hatanaka	3717b9661f	[NFC][Clang][OpaquePtr] Remove calls to Address::deprecated in CGBlocks.cpp Differential Revision: https://reviews.llvm.org/D120856	2022-03-03 08:54:46 -08:00
Aakanksha	840695814a	[AMDGPU] Add gfx1036 target Differential Revision: https://reviews.llvm.org/D120846	2022-03-02 23:26:38 +00:00
Stanislav Mekhanoshin	2e2e64df4a	[AMDGPU] Add gfx940 target This is target definition only. Differential Revision: https://reviews.llvm.org/D120688	2022-03-02 13:54:48 -08:00
Tong Zhang	f76d3b800f	[clang][CGStmt] fix crash on invalid asm statement Clang is crashing on the following statement char var[9]; __asm__ ("" : "=r" (var) : "0" (var)); This is similar to existing test: crbug_999160_regtest The issue happens when EmitAsmStmt is trying to convert input to match output type length. However, that is not guaranteed to be successful all the time and if the statement itself is invalid like having an array type in the example, we should give a regular error message here instead of using assert(). Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D120596	2022-03-02 11:18:55 -08:00
Akira Hatanaka	d112cc2756	[NFC][Clang][OpaquePtr] Remove the call to Address::deprecated in CreatePointerBitCastOrAddrSpaceCast Differential Revision: https://reviews.llvm.org/D120757	2022-03-02 08:58:00 -08:00
Tong Zhang	17ce89fa80	[SanitizerBounds] Add support for NoSanitizeBounds function Currently adding attribute no_sanitize("bounds") isn't disabling -fsanitize=local-bounds (also enabled in -fsanitize=bounds). The Clang frontend handles fsanitize=array-bounds which can already be disabled by no_sanitize("bounds"). However, instrumentation added by the BoundsChecking pass in the middle-end cannot be disabled by the attribute. The fix is very similar to D102772 that added the ability to selectively disable sanitizer pass on certain functions. In this patch, if no_sanitize("bounds") is provided, an additional function attribute (NoSanitizeBounds) is attached to IR to let the BoundsChecking pass know we want to disable local-bounds checking. In order to support this feature, the IR is extended (similar to D102772) to make Clang able to preserve the information and let BoundsChecking pass know bounds checking is disabled for certain function. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D119816	2022-03-01 18:47:02 +01:00
Michael Kruse	a66f7769a3	[OpenMPIRBuilder] Implement static-chunked workshare-loop schedules. Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads. This patch includes the following related changes: * Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder. * Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder. * Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop. * Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause. Differential Revision: https://reviews.llvm.org/D114413	2022-02-28 18:18:33 -06:00
Dávid Bolvanský	223b824022	[Clang] noinline call site attribute Motivation: ``` int foo(int x, int y) { // any compiler will happily inline this function return x / y; } int test(int x, int y) { int r = 0; [[clang::noinline]] r += foo(x, y); // for some reason we don't want any inlining here return r; } ``` In 2018, @kuhar proposed "Introduce per-callsite inline intrinsics" in https://reviews.llvm.org/D51200 to solve this motivation case (and many others). This patch solves this problem with call site attribute. The implementation is "smaller" wrt approach which uses new intrinsics and thanks to https://reviews.llvm.org/D79121 (Add nomerge statement attribute to clang), we have got some basic infrastructure to deal with attrs on statements with call expressions. GCC devs are more inclined to call attribute solution as well, as builtins are problematic for them - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104187. But they have no patch proposal yet so.. We have free hands here. If this approach makes sense, next future steps would be support for call site attributes for always_inline / flatten. Reviewed By: aaron.ballman, kuhar Differential Revision: https://reviews.llvm.org/D119061	2022-02-28 21:21:17 +01:00
Itay Bookstein	f3480390be	[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers The purpose of this change is to fix the following codegen bug: ``` // main.c __attribute__((cpu_specific(generic))) int foo(void) { static int z; return &z;} int main() { return foo() = 5; } // other.c __attribute__((cpu_dispatch(generic))) int foo(void); // run: clang main.c other.c -o main; ./main ``` This will segfault prior to the change, and return the correct exit code 5 after the change. The underlying cause is that when a translation unit contains a cpu_specific function without the corresponding cpu_dispatch the generated code binds the reference to foo() against a GlobalIFunc whose resolver is undefined. This is invalid: the resolver must be defined in the same translation unit as the ifunc, but historically the LLVM bitcode verifier did not check that. The generated code then binds against the resolver rather than the ifunc, so it ends up calling the resolver rather than the resolvee. In the example above it treats its return value as an int , therefore trying to write to program text. The root issue at the representation level is that GlobalIFunc, like GlobalAlias, does not support a "declaration" state. The object which provides the correct semantics in these cases is a Function declaration, but unlike Functions, changing a declaration to a definition in the GlobalIFunc case constitutes a change of the object type, as opposed to simply emitting code into a Function. I think this limitation is unlikely to change, so I implemented the fix by returning a function declaration rather than an ifunc when encountering cpu_specific, and upgrading it to an ifunc when emitting cpu_dispatch. This uses `takeName` + `replaceAllUsesWith` in similar vein to other places where the correct IR object type cannot be known locally/up-front, like in `CodeGenModule::EmitAliasDefinition`. Previous discussion in: https://reviews.llvm.org/D112349 Signed-off-by: Itay Bookstein <ibookstein@gmail.com> Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D120266	2022-02-26 11:17:49 +02:00
Adrian Prantl	bc7aeea854	Revert "Don't append the working directory to absolute paths" This reverts commit `2cd9a86da5`.	2022-02-25 17:00:10 -08:00
Adrian Prantl	2cd9a86da5	Don't append the working directory to absolute paths This fixes a bug that happens when using -fdebug-prefix-map to remap an absolute path to a relative path. Since the path was absolute before remapping, it is safe to assume that concatenating the remapped working directory would be wrong. Differential Revision: https://reviews.llvm.org/D113718	2022-02-25 13:03:59 -08:00
Alexey Bataev	d04d9220e1	[OPENMP]Fix PR50347: Mapping of global scope deep object fails. Changed the we handle llvm::Constants in sizes arrays. ConstExprs and GlobalValues cannot be used as initializers, need to put them at the runtime, otherwise there wight be the compilation errors. Differential Revision: https://reviews.llvm.org/D105297	2022-02-25 10:54:24 -08:00
Shangwu Yao	c2f501f395	[CUDA][SPIRV] Assign global address space to CUDA kernel arguments (resubmit https://reviews.llvm.org/D119207 after fixing the test for some build settings) This patch converts CUDA pointer kernel arguments with default address space to CrossWorkGroup address space (__global in OpenCL). This is because Generic or Function (OpenCL's private) is not supported as storage class for kernel pointer types. Differential revision: https://reviews.llvm.org/D120366	2022-02-24 20:51:43 -08:00
Alexey Bataev	ca6fa71b7e	Revert "[OPENMP]Fix PR50347: Mapping of global scope deep object fails." This reverts commit `638938117a`. Need to fix reported fail https://lab.llvm.org/buildbot/#/builders/193/builds/7496	2022-02-24 12:04:39 -08:00
Alexey Bataev	638938117a	[OPENMP]Fix PR50347: Mapping of global scope deep object fails. Changed the we handle llvm::Constants in sizes arrays. ConstExprs and GlobalValues cannot be used as initializers, need to put them at the runtime, otherwise there wight be the compilation errors. Differential Revision: https://reviews.llvm.org/D105297	2022-02-24 11:49:14 -08:00
Joseph Huber	7aef8b3754	[OpenMP] Make section variable external to prevent collisions Summary: We use a section to embed offloading code into the host for later linking. This is normally unique to the translation unit as it is thrown away during linking. However, if the user performs a relocatable link the sections will be merged and we won't be able to access the files stored inside. This patch changes the section variables to have external linkage and a name defined by the section name, so if two sections are combined during linking we get an error.	2022-02-24 10:57:09 -05:00
Yaxun (Sam) Liu	9d899d8f01	[HIP] Support `-fgpu-default-stream` Introduce -fgpu-default-stream={legacy\|per-thread} option to support per-thread default stream for HIP runtime. When -fgpu-default-stream=per-thread, HIP kernels are launched through hipLaunchKernel_spt instead of hipLaunchKernel. Also HIP_API_PER_THREAD_DEFAULT_STREAM=1 is defined by the preprocessor to enable other per-thread stream API's. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D120298	2022-02-23 22:28:29 -05:00
Fangrui Song	0477cac332	[asan] Allow -fsanitize-address-globals-dead-stripping with -fno-data-sections for ELF -fdata-sections decides whether global variables go into different sections. This is orthogonal to whether we place their metadata (`.data` or `asan_globals`) into different sections. With -fno-data-sections, `-fsanitize-address-globals-dead-stripping` can still: * deduplicate COMDAT `asan.module_ctor` and `asan.module_dtor` * (with ld --gc-sections): for a data section (e.g. `.data`), if all global variables defined relative to it are unreferenced, discard them and associated `asan_globals` sections (rare but no need to exclude this case) Similar to `c7b90947bd` for PE/COFF. Reviewed By: #sanitizers, kstoimenov, vitalybuka Differential Revision: https://reviews.llvm.org/D120394	2022-02-23 16:08:25 -08:00
Joseph Huber	119d71cb73	[OpenMP][NFC] Address warnings and lint messages in CGOpenMPRuntime Summary: This patch addressed the warnings and linting messages for the CGOpenMPRuntime.cpp file. This was causing some -Werror builds to fail.	2022-02-23 18:07:25 -05:00
Reid Kleckner	1d1b089c5d	Fix more unused lambda capture warnings, NFC	2022-02-23 14:07:04 -08:00
Reid Kleckner	cd37594c03	Fix unused lambda capture warning, NFC	2022-02-23 14:01:01 -08:00
Joseph Huber	2b97b16f29	[OpenMP] Add option to make offloading mandatory Currently when we generate OpenMP offloading code we always make fallback code for the CPU. This is necessary for implementing features like conditional offloading and ensuring that unhandled pragmas don't result in missing symbols. However, this is problematic for a few cases. For offloading tests we can silently fail to the host without realizing that offloading failed. Additionally, this makes it impossible to provide interoperabiility to other offloading schemes like HIP or CUDA because those methods do not provide any such host fallback guaruntee. this patch adds the `-fopenmp-offload-mandatory` flag to prevent generating the fallback symbol on the CPU and instead replaces the function with a dummy global and the failed branch with 'unreachable'. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D120353	2022-02-23 16:45:36 -05:00
Arthur Eubanks	4cb24ef90a	[clang] Remove Address::deprecated() from CGClass.cpp	2022-02-23 13:31:56 -08:00
Arthur Eubanks	6eec483584	[clang] Remove getPointerElementType() in EmitVTableTypeCheckedLoad()	2022-02-23 09:38:33 -08:00
Nikita Popov	b1863d8245	[Clang][OpenMP] Remove use of getPointerElementType() This new pointer element type use snuck in via D118632.	2022-02-23 16:14:24 +01:00
Arthur Eubanks	36e335eeb5	[clang] Remove Address::deprecated() calls in CodeGenFunction.cpp	2022-02-22 18:28:49 -08:00
Arthur Eubanks	cde658fa1f	[clang] Remove Address::deprecated() calls in CGVTables.cpp	2022-02-22 16:54:28 -08:00
Arthur Eubanks	3ef7e6c53c	[clang] Remove an Address::deprecated() call in CGClass.cpp	2022-02-22 16:19:06 -08:00
Shilei Tian	104d9a6743	[Clang][OpenMP] Add the codegen support for `atomic compare` This patch adds the codegen support for `atomic compare` in clang. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D118632	2022-02-22 13:01:39 -05:00
Shilei Tian	ccebf8ac8c	[Clang][OpenMP] Add support for compare capture in parser This patch adds the support for `atomic compare capture` in parser and part of sema. We don't create an AST node for this because the spec doesn't say `compare` and `capture` clauses should be used tightly, so we cannot look one more token ahead in the parser. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D116261	2022-02-18 10:23:59 -05:00
Joseph Huber	0870a4f59a	[OpenMP] Add flag for disabling thread state in runtime The runtime uses thread state values to indicate when we use an ICV or are in nested parallelism. This is done for OpenMP correctness, but it not needed in the majority of cases. The new flag added is `-fopenmp-assume-no-thread-state`. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D120106	2022-02-18 08:35:05 -05:00
Alexander Potapenko	c85a26454d	[asan] Add support for disable_sanitizer_instrumentation attribute For ASan this will effectively serve as a synonym for __attribute__((no_sanitize("address"))). Adding the disable_sanitizer_instrumentation to functions will drop the sanitize_XXX attributes on the IR level. This is the third reland of https://reviews.llvm.org/D114421. Now that TSan test is fixed (https://reviews.llvm.org/D120050) there should be no deadlocks. Differential Revision: https://reviews.llvm.org/D120055	2022-02-18 09:51:54 +01:00
hyeongyukim	b529744c29	[Clang] Rename `disable-noundef-analysis` flag to `-[no-]enable-noundef-analysis` This flag was previously renamed `enable_noundef_analysis` to `disable-noundef-analysis,` which is not a conventional name. (Driver and CC1's boolean options are using [no-] prefix) As discussed at https://reviews.llvm.org/D105169, this patch reverts its name to `[no-]enable_noundef_analysis` and enables noundef-analysis as default. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D119998	2022-02-18 17:02:41 +09:00
Arthur Eubanks	0b5fe2c9f2	[clang] Remove Address::deprecated() in emitVoidPtrDirectVAArg()	2022-02-17 15:05:50 -08:00
Matthew Voss	9ce09099bb	Revert "[CUDA][SPIRV] Assign global address space to CUDA kernel arguments" This reverts commit `9de4fc0f2d`. Reverting due to test failure: https://lab.llvm.org/buildbot/#/builders/139/builds/17199	2022-02-17 14:32:10 -08:00
Arthur Eubanks	ba9944ea1d	[clang] Remove Address::deprecated() in CGCXXABI.h	2022-02-17 14:23:02 -08:00
Arthur Eubanks	0e219af475	[clang] Remove Address::deprecated() call in CGExprCXX.cpp	2022-02-17 13:58:26 -08:00
Shafik Yaghmour	f56cb520d8	[DEBUGINFO] [LLDB] Add support for generating debug-info for structured bindings of structs and arrays Currently we are not emitting debug-info for all cases of structured bindings a C++17 feature which allows us to bind names to subobjects in an initializer. A structured binding is represented by a DecompositionDecl AST node and the binding are represented by a BindingDecl. It looks the original implementation only covered the tuple like case which be represented by a DeclRefExpr which contains a VarDecl. If the binding is to a subobject of the struct the binding will contain a MemberExpr and in the case of arrays it will contain an ArraySubscriptExpr. This PR adds support emitting debug-info for the MemberExpr and ArraySubscriptExpr cases as well as llvm and lldb tests for these cases as well as the tuple case. Differential Revision: https://reviews.llvm.org/D119178	2022-02-17 11:14:14 -08:00
Shangwu Yao	9de4fc0f2d	[CUDA][SPIRV] Assign global address space to CUDA kernel arguments This patch converts CUDA pointer kernel arguments with default address space to CrossWorkGroup address space (__global in OpenCL). This is because Generic or Function (OpenCL's private) is not supported as storage class for kernel pointer types. Differential Revision: https://reviews.llvm.org/D119207	2022-02-17 09:38:06 -08:00
Simon Pilgrim	57fc9798d7	[clang] CGDebugInfo::getOrCreateMethodType - use castAs<> instead of getAs<> to avoid dereference of nullptr The pointer is always dereferenced, so assert the cast is correct instead of returning nullptr	2022-02-17 13:18:23 +00:00
Simon Pilgrim	2614de8202	[clang] CGCXXABI::EmitLoadOfMemberFunctionPointer - use castAs<> instead of getAs<> to avoid dereference of nullptr The pointer is always dereferenced by arrangeCXXMethodType, so assert the cast is correct instead of returning nullptr	2022-02-17 13:18:23 +00:00
Nikita Popov	5065076698	[CodeGen] Rename deprecated Address constructor To make uses of the deprecated constructor easier to spot, and to ensure that no new uses are introduced, rename it to Address::deprecated(). While doing the rename, I've filled in element types in cases where it was relatively obvious, but we're still left with 135 calls to the deprecated constructor.	2022-02-17 11:26:42 +01:00
Nikita Popov	fe3407a91b	[CGBuilder] Assert that CreateAddrSpaceCast does not change element type Address space casts in general may change the element type, but don't allow it in the method working on Address, so we can preserve the element type. CreatePointerBitCastOrAddrSpaceCast() still needs to be addressed.	2022-02-16 15:17:08 +01:00
Chuanqi Xu	d30ca5e2e2	[C++20] [Coroutines] Implement return value optimization for get_return_object This patch tries to implement RVO for coroutine's return object got from get_return_object. From [dcl.fct.def.coroutine]/p7 we could know that the return value of get_return_object is either a reference or a prvalue. So it makes sense to do copy elision for the return value. The return object should be constructed directly into the storage where they would otherwise be copied/moved to. Test Plan: folly, check-all Reviewed By: junparser Differential revision: https://reviews.llvm.org/D117087	2022-02-16 13:38:00 +08:00
David Blaikie	9980a3f831	DebugInfo: Disable simplified template names for -gmlt and below Since -gmlt doesn't carry any type information necessary to rebuild template names.	2022-02-15 11:58:40 -08:00
David Blaikie	1ea326634b	DebugInfo: Don't simplify template names using _BitInt(N) _BitInt(N) only encodes the byte size in DWARF, not the bit size, so can't be reconstituted.	2022-02-15 11:58:40 -08:00

1 2 3 4 5 ...

15016 Commits