llvm-project

History

Joseph Huber 68d133a3e8 [OpenMP] Simplify GPU memory globalization Summary: Memory globalization is required to maintain OpenMP standard semantics for data sharing between worker and master threads. The GPU cannot share data between its threads so must allocate global or shared memory to store the data in. Currently this is implemented fully in the frontend using the `__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard CPU stack sharing. The front-end scans the target region for variables that escape the region and must be shared between the threads. Each variable then has a field created for it in a global record type. This patch replaces this functinality with a single allocation command, effectively mimicing an alloca instruction for the variables that must be shared between the threads. This will be much slower than the current solution, but makes it much easier to optimize as we can analyze each variable independently and determine if it is not captured. In the future, we can replace these calls with an `alloca` and small allocations can be pushed to shared memory. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D97680		2021-06-22 10:52:46 -04:00
..
ABIInfo.h	[ABI][NFC] Fix the confusion of ByVal and ByRef argument names	2020-08-06 15:20:18 +03:00
Address.h	…
BackendUtil.cpp	[clang-cl][sanitizer] Add -fsanitize-address-use-after-return to clang.	2021-06-11 12:07:35 -07:00
CGAtomic.cpp	[CGAtomic] Delete outdated code comparing success/failure ordering for cmpxchg.	2021-05-28 15:36:01 -07:00
CGBlocks.cpp	[clang] NFC: Rename rvalue to prvalue	2021-06-09 12:27:10 +02:00
CGBlocks.h	[CodeGen] Simplify the way lifetime of block captures is extended	2020-06-11 16:06:22 -07:00
CGBuilder.h	[OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC)	2021-03-12 21:01:16 +01:00
CGBuiltin.cpp	Update @llvm.powi to handle different int sizes for the exponent	2021-06-17 09:38:28 +02:00
CGCUDANV.cpp	Reimplement __builtin_unique_stable_name-	2021-05-27 07:12:20 -07:00
CGCUDARuntime.cpp	…
CGCUDARuntime.h	[HIP] Emit kernel symbol	2021-03-01 16:31:40 -05:00
CGCXX.cpp	[OpaquePtrs] Remove some uses of type-less CreateLoad APIs (NFC)	2021-03-11 14:40:57 +01:00
CGCXXABI.cpp	Fix PR35902: incorrect alignment used for ubsan check.	2020-12-28 18:11:17 -05:00
CGCXXABI.h	[clang][aarch64] Precondition isHomogeneousAggregate on isCXX14Aggregate	2021-01-12 19:44:01 +00:00
CGCall.cpp	[Clang][CodeGen] Set the size of llvm.lifetime to unknown for scalable types.	2021-06-07 23:30:13 +08:00
CGCall.h	Replace `T(x)` with `reinterpret_cast<T>(x)` everywhere it means reinterpret_cast. NFC.	2020-12-22 19:54:29 -05:00
CGClass.cpp	[CUDA][HIP] Fix store of vtbl in ctor	2021-06-08 10:24:44 -04:00
CGCleanup.cpp	[Windows SEH]: Fix -O2 crash for Windows -EHa	2021-06-04 14:07:44 -07:00
CGCleanup.h	[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX	2020-12-02 18:42:44 +00:00
CGCoroutine.cpp	Revert "[Coroutines] Set presplit attribute in Clang instead of CoroEarly pass"	2021-04-18 17:22:28 -07:00
CGDebugInfo.cpp	[Debug-Info] guard DW_LANG_C_plus_plus_14 under strict dwarf	2021-06-16 03:17:56 +00:00
CGDebugInfo.h	[clang] p1099 using enum part 2	2021-06-08 11:11:46 -07:00
CGDecl.cpp	[clang] p1099 using enum part 2	2021-06-08 11:11:46 -07:00
CGDeclCXX.cpp	[CUDA][HIP] Fix device variables used by host	2021-05-20 17:04:29 -04:00
CGException.cpp	[WebAssembly] Warn on exception spec for Emscripten EH	2021-05-20 13:00:20 -07:00
CGExpr.cpp	[clang] NFC: Rename rvalue to prvalue	2021-06-09 12:27:10 +02:00
CGExprAgg.cpp	[Clang][CodeGen] Set the size of llvm.lifetime to unknown for scalable types.	2021-06-07 23:30:13 +08:00
CGExprCXX.cpp	Implemented [[clang::musttail]] attribute for guaranteed tail calls.	2021-04-15 17:12:21 -07:00
CGExprComplex.cpp	[Matrix] Implement C-style explicit type conversions for matrix types.	2021-04-10 11:48:41 +01:00
CGExprConstant.cpp	[Matrix] Implement C-style explicit type conversions for matrix types.	2021-04-10 11:48:41 +01:00
CGExprScalar.cpp	[clang] NFC: Rename rvalue to prvalue	2021-06-09 12:27:10 +02:00
CGGPUBuiltin.cpp	…
CGLoopInfo.cpp	[Clang] Ensure vector predication loop metadata is always emitted when pragma is specified.	2021-02-13 17:35:54 -06:00
CGLoopInfo.h	[SVE] Add support to vectorize_width loop pragma for scalable vectors	2021-01-08 11:37:27 +00:00
CGNonTrivialStruct.cpp	[Clang][Codegen] Do not annotate thunk's this/return types with align/deref/nonnull attrs	2021-05-13 20:33:08 +03:00
CGObjC.cpp	[clang] NFC: Rename rvalue to prvalue	2021-06-09 12:27:10 +02:00
CGObjCGNU.cpp	[clang] NFC: Fix range-based for loop warnings related to decl lookup	2021-04-19 18:31:31 +02:00
CGObjCMac.cpp	[clang] NFC: Fix range-based for loop warnings related to decl lookup	2021-04-19 18:31:31 +02:00
CGObjCRuntime.cpp	[OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC)	2021-03-12 21:01:16 +01:00
CGObjCRuntime.h	[clang] Implement objc_non_runtime_protocol to remove protocol metadata	2020-10-02 17:35:50 -04:00
CGOpenCLRuntime.cpp	…
CGOpenCLRuntime.h	…
CGOpenMPRuntime.cpp	[OpenMP] Implement '#pragma omp unroll'.	2021-06-10 14:30:17 -05:00
CGOpenMPRuntime.h	[OpenMP] Fix non-determinism in clang task codegen	2021-05-03 10:34:38 -07:00
CGOpenMPRuntimeAMDGCN.cpp	[libomptarget][amdgpu] Call into deviceRTL instead of ockl	2021-01-04 16:48:47 +00:00
CGOpenMPRuntimeAMDGCN.h	[OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 3	2020-08-03 05:38:39 +00:00
CGOpenMPRuntimeGPU.cpp	[OpenMP] Simplify GPU memory globalization	2021-06-22 10:52:46 -04:00
CGOpenMPRuntimeGPU.h	[OpenMP] Simplify GPU memory globalization	2021-06-22 10:52:46 -04:00
CGOpenMPRuntimeNVPTX.cpp	[OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 3	2020-08-03 05:38:39 +00:00
CGOpenMPRuntimeNVPTX.h	[OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 3	2020-08-03 05:38:39 +00:00
CGRecordLayout.h	[ARM] Follow AACPS standard for volatile bit-fields access width	2020-10-13 10:31:48 +01:00
CGRecordLayoutBuilder.cpp	[CodeGen] Use getCharWidth() more consistently in CGRecordLowering. NFC	2021-01-22 21:12:17 +01:00
CGStmt.cpp	[OpenMP] Implement '#pragma omp unroll'.	2021-06-10 14:30:17 -05:00
CGStmtOpenMP.cpp	[clang] Remove unused capture in closure	2021-06-22 15:09:39 +01:00
CGVTT.cpp	[AMDGPU] Set the default globals address space to 1	2020-11-20 15:46:53 +00:00
CGVTables.cpp	[Clang][Codegen] Do not annotate thunk's this/return types with align/deref/nonnull attrs	2021-05-13 20:33:08 +03:00
CGVTables.h	[clang] Frontend components for the relative vtables ABI (round 2)	2020-06-11 11:17:08 -07:00
CGValue.h	[AST] Change return type of getTypeInfoInChars to a proper struct instead of std::pair.	2020-10-13 13:26:56 +02:00
CMakeLists.txt	Remove dependency on clangASTMatchers.	2020-09-10 22:17:48 -04:00
CodeGenABITypes.cpp	[CodeGen] Add public function to emit C++ destructor call.	2020-07-01 11:01:23 -07:00
CodeGenAction.cpp	[clang-repl] Recommit "Land initial infrastructure for incremental parsing"	2021-05-13 06:30:29 +00:00
CodeGenFunction.cpp	[IR] convert warn-stack-size from module flag to fn attr	2021-06-21 15:09:25 -07:00
CodeGenFunction.h	[Clang][OpenMP] Monotonic does not apply to SIMD	2021-06-22 10:24:11 +01:00
CodeGenModule.cpp	[IR] convert warn-stack-size from module flag to fn attr	2021-06-21 15:09:25 -07:00
CodeGenModule.h	[NFC] Pass GV value type instead of pointer type to GetOrCreateLLVMGlobal	2021-05-17 18:41:17 -07:00
CodeGenPGO.cpp	[PGO] Don't reference functions unless value profiling is enabled	2021-05-20 11:09:24 -07:00
CodeGenPGO.h	[PGO] Don't reference functions unless value profiling is enabled	2021-05-20 11:09:24 -07:00
CodeGenTBAA.cpp	Reland Implement _ExtInt as an extended int type specifier.	2020-04-17 10:45:48 -07:00
CodeGenTBAA.h	…
CodeGenTypeCache.h	[CGExpr] Use getCharWidth() more consistently in CCGExprConstant. NFC	2021-01-22 21:12:17 +01:00
CodeGenTypes.cpp	[Clang][RISCV] Define RISC-V V builtin types	2021-02-18 10:17:31 +08:00
CodeGenTypes.h	CodeGenTypes::CGRecordLayouts: Use unique_ptr to simplify memory management	2020-04-28 22:31:16 -07:00
ConstantEmitter.h	attempt to fix failing buildbots after `3bab88b7ba`	2020-06-15 12:58:37 +02:00
ConstantInitBuilder.cpp	Fix ConstantAggregateBuilderBase::getRelativeOffset	2020-06-15 12:23:20 -07:00
CoverageMappingGen.cpp	Revert "Revert "[Coverage] Emit gap region between statements if first statements contains terminate statements.""	2021-03-04 11:52:43 -08:00
CoverageMappingGen.h	[Driver] Rename -fprofile-{prefix-map,compilation-dir} to -fcoverage-{prefix-map,compilation-dir}	2021-02-25 21:40:12 -08:00
EHScopeStack.h	[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1	2021-05-17 22:42:17 -07:00
ItaniumCXXABI.cpp	[Clang][Codegen] Do not annotate thunk's this/return types with align/deref/nonnull attrs	2021-05-13 20:33:08 +03:00
MacroPPCallbacks.cpp	…
MacroPPCallbacks.h	…
MicrosoftCXXABI.cpp	[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1	2021-05-17 22:42:17 -07:00
ModuleBuilder.cpp	[clang/Basic] Make TargetInfo.h not use DataLayout again	2021-04-27 22:26:10 -04:00
ObjectFilePCHContainerOperations.cpp	[clang/Basic] Make TargetInfo.h not use DataLayout again	2021-04-27 22:26:10 -04:00
PatternInit.cpp	Clean up usages of asserting vector getters in Type	2020-04-13 13:01:40 -07:00
PatternInit.h	…
README.txt	Revert "This is a test commit"	2020-12-23 13:04:37 -06:00
SanitizerMetadata.cpp	[clang][patch] Inclusive language, modify filename SanitizerBlacklist.h to NoSanitizeList.h	2021-02-22 15:11:37 -05:00
SanitizerMetadata.h	[Analysis/Transforms/Sanitizers] As part of using inclusive language	2020-06-20 00:42:26 -07:00
SwiftCallingConv.cpp	Teach the swift calling convention about _Atomic types	2020-08-31 07:07:25 -07:00
TargetInfo.cpp	[clang] Apply MS ABI details on __builtin_ms_va_list on non-windows platforms on x86_64	2021-06-08 12:14:12 +03:00
TargetInfo.h	[CFE, SystemZ] New target hook testFPKind() for checks of FP values.	2021-02-18 12:36:46 -06:00
VarBypassDetector.cpp	[clang,NFC] Fix typos in file headers	2021-02-25 12:47:02 -08:00
VarBypassDetector.h	[clang,NFC] Fix typos in file headers	2021-02-25 12:47:02 -08:00

README.txt

IRgen optimization opportunities.

//===---------------------------------------------------------------------===//

The common pattern of
--
short x; // or char, etc
(x == 10)
--
generates an zext/sext of x which can easily be avoided.

//===---------------------------------------------------------------------===//

Bitfields accesses can be shifted to simplify masking and sign
extension. For example, if the bitfield width is 8 and it is
appropriately aligned then is is a lot shorter to just load the char
directly.

//===---------------------------------------------------------------------===//

It may be worth avoiding creation of alloca's for formal arguments
for the common situation where the argument is never written to or has
its address taken. The idea would be to begin generating code by using
the argument directly and if its address is taken or it is stored to
then generate the alloca and patch up the existing code.

In theory, the same optimization could be a win for block local
variables as long as the declaration dominates all statements in the
block.

NOTE: The main case we care about this for is for -O0 -g compile time
performance, and in that scenario we will need to emit the alloca
anyway currently to emit proper debug info. So this is blocked by
being able to emit debug information which refers to an LLVM
temporary, not an alloca.

//===---------------------------------------------------------------------===//

We should try and avoid generating basic blocks which only contain
jumps. At -O0, this penalizes us all the way from IRgen (malloc &
instruction overhead), all the way down through code generation and
assembly time.

On 176.gcc:expr.ll, it looks like over 12% of basic blocks are just
direct branches!

//===---------------------------------------------------------------------===//