llvm-project/clang/lib/CodeGen
Yaxun Liu 0bc4b2d337 [OpenCL] Generate opaque type for sampler_t and function call for the initializer
Currently Clang use int32 to represent sampler_t, which have been a source of issue for some backends, because in some backends sampler_t cannot be represented by int32. They have to depend on kernel argument metadata and use IPA to find the sampler arguments and global variables and transform them to target specific sampler type.

This patch uses opaque pointer type opencl.sampler_t* for sampler_t. For each use of file-scope sampler variable, it generates a function call of __translate_sampler_initializer. For each initialization of function-scope sampler variable, it generates a function call of __translate_sampler_initializer.

Each builtin library can implement its own __translate_sampler_initializer(). Since the real sampler type tends to be architecture dependent, allowing it to be initialized by a library function simplifies backend design. A typical implementation of __translate_sampler_initializer could be a table lookup of real sampler literal values. Since its argument is always a literal, the returned pointer is known at compile time and easily optimized to finally become some literal values directly put into image read instructions.

This patch is partially based on Alexey Sotkin's work in Khronos Clang (3d4eec6162).

Differential Revision: https://reviews.llvm.org/D21567

llvm-svn: 277024
2016-07-28 19:26:30 +00:00
..
ABIInfo.h IRGen-level lowering for the Swift calling convention. 2016-04-04 18:33:08 +00:00
Address.h Work around build failure due to GCC 4.8.1 bug. We don't completely understand 2016-02-02 23:11:49 +00:00
BackendUtil.cpp Add flags to toggle preservation of assembly comments 2016-07-27 19:57:40 +00:00
CGAtomic.cpp [NFC] Header cleanup 2016-07-18 19:02:11 +00:00
CGBlocks.cpp CodeGen: correct assertion 2016-06-03 23:26:30 +00:00
CGBlocks.h [NFC] Header cleanup 2016-07-18 19:02:11 +00:00
CGBuilder.h Remove compile time PreserveName in favor of a runtime cc1 -discard-value-names option 2016-03-13 21:05:23 +00:00
CGBuiltin.cpp [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 with generic IR 2016-07-22 13:58:56 +00:00
CGCUDABuiltin.cpp [CUDA] Align kernel launch args correctly when the LLVM type's alignment is different from the clang type's alignment. 2016-07-27 22:36:21 +00:00
CGCUDANV.cpp [CUDA] Align kernel launch args correctly when the LLVM type's alignment is different from the clang type's alignment. 2016-07-27 22:36:21 +00:00
CGCUDARuntime.cpp Roll-back r250822. 2015-10-20 13:23:58 +00:00
CGCUDARuntime.h [CUDA] Emit host-side 'shadows' for device-side global variables 2016-03-02 18:28:50 +00:00
CGCXX.cpp revert SVN r265702, r265640 2016-04-08 16:52:00 +00:00
CGCXXABI.cpp Add the `pass_object_size` attribute to clang. 2015-12-02 21:58:08 +00:00
CGCXXABI.h [DebugInfo] Set DISubprogram ThisAdjustment in the MS ABI 2016-07-01 02:41:25 +00:00
CGCall.cpp [OpenCL] Add missing -cl-no-signed-zeros option into driver 2016-07-08 20:28:29 +00:00
CGCall.h [NFC] Header cleanup 2016-07-18 19:02:11 +00:00
CGClass.cpp When copying an array into a lambda, destroy temporaries from 2016-07-20 21:02:43 +00:00
CGCleanup.cpp [Temporary, Lifetime] Add lifetime marks for temporaries 2016-07-01 21:08:47 +00:00
CGCleanup.h Widen EHScope::ClenupBitFields::FixupDepth to avoid overflowing it (PR23490) 2016-06-22 16:21:14 +00:00
CGDebugInfo.cpp [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
CGDebugInfo.h [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
CGDecl.cpp P0217R3: Parsing support and framework for AST representation of C++1z 2016-07-22 23:36:59 +00:00
CGDeclCXX.cpp Clang changes for overloading invariant.start and end intrinsics 2016-07-22 17:50:08 +00:00
CGException.cpp [SEH] Remove nounwind/noinline from outlined finally funclets 2016-03-11 17:36:16 +00:00
CGExpr.cpp [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
CGExprAgg.cpp [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
CGExprCXX.cpp [CodeGen] Fix a segfault caused by pass_object_size. 2016-06-16 23:06:04 +00:00
CGExprComplex.cpp [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
CGExprConstant.cpp [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
CGExprScalar.cpp [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
CGLoopInfo.cpp Add loop pragma for Loop Distribution 2016-06-14 12:04:26 +00:00
CGLoopInfo.h [NFC] Header cleanup 2016-07-18 19:02:11 +00:00
CGObjC.cpp Remove CXXConstructExpr::getFoundDecl(); it turned out to not be useful. 2016-06-10 00:58:19 +00:00
CGObjCGNU.cpp CodeGen: honour dllstorage on ObjC types 2016-07-17 22:27:44 +00:00
CGObjCMac.cpp CodeGen: honour dllstorage on ObjC types 2016-07-17 22:27:44 +00:00
CGObjCRuntime.cpp Preserve ExtParameterInfos into CGFunctionInfo. 2016-03-11 04:30:31 +00:00
CGObjCRuntime.h Reduce the number of implicit StringRef->std::string conversions by threading StringRef through more APIs. 2016-02-13 13:42:54 +00:00
CGOpenCLRuntime.cpp [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
CGOpenCLRuntime.h [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
CGOpenMPRuntime.cpp [OpenMP] Change name of variable in mappble expression. 2016-07-28 15:31:29 +00:00
CGOpenMPRuntime.h [OpenMP] Codegen for use_device_ptr clause. 2016-07-28 14:23:26 +00:00
CGOpenMPRuntimeNVPTX.cpp [OPENMP] Codegen for teams directive for NVPTX 2016-04-04 15:55:02 +00:00
CGOpenMPRuntimeNVPTX.h [OPENMP] Codegen for teams directive for NVPTX 2016-04-04 15:55:02 +00:00
CGRecordLayout.h Make CodeGen headers self-contained. 2016-02-02 16:05:18 +00:00
CGRecordLayoutBuilder.cpp revert SVN r265702, r265640 2016-04-08 16:52:00 +00:00
CGStmt.cpp Reverting r275115 which caused PR28634. 2016-07-21 23:28:18 +00:00
CGStmtOpenMP.cpp [OpenMP] Codegen for use_device_ptr clause. 2016-07-28 14:23:26 +00:00
CGVTT.cpp Update clang for D20348 2016-06-14 21:02:05 +00:00
CGVTables.cpp [NFC] Header cleanup 2016-07-18 19:02:11 +00:00
CGVTables.h [CodeGen] Remove dead code. NFC. 2015-10-15 15:29:40 +00:00
CGValue.h [Sema] PR26444 fix crash when alignment value is >= 2**16 2016-03-02 06:48:47 +00:00
CMakeLists.txt Use the new path for coverage related headers and update CMakeLists.txt 2016-04-29 18:53:16 +00:00
CodeGenABITypes.cpp Various improvements to the public IRGen interface. 2016-05-18 05:21:18 +00:00
CodeGenAction.cpp [CodeGen] Handle recursion in LLVMIRGeneration Timer. 2016-07-21 06:28:48 +00:00
CodeGenFunction.cpp Add XRay flags to Clang. We implement two flags to control the XRay behaviour: 2016-07-13 22:32:15 +00:00
CodeGenFunction.h [OpenMP] Codegen for use_device_ptr clause. 2016-07-28 14:23:26 +00:00
CodeGenModule.cpp [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
CodeGenModule.h [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
CodeGenPGO.cpp [Coverage] Move logic to skip decl's into a helper (NFC) 2016-07-11 22:57:44 +00:00
CodeGenPGO.h [NFC] Header cleanup 2016-07-18 19:02:11 +00:00
CodeGenTBAA.cpp revert SVN r265702, r265640 2016-04-08 16:52:00 +00:00
CodeGenTBAA.h Make the remaining headers self-contained. 2016-02-02 14:24:21 +00:00
CodeGenTypeCache.h Compute and preserve alignment more faithfully in IR-generation. 2015-09-08 08:05:57 +00:00
CodeGenTypes.cpp Enable support for __float128 in Clang and enable it on pertinent platforms 2016-05-09 08:52:33 +00:00
CodeGenTypes.h [NFC] Header cleanup 2016-07-18 19:02:11 +00:00
CoverageMappingGen.cpp [Coverage] Do not write out coverage mappings with zero entries 2016-07-26 00:24:59 +00:00
CoverageMappingGen.h [NFC] Header cleanup 2016-07-18 19:02:11 +00:00
EHScopeStack.h [Temporary, Lifetime] Add lifetime marks for temporaries 2016-07-01 21:08:47 +00:00
ItaniumCXXABI.cpp Don't crash when generating code for __attribute__((naked)) member functions. 2016-07-27 22:04:24 +00:00
MicrosoftCXXABI.cpp Don't crash when generating code for __attribute__((naked)) member functions. 2016-07-27 22:04:24 +00:00
ModuleBuilder.cpp Various improvements to the public IRGen interface. 2016-05-18 05:21:18 +00:00
ObjectFilePCHContainerOperations.cpp Frontend: Simplify ownership model for clang's output streams. 2016-07-15 00:55:40 +00:00
README.txt
SanitizerMetadata.cpp [ASan] Initial support for Kernel AddressSanitizer 2015-06-19 12:19:07 +00:00
SanitizerMetadata.h Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; Clang edition. 2015-02-15 22:54:08 +00:00
SwiftCallingConv.cpp Silencing a 32-bit shift implicit conversion warning from MSVC; NFC. 2016-04-08 12:21:58 +00:00
TargetInfo.cpp Adjust coercion of aggregates on RenderScript 2016-07-27 19:01:51 +00:00
TargetInfo.h [OpenCL] AMDGCN target will generate images in constant address space 2016-07-20 19:21:11 +00:00

README.txt

IRgen optimization opportunities.

//===---------------------------------------------------------------------===//

The common pattern of
--
short x; // or char, etc
(x == 10)
--
generates an zext/sext of x which can easily be avoided.

//===---------------------------------------------------------------------===//

Bitfields accesses can be shifted to simplify masking and sign
extension. For example, if the bitfield width is 8 and it is
appropriately aligned then is is a lot shorter to just load the char
directly.

//===---------------------------------------------------------------------===//

It may be worth avoiding creation of alloca's for formal arguments
for the common situation where the argument is never written to or has
its address taken. The idea would be to begin generating code by using
the argument directly and if its address is taken or it is stored to
then generate the alloca and patch up the existing code.

In theory, the same optimization could be a win for block local
variables as long as the declaration dominates all statements in the
block.

NOTE: The main case we care about this for is for -O0 -g compile time
performance, and in that scenario we will need to emit the alloca
anyway currently to emit proper debug info. So this is blocked by
being able to emit debug information which refers to an LLVM
temporary, not an alloca.

//===---------------------------------------------------------------------===//

We should try and avoid generating basic blocks which only contain
jumps. At -O0, this penalizes us all the way from IRgen (malloc &
instruction overhead), all the way down through code generation and
assembly time.

On 176.gcc:expr.ll, it looks like over 12% of basic blocks are just
direct branches!

//===---------------------------------------------------------------------===//