llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun Liu	b122ed9181	[AMDGPU] Temporarily change constant address space from 4 to 2 for the new address space mapping Change constant address space from 4 to 2 for the new address space mapping in Clang. Differential Revision: https://reviews.llvm.org/D31771 llvm-svn: 299691	2017-04-06 19:18:36 +00:00
Stanislav Mekhanoshin	921a42314b	[AMDGPU] Translate reqd_work_group_size into amdgpu_flat_work_group_size These two attributes specify the same info in a different way. AMGPU BE only checks the latter as a target specific attribute as opposed to language specific reqd_work_group_size. This change produces amdgpu_flat_work_group_size out of reqd_work_group_size if specified. Differential Revision: https://reviews.llvm.org/D31728 llvm-svn: 299678	2017-04-06 18:15:44 +00:00
Egor Churaev	a8d2451533	[OpenCL] Enables passing sampler initializer to function argument Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: yaxunl, bader Differential Revision: https://reviews.llvm.org/D31594 llvm-svn: 299524	2017-04-05 09:02:56 +00:00
Jin-Gu Kang	e7cdcdea73	Preserve vec3 type. Summary: Preserve vec3 type with CodeGen option. Reviewers: Anastasia, bruno Reviewed By: Anastasia Subscribers: bruno, ahatanak, cfe-commits Differential Revision: https://reviews.llvm.org/D30810 llvm-svn: 299445	2017-04-04 16:40:25 +00:00
Egor Churaev	ba8b84d7fb	[OpenCL] Do not generate "kernel_arg_type_qual" metadata for non-pointer args Summary: "kernel_arg_type_qual" metadata should contain const/volatile/restrict tags only for pointer types to match the corresponding requirement of the OpenCL specification. OpenCL 2.0 spec 5.9.3 Kernel Object Queries: CL_KERNEL_ARG_TYPE_VOLATILE is returned if the argument is a pointer and the referenced type is declared with the volatile qualifier. [...] Similarly, CL_KERNEL_ARG_TYPE_CONST is returned if the argument is a pointer and the referenced type is declared with the restrict or const qualifier. [...] CL_KERNEL_ARG_TYPE_RESTRICT will be returned if the pointer type is marked restrict. Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: bader, yaxunl Differential Revision: https://reviews.llvm.org/D31321 llvm-svn: 299192	2017-03-31 10:14:52 +00:00
Egor Churaev	45c26ee0bf	[OpenCL] Extended mapping of parcing CodeGen arguments Summary: Enable cl_mad_enamle and cl_no_signed_zeros options when user turns on cl_unsafe_math_optimizations or cl_fast_relaxed_math options. Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: bader, yaxunl Differential Revision: https://reviews.llvm.org/D31324 llvm-svn: 298838	2017-03-27 10:38:01 +00:00
Yaxun Liu	3464f92e23	[AMDGPU] Switch address space mapping by triple environment amdgiz For target environment amdgiz and amdgizcl (giz means Generic Is Zero), AMDGPU will use new address space mapping where generic address space is 0 and private address space is 5. The data layout is also changed correspondingly. Differential Revision: https://reviews.llvm.org/D31210 llvm-svn: 298767	2017-03-25 03:46:25 +00:00
Konstantin Zhuravlyov	9c1e310c16	Fix array sizes where address space is not yet known For variables in generic address spaces, for example: ``` unsigned char V[6442450944]; ... ``` the address space is not yet known when we get into getConstantArrayType, it is 0. AMDGCN target's address space 0 has 32 bits pointers, so when we call getPointerWidth with 0, the array size is trimmed to 32 bits, which is not right. Differential Revision: https://reviews.llvm.org/D30845 llvm-svn: 298420	2017-03-21 18:55:39 +00:00
Egor Churaev	c217f37cb6	[OpenCL] Added implicit conversion rank for overloading functions with vector data type in OpenCL Summary: I added a new rank to ImplicitConversionRank enum to resolve the function overload ambiguity with vector types. Rank of scalar types conversion is lower than vector splat. So, we can choose which function should we call. See test for more details. Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: bader, yaxunl Differential Revision: https://reviews.llvm.org/D30816 llvm-svn: 298366	2017-03-21 12:55:55 +00:00
Matt Arsenault	bf5e3e4391	AMDGPU: Make 0 the private nullptr value We can't actually pretend that 0 is valid for address space 0. r295877 added a workaround to stop allocating user objects there, so we can use 0 as the invalid pointer. Some of the tests seemed to be using private as the non-0 null test address space, so add copies using local to make sure this is still stressed. llvm-svn: 297659	2017-03-13 19:47:53 +00:00
Yaxun Liu	4d86799219	[AMDGPU] Add builtin functions readlane ds_permute mov_dpp Differential Revision: https://reviews.llvm.org/D30551 llvm-svn: 297436	2017-03-10 01:30:46 +00:00
Konstantin Zhuravlyov	2b4917fcc9	[DebugInfo] Append extended dereferencing mechanism to variables' DIExpression for targets that support more than one address space Differential Revision: https://reviews.llvm.org/D29673 llvm-svn: 297397	2017-03-09 18:06:23 +00:00
Konstantin Zhuravlyov	d1ba16e762	[DebugInfo] Add address space when creating DIDerivedTypes Differential Revision: https://reviews.llvm.org/D29671 llvm-svn: 297321	2017-03-08 23:56:48 +00:00
Jan Vesely	9488560bb8	AMDGPU: export s_sendmsg{halt} instrinsics Differential Revision: https://reviews.llvm.org/D30366 llvm-svn: 296241	2017-02-25 04:20:24 +00:00
Jan Vesely	c255097517	AMDGPU: export l1 cache invalidation intrinsics Differential Revision: https://reviews.llvm.org/D30360 llvm-svn: 296240	2017-02-25 04:20:22 +00:00
Jan Vesely	d26dbb389f	AMDGPU: export s_waitcnt builtin Differential Revision: https://reviews.llvm.org/D30359 llvm-svn: 296239	2017-02-25 04:20:20 +00:00
Matt Arsenault	a0c6dca15b	AMDGPU: Add fmed3 half builtin llvm-svn: 295874	2017-02-22 20:55:59 +00:00
Jan Vesely	a6f369c727	[OpenCL] r600 needs OpenCL kernel calling convention Differential Revision: https://reviews.llvm.org/D30236 llvm-svn: 295843	2017-02-22 15:01:42 +00:00
Anastasia Stulova	58984e7087	[OpenCL] Correct ndrange_t implementation Removed ndrange_t as Clang builtin type and added as a struct type in the OpenCL header. Use type name to do the Sema checking in enqueue_kernel and modify IR generation accordingly. Review: D28058 Patch by Dmitry Borisenkov! llvm-svn: 295311	2017-02-16 12:27:47 +00:00
Matt Arsenault	77ce553891	AMDGPU: Add a test checking alignments of emitted globals/allocas Make sure the spec required type alignments are used in preparation for a possible change which may break this. llvm-svn: 294278	2017-02-07 04:28:02 +00:00
Matt Arsenault	a274b209f5	AMDGPU: Add builtin for fmed3 intrinsic llvm-svn: 293600	2017-01-31 03:42:07 +00:00
Anastasia Stulova	af0a7bbbe2	[OpenCL] Add missing address spaces in IR generation of blocks Modify ObjC blocks impl wrt address spaces as follows: - keep default private address space for blocks generated as local variables (with captures); - add global address space for global block literals (no captures); - make the block invoke function and enqueue_kernel prototype with the generic AS block pointer parameter to accommodate both private and global AS cases from above; - add block handling into default AS because it's implemented as a special pointer type (BlockPointer) in the frontend and therefore it is used as a pointer everywhere. This is also needed to accommodate both private and global AS blocks for the two cases above. - removes ObjC RT specific symbols (NSConcreteStackBlock and NSConcreteGlobalBlock) in the OpenCL mode. Review: https://reviews.llvm.org/D28814 llvm-svn: 293286	2017-01-27 15:11:34 +00:00
Matt Arsenault	09cca093a3	AMDGPU: Update for changed subtarget feature name llvm-svn: 292838	2017-01-23 22:31:14 +00:00
Matt Arsenault	24b5ae4497	AMDGPU: Add builtin for getreg intrinsic llvm-svn: 292636	2017-01-20 19:24:22 +00:00
Egor Churaev	28f00aab73	[OpenCL] Align fake address space map with the SPIR target maps. Summary: We compile user opencl kernel code with spir triple. But built-ins are written in OpenCL and we compile it with triple x86_64 to be able to use x86 intrinsics. And we need address spaces to match in both cases. So, we change fake address space map in OpenCL for matching with spir. On CPU address spaces are not really important but we'd like to preserve address space information in order to perform optimizations relying on this info like enhanced alias analysis. Reviewers: pekka.jaaskelainen, Anastasia Subscribers: pekka.jaaskelainen, yaxunl, bader, cfe-commits Differential Revision: https://reviews.llvm.org/D28048 llvm-svn: 290436	2016-12-23 16:11:25 +00:00
Egor Churaev	89831421af	Fix problems in "[OpenCL] Enabling the usage of CLK_NULL_QUEUE as compare operand." Summary: Fixed warnings in commit: https://reviews.llvm.org/rL290171 Reviewers: djasper, Anastasia Subscribers: yaxunl, cfe-commits, bader Differential Revision: https://reviews.llvm.org/D27981 llvm-svn: 290431	2016-12-23 14:55:49 +00:00
Chandler Carruth	fcd33149b4	Cleanup the handling of noinline function attributes, -fno-inline, -fno-inline-functions, -O0, and optnone. These were really, really tangled together: - We used the noinline LLVM attribute for -fno-inline - But not for -fno-inline-functions (breaking LTO) - But we did use it for -finline-hint-functions (yay, LTO is happy!) - But we didn't for -O0 (LTO is sad yet again...) - We had weird structuring of CodeGenOpts with both an inlining enumeration and a boolean. They interacted in weird ways and needlessly. - A lot of set smashing went on with setting these, and then got worse when we considered optnone and other inlining-effecting attributes. - A bunch of inline affecting attributes were managed in a completely different place from -fno-inline. - Even with -fno-inline we failed to put the LLVM noinline attribute onto many generated function definitions because they didn't show up as AST-level functions. - If you passed -O0 but -finline-functions we would run the normal inliner pass in LLVM despite it being in the O0 pipeline, which really doesn't make much sense. - Lastly, we used things like '-fno-inline' to manipulate the pass pipeline which forced the pass pipeline to be much more parameterizable than it really needs to be. Instead we can just use the optimization level to select a pipeline and control the rest via attributes. Sadly, this causes a bunch of churn in tests because we don't run the optimizer in the tests and check the contents of attribute sets. It would be awesome if attribute sets were a bit more FileCheck friendly, but oh well. I think this is a significant improvement and should remove the semantic need to change what inliner pass we run in order to comply with the requested inlining semantics by relying completely on attributes. It also cleans up tho optnone and related handling a bit. One unfortunate aspect of this is that for generating alwaysinline routines like those in OpenMP we end up removing noinline and then adding alwaysinline. I tried a bunch of other approaches, but because we recompute function attributes from scratch and don't have a declaration here I couldn't find anything substantially cleaner than this. Differential Revision: https://reviews.llvm.org/D28053 llvm-svn: 290398	2016-12-23 01:24:49 +00:00
George Burgess IV	e37633713d	Add the alloc_size attribute to clang, attempt 2. This is a recommit of r290149, which was reverted in r290169 due to msan failures. msan was failing because we were calling `isMostDerivedAnUnsizedArray` on an invalid designator, which caused us to read uninitialized memory. To fix this, the logic of the caller of said function was simplified, and we now have a `!Invalid` assert in `isMostDerivedAnUnsizedArray`, so we can catch this particular bug more easily in the future. Fingers crossed that this patch sticks this time. :) Original commit message: This patch does three things: - Gives us the alloc_size attribute in clang, which lets us infer the number of bytes handed back to us by malloc/realloc/calloc/any user functions that act in a similar manner. - Teaches our constexpr evaluator that evaluating some `const` variables is OK sometimes. This is why we have a change in test/SemaCXX/constant-expression-cxx11.cpp and other seemingly unrelated tests. Richard Smith okay'ed this idea some time ago in person. - Uniques some Blocks in CodeGen, which was reviewed separately at D26410. Lack of uniquing only really shows up as a problem when combined with our new eagerness in the face of const. llvm-svn: 290297	2016-12-22 02:50:20 +00:00
Daniel Jasper	9068938eb0	Revert "[OpenCL] Enabling the usage of CLK_NULL_QUEUE as compare operand." This reverts commit r290171. It triggers a bunch of warnings, because the new enumerator isn't handled in all switches. We want a warning-free build. Replied on the commit with more details. llvm-svn: 290173	2016-12-20 10:05:04 +00:00
Egor Churaev	67c3f3ec68	[OpenCL] Enabling the usage of CLK_NULL_QUEUE as compare operand. Summary: Enabling the compression of CLK_NULL_QUEUE to variable of type queue_t. Reviewers: Anastasia Subscribers: cfe-commits, yaxunl, bader Differential Revision: https://reviews.llvm.org/D27569 llvm-svn: 290171	2016-12-20 09:15:21 +00:00
Chandler Carruth	d7738fe6ad	Revert r290149: Add the alloc_size attribute to clang. This commit fails MSan when running test/CodeGen/object-size.c in a confusing way. After some discussion with George, it isn't really clear what is going on here. We can make the MSan failure go away by testing for the invalid bit, but why things are invalid isn't clear. And yet, other code in the surrounding area is doing precisely this and testing for invalid. George is going to take a closer look at this to better understand the nature of the failure and recommit it, for now backing it out to clean up MSan builds. llvm-svn: 290169	2016-12-20 08:28:19 +00:00
George Burgess IV	a747027bc6	Add the alloc_size attribute to clang. This patch does three things: - Gives us the alloc_size attribute in clang, which lets us infer the number of bytes handed back to us by malloc/realloc/calloc/any user functions that act in a similar manner. - Teaches our constexpr evaluator that evaluating some `const` variables is OK sometimes. This is why we have a change in test/SemaCXX/constant-expression-cxx11.cpp and other seemingly unrelated tests. Richard Smith okay'ed this idea some time ago in person. - Uniques some Blocks in CodeGen, which was reviewed separately at D26410. Lack of uniquing only really shows up as a problem when combined with our new eagerness in the face of const. Differential Revision: https://reviews.llvm.org/D14274 llvm-svn: 290149	2016-12-20 01:05:42 +00:00
Yaxun Liu	5b74665a41	Recommit r289979 [OpenCL] Allow disabling types and declarations associated with extensions Fixed undefined behavior due to cast integer to bool in initializer list. llvm-svn: 290056	2016-12-18 05:18:55 +00:00
Yaxun Liu	35f6d66b0d	Revert r289979 due to regressions llvm-svn: 289991	2016-12-16 21:23:55 +00:00
Yaxun Liu	2e8331cab6	[OpenCL] Allow disabling types and declarations associated with extensions Added a map to associate types and declarations with extensions. Refactored existing diagnostic for disabled types associated with extensions and extended it to declarations for generic situation. Fixed some bugs for types associated with extensions. Allow users to use pragma to declare types and functions for supported extensions, e.g. #pragma OPENCL EXTENSION the_new_extension_name : begin // declare types and functions associated with the extension here #pragma OPENCL EXTENSION the_new_extension_name : end Differential Revision: https://reviews.llvm.org/D21698 llvm-svn: 289979	2016-12-16 19:22:08 +00:00
Yaxun Liu	402804b6d6	Re-commit r289252 and r289285, and fix PR31374 llvm-svn: 289787	2016-12-15 08:09:08 +00:00
Nico Weber	7849eeb035	Revert 289252 (and follow-up 289285), it caused PR31374 llvm-svn: 289713	2016-12-14 21:38:18 +00:00
Neil Hickey	c881be1c23	Fixing build failure by adding triple option to new test condition. Adding -triple option to ensure target supports double for fpmath test. llvm-svn: 289552	2016-12-13 17:04:33 +00:00
Neil Hickey	88c0fac534	Improve handling of floating point literals in OpenCL to only use double precision if the target supports fp64. This change makes sure single-precision floating point types are used if the cl_fp64 extension is not supported by the target. Also removed the check to see whether the OpenCL version is >= 1.2, as this has been incorporated into the extension setting code. Differential Revision: https://reviews.llvm.org/D24235 llvm-svn: 289544	2016-12-13 16:22:50 +00:00
Egor Churaev	24939d479e	[OpenCL] Enable unroll hint for OpenCL 1.x. Summary: Although the feature was introduced only in OpenCL C v2.0 spec., it's useful for OpenCL 1.x too and doesn't require HW support. Reviewers: Anastasia Subscribers: yaxunl, cfe-commits, bader Differential Revision: https://reviews.llvm.org/D27453 llvm-svn: 289535	2016-12-13 14:02:35 +00:00
Yaxun Liu	8f66b4b44a	Add support for non-zero null pointer for C and OpenCL In amdgcn target, null pointers in global, constant, and generic address space take value 0 but null pointers in private and local address space take value -1. Currently LLVM assumes all null pointers take value 0, which results in incorrectly translated IR. To workaround this issue, instead of emit null pointers in local and private address space, a null pointer in generic address space is emitted and casted to local and private address space. Tentative definition of global variables with non-zero initializer will have weak linkage instead of common linkage since common linkage requires zero initializer and does not have explicit section to hold the non-zero value. Virtual member functions getNullPointer and performAddrSpaceCast are added to TargetCodeGenInfo which by default returns ConstantPointerNull and emitting addrspacecast instruction. A virtual member function getNullPointerValue is added to TargetInfo which by default returns 0. Each target can override these virtual functions to get target specific null pointer and the null pointer value for specific address space, and perform specific translations for addrspacecast. Wrapper functions getNullPointer is added to CodegenModule and getTargetNullPointerValue is added to ASTContext to facilitate getting the target specific null pointers and their values. This change has no effect on other targets except amdgcn target. Other targets can provide support of non-zero null pointer in a similar way. This change only provides support for non-zero null pointer for C and OpenCL. Supporting for other languages will be added later incrementally. Differential Revision: https://reviews.llvm.org/D26196 llvm-svn: 289252	2016-12-09 19:01:11 +00:00
Alexey Bader	a60db59d6f	[OpenCL] Added a LIT test for ensuring address space mangling is done the same both in OpenCL1.2 and OpenCL2.0. Patch by Egor Churaev (echuraev). Reviewers: Anastasia Subscribers: yaxunl, cfe-commits, bader Differential Revision: https://reviews.llvm.org/D27403 llvm-svn: 288891	2016-12-07 08:43:49 +00:00
Alexey Bader	b3190829e5	[OpenCL] Fix SPIR version generation. Patch by Egor Churaev (echuraev). Reviewers: Anastasia Subscribers: bader, yaxunl, cfe-commits Differential Revision: https://reviews.llvm.org/D27300 llvm-svn: 288890	2016-12-07 08:38:24 +00:00
Anastasia Stulova	e4a1c38109	[OpenCL] Prevent generation of globals in non-constant AS for OpenCL. Avoid using shortcut for const qualified non-constant address space aggregate variables while generating them on the stack such that the alloca object is used instead of a global variable containing initializer. Review: https://reviews.llvm.org/D27109 llvm-svn: 288163	2016-11-29 17:01:19 +00:00
Konstantin Zhuravlyov	62ae8f671c	[AMDGPU] Change frexp.exp builtin to return i16 for f16 input Differential Revision: https://reviews.llvm.org/D26863 llvm-svn: 287390	2016-11-18 22:31:51 +00:00
Stanislav Mekhanoshin	cd433d2811	[AMDGPU] Add wave barrier builtin The wave barrier represents the discardable barrier. Its main purpose is to carry convergent attribute, thus preventing illegal CFG optimizations. All lanes in a wave come to convergence point simultaneously with SIMT, thus no special instruction is needed in the ISA. The barrier is discarded during code generation. Differential Revision: https://reviews.llvm.org/D26584 llvm-svn: 287006	2016-11-15 18:58:03 +00:00
Anastasia Stulova	0df4ac3f94	[OpenCL] Fix for integer parameters of enqueue_kernel Make handling integer parameters more flexible: - For the number of events argument allow to pass larger integers than 32 bits as soon as compiler can prove that the range fits in 32 bits. If not, the diagnostic will be given. - Change type of the arguments specifying the sizes of the corresponding block arguments to be size_t. Review: https://reviews.llvm.org/D26509 llvm-svn: 286849	2016-11-14 17:39:58 +00:00
Anastasia Stulova	2b46120a09	[OpenCL] Change to clk_event parameter in enqueue_kernel. - Accept NULL pointer as a valid parameter value for clk_event. - Generate clk_event_t arguments of internal __enqueue_kernel_XXX function as pointers in generic address space. Review: https://reviews.llvm.org/D26507 llvm-svn: 286836	2016-11-14 15:34:01 +00:00
Pekka Jaaskelainen	5136dd81ad	Fix r286819 (accidentally patched multiple times. llvm-svn: 286821	2016-11-14 13:14:38 +00:00
Pekka Jaaskelainen	2a1cc587bf	[OpenCL] always use SPIR address spaces for kernel_arg_addr_space MD It doesn't make sense to use the target's address space ids in this context as this is metadata that should be referring to the "logical" OpenCL address spaces. For flat AS machines like all "CPUs" in general, the logical AS info gets lost as there's only one address space (0). This commit changes the logic such that we always use the SPIR address space ids for the argument metadata. It thus allows implementing the clGetKernelArgInfo() and the other detection needs. https://reviews.llvm.org/D26157 llvm-svn: 286819	2016-11-14 13:08:30 +00:00

1 2 3 4 5

232 Commits