llvm-project

Commit Graph

Author	SHA1	Message	Date
Daniel Neilson	6e938effaa	Change memcpy/memove/memset to have dest and source alignment attributes (Step 1). Summary: Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset intrinsics. This change updates the Clang tests for this change. The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change removes the alignment argument in favour of placing the alignment attribute on the source and destination pointers of the memory intrinsic call. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) At this time the source and destination alignments must be the same (Step 1). Step 2 of the change, to be landed shortly, will relax that contraint and allow the source and destination to have different alignments. llvm-svn: 322964	2018-01-19 17:12:54 +00:00
Yaxun Liu	c325d30d2c	CodeGen: Fix invalid bitcasts for memcpy CreateCoercedLoad/CreateCoercedStore assumes pointer argument of memcpy is in addr space 0, which is not correct and causes invalid bitcasts for triple amdgcn---amdgiz. It is fixed by using alloca addr space instead. Differential Revision: https://reviews.llvm.org/D40806 llvm-svn: 320000	2017-12-07 01:39:52 +00:00
Alexey Bader	bed400957b	[OpenCL] Fix code generation of function-scope constant samplers. Summary: Constant samplers are handled as static variables and clang's code generation library, which leads to llvm::unreachable. We bypass emitting sampler variable as static since it's translated to a function call later. Reviewers: yaxunl, Anastasia Reviewed By: yaxunl, Anastasia Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D34342 llvm-svn: 318290	2017-11-15 11:38:17 +00:00
Matt Arsenault	a5888a730d	OpenCL: Assume inline asm is convergent Already done for CUDA. llvm-svn: 318098	2017-11-13 22:40:55 +00:00
Yaxun Liu	e45b3d5dad	CodeGen: Fix missing debug loc due to alloca Builder save/restores insertion pointer when emitting addr space cast for alloca, but does not save/restore debug loc, which causes verifier failure for certain call instructions. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39069 llvm-svn: 316484	2017-10-24 19:14:43 +00:00
Yaxun Liu	bd3823618d	CodeGen: Fix invalid bitcast in partial initialization of automatic arrary variable Differential Revision: https://reviews.llvm.org/D39184 llvm-svn: 316353	2017-10-23 17:49:26 +00:00
Yaxun Liu	f2127d1728	[AMDGPU] Fix bug in enqueued block codegen due to an extra line llvm-svn: 316165	2017-10-19 15:56:13 +00:00
Yaxun Liu	8ab5ab066a	CodeGen: Fix invalid bitcasts for atomic builtins Currently clang assumes the temporary variables emitted during codegen of atomic builtins have address space 0, which is not true for target triple amdgcn---amdgiz and causes invalid bitcasts. This patch fixes that. Differential Revision: https://reviews.llvm.org/D38966 llvm-svn: 316000	2017-10-17 14:19:29 +00:00
Yaxun Liu	c2a87a05f1	[OpenCL] Emit enqueued block as kernel In OpenCL the kernel function and non-kernel function has different calling conventions. For certain targets they have different argument ABIs. Also kernels have special function attributes and metadata for runtime to launch them. The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such, the block invoke function should be emitted as kernel with proper calling convention and argument ABI. This patch emits enqueued block as kernel. If a block is both called directly and passed to enqueue_kernel, separate functions will be generated. Differential Revision: https://reviews.llvm.org/D38134 llvm-svn: 315804	2017-10-14 12:23:50 +00:00
Yaxun Liu	d029234fc6	Fix regression of test/CodeGenOpenCL/address-spaces.cl on ppc llvm-svn: 315678	2017-10-13 13:53:06 +00:00
Yaxun Liu	b7318e02c1	[OpenCL] Add LangAS::opencl_private to represent private address space in AST Currently Clang uses default address space (0) to represent private address space for OpenCL in AST. There are two issues with this: Multiple address spaces including private address space cannot be diagnosed. There is no mangling for default address space. For example, if private int* is emitted as i32 addrspace(5)* in IR. It is supposed to be mangled as PUAS5i but it is mangled as Pi instead. This patch attempts to represent OpenCL private address space explicitly in AST. It adds a new enum LangAS::opencl_private and adds it to the variable types which are implicitly private: automatic variables without address space qualifier function parameter pointee type without address space qualifier (OpenCL 1.2 and below) Differential Revision: https://reviews.llvm.org/D35082 llvm-svn: 315668	2017-10-13 03:37:48 +00:00
Matt Arsenault	f12e3b848a	AMDGPU: Add read_exec_lo/hi builtins llvm-svn: 315238	2017-10-09 20:06:37 +00:00
Matt Arsenault	cbe0dd13d2	AMDGPU: Fix missing declaration for __builtin_amdgcn_dispatch_ptr llvm-svn: 315219	2017-10-09 17:44:18 +00:00
Matt Arsenault	a1cf61b6fc	OpenCL: Assume functions are convergent This was done for CUDA functions in r261779, and for the same reason this also needs to be done for OpenCL. An arbitrary function could have a barrier() call in it, which in turn requires the calling function to be convergent. llvm-svn: 315094	2017-10-06 19:34:40 +00:00
Yaxun Liu	10712d9203	[OpenCL] Clean up and add missing fields for block struct Currently block is translated to a structure equivalent to struct Block { void isa; int flags; int reserved; void invoke; void descriptor; }; Except invoke, which is the pointer to the block invoke function, all other fields are useless for OpenCL, which clutter the IR and also waste memory since the block struct is passed to the block invoke function as argument. On the other hand, the size and alignment of the block struct is not stored in the struct, which causes difficulty to implement __enqueue_kernel as library function, since the library function needs to know the size and alignment of the argument which needs to be passed to the kernel. This patch removes the useless fields from the block struct and adds size and align fields. The equivalent block struct will become struct Block { int size; int align; generic void invoke; /* custom fields */ }; It also changes the pointer to the invoke function to be a generic pointer since the address space of a function may not be private on certain targets. Differential Revision: https://reviews.llvm.org/D37822 llvm-svn: 314932	2017-10-04 20:32:17 +00:00
Anastasia Stulova	89a947da72	[OpenCL] Fixed CL version in failing test. llvm-svn: 314317	2017-09-27 17:03:35 +00:00
Anastasia Stulova	0a72ed40d3	[OpenCL] Handle address space conversion while setting type alignment. Added missing addrspacecast case in alignment computation logic of pointer type emission in IR generation. Differential Revision: https://reviews.llvm.org/D37804 llvm-svn: 314304	2017-09-27 14:37:00 +00:00
Yaxun Liu	1f33d8eb19	Add more tests for OpenCL atomic builtin functions Add tests for different address spaces and insert some blank lines to make them more readable. Differential Revision: https://reviews.llvm.org/D37742 llvm-svn: 313172	2017-09-13 18:56:25 +00:00
Yaxun Liu	e37793545e	[AMDGPU] Change addr space of clk_event_t, queue_t and reserve_id_t to global Differential Revision: https://reviews.llvm.org/D37703 llvm-svn: 313171	2017-09-13 18:50:42 +00:00
Jan Vesely	31ecb4bf60	[OpenCL] Add half load and store builtins This enables load/stores of half type, without half being a legal type. Differential Revision: https://reviews.llvm.org/D37231 llvm-svn: 312742	2017-09-07 19:39:10 +00:00
Yaxun Liu	29a5ee358e	[OpenCL] Do not use vararg in emitted functions for enqueue_kernel Not all targets support vararg (e.g. amdgpu). Instead of using vararg in the emitted functions for enqueue_kernel, this patch creates a temporary array of size_t, stores the size arguments in the temporary array and passes it to the emitted functions for enqueue_kernel. Differential Revision: https://reviews.llvm.org/D36678 llvm-svn: 312441	2017-09-03 13:52:24 +00:00
Adrian Prantl	9e83fb0838	Adapt testcases to LLVM change r312144 in DIGlobalVariableExpression llvm-svn: 312148	2017-08-30 18:22:23 +00:00
Reid Kleckner	6d353348e5	Parse and print DIExpressions inline to ease IR and MIR testing Summary: Most DIExpressions are empty or very simple. When they are complex, they tend to be unique, so checking them inline is reasonable. This also avoids the need for CodeGen passes to append to the llvm.dbg.mir named md node. See also PR22780, for making DIExpression not be an MDNode. Reviewers: aprantl, dexonsmith, dblaikie Subscribers: qcolombet, javed.absar, eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37075 llvm-svn: 311594	2017-08-23 20:31:27 +00:00
Yaxun Liu	504d4e2403	Attempt to fix failure in CodeGenOpenCL/atomic-ops.cl again llvm-svn: 310937	2017-08-15 17:59:26 +00:00
Yaxun Liu	adcfe5d881	Attempt to fix failure in CodeGenOpenCL/atomic-ops.cl llvm-svn: 310932	2017-08-15 17:16:44 +00:00
Yaxun Liu	99d56d291f	Remove -finclude-default-header in OpenCL atomic tests Differential Revision: https://reviews.llvm.org/D36676 llvm-svn: 310927	2017-08-15 16:30:31 +00:00
Yaxun Liu	30d652a447	[OpenCL] Support variable memory scope in atomic builtins Differential Revision: https://reviews.llvm.org/D36580 llvm-svn: 310924	2017-08-15 16:02:49 +00:00
Sven van Haastregt	efb4d4c78c	[OpenCL] Allow targets to select address space per type Generalize getOpenCLImageAddrSpace into getOpenCLTypeAddrSpace, such that targets can select the address space per type. No functional changes intended. Initial patch by Simon Perretta. Differential Revision: https://reviews.llvm.org/D33989 llvm-svn: 310911	2017-08-15 09:38:18 +00:00
Matt Arsenault	3fe7395fbc	AMDGPU: Use direct struct returns and arguments This is an improvement over always using byval for structs. This will use registers until ~16 are used, and then switch back to byval. This needs more work, since I'm not sure it ever really makes sense to use byval. If the register limit is exceeded, the arguments still end up passed on the stack, but with a different ABI. It also may make sense to base this on number of registers used for non-struct arguments, rather than just arguments that appear first in the argument list. llvm-svn: 310527	2017-08-09 21:44:58 +00:00
Yaxun Liu	39195062c2	Add OpenCL 2.0 atomic builtin functions as Clang builtin OpenCL 2.0 atomic builtin functions have a scope argument which is ideally represented as synchronization scope argument in LLVM atomic instructions. Clang supports translating Clang atomic builtin functions to LLVM atomic instructions. However it currently does not support synchronization scope of LLVM atomic instructions. Without this, users have to use LLVM assembly code to implement OpenCL atomic builtin functions. This patch adds OpenCL 2.0 atomic builtin functions as Clang builtin functions, which supports generating LLVM atomic instructions with synchronization scope operand. Currently only constant memory scope argument is supported. Support of non-constant memory scope argument will be added later. Differential Revision: https://reviews.llvm.org/D28691 llvm-svn: 310082	2017-08-04 18:16:31 +00:00
Joey Gouly	fa76b49cef	[OpenCL] Add missing subgroup builtins This adds get_kernel_max_sub_group_size_for_ndrange and get_kernel_sub_group_count_for_ndrange. llvm-svn: 309678	2017-08-01 13:27:09 +00:00
Joey Gouly	53160cdc45	[OpenCL] Enable subgroup extension in tests This fixes the test, so that it can be run on different hosts that may have different OpenCL extensions enabled. llvm-svn: 309571	2017-07-31 15:50:27 +00:00
Joey Gouly	84ae3364df	[OpenCL] Add extension Sema check for subgroup builtins Check the subgroup extension is enabled, before doing other Sema checks. llvm-svn: 309567	2017-07-31 15:15:59 +00:00
Alexey Sotkin	7d7f0dc08b	[OpenCL] Fix access qualifiers metadata for kernel arguments with typedef Subscribers: cfe-commits, yaxunl, Anastasia Differential Revision: https://reviews.llvm.org/D35420 llvm-svn: 309155	2017-07-26 18:49:54 +00:00
Egor Churaev	53f9a30543	[OpenCL] Added extended tests on metadata generation for half data type and arrays. Reviewers: Anastasia Reviewed By: Anastasia Subscribers: bader, cfe-commits, yaxunl Differential Revision: https://reviews.llvm.org/D35000 llvm-svn: 308266	2017-07-18 06:04:01 +00:00
Yaxun Liu	25d1b4341f	[AMDGPU] Fix size and alignment of size_t and pointer types Differential Revision: https://reviews.llvm.org/D34995 llvm-svn: 307121	2017-07-05 04:58:24 +00:00
Yaxun Liu	3ba4a720ad	[AMDGPU] Fix regressions on mesa/clover with libclc due to address space Currently AMDGPUTargetInfo does not initialize AddrSpaceMap in constructor, which causes regressions in mesa/clover with libclc. This patch fixes that. Differential Revision: https://reviews.llvm.org/D34987 llvm-svn: 307105	2017-07-04 19:57:18 +00:00
Yaxun Liu	e9e5c4f975	CodeGen: Fix invalid bitcast for coerced function argument Clang assumes coerced function argument is in address space 0, which is not always true and results in invalid bitcasts. This patch fixes failure in OpenCL conformance test api/get_kernel_arg_info with amdgcn---amdgizcl triple, where non-zero alloca address space is used. Differential Revision: https://reviews.llvm.org/D34777 llvm-svn: 306721	2017-06-29 18:47:45 +00:00
Alexey Bader	364a11651e	[OpenCL] Fix OpenCL and SPIR version metadata generation. Summary: OpenCL and SPIR version metadata must be generated once per module instead of once per mangled global value. Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Subscribers: ahatanak, cfe-commits Differential Revision: https://reviews.llvm.org/D34235 llvm-svn: 305796	2017-06-20 14:30:18 +00:00
Pekka Jaaskelainen	2eb0bcc9e6	[OpenCL] spir_kern by defaul: fix old test cases llvm-svn: 304396	2017-06-01 08:19:43 +00:00
Pekka Jaaskelainen	fc2629a65a	[OpenCL] Makes kernels use the SPIR_KERNEL CC by default. Rationale: OpenCL kernels are called via an explicit runtime API with arguments set with clSetKernelArg(), not as normal sub-functions. Return SPIR_KERNEL by default as the kernel calling convention to ensure the fingerprint is fixed such way that each OpenCL argument gets one matching argument in the produced kernel function argument list to enable feasible implementation of clSetKernelArg() with aggregates etc. In case we would use the default C calling conv here, clSetKernelArg() might break depending on the target-specific conventions; different targets might split structs passed as values to multiple function arguments etc. https://reviews.llvm.org/D33639 llvm-svn: 304389	2017-06-01 07:18:49 +00:00
Javed Absar	0841d620c5	Fix issue with test that caused bildbot failure These tests did not specify the target. The failure was triggered by change - https://reviews.llvm.org/D33205 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/7314 which sets vector alignment to 8-byte for arm-targets (except for Android). So, fixing the test to make it target specific. llvm-svn: 304210	2017-05-30 13:34:26 +00:00
Egor Churaev	dd7d82c408	[OpenCL] Test on half immediate support. Reviewers: Anastasia Reviewed By: Anastasia Subscribers: yaxunl, cfe-commits, bader Differential Revision: https://reviews.llvm.org/D33592 llvm-svn: 304134	2017-05-29 07:44:22 +00:00
Mehdi Amini	6aa9e9b41a	IRGen: Add optnone attribute on function during O0 Amongst other, this will help LTO to correctly handle/honor files compiled with O0, helping debugging failures. It also seems in line with how we handle other options, like how -fnoinline adds the appropriate attribute as well. Differential Revision: https://reviews.llvm.org/D28404 llvm-svn: 304127	2017-05-29 05:38:20 +00:00
Konstantin Zhuravlyov	1f144a18ff	Resubmit r303861. [AMDGPU] add __builtin_amdgcn_s_getpc Patch by Tim Corringham llvm-svn: 304033	2017-05-26 21:08:20 +00:00
Reid Kleckner	581a6c5d56	Revert "[AMDGPU] add __builtin_amdgcn_s_getpc" This reverts commit r303861, the LLVM intrinsic was reverted. llvm-svn: 303908	2017-05-25 20:28:26 +00:00
Tim Corringham	702fe45bcd	[AMDGPU] add __builtin_amdgcn_s_getpc Summary: Added the builtin corresponding to the s_getpc intrinsic added in llvm D32862 Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D33276 llvm-svn: 303861	2017-05-25 14:16:11 +00:00
Yaxun Liu	af3d4db64b	[AMDGPU] Do not require opencl triple environment for OpenCL A recent change requires opencl triple environment for compiling OpenCL program, which causes regressions in libclc. This patch fixes that. Instead of deducing language based on triple environment, it checks LangOptions. Differential Revision: https://reviews.llvm.org/D33445 llvm-svn: 303644	2017-05-23 16:15:53 +00:00
Yaxun Liu	6d96f16347	CodeGen: Cast alloca to expected address space Alloca always returns a pointer in alloca address space, which may be different from the type defined by the language. For example, in C++ the auto variables are in the default address space. Therefore cast alloca to the expected address space when necessary. Differential Revision: https://reviews.llvm.org/D32248 llvm-svn: 303370	2017-05-18 18:51:09 +00:00
Yaxun Liu	4f33b3d396	[OpenCL] Emit function-scope variable in constant address space as static variable Differential Revision: https://reviews.llvm.org/D32977 llvm-svn: 303072	2017-05-15 14:47:47 +00:00

1 2 3 4 5 ...

288 Commits