llvm-project

Commit Graph

Author	SHA1	Message	Date
Anastasia Stulova	ab4a5d14b5	[OpenCL] Fix vector literal test broken in rL367675. Avoid checking alignment unnecessary that is not portable among targets. llvm-svn: 367823	2019-08-05 09:50:28 +00:00
Tim Northover	a009a60a91	IR: print value numbers for unnamed function arguments For consistency with normal instructions and clarity when reading IR, it's best to print the %0, %1, ... names of function arguments in definitions. Also modifies the parser to accept IR in that form for obvious reasons. llvm-svn: 367755	2019-08-03 14:28:34 +00:00
Anastasia Stulova	8d99a5c0e6	[OpenCL] Allow OpenCL C style vector initialization in C++ Allow creating vector literals from other vectors. float4 a = (float4)(1.0f, 2.0f, 3.0f, 4.0f); float4 v = (float4)(a.s23, a.s01); Differential revision: https://reviews.llvm.org/D65286 llvm-svn: 367675	2019-08-02 11:19:35 +00:00
Matt Arsenault	64d7af09f5	AMDGPU: Add missing builtin declarations llvm-svn: 367431	2019-07-31 14:03:05 +00:00
Anastasia Stulova	88ed70e247	[OpenCL] Rename lang mode flag for C++ mode Rename lang mode flag to -cl-std=clc++/-cl-std=CLC++ or -std=clc++/-std=CLC++. This aligns with OpenCL C conversion and removes ambiguity with OpenCL C++. Differential Revision: https://reviews.llvm.org/D65102 llvm-svn: 367008	2019-07-25 11:04:29 +00:00
Christudasan Devadasan	8c5e6fa657	Updated the signature for some stack related intrinsics (CLANG) Modified the intrinsics int_addressofreturnaddress, int_frameaddress & int_sponentry. This commit depends on the changes in rL366679 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D64563 llvm-svn: 366683	2019-07-22 12:50:30 +00:00
Matt Arsenault	e56865d40c	AMDGPU: Add some missing builtins llvm-svn: 366286	2019-07-17 00:01:03 +00:00
Neil Hickey	8ece3b6719	[OpenCL] Fixing sampler initialisations for C++ mode. Allow conversions between integer and sampler type. Differential Revision: https://reviews.llvm.org/D64791 llvm-svn: 366212	2019-07-16 14:57:32 +00:00
Vyacheslav Zakharin	de811d1f51	[clang] Preserve names of addrspacecast'ed values. Differential Revision: https://reviews.llvm.org/D63846 llvm-svn: 365666	2019-07-10 17:10:05 +00:00
Christudasan Devadasan	18ba9d6077	[AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG). To enable a new implicit kernel argument, increased the number of argument bytes from 48 to 56. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D63756 llvm-svn: 365643	2019-07-10 15:10:08 +00:00
Reid Kleckner	9b28d9c331	Use the Itanium C++ ABI for the pipe_builtin.cl test Certain OpenCL constructs cannot yet be mangled in the MS C++ ABI. Add a FIXME for it if anyone cares to implement it. llvm-svn: 365557	2019-07-09 21:02:06 +00:00
Stanislav Mekhanoshin	0cfd75a07d	[AMDGPU] gfx908 clang target Differential Revision: https://reviews.llvm.org/D64430 llvm-svn: 365528	2019-07-09 18:19:00 +00:00
Marco Antognini	b00d5f732c	[OpenCL][Sema] Fix builtin rewriting This patch ensures built-in functions are rewritten using the proper parent declaration. Existing tests are modified to run in C++ mode to ensure the functionality works also with C++ for OpenCL while not increasing the testing runtime. llvm-svn: 365499	2019-07-09 15:04:23 +00:00
Brian Homerding	e6ba22542f	Add nofree attribute to CodeGenOpenCL/convergent.cl test The revision at https://reviews.llvm.org/rL365336 added inference of the nofree attribute. This revision updates the test to reflect this. Differential Revision: https://reviews.llvm.org/D49165 llvm-svn: 365341	2019-07-08 16:24:10 +00:00
Matt Arsenault	5495f78165	AMDGPU: Fix missing declaration for mbcnt builtins llvm-svn: 364251	2019-06-24 23:34:06 +00:00
Leonard Chan	f336eb344c	[clang][NewPM] Add RUNS for tests that produce slightly different IR under new PM For CodeGenOpenCL/convergent.cl, the new PM produced a slightly different for loop, but this still checks for no loop unrolling as intended. This is committed separately from D63174. llvm-svn: 364202	2019-06-24 16:49:18 +00:00
Matt Arsenault	fc84925208	AMDGPU: Fix target builtins for gfx10 This wasn't setting some of the features from older generations. llvm-svn: 364123	2019-06-22 01:30:00 +00:00
Matt Arsenault	bcdbc9a115	AMDGPU: Add DS GWS sema builtins llvm-svn: 363986	2019-06-20 21:33:57 +00:00
Matt Arsenault	f46f41411b	Reapply "r363684: AMDGPU: Add GWS instruction builtins" llvm-svn: 363871	2019-06-19 19:55:49 +00:00
Simon Pilgrim	6828bc5614	Revert rL363684 : AMDGPU: Add GWS instruction builtins ........ Depends on rL363678 which was reverted at rL363797 llvm-svn: 363824	2019-06-19 15:35:45 +00:00
Matt Arsenault	2acc717627	AMDGPU: Add GWS instruction builtins llvm-svn: 363684	2019-06-18 14:10:01 +00:00
Stanislav Mekhanoshin	cafccd7a53	[AMDGPU] gfx1011/gfx1012 clang support Differential Revision: https://reviews.llvm.org/D63308 llvm-svn: 363345	2019-06-14 00:33:59 +00:00
Stanislav Mekhanoshin	8a8131a3f6	[AMDGPU] gfx1010 wave32 clang support Differential Revision: https://reviews.llvm.org/D63209 llvm-svn: 363341	2019-06-13 23:47:59 +00:00
Tim Northover	c46827c7ed	LLVM IR: Generate new-style byval-with-Type from Clang LLVM IR recently added a Type parameter to the byval Attribute, so that when pointers become opaque and no longer have an element type the information will still be present in IR. For now the Type parameter is optional (which is why Clang didn't need this change at the time), but it will become mandatory soon. llvm-svn: 362652	2019-06-05 21:12:14 +00:00
Tim Northover	fcb00d4aec	Reapply: LLVM IR: update Clang tests for byval being a typed attribute. Since byval is now a typed attribute it gets sorted slightly differently by LLVM when the order of attributes is being canonicalized. This updates the few Clang tests that depend on the old order. Clang patch is unchanged. llvm-svn: 362129	2019-05-30 18:49:19 +00:00
Anastasia Stulova	f61b5481fd	[OpenCL] Fix OpenCL/SPIR version metadata in C++ mode. C++ is derived from OpenCL v2.0 therefore set the versions identically. Differential Revision: https://reviews.llvm.org/D62657 llvm-svn: 362102	2019-05-30 15:18:07 +00:00
Sven van Haastregt	ce127bb60e	[OpenCL] Support logical vector operators in C++ mode Support logical operators on vectors in C++ for OpenCL mode, to preserve backwards compatibility with OpenCL C. Differential Revision: https://reviews.llvm.org/D62588 llvm-svn: 362087	2019-05-30 12:35:19 +00:00
Tim Northover	4b281755ae	Revert "LLVM IR: update Clang tests for byval being a typed attribute." The underlying LLVM change couldn't cope with llvm-link and broke LTO builds. llvm-svn: 362028	2019-05-29 20:45:32 +00:00
Tim Northover	45e8cc6639	LLVM IR: update Clang tests for byval being a typed attribute. Since byval is now a typed attribute it gets sorted slightly differently by LLVM when the order of attributes is being canonicalized. This updates the few Clang tests that depend on the old order. llvm-svn: 362013	2019-05-29 19:13:29 +00:00
Yaxun Liu	a53d48b7f4	[OpenCL] Fix file-scope const sampler variable for 2.0 OpenCL spec v2.0 s6.13.14: Samplers can also be declared as global constants in the program source using the following syntax. const sampler_t <sampler name> = <value> This works fine for OpenCL 1.2 but fails for 2.0, because clang duduces address space of file-scope const sampler variable to be in global address space whereas spec v2.0 s6.9.b forbids file-scope sampler variable to be in global address space. The fix is not to deduce address space for file-scope sampler variables. Differential Revision: https://reviews.llvm.org/D62197 llvm-svn: 361757	2019-05-27 11:19:07 +00:00
Kevin Petit	aa7754cc90	[OpenCL] Add support for the cl_arm_integer_dot_product extensions The specification is available in the Khronos OpenCL registry: https://www.khronos.org/registry/OpenCL/extensions/arm/cl_arm_integer_dot_product.txt Signed-off-by: Kevin Petit <kevin.petit@arm.com> llvm-svn: 361641	2019-05-24 14:53:52 +00:00
Sven van Haastregt	151d4f88dc	[NFC] Fix line endings in OpenCL tests llvm-svn: 361004	2019-05-17 09:25:38 +00:00
Stanislav Mekhanoshin	91792f1b93	[AMDGPU] gfx1010 clang target Differential Revision: https://reviews.llvm.org/D61875 llvm-svn: 360634	2019-05-13 23:15:59 +00:00
Anastasia Stulova	5b6dda33d1	[Sema][OpenCL] Make address space conversions a bit stricter. The semantics for converting nested pointers between address spaces are not very well defined. Some conversions which do not really carry any meaning only produce warnings, and in some cases warnings hide invalid conversions, such as 'global int' to 'local float'! This patch changes the logic in checkPointerTypesForAssignment and checkAddressSpaceCast to fail properly on implicit conversions that should definitely not be permitted. We also dig deeper into the pointer types and warn on explicit conversions where the address space in a nested pointer changes, regardless of whether the address space is compatible with the corresponding pointer nesting level on the destination type. Fixes PR39674! Patch by ebevhan (Bevin Hansson)! Differential Revision: https://reviews.llvm.org/D58236 llvm-svn: 360258	2019-05-08 14:23:49 +00:00
Scott Linder	fb59fef7dc	Move setTargetAttributes after setGVProperties in SetFunctionAttributes AMDGPU currently relies on global properties being set before setTargetProperties is called. Existing targets like MIPS which rely on setTargetProperties do not rely on the current behavior, so this patch moves the call later in SetFunctionAttributes. Differential Revision: https://reviews.llvm.org/D60967 llvm-svn: 359039	2019-04-23 21:50:11 +00:00
Alexey Sotkin	1b01f9728f	[OpenCL] Re-fix invalid address space generation for clk_event_t arguments of enqueue_kernel builtin function Summary: https://reviews.llvm.org/D53809 fixed wrong address space(assert in debug build) generated for event_ret argument. But exactly the same problem exists for event_wait_list argument. This patch should fix both. Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Subscribers: kristina, ebevhan, cfe-commits Differential Revision: https://reviews.llvm.org/D59985 llvm-svn: 358151	2019-04-11 06:18:17 +00:00
Stanislav Mekhanoshin	1d9f286ecb	[AMDGPU] rename vi-insts into gfx8-insts Differential Revision: https://reviews.llvm.org/D60293 llvm-svn: 357792	2019-04-05 18:25:00 +00:00
Konstantin Zhuravlyov	ec28a1dcef	AMDGPU: Add support for cross address space synchronization scopes (clang) Differential Revision: https://reviews.llvm.org/D59494 llvm-svn: 356947	2019-03-25 20:54:00 +00:00
Andrew Savonichev	76b178d949	[OpenCL] Generate 'unroll.enable' metadata for __attribute__((opencl_unroll_hint)) Summary: [OpenCL] Generate 'unroll.enable' metadata for __attribute__((opencl_unroll_hint)) For both !{!"llvm.loop.unroll.enable"} and !{!"llvm.loop.unroll.full"} the unroller will try to fully unroll a loop unless the trip count is not known at compile time. In that case for '.full' metadata no unrolling will be processed, while for '.enable' the loop will be partially unrolled with a heuristically chosen unroll factor. See: docs/LanguageExtensions.rst From https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/attributes-loopUnroll.html __attribute__((opencl_unroll_hint)) for (int i=0; i<2; i++) { ... } In the example above, the compiler will determine how much to unroll the loop. Before the patch for __attribute__((opencl_unroll_hint)) was generated metadata !{!"llvm.loop.unroll.full"}, which limits ability of loop unroller to decide, how much to unroll the loop. Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Subscribers: zzheng, dmgreen, jdoerfert, cfe-commits, asavonic, AlexeySotkin Tags: #clang Differential Revision: https://reviews.llvm.org/D59493 llvm-svn: 356571	2019-03-20 16:43:07 +00:00
Michael Liao	3c2aadbe67	[AMDGPU] Add the missing clang change of the experimental buffer fat pointer llvm-svn: 356385	2019-03-18 18:11:37 +00:00
Konstantin Zhuravlyov	3161c89a22	AMDGPU: Fix the mapping of sub group sync scope Map memory_scope_sub_group to "wavefront" sync scope Differential Revision: https://reviews.llvm.org/D58847 llvm-svn: 355549	2019-03-06 20:54:48 +00:00
Yaxun Liu	d83c74028d	[OpenCL] Fix assertion due to blocks A recent change caused assertion in CodeGenFunction::EmitBlockCallExpr when a block is called. There is code Func = CGM.getOpenCLRuntime().getInvokeFunction(E->getCallee()); getCalleeDecl calls Expr::getReferencedDeclOfCallee, which does not handle BlockExpr and returns nullptr, which causes isa to assert. This patch fixes that. Differential Revision: https://reviews.llvm.org/D58658 llvm-svn: 354893	2019-02-26 16:20:41 +00:00
Andrew Savonichev	43fceb2727	[OpenCL] Simplify LLVM IR generated for OpenCL blocks Summary: Emit direct call of block invoke functions when possible, i.e. in case the block is not passed as a function argument. Also doing some refactoring of `CodeGenFunction::EmitBlockCallExpr()` Reviewers: Anastasia, yaxunl, svenvh Reviewed By: Anastasia Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58388 llvm-svn: 354568	2019-02-21 11:02:10 +00:00
Alexey Bader	24fa0c18e6	[OpenCL] Change type of block pointer for OpenCL Summary: For some reason OpenCL blocks in LLVM IR are represented as function pointers. These pointers do not point to any real function and never get called. Actually they point to some structure, which in turn contains pointer to the real block invoke function. This patch changes represntation of OpenCL blocks in LLVM IR from function pointers to pointers to `%struct.__block_literal_generic`. Such representation allows to avoid unnecessary bitcasts and simplifies further processing (e.g. translation to SPIR-V ) of the module for targets which do not support function pointers. Patch by: Alexey Sotkin. Reviewers: Anastasia, yaxunl, svenvh Reviewed By: Anastasia Subscribers: alexbatashev, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58277 llvm-svn: 354337	2019-02-19 15:19:06 +00:00
Anastasia Stulova	2c4730ded8	[OpenCL][PR40707] Allow OpenCL C types in C++ mode. Allow all OpenCL types to be parsed in C++ mode. llvm-svn: 354121	2019-02-15 12:07:57 +00:00
Scott Linder	80a1ee46d8	[AMDGPU] Require at least protected visibility for certain symbols This allows the global visibility controls to be restrictive while still populating the dynamic symbol table where required. Differential Revision: https://reviews.llvm.org/D56871 llvm-svn: 353870	2019-02-12 18:30:38 +00:00
Stanislav Mekhanoshin	1607a37308	[AMDGPU] Split dot-insts feature Differential Revision: https://reviews.llvm.org/D57972 llvm-svn: 353588	2019-02-09 00:34:41 +00:00
James Y Knight	f5f1b0e59e	[opaque pointer types] Cleanup CGBuilder's Create*GEP. Some of these functions take some extraneous arguments, e.g. EltSize, Offset, which are computable from the Type and DataLayout. Add some asserts to ensure that the computed values are consistent with the passed-in values, in preparation for eliminating the extraneous arguments. This also asserts that the Type is an Array for the calls named "Array" and a Struct for the calls named "Struct". Then, correct a couple of errors: 1. Using CreateStructGEP on an array type. (this causes the majority of the test differences, as struct GEPs are created with i32 indices, while array GEPs are created with i64 indices) 2. Passing the wrong Offset to CreateStructGEP in TargetInfo.cpp on x86-64 NACL (which uses 32-bit pointers). Differential Revision: https://reviews.llvm.org/D57766 llvm-svn: 353529	2019-02-08 15:34:12 +00:00
Anastasia Stulova	a4b1cf3282	[OpenCL] Fixed addr space manging test. Fixed typo in the Filecheck directive and changed the test to verify output correctly. Fixes PR40029! llvm-svn: 352760	2019-01-31 15:23:48 +00:00
Matt Arsenault	297afb14ec	Revert "OpenCL: Extend argument promotion rules to vector types" This reverts r348083. This was based on a misreading of the spec for printf specifiers. Also revert r343653, as without a subsequent patch, a correctly specified format for a vector will incorrectly warn. Fixes bug 40491. llvm-svn: 352539	2019-01-29 20:49:47 +00:00
Matt Arsenault	b72888647b	AMDGPU: Add ds append/consume builtins llvm-svn: 352443	2019-01-28 23:59:18 +00:00
Tim Corringham	6d5348cca5	[AMDGPU] Add interpolation builtins Summary: Added builtins for the interpolation intrinsics, and related LIT test. Reviewers: arsenm, tpr, dstuttard, #amdgpu Reviewed By: arsenm, #amdgpu Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, cfe-commits Differential Revision: https://reviews.llvm.org/D46871 llvm-svn: 352358	2019-01-28 13:50:37 +00:00
Stanislav Mekhanoshin	6332f4d0d4	[AMDGPU] Separate feature dot-insts Differential Revision: https://reviews.llvm.org/D56525 llvm-svn: 350794	2019-01-10 03:25:47 +00:00
Erich Keane	e8abbecaf7	Fix opencl test broken on windows by r350643. Windows doesn't allow common with alignment >32 bits, so these tests were broken in windows mode. This patch makes 'common' optional in these cases. Change-Id: I4d5fdd07ecdafc3570ef9b09cd816c2e5e4ed15e llvm-svn: 350645	2019-01-08 19:10:43 +00:00
Andrew Savonichev	1bf1a156d6	[OpenCL][CodeGen] Fix replacing memcpy with addrspacecast Summary: If a function argument is byval and RV is located in default or alloca address space an optimization of creating addrspacecast instead of memcpy is performed. That is not correct for OpenCL, where that can lead to a situation of address space casting from __private * to __global . See an example below: ``` typedef struct { int x; } MyStruct; void foo(MyStruct val) {} kernel void KernelOneMember(__global MyStruct x) { foo (x); } ``` for this code clang generated following IR: ... %0 = load %struct.MyStruct addrspace(1), %struct.MyStruct addrspace(1)** %x.addr, align 4 %1 = addrspacecast %struct.MyStruct addrspace(1)* %0 to %struct.MyStruct* ... So the optimization was disallowed for OpenCL if RV is located in an address space different than that of the argument (0). Reviewers: yaxunl, Anastasia Reviewed By: Anastasia Subscribers: cfe-commits, asavonic Differential Revision: https://reviews.llvm.org/D54947 llvm-svn: 348752	2018-12-10 12:03:00 +00:00
Matt Arsenault	af07de4059	OpenCL: Extend argument promotion rules to vector types The spec is ambiguous on whether vector types are allowed to be implicitly converted. The only legal context I think this can be used for OpenCL is printf, where it seems necessary. llvm-svn: 348083	2018-12-01 21:56:10 +00:00
Marco Antognini	06d9d070c7	Derive builtin return type from its definition Summary: Prior to this patch, OpenCL code such as the following would attempt to create a BranchInst with a non-bool argument: if (enqueue_kernel(get_default_queue(), 0, nd, ^(void){})) /* ... */ This patch is a follow up on a similar issue with pipe builtin operations. See commit r280800 and https://bugs.llvm.org/show_bug.cgi?id=30219. This change, while being conservative on non-builtin functions, should set the type of expressions invoking builtins to the proper type, instead of defaulting to `bool` and requiring manual overrides in Sema::CheckBuiltinFunctionCall. In addition to tests for enqueue_kernel, the tests are extended to check other OpenCL builtins. Reviewers: Anastasia, spatel, rsmith Reviewed By: Anastasia Subscribers: kristina, cfe-commits, svenvh Differential Revision: https://reviews.llvm.org/D52879 llvm-svn: 347658	2018-11-27 14:54:58 +00:00
JF Bastien	3a881e6bbc	CGDecl::emitStoresForConstant fix synthesized constant's name Summary: The name of the synthesized constants for constant initialization was using mangling for statics, which isn't generally correct and (in a yet-uncommitted patch) causes the mangler to assert out because the static ends up trying to mangle function parameters and this makes no sense. Instead, mangle to `"__const." + FunctionName + "." + DeclName`. Reviewers: rjmccall Subscribers: dexonsmith, cfe-commits Differential Revision: https://reviews.llvm.org/D54055 llvm-svn: 346915	2018-11-15 00:19:18 +00:00
Alexey Sotkin	692f12b389	[OpenCL] Fix invalid address space generation for clk_event_t Summary: Addrspace(32) was generated when putting 0 in clk_event_t * event_ret parameter for enqueue_kernel function. Patch by Viktoria Maksimova Reviewers: Anastasia, yaxunl, AlexeySotkin Reviewed By: Anastasia, AlexeySotkin Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D53809 llvm-svn: 346838	2018-11-14 09:40:05 +00:00
Andrew Savonichev	3fee351867	[OpenCL] Add support of cl_intel_device_side_avc_motion_estimation extension Summary: Documentation can be found at https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_device_side_avc_motion_estimation.txt Patch by Kristina Bessonova Reviewers: Anastasia, yaxunl, shafik Reviewed By: Anastasia Subscribers: arphaman, sidorovd, AlexeySotkin, krisb, bader, asavonic, cfe-commits Differential Revision: https://reviews.llvm.org/D51484 llvm-svn: 346392	2018-11-08 11:25:41 +00:00
Andrew Savonichev	3b12b7e702	Revert r346326 [OpenCL] Add support of cl_intel_device_side_avc_motion_estimation This patch breaks Index/opencl-types.cl LIT test: Script: -- : 'RUN: at line 1'; stage1/bin/c-index-test -test-print-type llvm/tools/clang/test/Index/opencl-types.cl -cl-std=CL2.0 \| stage1/bin/FileCheck llvm/tools/clang/test/Index/opencl-types.cl -- Command Output (stderr): -- llvm/tools/clang/test/Index/opencl-types.cl:3:26: warning: unsupported OpenCL extension 'cl_khr_fp16' - ignoring [-Wignored-pragmas] llvm/tools/clang/test/Index/opencl-types.cl:4:26: warning: unsupported OpenCL extension 'cl_khr_fp64' - ignoring [-Wignored-pragmas] llvm/tools/clang/test/Index/opencl-types.cl:8:9: error: use of type 'double' requires cl_khr_fp64 extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:11:8: error: declaring variable of type 'half' is not allowed llvm/tools/clang/test/Index/opencl-types.cl:15:3: error: use of type 'double' requires cl_khr_fp64 extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:16:3: error: use of type 'double4' (vector of 4 'double' values) requires cl_khr_fp64 extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:26:26: warning: unsupported OpenCL extension 'cl_khr_gl_msaa_sharing' - ignoring [-Wignored-pragmas] llvm/tools/clang/test/Index/opencl-types.cl:35:44: error: use of type '__read_only image2d_msaa_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:36:49: error: use of type '__read_only image2d_array_msaa_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:37:49: error: use of type '__read_only image2d_msaa_depth_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:38:54: error: use of type '__read_only image2d_array_msaa_depth_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm-svn: 346338	2018-11-07 18:34:19 +00:00
Andrew Savonichev	35dfce723c	[OpenCL] Add support of cl_intel_device_side_avc_motion_estimation extension Summary: Documentation can be found at https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_device_side_avc_motion_estimation.txt Patch by Kristina Bessonova Reviewers: Anastasia, yaxunl, shafik Reviewed By: Anastasia Subscribers: arphaman, sidorovd, AlexeySotkin, krisb, bader, asavonic, cfe-commits Differential Revision: https://reviews.llvm.org/D51484 llvm-svn: 346326	2018-11-07 15:44:01 +00:00
Craig Topper	3113ec3dc7	[CodeGen] Update min-legal-vector width based on function argument and return types This is a continuation of my patches to inform the X86 backend about what the largest IR types are in the function so that we can restrict the backend type legalizer to prevent 512-bit vectors on SKX when -mprefer-vector-width=256 is specified if no explicit 512 bit vectors were specified by the user. This patch updates the vector width based on the argument and return types of the current function and from the types of any functions it calls. This is intended to make sure the backend type legalizer doesn't disturb any types that are required for ABI. Differential Revision: https://reviews.llvm.org/D52441 llvm-svn: 345168	2018-10-24 17:42:17 +00:00
Yaxun Liu	aae1e87f4b	AMDGPU: add __builtin_amdgcn_update_dpp Emit llvm.amdgcn.update.dpp for both __builtin_amdgcn_mov_dpp and __builtin_amdgcn_update_dpp. The first argument to llvm.amdgcn.update.dpp will be undef for __builtin_amdgcn_mov_dpp. Differential Revision: https://reviews.llvm.org/D52320 llvm-svn: 344665	2018-10-17 02:32:26 +00:00
Sven van Haastregt	a3c6b407ec	[OpenCL] Add block argument CodeGen test r326937 ("[OpenCL] Remove block invoke function from emitted block literal struct", 2018-03-07) broke block argument handling. In particular the commit was causing a crash during code generation, see the discussion in https://reviews.llvm.org/D43783 . The offending commit has just been reverted; add a test to avoid breaking this again in the future. llvm-svn: 343583	2018-10-02 13:02:27 +00:00
Sven van Haastregt	da3b632057	Revert r326937 "[OpenCL] Remove block invoke function from emitted block literal struct" This reverts r326937 as it broke block argument handling in OpenCL. See the discussion on https://reviews.llvm.org/D43783 . The next commit will add a test case that revealed the issue. llvm-svn: 343582	2018-10-02 13:02:24 +00:00
Matt Arsenault	94abc57e37	AMDGPU: Add another missing builtin llvm-svn: 339395	2018-08-09 22:18:37 +00:00
Matt Arsenault	45bc148093	AMDGPU: Fix enabling denormals by default on pre-VI targets Fast FMAF is not a sufficient condition to enable denormals. Before VI, enabling denormals caused F32 instructions to run at F64 speeds. llvm-svn: 339278	2018-08-08 17:48:37 +00:00
Scott Linder	58df0e4d2c	[DebugInfo][OpenCL] Address post-commit review for r338299 NFC refactor of code to generate debug info for OpenCL 2.X blocks. Differential Revision: https://reviews.llvm.org/D50099 llvm-svn: 339265	2018-08-08 15:56:12 +00:00
Douglas Yung	be1166a43b	Fix one hard coded value I missed in r339185. llvm-svn: 339188	2018-08-07 21:37:14 +00:00
Douglas Yung	dca675a0d8	Make test more robust by not checking hard coded debug info values, but instead check the relationships between them. llvm-svn: 339185	2018-08-07 21:22:49 +00:00
Scott Linder	f8b3df4dec	[OpenCL] Restore r338899 (reverted in r338904), fixing stack-use-after-return Always emit alloca in entry block for enqueue_kernel builtin. Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. llvm-svn: 339150	2018-08-07 15:52:49 +00:00
Matt Arsenault	31c895ecdf	AMDGPU: Add builtin for s_dcache_wb llvm-svn: 339110	2018-08-07 07:49:13 +00:00
Matt Arsenault	24f3924709	AMDGPU: Add builtin for s_dcache_inv_vol llvm-svn: 339109	2018-08-07 07:49:04 +00:00
Vlad Tsyrklevich	c7d3d34b98	Revert "[OpenCL] Always emit alloca in entry block for enqueue_kernel builtin" This reverts commit r338899, it was causing ASan test failures on sanitizer-x86_64-linux-fast. llvm-svn: 338904	2018-08-03 17:47:58 +00:00
Scott Linder	91f578467c	[OpenCL] Always emit alloca in entry block for enqueue_kernel builtin Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. Differential Revision: https://reviews.llvm.org/D50104 llvm-svn: 338899	2018-08-03 15:50:52 +00:00
Matt Arsenault	e3d81572c1	AMDGPU: Fix missing declaration of queue ptr builtin llvm-svn: 338754	2018-08-02 18:24:55 +00:00
Matt Arsenault	c65f966d76	Try to make builtin address space declarations not useless The way address space declarations for builtins currently work is nearly useless. The code assumes the address spaces used for builtins is a confusingly named "target address space" from user code using __attribute__((address_space(N))) that matches the builtin declaration. There's no way to use this to declare a builtin that returns a language specific address space. The terminology used is highly cofusing since it has nothing to do with the the address space selected by the target to use for a language address space. This feature is essentially unused as-is. AMDGPU and NVPTX are the only in-tree targets attempting to use this. The AMDGPU builtins certainly do not behave as intended (i.e. all of the builtins returning pointers can never compile because the numbered address space never matches the expected named address space). The NVPTX builtins are missing tests for some, and the others seem to rely on an implicit addrspacecast. Change the used address space for builtins based on a target hook to allow using a language address space for a builtin. This allows the same builtin declaration to be used for multiple languages with similarly purposed address spaces (e.g. the same AMDGPU builtin can be used in OpenCL and CUDA even though the constant address spaces are arbitarily different). This breaks the possibility of using arbitrary numbered address spaces alongside the named address spaces for builtins. If this is an issue we probably need to introduce another builtin declaration character to distinguish language address spaces from so-called "target address spaces". llvm-svn: 338707	2018-08-02 12:14:28 +00:00
Konstantin Zhuravlyov	9057546c5b	AMDGPU: Add clamp bit to dot builtins Differential Revision: https://reviews.llvm.org/D50011 llvm-svn: 338471	2018-08-01 01:32:21 +00:00
Scott Linder	2b5cf04180	[DebugInfo][OpenCL] Generate correct block literal debug info for OpenCL OpenCL block literal structs have different fields which are now correctly identified in the debug info. Differential Revision: https://reviews.llvm.org/D49930 llvm-svn: 338299	2018-07-30 20:31:11 +00:00
JF Bastien	9aab85a6a0	CodeGen: specify alignment + inbounds for automatic variable initialization Summary: Automatic variable initialization was generating default-aligned stores (which are deprecated) instead of using the known alignment from the alloca. Further, they didn't specify inbounds. Subscribers: dexonsmith, cfe-commits Differential Revision: https://reviews.llvm.org/D49209 llvm-svn: 337041	2018-07-13 20:33:23 +00:00
Daniil Fukalov	1b14a3ad3d	[AMDGPU] fixes for lds f32 builtins 1. added restrictions to memory scope, order and volatile parameters 2. added custom processing for these builtins - currently is not used code, needed to switch off GCCBuiltin link to the builtins (ongoing change to llvm tree) 3. builtins renamed as requested Differential Revision: https://reviews.llvm.org/D43281 llvm-svn: 332848	2018-05-21 16:18:07 +00:00
Sanjay Patel	cda77b30e5	[OpenCL] make test independent of optimizer There shouldn't be any tests that run the entire optimizer here, but the last test in this file is definitely going to break with a change in LLVM IR canonicalization. Change that part to check the unoptimized IR because that's the real intent of this file. llvm-svn: 332473	2018-05-16 14:38:07 +00:00
Yaxun Liu	3cab24aa4f	[OpenCL] Fix typos in emitted enqueue kernel function names Two typos: vaarg => vararg get_kernel_preferred_work_group_multiple => get_kernel_preferred_work_group_size_multiple Differential Revision: https://reviews.llvm.org/D46601 llvm-svn: 331895	2018-05-09 17:07:06 +00:00
Anastasia Stulova	59055b94af	[OpenCL] Add constant address space to __func__ in AST. Added string literal helper function to obtain the type attributed by a constant address space. Also fixed predefind __func__ expr to use the helper to constract the string literal correctly. Differential Revision: https://reviews.llvm.org/D46049 llvm-svn: 331877	2018-05-09 13:23:26 +00:00
Erich Keane	14c1085317	Add Microsoft Mangling for OpenCL Half Type Half-type mangling is accomplished following the method introduced by Erich Keane for mangling _Float16. Updated the half.cl LIT test to cover this particular case. Patch By: vbridgers Differential Revision: https://reviews.llvm.org/D46131 llvm-svn: 331263	2018-05-01 14:16:15 +00:00
Matt Arsenault	d2da3c20d7	AMDGPU: Add Vega12 and Vega20 Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331216	2018-04-30 19:08:27 +00:00
Sven van Haastregt	4700faa28e	[OpenCL] Add separate read_only and write_only pipe IR types SPIR-V encodes the read_only and write_only access qualifiers of pipes, so separate LLVM IR types are required to target SPIR-V. Other backends may also find this useful. These new types are `opencl.pipe_ro_t` and `opencl.pipe_wo_t`, which replace `opencl.pipe_t`. This replaces __get_pipe_num_packets(...) and __get_pipe_max_packets(...) which took a read_only pipe with separate versions for read_only and write_only pipes, namely: * __get_pipe_num_packets_ro(...) * __get_pipe_num_packets_wo(...) * __get_pipe_max_packets_ro(...) * __get_pipe_max_packets_wo(...) These separate versions exist to avoid needing a bitcast to one of the two qualified pipe types. Patch by Stuart Brady. Differential Revision: https://reviews.llvm.org/D46015 llvm-svn: 331026	2018-04-27 10:37:04 +00:00
Hans Wennborg	a417362c28	Fix some tests that were failing on Windows llvm-svn: 330441	2018-04-20 15:33:44 +00:00
Alexey Sotkin	3858e26f22	[OpenCL] Add 'denorms-are-zero' function attribute Summary: Generate attribute 'denorms-are-zero'='true' if '-cl-denorms-are-zero' compile option was specified and 'denorms-are-zero'='false' otherwise. Patch by krisb Reviewers: Anastasia, yaxunl Reviewed By: yaxunl Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D45808 llvm-svn: 330404	2018-04-20 08:08:04 +00:00
Alexander Kornienko	2a8c18d991	Fix typos in clang Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399	2018-04-06 15:14:32 +00:00
Matt Arsenault	b130ea5605	AMDGPU: Update datalayout for stack alignment llvm-svn: 328657	2018-03-27 19:26:51 +00:00
Yaxun Liu	ac1263cd54	[AMDGPU] Fix codegen for inline assembly Need to override convertConstraint to recognise amdgpu specific register names. Differential Revision: https://reviews.llvm.org/D44533 llvm-svn: 328359	2018-03-23 19:43:42 +00:00
Tony Tye	68e11a6eca	[AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU (CLANG) Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue. Differential Revision: https://reviews.llvm.org/D44696 llvm-svn: 328350	2018-03-23 18:51:45 +00:00
Tony Tye	1a3f3a2d14	[AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU (CLANG) - Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use a function attribute to communicate to the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D43735 llvm-svn: 328347	2018-03-23 18:43:15 +00:00
Yaxun Liu	5b330e8d61	Recommit r326946 after reducing CallArgList memory footprint llvm-svn: 327634	2018-03-15 15:25:19 +00:00
Richard Smith	007cb6df58	Revert r326946. It caused stack overflows by significantly increasing the size of a CallArgList. llvm-svn: 327195	2018-03-10 01:47:22 +00:00
Yaxun Liu	06dd81149f	CodeGen: Fix address space of indirect function argument The indirect function argument is in alloca address space in LLVM IR. However, during Clang codegen for C++, the address space of indirect function argument should match its address space in the source code, i.e., default addr space, even for indirect argument. This is because destructor of the indirect argument may be called in the caller function, and address of the indirect argument may be taken, in either case the indirect function argument is expected to be in default addr space, not the alloca address space. Therefore, the indirect function argument should be mapped to the temp var casted to default address space. The caller will cast it to alloca addr space when passing it to the callee. In the callee, the argument is also casted to the default address space and used. CallArg is refactored to facilitate this fix. Differential Revision: https://reviews.llvm.org/D34367 llvm-svn: 326946	2018-03-07 21:45:40 +00:00
Yaxun Liu	cb35e9fa94	[OpenCL] Remove block invoke function from emitted block literal struct OpenCL runtime tracks the invoke function emitted for any block expression. Due to restrictions on blocks in OpenCL (v2.0 s6.12.5), it is always possible to know the block invoke function when emitting call of block expression or __enqueue_kernel builtin functions. Since __enqueu_kernel already has an argument for the invoke function, it is redundant to have invoke function member in the llvm block literal structure. This patch removes invoke function from the llvm block literal structure. It also removes the bitcast of block invoke function to the generic block literal type which is useless for OpenCL. This will save some space for the kernel argument, and also eliminate some store instructions. Differential Revision: https://reviews.llvm.org/D43783 llvm-svn: 326937	2018-03-07 19:32:58 +00:00
Rafael Espindola	922f2aa9b2	Bring r325915 back. The tests that failed on a windows host have been fixed. Original message: Start setting dso_local for COFF. With this there are still some GVs where we don't set dso_local because setGVProperties is never called. I intend to fix that in followup commits. This is just the bare minimum to teach shouldAssumeDSOLocal what it should do for COFF. llvm-svn: 325940	2018-02-23 19:30:48 +00:00
Alexey Sotkin	20f65928e1	[OpenCL] Add '-cl-uniform-work-group-size' compile option Summary: OpenCL 2.0 specification defines '-cl-uniform-work-group-size' option, which requires that the global work-size be a multiple of the work-group size specified to clEnqueueNDRangeKernel and allows optimizations that are made possible by this restriction. The patch introduces the support of this option. To keep information about whether an OpenCL kernel has uniform work group size or not, clang generates 'uniform-work-group-size' function attribute for every kernel: - "uniform-work-group-size"="true" for OpenCL 1.2 and lower, - "uniform-work-group-size"="true" for OpenCL 2.0 and higher if '-cl-uniform-work-group-size' option was specified, - "uniform-work-group-size"="false" for OpenCL 2.0 and higher if no '-cl-uniform-work-group-size' options was specified. If the function is not an OpenCL kernel, 'uniform-work-group-size' attribute isn't generated. Patch by: krisb Reviewers: yaxunl, Anastasia, b-sumner Reviewed By: yaxunl, Anastasia Subscribers: nhaehnle, yaxunl, Anastasia, cfe-commits Differential Revision: https://reviews.llvm.org/D43570 llvm-svn: 325771	2018-02-22 11:54:14 +00:00
Yaxun Liu	f8ad59d99d	Clean up AMDGCN tests Differential Revision: https://reviews.llvm.org/D43340 llvm-svn: 325279	2018-02-15 19:12:41 +00:00
Yaxun Liu	fa13d015a3	[OpenCL] Fix __enqueue_block for block with captures The following test case causes issue with codegen of __enqueue_block void (^block)(void) = ^{ callee(id, out); }; enqueue_kernel(queue, 0, ndrange, block); Clang first does codegen for block expression in the first line and deletes its block info. Clang then tries to do codegen for the same block expression again for the second line, and fails because the block info is gone. The fix is to do normal codegen for both lines. Introduce an API to OpenCL runtime to record llvm block invoke function and llvm block literal emitted for each AST block expression, and use the recorded information for generating the wrapper kernel. The EmitBlockLiteral APIs are cleaned up to minimize changes to the normal codegen of blocks. Another minor issue is that some clean up AST expression is generated for block with captures, which can be stripped by IgnoreImplicit. Differential Revision: https://reviews.llvm.org/D43240 llvm-svn: 325264	2018-02-15 16:39:19 +00:00
Yaxun Liu	651bd73c02	[AMDGPU] Change constant addr space to 4 Differential Revision: https://reviews.llvm.org/D43171 llvm-svn: 325031	2018-02-13 18:01:21 +00:00
Matt Arsenault	e7da136a74	AMDGPU: Update for datalayout change llvm-svn: 324748	2018-02-09 16:58:41 +00:00
Matt Arsenault	935574a490	Fix crash on array initializer with non-0 alloca addrspace llvm-svn: 324641	2018-02-08 19:37:09 +00:00
Daniil Fukalov	da2a0558ea	Recommit rL323890: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions Fixed asserts in tests. llvm-svn: 324201	2018-02-04 22:32:07 +00:00
Yaxun Liu	f5f45e5e63	[AMDGPU] Switch to the new addr space mapping by default This requires corresponding llvm change. Differential Revision: https://reviews.llvm.org/D40956 llvm-svn: 324102	2018-02-02 16:08:24 +00:00
Daniil Fukalov	07df4ffae7	Revert "[AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions" This reverts https://reviews.llvm.org/rL323890 This reverts commit 251524ebd8c346a936f0e74b09d609d49fbaae4a. llvm-svn: 323896	2018-01-31 18:49:49 +00:00
Daniil Fukalov	e72cde57d2	[AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions Reviewed by arsenm Differential Revision: https://reviews.llvm.org/D42578 llvm-svn: 323890	2018-01-31 16:55:09 +00:00
Daniel Neilson	6e938effaa	Change memcpy/memove/memset to have dest and source alignment attributes (Step 1). Summary: Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset intrinsics. This change updates the Clang tests for this change. The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change removes the alignment argument in favour of placing the alignment attribute on the source and destination pointers of the memory intrinsic call. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) At this time the source and destination alignments must be the same (Step 1). Step 2 of the change, to be landed shortly, will relax that contraint and allow the source and destination to have different alignments. llvm-svn: 322964	2018-01-19 17:12:54 +00:00
Yaxun Liu	c325d30d2c	CodeGen: Fix invalid bitcasts for memcpy CreateCoercedLoad/CreateCoercedStore assumes pointer argument of memcpy is in addr space 0, which is not correct and causes invalid bitcasts for triple amdgcn---amdgiz. It is fixed by using alloca addr space instead. Differential Revision: https://reviews.llvm.org/D40806 llvm-svn: 320000	2017-12-07 01:39:52 +00:00
Alexey Bader	bed400957b	[OpenCL] Fix code generation of function-scope constant samplers. Summary: Constant samplers are handled as static variables and clang's code generation library, which leads to llvm::unreachable. We bypass emitting sampler variable as static since it's translated to a function call later. Reviewers: yaxunl, Anastasia Reviewed By: yaxunl, Anastasia Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D34342 llvm-svn: 318290	2017-11-15 11:38:17 +00:00
Matt Arsenault	a5888a730d	OpenCL: Assume inline asm is convergent Already done for CUDA. llvm-svn: 318098	2017-11-13 22:40:55 +00:00
Yaxun Liu	e45b3d5dad	CodeGen: Fix missing debug loc due to alloca Builder save/restores insertion pointer when emitting addr space cast for alloca, but does not save/restore debug loc, which causes verifier failure for certain call instructions. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39069 llvm-svn: 316484	2017-10-24 19:14:43 +00:00
Yaxun Liu	bd3823618d	CodeGen: Fix invalid bitcast in partial initialization of automatic arrary variable Differential Revision: https://reviews.llvm.org/D39184 llvm-svn: 316353	2017-10-23 17:49:26 +00:00
Yaxun Liu	f2127d1728	[AMDGPU] Fix bug in enqueued block codegen due to an extra line llvm-svn: 316165	2017-10-19 15:56:13 +00:00
Yaxun Liu	8ab5ab066a	CodeGen: Fix invalid bitcasts for atomic builtins Currently clang assumes the temporary variables emitted during codegen of atomic builtins have address space 0, which is not true for target triple amdgcn---amdgiz and causes invalid bitcasts. This patch fixes that. Differential Revision: https://reviews.llvm.org/D38966 llvm-svn: 316000	2017-10-17 14:19:29 +00:00
Yaxun Liu	c2a87a05f1	[OpenCL] Emit enqueued block as kernel In OpenCL the kernel function and non-kernel function has different calling conventions. For certain targets they have different argument ABIs. Also kernels have special function attributes and metadata for runtime to launch them. The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such, the block invoke function should be emitted as kernel with proper calling convention and argument ABI. This patch emits enqueued block as kernel. If a block is both called directly and passed to enqueue_kernel, separate functions will be generated. Differential Revision: https://reviews.llvm.org/D38134 llvm-svn: 315804	2017-10-14 12:23:50 +00:00
Yaxun Liu	d029234fc6	Fix regression of test/CodeGenOpenCL/address-spaces.cl on ppc llvm-svn: 315678	2017-10-13 13:53:06 +00:00
Yaxun Liu	b7318e02c1	[OpenCL] Add LangAS::opencl_private to represent private address space in AST Currently Clang uses default address space (0) to represent private address space for OpenCL in AST. There are two issues with this: Multiple address spaces including private address space cannot be diagnosed. There is no mangling for default address space. For example, if private int* is emitted as i32 addrspace(5)* in IR. It is supposed to be mangled as PUAS5i but it is mangled as Pi instead. This patch attempts to represent OpenCL private address space explicitly in AST. It adds a new enum LangAS::opencl_private and adds it to the variable types which are implicitly private: automatic variables without address space qualifier function parameter pointee type without address space qualifier (OpenCL 1.2 and below) Differential Revision: https://reviews.llvm.org/D35082 llvm-svn: 315668	2017-10-13 03:37:48 +00:00
Matt Arsenault	f12e3b848a	AMDGPU: Add read_exec_lo/hi builtins llvm-svn: 315238	2017-10-09 20:06:37 +00:00
Matt Arsenault	cbe0dd13d2	AMDGPU: Fix missing declaration for __builtin_amdgcn_dispatch_ptr llvm-svn: 315219	2017-10-09 17:44:18 +00:00
Matt Arsenault	a1cf61b6fc	OpenCL: Assume functions are convergent This was done for CUDA functions in r261779, and for the same reason this also needs to be done for OpenCL. An arbitrary function could have a barrier() call in it, which in turn requires the calling function to be convergent. llvm-svn: 315094	2017-10-06 19:34:40 +00:00
Yaxun Liu	10712d9203	[OpenCL] Clean up and add missing fields for block struct Currently block is translated to a structure equivalent to struct Block { void isa; int flags; int reserved; void invoke; void descriptor; }; Except invoke, which is the pointer to the block invoke function, all other fields are useless for OpenCL, which clutter the IR and also waste memory since the block struct is passed to the block invoke function as argument. On the other hand, the size and alignment of the block struct is not stored in the struct, which causes difficulty to implement __enqueue_kernel as library function, since the library function needs to know the size and alignment of the argument which needs to be passed to the kernel. This patch removes the useless fields from the block struct and adds size and align fields. The equivalent block struct will become struct Block { int size; int align; generic void invoke; /* custom fields */ }; It also changes the pointer to the invoke function to be a generic pointer since the address space of a function may not be private on certain targets. Differential Revision: https://reviews.llvm.org/D37822 llvm-svn: 314932	2017-10-04 20:32:17 +00:00
Anastasia Stulova	89a947da72	[OpenCL] Fixed CL version in failing test. llvm-svn: 314317	2017-09-27 17:03:35 +00:00
Anastasia Stulova	0a72ed40d3	[OpenCL] Handle address space conversion while setting type alignment. Added missing addrspacecast case in alignment computation logic of pointer type emission in IR generation. Differential Revision: https://reviews.llvm.org/D37804 llvm-svn: 314304	2017-09-27 14:37:00 +00:00
Yaxun Liu	1f33d8eb19	Add more tests for OpenCL atomic builtin functions Add tests for different address spaces and insert some blank lines to make them more readable. Differential Revision: https://reviews.llvm.org/D37742 llvm-svn: 313172	2017-09-13 18:56:25 +00:00
Yaxun Liu	e37793545e	[AMDGPU] Change addr space of clk_event_t, queue_t and reserve_id_t to global Differential Revision: https://reviews.llvm.org/D37703 llvm-svn: 313171	2017-09-13 18:50:42 +00:00
Jan Vesely	31ecb4bf60	[OpenCL] Add half load and store builtins This enables load/stores of half type, without half being a legal type. Differential Revision: https://reviews.llvm.org/D37231 llvm-svn: 312742	2017-09-07 19:39:10 +00:00
Yaxun Liu	29a5ee358e	[OpenCL] Do not use vararg in emitted functions for enqueue_kernel Not all targets support vararg (e.g. amdgpu). Instead of using vararg in the emitted functions for enqueue_kernel, this patch creates a temporary array of size_t, stores the size arguments in the temporary array and passes it to the emitted functions for enqueue_kernel. Differential Revision: https://reviews.llvm.org/D36678 llvm-svn: 312441	2017-09-03 13:52:24 +00:00
Adrian Prantl	9e83fb0838	Adapt testcases to LLVM change r312144 in DIGlobalVariableExpression llvm-svn: 312148	2017-08-30 18:22:23 +00:00
Reid Kleckner	6d353348e5	Parse and print DIExpressions inline to ease IR and MIR testing Summary: Most DIExpressions are empty or very simple. When they are complex, they tend to be unique, so checking them inline is reasonable. This also avoids the need for CodeGen passes to append to the llvm.dbg.mir named md node. See also PR22780, for making DIExpression not be an MDNode. Reviewers: aprantl, dexonsmith, dblaikie Subscribers: qcolombet, javed.absar, eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37075 llvm-svn: 311594	2017-08-23 20:31:27 +00:00
Yaxun Liu	504d4e2403	Attempt to fix failure in CodeGenOpenCL/atomic-ops.cl again llvm-svn: 310937	2017-08-15 17:59:26 +00:00
Yaxun Liu	adcfe5d881	Attempt to fix failure in CodeGenOpenCL/atomic-ops.cl llvm-svn: 310932	2017-08-15 17:16:44 +00:00
Yaxun Liu	99d56d291f	Remove -finclude-default-header in OpenCL atomic tests Differential Revision: https://reviews.llvm.org/D36676 llvm-svn: 310927	2017-08-15 16:30:31 +00:00
Yaxun Liu	30d652a447	[OpenCL] Support variable memory scope in atomic builtins Differential Revision: https://reviews.llvm.org/D36580 llvm-svn: 310924	2017-08-15 16:02:49 +00:00
Sven van Haastregt	efb4d4c78c	[OpenCL] Allow targets to select address space per type Generalize getOpenCLImageAddrSpace into getOpenCLTypeAddrSpace, such that targets can select the address space per type. No functional changes intended. Initial patch by Simon Perretta. Differential Revision: https://reviews.llvm.org/D33989 llvm-svn: 310911	2017-08-15 09:38:18 +00:00
Matt Arsenault	3fe7395fbc	AMDGPU: Use direct struct returns and arguments This is an improvement over always using byval for structs. This will use registers until ~16 are used, and then switch back to byval. This needs more work, since I'm not sure it ever really makes sense to use byval. If the register limit is exceeded, the arguments still end up passed on the stack, but with a different ABI. It also may make sense to base this on number of registers used for non-struct arguments, rather than just arguments that appear first in the argument list. llvm-svn: 310527	2017-08-09 21:44:58 +00:00
Yaxun Liu	39195062c2	Add OpenCL 2.0 atomic builtin functions as Clang builtin OpenCL 2.0 atomic builtin functions have a scope argument which is ideally represented as synchronization scope argument in LLVM atomic instructions. Clang supports translating Clang atomic builtin functions to LLVM atomic instructions. However it currently does not support synchronization scope of LLVM atomic instructions. Without this, users have to use LLVM assembly code to implement OpenCL atomic builtin functions. This patch adds OpenCL 2.0 atomic builtin functions as Clang builtin functions, which supports generating LLVM atomic instructions with synchronization scope operand. Currently only constant memory scope argument is supported. Support of non-constant memory scope argument will be added later. Differential Revision: https://reviews.llvm.org/D28691 llvm-svn: 310082	2017-08-04 18:16:31 +00:00
Joey Gouly	fa76b49cef	[OpenCL] Add missing subgroup builtins This adds get_kernel_max_sub_group_size_for_ndrange and get_kernel_sub_group_count_for_ndrange. llvm-svn: 309678	2017-08-01 13:27:09 +00:00
Joey Gouly	53160cdc45	[OpenCL] Enable subgroup extension in tests This fixes the test, so that it can be run on different hosts that may have different OpenCL extensions enabled. llvm-svn: 309571	2017-07-31 15:50:27 +00:00
Joey Gouly	84ae3364df	[OpenCL] Add extension Sema check for subgroup builtins Check the subgroup extension is enabled, before doing other Sema checks. llvm-svn: 309567	2017-07-31 15:15:59 +00:00
Alexey Sotkin	7d7f0dc08b	[OpenCL] Fix access qualifiers metadata for kernel arguments with typedef Subscribers: cfe-commits, yaxunl, Anastasia Differential Revision: https://reviews.llvm.org/D35420 llvm-svn: 309155	2017-07-26 18:49:54 +00:00
Egor Churaev	53f9a30543	[OpenCL] Added extended tests on metadata generation for half data type and arrays. Reviewers: Anastasia Reviewed By: Anastasia Subscribers: bader, cfe-commits, yaxunl Differential Revision: https://reviews.llvm.org/D35000 llvm-svn: 308266	2017-07-18 06:04:01 +00:00
Yaxun Liu	25d1b4341f	[AMDGPU] Fix size and alignment of size_t and pointer types Differential Revision: https://reviews.llvm.org/D34995 llvm-svn: 307121	2017-07-05 04:58:24 +00:00
Yaxun Liu	3ba4a720ad	[AMDGPU] Fix regressions on mesa/clover with libclc due to address space Currently AMDGPUTargetInfo does not initialize AddrSpaceMap in constructor, which causes regressions in mesa/clover with libclc. This patch fixes that. Differential Revision: https://reviews.llvm.org/D34987 llvm-svn: 307105	2017-07-04 19:57:18 +00:00
Yaxun Liu	e9e5c4f975	CodeGen: Fix invalid bitcast for coerced function argument Clang assumes coerced function argument is in address space 0, which is not always true and results in invalid bitcasts. This patch fixes failure in OpenCL conformance test api/get_kernel_arg_info with amdgcn---amdgizcl triple, where non-zero alloca address space is used. Differential Revision: https://reviews.llvm.org/D34777 llvm-svn: 306721	2017-06-29 18:47:45 +00:00
Alexey Bader	364a11651e	[OpenCL] Fix OpenCL and SPIR version metadata generation. Summary: OpenCL and SPIR version metadata must be generated once per module instead of once per mangled global value. Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Subscribers: ahatanak, cfe-commits Differential Revision: https://reviews.llvm.org/D34235 llvm-svn: 305796	2017-06-20 14:30:18 +00:00
Pekka Jaaskelainen	2eb0bcc9e6	[OpenCL] spir_kern by defaul: fix old test cases llvm-svn: 304396	2017-06-01 08:19:43 +00:00
Pekka Jaaskelainen	fc2629a65a	[OpenCL] Makes kernels use the SPIR_KERNEL CC by default. Rationale: OpenCL kernels are called via an explicit runtime API with arguments set with clSetKernelArg(), not as normal sub-functions. Return SPIR_KERNEL by default as the kernel calling convention to ensure the fingerprint is fixed such way that each OpenCL argument gets one matching argument in the produced kernel function argument list to enable feasible implementation of clSetKernelArg() with aggregates etc. In case we would use the default C calling conv here, clSetKernelArg() might break depending on the target-specific conventions; different targets might split structs passed as values to multiple function arguments etc. https://reviews.llvm.org/D33639 llvm-svn: 304389	2017-06-01 07:18:49 +00:00
Javed Absar	0841d620c5	Fix issue with test that caused bildbot failure These tests did not specify the target. The failure was triggered by change - https://reviews.llvm.org/D33205 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/7314 which sets vector alignment to 8-byte for arm-targets (except for Android). So, fixing the test to make it target specific. llvm-svn: 304210	2017-05-30 13:34:26 +00:00
Egor Churaev	dd7d82c408	[OpenCL] Test on half immediate support. Reviewers: Anastasia Reviewed By: Anastasia Subscribers: yaxunl, cfe-commits, bader Differential Revision: https://reviews.llvm.org/D33592 llvm-svn: 304134	2017-05-29 07:44:22 +00:00
Mehdi Amini	6aa9e9b41a	IRGen: Add optnone attribute on function during O0 Amongst other, this will help LTO to correctly handle/honor files compiled with O0, helping debugging failures. It also seems in line with how we handle other options, like how -fnoinline adds the appropriate attribute as well. Differential Revision: https://reviews.llvm.org/D28404 llvm-svn: 304127	2017-05-29 05:38:20 +00:00
Konstantin Zhuravlyov	1f144a18ff	Resubmit r303861. [AMDGPU] add __builtin_amdgcn_s_getpc Patch by Tim Corringham llvm-svn: 304033	2017-05-26 21:08:20 +00:00
Reid Kleckner	581a6c5d56	Revert "[AMDGPU] add __builtin_amdgcn_s_getpc" This reverts commit r303861, the LLVM intrinsic was reverted. llvm-svn: 303908	2017-05-25 20:28:26 +00:00
Tim Corringham	702fe45bcd	[AMDGPU] add __builtin_amdgcn_s_getpc Summary: Added the builtin corresponding to the s_getpc intrinsic added in llvm D32862 Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D33276 llvm-svn: 303861	2017-05-25 14:16:11 +00:00
Yaxun Liu	af3d4db64b	[AMDGPU] Do not require opencl triple environment for OpenCL A recent change requires opencl triple environment for compiling OpenCL program, which causes regressions in libclc. This patch fixes that. Instead of deducing language based on triple environment, it checks LangOptions. Differential Revision: https://reviews.llvm.org/D33445 llvm-svn: 303644	2017-05-23 16:15:53 +00:00
Yaxun Liu	6d96f16347	CodeGen: Cast alloca to expected address space Alloca always returns a pointer in alloca address space, which may be different from the type defined by the language. For example, in C++ the auto variables are in the default address space. Therefore cast alloca to the expected address space when necessary. Differential Revision: https://reviews.llvm.org/D32248 llvm-svn: 303370	2017-05-18 18:51:09 +00:00
Yaxun Liu	4f33b3d396	[OpenCL] Emit function-scope variable in constant address space as static variable Differential Revision: https://reviews.llvm.org/D32977 llvm-svn: 303072	2017-05-15 14:47:47 +00:00
Xiuli Pan	be6da4bbdb	[OpenCL] Add intel_reqd_sub_group_size attribute support Summary: Add intel_reqd_sub_group_size attribute support as intel extension cl_intel_required_subgroup_size from https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_required_subgroup_size.txt Reviewers: Anastasia, bader, hfinkel, pxli168 Reviewed By: Anastasia, bader, pxli168 Subscribers: cfe-commits, yaxunl Differential Revision: https://reviews.llvm.org/D30805 llvm-svn: 302125	2017-05-04 07:31:20 +00:00
Adrian Prantl	c3782a1a6f	Debug Info: Remove special-casing of indirect function argument handling. LLVM has changed the semantics of dbg.declare for describing function arguments. After this patch a dbg.declare always takes the address of a variable as the first argument, even if the argument is not an alloca. https://bugs.llvm.org/show_bug.cgi?id=32382 rdar://problem/31205000 llvm-svn: 300523	2017-04-18 01:22:01 +00:00
Yaxun Liu	d7523283a7	CodeGen: Let byval parameter use alloca address space Differential Revision: https://reviews.llvm.org/D32133 llvm-svn: 300487	2017-04-17 20:10:44 +00:00
Yaxun Liu	7f7f323e4f	CodeGen: Let lifetime intrinsic use alloca address space Differential Revision: https://reviews.llvm.org/D31717 llvm-svn: 300485	2017-04-17 20:03:11 +00:00
Konstantin Zhuravlyov	e668b1cd1e	[AMDGPU][GFX9] Set +fp32-denormals for >=gfx900 unless -cl-denorms-are-zero is set Differential Revision: https://reviews.llvm.org/D31482 llvm-svn: 300306	2017-04-14 05:33:57 +00:00
Yaxun Liu	b34ec829be	[OpenCL] Map default address space to alloca address space For OpenCL, the private address space qualifier is 0 in AST. Before this change, 0 address space qualifier is always mapped to target address space 0. As now target private address space is specified by alloca address space in data layout, address space qualifier 0 needs to be mapped to alloca addr space specified by the data layout. This change has no impact on targets whose alloca addr space is 0. With contributions from Matt Arsenault, Tony Tye and Wen-Heng (Jack) Chung Differential Revision: https://reviews.llvm.org/D31404 llvm-svn: 299965	2017-04-11 17:24:23 +00:00
Yaxun Liu	b122ed9181	[AMDGPU] Temporarily change constant address space from 4 to 2 for the new address space mapping Change constant address space from 4 to 2 for the new address space mapping in Clang. Differential Revision: https://reviews.llvm.org/D31771 llvm-svn: 299691	2017-04-06 19:18:36 +00:00
Stanislav Mekhanoshin	921a42314b	[AMDGPU] Translate reqd_work_group_size into amdgpu_flat_work_group_size These two attributes specify the same info in a different way. AMGPU BE only checks the latter as a target specific attribute as opposed to language specific reqd_work_group_size. This change produces amdgpu_flat_work_group_size out of reqd_work_group_size if specified. Differential Revision: https://reviews.llvm.org/D31728 llvm-svn: 299678	2017-04-06 18:15:44 +00:00
Egor Churaev	a8d2451533	[OpenCL] Enables passing sampler initializer to function argument Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: yaxunl, bader Differential Revision: https://reviews.llvm.org/D31594 llvm-svn: 299524	2017-04-05 09:02:56 +00:00
Jin-Gu Kang	e7cdcdea73	Preserve vec3 type. Summary: Preserve vec3 type with CodeGen option. Reviewers: Anastasia, bruno Reviewed By: Anastasia Subscribers: bruno, ahatanak, cfe-commits Differential Revision: https://reviews.llvm.org/D30810 llvm-svn: 299445	2017-04-04 16:40:25 +00:00
Egor Churaev	ba8b84d7fb	[OpenCL] Do not generate "kernel_arg_type_qual" metadata for non-pointer args Summary: "kernel_arg_type_qual" metadata should contain const/volatile/restrict tags only for pointer types to match the corresponding requirement of the OpenCL specification. OpenCL 2.0 spec 5.9.3 Kernel Object Queries: CL_KERNEL_ARG_TYPE_VOLATILE is returned if the argument is a pointer and the referenced type is declared with the volatile qualifier. [...] Similarly, CL_KERNEL_ARG_TYPE_CONST is returned if the argument is a pointer and the referenced type is declared with the restrict or const qualifier. [...] CL_KERNEL_ARG_TYPE_RESTRICT will be returned if the pointer type is marked restrict. Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: bader, yaxunl Differential Revision: https://reviews.llvm.org/D31321 llvm-svn: 299192	2017-03-31 10:14:52 +00:00
Egor Churaev	45c26ee0bf	[OpenCL] Extended mapping of parcing CodeGen arguments Summary: Enable cl_mad_enamle and cl_no_signed_zeros options when user turns on cl_unsafe_math_optimizations or cl_fast_relaxed_math options. Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: bader, yaxunl Differential Revision: https://reviews.llvm.org/D31324 llvm-svn: 298838	2017-03-27 10:38:01 +00:00
Yaxun Liu	3464f92e23	[AMDGPU] Switch address space mapping by triple environment amdgiz For target environment amdgiz and amdgizcl (giz means Generic Is Zero), AMDGPU will use new address space mapping where generic address space is 0 and private address space is 5. The data layout is also changed correspondingly. Differential Revision: https://reviews.llvm.org/D31210 llvm-svn: 298767	2017-03-25 03:46:25 +00:00
Konstantin Zhuravlyov	9c1e310c16	Fix array sizes where address space is not yet known For variables in generic address spaces, for example: ``` unsigned char V[6442450944]; ... ``` the address space is not yet known when we get into getConstantArrayType, it is 0. AMDGCN target's address space 0 has 32 bits pointers, so when we call getPointerWidth with 0, the array size is trimmed to 32 bits, which is not right. Differential Revision: https://reviews.llvm.org/D30845 llvm-svn: 298420	2017-03-21 18:55:39 +00:00
Egor Churaev	c217f37cb6	[OpenCL] Added implicit conversion rank for overloading functions with vector data type in OpenCL Summary: I added a new rank to ImplicitConversionRank enum to resolve the function overload ambiguity with vector types. Rank of scalar types conversion is lower than vector splat. So, we can choose which function should we call. See test for more details. Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: bader, yaxunl Differential Revision: https://reviews.llvm.org/D30816 llvm-svn: 298366	2017-03-21 12:55:55 +00:00
Matt Arsenault	bf5e3e4391	AMDGPU: Make 0 the private nullptr value We can't actually pretend that 0 is valid for address space 0. r295877 added a workaround to stop allocating user objects there, so we can use 0 as the invalid pointer. Some of the tests seemed to be using private as the non-0 null test address space, so add copies using local to make sure this is still stressed. llvm-svn: 297659	2017-03-13 19:47:53 +00:00
Yaxun Liu	4d86799219	[AMDGPU] Add builtin functions readlane ds_permute mov_dpp Differential Revision: https://reviews.llvm.org/D30551 llvm-svn: 297436	2017-03-10 01:30:46 +00:00
Konstantin Zhuravlyov	2b4917fcc9	[DebugInfo] Append extended dereferencing mechanism to variables' DIExpression for targets that support more than one address space Differential Revision: https://reviews.llvm.org/D29673 llvm-svn: 297397	2017-03-09 18:06:23 +00:00
Konstantin Zhuravlyov	d1ba16e762	[DebugInfo] Add address space when creating DIDerivedTypes Differential Revision: https://reviews.llvm.org/D29671 llvm-svn: 297321	2017-03-08 23:56:48 +00:00
Jan Vesely	9488560bb8	AMDGPU: export s_sendmsg{halt} instrinsics Differential Revision: https://reviews.llvm.org/D30366 llvm-svn: 296241	2017-02-25 04:20:24 +00:00
Jan Vesely	c255097517	AMDGPU: export l1 cache invalidation intrinsics Differential Revision: https://reviews.llvm.org/D30360 llvm-svn: 296240	2017-02-25 04:20:22 +00:00
Jan Vesely	d26dbb389f	AMDGPU: export s_waitcnt builtin Differential Revision: https://reviews.llvm.org/D30359 llvm-svn: 296239	2017-02-25 04:20:20 +00:00
Matt Arsenault	a0c6dca15b	AMDGPU: Add fmed3 half builtin llvm-svn: 295874	2017-02-22 20:55:59 +00:00
Jan Vesely	a6f369c727	[OpenCL] r600 needs OpenCL kernel calling convention Differential Revision: https://reviews.llvm.org/D30236 llvm-svn: 295843	2017-02-22 15:01:42 +00:00
Anastasia Stulova	58984e7087	[OpenCL] Correct ndrange_t implementation Removed ndrange_t as Clang builtin type and added as a struct type in the OpenCL header. Use type name to do the Sema checking in enqueue_kernel and modify IR generation accordingly. Review: D28058 Patch by Dmitry Borisenkov! llvm-svn: 295311	2017-02-16 12:27:47 +00:00
Matt Arsenault	77ce553891	AMDGPU: Add a test checking alignments of emitted globals/allocas Make sure the spec required type alignments are used in preparation for a possible change which may break this. llvm-svn: 294278	2017-02-07 04:28:02 +00:00
Matt Arsenault	a274b209f5	AMDGPU: Add builtin for fmed3 intrinsic llvm-svn: 293600	2017-01-31 03:42:07 +00:00
Anastasia Stulova	af0a7bbbe2	[OpenCL] Add missing address spaces in IR generation of blocks Modify ObjC blocks impl wrt address spaces as follows: - keep default private address space for blocks generated as local variables (with captures); - add global address space for global block literals (no captures); - make the block invoke function and enqueue_kernel prototype with the generic AS block pointer parameter to accommodate both private and global AS cases from above; - add block handling into default AS because it's implemented as a special pointer type (BlockPointer) in the frontend and therefore it is used as a pointer everywhere. This is also needed to accommodate both private and global AS blocks for the two cases above. - removes ObjC RT specific symbols (NSConcreteStackBlock and NSConcreteGlobalBlock) in the OpenCL mode. Review: https://reviews.llvm.org/D28814 llvm-svn: 293286	2017-01-27 15:11:34 +00:00
Matt Arsenault	09cca093a3	AMDGPU: Update for changed subtarget feature name llvm-svn: 292838	2017-01-23 22:31:14 +00:00
Matt Arsenault	24b5ae4497	AMDGPU: Add builtin for getreg intrinsic llvm-svn: 292636	2017-01-20 19:24:22 +00:00
Egor Churaev	28f00aab73	[OpenCL] Align fake address space map with the SPIR target maps. Summary: We compile user opencl kernel code with spir triple. But built-ins are written in OpenCL and we compile it with triple x86_64 to be able to use x86 intrinsics. And we need address spaces to match in both cases. So, we change fake address space map in OpenCL for matching with spir. On CPU address spaces are not really important but we'd like to preserve address space information in order to perform optimizations relying on this info like enhanced alias analysis. Reviewers: pekka.jaaskelainen, Anastasia Subscribers: pekka.jaaskelainen, yaxunl, bader, cfe-commits Differential Revision: https://reviews.llvm.org/D28048 llvm-svn: 290436	2016-12-23 16:11:25 +00:00
Egor Churaev	89831421af	Fix problems in "[OpenCL] Enabling the usage of CLK_NULL_QUEUE as compare operand." Summary: Fixed warnings in commit: https://reviews.llvm.org/rL290171 Reviewers: djasper, Anastasia Subscribers: yaxunl, cfe-commits, bader Differential Revision: https://reviews.llvm.org/D27981 llvm-svn: 290431	2016-12-23 14:55:49 +00:00
Chandler Carruth	fcd33149b4	Cleanup the handling of noinline function attributes, -fno-inline, -fno-inline-functions, -O0, and optnone. These were really, really tangled together: - We used the noinline LLVM attribute for -fno-inline - But not for -fno-inline-functions (breaking LTO) - But we did use it for -finline-hint-functions (yay, LTO is happy!) - But we didn't for -O0 (LTO is sad yet again...) - We had weird structuring of CodeGenOpts with both an inlining enumeration and a boolean. They interacted in weird ways and needlessly. - A lot of set smashing went on with setting these, and then got worse when we considered optnone and other inlining-effecting attributes. - A bunch of inline affecting attributes were managed in a completely different place from -fno-inline. - Even with -fno-inline we failed to put the LLVM noinline attribute onto many generated function definitions because they didn't show up as AST-level functions. - If you passed -O0 but -finline-functions we would run the normal inliner pass in LLVM despite it being in the O0 pipeline, which really doesn't make much sense. - Lastly, we used things like '-fno-inline' to manipulate the pass pipeline which forced the pass pipeline to be much more parameterizable than it really needs to be. Instead we can just use the optimization level to select a pipeline and control the rest via attributes. Sadly, this causes a bunch of churn in tests because we don't run the optimizer in the tests and check the contents of attribute sets. It would be awesome if attribute sets were a bit more FileCheck friendly, but oh well. I think this is a significant improvement and should remove the semantic need to change what inliner pass we run in order to comply with the requested inlining semantics by relying completely on attributes. It also cleans up tho optnone and related handling a bit. One unfortunate aspect of this is that for generating alwaysinline routines like those in OpenMP we end up removing noinline and then adding alwaysinline. I tried a bunch of other approaches, but because we recompute function attributes from scratch and don't have a declaration here I couldn't find anything substantially cleaner than this. Differential Revision: https://reviews.llvm.org/D28053 llvm-svn: 290398	2016-12-23 01:24:49 +00:00
George Burgess IV	e37633713d	Add the alloc_size attribute to clang, attempt 2. This is a recommit of r290149, which was reverted in r290169 due to msan failures. msan was failing because we were calling `isMostDerivedAnUnsizedArray` on an invalid designator, which caused us to read uninitialized memory. To fix this, the logic of the caller of said function was simplified, and we now have a `!Invalid` assert in `isMostDerivedAnUnsizedArray`, so we can catch this particular bug more easily in the future. Fingers crossed that this patch sticks this time. :) Original commit message: This patch does three things: - Gives us the alloc_size attribute in clang, which lets us infer the number of bytes handed back to us by malloc/realloc/calloc/any user functions that act in a similar manner. - Teaches our constexpr evaluator that evaluating some `const` variables is OK sometimes. This is why we have a change in test/SemaCXX/constant-expression-cxx11.cpp and other seemingly unrelated tests. Richard Smith okay'ed this idea some time ago in person. - Uniques some Blocks in CodeGen, which was reviewed separately at D26410. Lack of uniquing only really shows up as a problem when combined with our new eagerness in the face of const. llvm-svn: 290297	2016-12-22 02:50:20 +00:00
Daniel Jasper	9068938eb0	Revert "[OpenCL] Enabling the usage of CLK_NULL_QUEUE as compare operand." This reverts commit r290171. It triggers a bunch of warnings, because the new enumerator isn't handled in all switches. We want a warning-free build. Replied on the commit with more details. llvm-svn: 290173	2016-12-20 10:05:04 +00:00
Egor Churaev	67c3f3ec68	[OpenCL] Enabling the usage of CLK_NULL_QUEUE as compare operand. Summary: Enabling the compression of CLK_NULL_QUEUE to variable of type queue_t. Reviewers: Anastasia Subscribers: cfe-commits, yaxunl, bader Differential Revision: https://reviews.llvm.org/D27569 llvm-svn: 290171	2016-12-20 09:15:21 +00:00
Chandler Carruth	d7738fe6ad	Revert r290149: Add the alloc_size attribute to clang. This commit fails MSan when running test/CodeGen/object-size.c in a confusing way. After some discussion with George, it isn't really clear what is going on here. We can make the MSan failure go away by testing for the invalid bit, but why things are invalid isn't clear. And yet, other code in the surrounding area is doing precisely this and testing for invalid. George is going to take a closer look at this to better understand the nature of the failure and recommit it, for now backing it out to clean up MSan builds. llvm-svn: 290169	2016-12-20 08:28:19 +00:00
George Burgess IV	a747027bc6	Add the alloc_size attribute to clang. This patch does three things: - Gives us the alloc_size attribute in clang, which lets us infer the number of bytes handed back to us by malloc/realloc/calloc/any user functions that act in a similar manner. - Teaches our constexpr evaluator that evaluating some `const` variables is OK sometimes. This is why we have a change in test/SemaCXX/constant-expression-cxx11.cpp and other seemingly unrelated tests. Richard Smith okay'ed this idea some time ago in person. - Uniques some Blocks in CodeGen, which was reviewed separately at D26410. Lack of uniquing only really shows up as a problem when combined with our new eagerness in the face of const. Differential Revision: https://reviews.llvm.org/D14274 llvm-svn: 290149	2016-12-20 01:05:42 +00:00
Yaxun Liu	5b74665a41	Recommit r289979 [OpenCL] Allow disabling types and declarations associated with extensions Fixed undefined behavior due to cast integer to bool in initializer list. llvm-svn: 290056	2016-12-18 05:18:55 +00:00
Yaxun Liu	35f6d66b0d	Revert r289979 due to regressions llvm-svn: 289991	2016-12-16 21:23:55 +00:00

... 2 3 4 5 6 ...

548 Commits