llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	c64ca93053	clang: Treat ieee mode as the default for denormal-fp-math The IR hasn't switched the default yet, so explicitly add the ieee attributes. I'm still not really sure how the target default denormal mode should interact with -fno-unsafe-math-optimizations. The target may have selected the default mode to be non-IEEE based on the flags or based on its true behavior, but we don't know which is the case. Since the only users of a non-IEEE mode without a flag still support IEEE mode, just reset to IEEE.	2020-03-04 23:34:02 -05:00
Sjoerd Meijer	4e363563fa	Revert "[Driver] Default to -fno-common for all targets" This reverts commit `0a9fc9233e`. Going to look at the asan failures. I find the failures in the test suite weird, because they look like compile time test and I don't understand how that can be failing, but will have a brief look at that too.	2020-03-03 10:00:36 +00:00
Sjoerd Meijer	0a9fc9233e	[Driver] Default to -fno-common for all targets This makes -fno-common the default for all targets because this has performance and code-size benefits and is more language conforming for C code. Additionally, GCC10 also defaults to -fno-common and so we get consistent behaviour with GCC. With this change, C code that uses tentative definitions as definitions of a variable in multiple translation units will trigger multiple-definition linker errors. Generally, this occurs when the use of the extern keyword is neglected in the declaration of a variable in a header file. In some cases, no specific translation unit provides a definition of the variable. The previous behavior can be restored by specifying -fcommon. As GCC has switched already, we benefit from applications already being ported and existing documentation how to do this. For example: - https://gcc.gnu.org/gcc-10/porting_to.html - https://wiki.gentoo.org/wiki/Gcc_10_porting_notes/fno_common Differential revision: https://reviews.llvm.org/D75056	2020-03-03 09:15:07 +00:00
Yaxun (Sam) Liu	a57d9652a0	Make __builtin_amdgcn_dispatch_ptr dereferenceable and align at 4 Differential Revision: https://reviews.llvm.org/D75028	2020-02-25 13:58:20 -05:00
Yaxun (Sam) Liu	fb44b9db95	[OpenCL][CUDA][HIP][SYCL] Add norecurse norecurse function attr indicates the function is not called recursively directly or indirectly. Add norecurse to OpenCL functions, SYCL functions in device compilation and CUDA/HIP kernels. Although there is LLVM pass adding norecurse to functions, it only works for whole-program compilation. Also FE adding norecurse can make that pass run faster since functions with norecurse do not need to be checked again. Differential Revision: https://reviews.llvm.org/D73651	2020-02-16 20:41:00 -05:00
Konstantin Pyzhov	987aa3435f	Corrected clang amdgpu-features.cl test for `6d614a82a4` (AMDGPU MFMA built-ins) Differential Revision: https://reviews.llvm.org/D72723	2020-01-28 05:41:42 -05:00
Konstantin Pyzhov	ac9b2a6297	Add missing clang tests for `6d614a82a4` (AMDGPU MFMA built-ins) Differential Revision: https://reviews.llvm.org/D72723	2020-01-28 04:41:21 -05:00
Konstantin Pyzhov	6d614a82a4	Summary: This CL adds clang declarations of built-in functions for AMDGPU MFMA intrinsics and instructions. OpenCL tests for new built-ins are included. Differential Revision: https://reviews.llvm.org/D72723	2020-01-28 03:51:27 -05:00
Matt Arsenault	a4451d88ee	Consolidate internal denormal flushing controls Currently there are 4 different mechanisms for controlling denormal flushing behavior, and about as many equivalent frontend controls. - AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features - NVPTX uses the nvptx-f32ftz attribute - ARM directly uses the denormal-fp-math attribute - Other targets indirectly use denormal-fp-math in one DAGCombine - cl-denorms-are-zero has a corresponding denorms-are-zero attribute AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as I can tell the same is true for NVPTX (based on the attribute name). Work on consolidating these into the denormal-fp-math attribute, and a new type specific denormal-fp-math-f32 variant. Only ARM seems to support the two different flush modes, so this is overkill for the other use cases. Ideally we would error on the unsupported positive-zero mode on other targets from somewhere. Move the logic for selecting the flush mode into the compiler driver, instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32 are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as a user flag. -cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and -fno-cuda-flush-denormals-to-zero will be mapped to -fp-denormal-math-f32=ieee or preserve-sign rather than the old attributes. Stop emitting the denorms-are-zero attribute for the OpenCL flag. It has no in-tree users. The meaning would also be target dependent, such as the AMDGPU choice to treat this as only meaning allow flushing of f32 and not f16 or f64. The naming is also potentially confusing, since DAZ in other contexts refers to instructions implicitly treating input denormals as zero, not necessarily flushing output denormals to zero. This also does not attempt to change the behavior for the current attribute. The LangRef now states that the default is ieee behavior, but this is inaccurate for the current implementation. The clang handling is slightly hacky to avoid touching the existing denormal-fp-math uses. Fixing this will be left for a future patch. AMDGPU is still using the subtarget feature to control the denormal mode, but the new attribute are now emitted. A future change will switch this and remove the subtarget features.	2020-01-17 20:09:53 -05:00
Matt Arsenault	9b549f26fa	AMDGPU: Update clang test	2020-01-16 18:10:29 -05:00
Sven van Haastregt	6713670b17	[OpenCL] Fix mangling of single-overload builtins Commit `9a8d477a0e` ("[OpenCL] Add builtin function attribute handling", 2019-11-05) stopped Clang from mangling single-overload builtins, which is incorrect.	2019-12-03 11:09:16 +00:00
Sven van Haastregt	9a8d477a0e	[OpenCL] Add builtin function attribute handling Add handling for the "pure", "const" and "convergent" function attributes for OpenCL builtin functions. Patch by Pierre Gondois and Sven van Haastregt. Differential Revision: https://reviews.llvm.org/D64319	2019-11-05 10:26:47 +00:00
Matt Arsenault	281f2e2c37	AMDGPU: Add builtins for is_shared/is_private llvm-svn: 371010	2019-09-05 03:00:43 +00:00
Matt Arsenault	eac783a900	AMDGPU: Always emit amdgpu-flat-work-group-size The backend default maximum should be the hardware maximum, so the frontend should set the implementation defined default maximum. llvm-svn: 370101	2019-08-27 19:25:40 +00:00
Matt Arsenault	acd0a53c02	Builtins: Start adding half versions of math builtins The implementation of the OpenCL builtin currently library uses 2 different hacks to get to the corresponding IR intrinsics from the source. This will allow removal of those. This is the set that is currently used (minus a few vector ones). llvm-svn: 367973	2019-08-06 03:28:37 +00:00
Anastasia Stulova	ab4a5d14b5	[OpenCL] Fix vector literal test broken in rL367675. Avoid checking alignment unnecessary that is not portable among targets. llvm-svn: 367823	2019-08-05 09:50:28 +00:00
Tim Northover	a009a60a91	IR: print value numbers for unnamed function arguments For consistency with normal instructions and clarity when reading IR, it's best to print the %0, %1, ... names of function arguments in definitions. Also modifies the parser to accept IR in that form for obvious reasons. llvm-svn: 367755	2019-08-03 14:28:34 +00:00
Anastasia Stulova	8d99a5c0e6	[OpenCL] Allow OpenCL C style vector initialization in C++ Allow creating vector literals from other vectors. float4 a = (float4)(1.0f, 2.0f, 3.0f, 4.0f); float4 v = (float4)(a.s23, a.s01); Differential revision: https://reviews.llvm.org/D65286 llvm-svn: 367675	2019-08-02 11:19:35 +00:00
Matt Arsenault	64d7af09f5	AMDGPU: Add missing builtin declarations llvm-svn: 367431	2019-07-31 14:03:05 +00:00
Anastasia Stulova	88ed70e247	[OpenCL] Rename lang mode flag for C++ mode Rename lang mode flag to -cl-std=clc++/-cl-std=CLC++ or -std=clc++/-std=CLC++. This aligns with OpenCL C conversion and removes ambiguity with OpenCL C++. Differential Revision: https://reviews.llvm.org/D65102 llvm-svn: 367008	2019-07-25 11:04:29 +00:00
Christudasan Devadasan	8c5e6fa657	Updated the signature for some stack related intrinsics (CLANG) Modified the intrinsics int_addressofreturnaddress, int_frameaddress & int_sponentry. This commit depends on the changes in rL366679 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D64563 llvm-svn: 366683	2019-07-22 12:50:30 +00:00
Matt Arsenault	e56865d40c	AMDGPU: Add some missing builtins llvm-svn: 366286	2019-07-17 00:01:03 +00:00
Neil Hickey	8ece3b6719	[OpenCL] Fixing sampler initialisations for C++ mode. Allow conversions between integer and sampler type. Differential Revision: https://reviews.llvm.org/D64791 llvm-svn: 366212	2019-07-16 14:57:32 +00:00
Vyacheslav Zakharin	de811d1f51	[clang] Preserve names of addrspacecast'ed values. Differential Revision: https://reviews.llvm.org/D63846 llvm-svn: 365666	2019-07-10 17:10:05 +00:00
Christudasan Devadasan	18ba9d6077	[AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG). To enable a new implicit kernel argument, increased the number of argument bytes from 48 to 56. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D63756 llvm-svn: 365643	2019-07-10 15:10:08 +00:00
Reid Kleckner	9b28d9c331	Use the Itanium C++ ABI for the pipe_builtin.cl test Certain OpenCL constructs cannot yet be mangled in the MS C++ ABI. Add a FIXME for it if anyone cares to implement it. llvm-svn: 365557	2019-07-09 21:02:06 +00:00
Stanislav Mekhanoshin	0cfd75a07d	[AMDGPU] gfx908 clang target Differential Revision: https://reviews.llvm.org/D64430 llvm-svn: 365528	2019-07-09 18:19:00 +00:00
Marco Antognini	b00d5f732c	[OpenCL][Sema] Fix builtin rewriting This patch ensures built-in functions are rewritten using the proper parent declaration. Existing tests are modified to run in C++ mode to ensure the functionality works also with C++ for OpenCL while not increasing the testing runtime. llvm-svn: 365499	2019-07-09 15:04:23 +00:00
Brian Homerding	e6ba22542f	Add nofree attribute to CodeGenOpenCL/convergent.cl test The revision at https://reviews.llvm.org/rL365336 added inference of the nofree attribute. This revision updates the test to reflect this. Differential Revision: https://reviews.llvm.org/D49165 llvm-svn: 365341	2019-07-08 16:24:10 +00:00
Matt Arsenault	5495f78165	AMDGPU: Fix missing declaration for mbcnt builtins llvm-svn: 364251	2019-06-24 23:34:06 +00:00
Leonard Chan	f336eb344c	[clang][NewPM] Add RUNS for tests that produce slightly different IR under new PM For CodeGenOpenCL/convergent.cl, the new PM produced a slightly different for loop, but this still checks for no loop unrolling as intended. This is committed separately from D63174. llvm-svn: 364202	2019-06-24 16:49:18 +00:00
Matt Arsenault	fc84925208	AMDGPU: Fix target builtins for gfx10 This wasn't setting some of the features from older generations. llvm-svn: 364123	2019-06-22 01:30:00 +00:00
Matt Arsenault	bcdbc9a115	AMDGPU: Add DS GWS sema builtins llvm-svn: 363986	2019-06-20 21:33:57 +00:00
Matt Arsenault	f46f41411b	Reapply "r363684: AMDGPU: Add GWS instruction builtins" llvm-svn: 363871	2019-06-19 19:55:49 +00:00
Simon Pilgrim	6828bc5614	Revert rL363684 : AMDGPU: Add GWS instruction builtins ........ Depends on rL363678 which was reverted at rL363797 llvm-svn: 363824	2019-06-19 15:35:45 +00:00
Matt Arsenault	2acc717627	AMDGPU: Add GWS instruction builtins llvm-svn: 363684	2019-06-18 14:10:01 +00:00
Stanislav Mekhanoshin	cafccd7a53	[AMDGPU] gfx1011/gfx1012 clang support Differential Revision: https://reviews.llvm.org/D63308 llvm-svn: 363345	2019-06-14 00:33:59 +00:00
Stanislav Mekhanoshin	8a8131a3f6	[AMDGPU] gfx1010 wave32 clang support Differential Revision: https://reviews.llvm.org/D63209 llvm-svn: 363341	2019-06-13 23:47:59 +00:00
Tim Northover	c46827c7ed	LLVM IR: Generate new-style byval-with-Type from Clang LLVM IR recently added a Type parameter to the byval Attribute, so that when pointers become opaque and no longer have an element type the information will still be present in IR. For now the Type parameter is optional (which is why Clang didn't need this change at the time), but it will become mandatory soon. llvm-svn: 362652	2019-06-05 21:12:14 +00:00
Tim Northover	fcb00d4aec	Reapply: LLVM IR: update Clang tests for byval being a typed attribute. Since byval is now a typed attribute it gets sorted slightly differently by LLVM when the order of attributes is being canonicalized. This updates the few Clang tests that depend on the old order. Clang patch is unchanged. llvm-svn: 362129	2019-05-30 18:49:19 +00:00
Anastasia Stulova	f61b5481fd	[OpenCL] Fix OpenCL/SPIR version metadata in C++ mode. C++ is derived from OpenCL v2.0 therefore set the versions identically. Differential Revision: https://reviews.llvm.org/D62657 llvm-svn: 362102	2019-05-30 15:18:07 +00:00
Sven van Haastregt	ce127bb60e	[OpenCL] Support logical vector operators in C++ mode Support logical operators on vectors in C++ for OpenCL mode, to preserve backwards compatibility with OpenCL C. Differential Revision: https://reviews.llvm.org/D62588 llvm-svn: 362087	2019-05-30 12:35:19 +00:00
Tim Northover	4b281755ae	Revert "LLVM IR: update Clang tests for byval being a typed attribute." The underlying LLVM change couldn't cope with llvm-link and broke LTO builds. llvm-svn: 362028	2019-05-29 20:45:32 +00:00
Tim Northover	45e8cc6639	LLVM IR: update Clang tests for byval being a typed attribute. Since byval is now a typed attribute it gets sorted slightly differently by LLVM when the order of attributes is being canonicalized. This updates the few Clang tests that depend on the old order. llvm-svn: 362013	2019-05-29 19:13:29 +00:00
Yaxun Liu	a53d48b7f4	[OpenCL] Fix file-scope const sampler variable for 2.0 OpenCL spec v2.0 s6.13.14: Samplers can also be declared as global constants in the program source using the following syntax. const sampler_t <sampler name> = <value> This works fine for OpenCL 1.2 but fails for 2.0, because clang duduces address space of file-scope const sampler variable to be in global address space whereas spec v2.0 s6.9.b forbids file-scope sampler variable to be in global address space. The fix is not to deduce address space for file-scope sampler variables. Differential Revision: https://reviews.llvm.org/D62197 llvm-svn: 361757	2019-05-27 11:19:07 +00:00
Kevin Petit	aa7754cc90	[OpenCL] Add support for the cl_arm_integer_dot_product extensions The specification is available in the Khronos OpenCL registry: https://www.khronos.org/registry/OpenCL/extensions/arm/cl_arm_integer_dot_product.txt Signed-off-by: Kevin Petit <kevin.petit@arm.com> llvm-svn: 361641	2019-05-24 14:53:52 +00:00
Sven van Haastregt	151d4f88dc	[NFC] Fix line endings in OpenCL tests llvm-svn: 361004	2019-05-17 09:25:38 +00:00
Stanislav Mekhanoshin	91792f1b93	[AMDGPU] gfx1010 clang target Differential Revision: https://reviews.llvm.org/D61875 llvm-svn: 360634	2019-05-13 23:15:59 +00:00
Anastasia Stulova	5b6dda33d1	[Sema][OpenCL] Make address space conversions a bit stricter. The semantics for converting nested pointers between address spaces are not very well defined. Some conversions which do not really carry any meaning only produce warnings, and in some cases warnings hide invalid conversions, such as 'global int' to 'local float'! This patch changes the logic in checkPointerTypesForAssignment and checkAddressSpaceCast to fail properly on implicit conversions that should definitely not be permitted. We also dig deeper into the pointer types and warn on explicit conversions where the address space in a nested pointer changes, regardless of whether the address space is compatible with the corresponding pointer nesting level on the destination type. Fixes PR39674! Patch by ebevhan (Bevin Hansson)! Differential Revision: https://reviews.llvm.org/D58236 llvm-svn: 360258	2019-05-08 14:23:49 +00:00
Scott Linder	fb59fef7dc	Move setTargetAttributes after setGVProperties in SetFunctionAttributes AMDGPU currently relies on global properties being set before setTargetProperties is called. Existing targets like MIPS which rely on setTargetProperties do not rely on the current behavior, so this patch moves the call later in SetFunctionAttributes. Differential Revision: https://reviews.llvm.org/D60967 llvm-svn: 359039	2019-04-23 21:50:11 +00:00
Alexey Sotkin	1b01f9728f	[OpenCL] Re-fix invalid address space generation for clk_event_t arguments of enqueue_kernel builtin function Summary: https://reviews.llvm.org/D53809 fixed wrong address space(assert in debug build) generated for event_ret argument. But exactly the same problem exists for event_wait_list argument. This patch should fix both. Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Subscribers: kristina, ebevhan, cfe-commits Differential Revision: https://reviews.llvm.org/D59985 llvm-svn: 358151	2019-04-11 06:18:17 +00:00
Stanislav Mekhanoshin	1d9f286ecb	[AMDGPU] rename vi-insts into gfx8-insts Differential Revision: https://reviews.llvm.org/D60293 llvm-svn: 357792	2019-04-05 18:25:00 +00:00
Konstantin Zhuravlyov	ec28a1dcef	AMDGPU: Add support for cross address space synchronization scopes (clang) Differential Revision: https://reviews.llvm.org/D59494 llvm-svn: 356947	2019-03-25 20:54:00 +00:00
Andrew Savonichev	76b178d949	[OpenCL] Generate 'unroll.enable' metadata for __attribute__((opencl_unroll_hint)) Summary: [OpenCL] Generate 'unroll.enable' metadata for __attribute__((opencl_unroll_hint)) For both !{!"llvm.loop.unroll.enable"} and !{!"llvm.loop.unroll.full"} the unroller will try to fully unroll a loop unless the trip count is not known at compile time. In that case for '.full' metadata no unrolling will be processed, while for '.enable' the loop will be partially unrolled with a heuristically chosen unroll factor. See: docs/LanguageExtensions.rst From https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/attributes-loopUnroll.html __attribute__((opencl_unroll_hint)) for (int i=0; i<2; i++) { ... } In the example above, the compiler will determine how much to unroll the loop. Before the patch for __attribute__((opencl_unroll_hint)) was generated metadata !{!"llvm.loop.unroll.full"}, which limits ability of loop unroller to decide, how much to unroll the loop. Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Subscribers: zzheng, dmgreen, jdoerfert, cfe-commits, asavonic, AlexeySotkin Tags: #clang Differential Revision: https://reviews.llvm.org/D59493 llvm-svn: 356571	2019-03-20 16:43:07 +00:00
Michael Liao	3c2aadbe67	[AMDGPU] Add the missing clang change of the experimental buffer fat pointer llvm-svn: 356385	2019-03-18 18:11:37 +00:00
Konstantin Zhuravlyov	3161c89a22	AMDGPU: Fix the mapping of sub group sync scope Map memory_scope_sub_group to "wavefront" sync scope Differential Revision: https://reviews.llvm.org/D58847 llvm-svn: 355549	2019-03-06 20:54:48 +00:00
Yaxun Liu	d83c74028d	[OpenCL] Fix assertion due to blocks A recent change caused assertion in CodeGenFunction::EmitBlockCallExpr when a block is called. There is code Func = CGM.getOpenCLRuntime().getInvokeFunction(E->getCallee()); getCalleeDecl calls Expr::getReferencedDeclOfCallee, which does not handle BlockExpr and returns nullptr, which causes isa to assert. This patch fixes that. Differential Revision: https://reviews.llvm.org/D58658 llvm-svn: 354893	2019-02-26 16:20:41 +00:00
Andrew Savonichev	43fceb2727	[OpenCL] Simplify LLVM IR generated for OpenCL blocks Summary: Emit direct call of block invoke functions when possible, i.e. in case the block is not passed as a function argument. Also doing some refactoring of `CodeGenFunction::EmitBlockCallExpr()` Reviewers: Anastasia, yaxunl, svenvh Reviewed By: Anastasia Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58388 llvm-svn: 354568	2019-02-21 11:02:10 +00:00
Alexey Bader	24fa0c18e6	[OpenCL] Change type of block pointer for OpenCL Summary: For some reason OpenCL blocks in LLVM IR are represented as function pointers. These pointers do not point to any real function and never get called. Actually they point to some structure, which in turn contains pointer to the real block invoke function. This patch changes represntation of OpenCL blocks in LLVM IR from function pointers to pointers to `%struct.__block_literal_generic`. Such representation allows to avoid unnecessary bitcasts and simplifies further processing (e.g. translation to SPIR-V ) of the module for targets which do not support function pointers. Patch by: Alexey Sotkin. Reviewers: Anastasia, yaxunl, svenvh Reviewed By: Anastasia Subscribers: alexbatashev, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58277 llvm-svn: 354337	2019-02-19 15:19:06 +00:00
Anastasia Stulova	2c4730ded8	[OpenCL][PR40707] Allow OpenCL C types in C++ mode. Allow all OpenCL types to be parsed in C++ mode. llvm-svn: 354121	2019-02-15 12:07:57 +00:00
Scott Linder	80a1ee46d8	[AMDGPU] Require at least protected visibility for certain symbols This allows the global visibility controls to be restrictive while still populating the dynamic symbol table where required. Differential Revision: https://reviews.llvm.org/D56871 llvm-svn: 353870	2019-02-12 18:30:38 +00:00
Stanislav Mekhanoshin	1607a37308	[AMDGPU] Split dot-insts feature Differential Revision: https://reviews.llvm.org/D57972 llvm-svn: 353588	2019-02-09 00:34:41 +00:00
James Y Knight	f5f1b0e59e	[opaque pointer types] Cleanup CGBuilder's Create*GEP. Some of these functions take some extraneous arguments, e.g. EltSize, Offset, which are computable from the Type and DataLayout. Add some asserts to ensure that the computed values are consistent with the passed-in values, in preparation for eliminating the extraneous arguments. This also asserts that the Type is an Array for the calls named "Array" and a Struct for the calls named "Struct". Then, correct a couple of errors: 1. Using CreateStructGEP on an array type. (this causes the majority of the test differences, as struct GEPs are created with i32 indices, while array GEPs are created with i64 indices) 2. Passing the wrong Offset to CreateStructGEP in TargetInfo.cpp on x86-64 NACL (which uses 32-bit pointers). Differential Revision: https://reviews.llvm.org/D57766 llvm-svn: 353529	2019-02-08 15:34:12 +00:00
Anastasia Stulova	a4b1cf3282	[OpenCL] Fixed addr space manging test. Fixed typo in the Filecheck directive and changed the test to verify output correctly. Fixes PR40029! llvm-svn: 352760	2019-01-31 15:23:48 +00:00
Matt Arsenault	297afb14ec	Revert "OpenCL: Extend argument promotion rules to vector types" This reverts r348083. This was based on a misreading of the spec for printf specifiers. Also revert r343653, as without a subsequent patch, a correctly specified format for a vector will incorrectly warn. Fixes bug 40491. llvm-svn: 352539	2019-01-29 20:49:47 +00:00
Matt Arsenault	b72888647b	AMDGPU: Add ds append/consume builtins llvm-svn: 352443	2019-01-28 23:59:18 +00:00
Tim Corringham	6d5348cca5	[AMDGPU] Add interpolation builtins Summary: Added builtins for the interpolation intrinsics, and related LIT test. Reviewers: arsenm, tpr, dstuttard, #amdgpu Reviewed By: arsenm, #amdgpu Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, cfe-commits Differential Revision: https://reviews.llvm.org/D46871 llvm-svn: 352358	2019-01-28 13:50:37 +00:00
Stanislav Mekhanoshin	6332f4d0d4	[AMDGPU] Separate feature dot-insts Differential Revision: https://reviews.llvm.org/D56525 llvm-svn: 350794	2019-01-10 03:25:47 +00:00
Erich Keane	e8abbecaf7	Fix opencl test broken on windows by r350643. Windows doesn't allow common with alignment >32 bits, so these tests were broken in windows mode. This patch makes 'common' optional in these cases. Change-Id: I4d5fdd07ecdafc3570ef9b09cd816c2e5e4ed15e llvm-svn: 350645	2019-01-08 19:10:43 +00:00
Andrew Savonichev	1bf1a156d6	[OpenCL][CodeGen] Fix replacing memcpy with addrspacecast Summary: If a function argument is byval and RV is located in default or alloca address space an optimization of creating addrspacecast instead of memcpy is performed. That is not correct for OpenCL, where that can lead to a situation of address space casting from __private * to __global . See an example below: ``` typedef struct { int x; } MyStruct; void foo(MyStruct val) {} kernel void KernelOneMember(__global MyStruct x) { foo (x); } ``` for this code clang generated following IR: ... %0 = load %struct.MyStruct addrspace(1), %struct.MyStruct addrspace(1)** %x.addr, align 4 %1 = addrspacecast %struct.MyStruct addrspace(1)* %0 to %struct.MyStruct* ... So the optimization was disallowed for OpenCL if RV is located in an address space different than that of the argument (0). Reviewers: yaxunl, Anastasia Reviewed By: Anastasia Subscribers: cfe-commits, asavonic Differential Revision: https://reviews.llvm.org/D54947 llvm-svn: 348752	2018-12-10 12:03:00 +00:00
Matt Arsenault	af07de4059	OpenCL: Extend argument promotion rules to vector types The spec is ambiguous on whether vector types are allowed to be implicitly converted. The only legal context I think this can be used for OpenCL is printf, where it seems necessary. llvm-svn: 348083	2018-12-01 21:56:10 +00:00
Marco Antognini	06d9d070c7	Derive builtin return type from its definition Summary: Prior to this patch, OpenCL code such as the following would attempt to create a BranchInst with a non-bool argument: if (enqueue_kernel(get_default_queue(), 0, nd, ^(void){})) /* ... */ This patch is a follow up on a similar issue with pipe builtin operations. See commit r280800 and https://bugs.llvm.org/show_bug.cgi?id=30219. This change, while being conservative on non-builtin functions, should set the type of expressions invoking builtins to the proper type, instead of defaulting to `bool` and requiring manual overrides in Sema::CheckBuiltinFunctionCall. In addition to tests for enqueue_kernel, the tests are extended to check other OpenCL builtins. Reviewers: Anastasia, spatel, rsmith Reviewed By: Anastasia Subscribers: kristina, cfe-commits, svenvh Differential Revision: https://reviews.llvm.org/D52879 llvm-svn: 347658	2018-11-27 14:54:58 +00:00
JF Bastien	3a881e6bbc	CGDecl::emitStoresForConstant fix synthesized constant's name Summary: The name of the synthesized constants for constant initialization was using mangling for statics, which isn't generally correct and (in a yet-uncommitted patch) causes the mangler to assert out because the static ends up trying to mangle function parameters and this makes no sense. Instead, mangle to `"__const." + FunctionName + "." + DeclName`. Reviewers: rjmccall Subscribers: dexonsmith, cfe-commits Differential Revision: https://reviews.llvm.org/D54055 llvm-svn: 346915	2018-11-15 00:19:18 +00:00
Alexey Sotkin	692f12b389	[OpenCL] Fix invalid address space generation for clk_event_t Summary: Addrspace(32) was generated when putting 0 in clk_event_t * event_ret parameter for enqueue_kernel function. Patch by Viktoria Maksimova Reviewers: Anastasia, yaxunl, AlexeySotkin Reviewed By: Anastasia, AlexeySotkin Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D53809 llvm-svn: 346838	2018-11-14 09:40:05 +00:00
Andrew Savonichev	3fee351867	[OpenCL] Add support of cl_intel_device_side_avc_motion_estimation extension Summary: Documentation can be found at https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_device_side_avc_motion_estimation.txt Patch by Kristina Bessonova Reviewers: Anastasia, yaxunl, shafik Reviewed By: Anastasia Subscribers: arphaman, sidorovd, AlexeySotkin, krisb, bader, asavonic, cfe-commits Differential Revision: https://reviews.llvm.org/D51484 llvm-svn: 346392	2018-11-08 11:25:41 +00:00
Andrew Savonichev	3b12b7e702	Revert r346326 [OpenCL] Add support of cl_intel_device_side_avc_motion_estimation This patch breaks Index/opencl-types.cl LIT test: Script: -- : 'RUN: at line 1'; stage1/bin/c-index-test -test-print-type llvm/tools/clang/test/Index/opencl-types.cl -cl-std=CL2.0 \| stage1/bin/FileCheck llvm/tools/clang/test/Index/opencl-types.cl -- Command Output (stderr): -- llvm/tools/clang/test/Index/opencl-types.cl:3:26: warning: unsupported OpenCL extension 'cl_khr_fp16' - ignoring [-Wignored-pragmas] llvm/tools/clang/test/Index/opencl-types.cl:4:26: warning: unsupported OpenCL extension 'cl_khr_fp64' - ignoring [-Wignored-pragmas] llvm/tools/clang/test/Index/opencl-types.cl:8:9: error: use of type 'double' requires cl_khr_fp64 extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:11:8: error: declaring variable of type 'half' is not allowed llvm/tools/clang/test/Index/opencl-types.cl:15:3: error: use of type 'double' requires cl_khr_fp64 extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:16:3: error: use of type 'double4' (vector of 4 'double' values) requires cl_khr_fp64 extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:26:26: warning: unsupported OpenCL extension 'cl_khr_gl_msaa_sharing' - ignoring [-Wignored-pragmas] llvm/tools/clang/test/Index/opencl-types.cl:35:44: error: use of type '__read_only image2d_msaa_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:36:49: error: use of type '__read_only image2d_array_msaa_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:37:49: error: use of type '__read_only image2d_msaa_depth_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:38:54: error: use of type '__read_only image2d_array_msaa_depth_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm-svn: 346338	2018-11-07 18:34:19 +00:00
Andrew Savonichev	35dfce723c	[OpenCL] Add support of cl_intel_device_side_avc_motion_estimation extension Summary: Documentation can be found at https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_device_side_avc_motion_estimation.txt Patch by Kristina Bessonova Reviewers: Anastasia, yaxunl, shafik Reviewed By: Anastasia Subscribers: arphaman, sidorovd, AlexeySotkin, krisb, bader, asavonic, cfe-commits Differential Revision: https://reviews.llvm.org/D51484 llvm-svn: 346326	2018-11-07 15:44:01 +00:00
Craig Topper	3113ec3dc7	[CodeGen] Update min-legal-vector width based on function argument and return types This is a continuation of my patches to inform the X86 backend about what the largest IR types are in the function so that we can restrict the backend type legalizer to prevent 512-bit vectors on SKX when -mprefer-vector-width=256 is specified if no explicit 512 bit vectors were specified by the user. This patch updates the vector width based on the argument and return types of the current function and from the types of any functions it calls. This is intended to make sure the backend type legalizer doesn't disturb any types that are required for ABI. Differential Revision: https://reviews.llvm.org/D52441 llvm-svn: 345168	2018-10-24 17:42:17 +00:00
Yaxun Liu	aae1e87f4b	AMDGPU: add __builtin_amdgcn_update_dpp Emit llvm.amdgcn.update.dpp for both __builtin_amdgcn_mov_dpp and __builtin_amdgcn_update_dpp. The first argument to llvm.amdgcn.update.dpp will be undef for __builtin_amdgcn_mov_dpp. Differential Revision: https://reviews.llvm.org/D52320 llvm-svn: 344665	2018-10-17 02:32:26 +00:00
Sven van Haastregt	a3c6b407ec	[OpenCL] Add block argument CodeGen test r326937 ("[OpenCL] Remove block invoke function from emitted block literal struct", 2018-03-07) broke block argument handling. In particular the commit was causing a crash during code generation, see the discussion in https://reviews.llvm.org/D43783 . The offending commit has just been reverted; add a test to avoid breaking this again in the future. llvm-svn: 343583	2018-10-02 13:02:27 +00:00
Sven van Haastregt	da3b632057	Revert r326937 "[OpenCL] Remove block invoke function from emitted block literal struct" This reverts r326937 as it broke block argument handling in OpenCL. See the discussion on https://reviews.llvm.org/D43783 . The next commit will add a test case that revealed the issue. llvm-svn: 343582	2018-10-02 13:02:24 +00:00
Matt Arsenault	94abc57e37	AMDGPU: Add another missing builtin llvm-svn: 339395	2018-08-09 22:18:37 +00:00
Matt Arsenault	45bc148093	AMDGPU: Fix enabling denormals by default on pre-VI targets Fast FMAF is not a sufficient condition to enable denormals. Before VI, enabling denormals caused F32 instructions to run at F64 speeds. llvm-svn: 339278	2018-08-08 17:48:37 +00:00
Scott Linder	58df0e4d2c	[DebugInfo][OpenCL] Address post-commit review for r338299 NFC refactor of code to generate debug info for OpenCL 2.X blocks. Differential Revision: https://reviews.llvm.org/D50099 llvm-svn: 339265	2018-08-08 15:56:12 +00:00
Douglas Yung	be1166a43b	Fix one hard coded value I missed in r339185. llvm-svn: 339188	2018-08-07 21:37:14 +00:00
Douglas Yung	dca675a0d8	Make test more robust by not checking hard coded debug info values, but instead check the relationships between them. llvm-svn: 339185	2018-08-07 21:22:49 +00:00
Scott Linder	f8b3df4dec	[OpenCL] Restore r338899 (reverted in r338904), fixing stack-use-after-return Always emit alloca in entry block for enqueue_kernel builtin. Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. llvm-svn: 339150	2018-08-07 15:52:49 +00:00
Matt Arsenault	31c895ecdf	AMDGPU: Add builtin for s_dcache_wb llvm-svn: 339110	2018-08-07 07:49:13 +00:00
Matt Arsenault	24f3924709	AMDGPU: Add builtin for s_dcache_inv_vol llvm-svn: 339109	2018-08-07 07:49:04 +00:00
Vlad Tsyrklevich	c7d3d34b98	Revert "[OpenCL] Always emit alloca in entry block for enqueue_kernel builtin" This reverts commit r338899, it was causing ASan test failures on sanitizer-x86_64-linux-fast. llvm-svn: 338904	2018-08-03 17:47:58 +00:00
Scott Linder	91f578467c	[OpenCL] Always emit alloca in entry block for enqueue_kernel builtin Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. Differential Revision: https://reviews.llvm.org/D50104 llvm-svn: 338899	2018-08-03 15:50:52 +00:00
Matt Arsenault	e3d81572c1	AMDGPU: Fix missing declaration of queue ptr builtin llvm-svn: 338754	2018-08-02 18:24:55 +00:00
Matt Arsenault	c65f966d76	Try to make builtin address space declarations not useless The way address space declarations for builtins currently work is nearly useless. The code assumes the address spaces used for builtins is a confusingly named "target address space" from user code using __attribute__((address_space(N))) that matches the builtin declaration. There's no way to use this to declare a builtin that returns a language specific address space. The terminology used is highly cofusing since it has nothing to do with the the address space selected by the target to use for a language address space. This feature is essentially unused as-is. AMDGPU and NVPTX are the only in-tree targets attempting to use this. The AMDGPU builtins certainly do not behave as intended (i.e. all of the builtins returning pointers can never compile because the numbered address space never matches the expected named address space). The NVPTX builtins are missing tests for some, and the others seem to rely on an implicit addrspacecast. Change the used address space for builtins based on a target hook to allow using a language address space for a builtin. This allows the same builtin declaration to be used for multiple languages with similarly purposed address spaces (e.g. the same AMDGPU builtin can be used in OpenCL and CUDA even though the constant address spaces are arbitarily different). This breaks the possibility of using arbitrary numbered address spaces alongside the named address spaces for builtins. If this is an issue we probably need to introduce another builtin declaration character to distinguish language address spaces from so-called "target address spaces". llvm-svn: 338707	2018-08-02 12:14:28 +00:00
Konstantin Zhuravlyov	9057546c5b	AMDGPU: Add clamp bit to dot builtins Differential Revision: https://reviews.llvm.org/D50011 llvm-svn: 338471	2018-08-01 01:32:21 +00:00
Scott Linder	2b5cf04180	[DebugInfo][OpenCL] Generate correct block literal debug info for OpenCL OpenCL block literal structs have different fields which are now correctly identified in the debug info. Differential Revision: https://reviews.llvm.org/D49930 llvm-svn: 338299	2018-07-30 20:31:11 +00:00
JF Bastien	9aab85a6a0	CodeGen: specify alignment + inbounds for automatic variable initialization Summary: Automatic variable initialization was generating default-aligned stores (which are deprecated) instead of using the known alignment from the alloca. Further, they didn't specify inbounds. Subscribers: dexonsmith, cfe-commits Differential Revision: https://reviews.llvm.org/D49209 llvm-svn: 337041	2018-07-13 20:33:23 +00:00
Daniil Fukalov	1b14a3ad3d	[AMDGPU] fixes for lds f32 builtins 1. added restrictions to memory scope, order and volatile parameters 2. added custom processing for these builtins - currently is not used code, needed to switch off GCCBuiltin link to the builtins (ongoing change to llvm tree) 3. builtins renamed as requested Differential Revision: https://reviews.llvm.org/D43281 llvm-svn: 332848	2018-05-21 16:18:07 +00:00
Sanjay Patel	cda77b30e5	[OpenCL] make test independent of optimizer There shouldn't be any tests that run the entire optimizer here, but the last test in this file is definitely going to break with a change in LLVM IR canonicalization. Change that part to check the unoptimized IR because that's the real intent of this file. llvm-svn: 332473	2018-05-16 14:38:07 +00:00
Yaxun Liu	3cab24aa4f	[OpenCL] Fix typos in emitted enqueue kernel function names Two typos: vaarg => vararg get_kernel_preferred_work_group_multiple => get_kernel_preferred_work_group_size_multiple Differential Revision: https://reviews.llvm.org/D46601 llvm-svn: 331895	2018-05-09 17:07:06 +00:00
Anastasia Stulova	59055b94af	[OpenCL] Add constant address space to __func__ in AST. Added string literal helper function to obtain the type attributed by a constant address space. Also fixed predefind __func__ expr to use the helper to constract the string literal correctly. Differential Revision: https://reviews.llvm.org/D46049 llvm-svn: 331877	2018-05-09 13:23:26 +00:00

1 2 3 4 5 ...

463 Commits