llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun (Sam) Liu	3f4b5893ef	[AMDGPU] Add option -munsafe-fp-atomics Add an option -munsafe-fp-atomics for AMDGPU target. When enabled, clang adds function attribute "amdgpu-unsafe-fp-atomics" to any functions for amdgpu target. This allows amdgpu backend to use unsafe fp atomic instructions in these functions. Differential Revision: https://reviews.llvm.org/D91546	2020-11-16 21:52:12 -05:00
Tim Renouf	89d41f3a2b	[AMDGPU] Add gfx1033 target Differential Revision: https://reviews.llvm.org/D90447 Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761	2020-11-03 16:27:48 +00:00
Tim Renouf	ee3e642627	[AMDGPU] Add gfx90c target This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were previously included in gfx909. Differential Revision: https://reviews.llvm.org/D90419 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-11-03 16:27:43 +00:00
Stanislav Mekhanoshin	d1beb95d12	[AMDGPU] gfx1032 target Differential Revision: https://reviews.llvm.org/D89487	2020-10-15 12:41:18 -07:00
Tim Renouf	666ef0db20	[AMDGPU] Add gfx602, gfx705, gfx805 targets At AMD, in an internal audit of our code, we found some corner cases where we were not quite differentiating targets enough for some old hardware. This commit is part of fixing that by adding three new targets: * The "Oland" and "Hainan" variants of gfx601 are now split out into gfx602. LLPC (in the GPUOpen driver) and other front-ends could use that to avoid using the shaderZExport workaround on gfx602. * One variant of gfx703 is now split out into gfx705. LLPC and other front-ends could use that to avoid using the shaderSpiCsRegAllocFragmentation workaround on gfx705. * The "TongaPro" variant of gfx802 is now split out into gfx805. TongaPro has a faster 64-bit shift than its former friends in gfx802, and a subtarget feature could be set up for that to take advantage of it. This commit does not make that change; it just adds the target. V2: Add clang changes. Put TargetParser list in order. V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order, so fix the GPUKind order. Differential Revision: https://reviews.llvm.org/D88916 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-10-10 17:22:22 +01:00
Yaxun (Sam) Liu	36501b180a	Emit predefined macro for wavefront size for amdgcn Also fix the issue of multiple -m[no-]wavefrontsize64 options to make the last one wins. Differential Revision: https://reviews.llvm.org/D88370	2020-10-02 10:17:21 -04:00
Yaxun (Sam) Liu	7546b29e76	[HIP] Support target id by --offload-arch This patch introduces support of target id by -offload-arch. Differential Revision: https://reviews.llvm.org/D60620	2020-08-18 23:43:53 -04:00
Stanislav Mekhanoshin	ea7d0e2996	[AMDGPU] gfx1031 target Differential Revision: https://reviews.llvm.org/D85337	2020-08-05 12:36:26 -07:00
Alexey Bader	8d27be8dba	[OpenCL] Add global_device and global_host address spaces This patch introduces 2 new address spaces in OpenCL: global_device and global_host which are a subset of a global address space, so the address space scheme will be looking like: ``` generic->global->host ->device ->private ->local constant ``` Justification: USM allocations may be associated with both host and device memory. We want to give users a way to tell the compiler the allocation type of a USM pointer for optimization purposes. (Link to the Unified Shared Memory extension: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/cl_intel_unified_shared_memory.asciidoc) Before this patch USM pointer could be only in opencl_global address space, hence a device backend can't tell if a particular pointer points to host or device memory. On FPGAs at least we can generate more efficient hardware code if the user tells us where the pointer can point - being able to distinguish between these types of pointers at compile time allows us to instantiate simpler load-store units to perform memory transactions. Patch by Dmitry Sidorov. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D82174	2020-07-29 17:24:53 +03:00
Stanislav Mekhanoshin	9ee272f13d	[AMDGPU] Add gfx1030 target Differential Revision: https://reviews.llvm.org/D81886	2020-06-15 16:18:05 -07:00
Saiyedul Islam	4022bc2a6c	[OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 2 Summary: New file include to support platform dependent grid constants. It will be used by clang, libomptarget plugins, and deviceRTLs to access constant values consistently and with fast access in the deviceRTLs. Originally authored by Greg Rodgers (@gregrodgers). Reviewers: arsenm, sameerds, jdoerfert, yaxunl, b-sumner, scchan, JonChesterfield Reviewed By: arsenm Subscribers: llvm-commits, pdhaliwal, jholewinski, jvesely, wdng, nhaehnle, guansong, kerbowa, sstefan1, cfe-commits, ronlieb, gregrodgers Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80917	2020-06-10 18:09:59 +00:00
Michael Liao	86e3b735cd	[hip] Claim builtin type `__float128` supported if the host target supports it. Reviewers: tra, yaxunl Subscribers: jvesely, nhaehnle, kerbowa, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D78513	2020-04-21 15:56:40 -04:00
Yaxun (Sam) Liu	a46e7d7a5f	[AMDGPU] Allow AGPR in inline asm Differential Revision: https://reviews.llvm.org/D77329	2020-04-03 09:08:13 -04:00
Matt Arsenault	ce2258c1cd	clang/AMDGPU: Stop setting old denormal subtarget features	2020-04-02 17:17:12 -04:00
Matt Arsenault	a3c814d234	Separately track input and output denormal mode AMDGPU and x86 at least both have separate controls for whether denormal results are flushed on output, and for whether denormals are implicitly treated as 0 as an input. The current DAGCombiner use only really cares about the input treatment of denormals.	2020-02-04 12:59:21 -05:00
Konstantin Pyzhov	6d614a82a4	Summary: This CL adds clang declarations of built-in functions for AMDGPU MFMA intrinsics and instructions. OpenCL tests for new built-ins are included. Differential Revision: https://reviews.llvm.org/D72723	2020-01-28 03:51:27 -05:00
Matt Arsenault	a4451d88ee	Consolidate internal denormal flushing controls Currently there are 4 different mechanisms for controlling denormal flushing behavior, and about as many equivalent frontend controls. - AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features - NVPTX uses the nvptx-f32ftz attribute - ARM directly uses the denormal-fp-math attribute - Other targets indirectly use denormal-fp-math in one DAGCombine - cl-denorms-are-zero has a corresponding denorms-are-zero attribute AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as I can tell the same is true for NVPTX (based on the attribute name). Work on consolidating these into the denormal-fp-math attribute, and a new type specific denormal-fp-math-f32 variant. Only ARM seems to support the two different flush modes, so this is overkill for the other use cases. Ideally we would error on the unsupported positive-zero mode on other targets from somewhere. Move the logic for selecting the flush mode into the compiler driver, instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32 are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as a user flag. -cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and -fno-cuda-flush-denormals-to-zero will be mapped to -fp-denormal-math-f32=ieee or preserve-sign rather than the old attributes. Stop emitting the denorms-are-zero attribute for the OpenCL flag. It has no in-tree users. The meaning would also be target dependent, such as the AMDGPU choice to treat this as only meaning allow flushing of f32 and not f16 or f64. The naming is also potentially confusing, since DAZ in other contexts refers to instructions implicitly treating input denormals as zero, not necessarily flushing output denormals to zero. This also does not attempt to change the behavior for the current attribute. The LangRef now states that the default is ieee behavior, but this is inaccurate for the current implementation. The clang handling is slightly hacky to avoid touching the existing denormal-fp-math uses. Fixing this will be left for a future patch. AMDGPU is still using the subtarget feature to control the denormal mode, but the new attribute are now emitted. A future change will switch this and remove the subtarget features.	2020-01-17 20:09:53 -05:00
Amy Huang	a85f5efd95	Add support for the MS qualifiers __ptr32, __ptr64, __sptr, __uptr. Summary: This adds parsing of the qualifiers __ptr32, __ptr64, __sptr, and __uptr and lowers them to the corresponding address space pointer for 32-bit and 64-bit pointers. (32/64-bit pointers added in https://reviews.llvm.org/D69639) A large part of this patch is making these pointers ignore the address space when doing things like overloading and casting. https://bugs.llvm.org/show_bug.cgi?id=42359 Reviewers: rnk, rsmith Subscribers: jholewinski, jvesely, nhaehnle, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D71039	2019-12-18 10:41:12 -08:00
Bjorn Pettersson	78424e5f84	Prune include of DataLayout.h from include/clang/Basic/TargetInfo.h. NFC Summary: Use a forward declaration of DataLayout instead of including DataLayout.h in clangs TargetInfo.h. This reduces include dependencies toward DataLayout.h (and other headers such as DerivedTypes.h, Type.h that is included by DataLayout.h). Needed to move implemantation of TargetInfo::resetDataLayout from TargetInfo.h to TargetInfo.cpp. Reviewers: rnk Reviewed By: rnk Subscribers: jvesely, nhaehnle, cfe-commits, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69262 llvm-svn: 375438	2019-10-21 17:58:14 +00:00
Matt Arsenault	281f2e2c37	AMDGPU: Add builtins for is_shared/is_private llvm-svn: 371010	2019-09-05 03:00:43 +00:00
Stanislav Mekhanoshin	c17705b7fb	[AMDGPU] Do not assume a default GCN target Differential Revision: https://reviews.llvm.org/D66246 llvm-svn: 368917	2019-08-14 20:55:15 +00:00
Stanislav Mekhanoshin	0cfd75a07d	[AMDGPU] gfx908 clang target Differential Revision: https://reviews.llvm.org/D64430 llvm-svn: 365528	2019-07-09 18:19:00 +00:00
Matt Arsenault	fc84925208	AMDGPU: Fix target builtins for gfx10 This wasn't setting some of the features from older generations. llvm-svn: 364123	2019-06-22 01:30:00 +00:00
Stanislav Mekhanoshin	cafccd7a53	[AMDGPU] gfx1011/gfx1012 clang support Differential Revision: https://reviews.llvm.org/D63308 llvm-svn: 363345	2019-06-14 00:33:59 +00:00
Stanislav Mekhanoshin	91792f1b93	[AMDGPU] gfx1010 clang target Differential Revision: https://reviews.llvm.org/D61875 llvm-svn: 360634	2019-05-13 23:15:59 +00:00
Yaxun Liu	4469701207	AMDGPU: Enable _Float16 llvm-svn: 359594	2019-04-30 18:35:37 +00:00
Stanislav Mekhanoshin	1d9f286ecb	[AMDGPU] rename vi-insts into gfx8-insts Differential Revision: https://reviews.llvm.org/D60293 llvm-svn: 357792	2019-04-05 18:25:00 +00:00
Michael Liao	3c2aadbe67	[AMDGPU] Add the missing clang change of the experimental buffer fat pointer llvm-svn: 356385	2019-03-18 18:11:37 +00:00
Stanislav Mekhanoshin	1607a37308	[AMDGPU] Split dot-insts feature Differential Revision: https://reviews.llvm.org/D57972 llvm-svn: 353588	2019-02-09 00:34:41 +00:00
Yaxun Liu	277e064bf5	Do not copy long double and 128-bit fp format from aux target for AMDGPU rC352620 caused regressions because it copied floating point format from aux target. floating point format decides whether extended long double is supported. It is x86_fp80 on x86 but IEEE double on amdgcn. Document usage of long doubel type in HIP programming guide https://github.com/ROCm-Developer-Tools/HIP/pull/890 Differential Revision: https://reviews.llvm.org/D57527 llvm-svn: 352801	2019-01-31 21:57:51 +00:00
Yaxun Liu	95f2ca541f	[HIP] Fix size_t for MSVC environment In 64 bit MSVC environment size_t is defined as unsigned long long. In single source language like HIP, data layout should be consistent in device and host compilation, therefore copy data layout controlling fields from Aux target for AMDGPU target. Differential Revision: https://reviews.llvm.org/D56318 llvm-svn: 352620	2019-01-30 12:26:54 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Stanislav Mekhanoshin	6332f4d0d4	[AMDGPU] Separate feature dot-insts Differential Revision: https://reviews.llvm.org/D56525 llvm-svn: 350794	2019-01-10 03:25:47 +00:00
Richard Trieu	6368818fd5	Move CodeGenOptions from Frontend to Basic Basic uses CodeGenOptions and should not depend on Frontend. llvm-svn: 348827	2018-12-11 03:18:39 +00:00
Konstantin Zhuravlyov	06570954e2	AMDGPU: Handle gfx909 in AMDGPUTargetInfo::initFeatureMap + add required tests llvm-svn: 345181	2018-10-24 19:07:56 +00:00
Matt Arsenault	b666e73dd9	AMDGPU: Move target code into TargetParser llvm-svn: 340292	2018-08-21 16:13:29 +00:00
Matt Arsenault	45bc148093	AMDGPU: Fix enabling denormals by default on pre-VI targets Fast FMAF is not a sufficient condition to enable denormals. Before VI, enabling denormals caused F32 instructions to run at F64 speeds. llvm-svn: 339278	2018-08-08 17:48:37 +00:00
Matt Arsenault	31c895ecdf	AMDGPU: Add builtin for s_dcache_wb llvm-svn: 339110	2018-08-07 07:49:13 +00:00
Matt Arsenault	24f3924709	AMDGPU: Add builtin for s_dcache_inv_vol llvm-svn: 339109	2018-08-07 07:49:04 +00:00
Matt Arsenault	d2da3c20d7	AMDGPU: Add Vega12 and Vega20 Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331216	2018-04-30 19:08:27 +00:00
Yaxun Liu	bec8a66454	[CUDA] Revert defining __CUDA_ARCH__ for amdgcn targets amdgcn targets only support HIP, which does not define __CUDA_ARCH__. this is a partial unroll of r329232 / D45277. Differential Revision: https://reviews.llvm.org/D45387 llvm-svn: 329584	2018-04-09 15:43:01 +00:00
Yaxun Liu	8a5fc15aa4	[CUDA] Add amdgpu sub archs Patch by Greg Rodgers. Revised and lit tests added by Yaxun Liu. Differential Revision: https://reviews.llvm.org/D45277 llvm-svn: 329232	2018-04-04 21:19:27 +00:00
Matt Arsenault	b130ea5605	AMDGPU: Update datalayout for stack alignment llvm-svn: 328657	2018-03-27 19:26:51 +00:00
Yaxun Liu	1578a0a55d	[AMDGPU] Clean up old address space mapping and fix constant address space value Differential Revision: https://reviews.llvm.org/D43911 llvm-svn: 326725	2018-03-05 17:50:10 +00:00
Konstantin Zhuravlyov	d6b3453bdb	AMDGPU: Define FP_FAST_FMA{F} macros for amdgcn - Expand GK_*s (i.e. GFX6 -> GFX600, GFX601, etc.) - This allows us to choose features correctly in some cases (for example, fast fmaf is available on gfx600, but not gfx601) - Move HasFMAF, HasFP64, HasLDEXPF to GPUInfo tables - Add HasFastFMA, HasFastFMAF to GPUInfo tables - Add missing tests llvm-svn: 326254	2018-02-27 21:48:05 +00:00
Konstantin Zhuravlyov	cf71761495	Reapply r325193 llvm-svn: 325203	2018-02-15 02:37:04 +00:00
Konstantin Zhuravlyov	b7b86127f5	Revert r325193 as it breaks buildbots llvm-svn: 325200	2018-02-15 02:27:45 +00:00
Richard Smith	47c9b5d4d6	Add missing definition for class static after r325193. llvm-svn: 325195	2018-02-15 01:01:06 +00:00
Konstantin Zhuravlyov	5c9d4e7957	AMDGPU: Cleanup most of the macros - Insert __AMD__ macro - Insert __AMDGPU__ macro - Insert __devicename__ macro - Add missing tests for arch macros Differential Revision: https://reviews.llvm.org/D36802 llvm-svn: 325193	2018-02-15 00:20:26 +00:00
Yaxun Liu	651bd73c02	[AMDGPU] Change constant addr space to 4 Differential Revision: https://reviews.llvm.org/D43171 llvm-svn: 325031	2018-02-13 18:01:21 +00:00

1 2

60 Commits