llvm-project

Commit Graph

Author	SHA1	Message	Date
Anshil Gandhi	508b06699a	[Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions Produce remarks when atomic instructions are expanded into hardware instructions in SIISelLowering.cpp. Currently, these remarks are only emitted for atomic fadd instructions. Differential Revision: https://reviews.llvm.org/D108150	2021-08-19 20:51:19 -06:00
Sven van Haastregt	7bda1a0711	[OpenCL] Fix as_type(vec3) invalid store creation With -fpreserve-vec3-type enabled, a cast was not created when converting from a vec3 type to a non-vec3 type, even though a conversion to vec4 was performed. This resulted in creation of invalid store instructions. Differential Revision: https://reviews.llvm.org/D107963	2021-08-19 11:57:09 +01:00
Anshil Gandhi	f22ba51873	[Remarks] Emit optimization remarks for atomics generating CAS loop Implements ORE in AtomicExpand pass to report atomics generating a compare and swap loop. Differential Revision: https://reviews.llvm.org/D106891	2021-08-16 14:56:01 -06:00
Dávid Bolvanský	49de6070a2	Revert "[Remarks] Emit optimization remarks for atomics generating CAS loop" This reverts commit `435785214f`. Still same compile time issues for -O0 -g, eg. +1.3% for sqlite3.	2021-08-15 11:44:13 +02:00
Anshil Gandhi	435785214f	[Remarks] Emit optimization remarks for atomics generating CAS loop Implements ORE in AtomicExpand pass to report atomics generating a compare and swap loop. Differential Revision: https://reviews.llvm.org/D106891	2021-08-14 23:37:23 -06:00
Anshil Gandhi	29e11a1aa3	Revert "[Remarks] Emit optimization remarks for atomics generating CAS loop" This reverts commit `c4e5425aa5`.	2021-08-13 23:58:04 -06:00
Anshil Gandhi	c4e5425aa5	[Remarks] Emit optimization remarks for atomics generating CAS loop Implements ORE in AtomicExpandPass to report atomics generating a compare and swap loop. Differential Revision: https://reviews.llvm.org/D106891	2021-08-13 22:44:08 -06:00
Sven van Haastregt	696ad3c491	[OpenCL] Tidy up preserve_vec3 test Add CHECK-LABELs and fix string substitution to actually match the previous definition.	2021-08-12 14:51:20 +01:00
Oliver Stannard	e345b45bf1	Mark tests as requiring AMDGPU target	2021-08-05 10:02:51 +01:00
Anshil Gandhi	39dac1f7f6	[clang] Add clang builtins support for gfx90a Implement target builtins for gfx90a including fadd64, fadd32, add2h, max and min on various global, flat and ds address spaces for which intrinsics are implemented. Differential Revision: https://reviews.llvm.org/D106909	2021-08-05 02:08:06 -06:00
Anton Zabaznov	4e124ff256	[OpenCL] Replace test for pipe struct to test it with fixed triple Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D107176	2021-07-30 21:49:20 +03:00
Anton Zabaznov	acc5850495	[OpenCL] Add support of __opencl_c_pipes feature macro. 'pipe' keyword is introduced in OpenCL C 2.0: so do checks for OpenCL C version while parsing and then later on check for language options to construct actual pipe. This feature requires support of __opencl_c_generic_address_space, so diagnostics for that is provided as well. This is the same patch as in D106748 but with a tiny fix in checking of diagnostic messages. Also added tests when program scope global variables are not supported. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D107154	2021-07-30 18:10:25 +03:00
Anton Zabaznov	da6626d126	Revert "[OpenCL] Add support of __opencl_c_pipes feature macro." This reverts commit `d1e4b25756`.	2021-07-30 06:34:29 +03:00
Anton Zabaznov	d1e4b25756	[OpenCL] Add support of __opencl_c_pipes feature macro. 'pipe' keyword is introduced in OpenCL C 2.0: so do checks for OpenCL C version while parsing and then later on check for language options to construct actual pipe. This feature requires support of __opencl_c_generic_address_space, so diagnostics for that is provided as well. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D106748	2021-07-30 05:27:55 +03:00
Eli Friedman	0fb16d5ad1	Fix clang regression test after `5c486ce0`	2021-07-26 11:59:40 -07:00
Anastasia Stulova	81600160b3	[OpenCL] Change default standard version to CL1.2 Set default version for OpenCL C to 1.2. This means that the absence of any standard flag will be equivalent to passing '-cl-std=CL1.2'. Note that this patch also fixes incorrect version check for the pointer to pointer kernel arguments diagnostic and atomic test. Differential Revision: https://reviews.llvm.org/D106504	2021-07-26 15:04:34 +01:00
Anton Zabaznov	05eb59e1d0	[OpenCL] Add support of __opencl_c_program_scope_global_variables feature macro Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D103191	2021-07-15 17:21:19 +03:00
Anton Zabaznov	78463ebde2	[OpenCL] Add support of __opencl_c_generic_address_space feature macro Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D103401	2021-07-13 13:14:10 +03:00
Nikita Popov	ff8b1b1b9c	Reapply [IR] Don't mark mustprogress as type attribute Reapply with fixes for clang tests. ----- This is a simple enum attribute. Test changes are because enum attributes are sorted before type attributes, so mustprogress is now in a different position.	2021-07-09 20:57:44 +02:00
Nico Weber	97c675d3d4	Revert "Revert "Temporarily do not drop volatile stores before unreachable"" This reverts commit `52aeacfbf5`. There isn't full agreement on a path forward yet, but there is agreement that this shouldn't land as-is. See discussion on https://reviews.llvm.org/D105338 Also reverts unreviewed "[clang] Improve `-Wnull-dereference` diag to be more in-line with reality" This reverts commit `f4877c78c0`. And all the related changes to tests: This reverts commit `9a0152799f`. This reverts commit `3f7c9cc274`. This reverts commit `329f8197ef`. This reverts commit `aa9f58cc2c`. This reverts commit `2df37d5ddd`. This reverts commit `a72a441812`.	2021-07-09 11:44:34 -04:00
Roman Lebedev	329f8197ef	[NFC][Clang][CodegenOpenCL] Fix test not to rely on volatile store not being removed	2021-07-09 14:16:54 +03:00
Yaxun (Sam) Liu	434bd5bf54	[AMDGPU] Add builtin functions image_bvh_intersect_ray Reviewed by: Stanislav Mekhanoshin, Matt Arsenault Differential Revision: https://reviews.llvm.org/D104946	2021-06-30 13:10:47 -04:00
Stuart Brady	e47027d091	[OpenCL] Use DW_LANG_OpenCL language tag for OpenCL C Note regarding C++ for OpenCL: When compiling C++ for OpenCL, DW_LANG_C_plus_plus* is emitted. There is no DWARF language code defined for C++ for OpenCL as of yet, but DWARF issue 210514.1 has been raised to request one. In the mean time, continuing to emit DW_LANG_C_plus_plus* for C++ for OpenCL allows the potential to distinguish between C++ for OpenCL and OpenCL C in !DICompileUnit nodes, whereas using DW_LANG_OpenCL for C++ for OpenCL would prevent this. This change therefore leaves C++ for OpenCL as-is. Reviewed By: shchenz, Anastasia Differential Revision: https://reviews.llvm.org/D104118	2021-06-25 11:48:42 +01:00
Aakanksha Patil	3453f3dd46	[AMDGPU] Add gfx1035 target Differential Revision: https://reviews.llvm.org/D104804	2021-06-24 14:32:41 -04:00
Brendon Cahoon	294efbbd3e	Reland "[AMDGPU] Add gfx1013 target" This reverts commit `211e584fa2`. Fixed a use-after-free error that caused the sanitizers to fail.	2021-06-08 21:15:35 -04:00
Brendon Cahoon	211e584fa2	Revert "[AMDGPU] Add gfx1013 target" This reverts commit `ea10a86984`. A sanitizer buildbot reports an error.	2021-06-08 16:29:41 -04:00
Brendon Cahoon	ea10a86984	[AMDGPU] Add gfx1013 target Differential Revision: https://reviews.llvm.org/D103663	2021-06-08 12:49:49 -04:00
Jason Zheng	333987b045	[OpenCL] Add DWARF address spaces mapping for SPIR Extend debug info handling by adding DWARF address space mapping for SPIR, with corresponding test case. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D103097	2021-06-04 18:10:54 +01:00
Anton Zabaznov	826905787a	[OpenCL] Add support of OpenCL C 3.0 __opencl_c_fp64 There already exists cl_khr_fp64 extension. So OpenCL C 3.0 and higher should use the feature, earlier versions still use the extension. OpenCL C 3.0 API spec states that extension will be not described in the option string if corresponding optional functionality is not supported (see 4.2. Querying Devices). Due to that fact the usage of features for OpenCL C 3.0 must be as follows: ``` $ clang -Xclang -cl-ext=+cl_khr_fp64,+__opencl_c_fp64 ... $ clang -Xclang -cl-ext=-cl_khr_fp64,-__opencl_c_fp64 ... ``` e.g. the feature and the equivalent extension (if exists) must be set to the same values Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D96524	2021-05-21 15:01:19 +03:00
Anastasia Stulova	3549466ac0	[OpenCL] Drop pragma handling for extension types/decls. Drop non-conformant extension pragma implementation as it does not properly disable anything and therefore enabling non-disabled logic has no meaning. This simplifies clang code and user interface to the extension functionality. With this patch extension pragma 'begin'/'end' and 'enable'/'disable' are only accepted for backward compatibility and no longer have any default behavior. Differential Revision: https://reviews.llvm.org/D101043	2021-05-17 12:09:43 +01:00
Aakanksha Patil	464e4dc50f	[AMDGPU] Add gfx1034 target Differential Revision: https://reviews.llvm.org/D102306	2021-05-13 14:25:18 -04:00
Aaron En Ye Shi	6a67e05a26	[HIP] Add __builtin_amdgcn_groupstaticsize Differential Revision: https://reviews.llvm.org/D102403	2021-05-13 15:50:08 +00:00
Anastasia Stulova	58d18dde5c	[OpenCL] Remove pragma requirement from Arm dot extension. This removed the pointless need for extension pragma since it doesn't disable anything properly and it doesn't need to enable anything that is not possible to disable. The change doesn't break existing kernels since it allows to compile more cases i.e. without pragma statements but the pragma continues to be accepted. Differential Revision: https://reviews.llvm.org/D100985	2021-05-12 16:25:33 +01:00
Bruno Cardoso Lopes	819e0d105e	[CGAtomic] Lift strong requirement for remaining compare_exchange combinations Follow up on `431e3138a` and complete the other possible combinations. Besides enforcing the new behavior, it also mitigates TSAN false positives when combining orders that used to be stronger.	2021-05-06 21:05:20 -07:00
Stanislav Mekhanoshin	c714d03785	[AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32 Differential Revision: https://reviews.llvm.org/D102022	2021-05-06 16:17:33 -07:00
Juneyoung Lee	8a156d1c27	[InstCombine] Fully disable select to and/or i1 folding This is a patch that disables the poison-unsafe select -> and/or i1 folding. It has been blocking D72396 and also has been the source of a few miscompilations described in llvm.org/pr49688 . D99674 conditionally blocked this folding and successfully fixed the latter one. The former one was still blocked, and this patch addresses it. Note that a few test functions that has `_logical` suffix are now deoptimized. These are created by @nikic to check the impact of disabling this optimization by copying existing original functions and replacing and/or with select. I can see that most of these are poison-unsafe; they can be revived by introducing freeze instruction. I left comments at fcmp + select optimizations (or-fcmp.ll, and-fcmp.ll) because I think they are good targets for freeze fix. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101191	2021-05-06 09:29:52 +09:00
Yaxun (Sam) Liu	0175999805	[AMDGPU] Add options -mamdgpu-ieee -mno-amdgpu-ieee AMDGPU backend need to know whether floating point opcodes that support exception flag gathering quiet and propagate signaling NaN inputs per IEEE754-2008, which is conveyed by a function attribute "amdgpu-ieee". "amdgpu-ieee"="false" turns this off. Without this function attribute backend assumes it is on for compute functions. -mamdgpu-ieee and -mno-amdgpu-ieee are added to Clang to control this function attribute. By default it is on. -mno-amdgpu-ieee requires -fno-honor-nans or equivalent. Reviewed by: Matt Arsenault Differential Revision: https://reviews.llvm.org/D77013	2021-05-01 09:02:55 -04:00
Philip Reames	f549176ad9	[funcattrs] Add the maximal set of implied attributes to definitions Have funcattrs expand all implied attributes into the IR. This expands the infrastructure from D100400, but for definitions not declarations this time. Somewhat subtly, this mostly isn't semantic. Because the accessors did the inference, any client which used the accessor was already getting the stronger result. Clients that directly checked presence of attributes (there are some), will see a stronger result now. The old behavior can end up quite confusing for two reasons: * Without this change, we have situations where function-attrs appears to fail when inferring an attribute (as seen by a human reading IR), but that consuming code will see that it should have been implied. As a human trying to sanity check test results and study IR for optimization possibilities, this is exceeding error prone and confusing. (I'll note that I wasted several hours recently because of this.) * We can have transforms which trigger without the IR appearing (on inspection) to meet the preconditions. This change doesn't prevent this from happening (as the accessors still involve multiple checks), but it should make it less frequent. I'd argue in favor of deleting the extra checks out of the accessors after this lands, but I want that in it's own review as a) it's purely stylistic, and b) I already know there's some disagreement. Once this lands, I'm also going to do a cleanup change which will delete some now redundant duplicate predicates in the inference code, but again, that deserves to be a change of it's own. Differential Revision: https://reviews.llvm.org/D100226	2021-04-16 14:22:19 -07:00
Philip Reames	dd985551c2	Reapply "[InferAttributes] Materialize all infered attributes for declaration"" and follow on patches. This reverts commit `ab98f2c712` and `98eea392cd`. It includes a fix for the clang test which triggered the revert. I failed to notice this one because there was another AMDGPU llvm test with a similiar name and the exact same text in the error message. Odd. Since only one build bot reported the clang test, I didn't notice that one.	2021-04-14 16:38:07 -07:00
Yaxun (Sam) Liu	61d065e21f	Let clang atomic builtins fetch add/sub support floating point types Recently atomicrmw started to support fadd/fsub: https://reviews.llvm.org/D53965 However clang atomic builtins fetch add/sub still does not support emitting atomicrmw fadd/fsub. This patch adds that. Reviewed by: John McCall, Artem Belevich, Matt Arsenault, JF Bastien, James Y Knight, Louis Dionne, Olivier Giroux Differential Revision: https://reviews.llvm.org/D71726	2021-04-06 15:44:00 -04:00
Thomas Preud'homme	828ec9e9e5	[OpenCL, test] Fix use of undef FileCheck var Clang test CodeGenOpenCL/fpmath.cl uses a variable defined in an earlier CHECK-NOT directive. However, by definition the pattern in that directive is not supposed to occur so no variable will be defined. This commit solves the issue by using a regex match with the same regex as in the definition. It also changes the definition into a regex match since no variable is going to be defined. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D99857	2021-04-05 21:11:39 +01:00
Bruno Cardoso Lopes	431e3138a1	[CGAtomic] Lift stronger requirements on cmpxch and support acquire failure mode - Fix `emitAtomicCmpXchgFailureSet` to support release/acquire (succ/fail) memory order. - Remove stronger checks for cmpxch. Effectively, this addresses http://wg21.link/p0418 Differential Revision: https://reviews.llvm.org/D98995	2021-03-23 16:45:37 -07:00
Sven van Haastregt	20d93267e1	[OpenCL] Use -fdeclare-opencl-builtins for some tests This speeds up the test running times, as the large `opencl-c.h` header no longer needs to be parsed.	2021-03-22 09:46:28 +00:00
Jay Foad	967b64beb4	[AMDGPU] Split dot2-insts feature Split out some of the instructions predicated on the dot2-insts target feature into a new dot7-insts, in preparation for subtargets that have some but not all of these instructions. NFCI. Differential Revision: https://reviews.llvm.org/D98717	2021-03-17 09:42:21 +00:00
Luke Drummond	fcfd3fda71	[OpenCL] Respect calling convention for builtin `__translate_sampler_initializer` has a calling convention of `spir_func`, but clang generated calls to it using the default CC. Instruction Combining was lowering these mismatching calling conventions to `store i1* undef` which itself was subsequently lowered to a trap instruction by simplifyCFG resulting in runtime `SIGILL` There are arguably two bugs here: but whether there's any wisdom in converting an obviously invalid call into a runtime crash over aborting with a sensible error message will require further discussion. So for now it's enough to set the right calling convention on the runtime helper. Reviewed By: svenh, bader Differential Revision: https://reviews.llvm.org/D98411	2021-03-15 17:26:51 +00:00
Sven van Haastregt	6f912a2cd4	[OpenCL] Set calling convention for -fdeclare-opencl-builtins IR produced using TableGen builtin function declarations (`fdeclare-opencl-builtins.cl`) did not have the target's calling convention applied to builtin calls. Fix this, and update the codegen test to check that IR produced using opencl-c.h and `-fdeclare-opencl-builtins` is identical with respect to the builtin calls. Differential Revision: https://reviews.llvm.org/D98039	2021-03-10 10:03:57 +00:00
Jay Foad	99682bc039	Revert "Revert "[AMDGPU] Restore the s_memtime instruction in gfx1030"" This reverts commit `e58d68fcd0`. This reinstates commit `fc28f600e5` with a fix to initialize HasShaderCyclesRegister. See https://reviews.llvm.org/D97928.	2021-03-06 09:00:01 +00:00
Mitch Phillips	e58d68fcd0	Revert "[AMDGPU] Restore the s_memtime instruction in gfx1030" Broke the ASan/MSan buildbots. See more comments in the original patch, https://reviews.llvm.org/D97928. Build failure at http://lab.llvm.org:8011/#/builders/5/builds/5327 This reverts commit `fc28f600e5`.	2021-03-05 18:24:59 -08:00
Jay Foad	fc28f600e5	[AMDGPU] Restore the s_memtime instruction in gfx1030 gfx1030 added a new way to implement readcyclecounter using the SHADER_CYCLES hardware register, but the s_memtime instruction still exists, so the MC layer should still accept it and the llvm.amdgcn.s.memtime intrinsic should still work. Differential Revision: https://reviews.llvm.org/D97928	2021-03-05 20:19:11 +00:00
Sven van Haastregt	f0686569cc	[OpenCL] Fix `mix` builtin overloads `mix` is subtly different from `clamp`: in the overloads where the last argument is a scalar, the second argument should be a gentype for `mix`. As scalars can be implicitly converted to vectors, this cannot be caught in the Sema test. Hence adding a CodeGen test, where we can verify the types using the mangled name.	2021-03-05 13:43:30 +00:00

1 2 3 4 5 ...

548 Commits