llvm-project

Commit Graph

Author	SHA1	Message	Date
Jan Vesely	3a7e8e77e6	amdgcn: Consolidate atomic minmax helpers Removes most overrides Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 347665	2018-11-27 16:01:13 +00:00
Jan Vesely	f663e7e6da	amdgcn: Move __clc_amdgcn_s_waitcnt definition to clc file Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346082	2018-11-04 00:39:27 +00:00
Jan Vesely	0e95b6a579	amdgcn: Convert get_num_groups to clc Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346081	2018-11-04 00:39:25 +00:00
Jan Vesely	97283de27d	amdgcn: Convert get_global_size to clc Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346080	2018-11-04 00:39:20 +00:00
Jan Vesely	ea2f32b75d	amdgcn: Convert get_local_size to clc Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346079	2018-11-04 00:39:16 +00:00
Jan Vesely	faa1ff16c1	amdgcn: Use __constant AS for amdgcn builtins. Fixes build after clang r338707. Reviewer: Matthew.Arsenault@amd.com Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 338898	2018-08-03 15:14:08 +00:00
Jan Vesely	70a270da5f	Add initial support for half precision builtins v2: fix fmax implementation use consistent checks for __CLC_FP_SIZE add missing TODOs fix whitespace in definitions.h v3: undef ZERO in modf.inc Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 332677	2018-05-17 22:55:30 +00:00
Jan Vesely	8fa100dfe3	amdgcn/fmin: Fix typos that reduced precision Not sure how these sneaked in. Fixes fminD and few other tests(fractD, cosD) on carrizo Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 330198	2018-04-17 18:11:29 +00:00
Jan Vesely	fd11db19c2	amdgcn: Update datalayout after LLVM r328656 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 329290	2018-04-05 14:47:44 +00:00
Jan Vesely	f96b1b88f8	amdgcn/fmax: fcanonicalize operands v_max instruction needs canonicalized operands. Passes CTS on carrizo Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327076	2018-03-08 23:01:01 +00:00
Jan Vesely	e724e346ab	amdgcn/fmin: fcanonicalize operands v_min instruction needs canonicalized operands. Passes CTS on carrizo Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327075	2018-03-08 23:00:58 +00:00
Jan Vesely	04a46bf0a2	amdgcn,popcount: Workaround broken llvm.ctpop intrinsic on some GCN ASICs This is only really needed for VI+ ASICs. However, llvm would cast the value to i32 for older asics anyway. The proper fix is in LLVM-7 (r326535). Fixes CTS popcount on carrizo. Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327044	2018-03-08 18:58:07 +00:00
Jan Vesely	1ad6a94676	amdgcn: Fix build after GDS/const AS swap in r325030 Acked-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325866	2018-02-23 07:37:01 +00:00
Jan Vesely	eda1872d04	amdgcn: Fix datalayout after addition of 32bit const AS in r324747 Acked-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325865	2018-02-23 07:36:54 +00:00
Jan Vesely	911666f3fa	amdgcn: Fix datalayout after clang r324101 r324101 switched around AS numbering Acked-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325863	2018-02-23 07:36:39 +00:00
Jan Vesely	c420b61b26	amdgcn: Add missing datalayout info to .ll files Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Aaron Watry <awatry@gmail.com> llvm-svn: 316239	2017-10-20 21:10:18 +00:00
Jeroen Ketema	fe9fa89854	Let get_work_dim take exactly 0 arguments Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 314634	2017-10-01 20:11:46 +00:00
Jan Vesely	ce29e8cde1	Restore support for llvm-3.9 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Aaron Watry <awatry@gmail.com> llvm-svn: 314543	2017-09-29 19:06:41 +00:00
Jan Vesely	c9bbbe2403	Implement cl_khr_int64_extended_atomics builtins Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 313811	2017-09-20 20:42:19 +00:00
Jan Vesely	3d1db3de74	amdgcn,waitcnt: Add datalayout info This file is only compiled for GCN which all share the same layout Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 312493	2017-09-04 15:52:07 +00:00
Jan Vesely	999b1d9426	amdgcn: rewrite barrier() using fence and clang __builtin_amdgcn_s_barrier Specs require using fences when barrier() is invoked: "The barrier function will either flush any variables stored in local memory or queue a memory fence to ensure correct ordering of memory operations to local memory." and "The barrier function will queue a memory fence to ensure correct ordering of memory operations to global memory." Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 311022	2017-08-16 17:09:00 +00:00
Jan Vesely	1977092dc3	amdgcn: Implement {read_,write_,}mem_fence builtin v2: add more detailed comment about waitcnt instruction Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 311021	2017-08-16 17:08:56 +00:00
Matt Arsenault	958fce3192	amdgcn: Fix return type of get_num_groups llvm-svn: 279723	2016-08-25 07:31:40 +00:00
Matt Arsenault	26d9c41ff6	amdgcn: Fix return type for get_global_size llvm-svn: 279644	2016-08-24 17:52:04 +00:00
Matt Arsenault	314364cbd2	amdgpu: Fix default case value for get_local_size llvm-svn: 279359	2016-08-20 04:17:17 +00:00
Matt Arsenault	220268d177	amdgcn: Fix get_local_size IR return type llvm-svn: 279350	2016-08-20 00:01:21 +00:00
Matt Arsenault	2ce3d94a01	amdgcn: Correct return types to be size_t llvm-svn: 279343	2016-08-19 22:49:39 +00:00
Jan Vesely	a82e080b57	AMDGPU: Implement get_global_offset builtin Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276443	2016-07-22 17:24:24 +00:00
Jan Vesely	74f02db922	AMDGPU: Use clang intrinsics for workitem builtins v2: split into 2 patches use clang builtins for other intrinsics as well v3: Fix warnings Switch r600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276442	2016-07-22 17:24:20 +00:00
Matt Arsenault	b456c6dd56	Replace llvm.AMDGPU.ldexp with llvm.amdgcn.ldexp It didn't really work on r600 to begin with, which should get its own intrinsic. llvm-svn: 275813	2016-07-18 16:42:50 +00:00
Matt Arsenault	45e6eaaa05	amdgcn: Use new workitem intrinsics llvm-svn: 261042	2016-02-17 00:27:27 +00:00
Matt Arsenault	a48e15c6cb	Split sources for amdgcn and r600 Most files remain in a common amdgpu directory. Also switches barriers to to use convergent, and use llvm.amdgcn.s.barrier. This now requires 3.9/trunk to build amdgcn. llvm-svn: 260777	2016-02-13 01:01:59 +00:00

32 Commits