llvm-project

Commit Graph

Author	SHA1	Message	Date
Jan Vesely	f96b1b88f8	amdgcn/fmax: fcanonicalize operands v_max instruction needs canonicalized operands. Passes CTS on carrizo Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327076	2018-03-08 23:01:01 +00:00
Jan Vesely	e724e346ab	amdgcn/fmin: fcanonicalize operands v_min instruction needs canonicalized operands. Passes CTS on carrizo Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327075	2018-03-08 23:00:58 +00:00
Jan Vesely	04a46bf0a2	amdgcn,popcount: Workaround broken llvm.ctpop intrinsic on some GCN ASICs This is only really needed for VI+ ASICs. However, llvm would cast the value to i32 for older asics anyway. The proper fix is in LLVM-7 (r326535). Fixes CTS popcount on carrizo. Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327044	2018-03-08 18:58:07 +00:00
Jan Vesely	1ad6a94676	amdgcn: Fix build after GDS/const AS swap in r325030 Acked-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325866	2018-02-23 07:37:01 +00:00
Jan Vesely	eda1872d04	amdgcn: Fix datalayout after addition of 32bit const AS in r324747 Acked-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325865	2018-02-23 07:36:54 +00:00
Jan Vesely	911666f3fa	amdgcn: Fix datalayout after clang r324101 r324101 switched around AS numbering Acked-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325863	2018-02-23 07:36:39 +00:00
Jan Vesely	c420b61b26	amdgcn: Add missing datalayout info to .ll files Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Aaron Watry <awatry@gmail.com> llvm-svn: 316239	2017-10-20 21:10:18 +00:00
Jeroen Ketema	fe9fa89854	Let get_work_dim take exactly 0 arguments Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 314634	2017-10-01 20:11:46 +00:00
Jan Vesely	ce29e8cde1	Restore support for llvm-3.9 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Aaron Watry <awatry@gmail.com> llvm-svn: 314543	2017-09-29 19:06:41 +00:00
Jan Vesely	c9bbbe2403	Implement cl_khr_int64_extended_atomics builtins Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 313811	2017-09-20 20:42:19 +00:00
Jan Vesely	3d1db3de74	amdgcn,waitcnt: Add datalayout info This file is only compiled for GCN which all share the same layout Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 312493	2017-09-04 15:52:07 +00:00
Jan Vesely	999b1d9426	amdgcn: rewrite barrier() using fence and clang __builtin_amdgcn_s_barrier Specs require using fences when barrier() is invoked: "The barrier function will either flush any variables stored in local memory or queue a memory fence to ensure correct ordering of memory operations to local memory." and "The barrier function will queue a memory fence to ensure correct ordering of memory operations to global memory." Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 311022	2017-08-16 17:09:00 +00:00
Jan Vesely	1977092dc3	amdgcn: Implement {read_,write_,}mem_fence builtin v2: add more detailed comment about waitcnt instruction Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 311021	2017-08-16 17:08:56 +00:00
Matt Arsenault	958fce3192	amdgcn: Fix return type of get_num_groups llvm-svn: 279723	2016-08-25 07:31:40 +00:00
Matt Arsenault	26d9c41ff6	amdgcn: Fix return type for get_global_size llvm-svn: 279644	2016-08-24 17:52:04 +00:00
Matt Arsenault	314364cbd2	amdgpu: Fix default case value for get_local_size llvm-svn: 279359	2016-08-20 04:17:17 +00:00
Matt Arsenault	220268d177	amdgcn: Fix get_local_size IR return type llvm-svn: 279350	2016-08-20 00:01:21 +00:00
Matt Arsenault	2ce3d94a01	amdgcn: Correct return types to be size_t llvm-svn: 279343	2016-08-19 22:49:39 +00:00
Jan Vesely	a82e080b57	AMDGPU: Implement get_global_offset builtin Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276443	2016-07-22 17:24:24 +00:00
Jan Vesely	74f02db922	AMDGPU: Use clang intrinsics for workitem builtins v2: split into 2 patches use clang builtins for other intrinsics as well v3: Fix warnings Switch r600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276442	2016-07-22 17:24:20 +00:00
Matt Arsenault	b456c6dd56	Replace llvm.AMDGPU.ldexp with llvm.amdgcn.ldexp It didn't really work on r600 to begin with, which should get its own intrinsic. llvm-svn: 275813	2016-07-18 16:42:50 +00:00
Matt Arsenault	45e6eaaa05	amdgcn: Use new workitem intrinsics llvm-svn: 261042	2016-02-17 00:27:27 +00:00
Matt Arsenault	a48e15c6cb	Split sources for amdgcn and r600 Most files remain in a common amdgpu directory. Also switches barriers to to use convergent, and use llvm.amdgcn.s.barrier. This now requires 3.9/trunk to build amdgcn. llvm-svn: 260777	2016-02-13 01:01:59 +00:00

23 Commits