llvm-project

Commit Graph

Author	SHA1	Message	Date
Jeroen Ketema	80d2e8ffc1	Move BufferPtr into the block where it it being used The previous location outside the block would crash prepare-builtins when no the builtins file accidentially not passed on the command line. llvm-svn: 294916	2017-02-12 21:33:49 +00:00
Jeroen Ketema	ed98e8d099	Add the correct prefixes to the cl_khr_fp64 pragma llvm-svn: 294915	2017-02-12 21:31:41 +00:00
Matt Arsenault	9df2b9781c	math: Add native_rsqrt builtin function Trivial define to rsqrt. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 294608	2017-02-09 18:39:26 +00:00
Aaron Watry	c606efabb7	math: Add logb builtin Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292335	2017-01-18 03:14:10 +00:00
Aaron Watry	900bd7eb7f	math: Add expm1 builtin function Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292334	2017-01-18 03:13:37 +00:00
Tom Stellard	d83eb34ee7	Fix build since r286752. llvm-svn: 286839	2016-11-14 16:06:33 +00:00
Tom Stellard	088faab429	Fix build since llvm r286566 and require at least llvm 4.0 llvm-svn: 286634	2016-11-11 21:34:47 +00:00
Jan Vesely	0a5aac3fc4	Provide vstore_half helper to workaround clc restrictions clang won't accept half precision loads and stores without cl_khr_fp16 since r281904 llvm-svn: 282106	2016-09-21 20:15:55 +00:00
Tom Stellard	6b195ece57	configure: Add amdgcn-mesa-mesa3d target llvm-svn: 281793	2016-09-16 22:43:33 +00:00
Tom Stellard	f19cf403c4	amdgcn-amdhsa: Add get_num_groups implementation llvm-svn: 281792	2016-09-16 22:43:31 +00:00
Tom Stellard	e7ad23bad3	amdgcn-amdhsa: Add get_global_size() implementation llvm-svn: 281791	2016-09-16 22:43:29 +00:00
Aaron Watry	af569547fa	math: Implement tgamma Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281566	2016-09-15 00:17:34 +00:00
Aaron Watry	e9009cdd21	math: Implement lgamma Just use lgamma_r and ignore the value returned in the second argument Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281565	2016-09-15 00:17:31 +00:00
Aaron Watry	0ab07e1bde	math: Implement lgamma_r Ported from the amd-builtins branch, which is itself based on the Sun Microsystems implementation. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281564	2016-09-15 00:17:28 +00:00
Aaron Watry	f969413a82	Add ADDR_SPACE parameter to _CLC_V_V_VP_VECTORIZE This macro is currently unused, but I plan to use it shortly. The previous form did casts of pointers without an address space, which doesn't work so well for CL 1.x. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281563	2016-09-15 00:17:22 +00:00
Matt Arsenault	fbfd828d2a	Replace nextafter implementation This one passes conformance. llvm-svn: 280961	2016-09-08 16:37:56 +00:00
Jan Vesely	eade17271a	Avoid ambiguity in calling atom_add functions. clang (since r280553) allows pointer casts in function overloads, so we need to disambiguate the second argument. clang might be smarter about overloads in the future see https://reviews.llvm.org/D24113, but let's be safe in libclc anyway. llvm-svn: 280871	2016-09-07 22:11:02 +00:00
Niels Ole Salscheider	63f71057c0	configure.py: Add polaris10 and polaris11 llvm-svn: 280121	2016-08-30 18:00:41 +00:00
Matt Arsenault	958fce3192	amdgcn: Fix return type of get_num_groups llvm-svn: 279723	2016-08-25 07:31:40 +00:00
Matt Arsenault	7ef7e6aacd	Strip opencl.ocl.version metadata This should be uniqued when linking, but right now it creates a lot of metadata spam listing the same version. This should also probably be reporting the compiled version of the user program, which may differ from the library. Currently the library IR files report 1.0 while 1.1/1.2 are the default for user programs. llvm-svn: 279692	2016-08-25 00:25:10 +00:00
Matt Arsenault	d0a275228e	amdgcn: Also correct get_local_size type for HSA llvm-svn: 279656	2016-08-24 19:11:52 +00:00
Matt Arsenault	26d9c41ff6	amdgcn: Fix return type for get_global_size llvm-svn: 279644	2016-08-24 17:52:04 +00:00
Matt Arsenault	314364cbd2	amdgpu: Fix default case value for get_local_size llvm-svn: 279359	2016-08-20 04:17:17 +00:00
Matt Arsenault	220268d177	amdgcn: Fix get_local_size IR return type llvm-svn: 279350	2016-08-20 00:01:21 +00:00
Matt Arsenault	2ce3d94a01	amdgcn: Correct return types to be size_t llvm-svn: 279343	2016-08-19 22:49:39 +00:00
Jan Vesely	ad8672727c	Implement vstore_half{,n} Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 278962	2016-08-17 20:02:11 +00:00
Jan Vesely	4c59714a52	Make min follow the OCL 1.0 specs OpenCL 1.0: "Returns y if y < x, otherwise it returns x. If x and y are infinite or NaN, the return values are undefined." OpenCL 1.1+: "Returns y if y < x, otherwise it returns x. If x or y are infinite or NaN, the return values are undefined." The 1.0 version is stricter so use that one. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276704	2016-07-25 22:36:22 +00:00
Tom Stellard	d835b3f1af	Implement cbrt builtin This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276497	2016-07-22 23:45:15 +00:00
Tom Stellard	9cb070f96a	Implement cosh builtin This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276496	2016-07-22 23:45:13 +00:00
Tom Stellard	ff13926a60	geometric/floatn.inc: Add vec8 and vec16 types llvm-svn: 276495	2016-07-22 23:45:11 +00:00
Jan Vesely	a82e080b57	AMDGPU: Implement get_global_offset builtin Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276443	2016-07-22 17:24:24 +00:00
Jan Vesely	74f02db922	AMDGPU: Use clang intrinsics for workitem builtins v2: split into 2 patches use clang builtins for other intrinsics as well v3: Fix warnings Switch r600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276442	2016-07-22 17:24:20 +00:00
Jan Vesely	7846c9b8f0	ptx: Fix builtin names after clang r274770 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-By: Aaron Watry <awatry@gmail.com> llvm-svn: 276423	2016-07-22 15:00:08 +00:00
Matt Arsenault	633d749da7	amdgpu: Use right builtn for rsq The r600 path has never actually worked sinced double is not implemented there. llvm-svn: 276009	2016-07-19 19:02:01 +00:00
Matt Arsenault	1ab0d9c1ee	R600: Use new barrier intrinsic llvm-svn: 275874	2016-07-18 18:42:17 +00:00
Matt Arsenault	b456c6dd56	Replace llvm.AMDGPU.ldexp with llvm.amdgcn.ldexp It didn't really work on r600 to begin with, which should get its own intrinsic. llvm-svn: 275813	2016-07-18 16:42:50 +00:00
Jan Vesely	e97deffb6a	configure: Remove device specific defines Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 273044	2016-06-17 20:30:50 +00:00
Jan Vesely	5fd84d028d	nvptx: Drop feature defines. This is now handled by clang Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 273043	2016-06-17 20:30:49 +00:00
Jan Vesely	3317f253de	64 bit integers are legal in full profile without an extension Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 273042	2016-06-17 20:30:41 +00:00
Jan Vesely	973c1fa5f5	math: Use single precision fmax in sp path Fixes fdim piglit on Turks v2: use CL fmax instead of __builtin Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom.stellard@amd.com> llvm-svn: 269807	2016-05-17 19:44:01 +00:00
Jan Vesely	c374cb76f4	math: Add erf ported from amd-builtins The scalar float/double function bodies are a direct copy/paste, aside from the removed (optional) code in float function body that requires subnormals. reviewers: jvesely Patch by: Vedran Miletić <rivanvx@gmail.com> llvm-svn: 268766	2016-05-06 18:02:30 +00:00
Aaron Watry	55a8e0fd6d	math: Add fdim implementation Based on the amd-builtin, but explicitly vectorized for all sizes (not just float4), and includes a vectorized double implementation. Passes piglit (float) tests on pitcairn. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 268708	2016-05-06 03:34:45 +00:00
Tom Stellard	6cb18a09b1	prepare-builtins: Remove call to getGlobalContext() This function has been removed from LLVM. Patch By: Laurent Carlier llvm-svn: 266430	2016-04-15 14:18:58 +00:00
Konstantin Zhuravlyov	f8a81f869f	[AMDGPU] Implement get_local_size for amdgcn--amdhsa triple Differential Revision: http://reviews.llvm.org/D18284 llvm-svn: 265713	2016-04-07 19:54:19 +00:00
Paul Robinson	36a6eaeb34	Update copyright year to 2016. llvm-svn: 264949	2016-03-30 22:39:03 +00:00
Aaron Watry	09f3c99a86	math: Fix ilogb(double) return type Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 261714	2016-02-24 00:52:15 +00:00
Aaron Watry	d6d0454231	math: Add ilogb ported from amd-builtins The scalar float/double function bodies are a direct copy/paste with usage of the CLC wrappers to vectorize them. This commit also adds in the FP_ILOGB0 and FP_ILOGBNAN macros which are equal to the results of ilogb(0.0f) and ilogb(float nan) respectively. v2: Add FP_ILOGB0 and FP_ILOGBNAN definitions Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> v1 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 261639	2016-02-23 14:43:09 +00:00
Matt Arsenault	34766ca34b	Add .gitignore for build directories llvm-svn: 261043	2016-02-17 00:27:31 +00:00
Matt Arsenault	45e6eaaa05	amdgcn: Use new workitem intrinsics llvm-svn: 261042	2016-02-17 00:27:27 +00:00
Matt Arsenault	d4cd67ab9f	Update page to list supported targets llvm-svn: 260778	2016-02-13 01:02:06 +00:00

1 2 3 4 5 ...

340 Commits