llvm-project

Commit Graph

Author	SHA1	Message	Date
Tom Stellard	903a78b7c6	Implement sin builtin for float types This double version still uses @llvm.sin. llvm-svn: 213762	2014-07-23 15:16:21 +00:00
Tom Stellard	c0ab2f81e3	Implement cos builtin for float types The double version still uses @llvm.cos. llvm-svn: 213761	2014-07-23 15:16:18 +00:00
Tom Stellard	f9caca8b9d	Implement atan2 builtin llvm-svn: 213760	2014-07-23 15:16:16 +00:00
Tom Stellard	47882923c7	Implement atan builtin llvm-svn: 213759	2014-07-23 15:16:13 +00:00
Aaron Watry	d7f022a582	relational: Implement isnotequal v2: Use relational macros instead of hand-rolled ones Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 213320	2014-07-17 22:07:32 +00:00
Aaron Watry	30102536c0	relational: Implement isgreaterequal v2: Use relational macros instead of hand-rolled macros Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 213319	2014-07-17 22:07:27 +00:00
Aaron Watry	803a992f04	relational: Implement isgreater v2: Use relational macros instead of hand-rolled macros Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 213318	2014-07-17 22:07:19 +00:00
Aaron Watry	9335fe8eff	relational/signbit: Refactor to use relational macros Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 213317	2014-07-17 22:05:25 +00:00
Aaron Watry	d5aace4874	Fix isnan definition for vector results Vector true is -1, not 1, which means we need to use the relational unary macro instead of the normal unary builtin one. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 213316	2014-07-17 22:05:22 +00:00
Aaron Watry	13116cf01a	relational: create re-usable macros for relational declarations relational.h includes relational macros for defining functions which need to return 1 for scalar true and -1 for vector true. I believe that this is the only place that this behavior is required, so the macro is placed at its lowest useful level (same directory as it is used in). This also creates re-usable unary/binary declaration and floatn includes which should simplify relational builtin declarations. Mostly patterned off of include/math/[binary_decl\|unary_decl\|floatn].inc but with required changes for relational functions. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 213315	2014-07-17 22:05:16 +00:00
Aaron Watry	d7f158e006	relational: Fix signbit The vector components were mistakenly using () instead of {}, which caused all but the last vector component to be dropped on the floor. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk> llvm-svn: 211733	2014-06-25 21:08:38 +00:00
Aaron Watry	d9ee196eab	relational: Implement signbit v2 Changes: - use __builtin_signbit instead of shifting by hand - significantly improve vector shuffling - Works correctly now for signbit(float16) on radeonsi Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 211696	2014-06-25 13:29:23 +00:00
Jeroen Ketema	42df5d2a8f	Add exp10 Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211680	2014-06-25 10:06:35 +00:00
Jeroen Ketema	526fe2d501	Move clcmacro.h to avoid cluttering user namespace v2 v2: - use quotes instead of <> - add include to r600/lib/math/nextafter.c changed Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 211576	2014-06-24 09:36:32 +00:00
Jeroen Ketema	bfdb1c0c2f	Protect functions taking double by #ifdef cl_khr_fp64 Also change the order of the functions to be consistent with the order in the header files. llvm-svn: 211496	2014-06-23 14:15:39 +00:00
Jeroen Ketema	09516fa27d	Add pown Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211211	2014-06-18 19:42:23 +00:00
Aaron Watry	d9afe9def0	Fix definition of INFINITY and add NAN/HUGE_VAL[F] v3: change __builtin_nanf() to __builtin_nanf("") This doesn't work yet, but it was agreed to commit as-is with the logic that "broken" is better than "completely missing" and this should be fixed in clang. v2: use __builtin_inff() and also add nan/huge_val definitions Signed-off-by: Aaron Watry <awatry@gmail.com> llvm-svn: 211065	2014-06-16 22:32:58 +00:00
Aaron Watry	6af2969a61	math: Implement mix builtin Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211047	2014-06-16 19:53:59 +00:00
Aaron Watry	f7f79d2a94	relational: Add isequal(floatN) builtin Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211046	2014-06-16 19:53:57 +00:00
Aaron Watry	e167db9238	Add all(igentype) builtin Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211045	2014-06-16 19:53:54 +00:00
Jeroen Ketema	e2a0f050d8	Add files forgotten in the previous commit llvm-svn: 210896	2014-06-13 12:33:40 +00:00
Jeroen Ketema	82aaa41286	Implementations for exp(float) and exp(double) v2 Use separate implementations instead of a macro to ensure the constant multiplied with is of higher precision. v2: Use the correct formula, spotted by Dan Liew <daniel.liew@imperial.ac.uk> Reviewed-by: Aaron Warty <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 210891	2014-06-13 09:40:09 +00:00
Tom Stellard	3a12fc6a07	Add sincos Patch by: Jeroen Ketema Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 204478	2014-03-21 16:22:01 +00:00
Tom Stellard	074e7a8ed0	Add cross for double3 and double4 Patch by: Jeroen Ketema Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 204477	2014-03-21 16:21:58 +00:00
Tom Stellard	457e35912e	Implement builtins for cl_khr_global_int32_base_atomics extension llvm-svn: 195021	2013-11-18 18:21:23 +00:00
Tom Stellard	3a9632d544	s/_CLC_DECL/_CLC_DEF/ Some function definitions were using _CLC_DECL, which meant that they weren't being marked as always_inline. Reviewed-by and Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 193754	2013-10-31 15:50:53 +00:00
Tom Stellard	f21e3ea972	Port pocl's gen_convert.py script to libclc This script generates implementations for the entire set of convert_* functions, llvm-svn: 192385	2013-10-10 19:09:01 +00:00
Tom Stellard	436bf70519	Implement sign() builtin llvm-svn: 192384	2013-10-10 19:08:56 +00:00
Tom Stellard	6c7b86c106	Implement nextafter() builtin There are two implementations of nextafter(): 1. Using clang's __builtin_nextafter. Clang replaces this builtin with a call to nextafter which is part of libm. Therefore, this implementation will only work for targets with an implementation of libm (e.g. most CPU targets). 2. The other implementation is written in OpenCL C. This function is known internally as __clc_nextafter and can be used by targets that don't have access to libm. llvm-svn: 192383	2013-10-10 19:08:51 +00:00
Tom Stellard	e36e9dec65	Implement isnan() builtin llvm-svn: 192382	2013-10-10 19:08:41 +00:00
Aaron Watry	283e3fa011	Add atomic_sub and atomic_dec builtin functions Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 190201	2013-09-06 20:20:21 +00:00
Aaron Watry	50a7bcbac9	Add atomic_inc and atomic_add builtins Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 190058	2013-09-05 16:04:01 +00:00
Aaron Watry	fbe439f8c0	Add mul_hi implementation [v2] Everything except long/ulong is handled by just casting to the next larger type, doing the math and then shifting/casting the result. For 64-bit types, we break the high/low parts of each operand apart, and do a FOIL-based multiplication. v2: Discard the stack-overflow implementation due to copyright concerns. - The implementation is still FOIL-based, but discards the previous code. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188684	2013-08-19 18:31:49 +00:00
Aaron Watry	8548725f29	Add rhadd builtin rhadd = (x+y+1)>>1 Implemented as: (x>>1) + (y>>1) + ((x&1)\|(y&1)) This prevents us having to do assembly addition and overflow detection Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188477	2013-08-15 19:21:10 +00:00
Aaron Watry	7659157f1b	Add hadd builtin (x + y) >> 1 gets changed to: (x>>1) + (y>>1) + (x&y&1) Saves us having to do any llvm assembly and overflow checking in the addition. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188476	2013-08-15 19:21:07 +00:00
Aaron Watry	0c21c7c747	Add intN vloadN() implementations for address spaces 3 and 4 Not hooked up to R600 yet due to current lack of support, at least on EG. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188181	2013-08-12 14:42:51 +00:00
Aaron Watry	7d52565321	Add vload* for addrspace(2) and use as constant load for R600 Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188179	2013-08-12 14:42:49 +00:00
Tom Stellard	41ef85df0a	Add some missing convert_* functions llvm-svn: 188131	2013-08-10 03:40:37 +00:00
Aaron Watry	1769b1fca9	Implement generic upsample() Reduces all vector upsamples down to its scalar components, so probably not the most efficient thing in the world, but it does what the spec says it needs to do. Another possible implementation would be to convert/cast everything as unsigned if necessary, upsample the input vectors, create the upsampled value, and then cast back to signed if required. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard at amd.com> llvm-svn: 186691	2013-07-19 16:44:37 +00:00
Aaron Watry	99a2f3b274	Fix and re-enable R600 vload/vstore assembly The assembly optimizations were making unsafe assumptions about which address spaces had which identifiers. Also, fix vload/vstore with 64-bit pointers. This was broken previously on Radeon SI. This version still only has assembly versions of int/uint 2/4/8/16 for global loads and stores on R600, but it does it in a way that would be very easily extended to private/local/constant and could also be handled easily on other architectures. v2: 1) Leave v[load\|store]_impl.ll in generic/lib 2) Remove vload_if.ll and vstore_if.ll interfaces 3) Fix address+offset calculations 3) Remove offset from assembly arg list llvm-svn: 186416	2013-07-16 14:29:01 +00:00
Aaron Watry	4cb7cf276d	libclc: vload/vstore disable assembly and fix offset calculation This commit gets us back to pure CLC and fixes offset calculations. The next commit will re-enable the assembly implementation for R600, fix bugs related to 64-bit address spaces, and also fix the incorrect assumption that address space identifiers are the same in all architectures. llvm-svn: 186415	2013-07-16 14:28:58 +00:00
Tom Stellard	6f33168bb7	Implement mad24() and mul24() builtins Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 185839	2013-07-08 17:27:13 +00:00
Tom Stellard	d768ac0395	Add __CLC_ prefix to all macro definitions in headers libclc was defining and undefing GENTYPE and several other macros with common names in its header files. This was preventing applications from defining macros with identical names as command line arguments to the compiler, because the definitions in the header files were masking the macros defined as compiler arguements. Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 185838	2013-07-08 17:27:02 +00:00
Tom Stellard	64b3bbae1e	libclc: Add assembly versions of vstore for global [u]int4/8/16 The assembly should be generic, but at least currently R600 only supports 32-bit stores of [u]int1/4, and I believe that only global is well-supported. R600 lowers the 8/16 component stores to multiple 4-component stores. The unoptimized C versions of the other stuff is left in place. Patch by: Aaron Watry llvm-svn: 185009	2013-06-26 18:22:20 +00:00
Tom Stellard	922ac056e3	libclc: Add assembly versions of vload for global int4/8/16 The assembly should be generic, but at least currently R600 only supports 32-bit loads of int1/4, and I believe that only global is well-supported. R600 lowers the 8/16 component vectors to multiple 4-bit loads. The unoptimized C versions of the other stuff is left in place. Patch by: Aaron Watry llvm-svn: 185008	2013-06-26 18:22:15 +00:00
Tom Stellard	51441f80c5	libclc: Initial vstore implementation Assumes that the target supports byte-addressable stores. Completely unoptimized. Patch by: Aaron Watry llvm-svn: 185007	2013-06-26 18:22:11 +00:00
Tom Stellard	66ecbc7c18	libclc: Initial vload implementation Should work for all targets and data types. Completely unoptimized. Patch by: Aaron Watry llvm-svn: 185006	2013-06-26 18:22:05 +00:00
Tom Stellard	e78344dfae	libclc: Implement clz() builtin Squashed commit of the following: commit a0df0a0e86c55c1bdc0b9c0f5a739e5adef4b056 Author: Aaron Watry <awatry@gmail.com> Date: Mon Apr 15 18:42:04 2013 -0500 libclc: Rename clz.ll to clz_if.ll to ensure it gets built. configure.py treats files that have the same name with the .cl and .ll extensions as overriding eachother. E.g. If you have clz.cl and clz.ll both specified to be built in the same SOURCES file, only the first file listed will actually be built. Since the contents of clz.ll were an interface that is implemented in clz_impl.ll, rename clz.ll to clz_if.ll to make sure that the interface is built. commit 931b62bed05c58f737de625bd415af09571a6a5a Author: Aaron Watry <awatry@gmail.com> Date: Sat Apr 13 12:32:54 2013 -0500 libclc: llvm assembly implementation of clz Untested... currently crashes in the same manner as add_sat. commit 6ef0b7b0b6d2e5584086b4b9a9243743b2e0538f Author: Aaron Watry <awatry@gmail.com> Date: Sat Mar 23 12:35:27 2013 -0500 libclc: Add stub clz builtin For scalar int/uint, attempt to use the clz llvm builtin.. for all others return 0 until an actual implementation is finished. Patch by: Aaron Watry llvm-svn: 185004	2013-06-26 18:21:55 +00:00
Tom Stellard	34f513df7c	libclc: Add clamp(vec, scalar, scalar) and max(vec, scalar) For any GENTYPE that isn't scalar, we need to implement a mixed vector/scalar version of clamp/max. This depends on the min() patches I sent to the list a few minutes ago. Patch by: Aaron Watry llvm-svn: 185003	2013-06-26 18:21:49 +00:00
Tom Stellard	075b31a2fa	libclc: Implement the min(vec, scalar) version of the min builtin. Checks if the current GENTYPE is scalar, and if not, then defines a separate implementation of the function which casts the second arg to vector before proceeding. Patch by: Aaron Watry llvm-svn: 185002	2013-06-26 18:21:44 +00:00

1 2

73 Commits