llvm-project

Commit Graph

Author	SHA1	Message	Date
Aaron Watry	803a992f04	relational: Implement isgreater v2: Use relational macros instead of hand-rolled macros Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 213318	2014-07-17 22:07:19 +00:00
Aaron Watry	d9ee196eab	relational: Implement signbit v2 Changes: - use __builtin_signbit instead of shifting by hand - significantly improve vector shuffling - Works correctly now for signbit(float16) on radeonsi Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 211696	2014-06-25 13:29:23 +00:00
Jeroen Ketema	42df5d2a8f	Add exp10 Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211680	2014-06-25 10:06:35 +00:00
Jeroen Ketema	09516fa27d	Add pown Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211211	2014-06-18 19:42:23 +00:00
Aaron Watry	6af2969a61	math: Implement mix builtin Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211047	2014-06-16 19:53:59 +00:00
Aaron Watry	f7f79d2a94	relational: Add isequal(floatN) builtin Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211046	2014-06-16 19:53:57 +00:00
Aaron Watry	e167db9238	Add all(igentype) builtin Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211045	2014-06-16 19:53:54 +00:00
Jeroen Ketema	82aaa41286	Implementations for exp(float) and exp(double) v2 Use separate implementations instead of a macro to ensure the constant multiplied with is of higher precision. v2: Use the correct formula, spotted by Dan Liew <daniel.liew@imperial.ac.uk> Reviewed-by: Aaron Warty <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 210891	2014-06-13 09:40:09 +00:00
Tom Stellard	3a12fc6a07	Add sincos Patch by: Jeroen Ketema Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 204478	2014-03-21 16:22:01 +00:00
Tom Stellard	457e35912e	Implement builtins for cl_khr_global_int32_base_atomics extension llvm-svn: 195021	2013-11-18 18:21:23 +00:00
Tom Stellard	436bf70519	Implement sign() builtin llvm-svn: 192384	2013-10-10 19:08:56 +00:00
Tom Stellard	6c7b86c106	Implement nextafter() builtin There are two implementations of nextafter(): 1. Using clang's __builtin_nextafter. Clang replaces this builtin with a call to nextafter which is part of libm. Therefore, this implementation will only work for targets with an implementation of libm (e.g. most CPU targets). 2. The other implementation is written in OpenCL C. This function is known internally as __clc_nextafter and can be used by targets that don't have access to libm. llvm-svn: 192383	2013-10-10 19:08:51 +00:00
Tom Stellard	e36e9dec65	Implement isnan() builtin llvm-svn: 192382	2013-10-10 19:08:41 +00:00
Aaron Watry	50a7bcbac9	Add atomic_inc and atomic_add builtins Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 190058	2013-09-05 16:04:01 +00:00
Aaron Watry	fbe439f8c0	Add mul_hi implementation [v2] Everything except long/ulong is handled by just casting to the next larger type, doing the math and then shifting/casting the result. For 64-bit types, we break the high/low parts of each operand apart, and do a FOIL-based multiplication. v2: Discard the stack-overflow implementation due to copyright concerns. - The implementation is still FOIL-based, but discards the previous code. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188684	2013-08-19 18:31:49 +00:00
Aaron Watry	8548725f29	Add rhadd builtin rhadd = (x+y+1)>>1 Implemented as: (x>>1) + (y>>1) + ((x&1)\|(y&1)) This prevents us having to do assembly addition and overflow detection Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188477	2013-08-15 19:21:10 +00:00
Aaron Watry	7659157f1b	Add hadd builtin (x + y) >> 1 gets changed to: (x>>1) + (y>>1) + (x&y&1) Saves us having to do any llvm assembly and overflow checking in the addition. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188476	2013-08-15 19:21:07 +00:00
Aaron Watry	1769b1fca9	Implement generic upsample() Reduces all vector upsamples down to its scalar components, so probably not the most efficient thing in the world, but it does what the spec says it needs to do. Another possible implementation would be to convert/cast everything as unsigned if necessary, upsample the input vectors, create the upsampled value, and then cast back to signed if required. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard at amd.com> llvm-svn: 186691	2013-07-19 16:44:37 +00:00
Aaron Watry	4cb7cf276d	libclc: vload/vstore disable assembly and fix offset calculation This commit gets us back to pure CLC and fixes offset calculations. The next commit will re-enable the assembly implementation for R600, fix bugs related to 64-bit address spaces, and also fix the incorrect assumption that address space identifiers are the same in all architectures. llvm-svn: 186415	2013-07-16 14:28:58 +00:00
Tom Stellard	6f33168bb7	Implement mad24() and mul24() builtins Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 185839	2013-07-08 17:27:13 +00:00
Tom Stellard	64b3bbae1e	libclc: Add assembly versions of vstore for global [u]int4/8/16 The assembly should be generic, but at least currently R600 only supports 32-bit stores of [u]int1/4, and I believe that only global is well-supported. R600 lowers the 8/16 component stores to multiple 4-component stores. The unoptimized C versions of the other stuff is left in place. Patch by: Aaron Watry llvm-svn: 185009	2013-06-26 18:22:20 +00:00
Tom Stellard	922ac056e3	libclc: Add assembly versions of vload for global int4/8/16 The assembly should be generic, but at least currently R600 only supports 32-bit loads of int1/4, and I believe that only global is well-supported. R600 lowers the 8/16 component vectors to multiple 4-bit loads. The unoptimized C versions of the other stuff is left in place. Patch by: Aaron Watry llvm-svn: 185008	2013-06-26 18:22:15 +00:00
Tom Stellard	51441f80c5	libclc: Initial vstore implementation Assumes that the target supports byte-addressable stores. Completely unoptimized. Patch by: Aaron Watry llvm-svn: 185007	2013-06-26 18:22:11 +00:00
Tom Stellard	66ecbc7c18	libclc: Initial vload implementation Should work for all targets and data types. Completely unoptimized. Patch by: Aaron Watry llvm-svn: 185006	2013-06-26 18:22:05 +00:00
Tom Stellard	e78344dfae	libclc: Implement clz() builtin Squashed commit of the following: commit a0df0a0e86c55c1bdc0b9c0f5a739e5adef4b056 Author: Aaron Watry <awatry@gmail.com> Date: Mon Apr 15 18:42:04 2013 -0500 libclc: Rename clz.ll to clz_if.ll to ensure it gets built. configure.py treats files that have the same name with the .cl and .ll extensions as overriding eachother. E.g. If you have clz.cl and clz.ll both specified to be built in the same SOURCES file, only the first file listed will actually be built. Since the contents of clz.ll were an interface that is implemented in clz_impl.ll, rename clz.ll to clz_if.ll to make sure that the interface is built. commit 931b62bed05c58f737de625bd415af09571a6a5a Author: Aaron Watry <awatry@gmail.com> Date: Sat Apr 13 12:32:54 2013 -0500 libclc: llvm assembly implementation of clz Untested... currently crashes in the same manner as add_sat. commit 6ef0b7b0b6d2e5584086b4b9a9243743b2e0538f Author: Aaron Watry <awatry@gmail.com> Date: Sat Mar 23 12:35:27 2013 -0500 libclc: Add stub clz builtin For scalar int/uint, attempt to use the clz llvm builtin.. for all others return 0 until an actual implementation is finished. Patch by: Aaron Watry llvm-svn: 185004	2013-06-26 18:21:55 +00:00
Tom Stellard	0be3acfc70	libclc: implement initial version of min() This doesn't handle the integer cases for min(vector, scalar). Patch by: Aaron Watry llvm-svn: 185001	2013-06-26 18:21:38 +00:00
Tom Stellard	29b5b9816b	libclc: Rename [add\|sub]_sat.ll to [add\|sub]_sat_if.ll configure.py allows overloading .cl with .ll, but will only ever build the first file listed in SOURCES of ${file}.cl and ${file}.ll add_sat, sub_sat, (and the soon to be submitted clz) all define interfaces in ${function_name}.ll which are implemented in ${function_name}_impl.ll. Renaming the interface files is enough to get them to build again, fixing CL usage of these functions. Tested on clover/r600g. Patch by: Aaron Watry llvm-svn: 185000	2013-06-26 18:21:31 +00:00
Tom Stellard	0bb381eaec	libclc: implement rotate builtin This implementation does a lot of bit shifting and masking. Suffice to say, this is somewhat suboptimal... but it does look to produce correct results (after the piglit tests were corrected for sign extension issues). Someone who knows LLVM better than I could re-write this more efficiently. Patch by: Aaron Watry llvm-svn: 184996	2013-06-26 18:21:13 +00:00
Tom Stellard	cb133c9322	libclc: Move max builtin to shared/ Max(x,y) is available for all integer/floating types. Patch by: Aaron Watry llvm-svn: 184995	2013-06-26 18:21:06 +00:00
Tom Stellard	fe23a30ef5	libclc: Add clamp() builtin for integer/floating point Created under a new shared/ directory for functions which are available for both integer and floating point types. Patch by: Aaron Watry llvm-svn: 184994	2013-06-26 18:20:56 +00:00
Tom Stellard	cd88a4ebb6	libclc: Fix abs_diff builtin integer function Patch by: Aaron Watry llvm-svn: 184993	2013-06-26 18:20:50 +00:00
Tom Stellard	ec87fb0b0c	libclc: Add max() builtin function Adds this function for both int and floating data types. Patch by: Aaron Watry llvm-svn: 184992	2013-06-26 18:20:46 +00:00
Tom Stellard	509b3b2104	Implement fmax() and fmin() builtins llvm-svn: 184987	2013-06-26 18:20:25 +00:00
Peter Collingbourne	bf3fd44b10	Implement any() builtin. Patch by Tom Stellard! llvm-svn: 165386	2012-10-08 03:39:21 +00:00
Peter Collingbourne	a385c53413	PTX: move implementations of work-item and synchronisation functions to lib, and add header files in generic. Incorporates a patch by Tom Stellard! llvm-svn: 161313	2012-08-05 22:25:37 +00:00
Peter Collingbourne	1e373f07af	Implement sub_sat builtin. Patch by Lei Mou! llvm-svn: 161312	2012-08-05 22:25:12 +00:00
Peter Collingbourne	de7227e5bd	Add fma, hypot builtins. llvm-svn: 157613	2012-05-29 13:35:28 +00:00
Peter Collingbourne	b7fdecd2ec	Implement mad builtin. llvm-svn: 157599	2012-05-29 00:42:38 +00:00
Peter Collingbourne	3a78a47ace	Explicit conversions. llvm-svn: 157590	2012-05-28 20:42:54 +00:00
Peter Collingbourne	d5395fbf03	Initial commit. llvm-svn: 147756	2012-01-08 22:09:58 +00:00

40 Commits