Commit Graph

98 Commits

Author SHA1 Message Date
Aaron Watry 0d976ba497 atomic: Add generic atom[ic]_cmpxchg
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217918
2014-09-16 22:34:49 +00:00
Aaron Watry 025d79ad6c atomic: Implement generic atom[ic]_xchg
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217917
2014-09-16 22:34:45 +00:00
Aaron Watry 7cfa12c2a5 atomic: Add generic atomic_min implementation
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217916
2014-09-16 22:34:41 +00:00
Aaron Watry 3f0a1a4c27 atomic: Add generic atom[ic]_xor
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217915
2014-09-16 22:34:36 +00:00
Aaron Watry 31e67d1cff atomic: Add atom[ic]_or
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217914
2014-09-16 22:34:32 +00:00
Aaron Watry cc68405761 atomics: Add generic atom[ic]_and
Not used yet.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217913
2014-09-16 22:34:28 +00:00
Aaron Watry 49614fbfd9 atomic: Add generic implementation of atom[ic]_max
Not used yet...

v2: Correct int/uint behavior

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217912
2014-09-16 22:34:24 +00:00
Aaron Watry c9b88d32be atomic: define extension functions for existing atomic implementations
We were missing the local versions of the atom_* before

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217911
2014-09-16 22:34:21 +00:00
Aaron Watry 947bdd059a math: Add tan implementation
Uses the algorithm:
tan(x) = sin(x) / sqrt(1-sin^2(x))

An alternative is:
tan(x) = sin(x) / cos(x)

Which produces more verbose bitcode and longer assembly.

Either way, the generated bitcode seems pretty nasty and a more optimized
but still precise-enough solution is welcome.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 217511
2014-09-10 15:43:35 +00:00
Aaron Watry 951ab64d19 math: Add asin implementation
asin(x) = atan2(x, sqrt( 1-x^2 ))

alternatively:
asin(x) = PI/2 - acos(x)

Use the atan2 implementation since it produces slightly shorter bitcode and
R600 machine code.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 217510
2014-09-10 15:43:32 +00:00
Aaron Watry 268beab921 math: Add acos implementation
Passes the tests that were submitted to the piglit list

Tested on R600 (Pitcairn)

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 217509
2014-09-10 15:43:29 +00:00
Jan Vesely 05a60b7ac3 add isordered builtin
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217247
2014-09-05 13:59:15 +00:00
Jan Vesely 63486c1f0e add isunordered builtin
v2: remove trailing newline

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217246
2014-09-05 13:59:13 +00:00
Jan Vesely 41a0c491de add islessgreater builtin
v2: remove trailing newline

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217245
2014-09-05 13:59:11 +00:00
Jan Vesely 369e20353c add isnormal builtin
v2: simplify and remove isnan leftovers
    remove trailing newline

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217244
2014-09-05 13:59:09 +00:00
Jan Vesely a5a3b023b4 add isfinite builtin
v2: simplify and remove isinf leftovers
    remove trailing newline

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217243
2014-09-05 13:59:06 +00:00
Tom Stellard 7a9e2c6879 Implement isinf builtin
llvm-svn: 217046
2014-09-03 15:55:40 +00:00
Tom Stellard d8a73abfc3 Fix implementation of copysign
This was previously implemented with a macro and we were using
__builtin_copysign(), which takes double inputs for the float
version of copysign().

Reviewed-and-Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217045
2014-09-03 15:55:38 +00:00
Jan Vesely ef513d392b Implement generic mad_sat
v2: Fix trailing whitespace
    Fix signed long overflow
    improve comment

v3: fix typo

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 216923
2014-09-02 17:55:02 +00:00
Aaron Watry 9447097636 Revert "Implement generic mad_sat"
This reverts commit cf62eded8b623a1c10d3692d25e5882b7939f564.

I didn't mean to commit this...  Jan has a v3 incoming

llvm-svn: 216322
2014-08-23 14:06:01 +00:00
Aaron Watry 6bfac7ae69 Implement generic mad_sat
v2: Fix trailing whitespace
    Fix signed long overflow
    improve comment

Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
llvm-svn: 216320
2014-08-23 14:04:33 +00:00
Tom Stellard 2ad4243bf7 Implement prefetch builtin
The default implementation is a no-op.  Targets should override this
with their own implementations.

llvm-svn: 216127
2014-08-20 21:23:03 +00:00
Aaron Watry f991505d02 vload/vstore: Use casts instead of scalarizing everything in CLC version
This generates bitcode which is indistinguishable from what was
hand-written for int32 types in v[load|store]_impl.ll.

v4: Use vec2+scalar for vec3 load/stores to prevent corruption (per Tom)
v3: Also remove unused generic/lib/shared/v[load|store]_impl.ll
v2: (Per Matt Arsenault) Fix alignment issues with vector load stores

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: Matt Arsenault <Matthew.Arsenault@amd.com>
CC: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 216069
2014-08-20 13:58:57 +00:00
Jan Vesely 12c660827e relational: Add islessequal(floatN) builtin
v2: remove the initial undef

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 214568
2014-08-01 21:50:59 +00:00
Jan Vesely acba2c98eb relational: Add isless(floatN) builtin
v2: remove the initial undef

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 214567
2014-08-01 21:50:55 +00:00
Tom Stellard 903a78b7c6 Implement sin builtin for float types
This double version still uses @llvm.sin.

llvm-svn: 213762
2014-07-23 15:16:21 +00:00
Tom Stellard c0ab2f81e3 Implement cos builtin for float types
The double version still uses @llvm.cos.

llvm-svn: 213761
2014-07-23 15:16:18 +00:00
Tom Stellard f9caca8b9d Implement atan2 builtin
llvm-svn: 213760
2014-07-23 15:16:16 +00:00
Tom Stellard 47882923c7 Implement atan builtin
llvm-svn: 213759
2014-07-23 15:16:13 +00:00
Aaron Watry d7f022a582 relational: Implement isnotequal
v2: Use relational macros instead of hand-rolled ones

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213320
2014-07-17 22:07:32 +00:00
Aaron Watry 30102536c0 relational: Implement isgreaterequal
v2: Use relational macros instead of hand-rolled macros

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213319
2014-07-17 22:07:27 +00:00
Aaron Watry 803a992f04 relational: Implement isgreater
v2: Use relational macros instead of hand-rolled macros

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213318
2014-07-17 22:07:19 +00:00
Aaron Watry 9335fe8eff relational/signbit: Refactor to use relational macros
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213317
2014-07-17 22:05:25 +00:00
Aaron Watry d5aace4874 Fix isnan definition for vector results
Vector true is -1, not 1, which means we need to use the relational unary
macro instead of the normal unary builtin one.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213316
2014-07-17 22:05:22 +00:00
Aaron Watry 13116cf01a relational: create re-usable macros for relational declarations
relational.h includes relational macros for defining functions which need to
return 1 for scalar true and -1 for vector true.

I believe that this is the only place that this behavior is required, so the
macro is placed at its lowest useful level (same directory as it is used in).

This also creates re-usable unary/binary declaration and floatn includes which
should simplify relational builtin declarations.

Mostly patterned off of include/math/[binary_decl|unary_decl|floatn].inc
but with required changes for relational functions.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213315
2014-07-17 22:05:16 +00:00
Aaron Watry d7f158e006 relational: Fix signbit
The vector components were mistakenly using () instead of {}, which caused
all but the last vector component to be dropped on the floor.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk>
llvm-svn: 211733
2014-06-25 21:08:38 +00:00
Aaron Watry d9ee196eab relational: Implement signbit
v2 Changes:
   - use __builtin_signbit instead of shifting by hand
   - significantly improve vector shuffling
   - Works correctly now for signbit(float16) on radeonsi

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 211696
2014-06-25 13:29:23 +00:00
Jeroen Ketema 42df5d2a8f Add exp10
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211680
2014-06-25 10:06:35 +00:00
Jeroen Ketema 526fe2d501 Move clcmacro.h to avoid cluttering user namespace v2
v2: - use quotes instead of <>
    - add include to r600/lib/math/nextafter.c changed

Reviewed-by: Tom Stellard <tom@stellard.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 211576
2014-06-24 09:36:32 +00:00
Jeroen Ketema bfdb1c0c2f Protect functions taking double by #ifdef cl_khr_fp64
Also change the order of the functions to be consistent with
the order in the header files.

llvm-svn: 211496
2014-06-23 14:15:39 +00:00
Jeroen Ketema 09516fa27d Add pown
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211211
2014-06-18 19:42:23 +00:00
Aaron Watry d9afe9def0 Fix definition of INFINITY and add NAN/HUGE_VAL[F]
v3: change __builtin_nanf() to __builtin_nanf("")
    This doesn't work yet, but it was agreed to commit as-is with the logic
    that "broken" is better than "completely missing" and this should be
    fixed in clang.

v2: use __builtin_inff() and also add nan/huge_val definitions

Signed-off-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 211065
2014-06-16 22:32:58 +00:00
Aaron Watry 6af2969a61 math: Implement mix builtin
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211047
2014-06-16 19:53:59 +00:00
Aaron Watry f7f79d2a94 relational: Add isequal(floatN) builtin
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211046
2014-06-16 19:53:57 +00:00
Aaron Watry e167db9238 Add all(igentype) builtin
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211045
2014-06-16 19:53:54 +00:00
Jeroen Ketema e2a0f050d8 Add files forgotten in the previous commit
llvm-svn: 210896
2014-06-13 12:33:40 +00:00
Jeroen Ketema 82aaa41286 Implementations for exp(float) and exp(double) v2
Use separate implementations instead of a macro
to ensure the constant multiplied with is of
higher precision.

v2: Use the correct formula, spotted by Dan Liew <daniel.liew@imperial.ac.uk>

Reviewed-by: Aaron Warty <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 210891
2014-06-13 09:40:09 +00:00
Tom Stellard 3a12fc6a07 Add sincos
Patch by: Jeroen Ketema

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 204478
2014-03-21 16:22:01 +00:00
Tom Stellard 074e7a8ed0 Add cross for double3 and double4
Patch by: Jeroen Ketema

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 204477
2014-03-21 16:21:58 +00:00
Tom Stellard 457e35912e Implement builtins for cl_khr_global_int32_base_atomics extension
llvm-svn: 195021
2013-11-18 18:21:23 +00:00