Tom Stellard
da2969fca7
Implement atanh builtin
...
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 234324
2015-04-07 16:20:22 +00:00
Tom Stellard
ca4d382e11
Implement acosh builtin
...
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 234323
2015-04-07 16:20:20 +00:00
Tom Stellard
03dc366e79
Implement atanpi builtin
...
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233928
2015-04-02 17:01:58 +00:00
Tom Stellard
eea0997566
Implement asinpi builtin
...
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233927
2015-04-02 17:01:56 +00:00
Tom Stellard
2b4ef39b2f
Implement asinh builtin
...
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233926
2015-04-02 17:01:54 +00:00
Tom Stellard
084124a8fa
Implement acospi builtin
...
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233925
2015-04-02 17:01:52 +00:00
Tom Stellard
1ded220cc0
Implement fmax using __builtin_fmax
...
This ensures correct handling of NaNi.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233713
2015-03-31 16:59:23 +00:00
Tom Stellard
310da7bfd2
Implement fmin using __builtin_fmin
...
This ensures correct handling of NaN.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233712
2015-03-31 16:59:21 +00:00
Tom Stellard
bd4da7a0ef
Implement fast_distance builtin
...
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 232978
2015-03-23 18:10:04 +00:00
Tom Stellard
cb80e14f2c
Implement fast_length builtin
...
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 232977
2015-03-23 18:10:02 +00:00
Tom Stellard
d2a1559846
Implement half_sqrt builtin v2
...
This is a generic implementation which just calls sqrt. Targets should
override this if they want a faster implementation.
v2:
- Alphabetize SOURCES
llvm-svn: 232965
2015-03-23 17:01:37 +00:00
Tom Stellard
551a669e80
Implement distance builtin v2
...
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Remove unnecessary copyright.
llvm-svn: 232964
2015-03-23 17:01:35 +00:00
Tom Stellard
cb1c0d7939
Fix implementation of length builtin v2
...
v2:
- Move common code into a macro
- Use the same constant for all vector types.
llvm-svn: 232963
2015-03-23 17:01:33 +00:00
Tom Stellard
8d3a4e3af2
Add __clc_ prefix to functions in sincos_helpers.cl
...
This will help avoid naming conflicts with functions defined in
kernels linking with libclc.
llvm-svn: 232960
2015-03-23 16:20:24 +00:00
Aaron Watry
2cf4d5f312
math: Implement erfc
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 232674
2015-03-18 21:52:07 +00:00
Tom Stellard
adfd96f742
Fix bitselect for float/double types v2
...
We need to reinterpret float/double types as uint/ulong in order to
perform the bitwise operations.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Use vector operations rather than splitting vectors into scalar
components.
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 231373
2015-03-05 15:31:05 +00:00
Aaron Watry
1314630ec3
Move mix from math to common
...
It has been part of the common functions since 1.0
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 231137
2015-03-03 21:25:08 +00:00
Tom Stellard
9d0d374c5b
Implement step builtin
...
This has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 230970
2015-03-02 15:29:41 +00:00
Tom Stellard
1f28b14bba
Implement smoothstep builtin v2
...
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Fix typo in smoothstep.h
llvm-svn: 230969
2015-03-02 15:29:39 +00:00
Tom Stellard
f5e5b0171d
Implement radians builtin v2
...
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Move to the common/ directory
llvm-svn: 230968
2015-03-02 15:29:37 +00:00
Tom Stellard
8336b3a604
Implement degrees builtin v2
...
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Move to the common/ directory
llvm-svn: 230967
2015-03-02 15:29:35 +00:00
Aaron Watry
f89bcca0b7
libclc/math: Add cospi
...
Ported from the libclc/amd-builtins branch
v2: Rename sincos_f_piby4 to __libclc__sincosf_piby4
Add cospi(double) implementation instead of using llvm.cos
Notes:
The sincosD_piby4.h file is mostly the same as the builtin implementation
released by AMD. The inline attribute declaration is changed, and M_PI is
used instead of a constant double. Otherwise, the only difference is that
the header explicitly enables the fp64 pragma.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk>
CC: Tom Stellard <tom@stellard.net>
CC: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 230641
2015-02-26 15:42:00 +00:00
Jan Vesely
51702e6e75
Implement log10
...
v2: Use constant and multiplication instead of division
v3: Use hex constants
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 227585
2015-01-30 18:00:34 +00:00
Tom Stellard
0f39721261
Use amdgcn triple for SI+ GPUs
...
llvm-svn: 225296
2015-01-06 20:42:12 +00:00
Tom Stellard
24ea64e050
r600: get_work_dim: Update metadata syntax for LLVM 3.6
...
llvm-svn: 225042
2014-12-31 15:27:59 +00:00
Tom Stellard
1d77071e0d
Require LLVM 3.6 and bump version to 0.1.0
...
Some functions are implemented using hand-written LLVM IR, and
LLVM assembly format is allowed to change between versions, so we
should require a specific version of LLVM.
llvm-svn: 225041
2014-12-31 15:27:53 +00:00
Jeroen Ketema
c9526139bc
Remove wrong semi-colons
...
Patch by Alastair Donaldson
llvm-svn: 224568
2014-12-19 09:18:23 +00:00
Jeroen Ketema
7a22aebbda
Don't include <stddef.h>
...
Including a standard or system header isn't allowed in OpenCL.
The type "size_t" needs to be explicitely defined now.
v2: Use __SIZE_TYPE__ instead of unsigned int.
v3: Define ptrdiff_t and NULL.
Patch-by: Jean-Sébastien Pédron
Reviewed-by: Jeroen Ketema
Reviewed-by: Jan Vesely
llvm-svn: 222235
2014-11-18 14:19:27 +00:00
NAKAMURA Takumi
729be14435
Prune CRLF.
...
llvm-svn: 220678
2014-10-27 12:37:26 +00:00
Jan Vesely
ae50c89589
r600: Fix get_work_dim range metadata
...
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 220388
2014-10-22 14:32:53 +00:00
Jan Vesely
260827caa2
r600: Use llvm intrinsic to read work dimension information
...
v2: Fix function declaration
Add range metadata to r600 implementation
v3: change prefix to AMDGPU
Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 219793
2014-10-15 15:08:06 +00:00
Tom Stellard
bf9f76fbe0
Implement log1p builtin
...
llvm-svn: 219230
2014-10-07 20:22:42 +00:00
Jan Vesely
8f64c3d842
Implement fmod
...
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 219087
2014-10-05 20:24:52 +00:00
Tom Stellard
081e778d22
Implement async_work_group_copy builtin v3
...
This is a simple implementation which just copies data synchronously.
v2:
- Use size_t.
v3:
- Fix possible race condition by splitting the copy among multiple
work items.
llvm-svn: 219008
2014-10-03 19:49:39 +00:00
Tom Stellard
ed5bbfdb1b
Implement async_work_group_strided_copy builtin v2
...
This is a simple implementation which just copies data synchronously.
v2:
- Use size_t.
llvm-svn: 219007
2014-10-03 19:49:37 +00:00
Tom Stellard
b5064f79ef
Implement wait_group_events builtin v2
...
This is a simple default implemetation which just calls barrier().
v2:
- Only call barrier() once.
llvm-svn: 219006
2014-10-03 19:49:34 +00:00
Jeroen Ketema
87d2ca57d7
Remove more redundant semi-colons
...
llvm-svn: 218039
2014-09-18 09:23:40 +00:00
Aaron Watry
db770b5bb5
atomic: undef macros that are included from atomic_decl.inc
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk>
llvm-svn: 217958
2014-09-17 15:27:39 +00:00
Jeroen Ketema
839b8a62d9
Remove redundant semi-colons
...
llvm-svn: 217954
2014-09-17 14:40:48 +00:00
Aaron Watry
f4133b8a10
R600: Map Address spaces for atomic_cmpxchg
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217925
2014-09-16 22:34:59 +00:00
Aaron Watry
e210cae126
R600: Map address spaces for atomic_xchg
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217924
2014-09-16 22:34:58 +00:00
Aaron Watry
0545fa3fb0
R600: Map address spaces for atomic_min
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217923
2014-09-16 22:34:56 +00:00
Aaron Watry
dd754f4b33
R600: Map address spaces for atomic_xor
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217922
2014-09-16 22:34:55 +00:00
Aaron Watry
ea32a57060
R600: Map addr spaces and use atomic_max
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217921
2014-09-16 22:34:53 +00:00
Aaron Watry
5ab82be926
R600: Map address spaces for atomic_or
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217920
2014-09-16 22:34:52 +00:00
Aaron Watry
348db3c666
R600: Map atomic_and address spaces
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217919
2014-09-16 22:34:51 +00:00
Aaron Watry
0d976ba497
atomic: Add generic atom[ic]_cmpxchg
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217918
2014-09-16 22:34:49 +00:00
Aaron Watry
025d79ad6c
atomic: Implement generic atom[ic]_xchg
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217917
2014-09-16 22:34:45 +00:00
Aaron Watry
7cfa12c2a5
atomic: Add generic atomic_min implementation
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217916
2014-09-16 22:34:41 +00:00
Aaron Watry
3f0a1a4c27
atomic: Add generic atom[ic]_xor
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217915
2014-09-16 22:34:36 +00:00