We use ${DESTDIR} syntax now instead of $(DESTDIR) because that syntax
works both is the shell (at least it does for bash) and for make (at
least it does for GNU Make)
Patch By: Dan Liew
llvm-svn: 200414
OpenCL C lang says that trunc rounds towards zero.
llvm.trunc.* intrinsic rounds to integer not larger in magnitude.
These definitions are equivalent.
Patch by: Jan Vesely
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 197769
Some function definitions were using _CLC_DECL, which meant that they
weren't being marked as always_inline.
Reviewed-by and Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 193754
This will prevent LLVM optimization passes from creating illegal uses
of the barrier() intrinsic (e.g. calling barrier() from a conditional
that is not executed by all threads).
llvm-svn: 193753
The C++ compiler used to build prepare-builtins
may differ from the llvm/clang for which we are
building libclc.
Use 'clang++' as the default compiler.
Patch by: Jeroen Ketema
llvm-svn: 193220
There are two implementations of nextafter():
1. Using clang's __builtin_nextafter. Clang replaces this builtin with
a call to nextafter which is part of libm. Therefore, this
implementation will only work for targets with an implementation of
libm (e.g. most CPU targets).
2. The other implementation is written in OpenCL C. This function is
known internally as __clc_nextafter and can be used by targets that
don't have access to libm.
llvm-svn: 192383
We already have a working mul_hi, and the spec gives us the implementation as:
Returns mul_hi(a,b)+c.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 190211
libclc is ABI-agnostic, and $prefix/lib/pkgconfig causes issues
on multilib setups. Using $prefix/share/pkgconfig allows us to reuse
a single libclc build across all system ABIs.
Patch by: Michał Górny
llvm-svn: 190107
Everything except long/ulong is handled by just casting to the next larger type,
doing the math and then shifting/casting the result.
For 64-bit types, we break the high/low parts of each operand apart, and do
a FOIL-based multiplication.
v2:
Discard the stack-overflow implementation due to copyright concerns.
- The implementation is still FOIL-based, but discards the previous code.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188684
rhadd = (x+y+1)>>1
Implemented as:
(x>>1) + (y>>1) + ((x&1)|(y&1))
This prevents us having to do assembly addition and overflow detection
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188477
(x + y) >> 1 gets changed to:
(x>>1) + (y>>1) + (x&y&1)
Saves us having to do any llvm assembly and overflow checking in the addition.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188476
Not hooked up to R600 yet due to current lack of support, at least on EG.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188181
It's supported by the R600 LLVM back-end now, at least for evergreen.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188180
The get_num_groups function was missing for r600g. I did the same
thing as the other workitem functions.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 187059