Use separate implementations instead of a macro
to ensure the constant multiplied with is of
higher precision.
v2: Use the correct formula, spotted by Dan Liew <daniel.liew@imperial.ac.uk>
Reviewed-by: Aaron Warty <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 210891
We use ${DESTDIR} syntax now instead of $(DESTDIR) because that syntax
works both is the shell (at least it does for bash) and for make (at
least it does for GNU Make)
Patch By: Dan Liew
llvm-svn: 200414
OpenCL C lang says that trunc rounds towards zero.
llvm.trunc.* intrinsic rounds to integer not larger in magnitude.
These definitions are equivalent.
Patch by: Jan Vesely
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 197769
Some function definitions were using _CLC_DECL, which meant that they
weren't being marked as always_inline.
Reviewed-by and Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 193754
This will prevent LLVM optimization passes from creating illegal uses
of the barrier() intrinsic (e.g. calling barrier() from a conditional
that is not executed by all threads).
llvm-svn: 193753
The C++ compiler used to build prepare-builtins
may differ from the llvm/clang for which we are
building libclc.
Use 'clang++' as the default compiler.
Patch by: Jeroen Ketema
llvm-svn: 193220
There are two implementations of nextafter():
1. Using clang's __builtin_nextafter. Clang replaces this builtin with
a call to nextafter which is part of libm. Therefore, this
implementation will only work for targets with an implementation of
libm (e.g. most CPU targets).
2. The other implementation is written in OpenCL C. This function is
known internally as __clc_nextafter and can be used by targets that
don't have access to libm.
llvm-svn: 192383
We already have a working mul_hi, and the spec gives us the implementation as:
Returns mul_hi(a,b)+c.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 190211
libclc is ABI-agnostic, and $prefix/lib/pkgconfig causes issues
on multilib setups. Using $prefix/share/pkgconfig allows us to reuse
a single libclc build across all system ABIs.
Patch by: Michał Górny
llvm-svn: 190107
Everything except long/ulong is handled by just casting to the next larger type,
doing the math and then shifting/casting the result.
For 64-bit types, we break the high/low parts of each operand apart, and do
a FOIL-based multiplication.
v2:
Discard the stack-overflow implementation due to copyright concerns.
- The implementation is still FOIL-based, but discards the previous code.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188684
rhadd = (x+y+1)>>1
Implemented as:
(x>>1) + (y>>1) + ((x&1)|(y&1))
This prevents us having to do assembly addition and overflow detection
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188477