Commit Graph

212 Commits

Author SHA1 Message Date
Tom Stellard 903a78b7c6 Implement sin builtin for float types
This double version still uses @llvm.sin.

llvm-svn: 213762
2014-07-23 15:16:21 +00:00
Tom Stellard c0ab2f81e3 Implement cos builtin for float types
The double version still uses @llvm.cos.

llvm-svn: 213761
2014-07-23 15:16:18 +00:00
Tom Stellard f9caca8b9d Implement atan2 builtin
llvm-svn: 213760
2014-07-23 15:16:16 +00:00
Tom Stellard 47882923c7 Implement atan builtin
llvm-svn: 213759
2014-07-23 15:16:13 +00:00
Aaron Watry 9ef589e9cf Add several missing double constant definitions
These were present in CL 1.0, just not implemented yet.

v2: Use hex values and fix commit message

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk>
CC: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 213321
2014-07-17 22:07:35 +00:00
Aaron Watry d7f022a582 relational: Implement isnotequal
v2: Use relational macros instead of hand-rolled ones

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213320
2014-07-17 22:07:32 +00:00
Aaron Watry 30102536c0 relational: Implement isgreaterequal
v2: Use relational macros instead of hand-rolled macros

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213319
2014-07-17 22:07:27 +00:00
Aaron Watry 803a992f04 relational: Implement isgreater
v2: Use relational macros instead of hand-rolled macros

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213318
2014-07-17 22:07:19 +00:00
Aaron Watry 9335fe8eff relational/signbit: Refactor to use relational macros
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213317
2014-07-17 22:05:25 +00:00
Aaron Watry d5aace4874 Fix isnan definition for vector results
Vector true is -1, not 1, which means we need to use the relational unary
macro instead of the normal unary builtin one.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213316
2014-07-17 22:05:22 +00:00
Aaron Watry 13116cf01a relational: create re-usable macros for relational declarations
relational.h includes relational macros for defining functions which need to
return 1 for scalar true and -1 for vector true.

I believe that this is the only place that this behavior is required, so the
macro is placed at its lowest useful level (same directory as it is used in).

This also creates re-usable unary/binary declaration and floatn includes which
should simplify relational builtin declarations.

Mostly patterned off of include/math/[binary_decl|unary_decl|floatn].inc
but with required changes for relational functions.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213315
2014-07-17 22:05:16 +00:00
Tom Stellard 8d92c396d9 prepare-builtins: Fix broken build due to recent LLVM API change
llvm-svn: 212470
2014-07-07 17:46:45 +00:00
Jeroen Ketema 575fb84cc3 OpenCL 1.1 does not define CL_VERSION_1_2 so use hardcoded number instead
Otherwise the test evaluates to true on OpenCL 1.1 and earlier. Since we
therefore cannot use the CL_VERSION_?_? macros move them to the proper
position in the top-level header.

llvm-svn: 211787
2014-06-26 15:26:38 +00:00
Aaron Watry d7f158e006 relational: Fix signbit
The vector components were mistakenly using () instead of {}, which caused
all but the last vector component to be dropped on the floor.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk>
llvm-svn: 211733
2014-06-25 21:08:38 +00:00
Aaron Watry d9ee196eab relational: Implement signbit
v2 Changes:
   - use __builtin_signbit instead of shifting by hand
   - significantly improve vector shuffling
   - Works correctly now for signbit(float16) on radeonsi

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 211696
2014-06-25 13:29:23 +00:00
Jeroen Ketema 42df5d2a8f Add exp10
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211680
2014-06-25 10:06:35 +00:00
Jeroen Ketema dd1fbc0082 Add half limits
These are apparently only defined in OpenCL 1.2.

HALF_MAX, HALF_MIN and HALF_EPSILON are currently omitted. Clang does
not seem to support the ‘h’ suffix for half float constants even with
the cl_khr_fp16 extension enabled.

Reviewed-by: Tom Sellard <tom@stellard.net>
llvm-svn: 211579
2014-06-24 09:51:01 +00:00
Jeroen Ketema 046b47fbbe Introduce CLC_VERSION macros v2
Add these out-of-order in clc.h so we can use these in other headers.

v2: Take into account the lack of a definition in OpenCL 1.0

Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211578
2014-06-24 09:46:52 +00:00
Jeroen Ketema 985a1381b2 Add MAXFLOAT
Align definitions while we are here.

Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211577
2014-06-24 09:41:28 +00:00
Jeroen Ketema 526fe2d501 Move clcmacro.h to avoid cluttering user namespace v2
v2: - use quotes instead of <>
    - add include to r600/lib/math/nextafter.c changed

Reviewed-by: Tom Stellard <tom@stellard.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 211576
2014-06-24 09:36:32 +00:00
Jeroen Ketema bfdb1c0c2f Protect functions taking double by #ifdef cl_khr_fp64
Also change the order of the functions to be consistent with
the order in the header files.

llvm-svn: 211496
2014-06-23 14:15:39 +00:00
Jeroen Ketema d253e66ec7 Fix breakage after r211259
While we are here introduce the proper headers for the error code.

Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211432
2014-06-21 09:20:31 +00:00
Jeroen Ketema 09516fa27d Add pown
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211211
2014-06-18 19:42:23 +00:00
Jeroen Ketema fdee0d3efe Add missing undefs
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211210
2014-06-18 19:37:34 +00:00
Aaron Watry d9afe9def0 Fix definition of INFINITY and add NAN/HUGE_VAL[F]
v3: change __builtin_nanf() to __builtin_nanf("")
    This doesn't work yet, but it was agreed to commit as-is with the logic
    that "broken" is better than "completely missing" and this should be
    fixed in clang.

v2: use __builtin_inff() and also add nan/huge_val definitions

Signed-off-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 211065
2014-06-16 22:32:58 +00:00
Jeroen Ketema f3bd08ae63 Add remaining float constants
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 211062
2014-06-16 22:15:50 +00:00
Aaron Watry 50f518be65 Revert "clctypes.h: Don't rely on stddef.h for size_t and ptrdiff_t"
This reverts commit 4cf021ae67b6ea8cfd42aa76ce6f5e1c329e145a.

llvm-svn: 211049
2014-06-16 20:21:19 +00:00
Aaron Watry 6af2969a61 math: Implement mix builtin
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211047
2014-06-16 19:53:59 +00:00
Aaron Watry f7f79d2a94 relational: Add isequal(floatN) builtin
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211046
2014-06-16 19:53:57 +00:00
Aaron Watry e167db9238 Add all(igentype) builtin
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211045
2014-06-16 19:53:54 +00:00
Aaron Watry c164fc384b clctypes.h: Don't rely on stddef.h for size_t and ptrdiff_t
llvm-svn: 211044
2014-06-16 19:53:52 +00:00
Jan Vesely bd37b6884c Add intptr types
Based on clang's stdint.h

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 210933
2014-06-13 19:43:18 +00:00
Jeroen Ketema e2a0f050d8 Add files forgotten in the previous commit
llvm-svn: 210896
2014-06-13 12:33:40 +00:00
Jeroen Ketema 82aaa41286 Implementations for exp(float) and exp(double) v2
Use separate implementations instead of a macro
to ensure the constant multiplied with is of
higher precision.

v2: Use the correct formula, spotted by Dan Liew <daniel.liew@imperial.ac.uk>

Reviewed-by: Aaron Warty <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 210891
2014-06-13 09:40:09 +00:00
Tom Stellard 0ab103422d prepare-builtins: Use std:: prefix for error_code
This fixes the build with with newer LLVM.

llvm-svn: 210867
2014-06-13 01:30:14 +00:00
Jeroen Ketema a3089bd7e5 Remove unused include which breaks build after r210803
Tested with llvm 3.4 and trunk.

llvm-svn: 210825
2014-06-12 21:10:17 +00:00
Jeroen Ketema f3943efe17 Fix build broken by LLVM commit r209103
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 210111
2014-06-03 15:43:57 +00:00
Jeroen Ketema 75c1a0c6e2 Add more log related float constants
llvm-svn: 209850
2014-05-29 21:30:28 +00:00
Jeroen Ketema d1bb82a722 Fix _F definitions
The 'f' was missing and, hence, the values were
considered to be doubles instead of floats.

Reviewed by: Tom Stellard

llvm-svn: 209849
2014-05-29 21:29:34 +00:00
Jeroen Ketema a16fdbfac2 Add definition for M_PI
Reviewed by: Tom Stellard

llvm-svn: 209848
2014-05-29 21:24:57 +00:00
Tom Stellard 98dccb109b Fix build broken by LLVM commit r207593
llvm-svn: 207685
2014-04-30 18:35:20 +00:00
Tom Stellard 998602dac2 Remove clc/gentype.inc
This file duplicates clc/math/gentype.inc and is not
actually being used.

Patch by: Jeroen Ketema

llvm-svn: 207684
2014-04-30 18:35:17 +00:00
Tom Stellard f83fe5a6dc Introduce M_LOG2E_F and M_LOG2E
Patch by: Jeroen Ketema

llvm-svn: 205055
2014-03-28 21:19:03 +00:00
Tom Stellard ce43db105e Replace tabs by spaces
Patch by: Jeroen Ketema

llvm-svn: 205054
2014-03-28 21:19:00 +00:00
Tom Stellard 6378f7a5e2 Add definition for M_PI_F v3
v2:
  - Use a hexadecimal constant.

v3:
  - Use a hexadecimal constant in floating-point notation.

llvm-svn: 204666
2014-03-24 20:36:44 +00:00
Tom Stellard 3a12fc6a07 Add sincos
Patch by: Jeroen Ketema

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 204478
2014-03-21 16:22:01 +00:00
Tom Stellard 074e7a8ed0 Add cross for double3 and double4
Patch by: Jeroen Ketema

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 204477
2014-03-21 16:21:58 +00:00
Tom Stellard 7aee1cfa99 Fix build since r202052
sys::fs_F_Binary has been replaced with sys::fs_F_Text

llvm-svn: 202081
2014-02-24 21:31:56 +00:00
Tom Stellard d873c8e05a Add generic nvptx targets
These do not import the code specific to nvidiacl

Patch by: Jeroen Ketema

llvm-svn: 201431
2014-02-14 20:33:53 +00:00
Tom Stellard 55d3746dbe Revert "Enforce python2 for systems that use python3 as their default."
This reverts commit r200413.

This was breaking the build on systems where the python 2.x executable
was called python.

llvm-svn: 201239
2014-02-12 14:54:17 +00:00
Tom Stellard 0d35ed912a Updated README.TXT with information about using DESTDIR and building with Ninja.
Patch by: Dan Liew

llvm-svn: 200416
2014-01-29 20:03:28 +00:00
Tom Stellard 8a3770ab97 Fixed rules names so they are unique when aliases are present.
This is necessary for building with Ninja because it does not allow
duplicate rule names.

Patch by: Dan Liew

llvm-svn: 200415
2014-01-29 20:03:27 +00:00
Tom Stellard 91d51db800 Fixed ninja build issues relating to use of $(DESTDIR)
We use ${DESTDIR} syntax now instead of $(DESTDIR) because that syntax
works both is the shell (at least it does for bash) and for make (at
least it does for GNU Make)

Patch By: Dan Liew

llvm-svn: 200414
2014-01-29 20:03:26 +00:00
Tom Stellard ac0fb621ce Enforce python2 for systems that use python3 as their default.
Patch by: Dan Liew

llvm-svn: 200413
2014-01-29 20:03:24 +00:00
Tom Stellard 8a63b15b3c Fix build broken by LLVM commit r199279
Patch by: Udo van den Heuvel

Tom Stellard:
  - Added ifdef and error handling

llvm-svn: 199687
2014-01-20 20:28:48 +00:00
NAKAMURA Takumi d40d387fb1 Update the copyright credits -- Happy new year 2014!
FIXME: Dragonegg may be updated at non-trivial changes.
llvm-svn: 198274
2014-01-01 08:27:31 +00:00
Aaron Watry 8ef48d07ef Pass -fno-builtin flag to clang to silence warnings
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 198168
2013-12-29 16:39:55 +00:00
Aaron Watry b38037f7b7 Fix build with LLVM 3.5
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 198167
2013-12-29 16:39:53 +00:00
Tom Stellard ce0709aa61 Add floating-point macro definitions v2
v2:
  - Fix typo.

Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 197784
2013-12-20 05:13:42 +00:00
Tom Stellard 1f3c9ba9f1 Implement trunc builtin.
OpenCL C lang says that trunc rounds towards zero.
llvm.trunc.* intrinsic rounds to integer not larger in magnitude.
These definitions are equivalent.

Patch by: Jan Vesely

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 197769
2013-12-20 02:08:46 +00:00
Tom Stellard 8bb6cb8009 Fix a C&P error in r195021 (65a950abab3cb8435ccb2646ac4773986c995c81)
Patch by: Kai Wasserbäch

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
llvm-svn: 195898
2013-11-28 00:17:29 +00:00
Tom Stellard eedd3064de R600: Add aliases for Sea Islands GPUs
llvm-svn: 195023
2013-11-18 18:21:30 +00:00
Tom Stellard 5abf149bf3 Implement round builtin
llvm-svn: 195022
2013-11-18 18:21:27 +00:00
Tom Stellard 457e35912e Implement builtins for cl_khr_global_int32_base_atomics extension
llvm-svn: 195021
2013-11-18 18:21:23 +00:00
Tom Stellard 3a9632d544 s/_CLC_DECL/_CLC_DEF/
Some function definitions were using _CLC_DECL, which meant that they
weren't being marked as always_inline.

Reviewed-by and Tested-by: Aaron Watry <awatry@gmail.com>

llvm-svn: 193754
2013-10-31 15:50:53 +00:00
Tom Stellard d2e83929a9 R600: Set the noduplicate attribute on barrier() intrinsics
This will prevent LLVM optimization passes from creating illegal uses
of the barrier() intrinsic (e.g. calling barrier() from a conditional
that is not executed by all threads).

llvm-svn: 193753
2013-10-31 15:50:48 +00:00
Tom Stellard 9fabcb3edb Clean-up dependency files
Patch by: Jeroen Ketema

llvm-svn: 193221
2013-10-23 02:49:33 +00:00
Tom Stellard 9f48bb3b9a Make C++ compiler configurable
The C++ compiler used to build prepare-builtins
may differ from the llvm/clang for which we are
building libclc.

Use 'clang++' as the default compiler.

Patch by: Jeroen Ketema

llvm-svn: 193220
2013-10-23 02:49:27 +00:00
Tom Stellard f21e3ea972 Port pocl's gen_convert.py script to libclc
This script generates implementations for the entire set of convert_*
functions,

llvm-svn: 192385
2013-10-10 19:09:01 +00:00
Tom Stellard 436bf70519 Implement sign() builtin
llvm-svn: 192384
2013-10-10 19:08:56 +00:00
Tom Stellard 6c7b86c106 Implement nextafter() builtin
There are two implementations of nextafter():
1. Using clang's __builtin_nextafter.  Clang replaces this builtin with
a call to nextafter which is part of libm.  Therefore, this
implementation will only work for targets with an implementation of
libm (e.g. most CPU targets).

2. The other implementation is written in OpenCL C.  This function is
known internally as __clc_nextafter and can be used by targets that
don't have access to libm.

llvm-svn: 192383
2013-10-10 19:08:51 +00:00
Tom Stellard e36e9dec65 Implement isnan() builtin
llvm-svn: 192382
2013-10-10 19:08:41 +00:00
Tom Stellard ef13294c93 Add missing as_{float,double} functions
llvm-svn: 192381
2013-10-10 19:08:29 +00:00
Aaron Watry dfd8afa02b Parenthesize arguments for mad_hi
Thanks to Jordon Rose <jordan_rose@apple.com> for pointing this out.

llvm-svn: 190310
2013-09-09 14:36:21 +00:00
Aaron Watry 3466342f57 Implement mad_hi built-in
We already have a working mul_hi, and the spec gives us the implementation as:
Returns mul_hi(a,b)+c.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 190211
2013-09-06 22:09:51 +00:00
Aaron Watry 283e3fa011 Add atomic_sub and atomic_dec builtin functions
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 190201
2013-09-06 20:20:21 +00:00
Tom Stellard 93d674f7b3 Place pkg-config file in $prefix/share/pkgconfig.
libclc is ABI-agnostic, and $prefix/lib/pkgconfig causes issues
on multilib setups. Using $prefix/share/pkgconfig allows us to reuse
a single libclc build across all system ABIs.

Patch by: Michał Górny

llvm-svn: 190107
2013-09-05 23:27:58 +00:00
Aaron Watry 7171a2f965 Remove unneeded semi-colons
Reviewed-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 190059
2013-09-05 16:04:07 +00:00
Aaron Watry 50a7bcbac9 Add atomic_inc and atomic_add builtins
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 190058
2013-09-05 16:04:01 +00:00
Aaron Watry fbe439f8c0 Add mul_hi implementation [v2]
Everything except long/ulong is handled by just casting to the next larger type,
doing the math and then shifting/casting the result.

For 64-bit types, we break the high/low parts of each operand apart, and do
a FOIL-based multiplication.

v2:
  Discard the stack-overflow implementation due to copyright concerns.
  - The implementation is still FOIL-based, but discards the previous code.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188684
2013-08-19 18:31:49 +00:00
Aaron Watry 8548725f29 Add rhadd builtin
rhadd = (x+y+1)>>1

Implemented as:
(x>>1) + (y>>1) + ((x&1)|(y&1))

This prevents us having to do assembly addition and overflow detection

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188477
2013-08-15 19:21:10 +00:00
Aaron Watry 7659157f1b Add hadd builtin
(x + y) >> 1 gets changed to:
(x>>1) + (y>>1) + (x&y&1)

Saves us having to do any llvm assembly and overflow checking in the addition.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188476
2013-08-15 19:21:07 +00:00
Aaron Watry 0c21c7c747 Add intN vloadN() implementations for address spaces 3 and 4
Not hooked up to R600 yet due to current lack of support, at least on EG.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188181
2013-08-12 14:42:51 +00:00
Aaron Watry c0aa6e0291 Enable assembly vload3 int/uint constant/global for R600
It's supported by the R600 LLVM back-end now, at least for evergreen.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188180
2013-08-12 14:42:50 +00:00
Aaron Watry 7d52565321 Add vload* for addrspace(2) and use as constant load for R600
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188179
2013-08-12 14:42:49 +00:00
Tom Stellard 41ef85df0a Add some missing convert_* functions
llvm-svn: 188131
2013-08-10 03:40:37 +00:00
Tom Stellard abbfd2bde0 Implement generic rint()
llvm-svn: 188130
2013-08-10 03:40:33 +00:00
Tom Stellard da920eab42 configure: Fix build when clang is installed to a non-standard prefix
llvm-svn: 188129
2013-08-10 03:40:26 +00:00
Aaron Watry 88ac12591c Add missing integer min/max definitions
Found in CL 1.1 spec section 6.11.3

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 187200
2013-07-26 13:02:02 +00:00
Aaron Watry bde11213e7 Added get_num_groups
The get_num_groups function was missing for r600g. I did the same
thing as the other workitem functions.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 187059
2013-07-24 18:03:38 +00:00
Aaron Watry 1769b1fca9 Implement generic upsample()
Reduces all vector upsamples down to its scalar components, so probably
not the most efficient thing in the world, but it does what the
spec says it needs to do.

Another possible implementation would be to convert/cast everything as
unsigned if necessary, upsample the input vectors, create the upsampled
value, and then cast back to signed if required.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
llvm-svn: 186691
2013-07-19 16:44:37 +00:00
Aaron Watry 0da3d3b5ba Fix build with LLVM 3.4
F_Binary and friends were moved to include/Support/FileSystem.h

v2: Maintain compatibility with LLVM 3.3

Signed-off-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 186610
2013-07-18 21:24:35 +00:00
Aaron Watry 99a2f3b274 Fix and re-enable R600 vload/vstore assembly
The assembly optimizations were making unsafe assumptions about which address
spaces had which identifiers.

Also, fix vload/vstore with 64-bit pointers. This was broken previously on
Radeon SI.

This version still only has assembly versions of int/uint 2/4/8/16 for global
loads and stores on R600, but it does it in a way that would be very easily
extended to private/local/constant and could also be handled easily on other
architectures.

v2: 1) Leave v[load|store]_impl.ll in generic/lib
    2) Remove vload_if.ll and vstore_if.ll interfaces
    3) Fix address+offset calculations
    3) Remove offset from assembly arg list
llvm-svn: 186416
2013-07-16 14:29:01 +00:00
Aaron Watry 4cb7cf276d libclc: vload/vstore disable assembly and fix offset calculation
This commit gets us back to pure CLC and fixes offset calculations.

The next commit will re-enable the assembly implementation for R600,
fix bugs related to 64-bit address spaces, and also fix the
incorrect assumption that address space identifiers are the same in
all architectures.

llvm-svn: 186415
2013-07-16 14:28:58 +00:00
Tom Stellard eaa534450c Add integer-gentype.inc: Missing file from r185839
llvm-svn: 186326
2013-07-15 15:20:05 +00:00
Tom Stellard 6f33168bb7 Implement mad24() and mul24() builtins
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 185839
2013-07-08 17:27:13 +00:00
Tom Stellard d768ac0395 Add __CLC_ prefix to all macro definitions in headers
libclc was defining and undefing GENTYPE and several other macros with
common names in its header files.  This was preventing applications from
defining macros with identical names as command line arguments to the
compiler, because the definitions in the header files were masking the
macros defined as compiler arguements.

Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 185838
2013-07-08 17:27:02 +00:00
Tom Stellard 3a81b5d083 Implement barrier() builtin
Reviewed and Tested-by: Aaron Watry <awatry@gmail.com>

llvm-svn: 185837
2013-07-08 17:26:39 +00:00
Tom Stellard a4cadba551 Add bitselect() builtin
Reviewed-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 185836
2013-07-08 17:26:33 +00:00
Tom Stellard 64b3bbae1e libclc: Add assembly versions of vstore for global [u]int4/8/16
The assembly should be generic, but at least currently R600 only supports
32-bit stores of [u]int1/4, and I believe that only global is well-supported.

R600 lowers the 8/16 component stores to multiple 4-component stores.

The unoptimized C versions of the other stuff is left in place.

Patch by: Aaron Watry

llvm-svn: 185009
2013-06-26 18:22:20 +00:00