Commit Graph

15 Commits

Author SHA1 Message Date
Aaron Watry f991505d02 vload/vstore: Use casts instead of scalarizing everything in CLC version
This generates bitcode which is indistinguishable from what was
hand-written for int32 types in v[load|store]_impl.ll.

v4: Use vec2+scalar for vec3 load/stores to prevent corruption (per Tom)
v3: Also remove unused generic/lib/shared/v[load|store]_impl.ll
v2: (Per Matt Arsenault) Fix alignment issues with vector load stores

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: Matt Arsenault <Matthew.Arsenault@amd.com>
CC: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 216069
2014-08-20 13:58:57 +00:00
Aaron Watry 0c21c7c747 Add intN vloadN() implementations for address spaces 3 and 4
Not hooked up to R600 yet due to current lack of support, at least on EG.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188181
2013-08-12 14:42:51 +00:00
Aaron Watry 7d52565321 Add vload* for addrspace(2) and use as constant load for R600
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188179
2013-08-12 14:42:49 +00:00
Aaron Watry 99a2f3b274 Fix and re-enable R600 vload/vstore assembly
The assembly optimizations were making unsafe assumptions about which address
spaces had which identifiers.

Also, fix vload/vstore with 64-bit pointers. This was broken previously on
Radeon SI.

This version still only has assembly versions of int/uint 2/4/8/16 for global
loads and stores on R600, but it does it in a way that would be very easily
extended to private/local/constant and could also be handled easily on other
architectures.

v2: 1) Leave v[load|store]_impl.ll in generic/lib
    2) Remove vload_if.ll and vstore_if.ll interfaces
    3) Fix address+offset calculations
    3) Remove offset from assembly arg list
llvm-svn: 186416
2013-07-16 14:29:01 +00:00
Aaron Watry 4cb7cf276d libclc: vload/vstore disable assembly and fix offset calculation
This commit gets us back to pure CLC and fixes offset calculations.

The next commit will re-enable the assembly implementation for R600,
fix bugs related to 64-bit address spaces, and also fix the
incorrect assumption that address space identifiers are the same in
all architectures.

llvm-svn: 186415
2013-07-16 14:28:58 +00:00
Tom Stellard d768ac0395 Add __CLC_ prefix to all macro definitions in headers
libclc was defining and undefing GENTYPE and several other macros with
common names in its header files.  This was preventing applications from
defining macros with identical names as command line arguments to the
compiler, because the definitions in the header files were masking the
macros defined as compiler arguements.

Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 185838
2013-07-08 17:27:02 +00:00
Tom Stellard 64b3bbae1e libclc: Add assembly versions of vstore for global [u]int4/8/16
The assembly should be generic, but at least currently R600 only supports
32-bit stores of [u]int1/4, and I believe that only global is well-supported.

R600 lowers the 8/16 component stores to multiple 4-component stores.

The unoptimized C versions of the other stuff is left in place.

Patch by: Aaron Watry

llvm-svn: 185009
2013-06-26 18:22:20 +00:00
Tom Stellard 922ac056e3 libclc: Add assembly versions of vload for global int4/8/16
The assembly should be generic, but at least currently R600 only supports
32-bit loads of int1/4, and I believe that only global is well-supported.

R600 lowers the 8/16 component vectors to multiple 4-bit loads.

The unoptimized C versions of the other stuff is left in place.

Patch by: Aaron Watry

llvm-svn: 185008
2013-06-26 18:22:15 +00:00
Tom Stellard 51441f80c5 libclc: Initial vstore implementation
Assumes that the target supports byte-addressable stores.

Completely unoptimized.

Patch by: Aaron Watry

llvm-svn: 185007
2013-06-26 18:22:11 +00:00
Tom Stellard 66ecbc7c18 libclc: Initial vload implementation
Should work for all targets and data types.  Completely unoptimized.

Patch by: Aaron Watry

llvm-svn: 185006
2013-06-26 18:22:05 +00:00
Tom Stellard 34f513df7c libclc: Add clamp(vec, scalar, scalar) and max(vec, scalar)
For any GENTYPE that isn't scalar, we need to implement a mixed
vector/scalar version of clamp/max.

This depends on the min() patches I sent to the list a few minutes ago.

Patch by: Aaron Watry

llvm-svn: 185003
2013-06-26 18:21:49 +00:00
Tom Stellard 075b31a2fa libclc: Implement the min(vec, scalar) version of the min builtin.
Checks if the current GENTYPE is scalar, and if not, then defines a separate
implementation of the function which casts the second arg to vector before
proceeding.

Patch by: Aaron Watry

llvm-svn: 185002
2013-06-26 18:21:44 +00:00
Tom Stellard 0be3acfc70 libclc: implement initial version of min()
This doesn't handle the integer cases for min(vector, scalar).

Patch by: Aaron Watry

llvm-svn: 185001
2013-06-26 18:21:38 +00:00
Tom Stellard cb133c9322 libclc: Move max builtin to shared/
Max(x,y) is available for all integer/floating types.

Patch by: Aaron Watry

llvm-svn: 184995
2013-06-26 18:21:06 +00:00
Tom Stellard fe23a30ef5 libclc: Add clamp() builtin for integer/floating point
Created under a new shared/ directory for functions which are available for
both integer and floating point types.

Patch by: Aaron Watry

llvm-svn: 184994
2013-06-26 18:20:56 +00:00