Commit Graph

148 Commits

Author SHA1 Message Date
Daniel Stone 291bfff5db libclc: Add a __builtin to let SPIRV targets select between SW and HW FMA
Reviewer: jenatali jvesely
Differential Revision: https://reviews.llvm.org/D85910
2020-09-16 01:37:22 -04:00
Boris Brezillon 3a7051d9c2 libclc: Fix FP_ILOGBNAN definition
Fix FP_ILOGBNAN definition to match the opencl-c-base.h one and
guarantee that FP_ILOGBNAN and FP_ILOGB0 are different. Doing that
implies fixing ilogb() implementation to return the right value.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

Reviewed By: jvesely

Differential Revision: https://reviews.llvm.org/D83473
2020-08-17 13:45:43 -07:00
Jan Vesely efeafa1bda libclc: Use acos implementation from amd_builtins
Fixes acos CTS (1 thread, scalar) on AMD Turks.
Reviewer: tstellar
Differential Revision: https://reviews.llvm.org/D74011
2020-02-20 23:36:14 -05:00
Jan Vesely 4b23a2e8e9 libclc: Move rsqrt implementation to a .cl file
Reviewer: awatry
Differential Revision: https://reviews.llvm.org/D74013
2020-02-09 14:42:09 -05:00
Aaron Watry 64a8e1b83e libclc/asin: Switch to amd builtins version of asin
Fixes a wimpy-mode CTS failure for asin(float).

Passes non-wimpy for both float/double on RX580.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2020-02-04 14:29:20 -05:00
Jan Vesely 5b136ca125 Move unary_instrinsic.inc to private headers.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356021
2019-03-13 07:06:19 +00:00
Jan Vesely ee555aa992 trunc: Remove llvm intrinsic from the header.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356018
2019-03-13 07:06:10 +00:00
Jan Vesely 1c395b74bf round: Remove llvm intrinsic from the header
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356017
2019-03-13 07:06:08 +00:00
Jan Vesely b3d64e4a83 rint: Remove llvm intrinsic from the header.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356016
2019-03-13 07:06:06 +00:00
Jan Vesely fd199f0139 floor: Remove llvm isntrinsic from the header.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356015
2019-03-13 07:06:03 +00:00
Jan Vesely fda15e56a6 fabs: Remove llvm intrinsic from the header.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356014
2019-03-13 07:06:00 +00:00
Jan Vesely 54eb4d3a6d ceil: Remove llvm intrinsic from the header.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356013
2019-03-13 07:05:58 +00:00
Jan Vesely 82c6c846af sqrt: Split function generation to a shared inc file.
This will be reused by other unary functions.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

llvm-svn: 356012
2019-03-13 07:05:56 +00:00
Jan Vesely 6e85e6309d math/fma: Add fp32 software implementation
Passes CTS on carrizo (when forced to use sw fma) and turks.
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

llvm-svn: 334226
2018-06-07 20:27:43 +00:00
Jan Vesely 70a270da5f Add initial support for half precision builtins
v2: fix fmax implementation
    use consistent checks for __CLC_FP_SIZE
    add missing TODOs
    fix whitespace in definitions.h
v3: undef ZERO in modf.inc

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 332677
2018-05-17 22:55:30 +00:00
Jan Vesely 58fdb3b09a rootn: Use denormal path only
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
Use 0.0f explicitly intead of relying on GPU to flush it.
Fixes CTS on carrizo and turks

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 332324
2018-05-15 04:22:43 +00:00
Jan Vesely 21e77037c0 remquo: Flush denormals if not supported
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs.
Fixes CTS on carrizo and turks.

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 331435
2018-05-03 05:44:28 +00:00
Jan Vesely 8db45e4cf1 remquo: Port from amd builtins
double version passes on carrizo. float version fails on denormals.

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 331434
2018-05-03 05:44:26 +00:00
Jan Vesely 6146eda75d math: Add helper function to flush denormals if not supported.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 331433
2018-05-03 05:44:22 +00:00
Jan Vesely e7d567ee0d log10: Use sw implementation from amd builtins
Add missing table.
Fixes log10d CTS on carrizo.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>

llvm-svn: 330649
2018-04-23 21:10:42 +00:00
Jan Vesely 96591b6202 powr: Use denormal path only
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
Fixes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>

llvm-svn: 330207
2018-04-17 19:35:32 +00:00
Jan Vesely 4388d2883c pown: Use denormal path only
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
Fixes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>

llvm-svn: 330206
2018-04-17 19:35:30 +00:00
Jan Vesely 0d92f3047f pow: Use denormal path only
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
Fixes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>

llvm-svn: 330205
2018-04-17 19:35:28 +00:00
Jan Vesely 15c388cd79 exp10: Port from amd builtins
Passes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed and Tested (on RX 580) by: Aaron Watry <awatry@gmail.com>

llvm-svn: 330197
2018-04-17 18:08:08 +00:00
Jan Vesely 4be0339023 hypot: Port from amd builtins
v2: Fix whitespace errors

Use only subnormal path.
Passes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>

llvm-svn: 329647
2018-04-10 00:11:58 +00:00
Jan Vesely 93af966747 fmod: Port from amd_builtins
Uses only denormal path for fp32.
Passes CTS on carrizo and turks.

v2: whitespace fix

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 329433
2018-04-06 17:43:08 +00:00
Jan Vesely 5b10494fa8 remainder: Port from amd builtins
Mostly ported from amd_builtins, uses only denormal path for fp32.
Passes CTS on carrizo and turks

Reviewer: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 327818
2018-03-19 01:01:10 +00:00
Jan Vesely b672f7a251 nan: Implement
Passes CTS on carrizo and turks

Reviewer: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 327324
2018-03-12 19:46:52 +00:00
Jan Vesely c15a48dd9c lgamma_r: Move code from .inc to .cl file
Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326821
2018-03-06 17:48:47 +00:00
Jan Vesely 2dcb382efc frexp: Reuse types provided by gentype.inc
v2: Use select instead of bitselect to consolidate scalar and vector
versions

Passes CTS on Carrizo

Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326820
2018-03-06 17:48:45 +00:00
Jan Vesely 44f21978a2 minmag: Condition variable needs to be the same bitwidth as operands
No changes wrt CTS

Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326818
2018-03-06 17:48:40 +00:00
Jan Vesely 4e72300929 maxmag: Condition variable needs to be the same bitwidth as operands
No changes wrt CTS

Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326817
2018-03-06 17:48:38 +00:00
Jan Vesely dbaf6d0f7c Move cl_khr_fp64 exntension enablement to gentype include lists
This will make adding cl_khr_fp16 support easier

Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326816
2018-03-06 17:48:35 +00:00
Jan Vesely 3b8b4eb64d half_powr: Implement using powr
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 323942
2018-02-01 03:00:35 +00:00
Jan Vesely a75677c2b7 math.h: Use logical operations instead of bit operations for readability
Trivial.

Reported-by: Roman Lebedev <lebedev.ri@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 323920
2018-01-31 21:53:42 +00:00
Jan Vesely 0ecb5e511e math.h: Set HAVE_HW_FMA32 based on compiler provided macro
Fixes sin/cos piglits on non-FMA capable asics.
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=35983

Reviewer: Tom Stellard
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 323677
2018-01-29 19:05:08 +00:00
Jan Vesely 7013857f95 tanpi: Port from amd_builtins
Passes piglit on turks and carrizo.
Passes CTS on carrizo.

Acked-By: Aaron Watry <awatry@gmail.com>
Tested-By: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322980
2018-01-19 18:57:22 +00:00
Jan Vesely 03937bdec3 tan: Port from amd_builtins
v2: fixup constant precision
Passes piglit on turks and carrizo.
Passes CTS on carrizo
Fixes half_tan to pass CTS on carrizo

Acked-By: Aaron Watry <awatry@gmail.com>
Tested-By: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322979
2018-01-19 18:57:19 +00:00
Jan Vesely 44e0522c09 half_divide: Implement using x/y
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322899
2018-01-18 21:12:06 +00:00
Jan Vesely 2813b4f8d9 half_tan: Implement using tan
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322898
2018-01-18 21:12:04 +00:00
Jan Vesely bf38fae8de half_sin: Implement using sin
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322897
2018-01-18 21:12:01 +00:00
Jan Vesely 398108b91e half_recip: Implement using 1/x
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322896
2018-01-18 21:11:58 +00:00
Jan Vesely a1aba44ffa half_log2: Implement using log2
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322895
2018-01-18 21:11:56 +00:00
Jan Vesely b3b72af4b9 half_log10: Implement using log10
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322894
2018-01-18 21:11:53 +00:00
Jan Vesely 6852023802 half_log: Implement using log
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322893
2018-01-18 21:11:50 +00:00
Jan Vesely aa4c3899b5 half_exp10: Implement using exp10
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322892
2018-01-18 21:11:48 +00:00
Jan Vesely 3c0e19b61a half_exp2: Implement using exp2
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322891
2018-01-18 21:11:45 +00:00
Jan Vesely caa9000b1c half_exp: Implement using exp
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322890
2018-01-18 21:11:43 +00:00
Jan Vesely b5d556061d half_cos: Implement using cos
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322889
2018-01-18 21:11:40 +00:00
Jan Vesely e53ae3b596 half_sqrt: Cleanup implementation
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322888
2018-01-18 21:11:38 +00:00