Commit Graph

365 Commits

Author SHA1 Message Date
Jan Vesely 21e77037c0 remquo: Flush denormals if not supported
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs.
Fixes CTS on carrizo and turks.

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 331435
2018-05-03 05:44:28 +00:00
Jan Vesely 8db45e4cf1 remquo: Port from amd builtins
double version passes on carrizo. float version fails on denormals.

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 331434
2018-05-03 05:44:26 +00:00
Jan Vesely 6146eda75d math: Add helper function to flush denormals if not supported.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 331433
2018-05-03 05:44:22 +00:00
Jan Vesely 616a38a693 clc_sqrt: Reuse unary_decl.inc
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 331366
2018-05-02 16:06:52 +00:00
Jan Vesely 1647e50359 relational/select: Condition types for half are short/ushort, not char/uchar
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 330851
2018-04-25 17:36:36 +00:00
Jan Vesely e7d567ee0d log10: Use sw implementation from amd builtins
Add missing table.
Fixes log10d CTS on carrizo.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>

llvm-svn: 330649
2018-04-23 21:10:42 +00:00
Jan Vesely 96591b6202 powr: Use denormal path only
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
Fixes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>

llvm-svn: 330207
2018-04-17 19:35:32 +00:00
Jan Vesely 4388d2883c pown: Use denormal path only
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
Fixes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>

llvm-svn: 330206
2018-04-17 19:35:30 +00:00
Jan Vesely 0d92f3047f pow: Use denormal path only
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
Fixes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>

llvm-svn: 330205
2018-04-17 19:35:28 +00:00
Jan Vesely 15c388cd79 exp10: Port from amd builtins
Passes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed and Tested (on RX 580) by: Aaron Watry <awatry@gmail.com>

llvm-svn: 330197
2018-04-17 18:08:08 +00:00
Jan Vesely 4be0339023 hypot: Port from amd builtins
v2: Fix whitespace errors

Use only subnormal path.
Passes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>

llvm-svn: 329647
2018-04-10 00:11:58 +00:00
Jan Vesely 4c1112612c select: simplify implementation and fix fp16
Fix half precision implementation
Vector ?: operator should behave exactly as select
Passes CTS on carrizo

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
llvm-svn: 329462
2018-04-06 22:00:00 +00:00
Jan Vesely 93af966747 fmod: Port from amd_builtins
Uses only denormal path for fp32.
Passes CTS on carrizo and turks.

v2: whitespace fix

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 329433
2018-04-06 17:43:08 +00:00
Jan Vesely 5b10494fa8 remainder: Port from amd builtins
Mostly ported from amd_builtins, uses only denormal path for fp32.
Passes CTS on carrizo and turks

Reviewer: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 327818
2018-03-19 01:01:10 +00:00
Jan Vesely b672f7a251 nan: Implement
Passes CTS on carrizo and turks

Reviewer: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 327324
2018-03-12 19:46:52 +00:00
Jan Vesely 0883c4d365 integer/gentype: Add __CLC_VECSIZE macro
Reviewer: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 327043
2018-03-08 18:58:05 +00:00
Jan Vesely 17e8679493 popcount: Provide function implementation rather than intrinsic redirect
amdgcn will need to override this

Reviewer: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 327042
2018-03-08 18:58:00 +00:00
Jan Vesely c15a48dd9c lgamma_r: Move code from .inc to .cl file
Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326821
2018-03-06 17:48:47 +00:00
Jan Vesely 2dcb382efc frexp: Reuse types provided by gentype.inc
v2: Use select instead of bitselect to consolidate scalar and vector
versions

Passes CTS on Carrizo

Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326820
2018-03-06 17:48:45 +00:00
Jan Vesely ae156b66f8 select: Add vector implementation
Passes CTS on Carrizo

Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326819
2018-03-06 17:48:43 +00:00
Jan Vesely 44f21978a2 minmag: Condition variable needs to be the same bitwidth as operands
No changes wrt CTS

Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326818
2018-03-06 17:48:40 +00:00
Jan Vesely 4e72300929 maxmag: Condition variable needs to be the same bitwidth as operands
No changes wrt CTS

Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326817
2018-03-06 17:48:38 +00:00
Jan Vesely dbaf6d0f7c Move cl_khr_fp64 exntension enablement to gentype include lists
This will make adding cl_khr_fp16 support easier

Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326816
2018-03-06 17:48:35 +00:00
Jan Vesely 1c570566c3 Add vstore_half_rte implementation
Passes CTS on carrizo

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 324376
2018-02-06 18:44:50 +00:00
Jan Vesely f2d876ae83 Add vstore_half_rtp implementation
Passes CTS on carrizo

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 324375
2018-02-06 18:44:47 +00:00
Jan Vesely 2655312c69 Add vstore_half_rtn implementation
Passes CTS on carrizo

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 324374
2018-02-06 18:44:45 +00:00
Jan Vesely d526a2b6e8 Add vstore_half_rtz implementation
Passes CTS on carrizo

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 324373
2018-02-06 18:44:43 +00:00
Jan Vesely 4475aca172 vstore_half: Consolidate declarations
Add support for rounding suffix

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 324372
2018-02-06 18:44:41 +00:00
Jan Vesely 187ec00556 vstore_half: Add support for custom rounding functions
Add another layer of indirection
This will be used for specific rounding modes

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 324371
2018-02-06 18:44:39 +00:00
Jan Vesely 87036d2701 vstore_half: Make sure the helper function is always inline
Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 324370
2018-02-06 18:44:35 +00:00
Jan Vesely 3b8b4eb64d half_powr: Implement using powr
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 323942
2018-02-01 03:00:35 +00:00
Jan Vesely a75677c2b7 math.h: Use logical operations instead of bit operations for readability
Trivial.

Reported-by: Roman Lebedev <lebedev.ri@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 323920
2018-01-31 21:53:42 +00:00
Jan Vesely 0ecb5e511e math.h: Set HAVE_HW_FMA32 based on compiler provided macro
Fixes sin/cos piglits on non-FMA capable asics.
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=35983

Reviewer: Tom Stellard
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 323677
2018-01-29 19:05:08 +00:00
Jan Vesely 7013857f95 tanpi: Port from amd_builtins
Passes piglit on turks and carrizo.
Passes CTS on carrizo.

Acked-By: Aaron Watry <awatry@gmail.com>
Tested-By: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322980
2018-01-19 18:57:22 +00:00
Jan Vesely 03937bdec3 tan: Port from amd_builtins
v2: fixup constant precision
Passes piglit on turks and carrizo.
Passes CTS on carrizo
Fixes half_tan to pass CTS on carrizo

Acked-By: Aaron Watry <awatry@gmail.com>
Tested-By: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322979
2018-01-19 18:57:19 +00:00
Jan Vesely 44e0522c09 half_divide: Implement using x/y
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322899
2018-01-18 21:12:06 +00:00
Jan Vesely 2813b4f8d9 half_tan: Implement using tan
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322898
2018-01-18 21:12:04 +00:00
Jan Vesely bf38fae8de half_sin: Implement using sin
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322897
2018-01-18 21:12:01 +00:00
Jan Vesely 398108b91e half_recip: Implement using 1/x
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322896
2018-01-18 21:11:58 +00:00
Jan Vesely a1aba44ffa half_log2: Implement using log2
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322895
2018-01-18 21:11:56 +00:00
Jan Vesely b3b72af4b9 half_log10: Implement using log10
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322894
2018-01-18 21:11:53 +00:00
Jan Vesely 6852023802 half_log: Implement using log
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322893
2018-01-18 21:11:50 +00:00
Jan Vesely aa4c3899b5 half_exp10: Implement using exp10
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322892
2018-01-18 21:11:48 +00:00
Jan Vesely 3c0e19b61a half_exp2: Implement using exp2
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322891
2018-01-18 21:11:45 +00:00
Jan Vesely caa9000b1c half_exp: Implement using exp
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322890
2018-01-18 21:11:43 +00:00
Jan Vesely b5d556061d half_cos: Implement using cos
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322889
2018-01-18 21:11:40 +00:00
Jan Vesely e53ae3b596 half_sqrt: Cleanup implementation
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322888
2018-01-18 21:11:38 +00:00
Jan Vesely a95db14461 half_rsqrt: Cleanup implementation
Passes CTS on carrizo
v2: Use full precision implementation

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322887
2018-01-18 21:11:35 +00:00
Jan Vesely fe8e00bc3c rootn: Port from amd_builtins
Passes piglit on turks and carrizo
fp64 passes ctx on carrizo

v2: fix formatting
    check fp32 denormal support at runtime

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322763
2018-01-17 21:22:14 +00:00
Jan Vesely c45ec604f5 powr: Port from amd_builtins
Passes piglit on turks and carrizo
fp64 passes cts on carrizo

v2: fix formatting
    check fp32 denormal support at runtime

Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322762
2018-01-17 21:22:06 +00:00