Matt Arsenault
747bf8afa8
AMDGPU: Re-use TM.getNullPointerValue
...
llvm-svn: 297662
2017-03-13 20:18:14 +00:00
Matt Arsenault
971c85ebb4
AMDGPU: Treat 0 as private null pointer in addrspacecast lowering
...
llvm-svn: 297658
2017-03-13 19:47:31 +00:00
Matt Arsenault
dd905b0e9b
AMDGPU: Remove packf16 intrinsic
...
llvm-svn: 297557
2017-03-11 05:51:16 +00:00
Matt Arsenault
10268f93e8
AMDGPU: Use v_med3_{f16|i16|u16}
...
llvm-svn: 296401
2017-02-27 22:40:39 +00:00
Matt Arsenault
eb522e68bc
AMDGPU: Support v2i16/v2f16 packed operations
...
llvm-svn: 296396
2017-02-27 22:15:25 +00:00
Matt Arsenault
7596f13d15
AMDGPU: Support inlineasm for packed instructions
...
Add packed types as legal so they may be used with inlineasm.
Keep all operations expanded for now.
llvm-svn: 296379
2017-02-27 20:52:10 +00:00
Matt Arsenault
79a45db7f5
AMDGPU: Use clamp with f64
...
llvm-svn: 295908
2017-02-22 23:53:37 +00:00
Wei Ding
f2cce02eb2
AMDGPU : Update TrapCode based on Trap Handler ABI.
...
Differential Revision: http://reviews.llvm.org/D30232
llvm-svn: 295904
2017-02-22 23:22:19 +00:00
Matt Arsenault
f5262256a1
AMDGPU: Add replacement bfe intrinsics
...
llvm-svn: 295899
2017-02-22 23:04:58 +00:00
Matt Arsenault
93e65ea733
AMDGPU: Don't look at chain users when adjusting writemask
...
Fixes not adjusting using new intrinsics with chains.
llvm-svn: 295878
2017-02-22 21:16:41 +00:00
Wei Ding
6ade56e0a0
Revert "AMDGPU : Update TrapCode based on Trap Handler ABI."
...
This reverts commit r295867.
llvm-svn: 295871
2017-02-22 20:29:22 +00:00
Wei Ding
4991d3570f
AMDGPU : Update TrapCode based on Trap Handler ABI.
...
Differential Revision: http://reviews.llvm.org/D30232
llvm-svn: 295867
2017-02-22 20:05:06 +00:00
Matt Arsenault
1f17c66890
AMDGPU: Add cvt.pkrtz intrinsic
...
Convert llvm.SI.packf16 test uses
llvm-svn: 295797
2017-02-22 00:27:34 +00:00
Matt Arsenault
2fdf2a1a18
AMDGPU: Redefine clamp node as clamp 0.0-1.0
...
Change implementation to use max instead of add.
min/max/med3 do not flush denormals regardless of the mode,
so it is OK to use it whether or not they are enabled.
Also allow using clamp with f16, and use knowledge
of dx10_clamp.
llvm-svn: 295788
2017-02-21 23:35:48 +00:00
Matt Arsenault
7d6b71db4f
AMDGPU: Formatting fixes
...
llvm-svn: 295783
2017-02-21 22:50:41 +00:00
Matt Arsenault
c2a44e4c3c
AMDGPU: Remove llvm.AMDGPU.flbit intrinsic
...
llvm-svn: 295754
2017-02-21 19:27:33 +00:00
Matt Arsenault
e0bf7d02f0
AMDGPU: Don't use stack space for SGPR->VGPR spills
...
Before frame offsets are calculated, try to eliminate the
frame indexes used by SGPR spills. Then we can delete them
after.
I think for now we can be sure that no other instruction
will be re-using the same frame indexes. It should be easy
to notice if this assumption ever breaks since everything
asserts if it tries to use a dead frame index later.
The unused emergency stack slot seems to still be left behind,
so an additional 4 bytes is still wasted.
llvm-svn: 295753
2017-02-21 19:12:08 +00:00
Matt Arsenault
e823d92f7f
AMDGPU: Merge initial gfx9 support
...
llvm-svn: 295554
2017-02-18 18:29:53 +00:00
Matt Arsenault
f6cf1032fd
AMDGPU: Fix crashes on invalid icmp/fcmp intrinsics
...
llvm-svn: 295489
2017-02-17 19:49:10 +00:00
Matt Arsenault
eb65cda986
AMDGPU: Remove llvm.AMDGPU.rsq intrinsic
...
llvm-svn: 295358
2017-02-16 19:08:58 +00:00
Matt Arsenault
d3e5cb77e4
AMDGPU: Remove llvm.SI.sendmsg
...
llvm-svn: 295270
2017-02-16 02:01:17 +00:00
Matt Arsenault
d2c8a337aa
AMDGPU: Remove SI_fs_constant and SI_fs_interp intrinsics
...
Update test uses with expansion in terms of new intrinsics.
llvm-svn: 295269
2017-02-16 02:01:13 +00:00
Matt Arsenault
a78ca62c64
AMDGPU: Consolidate sendmsg/sendmsghalt handling and tests
...
llvm-svn: 295244
2017-02-15 22:17:09 +00:00
Matt Arsenault
b4493e909f
AMDGPU: Fix trailing whitespace
...
llvm-svn: 294694
2017-02-10 02:42:31 +00:00
Wei Ding
205bfdb3e9
AMDGPU : Add trap handler support.
...
Differential Revision: http://reviews.llvm.org/D26010
llvm-svn: 294692
2017-02-10 02:15:29 +00:00
Matt Arsenault
f84e5d9a27
AMDGPU: Generalize matching of v_med3_f32
...
I think this is safe as long as no inputs are known to ever
be nans.
Also add an intrinsic for fmed3 to be able to handle all safe
math cases.
llvm-svn: 293598
2017-01-31 03:07:46 +00:00
Matt Arsenault
ee3f0acf20
AMDGPU: Make i32 uaddo/usubo legal
...
llvm-svn: 293514
2017-01-30 18:11:38 +00:00
Tom Stellard
08efb7ebf6
AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel
...
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D29068
llvm-svn: 293321
2017-01-27 18:41:14 +00:00
Tom Stellard
2f3f9855f0
AMDGPU add support for spilling to a user sgpr pointed buffers
...
Summary:
This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1].
Patch By: Dave Airlie
Reviewers: nhaehnle, arsenm, tstellarAMD
Reviewed By: arsenm
Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D25428
llvm-svn: 293000
2017-01-25 01:25:13 +00:00
Wei Ding
ee21a36f8a
AMDGPU : Add trap handler support.
...
llvm-svn: 292893
2017-01-24 06:41:21 +00:00
Matt Arsenault
3aef809384
AMDGPU: Custom lower more vector operations
...
This avoids stack usage.
llvm-svn: 292846
2017-01-23 23:09:58 +00:00
Matt Arsenault
78916e17ea
AMDGPU: Remove unnecessary check
...
There are no scalar FP types that can be extended.
llvm-svn: 292816
2017-01-23 19:00:15 +00:00
Eugene Zelenko
6620376da7
[AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
...
llvm-svn: 292688
2017-01-21 00:53:49 +00:00
Matt Arsenault
4165efdc58
AMDGPU: Add replacement export intrinsics
...
llvm-svn: 292205
2017-01-17 07:26:53 +00:00
Benjamin Kramer
061f4a5fe6
Apply clang-tidy's performance-unnecessary-value-param to LLVM.
...
With some minor manual fixes for using function_ref instead of
std::function. No functional change intended.
llvm-svn: 291904
2017-01-13 14:39:03 +00:00
Diana Picus
116bbab4e4
[CodeGen] Rename MachineInstrBuilder::addOperand. NFC
...
Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.
See https://reviews.llvm.org/D28057 for the whole discussion.
Differential Revision: https://reviews.llvm.org/D28556
llvm-svn: 291891
2017-01-13 09:58:52 +00:00
Matt Arsenault
6dca542b4a
AMDGPU: Add Assert[SZ]Ext during argument load creation
...
For i16 zeroext arguments when i16 was a legal type, the
known bits information from the truncate was lost. Insert
a zeroext so the known bits optimizations work with the 32-bit
loads.
Fixes code quality regressions vs. SI in min.ll test.
llvm-svn: 291461
2017-01-09 18:52:39 +00:00
Jan Vesely
06200bd7bc
AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodes
...
This will make transition to SCRATCH_MEMORY easier
Differential Revision: https://reviews.llvm.org/D24746
llvm-svn: 291279
2017-01-06 21:00:46 +00:00
Jan Vesely
d48445d513
AMDGPU/SI: Implement sendmsghalt intrinsic
...
v2: expose using amdgcn prefix
Differential Revision: https://reviews.llvm.org/D23511
llvm-svn: 290977
2017-01-04 18:06:55 +00:00
Matt Arsenault
941632839f
AMDGPU: Use i16 for i16 shift amount
...
llvm-svn: 290351
2016-12-22 16:36:25 +00:00
Matt Arsenault
18f56be3d2
AMDGPU: Use i16 comparison instructions
...
llvm-svn: 290348
2016-12-22 16:27:11 +00:00
Matt Arsenault
e7d8ed32f9
AMDGPU: Swap order of operands in fadd/fsub combine
...
FMA is canonicalized to constant in the middle operand. Do
the same so fmad matches and avoid an extra combine step.
llvm-svn: 290313
2016-12-22 04:03:40 +00:00
Matt Arsenault
46e6b7adef
AMDGPU: Check fast math flags in fadd/fsub combines
...
llvm-svn: 290312
2016-12-22 04:03:35 +00:00
Matt Arsenault
770ec8680a
AMDGPU: Form more FMAs if fusion is allowed
...
Extend the existing fadd/fsub->fmad combines to produce
FMA if allowed.
llvm-svn: 290311
2016-12-22 03:55:35 +00:00
Matt Arsenault
d8b73d5304
AMDGPU: Move combines into separate functions
...
llvm-svn: 290309
2016-12-22 03:44:42 +00:00
Matt Arsenault
ef82ad94ea
AMDGPU: Enable some f32 fadd/fsub combines for f16
...
llvm-svn: 290308
2016-12-22 03:40:39 +00:00
Matt Arsenault
9e22bc2cd3
AMDGPU: Implement isFMAFasterThanFMulAndFAdd for f16
...
llvm-svn: 290307
2016-12-22 03:21:48 +00:00
Matt Arsenault
cdff21b14e
AMDGPU: Allow rcp and rsq usage with f16
...
llvm-svn: 290302
2016-12-22 03:05:44 +00:00
Matt Arsenault
4052a576c0
AMDGPU: Custom lower f16 fdiv
...
llvm-svn: 290301
2016-12-22 03:05:41 +00:00
Matt Arsenault
ce84130f85
AMDGPU: Implement f16 fcanonicalize
...
llvm-svn: 290300
2016-12-22 03:05:37 +00:00