Matt Arsenault
86e02ce2dc
AMDGPU: Fix unnecessary ands when packing f16 vectors
...
computeKnownBits didn't handle fp_to_fp16 to report
the high bits as 0. ARM maps the generic node to an instruction
that does not modify the high bits of the register, so introduce
a target node where the high bits are known 0.
llvm-svn: 297873
2017-03-15 19:04:26 +00:00
Matt Arsenault
a9e16e6597
AMDGPU: Add another BFE pattern
...
This is the pattern that falls out of the instruction's
definition if offset == 0.
llvm-svn: 295912
2017-02-23 00:23:43 +00:00
Jan Vesely
334f51a6fe
ADMGPU/EG,CM: Implement _noret global atomics
...
_RTN versions will be a lot more complicated
Differential Revision: https://reviews.llvm.org/D28067
llvm-svn: 292162
2017-01-16 21:20:13 +00:00
Jan Vesely
0d6cb1caaf
AMDGPU/EG,CM: Add fp16 conversion instructions
...
Differential Revision: https://reviews.llvm.org/D28164
llvm-svn: 291622
2017-01-11 00:12:39 +00:00
Matt Arsenault
2712d4a3d8
AMDGPU: Select mulhi 24-bit instructions
...
llvm-svn: 279902
2016-08-27 01:32:27 +00:00
Jan Vesely
0486f739a4
AMDGPU/R600: Convert buffer id to VTX_READ input
...
Use patterns instead of multiple instructions
Add buffer id to asm string
https://reviews.llvm.org/D22650
llvm-svn: 278749
2016-08-15 21:38:30 +00:00
Matt Arsenault
4c519d3518
AMDGPU/R600: Replace barrier intrinsics
...
llvm-svn: 275870
2016-07-18 18:34:59 +00:00
Jan Vesely
2fa28c330c
AMDGPU/R600: Add implicitarg.ptr intrinsic
...
Differential Revision: http://reviews.llvm.org/D21622
llvm-svn: 275024
2016-07-10 21:20:29 +00:00
Tom Stellard
4a105d73a9
AMDGPU/R600: Add PatFrags for selecting the correct vtx id for loads
...
This moves of the r600 logic out of isGlobalLoad() and into the
TableGen files.
Differential Revision: http://reviews.llvm.org/D21710
llvm-svn: 274527
2016-07-05 00:12:51 +00:00
Jan Vesely
81f1b30035
AMDGPU/EG,CM: Add instruction to read from constant AS (VTX2)
...
Reviewers: tstellard
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D19785
llvm-svn: 269473
2016-05-13 20:39:16 +00:00
Matt Arsenault
295875efda
AMDGPU: Remove 24-bit intrinsics
...
The known bit matching code seems to work reasonably well,
so these shouldn't really be needed.
llvm-svn: 259180
2016-01-29 10:05:16 +00:00
Matt Arsenault
ee0930821a
AMDGPU: Remove random TGSI intrinsic
...
I don't think this was ever used.
llvm-svn: 258514
2016-01-22 18:42:44 +00:00
Matt Arsenault
de5fbe9c60
AMDGPU: Pattern match ffbh pattern to instruction.
...
The hardware instruction's output on 0 is -1 rather than 32.
Eliminate a test and select to -1. This removes an extra instruction
from the compatability function with HSAIL's firstbit instruction.
llvm-svn: 257352
2016-01-11 17:02:00 +00:00
Tom Stellard
e0e582c9aa
AMDGPU: Add MEM_RAT STORE_TYPED.
...
v2: Add test (Matt).
Fix capitalization of isEOP (Matt).
Move pattern to class parameter (Matt).
Make the instruction available to Cayman (Matt).
Change name from MEM_RAT WRITE_TYPED to MEM_RAT STORE_TYPED.
Patch by: Zoltan Gilian
llvm-svn: 249042
2015-10-01 17:51:34 +00:00
Tom Stellard
45bb48ea19
R600 -> AMDGPU rename
...
llvm-svn: 239657
2015-06-13 03:28:10 +00:00