Matt Arsenault
0084adc516
AMDGPU: Add Vega12 and Vega20
...
Changes by
Matt Arsenault
Konstantin Zhuravlyov
llvm-svn: 331215
2018-04-30 19:08:16 +00:00
Jan Vesely
39aeab4f30
AMDGPU/EG: Add a new FeatureFMA and use it to selectively enable FMA instruction
...
Only used by pre-GCN targets
v2: fix predicate setting for FMA_Common
Differential Revision: https://reviews.llvm.org/D40692
llvm-svn: 319712
2017-12-04 23:07:28 +00:00
Jan Vesely
d1c9b61e2b
AMDGPU: Disable fp64 support on pre GCN asics
...
It's not implemented.
Passing +fp64-fp16-denormal feature enables fp64 even on asics that don't support it
v2: fix hasFP64 query
Differential Revision: https://reviews.llvm.org/D39931
llvm-svn: 319709
2017-12-04 22:57:29 +00:00
Matt Arsenault
36b4b0bed7
AMDGPU: Remove -mcpu=SI
...
Leftover from before amdgcn/r600 split.
llvm-svn: 310277
2017-08-07 18:30:35 +00:00
Alexander Timofeev
982aee6a38
[AMDGPU] Switch scalarize global loads ON by default
...
Differential revision: https://reviews.llvm.org/D34407
llvm-svn: 307097
2017-07-04 17:32:00 +00:00
NAKAMURA Takumi
e4a741376b
Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default"
...
It broke a testcase.
Failing Tests (1):
LLVM :: CodeGen/AMDGPU/alignbit-pat.ll
llvm-svn: 307054
2017-07-04 02:14:18 +00:00
Alexander Timofeev
ea7f08bee5
[AMDGPU] Switch scalarize global loads ON by default
...
Differential revision: https://reviews.llvm.org/D34407
llvm-svn: 307026
2017-07-03 14:54:11 +00:00
Matt Arsenault
3dbeefa978
AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
...
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.
Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).
llvm-svn: 298444
2017-03-21 21:39:51 +00:00
Matt Arsenault
3d1c1deb04
AMDGPU: Run SIFoldOperands after PeepholeOptimizer
...
PeepholeOptimizer cleans up redundant copies, which makes
the operand folding more effective.
shader-db stats:
Totals:
SGPRS: 34200 -> 34336 (0.40 %)
VGPRS: 22118 -> 21655 (-2.09 %)
Code Size: 632144 -> 633460 (0.21 %) bytes
LDS: 11 -> 11 (0.00 %) blocks
Scratch: 10240 -> 11264 (10.00 %) bytes per wave
Max Waves: 8822 -> 8918 (1.09 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 7704 -> 7840 (1.77 %)
VGPRS: 5169 -> 4706 (-8.96 %)
Code Size: 234444 -> 235760 (0.56 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Scratch: 0 -> 1024 (0.00 %) bytes per wave
Max Waves: 1188 -> 1284 (8.08 %)
Wait states: 0 -> 0 (0.00 %)
Increases:
SGPRS: 35 (0.01 %)
VGPRS: 1 (0.00 %)
Code Size: 59 (0.02 %)
LDS: 0 (0.00 %)
Scratch: 1 (0.00 %)
Max Waves: 48 (0.02 %)
Wait states: 0 (0.00 %)
Decreases:
SGPRS: 26 (0.01 %)
VGPRS: 54 (0.02 %)
Code Size: 68 (0.03 %)
LDS: 0 (0.00 %)
Scratch: 0 (0.00 %)
Max Waves: 4 (0.00 %)
Wait states: 0 (0.00 %)
llvm-svn: 266378
2016-04-14 21:58:24 +00:00
Tom Stellard
45bb48ea19
R600 -> AMDGPU rename
...
llvm-svn: 239657
2015-06-13 03:28:10 +00:00