Konstantin Zhuravlyov
0a1a7b6b23
Revert "AMDGPU: Enable ConstrainCopy DAG mutation"
...
This reverts commit r287146.
This breaks few conformance tests.
llvm-svn: 287233
2016-11-17 16:41:49 +00:00
Matt Arsenault
3b36bb1d87
AMDGPU: Enable ConstrainCopy DAG mutation
...
This fixes a probably unintended divergence from the default
scheduler behavior.
llvm-svn: 287146
2016-11-16 20:35:23 +00:00
Matt Arsenault
5d8eb25e78
AMDGPU: Use unsigned compare for eq/ne
...
For some reason there are both of these available, except
for scalar 64-bit compares which only has u64. I'm not sure
why there are both (I'm guessing it's for the one bit inputs we
don't use), but for consistency always using the
unsigned one.
llvm-svn: 282832
2016-09-30 01:50:20 +00:00
Matt Arsenault
bbb47da8a1
AMDGPU: Support commuting with immediate in src0
...
llvm-svn: 280970
2016-09-08 17:19:29 +00:00
Tom Stellard
0d23ebe888
AMDGPU/SI: Implement a custom MachineSchedStrategy
...
Summary:
GCNSchedStrategy re-uses most of GenericScheduler, it's just uses
a different method to compute the excess and critical register
pressure limits.
It's not enabled by default, to enable it you need to pass -misched=gcn
to llc.
Shader DB stats:
32464 shaders in 17874 tests
Totals:
SGPRS: 1542846 -> 1643125 (6.50 %)
VGPRS: 1005595 -> 904653 (-10.04 %)
Spilled SGPRs: 29929 -> 27745 (-7.30 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size: 36688188 -> 37034900 (0.95 %) bytes
LDS: 1913 -> 1913 (0.00 %) blocks
Max Waves: 254101 -> 265125 (4.34 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 1338220 -> 1438499 (7.49 %)
VGPRS: 886221 -> 785279 (-11.39 %)
Spilled SGPRs: 29869 -> 27685 (-7.31 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size: 34315716 -> 34662428 (1.01 %) bytes
LDS: 1551 -> 1551 (0.00 %) blocks
Max Waves: 188127 -> 199151 (5.86 %)
Wait states: 0 -> 0 (0.00 %)
Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: https://reviews.llvm.org/D23688
llvm-svn: 279995
2016-08-29 19:42:52 +00:00
Matt Arsenault
1cc4991412
AMDGPU: Fix inconsistent lowering of select of vectors
...
f32 vectors would use a sequence of BFI instructions instead
of unrolled cmp + select. This was better in the case of a VALU
select with SGPR inputs, but we don't have a way of dealing with that
in the DAG.
llvm-svn: 270731
2016-05-25 17:34:58 +00:00
Tom Stellard
e48fe2a27a
AMDGPU/SI: Add support for shrinking v_cndmask_b32_e32 instructions
...
Reviewers: arsenm
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11061
llvm-svn: 242146
2015-07-14 14:15:03 +00:00
Tom Stellard
45bb48ea19
R600 -> AMDGPU rename
...
llvm-svn: 239657
2015-06-13 03:28:10 +00:00