Commit Graph

8 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin d4ae470d2e [AMDGPU] Prevent post-RA scheduler from breaking memory clauses
The pre-RA scheduler does load/store clustering, but post-RA
scheduler undoes it. Add mutation to prevent it.

Differential Revision: https://reviews.llvm.org/D38014

llvm-svn: 313670
2017-09-19 20:54:38 +00:00
Matt Arsenault 9aa45f047f AMDGPU: Add macro fusion schedule DAG mutation
Try to increase opportunities to shrink vcc uses.

llvm-svn: 307313
2017-07-06 20:57:05 +00:00
Matt Arsenault 3dbeefa978 AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

llvm-svn: 298444
2017-03-21 21:39:51 +00:00
Stanislav Mekhanoshin 79da2a7698 [AMDGPU] Remove getBidirectionalReasonRank
This method inverts the Reason field of a scheduling candidate.
It does right comparison between RegCritical and RegExcess, but
everything else is broken. In fact it can prefer less strong reason
such as Weak over RegCritical because Weak > -RegCritical.

The CandReason enum is properly sorted, so just remove artificial
ranking.

Differential Revision: https://reviews.llvm.org/D30557

llvm-svn: 297536
2017-03-11 00:29:27 +00:00
Matt Arsenault 7aad8fd8f4 Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to
for the mesa path.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@miletic.net>

llvm-svn: 292982
2017-01-24 22:02:15 +00:00
Matt Arsenault 0b26e47345 AMDGPU: Invert cmp + select with constant
Canonicalize a select with a constant to the false side. This
enables more instruction shrinking opportunities since an
inline immediate can be used for the false side of v_cndmask_b32_e32.

This seems to usually be better but causes some code size regressions
in some tests.

llvm-svn: 290372
2016-12-22 21:40:08 +00:00
Matt Arsenault 9c47dd583a AMDGPU: Remove some old intrinsic uses from tests
llvm-svn: 260493
2016-02-11 06:02:01 +00:00
Tom Stellard 45bb48ea19 R600 -> AMDGPU rename
llvm-svn: 239657
2015-06-13 03:28:10 +00:00