Commit Graph

8 Commits

Author SHA1 Message Date
Yaxun Liu 0124b5484c [AMDGPU] Change constant addr space to 4
Differential Revision: https://reviews.llvm.org/D43170

llvm-svn: 325030
2018-02-13 18:00:25 +00:00
Tim Renouf 6eaad1e539 [AMDGPU] Fixed incorrect uniform branch condition
Summary:
I had a case where multiple nested uniform ifs resulted in code that did
v_cmp comparisons, combining the results with s_and_b64, s_or_b64 and
s_xor_b64 and using the resulting mask in s_cbranch_vccnz, without first
ensuring that bits for inactive lanes were clear.

There was already code for inserting an "s_and_b64 vcc, exec, vcc" to
clear bits for inactive lanes in the case that the branch is instruction
selected as s_cbranch_scc1 and is then changed to s_cbranch_vccnz in
SIFixSGPRCopies. I have added the same code into SILowerControlFlow for
the case that the branch is instruction selected as s_cbranch_vccnz.

This de-optimizes the code in some cases where the s_and is not needed,
because vcc is the result of a v_cmp, or multiple v_cmp instructions
combined by s_and/s_or. We should add a pass to re-optimize those cases.

Reviewers: arsenm, kzhuravl

Subscribers: wdng, yaxunl, t-tye, llvm-commits, dstuttard, timcorringham, nhaehnle

Differential Revision: https://reviews.llvm.org/D41292

llvm-svn: 322119
2018-01-09 21:34:43 +00:00
Mark Searles 70359ac60d [AMDGPU] Turn on the new waitcnt insertion pass. Adjust tests.
-enable-si-insert-waitcnts=1 becomes the default
-enable-si-insert-waitcnts=0 to use old pass

Differential Revision: https://reviews.llvm.org/D33730

llvm-svn: 304551
2017-06-02 14:19:25 +00:00
Matt Arsenault 3dbeefa978 AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

llvm-svn: 298444
2017-03-21 21:39:51 +00:00
Matt Arsenault 7aad8fd8f4 Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to
for the mesa path.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@miletic.net>

llvm-svn: 292982
2017-01-24 22:02:15 +00:00
Matt Arsenault f530e8b3f0 AMDGPU: Remove unnecessary and on conditional branch
The comment explaining why this was necessary is incorrect
in its description of v_cmp's behavior for inactive workitems.

llvm-svn: 286134
2016-11-07 19:09:33 +00:00
Matt Arsenault bbb47da8a1 AMDGPU: Support commuting with immediate in src0
llvm-svn: 280970
2016-09-08 17:19:29 +00:00
Tom Stellard bc4497b13c AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions
Reviewers: arsenm

Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16603

llvm-svn: 260765
2016-02-12 23:45:29 +00:00