r600 uses dummy pointer info for lowering load/store. Since dummy pointer info
assumes address space 0, this causes isel failure when temporary load/store SDNodes
are generated for amdgiz environment.
Since the offest is not constant, FixedStack pseudo source value cannot be used
to create the pointer info. This patch creates pointer info using llvm undef value.
At least this provides correct address space so that isel can be done correctly.
Differential Revision: https://reviews.llvm.org/D39698
llvm-svn: 317862
We already have a combine for this pattern when the input to shl is add, so we just need to enable the transformation when the input is or.
Original patch by @tstellar
Differential Revision: https://reviews.llvm.org/D19325
llvm-svn: 313251
This fixes 17 of the 41 -verify-machineinstrs test failures identified in PR33045
Differential Revision: https://reviews.llvm.org/D33451
llvm-svn: 303691
If workgroup size is known inform llvm about range returned by local
id and local size queries.
Differential Revision: https://reviews.llvm.org/D31804
llvm-svn: 300102
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.
Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).
llvm-svn: 298444
This switches to the workaround that HSA defaults to
for the mesa path.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
llvm-svn: 292982
This breaks the tests that were meant for testing
64-bit inline immediates, so move those to shl where
they won't be broken up.
This should be repeated for the other related bit ops.
llvm-svn: 258095
Re-comitting with a change that avoids undefined uses getting put into
the VRegUses list.
The new algorithm remembers the uses encountered while walking backwards
until a matching def is found. Contrary to the previous version this:
- Works without LiveIntervals being available
- Allows to increase the precision to subregisters/lanemasks
(not used for now)
The changes in the AMDGPU tests are necessary because the R600 scheduler
is not stable with respect to the order of nodes in the ready queues.
Differential Revision: http://reviews.llvm.org/D9068
llvm-svn: 254683
This works mostly fine but breaks some stage 1 builders when compiling
compiler-rt on i386. Revert for further investigation as I can't see an
obvious cause/fix.
This reverts commit r254577.
llvm-svn: 254586
The new algorithm remembers the uses encountered while walking backwards
until a matching def is found. Contrary to the previous version this:
- Works without LiveIntervals being available
- Allows to increase the precision to subregisters/lanemasks
(not used for now)
The changes in the AMDGPU tests are necessary because the R600 scheduler
is not stable with respect to the order of nodes in the ready queues.
Differential Revision: http://reviews.llvm.org/D9068
llvm-svn: 254577
This can be done only with moves which theoretically
will optimize better later.
Although this transform increases the instruction count,
it should be code size / cycle count neutral in the worst
VALU case. It also seems to slightly improve a couple
of testcases due to other DAG combines this exposes.
This is probably slightly worse for the SALU case, so
it might be better to handle this during moveToVALU,
although then you lose some simplifications like
the load width reducing in the simple testcase.
llvm-svn: 242177