All shift operations will be selected as SALU instructions and then
if necessary lowered to VALU instructions in the SIFixSGPRCopies pass.
This allows us to do more operations on the SALU which will improve
performance and is also required for implementing private memory
using indirect addressing, since the private memory pointers must stay
in the scalar registers.
This patch includes some fixes from Matt Arsenault.
llvm-svn: 194625
The SelectionDAGBuilder was promoting vector kernel arguments to legal
types, but this won't work for R600 and SI since kernel arguments are
stored in memory and can't be promoted. In order to handle vector
arguments correctly we need to look at the original types from the LLVM IR
function.
llvm-svn: 193215
During instruction selection, we rewrite the destination register
class for MIMG instructions based on their writemasks. This creates
machine verifier errors since the new register class does not match
the register class in the MIMG instruction definition.
We can avoid this by defining different MIMG instructions for each
possible destination type and then switching to the correct instruction
when we change the register class.
llvm-svn: 192365
For _XYZ, the type of VDATA is v4i32, because v3i32 doesn't exist.
The ADDR64 bit is not exposed. A simpler intrinsic that doesn't take
a resource descriptor might be nicer.
The maximum number of input SGPRs is bumped to 17.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 190575
This adds minimal support to the SelectionDAG for handling address spaces
with different pointer sizes. The SelectionDAG should now correctly
lower pointer function arguments to the correct size as well as generate
the correct code when lowering getelementptr.
This patch also updates the R600 DataLayout to use 32-bit pointers for
the local address space.
v2:
- Add more helper functions to TargetLoweringBase
- Use CHECK-LABEL for tests
llvm-svn: 189221
The SIInsertWaits pass was overwriting the first operand (gds bit) of
DS_WRITE_B32 with the second operand (value to write). This meant that
any time the value to write was stored in an odd number VGPR, the gds
bit would be set causing the instruction to write to GDS instead of LDS.
llvm-svn: 188522
Now that compute support is better on SI, we can't continue using v16i8
for descriptors since this is also a legal type in OpenCL.
This patch fixes numerous hangs with the piglit OpenCL test and since
we now use a target specific DAG node for LOAD_CONSTANT with the
correct MemOperandFlags, this should also fix:
https://bugs.freedesktop.org/show_bug.cgi?id=66805
llvm-svn: 188429
Using REG_SEQUENCE for BUILD_VECTOR rather than a series of INSERT_SUBREG
instructions should make it easier for the register allocator to coalasce
unnecessary copies.
v2:
- Use an SGPR register class if all the operands of BUILD_VECTOR are
SGPRs.
llvm-svn: 188427
The instruction selector will now try to infer the destination register
so it can decided whether to use V_MOV_B32 or S_MOV_B32 when copying
immediates.
llvm-svn: 188426
The previous code declared the operand as unknown:$vaddr, which made
it possible for scalar registers to be used instead of vector registers.
llvm-svn: 188425