The backend now assumes that all immediates are integers. This allows
us to simplify immediate handling code, becasue we no longer need to
handle fp and integer immediates differently.
llvm-svn: 225844
Use VGPR_32 register class instead. These two register classes were
identical and having separate classes was causing
SIInstrInfo::isLegalOperands() to be overly conservative in some cases.
This change is necessary to prevent future paches from missing a folding
opportunity in fneg-fabs.ll.
llvm-svn: 225382
This s_mov_b32 will write to a virtual register from the M0Reg
class and all the ds instructions now take an extra M0Reg explicit
argument.
This change is necessary to prevent issues with the scheduler
mixing together instructions that expect different values in the m0
registers.
llvm-svn: 222583
shorter/easier and have the DAG use that to do the same lookup. This
can be used in the future for TargetMachine based caching lookups from
the MachineFunction easily.
Update the MIPS subtarget switching machinery to update this pointer
at the same time it runs.
llvm-svn: 214838
We need to store a value greater than or equal to the number of LDS
bytes allocated by the shader in the m0 register in order for LDS
instructions to work correctly.
We always initialize m0 at the beginning of a shader, but this register
is also used for indirect addressing offsets, so we need to
re-initialize it any time we use indirect addressing.
llvm-svn: 211107
This was causing my llc to go into an infinite loop on
CodeGen/R600/address-space.ll (just triggered recently by some allocator
changes).
llvm-svn: 205005
This instructions writes to an 32-bit SGPR. This change required adding
the 32-bit VCC_LO and VCC_HI registers, because the full VCC register
is 64 bits.
This fixes verifier errors on several of the indirect addressing piglit
tests.
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 204055
If the SI_KILL operand is constant, we can either clear the exec mask if
the operand is negative, or do nothing otherwise.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 202337
DS instructions that access local memory can only uses addresses that
are less than or equal to the value of M0. When M0 is uninitialized,
then we experience undefined behavior.
This patch also changes the behavior to emit S_WQM_B64 on pixel shaders
no matter what kind of DS instruction is used.
llvm-svn: 201097
This doesn't change any functionality, since we only have two shader
types (compute and pixel) that use local memory. We're just changing
the logic to match the documentation.
llvm-svn: 201096
Restore the EXEC mask early, otherwise a copy might end up not beeing executed.
Candidate for the mesa stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178018
Seems to be allot simpler, and also paves the
way for further improvements.
v2: rebased on master, use 0 in BUFFER_LOAD_FORMAT_XYZW,
use VGPR0 in dummy EXP, avoid compiler warning, break
after encoding the first literal.
v3: correctly use V_ADD_F32_e64
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175354
We shouldn't insert KILL optimization if we don't have a
kill instruction at all.
Patch by: Christian König
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
llvm-svn: 172845
Branch if we have enough instructions so that it makes sense.
Also remove branches if they don't make sense.
Patch by: Christian König
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
llvm-svn: 170592
This patch replaces the control flow handling with a new
pass which structurize the graph before transforming it to
machine instruction. This has a couple of different advantages
and currently fixes 20 piglit tests without a single regression.
It is now a general purpose transformation that could be not
only be used for SI/R6xx, but also for other hardware
implementations that use a form of structurized control flow.
v2: further cleanup, fixes and documentation
Patch by: Christian König
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 170591