Commit Graph

41 Commits

Author SHA1 Message Date
Tom Stellard 1aaad6970c R600/SI: Add instruction shrinking pass
This pass converts 64-bit instructions to 32-bit when possible.

llvm-svn: 213561
2014-07-21 16:55:33 +00:00
Tom Stellard b02094e115 R600/SI: Use scratch memory for large private arrays
llvm-svn: 213551
2014-07-21 15:45:01 +00:00
Matt Arsenault 0163e033e2 R600: Remove unused function
llvm-svn: 213472
2014-07-20 06:31:06 +00:00
Tom Stellard 2e59a45f80 R600: Move AMDGPUInstrInfo from AMDGPUTargetMachine into AMDGPUSubtarget
llvm-svn: 210869
2014-06-13 01:32:00 +00:00
Matt Arsenault 8333e4378e R600/SI: Implement i64 ctpop
llvm-svn: 210568
2014-06-10 19:18:24 +00:00
Matt Arsenault 689f325099 R600/SI: Keep 64-bit not on SALU
llvm-svn: 210476
2014-06-09 16:36:31 +00:00
Tom Stellard c721a23882 R600/SI: Refactor the VOP3_32 tablegen class
This will allow us to use a single MachineInstr to represent
instructions which behave the same but have different encodings
on some subtargets.

llvm-svn: 209028
2014-05-16 20:56:47 +00:00
Tom Stellard eba61071d7 R600/SI: Only create one instruction when spilling/restoring register v3
The register spiller assumes that only one new instruction is created
when spilling and restoring registers, so we need to emit pseudo
instructions for vector register spills and lower them after
register allocation.

v2:
  - Fix calculation of lane index
  - Extend VGPR liveness to end of program.

v3:
  - Use SIMM16 field of S_NOP to specify multiple NOPs.

https://bugs.freedesktop.org/show_bug.cgi?id=75005

llvm-svn: 207843
2014-05-02 15:41:42 +00:00
Tom Stellard 0c354f25c9 R600/SI: Teach moveToVALU how to handle some SMRD instructions
llvm-svn: 207660
2014-04-30 15:31:29 +00:00
Craig Topper 5656db4a8b [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. R600 edition
llvm-svn: 207503
2014-04-29 07:57:24 +00:00
Craig Topper e73658ddbb [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Matt Arsenault 27cc958dff R600/SI: Match sign_extend_inreg to s_sext_i32_i8 and s_sext_i32_i16
llvm-svn: 206547
2014-04-18 01:53:18 +00:00
Matt Arsenault d7bdcc46a6 R600/SI: Implement shouldConvertConstantLoadToIntImm
llvm-svn: 205244
2014-03-31 19:54:27 +00:00
Tom Stellard 30f59417cf R600/SI: Implement SIInstrInfo::isTriviallyRematerializable()
llvm-svn: 205188
2014-03-31 14:01:56 +00:00
Matt Arsenault 248b7b6ba1 R600/SI: Sub-optimial fix for 64-bit immediates with SALU ops.
No longer asserts, but now you get moves loading legal immediates
into the split 32-bit operations.

llvm-svn: 204661
2014-03-24 20:08:09 +00:00
Matt Arsenault f35182c783 R600/SI: Fix 64-bit bit ops that require the VALU.
Try to match scalar and first like the other instructions.
Expand 64-bit ands to a pair of 32-bit ands since that is not
available on the VALU.

llvm-svn: 204660
2014-03-24 20:08:05 +00:00
Matt Arsenault bd9958038c R600/SI: Move splitting 64-bit immediates to separate function.
llvm-svn: 204651
2014-03-24 18:26:52 +00:00
Tom Stellard 1583409e33 R600/SI: Handle MUBUF instructions in SIInstrInfo::moveToVALU()
llvm-svn: 204476
2014-03-21 15:51:57 +00:00
Matt Arsenault 99395fa98f R600: Remove unused method declaration.
llvm-svn: 204357
2014-03-20 16:41:06 +00:00
Matt Arsenault faa297e89e Remove incomplete comment
llvm-svn: 203518
2014-03-11 00:01:37 +00:00
Matt Arsenault 6dde30354a Move trivial getter into header.
llvm-svn: 203517
2014-03-11 00:01:34 +00:00
Tom Stellard 5d7aaaed7d R600/SI: Initialize M0 and emit S_WQM_B64 whenever DS instructions are used
DS instructions that access local memory can only uses addresses that
are less than or equal to the value of M0.  When M0 is uninitialized,
then we experience undefined behavior.

This patch also changes the behavior to emit S_WQM_B64 on pixel shaders
no matter what kind of DS instruction is used.

llvm-svn: 201097
2014-02-10 16:58:30 +00:00
Matt Arsenault eaa3a7efab Use llvm_unreachable instead of assert(0)
llvm-svn: 196971
2013-12-10 21:37:42 +00:00
Tom Stellard c149dc02d3 R600/SI: Implement spilling of SGPRs v5
SGPRs are spilled into VGPRs using the {READ,WRITE}LANE_B32 instructions.

v2:
  - Fix encoding of Lane Mask
  - Use correct register flags, so we don't overwrite the low dword
    when restoring multi-dword registers.

v3:
  - Register spilling seems to hang the GPU, so replace all shaders
    that need spilling with a dummy shader.

v4:
  - Fix *LANE definitions
  - Change destination reg class for 32-bit SMRD instructions

v5:
  - Remove small optimization that was crashing Serious Sam 3.

https://bugs.freedesktop.org/show_bug.cgi?id=68224
https://bugs.freedesktop.org/show_bug.cgi?id=71285

NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195880
2013-11-27 21:23:35 +00:00
Matt Arsenault f14032af0e Make method static
llvm-svn: 194858
2013-11-15 22:02:28 +00:00
Tom Stellard 81d871dee3 R600/SI: Add support for private address space load/store
Private address space is emulated using the register file with
MOVRELS and MOVRELD instructions.

llvm-svn: 194626
2013-11-13 23:36:50 +00:00
Tom Stellard 8216602a0b R600/SI: Prefer SALU instructions for bit shift operations
All shift operations will be selected as SALU instructions and then
if necessary lowered to VALU instructions in the SIFixSGPRCopies pass.

This allows us to do more operations on the SALU which will improve
performance and is also required for implementing private memory
using indirect addressing, since the private memory pointers must stay
in the scalar registers.

This patch includes some fixes from Matt Arsenault.

llvm-svn: 194625
2013-11-13 23:36:37 +00:00
Tom Stellard 26a3b67b3b R600: Simplify handling of private address space
The AMDGPUIndirectAddressing pass was previously responsible for
lowering private loads and stores to indirect addressing instructions.
However, this pass was buggy and way too complicated.  The only
advantage it had over the new simplified code was that it saved one
instruction per direct write to private memory.  This optimization
likely has a minimal impact on performance, and we may be able
to duplicate it using some other transformation.

For the private address space, we now:
1. Lower private loads/store to Register(Load|Store) instructions
2. Reserve part of the register file as 'private memory'
3. After regalloc lower the Register(Load|Store) instructions to
   MOV instructions that use indirect addressing.

llvm-svn: 193179
2013-10-22 18:19:10 +00:00
Tom Stellard c460b0dcf1 R600: Remove unused InstrInfo::getMovImmInstr() function
llvm-svn: 193178
2013-10-22 18:19:01 +00:00
Matt Arsenault df90c02e68 Fix missing C++ mode thing in header
llvm-svn: 192751
2013-10-15 23:44:45 +00:00
Tom Stellard 93fabcebf1 R600/SI: Implement SIInstrInfo::verifyInstruction() for VOP*
The function is used by the machine verifier and checks that VOP*
instructions have legal operands.

llvm-svn: 192367
2013-10-10 17:11:55 +00:00
Michel Danzer 20680b1cc5 R600/SI: Fix broken encoding of DS_WRITE_B32
The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD
instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused
it to corrupt the encoding of that by clobbering the first operand with
the second one.

Undo that damage and only apply the SMRD logic to that.

Fixes some derivates related piglit regressions with radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188558
2013-08-16 16:19:24 +00:00
Tom Stellard 16a9a205c8 R600/SI: Assign a register class to the $vaddr operand for MIMG instructions
The previous code declared the operand as unknown:$vaddr, which made
it possible for scalar registers to be used instead of vector registers.

llvm-svn: 188425
2013-08-14 23:24:17 +00:00
Christian Konig 8e06e2a8c4 R600/SI: adjust writemask to only the used components
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179165
2013-04-10 08:39:08 +00:00
Christian Konig 3c14580acb R600/SI: add cummuting of rev instructions
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178127
2013-03-27 09:12:59 +00:00
Christian Konig f741fbfb1b R600/SI: add VOP mapping functions
Make it possible to map between e32 and e64 encoding opcodes.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 176104
2013-02-26 17:52:42 +00:00
Christian Konig 76edd4f2bc R600/SI: add some more instruction flags
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 176102
2013-02-26 17:52:29 +00:00
Tom Stellard 1c822a8929 R600/SI: cleanup VGPR encoding
Remove all the unused code.

Patch by: Christian König

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174656
2013-02-07 19:39:45 +00:00
Tom Stellard f3b2a1e8b3 R600: Support for indirect addressing v4
Only implemented for R600 so far.  SI is missing implementations of a
few callbacks used by the Indirect Addressing pass and needs code to
handle frame indices.

At the moment R600 only supports array sizes of 16 dwords or less.
Register packing of vector types is currently disabled, which means that a
vec4 is stored in T0_X, T1_X, T2_X, T3_X, rather than T0_XYZW. In order
to correctly pack registers in all cases, we will need to implement an
analysis pass for R600 that determines the correct vector width for each
array.

v2:
  - Add support for i8 zext load from stack.
  - Coding style fixes

v3:
  - Don't reserve registers for indirect addressing when it isn't
    being used.
  - Fix bug caused by LLVM limiting the number of SubRegIndex
    declarations.

v4:
  - Fix 64-bit defines

llvm-svn: 174525
2013-02-06 17:32:29 +00:00
Tom Stellard c4cabef782 R600: Proper insert S_WAITCNT instructions
Some instructions like memory reads/writes are executed
asynchronously, so we need to insert S_WAITCNT instructions
to block before accessing their results. Previously we have
just inserted S_WAITCNT instructions after each async
instruction, this patch fixes this and adds a prober
insertion pass.

Patch by: Christian König

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
llvm-svn: 172846
2013-01-18 21:15:53 +00:00
Tom Stellard 75aadc2813 Add R600 backend
A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX

llvm-svn: 169915
2012-12-11 21:25:42 +00:00