Commit Graph

104 Commits

Author SHA1 Message Date
Matt Arsenault 209a7b92b5 R600: Minor cleanups.
Fix indentation, better line wrapping, unused includes.

llvm-svn: 206562
2014-04-18 07:40:20 +00:00
Matt Arsenault 4e46665a80 R600: Expand sign extension of vectors.
Setting vector types to expand will result in scalarization on pre SI hw,
as those gpus don't have vector shifts either.
Expand also i32 vectors, this helps llvm make the correct decision
about scalarizing the vector ops.

v2: move setOperation() calls to R600ISelLowering.cpp.
    cleanup the SI code to make it obvious that this patch does is nop for SI

Patch by: Jan Vesely <jan.vesely@rutgers.edu>

llvm-svn: 206348
2014-04-16 01:41:30 +00:00
Nick Lewycky aad475b324 Break PseudoSourceValue out of the Value hierarchy. It is now the root of its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead.
llvm-svn: 206255
2014-04-15 07:22:52 +00:00
Matt Arsenault e1f030ca66 R600: Check if a sextload should be used for parameter loads.
Through some oddity where truncate (sextload x) isn't folded into
an anyextload for vectors, the sextload remains if the
vector isn't immediately scalarized. This keeps the expected
zextload instructions in the kernel-args test when small type
vectors aren't scalarized.

llvm-svn: 206070
2014-04-11 20:59:54 +00:00
Tom Stellard 50122a5890 R600: Match 24-bit arithmetic patterns in a Target DAGCombine
Moving these patterns from TableGen files to PerformDAGCombine()
should allow us to generate better code by eliminating unnecessary
shifts and extensions earlier.

This also fixes a bug where the MAD pattern was calling
SimplifyDemandedBits with a 24-bit mask on the first operand
even when the full pattern wasn't being matched.  This occasionally
resulted in some instructions being incorrectly deleted from the
program.

v2:
  - Fix bug with 64-bit mul

llvm-svn: 205731
2014-04-07 19:45:41 +00:00
Matt Arsenault 7939acd7fa Use .data() instead of &x[0]
llvm-svn: 205722
2014-04-07 16:44:24 +00:00
Matt Arsenault d125d74a73 R600/SI: Fix unreachable with a sext_in_reg to an illegal type.
llvm-svn: 204945
2014-03-27 17:23:24 +00:00
Matt Arsenault fae02989b7 R600: Match sign_extend_inreg to BFE instructions
llvm-svn: 204072
2014-03-17 18:58:11 +00:00
Benjamin Kramer b6d0bd48bd [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.
Remove the old functions.

llvm-svn: 202636
2014-03-02 12:27:27 +00:00
Alp Toker cb40291100 Fix known typos
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.

llvm-svn: 200018
2014-01-24 17:20:08 +00:00
Tom Stellard e93736057f R600/SI: Add support for i8 and i16 private loads/stores
llvm-svn: 199823
2014-01-22 19:24:14 +00:00
Matt Arsenault eaa3a7efab Use llvm_unreachable instead of assert(0)
llvm-svn: 196971
2013-12-10 21:37:42 +00:00
Vincent Lejeune cc0ea74c7b R600: Fix an infinite loop when trying to reorganize export/tex vector input
llvm-svn: 196923
2013-12-10 14:43:31 +00:00
Alp Toker f907b891da Correct word hyphenations
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.

llvm-svn: 196471
2013-12-05 05:44:44 +00:00
Tom Stellard 8f9fc20751 R600: Fix scheduling of instructions that use the LDS output queue
The LDS output queue is accessed via the OQAP register.  The OQAP
register cannot be live across clauses, so if value is written to the
output queue, it must be retrieved before the end of the clause.
With the machine scheduler, we cannot statisfy this constraint, because
it lacks proper alias analysis and it will mark some LDS accesses as
having a chain dependency on vertex fetches.  Since vertex fetches
require a new clauses, the dependency may end up spiltting OQAP uses and
defs so the end up in different clauses.  See the lds-output-queue.ll
test for a more detailed explanation.

To work around this issue, we now combine the LDS read and the OQAP
copy into one instruction and expand it after register allocation.

This patch also adds some checks to the EmitClauseMarker pass, so that
it doesn't end a clause with a value still in the output queue and
removes AR.X and OQAP handling from the scheduler (AR.X uses and defs
were already being expanded post-RA, so the scheduler will never see
them).

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 194755
2013-11-15 00:12:45 +00:00
Tom Stellard 81d871dee3 R600/SI: Add support for private address space load/store
Private address space is emulated using the register file with
MOVRELS and MOVRELD instructions.

llvm-svn: 194626
2013-11-13 23:36:50 +00:00
Matt Arsenault 00a0d6f672 R600: Fix selection failure on EXTLOAD
llvm-svn: 194547
2013-11-13 02:39:07 +00:00
Vincent Lejeune aee3a10440 R600: Reenable llvm.R600.load.input/interp.input for compatibility
llvm-svn: 194484
2013-11-12 16:26:47 +00:00
Vincent Lejeune f143af3fe9 R600: Use function inputs to represent data stored in gpr
llvm-svn: 194425
2013-11-11 22:10:24 +00:00
Matt Arsenault ef1a950b48 Use isa<> instead of dyn_cast<> with unused value
llvm-svn: 193869
2013-11-01 17:39:26 +00:00
Matt Arsenault 909d0c063f Fix a few typos
llvm-svn: 193723
2013-10-30 23:43:29 +00:00
NAKAMURA Takumi 8a0464393f Prune utf8 chars in comments.
llvm-svn: 193512
2013-10-28 04:07:38 +00:00
NAKAMURA Takumi 4bb85f90fd Target/R600: Un-tab-ify.
llvm-svn: 193510
2013-10-28 04:07:23 +00:00
Tom Stellard af77543244 R600: Fix handling of vector kernel arguments
The SelectionDAGBuilder was promoting vector kernel arguments to legal
types, but this won't work for R600 and SI since kernel arguments are
stored in memory and can't be promoted.  In order to handle vector
arguments correctly we need to look at the original types from the LLVM IR
function.

llvm-svn: 193215
2013-10-23 00:44:32 +00:00
Vincent Lejeune fa58a5fb60 R600: Use masked read sel for texture instructions
llvm-svn: 192554
2013-10-13 17:56:10 +00:00
Vincent Lejeune 301beb80d4 R600: fix swizzle export
llvm-svn: 192553
2013-10-13 17:56:04 +00:00
Vincent Lejeune 6df39438af R600: Add a ldptr intrinsic to support MSAA.
llvm-svn: 191838
2013-10-02 16:00:33 +00:00
Tom Stellard 0351ea2010 R600: Fix handling of NAN in comparison instructions
We were completely ignoring the unorder/ordered attributes of condition
codes and also incorrectly lowering seto and setuo.

Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 191603
2013-09-28 02:50:50 +00:00
Tom Stellard 5694d3090a SelectionDAG: Improve legalization of SELECT_CC with illegal condition codes
SelectionDAG will now attempt to inverse an illegal conditon in order to
find a legal one and if that doesn't work, it will attempt to swap the
operands using the inverted condition.

There are no new test cases for this, but a nubmer of the existing R600
tests hit this path.

llvm-svn: 191602
2013-09-28 02:50:43 +00:00
Tom Stellard cd42818d86 SelectionDAG: Try to expand all condition codes using getCCSwappedOperands()
This is useful for targets like R600, which only support GT, GE, NE, and EQ
condition codes as it removes the need to handle unsupported condition
codes in target specific code.

There are no tests with this commit, but R600 has been updated to take
advantage of this new feature, so its existing selectcc tests are now
testing the swapped operands path.

llvm-svn: 191601
2013-09-28 02:50:38 +00:00
Vincent Lejeune 0167a313da R600: Move clamp handling code to R600IselLowering.cpp
llvm-svn: 190645
2013-09-12 23:45:00 +00:00
Vincent Lejeune 9a248e5c2d R600: Move code handling literal folding into R600ISelLowering.
llvm-svn: 190644
2013-09-12 23:44:53 +00:00
Vincent Lejeune ab3baf80a8 R600: Move fabs/fneg/sel folding logic into PostProcessIsel
This move makes possible to correctly handle multiples instructions
from a single pattern.

llvm-svn: 190643
2013-09-12 23:44:44 +00:00
Tom Stellard 13c68ef88b R600: Add support for local memory atomic add
llvm-svn: 190080
2013-09-05 18:38:09 +00:00
Tom Stellard 53f2f90eb4 R600: Expand SELECT nodes rather than custom lowering them
llvm-svn: 190079
2013-09-05 18:38:03 +00:00
Tom Stellard 35bb18c2a7 R600: Add support for vector local memory loads
llvm-svn: 189226
2013-08-26 15:06:04 +00:00
Tom Stellard c6f4a29ed5 R600: Add support for i8 and i16 local memory loads
llvm-svn: 189225
2013-08-26 15:05:59 +00:00
Tom Stellard 2ffc330673 R600: Add support for v4i32 and v2i32 local stores
llvm-svn: 189222
2013-08-26 15:05:44 +00:00
Tom Stellard a92ff87929 R600: Expand vector float operations for both SI and R600
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 188596
2013-08-16 23:51:24 +00:00
Tom Stellard fbab827e2a R600: Add support for global vector stores with elements less than 32-bits
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 188520
2013-08-16 01:12:11 +00:00
Tom Stellard d3ee8c103a R600: Add support for i16 and i8 global stores
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 188519
2013-08-16 01:12:06 +00:00
Tom Stellard fc455471c3 R600: Set scheduling preference to Sched::Source
R600 doesn't need to do any scheduling on the SelectionDAG now that it
has a very good MachineScheduler.  Also, using the VLIW SelectionDAG
scheduler was having a major impact on compile times. For example with
the phatk kernel here are the LLVM IR to machine code compile times:

With Sched::VLIW

Total Compile Time:                  1.4890 Seconds (User + System)
SelectionDAG Instruction Scheduling: 1.1670 Seconds (User + System)

With Sched::Source

Total Compile Time:                  0.3330 Seconds (User + System)
SelectionDAG Instruction Scheduling: 0.0070 Seconds (User + System)

The code ouput was identical with both schedulers.  This may not be true
for all programs, but it gives me confidence that there won't be much
reduction, if any, in code quality by using Sched::Source.

llvm-svn: 188215
2013-08-12 22:33:21 +00:00
Tom Stellard 0344cdfe39 R600: Add 64-bit float load/store support
* Added R600_Reg64 class
* Added T#Index#.XY registers definition
* Added v2i32 register reads from parameter and global space
* Added f32 and i32 elements extraction from v2f32 and v2i32
* Added v2i32 -> v2f32 conversions

Tom Stellard:
  - Mark vec2 operations as expand.  The addition of a vec2 register
    class made them all legal.

Patch by: Dmitry Cherkassov

Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com>
llvm-svn: 187582
2013-08-01 15:23:42 +00:00
Tom Stellard aa313d0a74 R600/SI: Expand vector fp <-> int conversions
llvm-svn: 187421
2013-07-30 14:31:03 +00:00
Quentin Colombet e2e0548d77 [R600] Replicate old DAGCombiner behavior in target specific DAG combine.
build_vector is lowered to REG_SEQUENCE, which is something the register
allocator does a good job at optimizing.

llvm-svn: 187397
2013-07-30 00:27:16 +00:00
Tom Stellard 840214437b R600: Move CONST_ADDRESS folding into AMDGPUDAGToDAGISel::Select()
This increases the number of opportunites we have for folding.  With the
previous implementation we were unable to fold into any instructions
other than the first when multiple instructions were selected from a
single SDNode.

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186919
2013-07-23 01:48:24 +00:00
Tom Stellard 1e80309ebe R600: Use KCache for kernel arguments
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186918
2013-07-23 01:48:18 +00:00
Tom Stellard acfeebf883 R600: Use the same compute kernel calling convention for all GPUs
A side-effect of this is that now the compiler expects kernel arguments
to be 4-byte aligned.

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186916
2013-07-23 01:48:05 +00:00
Tom Stellard 78e012969c R600: Use correct LoadExtType when lowering kernel arguments
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186915
2013-07-23 01:47:58 +00:00
Tom Stellard 33dd04bfbe R600: Clean up extended load patterns
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186914
2013-07-23 01:47:52 +00:00