Commit Graph

40 Commits

Author SHA1 Message Date
NAKAMURA Takumi 4f328e1c2f R600ISelLowering.cpp: Avoid "using namespace Intrinsic;" to appease MSC. Specify namespaces explicitly here.
MSC is confused about "memcpy" between <cstring> and llvm::Intrinsic::memcpy, when llvm::Intrinsic were exposed.

llvm-svn: 182452
2013-05-22 06:37:31 +00:00
NAKAMURA Takumi 18ca09c1cc R600: Whitespace and untabify.
llvm-svn: 182451
2013-05-22 06:37:25 +00:00
Tom Stellard 5643c4ac72 R600: Swap the legality of rotl and rotr
The hardware supports rotr and not rotl.

llvm-svn: 182285
2013-05-20 15:02:19 +00:00
Matt Arsenault 75865923c9 Add LLVMContext argument to getSetCCResultType
llvm-svn: 182180
2013-05-18 00:21:46 +00:00
Vincent Lejeune d3fcb5016c R600: Lower int_load_input to copyFromReg instead of Register node
It solves a bug uncovered by dot4 patch where the register class of
int_load_input use was ignored.

llvm-svn: 182130
2013-05-17 16:51:06 +00:00
Vincent Lejeune 519f21eed3 R600: Relax some vector constraints on Dot4.
Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register
coalescer to remove some unneeded COPY.
This patch also defines some structures/functions that can be used to handle
every vector instructions (CUBE, Cayman special instructions...) in a similar
fashion.

llvm-svn: 182126
2013-05-17 16:50:32 +00:00
Vincent Lejeune d3eed66e8c R600: Improve texture handling
llvm-svn: 182125
2013-05-17 16:50:20 +00:00
Tom Stellard 3a7c34c778 R600: Expand SUB for v2i32/v4i32
Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181579
2013-05-10 02:09:39 +00:00
Tom Stellard 3deddc5079 R600: Expand MUL for v4i32/v2i32
Fixes piglit test for OpenCL builtin mul24, and allows mad24 to run.

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181578
2013-05-10 02:09:34 +00:00
Tom Stellard 7fb3963498 R600: Expand SRA for v4i32/v2i32
v2: Add v4i32 test

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181577
2013-05-10 02:09:29 +00:00
Tom Stellard a99c6ae47a R600: Expand vselect for v4i32 and v2i32
v2: Add vselect v4i32 test

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181576
2013-05-10 02:09:24 +00:00
Tom Stellard 4489b85f2b R600: Expand vector or, shl, srl, and xor nodes
llvm-svn: 181035
2013-05-03 17:21:31 +00:00
Tom Stellard 87047f69ad R600: Initialize BooleanVectorContents
Fixes test/CodeGen/R600/setcc.ll

llvm-svn: 180231
2013-04-24 23:56:18 +00:00
Christian Konig 70a5032c1b R600/SI: add mulhu/mulhs patterns
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178126
2013-03-27 09:12:51 +00:00
Michel Danzer a2e28156b4 R600: Use legacy (0 * anything = 0) MUL instructions for pow intrinsics
Fixes wrong lighting in some corner cases with r600g and radeonsi, e.g.
manifested by failure of two piglit/glean tests and intermittent black
patches in many apps.

Tested on SI and RS880.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62012 [radeonsi]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58150 [r600g]

NOTE: This is a candidate for the Mesa stable branch.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 177730
2013-03-22 14:09:10 +00:00
Vincent Lejeune e5ecf10a02 R600: Fix JUMP handling so that MachineInstr verification can occur
This allows R600 Target to use the newly created -verify-misched llc flag

llvm-svn: 176819
2013-03-11 18:15:06 +00:00
Tom Stellard 5e524897ed R600: Optimize another selectcc case
fold selectcc (selectcc x, y, a, b, cc), b, a, b, setne ->
     selectcc x, y, a, b, cc

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 176700
2013-03-08 15:37:11 +00:00
Tom Stellard 2add82de09 R600: Improve custom lowering of select_cc
Two changes:
1. Prefer SET* instructions when possible
2. Handle the CND*_INT case with floating-point args

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 176699
2013-03-08 15:37:09 +00:00
Tom Stellard 492ebeabe9 R600: Change operation action from Custom to Expand for BR_CC
Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 176698
2013-03-08 15:37:07 +00:00
Tom Stellard e8f9f2877b R600: Change operation action from Custom to Expand for SETCC
Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 176697
2013-03-08 15:37:05 +00:00
Tom Stellard b852af5dc4 R600: Set BooleanContents to ZeroOrNegativeOneBooleanContent
Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 176696
2013-03-08 15:37:03 +00:00
Christian Konig 189357c6b2 R600/SI: remove SGPR address space v2
v2: fix R600 regressions

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 176624
2013-03-07 09:03:59 +00:00
Christian Konig 3625055b8c R600/SI: remove shader type intrinsic
Just encode the type as target specific attribute.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 176622
2013-03-07 09:03:46 +00:00
Vincent Lejeune 0b72f1021d R600: Remove LowerConstCopyPass and lower CONST_COPY right after ISel.
Maintaining CONST_COPY Instructions until Pre Emit may prevent some ifcvt case
and taking them in account for scheduling is difficult for no real benefit.

llvm-svn: 176488
2013-03-05 15:04:55 +00:00
Vincent Lejeune 743dca0446 R600: Add support for indirect addressing of non default const buffer
NOTE: This is a candidate for the Mesa stable branch.
llvm-svn: 176484
2013-03-05 15:04:29 +00:00
Tom Stellard 8d469edbe3 R600: Fix scheduler crash caused by invalid MachinePointerInfo
Kernel function arguments are lowered to loads from the PARAM_I address
space.  When creating these load instructions, we were initializing
their MachinePointerInfo with an Arguement object that was not attached
to any function.  This was causing the MachineScheduler to crash when
it tried to access the parent of the Arguement.

This has been fixed by initializing the MachinePointerInfo with a
UndefValue instead.

NOTE: This is a candidate for the Mesa stable branch.
llvm-svn: 175517
2013-02-19 15:22:44 +00:00
Vincent Lejeune d80bc1561a R600: Fold zero/one in export instructions
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
llvm-svn: 175181
2013-02-14 16:55:06 +00:00
Tom Stellard e06163a9a6 R600: Add support for SET*_DX10 instructions
These instructions compare two floating point values and return an
integer true (-1) or false (0) value.

When compiling code generated by the Mesa GLSL frontend, the SET*_DX10
instructions save us four instructions for most branch decisions that
use floating-point comparisons.

llvm-svn: 174609
2013-02-07 14:02:35 +00:00
Tom Stellard f3b2a1e8b3 R600: Support for indirect addressing v4
Only implemented for R600 so far.  SI is missing implementations of a
few callbacks used by the Indirect Addressing pass and needs code to
handle frame indices.

At the moment R600 only supports array sizes of 16 dwords or less.
Register packing of vector types is currently disabled, which means that a
vec4 is stored in T0_X, T1_X, T2_X, T3_X, rather than T0_XYZW. In order
to correctly pack registers in all cases, we will need to implement an
analysis pass for R600 that determines the correct vector width for each
array.

v2:
  - Add support for i8 zext load from stack.
  - Coding style fixes

v3:
  - Don't reserve registers for indirect addressing when it isn't
    being used.
  - Fix bug caused by LLVM limiting the number of SubRegIndex
    declarations.

v4:
  - Fix 64-bit defines

llvm-svn: 174525
2013-02-06 17:32:29 +00:00
Jakob Stoklund Olesen fdc37670f6 Don't use MRI liveouts in R600.
Something very strange is going on with the output registers in this
target. Its ISelLowering code is inserting dangling CopyToReg nodes,
hoping that those physregs won't get clobbered before the RETURN.

This patch adds the output registers as implicit uses on RETURN
instructions in the custom emission pass. I'd much prefer to have those
CopyToReg nodes glued to the RETURNs, but I don't see how.

llvm-svn: 174400
2013-02-05 17:53:52 +00:00
Tom Stellard 41afe6a6fe R600: improve inputs/interpolation handling
Use one intrinsic for all sorts of interpolation.
Use two separate unexpanded instructions to represent INTERP_XY and _ZW -
this will allow to eliminate one part if it's not used.
Track liveness of special interpolation regs instead of reserving them -
this will allow to reuse those regs, lowering reg pressure.

Patch By: Vadim Girlin

v2[Vincent Lejeune]: Rebased against current llvm master

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174394
2013-02-05 17:09:14 +00:00
Tom Stellard dd04c83a4d R600: Consider bitcast when folding const_address node.
Patch by: Vincent Lejeune

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174098
2013-01-31 22:11:53 +00:00
Tom Stellard 6f1b8657f9 R600: Add a llvm.R600.store.swizzle intrinsics
This intrinsic is translated to ALLOC_EXPORT_WORD1_SWIZ, hence its
name. It is used to store vs/fs outputs

Patch by: Vincent Lejeune

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 173297
2013-01-23 21:39:49 +00:00
Tom Stellard d8ac91d436 R600: Simplify stream outputs intrinsic
Patch by: Vincent Lejeune

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 173296
2013-01-23 21:39:47 +00:00
Tom Stellard 365366f9ef R600: rework handling of the constants
Remove Cxxx registers, add new special register - "ALU_CONST" and new
operand for each alu src - "sel". ALU_CONST is used to designate that the
new operand contains the value to override src.sel, src.kc_bank, src.chan
for constants in the driver.

Patch by: Vadim Girlin

Vincent Lejeune:
  - Use pointers for constants
  - Fold CONST_ADDRESS when possible

Tom Stellard:
  - Give CONSTANT_BUFFER_0 its own address space
  - Use integer types for constant loads

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 173222
2013-01-23 02:09:06 +00:00
Chandler Carruth 9fb823bbd4 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Chandler Carruth be81023d74 Resort the #include lines in include/... and lib/... with the
utils/sort_includes.py script.

Most of these are updating the new R600 target and fixing up a few
regressions that have creeped in since the last time I sorted the
includes.

llvm-svn: 171362
2013-01-02 10:22:59 +00:00
Tom Stellard a8b0351720 R600: Expand vec4 INT <-> FP conversions
llvm-svn: 170901
2012-12-21 16:33:24 +00:00
Tom Stellard 6975d35979 Fix warnings with -DNDEBUG
Patch by: NAKAMURA Takumi

llvm-svn: 170142
2012-12-13 19:38:52 +00:00
Tom Stellard 75aadc2813 Add R600 backend
A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX

llvm-svn: 169915
2012-12-11 21:25:42 +00:00