Commit Graph

507 Commits

Author SHA1 Message Date
Aaron Watry 00aeb119db R600/SI: Expand and of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184837
2013-06-25 13:55:23 +00:00
Tom Stellard 0125f2a6e4 R600/SI: Report unaligned memory accesses as legal for > 32-bit types
In reality, some unaligned memory accesses are legal for 32-bit types and
smaller too, but it all depends on the address space.  Allowing
unaligned loads/stores for > 32-bit types is mainly to prevent the
legalizer from splitting one load into multiple loads of smaller types.

https://bugs.freedesktop.org/show_bug.cgi?id=65873

llvm-svn: 184822
2013-06-25 02:39:35 +00:00
Tom Stellard 9810ec613c R600: Add support for i32 loads from the constant address space on Cayman
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 184821
2013-06-25 02:39:30 +00:00
Tom Stellard b06f3fc1be R600/SI: Add support for v4i32 and v4f32 kernel args
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 184820
2013-06-25 02:39:25 +00:00
Tom Stellard 9d2e1500b4 R600: Fix typo in R600Schedule.td
This should only make a difference in programs that use a lot of the
vector ALU instructions like BFI_INT and BIT_ALIGN.  There is a slight
improvement in the phatk bitcoin mining kernel with this patch on
Evergreen (vector size == 1):

Before:
1173 Instruction Groups / 9520 dwords

After:
1167 Instruction Groups / 9510 dwords

Reviewed-by: Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 184819
2013-06-25 02:39:20 +00:00
Aaron Watry 52a72c926c R600: Fix spelling error in comment
our -> or

llvm-svn: 184756
2013-06-24 16:57:57 +00:00
Tom Stellard 96d38760fc R600/SI: Expand sub for v2i32 and v4i32 for SI
Also add a v2i32 test to the existing v4i32 test.

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry<awatry@gmail.com>
llvm-svn: 184482
2013-06-20 21:55:37 +00:00
Tom Stellard 043795e818 R600/SI: Expand add for v2i32 and v4i32
Also add SI tests to existing file and a v2i32 test for both
R600 and SI.

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 184481
2013-06-20 21:55:30 +00:00
Tom Stellard 6ec9e8043c R600: Expand v2i32 load/store instead of custom lowering
The custom lowering causes llc to crash with a segfault.

Ideally, the custom lowering can be fixed, but this allows
programs which load/store v2i32 to work without crashing.

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry<awatry@gmail.com>
llvm-svn: 184480
2013-06-20 21:55:23 +00:00
Bill Wendling a3cd350249 Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change.
llvm-svn: 184360
2013-06-19 21:36:55 +00:00
Matt Arsenault d46fce1141 Move StructurizeCFG out of R600 to generic Transforms.
Register it with PassManager

llvm-svn: 184343
2013-06-19 20:18:24 +00:00
Matt Arsenault 2aabb06175 Use GetUnderlyingObject instead of custom function
llvm-svn: 184261
2013-06-18 23:37:58 +00:00
Bill Wendling b7b1681157 Remove dead prototype.
llvm-svn: 184173
2013-06-18 06:24:14 +00:00
Vincent Lejeune 41d4cf26b4 R600: PV stores Reg id, not index
llvm-svn: 184117
2013-06-17 20:16:40 +00:00
Vincent Lejeune 8bd10421ec R600: Properly set COUNT_3 bit in TEX clause initiating inst for pre EG gen.
Fixes rv7x0 bug in Heaven reported here:
https://bugs.freedesktop.org/show_bug.cgi?id=64257

llvm-svn: 184116
2013-06-17 20:16:26 +00:00
Tom Stellard 371573448c R600: Add SI load support for v[24]i32 and store for v2i32
Also add a seperate vector lit test file, since r600 doesn't seem to handle
v2i32 load/store yet, but we can test both for SI.

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 184021
2013-06-15 00:09:31 +00:00
Tom Stellard ecf9d86404 R600: Use correct encoding for Vertex Fetch instructions on Cayman
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 184016
2013-06-14 22:12:30 +00:00
Tom Stellard 6aa0d5578d R600: Use EXPORT_RAT_INST_STORE_DWORD for stores on Cayman
We were using RAT_INST_STORE_RAW, which seemed to work, but the docs
say this instruction doesn't exist for Cayman, so it's probably safer
to use a documented instruction instead.

Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 184015
2013-06-14 22:12:24 +00:00
Tom Stellard d99b7932ae R600: Factor the instruction encoding out the RAT_WRITE_CACHELESS_eg class
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 184014
2013-06-14 22:12:19 +00:00
Tom Stellard 3d0823f1cd R600: Move instruction encoding definitions into a separate .td file
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 184013
2013-06-14 22:12:09 +00:00
Tom Stellard adba083bc2 R600: Don't try to fix reg class when copying IMPLICIT_DEF to a register
The test case for this is way too complex to be useful as a lit test,
and I was unable to reduce it.

https://bugs.freedesktop.org/show_bug.cgi?id=65438

llvm-svn: 183937
2013-06-13 20:14:00 +00:00
Benjamin Kramer 193960c822 R600: Make helper functions static.
llvm-svn: 183744
2013-06-11 13:32:25 +00:00
Vincent Lejeune d1a9d18120 R600: Use a refined heuristic to choose when switching clause
This is using a hint from AMD APP OpenCL Programming Guide with
empirically tweaked parameters.
I used Unigine Heaven 3.0 to determine best parameters on my system
(i7 2600/Radeon 6950/Kernel 3.9.4) the benchmark :
it went from 38.8 average fps to 39.6, which is ~3% gain.
(Lightmark 2008.2 gain is much more marginal: from 537 to 539)

There is no lit test provided as the parameter were determined
empirically and it it would be nearly impossiblet to find a test
program that check for optimal behavior.

llvm-svn: 183593
2013-06-07 23:30:34 +00:00
Vincent Lejeune 4d143328df R600: Anti dep better handled in tex clause
llvm-svn: 183592
2013-06-07 23:30:26 +00:00
Tom Stellard d74583777f R600: Fix calculation of stack offset in AMDGPUFrameLowering
We weren't computing structure size correctly and we were relying on
the original alloca instruction to compute the offset, which isn't
always reliable.

Reviewed-by: Vincent Lejeune <vljn@ovi.com>
llvm-svn: 183568
2013-06-07 20:52:05 +00:00
Tom Stellard a6c6e1bfc2 R600: Rework subtarget info and remove AMDILDevice classes
This should simplify the subtarget definitions and make it easier to
add new ones.

Reviewed-by: Vincent Lejeune <vljn@ovi.com>
llvm-svn: 183566
2013-06-07 20:37:48 +00:00
Bill Wendling 37e9adb091 Don't cache the instruction and register info from the TargetMachine, because
the internals of TargetMachine could change.

No functionality change intended.

llvm-svn: 183561
2013-06-07 20:28:55 +00:00
Tom Stellard 3498e4ff1d R600: Fix the fetch limits for R600 generation GPUs
Reviewed-by: Vincent Lejeune <vljn@ovi.com>

https://bugs.freedesktop.org/show_bug.cgi?id=64257

llvm-svn: 183560
2013-06-07 20:28:55 +00:00
Tom Stellard 99792774a4 R600: Move Subtarget feature definitions into AMDGPU.td
This is the convention used by the other targets.

Reviewed-by: Vincent Lejeune <vljn@ovi.com>
llvm-svn: 183559
2013-06-07 20:28:49 +00:00
Tom Stellard b0804ec2ad R600: Remove unnecessary include
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
llvm-svn: 183558
2013-06-07 20:28:43 +00:00
Benjamin Kramer 705d841bb6 R600: Don't compare iterators of different maps.
Found be libstdc's debug mode.

llvm-svn: 183549
2013-06-07 19:59:34 +00:00
Benjamin Kramer ebe0be9ca4 Vincent says the element is at most once in the vector, so we don't need a full std::remove.
llvm-svn: 183541
2013-06-07 18:18:12 +00:00
Benjamin Kramer a857fe115b R600: Fix a potential iterator invalidation issue.
As a bonus this reduces the loop from O(n^2) to O(n).

llvm-svn: 183532
2013-06-07 16:13:49 +00:00
Vincent Lejeune 931bb768fd R600: Remove an extra break in R600OptimizeVectorRegisters.cpp
llvm-svn: 183528
2013-06-07 15:44:53 +00:00
Vincent Lejeune 0030362ed9 R600: Rewrite an awkward loop in R600MachineScheduler
llvm-svn: 183458
2013-06-06 23:08:32 +00:00
Vincent Lejeune 54476a1503 R600: Remove leftover code in R600MachineScheduler.cpp
Spotted by Benjamin Kramer.

llvm-svn: 183413
2013-06-06 14:18:29 +00:00
Bill Wendling b91216817f Cast to the correct type. Pointer, not reference.
llvm-svn: 183385
2013-06-06 05:39:29 +00:00
NAKAMURA Takumi 4a8f079371 R600OptimizeVectorRegisters.cpp: Tweak a warning. [-Wsometimes-uninitialized]
FIXME: Is it false alarm?
llvm-svn: 183371
2013-06-06 02:15:12 +00:00
NAKAMURA Takumi e5555fc238 R600OptimizeVectorRegisters.cpp: Suppress a warning. [-Wunused-variable]
llvm-svn: 183370
2013-06-06 02:15:06 +00:00
NAKAMURA Takumi 372574d447 Trailing linefeed.
llvm-svn: 183369
2013-06-06 02:15:00 +00:00
Bill Wendling e410576865 Cast to the proper type.
llvm-svn: 183365
2013-06-06 01:04:21 +00:00
Tom Stellard acec99c948 R600: Replace predicate loop with predicate function
llvm-svn: 183351
2013-06-05 23:39:50 +00:00
Vincent Lejeune dec1875207 R600: Add a pass that merge Vector Register
Previously commited @183279 but tests were failing, reverted @183286
It was broken because @183336 was missing, now it's there.

llvm-svn: 183343
2013-06-05 21:38:04 +00:00
Vincent Lejeune 4b5b849753 R600: Schedule copy from phys register at beginning of block
It allows regalloc pass to remove them by trivially assigning associated reg

llvm-svn: 183336
2013-06-05 20:27:35 +00:00
Tom Stellard aad5376fb6 R600: Make sure to schedule AR register uses and defs in the same clause
Reviewed-by: vljn at ovi.com
llvm-svn: 183294
2013-06-05 03:43:06 +00:00
Rafael Espindola beef23fe21 Revert "R600: Add a pass that merge Vector Register"
This reverts commit r183279. CodeGen/R600/texture-input-merge.ll was failing.

llvm-svn: 183286
2013-06-05 01:48:30 +00:00
Vincent Lejeune a45aafabfe R600: Add a pass that merge Vector Register
llvm-svn: 183279
2013-06-04 23:17:26 +00:00
Vincent Lejeune c689679173 R600: Const/Neg/Abs can be folded to dot4
llvm-svn: 183278
2013-06-04 23:17:15 +00:00
Vincent Lejeune 276ceb8d5f R600: Swizzle texture/export instructions
llvm-svn: 183229
2013-06-04 15:04:53 +00:00
Aaron Ballman 19978553d4 Silencing an MSVC warning about mixing bool and unsigned int.
llvm-svn: 183176
2013-06-04 01:03:03 +00:00
Tom Stellard 94593ee8c3 R600/SI: Add support for work item and work group intrinsics
llvm-svn: 183138
2013-06-03 17:40:18 +00:00
Tom Stellard ed882c2f1b R600/SI: Add a calling convention for compute shaders
llvm-svn: 183137
2013-06-03 17:40:11 +00:00
Tom Stellard 046039e81b R600/SI: Custom lower i64 sign_extend
llvm-svn: 183136
2013-06-03 17:40:03 +00:00
Tom Stellard 0518ff89ba R600/SI: Adjust some instructions' out register class after ISel
This is necessary to avoid generating VGPR to SGPR copies in some
cases.

llvm-svn: 183135
2013-06-03 17:39:58 +00:00
Tom Stellard bad1f59212 R600/SI: Handle REG_SEQUENCE in fitsRegClass()
llvm-svn: 183134
2013-06-03 17:39:54 +00:00
Tom Stellard b5a97004fb R600/SI: Handle nodes with glue results correctly SITargetLowering::foldOperands()
llvm-svn: 183133
2013-06-03 17:39:50 +00:00
Tom Stellard 2183b70523 R600/SI: Fixup CopyToReg register class in PostprocessISelDAG()
The CopyToReg nodes will sometimes try to copy a value from a VGPR to an
SGPR.  This kind of copy is not possible, so we need to detect
VGPR->SGPR copies and do something else.  The current strategy is to
replace these copies with VGPR->VGPR copies and hope that all the users
of CopyToReg can accept VGPRs as arguments.

llvm-svn: 183132
2013-06-03 17:39:46 +00:00
Tom Stellard 07a10a3d3f R600/SI: Add support for global loads
llvm-svn: 183131
2013-06-03 17:39:43 +00:00
Tom Stellard 556d9aa841 R600/SI: Rework MUBUF store instructions
The lowering of stores is now mostly handled in the tablegen files.  No
more BUFFER_STORE nodes I generated during legalization.

llvm-svn: 183130
2013-06-03 17:39:37 +00:00
Vincent Lejeune 91a942b93e R600: 3 op instructions have no write bit but the result are store in PV
llvm-svn: 183111
2013-06-03 15:56:12 +00:00
Vincent Lejeune eabf83e0a2 R600: CALL_FS consumes a stack size entry
llvm-svn: 183108
2013-06-03 15:44:42 +00:00
Vincent Lejeune f83df1f1cb R600: use capital letter for PV channel
llvm-svn: 183107
2013-06-03 15:44:35 +00:00
Vincent Lejeune a09873dda7 R600: Constraints input regs of interp_xy,_zw
llvm-svn: 183106
2013-06-03 15:44:16 +00:00
Ahmed Bougacha b1a4d9da3b Make SubRegIndex size mandatory, following r183020.
This also makes TableGen able to compute sizes/offsets of synthesized
indices representing tuples.

llvm-svn: 183061
2013-05-31 23:45:26 +00:00
Patrik Hagglund ae8faf2e9a Temporary fix to get rid of gcc warning.
llvm-svn: 182832
2013-05-29 07:32:08 +00:00
Andrew Trick ef9de2a739 Track IR ordering of SelectionDAG nodes 2/4.
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.

llvm-svn: 182703
2013-05-25 02:42:55 +00:00
Tom Stellard 1b086cbcb8 R600: Fix R600ControlFlowFinalizer not considering VTX_READ 128 bit dst reg
Patch by: Vincent Lejeune

https://bugs.freedesktop.org/show_bug.cgi?id=64877

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 182600
2013-05-23 18:26:42 +00:00
Benjamin Kramer d78bb468bd Move passes from namespace llvm into anonymous namespaces. Sort includes while there.
llvm-svn: 182594
2013-05-23 17:10:37 +00:00
Benjamin Kramer 635e368e33 R600: Hide symbols of implementation details.
Also removes an unused function.

llvm-svn: 182587
2013-05-23 15:43:05 +00:00
Aaron Ballman 15f193a1a3 Setting the default value (fixes CRT assertions about uninitialized variable use when doing debug MSVC builds), and fixing coding style.
llvm-svn: 182585
2013-05-23 14:55:00 +00:00
Rafael Espindola 00345fa97b Fix 32 bit build in c++11 mode.
The error was:
error: non-constant-expression cannot be narrowed from type 'long long' to 'long' in initializer list [-Wc++11-narrowing]
        MI.getOperand(6).getImm() & 0x1F,

llvm-svn: 182584
2013-05-23 13:22:30 +00:00
Rafael Espindola 39aca620db Fix a leak on the r600 backend.
This should bring the valgrind bot back to life.

llvm-svn: 182561
2013-05-23 03:31:47 +00:00
Rafael Espindola bd6847fbea clang-format this file.
llvm-svn: 182560
2013-05-23 03:28:39 +00:00
Rafael Espindola e3d83fb8c3 Fix use after free (pr16103).
llvm-svn: 182482
2013-05-22 15:31:11 +00:00
Rafael Espindola ebd8e38849 Check that a function starts with llvm. before using GET_FUNCTION_RECOGNIZER.
Fixes a use of uninitialized memory found by asan and valgind.

llvm-svn: 182480
2013-05-22 14:57:42 +00:00
NAKAMURA Takumi 4f328e1c2f R600ISelLowering.cpp: Avoid "using namespace Intrinsic;" to appease MSC. Specify namespaces explicitly here.
MSC is confused about "memcpy" between <cstring> and llvm::Intrinsic::memcpy, when llvm::Intrinsic were exposed.

llvm-svn: 182452
2013-05-22 06:37:31 +00:00
NAKAMURA Takumi 18ca09c1cc R600: Whitespace and untabify.
llvm-svn: 182451
2013-05-22 06:37:25 +00:00
Owen Anderson 616852848a Create an FPOW SDNode opcode def in the target independent .td file rather than in a specific backend.
llvm-svn: 182450
2013-05-22 06:36:09 +00:00
Rafael Espindola 21ea01d132 Attempt to fix the mingw32 bot.
This should hopefully fix
http://lab.llvm.org:8011/builders/clang-x86_64-darwin11-self-mingw32

llvm-svn: 182446
2013-05-22 02:30:47 +00:00
Rafael Espindola 525cf28652 s/u_int32_t/uint32_t/
llvm-svn: 182444
2013-05-22 01:36:19 +00:00
Rafael Espindola f568827654 Fix warning in non-assert build.
llvm-svn: 182443
2013-05-22 01:29:38 +00:00
Benjamin Kramer 927ca942ce R600: Fix bug detected by GCC warning.
R600TextureIntrinsicsReplacer.cpp:232: warning: the address of ‘ArgsType’ will always evaluate as ‘true’

This doesn't have any effect on the output as a vararg intrinsic behaves the
same way as a non-vararg one.

llvm-svn: 182293
2013-05-20 15:58:43 +00:00
Tom Stellard f1ee716446 R600/SI: Use a multiclass for MUBUF_Load_Helper
This will simplify the instructions and also the pattern definitions.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182288
2013-05-20 15:02:31 +00:00
Tom Stellard b8458f88d6 R600/SI: Add a pattern for S_LOAD_DWORDX2_* instructions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182287
2013-05-20 15:02:28 +00:00
Tom Stellard d2eebf001e R600/SI: Add pattern for rotr
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182286
2013-05-20 15:02:24 +00:00
Tom Stellard 5643c4ac72 R600: Swap the legality of rotl and rotr
The hardware supports rotr and not rotl.

llvm-svn: 182285
2013-05-20 15:02:19 +00:00
Tom Stellard 1cfd7a50bb R600/SI: Add patterns for 64-bit shift operations
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182284
2013-05-20 15:02:12 +00:00
Tom Stellard 459a79a81c R600/SI: Use the same names for VOP3 operands and encoding fields
This makes it possible to reorder the operands without breaking the
encoding.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182283
2013-05-20 15:02:08 +00:00
Tom Stellard b35efba4d9 R600/SI: Make fitsRegClass() operands const
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182282
2013-05-20 15:02:01 +00:00
Matt Arsenault 75865923c9 Add LLVMContext argument to getSetCCResultType
llvm-svn: 182180
2013-05-18 00:21:46 +00:00
Rafael Espindola 5986ce0e5d Fix the build in c++11 mode.
The errors were:

non-constant-expression cannot be narrowed from type 'int64_t' (aka 'long') to 'uint32_t' (aka 'unsigned int') in initializer list

and

non-constant-expression cannot be narrowed from type 'long' to 'uint32_t' (aka 'unsigned int') in initializer list

llvm-svn: 182168
2013-05-17 22:45:52 +00:00
Vincent Lejeune d3fcb5016c R600: Lower int_load_input to copyFromReg instead of Register node
It solves a bug uncovered by dot4 patch where the register class of
int_load_input use was ignored.

llvm-svn: 182130
2013-05-17 16:51:06 +00:00
Vincent Lejeune 3d5118ca40 R600: Use bottom up scheduling algorithm
llvm-svn: 182129
2013-05-17 16:50:56 +00:00
Vincent Lejeune 4c81d4da6f R600: Use depth first scheduling algorithm
It should increase PV substitution opportunities and lower gpr
usage (pending computations path are "flushed" sooner)

llvm-svn: 182128
2013-05-17 16:50:44 +00:00
Vincent Lejeune e958c8e0d8 R600: Replace big texture opcode switch in scheduler by usesTC/usesVC
llvm-svn: 182127
2013-05-17 16:50:37 +00:00
Vincent Lejeune 519f21eed3 R600: Relax some vector constraints on Dot4.
Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register
coalescer to remove some unneeded COPY.
This patch also defines some structures/functions that can be used to handle
every vector instructions (CUBE, Cayman special instructions...) in a similar
fashion.

llvm-svn: 182126
2013-05-17 16:50:32 +00:00
Vincent Lejeune d3eed66e8c R600: Improve texture handling
llvm-svn: 182125
2013-05-17 16:50:20 +00:00
Vincent Lejeune 4ebef18ab5 R600: Rename 128 bit registers.
Almost all instructions that takes a 128 bits reg as input (fetch, export...)
have the abilities to swizzle their argument and output. Instead of printing
default swizzle for each 128 bits reg, rename T*.XYZW to T* and let instructions
print potentially optimized swizzles themselves.

llvm-svn: 182124
2013-05-17 16:50:09 +00:00
Vincent Lejeune 0fca91d52e R600: Some factorization
llvm-svn: 182123
2013-05-17 16:50:02 +00:00
Vincent Lejeune f9f4e1e7db R600: Factorize Fetch size limit inside AMDGPUSubTarget
llvm-svn: 182122
2013-05-17 16:49:55 +00:00
Vincent Lejeune 709e01688d R600: prettier dump of clamp
llvm-svn: 182121
2013-05-17 16:49:49 +00:00
Tom Stellard ecc2ad1cd4 R600: Fix encoding for R600 family GPUs
Reviewed-by: Vincent Lejeune <vljn@ovi.com>

https://bugs.freedesktop.org/show_bug.cgi?id=64193
https://bugs.freedesktop.org/show_bug.cgi?id=64257
https://bugs.freedesktop.org/show_bug.cgi?id=64320

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 182113
2013-05-17 15:23:21 +00:00
Tom Stellard edade94bbc R600: Pass MCSubtargetInfo reference to R600CodeEmitter
llvm-svn: 182112
2013-05-17 15:23:12 +00:00
Christian Konig b7be72df5b R600/SI: return undef instead of null for skipped arguments
This is a candidate for the stable branch.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=64694

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182084
2013-05-17 09:46:48 +00:00
Tom Stellard 1e21b53020 R600/SI: Add processor type for Hainan asic
Patch by: Alex Deucher

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181792
2013-05-14 14:42:56 +00:00
Rafael Espindola b84cde5219 Remove unused fields and arguments.
llvm-svn: 181706
2013-05-13 14:34:48 +00:00
Rafael Espindola 227144c23c Remove the MachineMove class.
It was just a less powerful and more confusing version of
MCCFIInstruction. A side effect is that, since MCCFIInstruction uses
dwarf register numbers, calls to getDwarfRegNum are pushed out, which
should allow further simplifications.

I left the MachineModuleInfo::addFrameMove interface unchanged since
this patch was already fairly big.

llvm-svn: 181680
2013-05-13 01:16:13 +00:00
Rafael Espindola 86067ad6a9 Fix the R600 build.
llvm-svn: 181621
2013-05-10 18:31:42 +00:00
Tom Stellard 2b971eb0d0 R600: Remove AMDILPeeopholeOptimizer and replace optimizations with tablegen patterns
The BFE optimization was the only one we were actually using, and it was
emitting an intrinsic that we don't support.

https://bugs.freedesktop.org/show_bug.cgi?id=64201

Reviewed-by: Christian König <christian.koenig@amd.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181580
2013-05-10 02:09:45 +00:00
Tom Stellard 3a7c34c778 R600: Expand SUB for v2i32/v4i32
Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181579
2013-05-10 02:09:39 +00:00
Tom Stellard 3deddc5079 R600: Expand MUL for v4i32/v2i32
Fixes piglit test for OpenCL builtin mul24, and allows mad24 to run.

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181578
2013-05-10 02:09:34 +00:00
Tom Stellard 7fb3963498 R600: Expand SRA for v4i32/v2i32
v2: Add v4i32 test

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181577
2013-05-10 02:09:29 +00:00
Tom Stellard a99c6ae47a R600: Expand vselect for v4i32 and v2i32
v2: Add vselect v4i32 test

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181576
2013-05-10 02:09:24 +00:00
Tom Stellard f787ef1d96 R600/SI: Add intrinsic for MIMG IMAGE_GET_RESINFO opcode
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181269
2013-05-06 23:02:19 +00:00
Tom Stellard e363dbf7eb R600/SI: Handle arbitrary destination type in SITargetLowering::adjustWritemask
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181268
2013-05-06 23:02:15 +00:00
Tom Stellard 353b336e8c R600/SI: Add intrinsic for texture image loading
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181267
2013-05-06 23:02:12 +00:00
Tom Stellard c932d7329c R600/SI: Add pattern for uint_to_fp
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181266
2013-05-06 23:02:07 +00:00
Tom Stellard cf6452c7d4 R600/SI: Add patterns for integer maxima / minima
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181265
2013-05-06 23:02:04 +00:00
Tom Stellard 9b3d2535bf R600/SI: Add pattern for AMDGPU.trunc intrinsic
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181263
2013-05-06 23:02:00 +00:00
Tom Stellard d93cede8e4 R600: Remove dead code from the CodeEmitter v2
v2:
  - Replace switch statement with TSFlags query

Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181229
2013-05-06 17:50:57 +00:00
Tom Stellard 043de4c5af R600: Emit config values in register / value pairs
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181228
2013-05-06 17:50:51 +00:00
Tom Stellard cfe2ef8fea R600: Stop emitting the instruction type byte before each instruction
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181225
2013-05-06 17:50:44 +00:00
Tom Stellard dbbcaf31b6 R600: Emit ISA for CALL_FS_* instructions
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181223
2013-05-06 17:50:26 +00:00
Tom Stellard 4489b85f2b R600: Expand vector or, shl, srl, and xor nodes
llvm-svn: 181035
2013-05-03 17:21:31 +00:00
Tom Stellard 6a6ecedcb7 R600: BFI_INT is a vector-only instruction
llvm-svn: 181034
2013-05-03 17:21:24 +00:00
Tom Stellard eac65dde30 R600: Add pattern for SHA-256 Ma function
This can be optimized using the BFI_INT instruction.

llvm-svn: 181033
2013-05-03 17:21:20 +00:00
Tom Stellard c2516c6e40 R600: Clean up comments in Processors.td
llvm-svn: 181032
2013-05-03 17:21:14 +00:00
Vincent Lejeune ddd43383ef R600: Signed literals are 64bits wide
llvm-svn: 180960
2013-05-02 21:53:03 +00:00
Vincent Lejeune 2a44ae0053 R600: If previous bundle is dot4, PV valid chan is always X
llvm-svn: 180959
2013-05-02 21:52:55 +00:00
Vincent Lejeune b0422e24a9 R600: Improve asmPrint of ALU clause
llvm-svn: 180957
2013-05-02 21:52:40 +00:00
Vincent Lejeune f97af796a9 R600: Prettier asmPrint of Alu
llvm-svn: 180956
2013-05-02 21:52:30 +00:00
Tom Stellard 40b7f1f6c3 R600: Use new tablegen syntax for patterns
All but two patterns have been converted to the new syntax.  The
remaining two patterns will require COPY_TO_REGCLASS instructions, which
the VLIW DAG Scheduler cannot handle.

llvm-svn: 180922
2013-05-02 15:30:12 +00:00
Tom Stellard 5447ae20ff R600/SI: remove nonsense select pattern
Fortunately this pattern never matched, otherwise
we would have generated incorrect code.

Signed-off-by: Christian K??nig <christian.koenig@amd.com>
llvm-svn: 180921
2013-05-02 15:30:07 +00:00
Vincent Lejeune 3a8d78a2c3 R600: Always use texture cache for compute shaders
This will improve the performance of memory reads.

llvm-svn: 180762
2013-04-30 00:14:44 +00:00
Vincent Lejeune 3abdbf1cad R600: use native for alu
llvm-svn: 180761
2013-04-30 00:14:38 +00:00
Vincent Lejeune 147700b8b4 R600: Packetize instructions
llvm-svn: 180760
2013-04-30 00:14:27 +00:00
Vincent Lejeune 076c0b28e3 R600: Rework Scheduling to handle difference between VLIW4 and VLIW5 chips
llvm-svn: 180759
2013-04-30 00:14:17 +00:00
Vincent Lejeune 22c4248213 R600: Add a Bank Swizzle operand
llvm-svn: 180758
2013-04-30 00:14:08 +00:00
Vincent Lejeune 7c395f77de R600: Take inner dependency into tex/vtx clauses
llvm-svn: 180757
2013-04-30 00:14:00 +00:00
Vincent Lejeune 3f1d136b02 R600: Turn TEX/VTX into native instructions
llvm-svn: 180756
2013-04-30 00:13:53 +00:00
Vincent Lejeune c299164284 R600: Add FetchInst bit to instruction defs to denote vertex/tex instructions
v2[Vincent Lejeune]: Split FetchInst into usesTextureCache/usesVertexCache

llvm-svn: 180755
2013-04-30 00:13:39 +00:00
Vincent Lejeune 7d820c0bef R600: Add some new processor variants
llvm-svn: 180753
2013-04-30 00:13:27 +00:00
Vincent Lejeune f501ea298b R600: Clean up instruction class definitions
llvm-svn: 180752
2013-04-30 00:13:20 +00:00
Vincent Lejeune 4a0beb5207 R600: config section now reports use of killgt
llvm-svn: 180751
2013-04-30 00:13:13 +00:00
Tom Stellard 119ad03c67 R600: Use correct CF_END instruction on Northern Island GPUs
llvm-svn: 180735
2013-04-29 22:23:58 +00:00
Tom Stellard 8367067e02 R600: Fix encoding of CF_END_{EG, R600} instructions
The EOP bit was not being encoded.

llvm-svn: 180734
2013-04-29 22:23:54 +00:00
Tom Stellard 456adc6c4e R600: Initialize AMDGPUMachineFunction::ShaderType to ShaderType::COMPUTE
We need to intialize this to something and since clang does not set
the shader type attribute and clang is used only for compute shaders,
initializing it to COMPUTE seems like the best choice.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 180620
2013-04-26 18:32:24 +00:00
Tom Stellard 87047f69ad R600: Initialize BooleanVectorContents
Fixes test/CodeGen/R600/setcc.ll

llvm-svn: 180231
2013-04-24 23:56:18 +00:00
Tom Stellard 34e4068d05 R600: Use SHT_PROGBITS for the .AMDGPU.config section
The libelf implementation that is distributed here:
http://www.mr511.de/software/english.html
will not parse sections that are marked SHT_NULL.

llvm-svn: 180230
2013-04-24 23:56:14 +00:00
Vincent Lejeune 117f075f6e R600: Use .AMDGPU.config section to emit stacksize
llvm-svn: 180124
2013-04-23 17:34:12 +00:00
Vincent Lejeune b6bfe85a07 R600: Add CF_END
llvm-svn: 180123
2013-04-23 17:34:00 +00:00
Matt Arsenault 034ca0fe41 Remove unused DwarfSectionOffsetDirective string
The value isn't actually used, and setting it emits a COFF specific
directive.

llvm-svn: 180064
2013-04-22 22:49:11 +00:00
Michael Liao b53d8963ce ArrayRefize getMachineNode(). No functionality change.
llvm-svn: 179901
2013-04-19 22:22:57 +00:00
Tom Stellard 9d10c4ce86 R600: Add pattern for the BFI_INT instruction
llvm-svn: 179830
2013-04-19 02:11:06 +00:00
Tom Stellard ea977bc0e3 R600/SI: Use InstFlag for VOP3 modifier operands
InstFlag has a default value of 0 and will simplify the VOP3 patterns.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179829
2013-04-19 02:11:00 +00:00
Vincent Lejeune 2d5c341cee R600: Make Export Instruction not duplicable
llvm-svn: 179686
2013-04-17 15:17:39 +00:00
Vincent Lejeune 218093e834 R600: Export is emitted as a CF_NATIVE inst
llvm-svn: 179685
2013-04-17 15:17:32 +00:00
Vincent Lejeune 98a7380859 R600: Emit used GPRs count
llvm-svn: 179684
2013-04-17 15:17:25 +00:00
Tom Stellard cb97e3acfa R600/SI: Emit config values in register value pairs.
Instead of emitting config values in a predefined order, the code
emitter will now emit a 32-bit register index followed by the 32-bit
config value.

llvm-svn: 179546
2013-04-15 17:51:35 +00:00
Tom Stellard 3a7beafb32 R600/SI: Emit configuration value in the .AMDGPU.config ELF section
llvm-svn: 179545
2013-04-15 17:51:30 +00:00
Tom Stellard 9991659fab R600: Emit ELF formatted code rather than raw ISA.
llvm-svn: 179544
2013-04-15 17:51:21 +00:00
NAKAMURA Takumi 3ee2b1e26f R600ControlFlowFinalizer.cpp: Fix a warning. [-Wunused-variable]
llvm-svn: 179263
2013-04-11 04:16:27 +00:00
NAKAMURA Takumi 3b0853be56 Whitespace.
llvm-svn: 179262
2013-04-11 04:16:22 +00:00
Michel Danzer 8caa904bde R600/SI: Add pattern for AMDGPUurecip
21 more little piglits with radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 179186
2013-04-10 17:17:56 +00:00
Vincent Lejeune 04d9aa4822 R600: Add VTX_READ_* and RAT_WRITE_CACHELESS_* when computing cf addr
llvm-svn: 179174
2013-04-10 13:29:20 +00:00
Christian Konig 8b1ed28ef1 R600/SI: dynamical figure out the reg class of MIMG
Depending on the number of bits set in the writemask.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179166
2013-04-10 08:39:16 +00:00
Christian Konig 8e06e2a8c4 R600/SI: adjust writemask to only the used components
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179165
2013-04-10 08:39:08 +00:00
Christian Konig 4ace663255 R600/SI: remove image sample writemask
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179164
2013-04-10 08:39:01 +00:00
Vincent Lejeune 5f11dd390a R600: Control Flow support for pre EG gen
llvm-svn: 179020
2013-04-08 13:05:49 +00:00
Tom Stellard 754f80ff3a R600/SI: Add support for buffer stores v2
v2:
  - Use the ADDR64 bit

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178931
2013-04-05 23:31:51 +00:00
Tom Stellard 6db08eb42f R600/SI: Use same names for corresponding MUBUF operands and encoding fields
The code emitter knows how to encode operands whose name matches one of
the encoding fields.  If there is no match, the code emitter relies on
the order of the operand and field definitions to determine how operands
should be encoding.  Matching by order makes it easy to accidentally break
the instruction encodings, so we prefer to match by name.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178930
2013-04-05 23:31:44 +00:00
Tom Stellard 60174bb9ca R600: Add RV670 processor
This is an R600 GPU with double support.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178929
2013-04-05 23:31:40 +00:00
Tom Stellard 2f21c7e551 R600/SI: Add processor types for each SI variant
Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178928
2013-04-05 23:31:35 +00:00
Tom Stellard edbf1eb42b R600/SI: Avoid generating S_MOVs with 64-bit immediates v2
SITargetLowering::analyzeImmediate() was converting the 64-bit values
to 32-bit and then checking if they were an inline immediate.  Some
of these conversions caused this check to succeed and produced
S_MOV instructions with 64-bit immediates, which are illegal.

v2:
  - Clean up logic

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178927
2013-04-05 23:31:20 +00:00
Vincent Lejeune bcbb13d691 R600: Use a mask for offsets when encoding instructions
llvm-svn: 178763
2013-04-04 14:00:09 +00:00
Vincent Lejeune 8e377fdba6 R600: Fix wrong address when substituting ENDIF
llvm-svn: 178762
2013-04-04 14:00:03 +00:00
Vincent Lejeune c44fa99719 R600: Take export into account when computing cf address
llvm-svn: 178761
2013-04-04 13:59:59 +00:00
Vincent Lejeune c3d3f9b66e R600: Fix last ALU of a clause being emitted in a separate clause
llvm-svn: 178675
2013-04-03 18:24:47 +00:00
Vincent Lejeune 80031d9fc4 R600: Factorize maximum alu per clause in a single location
llvm-svn: 178667
2013-04-03 16:49:34 +00:00
Vincent Lejeune b6d6c0d458 R600: Simplify data structure and add DEBUG to R600ControlFlowFinalizer
llvm-svn: 178665
2013-04-03 16:24:09 +00:00
Vincent Lejeune 9931298b30 R600: Consider KILLGT as an ALU instruction
Mesa does not override llvm behavior wrt KILLGT anymore so llvm
has to handle KILLGT on its own.

llvm-svn: 178664
2013-04-03 16:24:04 +00:00
NAKAMURA Takumi fd98f7f2b6 Target/R600: Fix CMake build to add missing files.
llvm-svn: 178508
2013-04-01 22:05:58 +00:00
Vincent Lejeune bfaa63a6db R600: Add support for native control flow
llvm-svn: 178505
2013-04-01 21:48:05 +00:00
Vincent Lejeune ace6f7351e R600/SI: Share code recording ShaderTypeAttribute between generations
llvm-svn: 178504
2013-04-01 21:47:53 +00:00
Vincent Lejeune f43bc57b66 R600: Emit CF_ALU and use true kcache register.
llvm-svn: 178503
2013-04-01 21:47:42 +00:00
Vincent Lejeune 53f3525d35 R600: Emit native instructions for tex
llvm-svn: 178452
2013-03-31 19:33:04 +00:00
Eric Christopher 6c75232cf0 These two are default in the constructor for MCAsmInfo.
llvm-svn: 178293
2013-03-28 21:37:18 +00:00
Christian Konig 08f5929942 R600/SI: add SETO/SETUO patterns
6 more piglit tests.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178145
2013-03-27 15:27:31 +00:00
Christian Konig 3c14580acb R600/SI: add cummuting of rev instructions
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178127
2013-03-27 09:12:59 +00:00
Christian Konig 70a5032c1b R600/SI: add mulhu/mulhs patterns
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178126
2013-03-27 09:12:51 +00:00
Christian Konig 20a7e6b764 R600/SI: add srl/sha patterns for SI
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178125
2013-03-27 09:12:44 +00:00
NAKAMURA Takumi 3234178bf9 R600/SIMCCodeEmitter.cpp: Prune a couple of unused members, STI and Ctx. [-Wunused-private-field]
llvm-svn: 178065
2013-03-26 19:42:48 +00:00
Christian Konig 8370dbbffd R600/SI: improve post ISel folding
Not only fold immediates, but avoid unnecessary copies as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178024
2013-03-26 14:04:17 +00:00
Christian Konig 082c661f94 R600/SI: improve vector interpolation
Prevent loading M0 multiple times.

Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178023
2013-03-26 14:04:12 +00:00
Christian Konig 25ce3e9f4c R600/SI: avoid unecessary subreg extraction in IMAGE_SAMPLE
Just define the address as unknown instead of VReg_32.

Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178022
2013-03-26 14:04:07 +00:00
Christian Konig eecebd0bab R600/SI: switch back to RegPressure scheduling
Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178021
2013-03-26 14:04:02 +00:00
Christian Konig 727d06de1d R600/SI: mark most intrinsics as readnone v2
They read from constant register space anyway.

v2: fix lit tests

Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178020
2013-03-26 14:03:57 +00:00
Christian Konig 737d4a1665 R600/SI: replace WQM intrinsic
Just enable WQM when we see an LDS interpolation instruction.

Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178019
2013-03-26 14:03:50 +00:00
Christian Konig 6a9d390b6b R600/SI: fix ELSE pseudo op handling
Restore the EXEC mask early, otherwise a copy might end up not beeing executed.

Candidate for the mesa stable branch.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178018
2013-03-26 14:03:44 +00:00
Christian Konig 90b45124cd R600: fix DenseMap with pointer key iteration in the structurizer
Use a MapVector on types where the iteration order matters.
Otherwise we doesn't always produce a deterministic output.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 177999
2013-03-26 10:24:20 +00:00