* Added R600_Reg64 class
* Added T#Index#.XY registers definition
* Added v2i32 register reads from parameter and global space
* Added f32 and i32 elements extraction from v2f32 and v2i32
* Added v2i32 -> v2f32 conversions
Tom Stellard:
- Mark vec2 operations as expand. The addition of a vec2 register
class made them all legal.
Patch by: Dmitry Cherkassov
Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com>
llvm-svn: 187582
We were using two instructions for similar purpose : break and
predicated break. Only predicated_break was emitted and it was
lowered at R600ControlFlowFinalizer to JUMP;CF_BREAK;POP.
This commit simplify the situation by making AMDILCFGStructurizer
emit IF_PREDICATE;BREAK;ENDIF; instead of predicated_break (which
is now removed).
There is no functionality change.
llvm-svn: 187510
These are really the same address space in hardware. The only
difference is that CONSTANT_ADDRESS uses a special cache for faster
access. When we are unable to use the constant kcache for some reason
(e.g. smaller types or lack of indirect addressing) then the instruction
selector must use GLOBAL_ADDRESS loads instead.
llvm-svn: 187006
Test is not included as it is several 1000 lines long.
To test this functionnality, a test case must generate at least 2 ALU clauses,
where an ALU clause is ~110 instructions long.
NOTE: This is a candidate for the stable branch.
llvm-svn: 185943
We were using RAT_INST_STORE_RAW, which seemed to work, but the docs
say this instruction doesn't exist for Cayman, so it's probably safer
to use a documented instruction instead.
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 184015
Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register
coalescer to remove some unneeded COPY.
This patch also defines some structures/functions that can be used to handle
every vector instructions (CUBE, Cayman special instructions...) in a similar
fashion.
llvm-svn: 182126
Almost all instructions that takes a 128 bits reg as input (fetch, export...)
have the abilities to swizzle their argument and output. Instead of printing
default swizzle for each 128 bits reg, rename T*.XYZW to T* and let instructions
print potentially optimized swizzles themselves.
llvm-svn: 182124
The BFE optimization was the only one we were actually using, and it was
emitting an intrinsic that we don't support.
https://bugs.freedesktop.org/show_bug.cgi?id=64201
Reviewed-by: Christian König <christian.koenig@amd.com>
NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181580
All but two patterns have been converted to the new syntax. The
remaining two patterns will require COPY_TO_REGCLASS instructions, which
the VLIW DAG Scheduler cannot handle.
llvm-svn: 180922