llvm-project

Commit Graph

Author	SHA1	Message	Date
Tom Stellard	11624bc577	R600/SI: Add a MUBUF load pattern for Reg+Imm offsets llvm-svn: 200933	2014-02-06 18:36:38 +00:00
Tom Stellard	044e418f15	R600/SI: Use immediates offsets for SMRD instructions whenever possible There was a problem with the old pattern, so we were copying some larger immediates into registers when we could have been encoding them in the instruction. llvm-svn: 200932	2014-02-06 18:36:34 +00:00
Matt Arsenault	25793a3f22	Add address space argument to allowsUnalignedMemoryAccess. On R600, some address spaces have more strict alignment requirements than others. llvm-svn: 200887	2014-02-05 23:15:53 +00:00
Michel Danzer	5d26fdfcba	R600/SI: Add pattern for zero-extending i1 to i32 Fixes opencl-example if_* tests with radeonsi. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200830	2014-02-05 09:48:05 +00:00
Duncan P. N. Exon Smith	8e661efc00	cleanup: scc_iterator consumers should use isAtEnd No functional change. Updated loops from: for (I = scc_begin(), E = scc_end(); I != E; ++I) to: for (I = scc_begin(); !I.isAtEnd(); ++I) for teh win. llvm-svn: 200789	2014-02-04 19:19:07 +00:00
Rafael Espindola	7cbbd28c67	Every target uses .align. Simplify. llvm-svn: 200782	2014-02-04 18:39:51 +00:00
Tom Stellard	aeb456438c	R600/SI: Expand i1 BR_CC This fixes a crashes in the OpenCV test suite and also the scrypt kernel in bfgminer. I was unable to come up with a reduced test case for this. https://bugs.freedesktop.org/show_bug.cgi?id=72785 llvm-svn: 200776	2014-02-04 17:18:43 +00:00
Tom Stellard	b8725d84d6	R600/SI: Don't assume copies will be coalesced in SIFixSGPRCopies There is no lit test for this, because it would be too big and complicated, but it does fix a crash in the Arithm/Absdiff.* OpenCV test. llvm-svn: 200775	2014-02-04 17:18:42 +00:00
Tom Stellard	0ec134f3d6	R600/SI: Custom lower i64 ISD::SELECT llvm-svn: 200774	2014-02-04 17:18:40 +00:00
Tom Stellard	bfebd1fc7e	R600: Enable vector fpow. The OpenCL specs say: "The vector versions of the math functions operate component-wise. The description is per-component." Patch by: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 200773	2014-02-04 17:18:37 +00:00
Michel Danzer	624b02aa67	R600/SI: Fix fneg for 0.0 V_ADD_F32 with source modifier does not produce -0.0 for this. Just manipulate the sign bit directly instead. Also add a pattern for (fneg (fabs ...)). Fixes a bunch of bit encoding piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200743	2014-02-04 07:12:38 +00:00
Matt Arsenault	d5ab971b54	Add DEBUG_TYPE to SIAnnotateControlFlow llvm-svn: 200720	2014-02-03 22:58:05 +00:00
Matt Arsenault	f5958dded4	R600/SI: Fix insertelement with dynamic indices. This didn't work for any integer vectors, and didn't work with some sizes of float vectors. This should now work with all sizes of float and i32 vectors. llvm-svn: 200619	2014-02-02 00:05:35 +00:00
Rafael Espindola	277f9061fc	Remove the last hasRawTextSupport call from R600. There is nothing wrong with printing the disassembly section when printing text. An hypothetical assembler would then produce a .o just like our direct object emission produces. llvm-svn: 200583	2014-01-31 22:14:06 +00:00
Rafael Espindola	887541fe27	Replace another use with hasRawTextSupport+EmitRawText with emitRawComment. llvm-svn: 200582	2014-01-31 22:08:19 +00:00
Rafael Espindola	19656ba7ea	Use emitRawComment to avoid a call to hasRawTextSupport. llvm-svn: 200581	2014-01-31 21:54:49 +00:00
David Woodhouse	d2cca113df	Delete MCSubtargetInfo data members from target MCCodeEmitter classes The subtarget info is explicitly passed to the EncodeInstruction method and we should use that subtarget info to influence any encoding decisions. llvm-svn: 200350	2014-01-28 23:13:25 +00:00
David Woodhouse	3fa98a65e9	Propagate MCSubtargetInfo through TableGen's getBinaryCodeForInstr() llvm-svn: 200349	2014-01-28 23:13:18 +00:00
David Woodhouse	9784cef38d	Explictly pass MCSubtargetInfo to MCCodeEmitter::EncodeInstruction() llvm-svn: 200348	2014-01-28 23:13:07 +00:00
David Woodhouse	e6c13e4abd	Change MCStreamer EmitInstruction interface to take subtarget info llvm-svn: 200345	2014-01-28 23:12:42 +00:00
Michel Danzer	bf1a641060	R600/SI: Add pattern for truncating i32 to i1 Fixes half a dozen piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200283	2014-01-28 03:01:16 +00:00
Michel Danzer	13736221e3	R600/SI: Add intrinsic for BUFFER_LOAD_DWORD* instructions Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200196	2014-01-27 07:20:51 +00:00
Michel Danzer	6064f57ae8	R600/SI: Add intrinsic for S_SENDMSG instruction Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200195	2014-01-27 07:20:44 +00:00
Rafael Espindola	98f5b54f85	Add back spaces I missed in the conversion to emitRawComments. Sorry about that. llvm-svn: 200171	2014-01-27 00:19:41 +00:00
Rafael Espindola	bcf890bf07	Use emitRawComment instead of EmitRawText. llvm-svn: 200170	2014-01-27 00:16:00 +00:00
Rafael Espindola	e41383f899	Pass a MCSubtargetInfo down to the TargetStreamer creation. With this the target streamers will be able to know the target features that are in use. llvm-svn: 200135	2014-01-26 06:38:58 +00:00
Rafael Espindola	24ea09ef7d	Construct the MCStreamer before constructing the MCTargetStreamer. This has a few advantages: * Only targets that use a MCTargetStreamer have to worry about it. * There is never a MCTargetStreamer without a MCStreamer, so we can use a reference. * A MCTargetStreamer can talk to the MCStreamer in its constructor. llvm-svn: 200129	2014-01-26 06:06:37 +00:00
Juergen Ributzka	3e752e7af9	Add final and owerride keywords to TargetTransformInfo's subclasses. llvm-svn: 200021	2014-01-24 18:22:59 +00:00
Alp Toker	cb40291100	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018	2014-01-24 17:20:08 +00:00
Tom Stellard	a64353e5bd	R600: Remove successive JUMP in AnalyzeBranch when AllowModify is true This fixes a crash in the OpenCV OpenCL test suite. There is no lit test for this, because the test would be very large and could easily be invalidated by changes to the scheduler or other parts of the compiler. Patch by: Vincent Lejeune llvm-svn: 199919	2014-01-23 18:49:34 +00:00
Tom Stellard	a2a4b8ee2f	R600: Disable the BFE pattern This pattern uses an SDNodeXForm, which isn't being emitted for some reason. I can get it to work by attaching the PatLeaf that has the XForm to the argument in the output pattern, but this results in an immediate being used in a register operand, which the backend can't handle yet. llvm-svn: 199918	2014-01-23 18:49:33 +00:00
Tom Stellard	805890b252	R600: Correctly handle vertex fetch clauses the precede ENDIFs The control flow finalizer would sometimes use an ALU_POP_AFTER instruction before the vetex fetch clause instead of using a POP instruction after it. llvm-svn: 199917	2014-01-23 18:49:31 +00:00
Tom Stellard	8cce9bdf17	R600: Unconditionally unroll loops that contain GEPs with alloca pointers Implement the getUnrollingPreferences() function for AMDGPUTargetTransformInfo so that loops that do address calculations on pointers derived from alloca are unconditionally unrolled. Unrolling these loops makes it more likely that SROA will be able to eliminate the allocas, which is a big win for R600 since memory allocated by alloca (private memory) is really slow. llvm-svn: 199916	2014-01-23 18:49:28 +00:00
Tom Stellard	348273df97	R600: Recommit 199842: Add work-around for the CF stack entry HW bug The unit test is now disabled on non-asserts builds. The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE, CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of sub-entries on the stack is greater than or equal to the stack entry size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is present when number of sub-entries module 8 is either 7 or 0) We choose to be conservative and always apply the work-around when the number of sub-enries is greater than or equal to the stack entry size, so that we can safely over-allocate the stack when we are unsure of the stack allocation rules. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199905	2014-01-23 16:18:02 +00:00
Tom Stellard	31e16388d7	Revert "R600: Add work-around for the CF stack entry HW bug" This reverts commit 35b8331cad6eb512a2506adbc394201181da94ba. The -debug-only flag for llc doesn't appear to be available in all build configurations. llvm-svn: 199845	2014-01-22 22:20:54 +00:00
Tom Stellard	e89373e062	R600: Add work-around for the CF stack entry HW bug The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE, CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of sub-entries on the stack is greater than or equal to the stack entry size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is present when number of sub-entries module 8 is either 7 or 0) We choose to be conservative and always apply the work-around when the number of sub-enries is greater than or equal to the stack entry size, so that we can safely over-allocate the stack when we are unsure of the stack allocation rules. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199842	2014-01-22 21:55:46 +00:00
Tom Stellard	59ed4794c4	R600: Add some missing CF instruction definitions to the .td files. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199841	2014-01-22 21:55:44 +00:00
Tom Stellard	a40f97154b	R600: Refactor stack size calculation reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199840	2014-01-22 21:55:43 +00:00
Tom Stellard	afbb697e0b	R600: CF_PUSH is the same on Evergreen and Cayman reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199839	2014-01-22 21:55:41 +00:00
Tom Stellard	8c347b024e	R600: Add wavefront size property to the subtargets v2 v2: - Initialize wavefront size to 0 reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199838	2014-01-22 21:55:40 +00:00
Tom Stellard	08b6af91c3	R600: Add stack size to .AMDGPUcsdata section reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199837	2014-01-22 21:55:35 +00:00
Tom Stellard	476437cbbc	R600: MOVA is vector only llvm-svn: 199827	2014-01-22 19:24:24 +00:00
Tom Stellard	598f3945c0	R600: Take alignment into account when calculating the stack offset llvm-svn: 199826	2014-01-22 19:24:23 +00:00
Tom Stellard	04c0e9851b	R600: Add support for global addresses with constant initializers llvm-svn: 199825	2014-01-22 19:24:21 +00:00
Tom Stellard	27982b1d4a	R600: Begin private memory at the second GPR. This way private memory does not over-write work group information stored in GPRs 0 and 1. llvm-svn: 199824	2014-01-22 19:24:19 +00:00
Tom Stellard	e93736057f	R600/SI: Add support for i8 and i16 private loads/stores llvm-svn: 199823	2014-01-22 19:24:14 +00:00
Rafael Espindola	f69b850d60	CommentColumn is always 40. Simplify. llvm-svn: 199357	2014-01-16 07:04:11 +00:00
Chandler Carruth	73523021d0	[PM] Split DominatorTree into a concrete analysis result object which can be used by both the new pass manager and the old. This removes it from any of the virtual mess of the pass interfaces and lets it derive cleanly from the DominatorTreeBase<> template. In turn, tons of boilerplate interface can be nuked and it turns into a very straightforward extension of the base DominatorTree interface. The old analysis pass is now a simple wrapper. The names and style of this split should match the split between CallGraph and CallGraphWrapperPass. All of the users of DominatorTree have been updated to match using many of the same tricks as with CallGraph. The goal is that the common type remains the resulting DominatorTree rather than the pass. This will make subsequent work toward the new pass manager significantly easier. Also in numerous places things became cleaner because I switched from re-running the pass (!!! mid way through some other passes run!!!) to directly recomputing the domtree. llvm-svn: 199104	2014-01-13 13:07:17 +00:00
Chandler Carruth	e509db410a	[PM] Pull the generic graph algorithms and data structures for dominator trees into the Support library. These are all expressed in terms of the generic GraphTraits and CFG, with no reliance on any concrete IR types. Putting them in support clarifies that and makes the fact that the static analyzer in Clang uses them much more sane. When moving the Dominators.h file into the IR library I claimed that this was the right home for it but not something I planned to work on. Oops. So why am I doing this? It happens to be one step toward breaking the requirement that IR verification can only be performed from inside of a pass context, which completely blocks the implementation of verification for the new pass manager infrastructure. Fixing it will also allow removing the concept of the "preverify" step (WTF???) and allow the verifier to cleanly flag functions which fail verification in a way that precludes even computing dominance information. Currently, that results in a fatal error even when you ask the verifier to not fatally error. It's awesome like that. The yak shaving will continue... llvm-svn: 199095	2014-01-13 10:52:56 +00:00
Chandler Carruth	5ad5f15cff	[cleanup] Move the Dominators.h and Verifier.h headers into the IR directory. These passes are already defined in the IR library, and it doesn't make any sense to have the headers in Analysis. Long term, I think there is going to be a much better way to divide these matters. The dominators code should be fully separated into the abstract graph algorithm and have that put in Support where it becomes obvious that evn Clang's CFGBlock's can use it. Then the verifier can manually construct dominance information from the Support-driven interface while the Analysis library can provide a pass which both caches, reconstructs, and supports a nice update API. But those are very long term, and so I don't want to leave the really confusing structure until that day arrives. llvm-svn: 199082	2014-01-13 09:26:24 +00:00
Matt Arsenault	a64ee177a0	Move declaration of variables down to first use. llvm-svn: 198794	2014-01-08 21:47:14 +00:00
Chandler Carruth	8a8cd2bab9	Re-sort all of the includes with ./utils/sort_includes.py so that subsequent changes are easier to review. About to fix some layering issues, and wanted to separate out the necessary churn. Also comment and sink the include of "Windows.h" in three .inc files to match the usage in Memory.inc. llvm-svn: 198685	2014-01-07 11:48:04 +00:00
Andrew Trick	d7f890edb0	Factor MI-Sched in preparation for post-ra scheduling support. Factor the MachineFunctionPass into MachineSchedulerBase. Split the DAG class into ScheduleDAGMI and SchedulerDAGMILive. llvm-svn: 198119	2013-12-28 21:56:47 +00:00
Tom Stellard	eddfa69465	R600: Allow ftrunc v2: Add ftrunc->TRUNC pattern instead of replacing int_AMDGPU_trunc v3: move ftrunc pattern next to TRUNC definition, it's available since R600 Patch By: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 197783	2013-12-20 05:11:55 +00:00
Rafael Espindola	4fa79758b7	Small simplification, p0 is the same as p. llvm-svn: 197699	2013-12-19 16:51:03 +00:00
Matt Arsenault	a98cd6a56e	R600/SI: Make private pointers be 32-bit. Different sized address spaces should theoretically work most of the time now, and since 64-bit add is currently disabled, using more 32-bit pointers fixes some cases. llvm-svn: 197659	2013-12-19 05:32:55 +00:00
Andrew Trick	e339828b90	Allow MachineCSE to coalesce trivial subregister copies the same way that it coalesces normal copies. Without this, MachineCSE is powerless to handle redundant operations with truncated source operands. This required fixing the 2-addr pass to handle tied subregisters. It isn't clear what combinations of subregisters can legally be tied, but the simple case of truncated source operands is now safely handled: %vreg11<def> = COPY %vreg1:sub_32bit; GR32:%vreg11 GR64:%vreg1 %vreg12<def> = COPY %vreg2:sub_32bit; GR32:%vreg12 GR64:%vreg2 %vreg13<def,tied1> = ADD32rr %vreg11<tied0>, %vreg12<kill>, %EFLAGS<imp-def> Test case: cse-add-with-overflow.ll. This exposed an existing bug in PPCInstrInfo::commuteInstruction. Thanks to Rafael for the test case: PowerPC/crash.ll. llvm-svn: 197465	2013-12-17 04:50:45 +00:00
Matt Arsenault	cb34f84e39	Fix typo in instruction name. SI_KIL -> SI_KILL llvm-svn: 197425	2013-12-16 20:58:33 +00:00
Rafael Espindola	e89b41495a	One last cleanup of LLVM's DataLayout strings. Produce them in the same order on every target. The order is that of getStringRepresentation: e\|E-i-f-v-a-s-n-S*. llvm-svn: 197411	2013-12-16 19:31:14 +00:00
Rafael Espindola	0eb1ebeaac	Structure R600's computeDataLayout more like every other target. While there, simplify "p3:32:32:32" to "p3:32:32". llvm-svn: 197407	2013-12-16 19:18:57 +00:00
Rafael Espindola	bccb9d45ad	The preferred alignment defaults to the abi alignment. Omit if it is the same. llvm-svn: 197400	2013-12-16 18:01:51 +00:00
Rafael Espindola	f057093fdc	Don't duplicate the DataLayout defaults for integer, floats and vectors. llvm-svn: 197398	2013-12-16 17:41:15 +00:00
Rafael Espindola	8afbb28cea	On DataLayout, omit the default of p:64:64:64. llvm-svn: 197397	2013-12-16 17:15:29 +00:00
Matt Arsenault	52226f9a8e	Don't manually calculate size in bytes llvm-svn: 197327	2013-12-14 18:21:59 +00:00
Rafael Espindola	ceb0c4962a	Turn AMDGPUSubtarget::getDataLayout into a static function. No functionality change. llvm-svn: 197310	2013-12-14 06:13:44 +00:00
Rafael Espindola	009e758628	Don't set unused variable. llvm-svn: 197064	2013-12-11 20:40:57 +00:00
Tom Stellard	d7e146ede6	R600: Re-format Processors.td This makes it a little easier to read. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 197058	2013-12-11 17:51:51 +00:00
Tom Stellard	f2ba972af6	R600: Register AMDGPUCFGStructurizer pass This enables -print-before-all to dump MachineInstrs after it is run. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 197057	2013-12-11 17:51:47 +00:00
Tom Stellard	1de5582d06	R600: Register R600EmitClauseMarkers pass This enables -print-before-all to dump MachineInstrs after it is run. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 197056	2013-12-11 17:51:41 +00:00
NAKAMURA Takumi	8bc9bfaa5a	Prune redundant dependencies in LLVMBuild.txt. llvm-svn: 196988	2013-12-11 00:30:57 +00:00
Matt Arsenault	eaa3a7efab	Use llvm_unreachable instead of assert(0) llvm-svn: 196971	2013-12-10 21:37:42 +00:00
Vincent Lejeune	cc0ea74c7b	R600: Fix an infinite loop when trying to reorganize export/tex vector input llvm-svn: 196923	2013-12-10 14:43:31 +00:00
Vincent Lejeune	f92d64d160	R600: Fix input modifiers lost for Cayman llvm-svn: 196922	2013-12-10 14:43:27 +00:00
NAKAMURA Takumi	396d4d3c7e	Add proper dependencies to LLVMBuild.txt in llvm/lib. I'll prune redundant deps in LLVMBuild.txt, later. llvm-svn: 196881	2013-12-10 05:39:34 +00:00
NAKAMURA Takumi	e3afe2ef62	Whitespaces. llvm-svn: 196880	2013-12-10 05:39:12 +00:00
Rafael Espindola	e2a1418e68	Don't set a variable to its default value. llvm-svn: 196807	2013-12-09 19:36:11 +00:00
Vincent Lejeune	92b0a64906	Add a RequireStructuredCFG Field to TargetMachine. llvm-svn: 196634	2013-12-07 01:49:19 +00:00
Vincent Lejeune	ae7e96062c	R600: Remove orphaned declarations llvm-svn: 196633	2013-12-07 01:49:10 +00:00
Eric Christopher	99952a0823	Fix an index array check. Patch by Marius Wachtler. llvm-svn: 196561	2013-12-06 02:45:24 +00:00
Rafael Espindola	4cc2b87375	Add a default constructor to get deterministic behavior. Should fix the msan and valgrind bots. llvm-svn: 196509	2013-12-05 16:21:17 +00:00
Alp Toker	f907b891da	Correct word hyphenations This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471	2013-12-05 05:44:44 +00:00
Matt Arsenault	89cc49fe5d	R600/SI: Add comments for number of used registers. llvm-svn: 196467	2013-12-05 05:15:35 +00:00
Rafael Espindola	20a8621e5f	Don't set PrivateGlobalPrefix for NVPTX and R600. These targets have special asm printers that don't use these. llvm-svn: 196187	2013-12-03 01:03:35 +00:00
Rafael Espindola	04867ce9b0	Convert two char* that are only ever used as booleans to bool. llvm-svn: 196168	2013-12-02 23:04:51 +00:00
Vincent Lejeune	4b8d9e303c	R600: Workaround for cayman loop bug llvm-svn: 196121	2013-12-02 17:29:37 +00:00
Rafael Espindola	50712a456d	Change the default of AsmWriterClassName and isMCAsmWriter. llvm-svn: 196065	2013-12-02 04:55:42 +00:00
NAKAMURA Takumi	226e10edff	[CMake] Let add_public_tablegen_target() provide intrinsics_gen, too. I think, in principle, intrinsics_gen may be added explicitly. That said, it can be added incidentally, since each target already has dependencies to llvm-tblgen. Almost all source files depend on both CommonTaleGen and intrinsics_gen. Explicit add_dependencies() have been pruned under lib/Target. llvm-svn: 195929	2013-11-28 17:04:31 +00:00
NAKAMURA Takumi	ce746c6c49	[CMake] Let add_public_tablegen_target responsible to provide dependency to CommonTableGen. add_public_tablegen_target adds *CommonTableGen to LLVM_COMMON_DEPENDS. LLVM_COMMON_DEPENDS affects add_llvm_library (and other add_target stuff) within its scope. llvm-svn: 195927	2013-11-28 17:04:04 +00:00
NAKAMURA Takumi	b2abd160b3	[CMake] Prune include_directories() in llvm/lib/Target, take #2 . I forgot to commit them. They were staging in my local repo. llvm-svn: 195924	2013-11-28 15:30:37 +00:00
Rafael Espindola	429e3fb068	The R600 has its own asm printer which doesn't use GlobalPrefix. Drop it. llvm-svn: 195883	2013-11-27 21:52:37 +00:00
Tom Stellard	175e7a8c97	R600: Expand vector FABS NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195881	2013-11-27 21:23:39 +00:00
Tom Stellard	c149dc02d3	R600/SI: Implement spilling of SGPRs v5 SGPRs are spilled into VGPRs using the {READ,WRITE}LANE_B32 instructions. v2: - Fix encoding of Lane Mask - Use correct register flags, so we don't overwrite the low dword when restoring multi-dword registers. v3: - Register spilling seems to hang the GPU, so replace all shaders that need spilling with a dummy shader. v4: - Fix *LANE definitions - Change destination reg class for 32-bit SMRD instructions v5: - Remove small optimization that was crashing Serious Sam 3. https://bugs.freedesktop.org/show_bug.cgi?id=68224 https://bugs.freedesktop.org/show_bug.cgi?id=71285 NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195880	2013-11-27 21:23:35 +00:00
Tom Stellard	859199dad8	R600/SI: Use SGPR_32 register class for 32-bit SMRD outputs Writing to the M0 register from an SMRD instruction hangs the GPU, so we need to use the SGPR_32 register class, which does not include M0. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195879	2013-11-27 21:23:29 +00:00
Tom Stellard	4d566b2edf	R600: Add support for ISD::FROUND NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195878	2013-11-27 21:23:20 +00:00
Tom Stellard	c0845334da	R600/SI: Fixing handling of condition codes We were ignoring the ordered/onordered bits and also the signed/unsigned bits of condition codes when lowering the DAG to MachineInstrs. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195514	2013-11-22 23:07:58 +00:00
Tom Stellard	cd6b0a658a	R600: Implement TargetInstrInfo::isLegalToSplitMBBAt() Splitting a basic block will create a new ALU clause, so we need to make sure we aren't moving uses of registers that are local to their current clause into a new one. I had a test case for this, but unfortunately unrelated schedule changes invalidated it, and I wasn't been able to come up with another one. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195399	2013-11-22 00:41:08 +00:00
Juergen Ributzka	d12ccbd343	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. The memory leaks in this version have been fixed. Thanks Alexey for pointing them out. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 195064	2013-11-19 00:57:56 +00:00
Matt Arsenault	3a4d86a1a4	R600/SI: Fix moveToVALU when the first operand is VSrc. Moving into a VSrc doesn't always work, since it could be replaced with an SGPR later. llvm-svn: 195042	2013-11-18 20:09:55 +00:00
Matt Arsenault	08f7e37aa9	R600/SI: Fix multiple SGPR reads when using VCC. No other SGPR operands are allowed, so if VCC is used, move the other to a VGPR. llvm-svn: 195041	2013-11-18 20:09:50 +00:00
Matt Arsenault	fb826fa6e1	R600/SI: Implement add i64, but do not yet enable. Test doesn't actually check the output. I need to fix add i64 being matched for the addressing calculations. llvm-svn: 195040	2013-11-18 20:09:47 +00:00
Matt Arsenault	bf6e1e7ff7	R600/SI: Specify SSrc operands llvm-svn: 195039	2013-11-18 20:09:43 +00:00
Matt Arsenault	e8d214662a	R600/SI: addc / adde i32 are legal llvm-svn: 195038	2013-11-18 20:09:40 +00:00
Matt Arsenault	04fca446b1	R600/SI: Match addc to S_ADD_U32. The carry always goes to SCC. llvm-svn: 195037	2013-11-18 20:09:37 +00:00
Matt Arsenault	f8c089ac25	R600/SI: Match adde/sube to S_ADDC_U32/S_SUBB_U32 llvm-svn: 195036	2013-11-18 20:09:34 +00:00
Matt Arsenault	e27a41b5a4	R600/SI: Specify S_ADD/S_SUB set SCC and add is commutable llvm-svn: 195035	2013-11-18 20:09:32 +00:00
Matt Arsenault	43b8e4ed3b	R600/SI: Move patterns to match add / sub to scalar instructions llvm-svn: 195034	2013-11-18 20:09:29 +00:00
Matt Arsenault	f0b1e3a776	R600/SI: Fix extra defs of VCC / SCC. When replacing scalar operations with vector, the wrong implicit output register was used. llvm-svn: 195033	2013-11-18 20:09:21 +00:00
Tom Stellard	66df8a2c0a	R600: Enable the IR structurizer by default llvm-svn: 195031	2013-11-18 19:43:44 +00:00
Tom Stellard	827ec9b630	R600: Fix a crash in the AMDILCFGStrucurizer The ifPatternMatch() function was not correctly reporting the number of matches in some cases. llvm-svn: 195030	2013-11-18 19:43:38 +00:00
Tom Stellard	783893a893	R600: Add a SubtargetFeatture for disabling the ifcvt pass. This is useful when writing test cases for the AMDIL structurizer. llvm-svn: 195029	2013-11-18 19:43:33 +00:00
Tom Stellard	f1e3f77507	R600: Use lower-case for EnableIRStructurizer feature llc converts all values passed to -mattr= to lowercase, so this enables us to toggle this feature when using llc. llvm-svn: 195028	2013-11-18 19:43:29 +00:00
Tom Stellard	f340787d79	R600/SI: Fix illegal VGPR->SGPR copy inside of loop llvm-svn: 195026	2013-11-18 18:50:20 +00:00
Tom Stellard	13de545693	R600/SI: Fix another case of illegal VGPR->SGPR copy llvm-svn: 195025	2013-11-18 18:50:15 +00:00
Alexey Samsonov	49109a279c	Revert r194865 and r194874. This change is incorrect. If you delete virtual destructor of both a base class and a subclass, then the following code: Base *foo = new Child(); delete foo; will not cause the destructor for members of Child class. As a result, I observe plently of memory leaks. Notable examples I investigated are: ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl. llvm-svn: 194997	2013-11-18 09:31:53 +00:00
Vincent Lejeune	745d4298b1	R600: Make dot_4 instructions predicable llvm-svn: 194927	2013-11-16 16:24:41 +00:00
Juergen Ributzka	dbedae89b9	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 194865	2013-11-15 22:34:48 +00:00
Matt Arsenault	f14032af0e	Make method static llvm-svn: 194858	2013-11-15 22:02:28 +00:00
Tom Stellard	519ae39c45	R600/SI: Add VReg_96 register class to SIRegisterInfo::hasVGPRs() This fixes a crash with GNOME settings manager. llvm-svn: 194836	2013-11-15 18:26:45 +00:00
Matt Arsenault	c5559bb14b	Add target hook to prevent folding some bitcasted loads. This is to avoid this transformation in some cases: fold (conv (load x)) -> (load (conv*)x) On architectures that don't natively support some vector loads efficiently casting the load to a smaller vector of larger types and loading is more efficient. Patch by Micah Villmow. llvm-svn: 194783	2013-11-15 04:42:23 +00:00
Tom Stellard	8f9fc20751	R600: Fix scheduling of instructions that use the LDS output queue The LDS output queue is accessed via the OQAP register. The OQAP register cannot be live across clauses, so if value is written to the output queue, it must be retrieved before the end of the clause. With the machine scheduler, we cannot statisfy this constraint, because it lacks proper alias analysis and it will mark some LDS accesses as having a chain dependency on vertex fetches. Since vertex fetches require a new clauses, the dependency may end up spiltting OQAP uses and defs so the end up in different clauses. See the lds-output-queue.ll test for a more detailed explanation. To work around this issue, we now combine the LDS read and the OQAP copy into one instruction and expand it after register allocation. This patch also adds some checks to the EmitClauseMarker pass, so that it doesn't end a clause with a value still in the output queue and removes AR.X and OQAP handling from the scheduler (AR.X uses and defs were already being expanded post-RA, so the scheduler will never see them). Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 194755	2013-11-15 00:12:45 +00:00
Tom Stellard	81229a14a3	R600/SI: Add processor type for Hawaii Patch by: Alex Deucher Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> llvm-svn: 194752	2013-11-14 23:46:00 +00:00
Matt Arsenault	855e0b71d4	R600/SI: Remove redundant legalizeOperands call llvm-svn: 194749	2013-11-14 23:44:25 +00:00
Hans Wennborg	a74fd70ac7	Add #include raw_ostream.h in lib/Target/R600/SIFixSGPRCopies.cpp This was casuing my release+asserts build on Windows to fail. llvm-svn: 194747	2013-11-14 23:24:09 +00:00
Matt Arsenault	3383eecd68	R600/SI: Specify S_ADDK/S_MULK set SCC and are commutable llvm-svn: 194738	2013-11-14 22:32:49 +00:00
Matt Arsenault	671a005e4a	Indentation fixes llvm-svn: 194688	2013-11-14 10:08:50 +00:00
Matt Arsenault	f4760455e8	Add a comment llvm-svn: 194684	2013-11-14 08:06:38 +00:00
Matt Arsenault	269092d747	Fix trailing whitespace in debug printing llvm-svn: 194683	2013-11-14 08:06:35 +00:00
NAKAMURA Takumi	b88288f64b	R600/SIFixSGPRCopies.cpp: Fix \param to \return. [-Wdocumentation] llvm-svn: 194662	2013-11-14 04:05:28 +00:00
NAKAMURA Takumi	78e80cd17d	Whitespace. llvm-svn: 194661	2013-11-14 04:05:22 +00:00
Tom Stellard	415ef6db68	R600: Fix uninitialized variable usage llvm-svn: 194632	2013-11-13 23:58:51 +00:00
Tom Stellard	81d871dee3	R600/SI: Add support for private address space load/store Private address space is emulated using the register file with MOVRELS and MOVRELD instructions. llvm-svn: 194626	2013-11-13 23:36:50 +00:00
Tom Stellard	8216602a0b	R600/SI: Prefer SALU instructions for bit shift operations All shift operations will be selected as SALU instructions and then if necessary lowered to VALU instructions in the SIFixSGPRCopies pass. This allows us to do more operations on the SALU which will improve performance and is also required for implementing private memory using indirect addressing, since the private memory pointers must stay in the scalar registers. This patch includes some fixes from Matt Arsenault. llvm-svn: 194625	2013-11-13 23:36:37 +00:00
Rafael Espindola	fdc88137f4	Remove AllowQuotesInName and friends from MCAsmInfo. Accepting quotes is a property of an assembler, not of an object file. For example, ELF can support any names for sections and symbols, but the gnu assembler only accepts quotes in some contexts and llvm-mc in a few more. LLVM should not produce different symbols based on a guess about which assembler will be reading the code it is printing. llvm-svn: 194575	2013-11-13 14:01:59 +00:00
Matt Arsenault	00a0d6f672	R600: Fix selection failure on EXTLOAD llvm-svn: 194547	2013-11-13 02:39:07 +00:00
Vincent Lejeune	aee3a10440	R600: Reenable llvm.R600.load.input/interp.input for compatibility llvm-svn: 194484	2013-11-12 16:26:47 +00:00
Matt Arsenault	72b31eee0b	R600/SI: Change formatting of printed registers. Print the range of registers used with a single letter prefix. This better matches what the shader compiler produces and is overall less obnoxious than concatenating all of the subregister names together. Instead of SGPR0, it will print s0. Instead of SGPR0_SGPR1, it will print s[0:1] and so on. There doesn't appear to be a straightforward way to get the actual register info in the InstPrinter, so this parses the generated name to print with the new syntax. The required test changes are pretty nasty, and register matching regexes are now worse. Since there isn't a way to add to a variable in FileCheck, some of the tests now don't check the exact number of registers used, but I don't think that will be a real problem. llvm-svn: 194443	2013-11-12 02:35:51 +00:00
Vincent Lejeune	f143af3fe9	R600: Use function inputs to represent data stored in gpr llvm-svn: 194425	2013-11-11 22:10:24 +00:00
Matt Arsenault	c9ad7c9fcb	Make method static llvm-svn: 194340	2013-11-10 01:04:02 +00:00
Matt Arsenault	d82c183d70	Fix missing C++ mode comment llvm-svn: 194339	2013-11-10 01:03:59 +00:00
Vincent Lejeune	4f3751f2af	R600: Fix LowerUDIVREM llvm-svn: 194153	2013-11-06 17:36:04 +00:00
Matt Arsenault	ef1a950b48	Use isa<> instead of dyn_cast<> with unused value llvm-svn: 193869	2013-11-01 17:39:26 +00:00
Rafael Espindola	4b102d0ead	Remove another unused flag. llvm-svn: 193756	2013-10-31 15:58:33 +00:00
Rafael Espindola	74e1d0a0a0	Remove unused flag. llvm-svn: 193752	2013-10-31 15:49:39 +00:00
Matt Arsenault	909d0c063f	Fix a few typos llvm-svn: 193723	2013-10-30 23:43:29 +00:00
Tom Stellard	c947d8ca64	R600: Custom lower f32 = uint_to_fp i64 llvm-svn: 193701	2013-10-30 17:22:05 +00:00
Aaron Ballman	9ab670fb54	Removing a switch statement that contains only a default label. This resolves an MSVC warning. No functional change intended. llvm-svn: 193649	2013-10-29 20:40:52 +00:00
Tom Stellard	6e1ee476ab	R600/SI: Add compute support for CI v2 v2: - Fix LDS size calculation Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 193621	2013-10-29 16:37:28 +00:00
Tom Stellard	e118b8becd	R600: Expand vector FSQRT ops llvm-svn: 193620	2013-10-29 16:37:20 +00:00
NAKAMURA Takumi	8a0464393f	Prune utf8 chars in comments. llvm-svn: 193512	2013-10-28 04:07:38 +00:00
NAKAMURA Takumi	4bb85f90fd	Target/R600: Un-tab-ify. llvm-svn: 193510	2013-10-28 04:07:23 +00:00
Tom Stellard	03a5c08de6	R600/SI: Replace ffs(x) - 1 with countTrailingZeros(x) ffs(x) broke the mingw buildbot. llvm-svn: 193225	2013-10-23 03:50:25 +00:00
Tom Stellard	54774e5681	R600/SI: fix MIMG writemask adjustement This fixes piglit: - shaders/glsl-fs-texture2d-masked - shaders/glsl-fs-texture2d-masked-4 Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 193222	2013-10-23 02:53:47 +00:00
Tom Stellard	af77543244	R600: Fix handling of vector kernel arguments The SelectionDAGBuilder was promoting vector kernel arguments to legal types, but this won't work for R600 and SI since kernel arguments are stored in memory and can't be promoted. In order to handle vector arguments correctly we need to look at the original types from the LLVM IR function. llvm-svn: 193215	2013-10-23 00:44:32 +00:00
Tom Stellard	fb9616905a	R600/SI: Add support for i64 bitwise or llvm-svn: 193213	2013-10-23 00:44:19 +00:00
Tom Stellard	a66cafa096	R600/SI: Use S_LOAD_DWORD instructions for v8i32 and v16i32 llvm-svn: 193212	2013-10-23 00:44:12 +00:00
Matt Arsenault	65864e3182	R600/SI: Don't assert on SCC usage llvm-svn: 193198	2013-10-22 21:11:31 +00:00
Tom Stellard	debb4cf5ea	R600/SI: Use llvm_unreachable() for an always false assert llvm-svn: 193183	2013-10-22 18:42:03 +00:00
Tom Stellard	8be4dd234a	R600/SI: Fix warning on non-asserts build llvm-svn: 193180	2013-10-22 18:31:45 +00:00
Tom Stellard	26a3b67b3b	R600: Simplify handling of private address space The AMDGPUIndirectAddressing pass was previously responsible for lowering private loads and stores to indirect addressing instructions. However, this pass was buggy and way too complicated. The only advantage it had over the new simplified code was that it saved one instruction per direct write to private memory. This optimization likely has a minimal impact on performance, and we may be able to duplicate it using some other transformation. For the private address space, we now: 1. Lower private loads/store to Register(Load\|Store) instructions 2. Reserve part of the register file as 'private memory' 3. After regalloc lower the Register(Load\|Store) instructions to MOV instructions that use indirect addressing. llvm-svn: 193179	2013-10-22 18:19:10 +00:00
Tom Stellard	c460b0dcf1	R600: Remove unused InstrInfo::getMovImmInstr() function llvm-svn: 193178	2013-10-22 18:19:01 +00:00
Benjamin Kramer	a9fe95b6c2	R600: Remove \ at EOL from ascii art comments. Completely harmless, but GCC likes to warn about it even when the next line is a comment. llvm-svn: 192974	2013-10-18 14:12:50 +00:00
Tom Stellard	b34186ae38	R600: Fix a crash in the AMDILCFGStructurizer We were calling llvm_unreachable() when failing to optimize the branch into if case. However, it is still possible for us to structurize the CFG by duplicating blocks even if this optimization fails. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 192813	2013-10-16 17:06:02 +00:00
Tom Stellard	69f86d199a	R600: Remove some dead code from the AMDILCFGStructurizer Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 192812	2013-10-16 17:05:56 +00:00
Matt Arsenault	226580656b	Fix typo llvm-svn: 192752	2013-10-15 23:44:48 +00:00
Matt Arsenault	df90c02e68	Fix missing C++ mode thing in header llvm-svn: 192751	2013-10-15 23:44:45 +00:00
Vincent Lejeune	5d6c2c318b	R600/SI: Remove some leftover MI dump call llvm-svn: 192743	2013-10-15 22:48:51 +00:00
Vincent Lejeune	d6cbede9c5	R600: improve dump of S_WAITCNT llvm-svn: 192557	2013-10-13 17:56:28 +00:00
Vincent Lejeune	4ee6dd6136	R600/SI: Add SinkingPass before ISel llvm-svn: 192556	2013-10-13 17:56:21 +00:00
Vincent Lejeune	d623644d17	R600/SI: Support byval arguments llvm-svn: 192555	2013-10-13 17:56:16 +00:00
Vincent Lejeune	fa58a5fb60	R600: Use masked read sel for texture instructions llvm-svn: 192554	2013-10-13 17:56:10 +00:00
Vincent Lejeune	301beb80d4	R600: fix swizzle export llvm-svn: 192553	2013-10-13 17:56:04 +00:00
Vincent Lejeune	533352f696	R600: Clear the VPM bit of export instructions. It makes apparently no change it to set this bit or not but the docs recommand to left it cleared. llvm-svn: 192552	2013-10-13 17:55:57 +00:00
Tom Stellard	ed69925998	R600: Store disassembly in a special ELF section when feature +DumpCode is enabled. Patch by: Jay Cornwall Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 192523	2013-10-12 05:02:51 +00:00
Matt Arsenault	8fb373891f	Fix typo llvm-svn: 192499	2013-10-11 21:03:36 +00:00
Matt Arsenault	1408b60291	Fix typo llvm-svn: 192406	2013-10-10 23:05:37 +00:00
Matt Arsenault	204cfa6e43	R600: Fix trunc i64 to i32 on SI llvm-svn: 192375	2013-10-10 18:04:16 +00:00
Tom Stellard	93fabcebf1	R600/SI: Implement SIInstrInfo::verifyInstruction() for VOP* The function is used by the machine verifier and checks that VOP* instructions have legal operands. llvm-svn: 192367	2013-10-10 17:11:55 +00:00
Tom Stellard	682bfbc43d	R600/SI: Define a separate MIMG instruction for each possible output value type During instruction selection, we rewrite the destination register class for MIMG instructions based on their writemasks. This creates machine verifier errors since the new register class does not match the register class in the MIMG instruction definition. We can avoid this by defining different MIMG instructions for each possible destination type and then switching to the correct instruction when we change the register class. llvm-svn: 192365	2013-10-10 17:11:24 +00:00
Tom Stellard	1b99ed8290	R600/SI: Mark the EXEC register as reserved This prevents the machine verifier from complaining about uses of an undefined physical register. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 192364	2013-10-10 17:11:19 +00:00
Tom Stellard	ed0ceec1c1	R600: Use StructurizeCFGPass for non SI targets StructurizeCFG pass allows to make complex cfg reducible ; it allows a lot of shader from shadertoy (which exhibits complex control flow constructs) to works correctly with respect to CFG handling (and allow us to detect potential bug in other part of the backend). We provide a cmd line argument to disable the pass for debug purpose. Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 192363	2013-10-10 17:11:12 +00:00
Rafael Espindola	a17151ad5a	Add a MCTargetStreamer interface. This patch fixes an old FIXME by creating a MCTargetStreamer interface and moving the target specific functions for ARM, Mips and PPC to it. The ARM streamer is still declared in a common place because it is used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are completely hidden in the corresponding Target directories. I will send an email to llvmdev with instructions on how to use this. llvm-svn: 192181	2013-10-08 13:08:17 +00:00
Vincent Lejeune	6df39438af	R600: Add a ldptr intrinsic to support MSAA. llvm-svn: 191838	2013-10-02 16:00:33 +00:00
Vincent Lejeune	a4da6fb535	R600: add a pass that merges clauses. llvm-svn: 191790	2013-10-01 19:32:58 +00:00
Vincent Lejeune	0b342d6f74	R600: Put PRED_X instruction in its own clause llvm-svn: 191789	2013-10-01 19:32:49 +00:00
Vincent Lejeune	269708b98d	R600: Enable -verify-machineinstrs in some tests. llvm-svn: 191788	2013-10-01 19:32:38 +00:00
Arnold Schwaighofer	d2f96b91ca	IfConverter: Use TargetSchedule for instruction latencies For targets that have instruction itineraries this means no change. Targets that move over to the new schedule model will use be able the new schedule module for instruction latencies in the if-converter (the logic is such that if there is no itineary we will use the new sched model for the latencies). Before, we queried "TTI->getInstructionLatency()" for the instruction latency and the extra prediction cost. Now, we query the TargetSchedule abstraction for the instruction latency and TargetInstrInfo for the extra predictation cost. The TargetSchedule abstraction will internally call "TTI->getInstructionLatency" if an itinerary exists, otherwise it will use the new schedule model. ATTENTION: Out of tree targets! (I will also send out an email later to LLVMDev) This means, if your target implements unsigned getInstrLatency(const InstrItineraryData ItinData, const MachineInstr MI, unsigned PredCost); and returns a value for "PredCost", you now also need to implement unsigned getPredictationCost(const MachineInstr MI); (if your target uses the IfConversion.cpp pass) radar://15077010 llvm-svn: 191671	2013-09-30 15:28:56 +00:00
Robert Wilhelm	2788d3ec99	Even more spelling fixes for "instruction". llvm-svn: 191611	2013-09-28 13:42:22 +00:00
Tom Stellard	0351ea2010	R600: Fix handling of NAN in comparison instructions We were completely ignoring the unorder/ordered attributes of condition codes and also incorrectly lowering seto and setuo. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 191603	2013-09-28 02:50:50 +00:00
Tom Stellard	5694d3090a	SelectionDAG: Improve legalization of SELECT_CC with illegal condition codes SelectionDAG will now attempt to inverse an illegal conditon in order to find a legal one and if that doesn't work, it will attempt to swap the operands using the inverted condition. There are no new test cases for this, but a nubmer of the existing R600 tests hit this path. llvm-svn: 191602	2013-09-28 02:50:43 +00:00
Tom Stellard	cd42818d86	SelectionDAG: Try to expand all condition codes using getCCSwappedOperands() This is useful for targets like R600, which only support GT, GE, NE, and EQ condition codes as it removes the need to handle unsupported condition codes in target specific code. There are no tests with this commit, but R600 has been updated to take advantage of this new feature, so its existing selectcc tests are now testing the swapped operands path. llvm-svn: 191601	2013-09-28 02:50:38 +00:00
David Majnemer	1ccd2f2aee	MC: Remove vestigial PCSymbol field from AsmInfo llvm-svn: 191362	2013-09-25 09:36:11 +00:00
Tim Northover	31d093c705	ISelDAG: spot chain cycles involving MachineNodes Previously, the DAGISel function WalkChainUsers was spotting that it had entered already-selected territory by whether a node was a MachineNode (amongst other things). Since it's fairly common practice to insert MachineNodes during ISelLowering, this was not the correct check. Looking around, it seems that other nodes get their NodeId set to -1 upon selection, so this makes sure the same thing happens to all MachineNodes and uses that characteristic to determine whether we should stop looking for a loop during selection. This should fix PR15840. llvm-svn: 191165	2013-09-22 08:21:56 +00:00
Andrew Trick	978674b2bc	Allow subtarget selection of the default MachineScheduler and document the interface. The global registry is used to allow command line override of the scheduler selection, but does not work well as the normal selection API. For example, the same LLVM process should be able to target multiple targets or subtargets. llvm-svn: 191071	2013-09-20 05:14:41 +00:00
Vincent Lejeune	0167a313da	R600: Move clamp handling code to R600IselLowering.cpp llvm-svn: 190645	2013-09-12 23:45:00 +00:00
Vincent Lejeune	9a248e5c2d	R600: Move code handling literal folding into R600ISelLowering. llvm-svn: 190644	2013-09-12 23:44:53 +00:00
Vincent Lejeune	ab3baf80a8	R600: Move fabs/fneg/sel folding logic into PostProcessIsel This move makes possible to correctly handle multiples instructions from a single pattern. llvm-svn: 190643	2013-09-12 23:44:44 +00:00
Tom Stellard	afcf12f33a	R600/SI: expose TBUFFER_STORE_FORMAT_* for OpenGL transform feedback For _XYZ, the type of VDATA is v4i32, because v3i32 doesn't exist. The ADDR64 bit is not exposed. A simpler intrinsic that doesn't take a resource descriptor might be nicer. The maximum number of input SGPRs is bumped to 17. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 190575	2013-09-12 02:55:14 +00:00
Tom Stellard	7f6fa4c4c5	R600: Don't use trans slot for instructions that read LDS source registers This fixes some regressions in the piglit local memory store tests introduced by recent commits which made the scheduler aware of the trans slot. It's not possible to test this using lit, because there is no way to determine from the assembly dumps whether or not an instruction is in the trans slot. Even if this were possible, the test would be highly sensitive to changes in the scheduler and might generate confusing false negatives. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 190574	2013-09-12 02:55:06 +00:00
Bill Wendling	58e2d3d856	Generate compact unwind encoding from CFI directives. We used to generate the compact unwind encoding from the machine instructions. However, this had the problem that if the user used `-save-temps' or compiled their hand-written `.s' file (with CFI directives), we wouldn't generate the compact unwind encoding. Move the algorithm that generates the compact unwind encoding into the MCAsmBackend. This way we can generate the encoding whether the code is from a `.ll' or `.s' file. <rdar://problem/13623355> llvm-svn: 190290	2013-09-09 02:37:14 +00:00
Aaron Watry	372cecf642	R600: Add support for LDS atomic subtract Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 190200	2013-09-06 20:17:42 +00:00
Tom Stellard	8bc633ac09	R600: Coding style llvm-svn: 190110	2013-09-05 23:55:13 +00:00
Matt Arsenault	6f24379974	R600: Fix i64 to i32 trunc on SI llvm-svn: 190091	2013-09-05 19:41:10 +00:00
Tom Stellard	13c68ef88b	R600: Add support for local memory atomic add llvm-svn: 190080	2013-09-05 18:38:09 +00:00
Tom Stellard	53f2f90eb4	R600: Expand SELECT nodes rather than custom lowering them llvm-svn: 190079	2013-09-05 18:38:03 +00:00
Tom Stellard	de60e25278	R600: Fix incorrect LDS size calculation GlobalAdderss nodes that appeared in more than one basic block were being counted twice. llvm-svn: 190078	2013-09-05 18:37:57 +00:00
Tom Stellard	d50bb3c8d4	R600/SI: Don't emit S_WQM_B64 instruction for compute shaders llvm-svn: 190077	2013-09-05 18:37:52 +00:00
Tom Stellard	624741fded	R600: Fix segfault in R600TextureIntrinsicReplacer This pass was segfaulting when it ran into a non-intrinsic function call. Function calls are not supported, so now instead of segfaulting, we will get an assertion failure with a nice error message. I'm not sure how to test this using lit. llvm-svn: 190076	2013-09-05 18:37:45 +00:00
Vincent Lejeune	744efa4dca	R600: Use shared op optimization when checking cycle compatibility llvm-svn: 189981	2013-09-04 19:53:54 +00:00
Vincent Lejeune	7e2c83256b	R600: Non vector only instruction can be scheduled on trans unit llvm-svn: 189980	2013-09-04 19:53:46 +00:00
Vincent Lejeune	4d5c5e53d0	R600: Use SchedModel enum for is{Trans,Vector}Only functions llvm-svn: 189979	2013-09-04 19:53:30 +00:00
Michael Gottesman	c9f5859f81	Add llvm namespace to llvm::next. llvm-svn: 189912	2013-09-04 04:26:09 +00:00
Michael Gottesman	114ac1a230	Use llvm::next() instead of incrementing begin iterators of std::vector. Iterator of std::vector may be implemented as a raw pointer. In this case begin iterators are rvalues and cannot be incremented. For example, this is the case with STDCXX implementation of vector. Patch by Konstantin Tokarev <annulen@yandex.ru>. llvm-svn: 189911	2013-09-04 04:19:01 +00:00
Benjamin Kramer	bda73fff49	Mark an unreachable code path with llvm_unreachable. Pacifies GCC. llvm-svn: 189726	2013-08-31 21:20:04 +00:00
Tom Stellard	35bb18c2a7	R600: Add support for vector local memory loads llvm-svn: 189226	2013-08-26 15:06:04 +00:00
Tom Stellard	c6f4a29ed5	R600: Add support for i8 and i16 local memory loads llvm-svn: 189225	2013-08-26 15:05:59 +00:00
Tom Stellard	f3d166aa1e	R600: Add support for i8 and i16 local memory stores llvm-svn: 189223	2013-08-26 15:05:49 +00:00
Tom Stellard	2ffc330673	R600: Add support for v4i32 and v2i32 local stores llvm-svn: 189222	2013-08-26 15:05:44 +00:00
Tom Stellard	fd155828ed	SelectionDAG: Use correct pointer size when lowering function arguments v2 This adds minimal support to the SelectionDAG for handling address spaces with different pointer sizes. The SelectionDAG should now correctly lower pointer function arguments to the correct size as well as generate the correct code when lowering getelementptr. This patch also updates the R600 DataLayout to use 32-bit pointers for the local address space. v2: - Add more helper functions to TargetLoweringBase - Use CHECK-LABEL for tests llvm-svn: 189221	2013-08-26 15:05:36 +00:00
Tom Stellard	15e4811455	R600/SI: Fix another case of illegal VGPR to SGPR copy This fixes a crash in Unigine Tropics. https://bugs.freedesktop.org/show_bug.cgi?id=68389 llvm-svn: 189057	2013-08-22 20:21:02 +00:00
Tom Stellard	f6d8023ca4	R600: Remove unnecessary casts Spotted by Bill Wendling. llvm-svn: 188942	2013-08-21 22:14:17 +00:00
Dmitri Gribenko	8b2a3d1fea	Remove unused stdio.h includes llvm-svn: 188626	2013-08-18 08:29:51 +00:00
Tom Stellard	59ed08b238	R600: Fix possible use of an uninitialized variable Spotted by Nick Lewycky! llvm-svn: 188599	2013-08-17 00:06:51 +00:00
Tom Stellard	b249b75726	R600: Expand vector FRINT ops Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 188598	2013-08-16 23:51:33 +00:00
Tom Stellard	ad3aff246c	R600: Expand vector FFLOOR ops Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 188597	2013-08-16 23:51:29 +00:00
Tom Stellard	a92ff87929	R600: Expand vector float operations for both SI and R600 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 188596	2013-08-16 23:51:24 +00:00
Michel Danzer	8522270d7e	R600/SI: Add pattern for xor of i1 Fixes two recent piglit regressions with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188559	2013-08-16 16:19:31 +00:00
Michel Danzer	20680b1cc5	R600/SI: Fix broken encoding of DS_WRITE_B32 The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused it to corrupt the encoding of that by clobbering the first operand with the second one. Undo that damage and only apply the SMRD logic to that. Fixes some derivates related piglit regressions with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188558	2013-08-16 16:19:24 +00:00
Benjamin Kramer	a8eecee121	R600: Allocate memoperand in the MachienFunction so it doesn't leak. llvm-svn: 188555	2013-08-16 14:48:09 +00:00
Tom Stellard	dba25713a6	Revert "R600/SI: Fix incorrect encoding of DS_WRITE_B32 instructions" This reverts commit a6a39ced095c2f453624ce62c4aead25db41a18f. This is the wrong version of this fix. llvm-svn: 188523	2013-08-16 01:18:43 +00:00
Tom Stellard	82bef57f20	R600/SI: Fix incorrect encoding of DS_WRITE_B32 instructions The SIInsertWaits pass was overwriting the first operand (gds bit) of DS_WRITE_B32 with the second operand (value to write). This meant that any time the value to write was stored in an odd number VGPR, the gds bit would be set causing the instruction to write to GDS instead of LDS. llvm-svn: 188522	2013-08-16 01:12:20 +00:00
Tom Stellard	b03edeca67	R600: Add support for global vector loads with element types less than 32-bits Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188521	2013-08-16 01:12:16 +00:00
Tom Stellard	fbab827e2a	R600: Add support for global vector stores with elements less than 32-bits Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188520	2013-08-16 01:12:11 +00:00
Tom Stellard	d3ee8c103a	R600: Add support for i16 and i8 global stores Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188519	2013-08-16 01:12:06 +00:00
Tom Stellard	6d1379e180	R600: Add support for v4i32 stores on Cayman Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188518	2013-08-16 01:12:00 +00:00
Tom Stellard	16da74c205	R600: Enable folding of inline literals into REQ_SEQUENCE instructions Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188517	2013-08-16 01:11:55 +00:00
Tom Stellard	676c16d088	R600: Add IsExport bit to TableGen instruction definitions Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188516	2013-08-16 01:11:51 +00:00
Tom Stellard	ac00f9df79	R600: Change the RAT instruction assembly names so they match the docs Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188515	2013-08-16 01:11:46 +00:00
Matt Arsenault	5cae894a13	Fix spelling llvm-svn: 188506	2013-08-15 23:11:03 +00:00
Alexey Samsonov	3186eb3efd	Tentative fix for global-buffer-overflow caused by r188426. Found by AddressSanitizer llvm-svn: 188448	2013-08-15 07:11:34 +00:00
Tom Stellard	d86003e31f	R600/SI: Improve legalization of vector operations This should fix hangs in the OpenCL piglit tests. llvm-svn: 188431	2013-08-14 23:25:00 +00:00
Tom Stellard	6785065ace	R600/SI: Replace v1i32 type with i32 in imageload and sample intrinsics llvm-svn: 188430	2013-08-14 23:24:53 +00:00
Tom Stellard	9fa1791a1b	R600/SI: Convert v16i8 resource descriptors to i128 Now that compute support is better on SI, we can't continue using v16i8 for descriptors since this is also a legal type in OpenCL. This patch fixes numerous hangs with the piglit OpenCL test and since we now use a target specific DAG node for LOAD_CONSTANT with the correct MemOperandFlags, this should also fix: https://bugs.freedesktop.org/show_bug.cgi?id=66805 llvm-svn: 188429	2013-08-14 23:24:45 +00:00
Tom Stellard	8e5da41374	R600/SI: Lower BUILD_VECTOR to REG_SEQUENCE v2 Using REG_SEQUENCE for BUILD_VECTOR rather than a series of INSERT_SUBREG instructions should make it easier for the register allocator to coalasce unnecessary copies. v2: - Use an SGPR register class if all the operands of BUILD_VECTOR are SGPRs. llvm-svn: 188427	2013-08-14 23:24:32 +00:00
Tom Stellard	df94dc3917	R600/SI: Choose the correct MOV instruction for copying immediates The instruction selector will now try to infer the destination register so it can decided whether to use V_MOV_B32 or S_MOV_B32 when copying immediates. llvm-svn: 188426	2013-08-14 23:24:24 +00:00
Tom Stellard	16a9a205c8	R600/SI: Assign a register class to the $vaddr operand for MIMG instructions The previous code declared the operand as unknown:$vaddr, which made it possible for scalar registers to be used instead of vector registers. llvm-svn: 188425	2013-08-14 23:24:17 +00:00
Tom Stellard	3494b7ee42	R600/SI: Handle MSAA texture targets Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188421	2013-08-14 22:22:14 +00:00
Tom Stellard	20ee94f152	R600/SI: Allow conversion between v32i8 and v8i32 Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188420	2013-08-14 22:22:09 +00:00
Tom Stellard	a36f077159	R600/SI: Fix an obvious typo Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188419	2013-08-14 22:22:03 +00:00
Tom Stellard	73c31d541e	R600/SI: Add pattern for fp_to_uint This fixes the F2U opcode for the Mesa driver. Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188418	2013-08-14 22:21:57 +00:00
Tom Stellard	fc455471c3	R600: Set scheduling preference to Sched::Source R600 doesn't need to do any scheduling on the SelectionDAG now that it has a very good MachineScheduler. Also, using the VLIW SelectionDAG scheduler was having a major impact on compile times. For example with the phatk kernel here are the LLVM IR to machine code compile times: With Sched::VLIW Total Compile Time: 1.4890 Seconds (User + System) SelectionDAG Instruction Scheduling: 1.1670 Seconds (User + System) With Sched::Source Total Compile Time: 0.3330 Seconds (User + System) SelectionDAG Instruction Scheduling: 0.0070 Seconds (User + System) The code ouput was identical with both schedulers. This may not be true for all programs, but it gives me confidence that there won't be much reduction, if any, in code quality by using Sched::Source. llvm-svn: 188215	2013-08-12 22:33:21 +00:00
Niels Ole Salscheider	d3a039fed2	R600/SI: FMA is faster than fmul and fadd for f64 llvm-svn: 188136	2013-08-10 10:38:54 +00:00
Niels Ole Salscheider	6509ac65a9	R600/SI: Add FMA pattern llvm-svn: 188135	2013-08-10 10:38:47 +00:00
Niels Ole Salscheider	719fbc9ae7	R600/SI: Implement fp32<->fp64 conversions llvm-svn: 187988	2013-08-08 16:06:15 +00:00
Niels Ole Salscheider	4715d886f8	R600/SI: Implement sint<->fp64 conversions llvm-svn: 187987	2013-08-08 16:06:08 +00:00
Evgeniy Stepanov	bc8808ce4a	Initialize SIInsertWaits::ExpInstrTypesSeen in the pass constructor. This value may be used uninitialized in SIInsertWaits::insertWait. Found with MemorySanitizer. llvm-svn: 187869	2013-08-07 07:47:41 +00:00
Tom Stellard	f5a988b35f	R600: Add new file from r187831 to CMakeLists.txt llvm-svn: 187834	2013-08-06 23:12:34 +00:00
Tom Stellard	2f7cdda57e	R600/SI: Use VSrc_* register classes as the default classes for types Since the VSrc_* register classes contain both VGPRs and SGPRs, copies that used be emitted by isel like this: SGPR = COPY VGPR Will now be emitted like this: VSrC = COPY VGPR This patch also adds a pass that tries to identify and fix situations where a VGPR to SGPR copy may occur. Hopefully, these changes will make it impossible for the compiler to generate illegal VGPR to SGPR copies. llvm-svn: 187831	2013-08-06 23:08:28 +00:00
Tom Stellard	4c0ffccbbf	R600/SI: Add more special cases for opcodes to ensureSRegLimit() Also factor out the register class lookup to its own function. llvm-svn: 187830	2013-08-06 23:08:18 +00:00
NAKAMURA Takumi	aaf66c7357	Target//CMakeLists.txt: Add the dependency to CommonTableGen explicitly for each corresponding CodeGen. Without explicit dependencies, both per-file action and in-CommonTableGen action could run in parallel. It races to emit .inc files simultaneously. llvm-svn: 187780	2013-08-06 06:38:37 +00:00
Tom Stellard	aa664d9b92	Factor FlattenCFG out from SimplifyCFG Patch by: Mei Ye llvm-svn: 187764	2013-08-06 02:43:45 +00:00
Tom Stellard	28d06de6f6	R600: Implement TargetLowering::getVectorIdxTy() We use MVT::i32 for the vector index type, because we use 32-bit operations to caculate offsets when dynamically indexing vectors. llvm-svn: 187749	2013-08-05 22:22:07 +00:00
Tom Stellard	0344cdfe39	R600: Add 64-bit float load/store support * Added R600_Reg64 class * Added T#Index#.XY registers definition * Added v2i32 register reads from parameter and global space * Added f32 and i32 elements extraction from v2f32 and v2i32 * Added v2i32 -> v2f32 conversions Tom Stellard: - Mark vec2 operations as expand. The addition of a vec2 register class made them all legal. Patch by: Dmitry Cherkassov Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com> llvm-svn: 187582	2013-08-01 15:23:42 +00:00
Tom Stellard	53698938a4	R600: Use 64-bit alignment for 64-bit kernel arguments llvm-svn: 187581	2013-08-01 15:23:31 +00:00
Tom Stellard	98f675a994	R600/SI: Custom lower i64 ZERO_EXTEND llvm-svn: 187580	2013-08-01 15:23:26 +00:00
Tom Stellard	ca69a53bae	Revert "R600: Non vector only instruction can be scheduled on trans unit" This reverts commit 98ce62780ea7185ba710868bf83c8077e8d7f6d6. llvm-svn: 187526	2013-07-31 20:43:27 +00:00
Tom Stellard	4dd41845ec	Revert "R600: Use SchedModel enum for is{Trans,Vector}Only functions" This reverts commit 3f1de26cb5cc0543a6a1d71259a7a39d97139051. llvm-svn: 187524	2013-07-31 20:43:03 +00:00
Vincent Lejeune	220db748b0	R600: Do not mergevector after a vector reg is used If we merge vector when a vector is used, it will generate an artificial antidependency that can prevent 2 tex/vtx instructions to use the same clause and thus generate extra clauses that reduce performance. There is no test case as such situation is really hard to predict. llvm-svn: 187516	2013-07-31 19:32:12 +00:00
Vincent Lejeune	bb3f931123	R600: Avoid more than 4 literals in the same instruction group at scheduling llvm-svn: 187515	2013-07-31 19:32:07 +00:00
Vincent Lejeune	df18804e26	R600: Non vector only instruction can be scheduled on trans unit llvm-svn: 187514	2013-07-31 19:31:56 +00:00
Vincent Lejeune	21de8baa15	R600: Don't mix LDS and non-LDS instructions in the same group There are a lot of restrictions on instruction groups that contain LDS instructions, so for now we will be conservative and not packetize anything else with them. llvm-svn: 187513	2013-07-31 19:31:41 +00:00
Vincent Lejeune	79afe17e99	R600: Use SchedModel enum for is{Trans,Vector}Only functions llvm-svn: 187512	2013-07-31 19:31:35 +00:00
Vincent Lejeune	0c5ed2b437	R600: Remove predicated_break inst We were using two instructions for similar purpose : break and predicated break. Only predicated_break was emitted and it was lowered at R600ControlFlowFinalizer to JUMP;CF_BREAK;POP. This commit simplify the situation by making AMDILCFGStructurizer emit IF_PREDICATE;BREAK;ENDIF; instead of predicated_break (which is now removed). There is no functionality change. llvm-svn: 187510	2013-07-31 19:31:14 +00:00
Tom Stellard	aa313d0a74	R600/SI: Expand vector fp <-> int conversions llvm-svn: 187421	2013-07-30 14:31:03 +00:00
Quentin Colombet	e2e0548d77	[R600] Replicate old DAGCombiner behavior in target specific DAG combine. build_vector is lowered to REG_SEQUENCE, which is something the register allocator does a good job at optimizing. llvm-svn: 187397	2013-07-30 00:27:16 +00:00
Tom Stellard	8b1e021e85	SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278	2013-07-27 00:01:07 +00:00
Tom Stellard	c54731aa9d	DAGCombiner: Pass the correct type to TargetLowering::isF(Abs\|Neg)Free This commit also implements these functions for R600 and removes a test case that was relying on the buggy behavior. llvm-svn: 187007	2013-07-23 23:55:03 +00:00
Tom Stellard	8cb0e47c9e	R600: Treat CONSTANT_ADDRESS loads like GLOBAL_ADDRESS loads when necessary These are really the same address space in hardware. The only difference is that CONSTANT_ADDRESS uses a special cache for faster access. When we are unable to use the constant kcache for some reason (e.g. smaller types or lack of indirect addressing) then the instruction selector must use GLOBAL_ADDRESS loads instead. llvm-svn: 187006	2013-07-23 23:54:56 +00:00
Tom Stellard	5263948a7b	R600: Add support for 24-bit MAD instructions Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186923	2013-07-23 01:48:49 +00:00
Tom Stellard	41fc7853be	R600: Add support for 24-bit MUL instructions Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186922	2013-07-23 01:48:42 +00:00
Tom Stellard	9f95033d33	R600: Improve support for < 32-bit loads Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186921	2013-07-23 01:48:35 +00:00
Tom Stellard	ba30932908	R600: Rename AMDILISelDAGToDAG.cpp -> AMDGPUISelDAGToDAG.cpp Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186920	2013-07-23 01:48:29 +00:00
Tom Stellard	840214437b	R600: Move CONST_ADDRESS folding into AMDGPUDAGToDAGISel::Select() This increases the number of opportunites we have for folding. With the previous implementation we were unable to fold into any instructions other than the first when multiple instructions were selected from a single SDNode. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186919	2013-07-23 01:48:24 +00:00
Tom Stellard	1e80309ebe	R600: Use KCache for kernel arguments Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186918	2013-07-23 01:48:18 +00:00
Tom Stellard	34ed721af4	R600: Simplify assembly for KCache registers using the TableGen !add operator Before: MOV * T0.W, KC0[131-128].Y After: MOV * T0.W, KC0[3].Y Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186917	2013-07-23 01:48:08 +00:00
Tom Stellard	acfeebf883	R600: Use the same compute kernel calling convention for all GPUs A side-effect of this is that now the compiler expects kernel arguments to be 4-byte aligned. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186916	2013-07-23 01:48:05 +00:00
Tom Stellard	78e012969c	R600: Use correct LoadExtType when lowering kernel arguments Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186915	2013-07-23 01:47:58 +00:00
Tom Stellard	33dd04bfbe	R600: Clean up extended load patterns Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186914	2013-07-23 01:47:52 +00:00
Tom Stellard	beed74af48	R600: Expand vector FNEG llvm-svn: 186913	2013-07-23 01:47:46 +00:00
Vincent Lejeune	8b8a7b5514	R600: Don't emit empty then clause and use alu_pop_after llvm-svn: 186725	2013-07-19 21:45:15 +00:00
Vincent Lejeune	960a622ca6	R600: Simplify AMDILCFGStructurize by removing templates and assuming single exit llvm-svn: 186724	2013-07-19 21:45:06 +00:00
Vincent Lejeune	a8c38fedd6	R600: Replace legacy debug code in AMDILCFGStructurizer.cpp llvm-svn: 186723	2013-07-19 21:44:56 +00:00
Tom Stellard	8374720aad	R600/SI: Fix crash with VSELECT https://bugs.freedesktop.org/show_bug.cgi?id=66175 llvm-svn: 186616	2013-07-18 21:43:53 +00:00
Tom Stellard	adf732cfbc	R600/SI: Add support for v2f32 loads llvm-svn: 186615	2013-07-18 21:43:48 +00:00
Tom Stellard	ed2f6149f3	R600/SI: Add support for v2f32 stores llvm-svn: 186614	2013-07-18 21:43:42 +00:00
Tom Stellard	67ae4762ef	R600: Expand VSELECT for all types llvm-svn: 186613	2013-07-18 21:43:35 +00:00
Craig Topper	8fc4096fab	Move string pointer from being a static class member to just a static global in the one file its needed in. llvm-svn: 186476	2013-07-17 00:31:35 +00:00
Craig Topper	d3a34f81f8	Add 'const' qualifiers to static const char* variables. llvm-svn: 186371	2013-07-16 01:17:10 +00:00
Tom Stellard	31209cc8eb	R600/SI: Add support for 64-bit loads https://bugs.freedesktop.org/show_bug.cgi?id=65873 llvm-svn: 186339	2013-07-15 19:00:09 +00:00
Craig Topper	0afd0ab749	Make some arrays 'static const' llvm-svn: 186307	2013-07-15 06:39:13 +00:00
Craig Topper	5871321e49	Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]). llvm-svn: 186301	2013-07-15 04:27:47 +00:00
Craig Topper	b94011fd28	Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size. llvm-svn: 186274	2013-07-14 04:42:23 +00:00
Benjamin Kramer	c22c790f89	R600: Remove unsafe type punning. No intended functionality change. llvm-svn: 186196	2013-07-12 20:18:05 +00:00
Tom Stellard	ccae60acc3	R600/SI: Add support for f64 kernel arguments Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186182	2013-07-12 18:15:26 +00:00
Tom Stellard	4e1100ab75	R600/SI: Implement select and compares for SI Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186181	2013-07-12 18:15:19 +00:00
Tom Stellard	8ed7b45da3	R600/SI: Add fsqrt pattern for SI Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186180	2013-07-12 18:15:13 +00:00
Tom Stellard	2a6a610516	R600/SI: Add double precision fsub pattern for SI Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186179	2013-07-12 18:15:08 +00:00
Tom Stellard	ab8a8c84d4	R600/SI: SI support for 64bit ConstantFP Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186178	2013-07-12 18:15:02 +00:00
Tom Stellard	7512c0803c	R600/SI: Add initial double precision support for SI Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186177	2013-07-12 18:14:56 +00:00
Aaron Ballman	f04bbd8b7f	Replacing an empty switch with its moral equivalent. No functional changes intended. llvm-svn: 186017	2013-07-10 17:19:22 +00:00
Michel Danzer	49812b5bbd	R600/SI: Initial local memory support Enough for the radeonsi driver to use it for calculating derivatives. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186012	2013-07-10 16:37:07 +00:00
Michel Danzer	1f87df365f	R600/SI: Add pattern for the AMDGPU.barrier.local intrinsic lit test coverage to follow in the next commit. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186011	2013-07-10 16:36:57 +00:00
Michel Danzer	8d69617b27	R600/SI: Add intrinsic for retrieving the current thread ID Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186010	2013-07-10 16:36:52 +00:00
Michel Danzer	1c45430e76	R600/SI: Initial support for LDS/GDS instructions Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186009	2013-07-10 16:36:43 +00:00
Michel Danzer	83f87c4c2e	R600/SI: Add intrinsics for texture sampling with user derivatives Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186008	2013-07-10 16:36:36 +00:00
Vincent Lejeune	ce499744b3	R600: Do not predicated basic block with multiple alu clause Test is not included as it is several 1000 lines long. To test this functionnality, a test case must generate at least 2 ALU clauses, where an ALU clause is ~110 instructions long. NOTE: This is a candidate for the stable branch. llvm-svn: 185943	2013-07-09 15:03:33 +00:00
Vincent Lejeune	b8aac8d720	R600: Fix a rare bug where swizzle optimization returns wrong values llvm-svn: 185942	2013-07-09 15:03:25 +00:00
Vincent Lejeune	a4d8d2ef2b	R600: Fix wrong export reswizzling llvm-svn: 185941	2013-07-09 15:03:19 +00:00
Vincent Lejeune	b55940cc7d	R600: Use DAG lowering pass to handle fcos/fsin NOTE: This is a candidate for the stable branch. llvm-svn: 185940	2013-07-09 15:03:11 +00:00
Vincent Lejeune	f10d1cd2a3	R600: Print Export Swizzle llvm-svn: 185939	2013-07-09 15:03:03 +00:00
Craig Topper	31ee5866de	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185540	2013-07-03 15:07:05 +00:00
Rafael Espindola	64e1af8eb9	Remove address spaces from MC. This is dead code since PIC16 was removed in 2010. The result was an odd mix, where some parts would carefully pass it along and others would assert it was zero (most of the object streamer for example). llvm-svn: 185436	2013-07-02 15:49:13 +00:00
Chad Rosier	797ee3e3c6	Add a newline. llvm-svn: 185385	2013-07-01 21:31:10 +00:00
Vincent Lejeune	a8a50248d8	R600: Fix an unitialized variable in R600InstrInfo.cpp llvm-svn: 185294	2013-06-30 21:44:06 +00:00
Benjamin Kramer	396906456f	R600: Unbreak GCC build. operator++ on an enum is not legal. clang happens to accept it anyways, I think that's a known bug. llvm-svn: 185269	2013-06-29 20:04:19 +00:00
Vincent Lejeune	77a8352476	R600: Support schedule and packetization of trans-only inst llvm-svn: 185268	2013-06-29 19:32:43 +00:00
Vincent Lejeune	bb8a872158	R600: Bank Swizzle now display SCL equivalent llvm-svn: 185267	2013-06-29 19:32:29 +00:00
Tom Stellard	c46e56721e	R600/SI: Add processor types for each CIK variant Patch By: Alex Deucher Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> llvm-svn: 185209	2013-06-28 20:23:29 +00:00
Tom Stellard	c026e8bc8e	R600: Add local memory support via LDS Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 185162	2013-06-28 15:47:08 +00:00
Tom Stellard	ce540330df	R600: Add support for GROUP_BARRIER instruction Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 185161	2013-06-28 15:46:59 +00:00
Tom Stellard	5eb903d9c5	R600: Add ALUInst bit to tablegen definitions v2 v2: - Remove functions left over from a previous rebase. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 185160	2013-06-28 15:46:53 +00:00
Tom Stellard	02661d9605	R600: Use new getNamedOperandIdx function generated by TableGen llvm-svn: 184880	2013-06-25 21:22:18 +00:00
Aaron Watry	0a794a4612	R600: Consolidate expansion of v2i32/v4i32 ops for EG/SI By default, we expand these operations for both EG and SI. Move the duplicated code into a common space for now. If the targets ever actually implement these operations as instructions, we can override that in the relevant target. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184848	2013-06-25 13:55:57 +00:00
Aaron Watry	daabb20e1b	R600/SI: Expand xor v2i32/v4i32 Add test cases for both vector sizes on SI and also add v2i32 test for EG. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184846	2013-06-25 13:55:52 +00:00
Aaron Watry	83fa6006bc	R600/SI: Expand urem of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Note: I followed the guidance of the v4i32 EG check... UREM produces really complex code, so let's just check that the instruction was lowered successfully. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184844	2013-06-25 13:55:46 +00:00
Aaron Watry	5527b6c6b6	R600/SI: Expand udiv v[24]i32 for SI and v2i32 for EG Also add lit test for both cases on SI, and v2i32 for evergreen. Note: I followed the guidance of the v4i32 EG check... UDIV produces really complex code, so let's just check that the instruction was lowered successfully. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184843	2013-06-25 13:55:43 +00:00
Aaron Watry	16d80c0529	R600/SI: Expand ashr of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184842	2013-06-25 13:55:40 +00:00
Aaron Watry	f63791e778	R600/SI: Expand srl of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184841	2013-06-25 13:55:37 +00:00
Aaron Watry	5584553984	R600/SI: Expand shl of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184840	2013-06-25 13:55:32 +00:00
Aaron Watry	2fa162e88e	R600/SI: Expand or of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184839	2013-06-25 13:55:29 +00:00
Aaron Watry	265eef5efe	R600/SI: Expand mul of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184838	2013-06-25 13:55:26 +00:00
Aaron Watry	00aeb119db	R600/SI: Expand and of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184837	2013-06-25 13:55:23 +00:00
Tom Stellard	0125f2a6e4	R600/SI: Report unaligned memory accesses as legal for > 32-bit types In reality, some unaligned memory accesses are legal for 32-bit types and smaller too, but it all depends on the address space. Allowing unaligned loads/stores for > 32-bit types is mainly to prevent the legalizer from splitting one load into multiple loads of smaller types. https://bugs.freedesktop.org/show_bug.cgi?id=65873 llvm-svn: 184822	2013-06-25 02:39:35 +00:00
Tom Stellard	9810ec613c	R600: Add support for i32 loads from the constant address space on Cayman Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 184821	2013-06-25 02:39:30 +00:00
Tom Stellard	b06f3fc1be	R600/SI: Add support for v4i32 and v4f32 kernel args Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 184820	2013-06-25 02:39:25 +00:00
Tom Stellard	9d2e1500b4	R600: Fix typo in R600Schedule.td This should only make a difference in programs that use a lot of the vector ALU instructions like BFI_INT and BIT_ALIGN. There is a slight improvement in the phatk bitcoin mining kernel with this patch on Evergreen (vector size == 1): Before: 1173 Instruction Groups / 9520 dwords After: 1167 Instruction Groups / 9510 dwords Reviewed-by: Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 184819	2013-06-25 02:39:20 +00:00
Aaron Watry	52a72c926c	R600: Fix spelling error in comment our -> or llvm-svn: 184756	2013-06-24 16:57:57 +00:00
Tom Stellard	96d38760fc	R600/SI: Expand sub for v2i32 and v4i32 for SI Also add a v2i32 test to the existing v4i32 test. Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry<awatry@gmail.com> llvm-svn: 184482	2013-06-20 21:55:37 +00:00
Tom Stellard	043795e818	R600/SI: Expand add for v2i32 and v4i32 Also add SI tests to existing file and a v2i32 test for both R600 and SI. Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> llvm-svn: 184481	2013-06-20 21:55:30 +00:00
Tom Stellard	6ec9e8043c	R600: Expand v2i32 load/store instead of custom lowering The custom lowering causes llc to crash with a segfault. Ideally, the custom lowering can be fixed, but this allows programs which load/store v2i32 to work without crashing. Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry<awatry@gmail.com> llvm-svn: 184480	2013-06-20 21:55:23 +00:00
Bill Wendling	a3cd350249	Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change. llvm-svn: 184360	2013-06-19 21:36:55 +00:00
Matt Arsenault	d46fce1141	Move StructurizeCFG out of R600 to generic Transforms. Register it with PassManager llvm-svn: 184343	2013-06-19 20:18:24 +00:00
Matt Arsenault	2aabb06175	Use GetUnderlyingObject instead of custom function llvm-svn: 184261	2013-06-18 23:37:58 +00:00
Bill Wendling	b7b1681157	Remove dead prototype. llvm-svn: 184173	2013-06-18 06:24:14 +00:00
Vincent Lejeune	41d4cf26b4	R600: PV stores Reg id, not index llvm-svn: 184117	2013-06-17 20:16:40 +00:00
Vincent Lejeune	8bd10421ec	R600: Properly set COUNT_3 bit in TEX clause initiating inst for pre EG gen. Fixes rv7x0 bug in Heaven reported here: https://bugs.freedesktop.org/show_bug.cgi?id=64257 llvm-svn: 184116	2013-06-17 20:16:26 +00:00
Tom Stellard	371573448c	R600: Add SI load support for v[24]i32 and store for v2i32 Also add a seperate vector lit test file, since r600 doesn't seem to handle v2i32 load/store yet, but we can test both for SI. Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> llvm-svn: 184021	2013-06-15 00:09:31 +00:00
Tom Stellard	ecf9d86404	R600: Use correct encoding for Vertex Fetch instructions on Cayman Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 184016	2013-06-14 22:12:30 +00:00
Tom Stellard	6aa0d5578d	R600: Use EXPORT_RAT_INST_STORE_DWORD for stores on Cayman We were using RAT_INST_STORE_RAW, which seemed to work, but the docs say this instruction doesn't exist for Cayman, so it's probably safer to use a documented instruction instead. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 184015	2013-06-14 22:12:24 +00:00
Tom Stellard	d99b7932ae	R600: Factor the instruction encoding out the RAT_WRITE_CACHELESS_eg class Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 184014	2013-06-14 22:12:19 +00:00
Tom Stellard	3d0823f1cd	R600: Move instruction encoding definitions into a separate .td file Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 184013	2013-06-14 22:12:09 +00:00
Tom Stellard	adba083bc2	R600: Don't try to fix reg class when copying IMPLICIT_DEF to a register The test case for this is way too complex to be useful as a lit test, and I was unable to reduce it. https://bugs.freedesktop.org/show_bug.cgi?id=65438 llvm-svn: 183937	2013-06-13 20:14:00 +00:00
Benjamin Kramer	193960c822	R600: Make helper functions static. llvm-svn: 183744	2013-06-11 13:32:25 +00:00
Vincent Lejeune	d1a9d18120	R600: Use a refined heuristic to choose when switching clause This is using a hint from AMD APP OpenCL Programming Guide with empirically tweaked parameters. I used Unigine Heaven 3.0 to determine best parameters on my system (i7 2600/Radeon 6950/Kernel 3.9.4) the benchmark : it went from 38.8 average fps to 39.6, which is ~3% gain. (Lightmark 2008.2 gain is much more marginal: from 537 to 539) There is no lit test provided as the parameter were determined empirically and it it would be nearly impossiblet to find a test program that check for optimal behavior. llvm-svn: 183593	2013-06-07 23:30:34 +00:00
Vincent Lejeune	4d143328df	R600: Anti dep better handled in tex clause llvm-svn: 183592	2013-06-07 23:30:26 +00:00
Tom Stellard	d74583777f	R600: Fix calculation of stack offset in AMDGPUFrameLowering We weren't computing structure size correctly and we were relying on the original alloca instruction to compute the offset, which isn't always reliable. Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183568	2013-06-07 20:52:05 +00:00
Tom Stellard	a6c6e1bfc2	R600: Rework subtarget info and remove AMDILDevice classes This should simplify the subtarget definitions and make it easier to add new ones. Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183566	2013-06-07 20:37:48 +00:00
Bill Wendling	37e9adb091	Don't cache the instruction and register info from the TargetMachine, because the internals of TargetMachine could change. No functionality change intended. llvm-svn: 183561	2013-06-07 20:28:55 +00:00
Tom Stellard	3498e4ff1d	R600: Fix the fetch limits for R600 generation GPUs Reviewed-by: Vincent Lejeune <vljn@ovi.com> https://bugs.freedesktop.org/show_bug.cgi?id=64257 llvm-svn: 183560	2013-06-07 20:28:55 +00:00
Tom Stellard	99792774a4	R600: Move Subtarget feature definitions into AMDGPU.td This is the convention used by the other targets. Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183559	2013-06-07 20:28:49 +00:00
Tom Stellard	b0804ec2ad	R600: Remove unnecessary include Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183558	2013-06-07 20:28:43 +00:00
Benjamin Kramer	705d841bb6	R600: Don't compare iterators of different maps. Found be libstdc's debug mode. llvm-svn: 183549	2013-06-07 19:59:34 +00:00
Benjamin Kramer	ebe0be9ca4	Vincent says the element is at most once in the vector, so we don't need a full std::remove. llvm-svn: 183541	2013-06-07 18:18:12 +00:00
Benjamin Kramer	a857fe115b	R600: Fix a potential iterator invalidation issue. As a bonus this reduces the loop from O(n^2) to O(n). llvm-svn: 183532	2013-06-07 16:13:49 +00:00
Vincent Lejeune	931bb768fd	R600: Remove an extra break in R600OptimizeVectorRegisters.cpp llvm-svn: 183528	2013-06-07 15:44:53 +00:00
Vincent Lejeune	0030362ed9	R600: Rewrite an awkward loop in R600MachineScheduler llvm-svn: 183458	2013-06-06 23:08:32 +00:00
Vincent Lejeune	54476a1503	R600: Remove leftover code in R600MachineScheduler.cpp Spotted by Benjamin Kramer. llvm-svn: 183413	2013-06-06 14:18:29 +00:00
Bill Wendling	b91216817f	Cast to the correct type. Pointer, not reference. llvm-svn: 183385	2013-06-06 05:39:29 +00:00
NAKAMURA Takumi	4a8f079371	R600OptimizeVectorRegisters.cpp: Tweak a warning. [-Wsometimes-uninitialized] FIXME: Is it false alarm? llvm-svn: 183371	2013-06-06 02:15:12 +00:00
NAKAMURA Takumi	e5555fc238	R600OptimizeVectorRegisters.cpp: Suppress a warning. [-Wunused-variable] llvm-svn: 183370	2013-06-06 02:15:06 +00:00
NAKAMURA Takumi	372574d447	Trailing linefeed. llvm-svn: 183369	2013-06-06 02:15:00 +00:00
Bill Wendling	e410576865	Cast to the proper type. llvm-svn: 183365	2013-06-06 01:04:21 +00:00
Tom Stellard	acec99c948	R600: Replace predicate loop with predicate function llvm-svn: 183351	2013-06-05 23:39:50 +00:00
Vincent Lejeune	dec1875207	R600: Add a pass that merge Vector Register Previously commited @183279 but tests were failing, reverted @183286 It was broken because @183336 was missing, now it's there. llvm-svn: 183343	2013-06-05 21:38:04 +00:00
Vincent Lejeune	4b5b849753	R600: Schedule copy from phys register at beginning of block It allows regalloc pass to remove them by trivially assigning associated reg llvm-svn: 183336	2013-06-05 20:27:35 +00:00
Tom Stellard	aad5376fb6	R600: Make sure to schedule AR register uses and defs in the same clause Reviewed-by: vljn at ovi.com llvm-svn: 183294	2013-06-05 03:43:06 +00:00
Rafael Espindola	beef23fe21	Revert "R600: Add a pass that merge Vector Register" This reverts commit r183279. CodeGen/R600/texture-input-merge.ll was failing. llvm-svn: 183286	2013-06-05 01:48:30 +00:00
Vincent Lejeune	a45aafabfe	R600: Add a pass that merge Vector Register llvm-svn: 183279	2013-06-04 23:17:26 +00:00
Vincent Lejeune	c689679173	R600: Const/Neg/Abs can be folded to dot4 llvm-svn: 183278	2013-06-04 23:17:15 +00:00
Vincent Lejeune	276ceb8d5f	R600: Swizzle texture/export instructions llvm-svn: 183229	2013-06-04 15:04:53 +00:00
Aaron Ballman	19978553d4	Silencing an MSVC warning about mixing bool and unsigned int. llvm-svn: 183176	2013-06-04 01:03:03 +00:00
Tom Stellard	94593ee8c3	R600/SI: Add support for work item and work group intrinsics llvm-svn: 183138	2013-06-03 17:40:18 +00:00
Tom Stellard	ed882c2f1b	R600/SI: Add a calling convention for compute shaders llvm-svn: 183137	2013-06-03 17:40:11 +00:00
Tom Stellard	046039e81b	R600/SI: Custom lower i64 sign_extend llvm-svn: 183136	2013-06-03 17:40:03 +00:00
Tom Stellard	0518ff89ba	R600/SI: Adjust some instructions' out register class after ISel This is necessary to avoid generating VGPR to SGPR copies in some cases. llvm-svn: 183135	2013-06-03 17:39:58 +00:00
Tom Stellard	bad1f59212	R600/SI: Handle REG_SEQUENCE in fitsRegClass() llvm-svn: 183134	2013-06-03 17:39:54 +00:00
Tom Stellard	b5a97004fb	R600/SI: Handle nodes with glue results correctly SITargetLowering::foldOperands() llvm-svn: 183133	2013-06-03 17:39:50 +00:00
Tom Stellard	2183b70523	R600/SI: Fixup CopyToReg register class in PostprocessISelDAG() The CopyToReg nodes will sometimes try to copy a value from a VGPR to an SGPR. This kind of copy is not possible, so we need to detect VGPR->SGPR copies and do something else. The current strategy is to replace these copies with VGPR->VGPR copies and hope that all the users of CopyToReg can accept VGPRs as arguments. llvm-svn: 183132	2013-06-03 17:39:46 +00:00
Tom Stellard	07a10a3d3f	R600/SI: Add support for global loads llvm-svn: 183131	2013-06-03 17:39:43 +00:00
Tom Stellard	556d9aa841	R600/SI: Rework MUBUF store instructions The lowering of stores is now mostly handled in the tablegen files. No more BUFFER_STORE nodes I generated during legalization. llvm-svn: 183130	2013-06-03 17:39:37 +00:00
Vincent Lejeune	91a942b93e	R600: 3 op instructions have no write bit but the result are store in PV llvm-svn: 183111	2013-06-03 15:56:12 +00:00

... 6 7 8 9 10 ...

1047 Commits