llvm-project

Commit Graph

Author	SHA1	Message	Date
Vincent Lejeune	4ee6dd6136	R600/SI: Add SinkingPass before ISel llvm-svn: 192556	2013-10-13 17:56:21 +00:00
Tom Stellard	ed0ceec1c1	R600: Use StructurizeCFGPass for non SI targets StructurizeCFG pass allows to make complex cfg reducible ; it allows a lot of shader from shadertoy (which exhibits complex control flow constructs) to works correctly with respect to CFG handling (and allow us to detect potential bug in other part of the backend). We provide a cmd line argument to disable the pass for debug purpose. Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 192363	2013-10-10 17:11:12 +00:00
Vincent Lejeune	a4da6fb535	R600: add a pass that merges clauses. llvm-svn: 191790	2013-10-01 19:32:58 +00:00
Andrew Trick	978674b2bc	Allow subtarget selection of the default MachineScheduler and document the interface. The global registry is used to allow command line override of the scheduler selection, but does not work well as the normal selection API. For example, the same LLVM process should be able to target multiple targets or subtargets. llvm-svn: 191071	2013-09-20 05:14:41 +00:00
Tom Stellard	9fa1791a1b	R600/SI: Convert v16i8 resource descriptors to i128 Now that compute support is better on SI, we can't continue using v16i8 for descriptors since this is also a legal type in OpenCL. This patch fixes numerous hangs with the piglit OpenCL test and since we now use a target specific DAG node for LOAD_CONSTANT with the correct MemOperandFlags, this should also fix: https://bugs.freedesktop.org/show_bug.cgi?id=66805 llvm-svn: 188429	2013-08-14 23:24:45 +00:00
Tom Stellard	2f7cdda57e	R600/SI: Use VSrc_* register classes as the default classes for types Since the VSrc_* register classes contain both VGPRs and SGPRs, copies that used be emitted by isel like this: SGPR = COPY VGPR Will now be emitted like this: VSrC = COPY VGPR This patch also adds a pass that tries to identify and fix situations where a VGPR to SGPR copy may occur. Hopefully, these changes will make it impossible for the compiler to generate illegal VGPR to SGPR copies. llvm-svn: 187831	2013-08-06 23:08:28 +00:00
Tom Stellard	aa664d9b92	Factor FlattenCFG out from SimplifyCFG Patch by: Mei Ye llvm-svn: 187764	2013-08-06 02:43:45 +00:00
Tom Stellard	8b1e021e85	SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278	2013-07-27 00:01:07 +00:00
Vincent Lejeune	960a622ca6	R600: Simplify AMDILCFGStructurize by removing templates and assuming single exit llvm-svn: 186724	2013-07-19 21:45:06 +00:00
Vincent Lejeune	ce499744b3	R600: Do not predicated basic block with multiple alu clause Test is not included as it is several 1000 lines long. To test this functionnality, a test case must generate at least 2 ALU clauses, where an ALU clause is ~110 instructions long. NOTE: This is a candidate for the stable branch. llvm-svn: 185943	2013-07-09 15:03:33 +00:00
Matt Arsenault	d46fce1141	Move StructurizeCFG out of R600 to generic Transforms. Register it with PassManager llvm-svn: 184343	2013-06-19 20:18:24 +00:00
Tom Stellard	a6c6e1bfc2	R600: Rework subtarget info and remove AMDILDevice classes This should simplify the subtarget definitions and make it easier to add new ones. Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183566	2013-06-07 20:37:48 +00:00
Vincent Lejeune	dec1875207	R600: Add a pass that merge Vector Register Previously commited @183279 but tests were failing, reverted @183286 It was broken because @183336 was missing, now it's there. llvm-svn: 183343	2013-06-05 21:38:04 +00:00
Rafael Espindola	beef23fe21	Revert "R600: Add a pass that merge Vector Register" This reverts commit r183279. CodeGen/R600/texture-input-merge.ll was failing. llvm-svn: 183286	2013-06-05 01:48:30 +00:00
Vincent Lejeune	a45aafabfe	R600: Add a pass that merge Vector Register llvm-svn: 183279	2013-06-04 23:17:26 +00:00
Rafael Espindola	39aca620db	Fix a leak on the r600 backend. This should bring the valgrind bot back to life. llvm-svn: 182561	2013-05-23 03:31:47 +00:00
Vincent Lejeune	d3eed66e8c	R600: Improve texture handling llvm-svn: 182125	2013-05-17 16:50:20 +00:00
Rafael Espindola	227144c23c	Remove the MachineMove class. It was just a less powerful and more confusing version of MCCFIInstruction. A side effect is that, since MCCFIInstruction uses dwarf register numbers, calls to getDwarfRegNum are pushed out, which should allow further simplifications. I left the MachineModuleInfo::addFrameMove interface unchanged since this patch was already fairly big. llvm-svn: 181680	2013-05-13 01:16:13 +00:00
Tom Stellard	2b971eb0d0	R600: Remove AMDILPeeopholeOptimizer and replace optimizations with tablegen patterns The BFE optimization was the only one we were actually using, and it was emitting an intrinsic that we don't support. https://bugs.freedesktop.org/show_bug.cgi?id=64201 Reviewed-by: Christian König <christian.koenig@amd.com> NOTE: This is a candidate for the 3.3 branch. llvm-svn: 181580	2013-05-10 02:09:45 +00:00
Vincent Lejeune	147700b8b4	R600: Packetize instructions llvm-svn: 180760	2013-04-30 00:14:27 +00:00
Vincent Lejeune	bfaa63a6db	R600: Add support for native control flow llvm-svn: 178505	2013-04-01 21:48:05 +00:00
Vincent Lejeune	f43bc57b66	R600: Emit CF_ALU and use true kcache register. llvm-svn: 178503	2013-04-01 21:47:42 +00:00
Christian Konig	99ee0f4790	R600/SI: rework input interpolation v2 v2: update CMakeLists.txt as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 176626	2013-03-07 09:04:14 +00:00
Vincent Lejeune	68b6b6ddfb	R600: initial scheduler code This is a skeleton for a pre-RA MachineInstr scheduler strategy. Currently it only tries to expose more parallelism for ALU instructions (this also makes the distribution of GPR channels more uniform and increases the chances of ALU instructions to be packed together in a single VLIW group). Also it tries to reduce clause switching by grouping instruction of the same kind (ALU/FETCH/CF) together. Vincent Lejeune: - Support for VLIW4 Slot assignement - Recomputation of ScheduleDAG to get more parallelism opportunities Tom Stellard: - Fix assertion failure when trying to determine an instruction's slot based on its destination register's class - Fix some compiler warnings Vincent Lejeune: [v2] - Remove recomputation of ScheduleDAG (will be provided in a later patch) - Improve estimation of an ALU clause size so that heuristic does not emit cf instructions at the wrong position. - Make schedule heuristic smarter using SUnit Depth - Take constant read limitations into account Vincent Lejeune: [v3] - Fix some uninitialized values in ConstPair - Add asserts to ensure an ALU slot is always populated llvm-svn: 176498	2013-03-05 18:41:32 +00:00
Vincent Lejeune	0b72f1021d	R600: Remove LowerConstCopyPass and lower CONST_COPY right after ISel. Maintaining CONST_COPY Instructions until Pre Emit may prevent some ifcvt case and taking them in account for scheduling is difficult for no real benefit. llvm-svn: 176488	2013-03-05 15:04:55 +00:00
Christian Konig	c756cb9901	R600/SI: cleanup literal handling v3 Seems to be allot simpler, and also paves the way for further improvements. v2: rebased on master, use 0 in BUFFER_LOAD_FORMAT_XYZW, use VGPR0 in dummy EXP, avoid compiler warning, break after encoding the first literal. v3: correctly use V_ADD_F32_e64 This is a candidate for the stable branch. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 175354	2013-02-16 11:28:22 +00:00
Tom Stellard	f3b2a1e8b3	R600: Support for indirect addressing v4 Only implemented for R600 so far. SI is missing implementations of a few callbacks used by the Indirect Addressing pass and needs code to handle frame indices. At the moment R600 only supports array sizes of 16 dwords or less. Register packing of vector types is currently disabled, which means that a vec4 is stored in T0_X, T1_X, T2_X, T3_X, rather than T0_XYZW. In order to correctly pack registers in all cases, we will need to implement an analysis pass for R600 that determines the correct vector width for each array. v2: - Add support for i8 zext load from stack. - Coding style fixes v3: - Don't reserve registers for indirect addressing when it isn't being used. - Fix bug caused by LLVM limiting the number of SubRegIndex declarations. v4: - Fix 64-bit defines llvm-svn: 174525	2013-02-06 17:32:29 +00:00
Tom Stellard	df063e617f	R600: Fold remaining CONST_COPY after expand pseudo inst Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 174395	2013-02-05 17:09:16 +00:00
Tom Stellard	365366f9ef	R600: rework handling of the constants Remove Cxxx registers, add new special register - "ALU_CONST" and new operand for each alu src - "sel". ALU_CONST is used to designate that the new operand contains the value to override src.sel, src.kc_bank, src.chan for constants in the driver. Patch by: Vadim Girlin Vincent Lejeune: - Use pointers for constants - Fold CONST_ADDRESS when possible Tom Stellard: - Give CONSTANT_BUFFER_0 its own address space - Use integer types for constant loads Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173222	2013-01-23 02:09:06 +00:00
Tom Stellard	c4cabef782	R600: Proper insert S_WAITCNT instructions Some instructions like memory reads/writes are executed asynchronously, so we need to insert S_WAITCNT instructions to block before accessing their results. Previously we have just inserted S_WAITCNT instructions after each async instruction, this patch fixes this and adds a prober insertion pass. Patch by: Christian König Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 172846	2013-01-18 21:15:53 +00:00
Tom Stellard	f8794354b2	R600: New control flow for SI v2 This patch replaces the control flow handling with a new pass which structurize the graph before transforming it to machine instruction. This has a couple of different advantages and currently fixes 20 piglit tests without a single regression. It is now a general purpose transformation that could be not only be used for SI/R6xx, but also for other hardware implementations that use a form of structurized control flow. v2: further cleanup, fixes and documentation Patch by: Christian König Signed-off-by: Christian König <deathsimple@vodafone.de> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 170591	2012-12-19 22:10:31 +00:00
Tom Stellard	75aadc2813	Add R600 backend A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX llvm-svn: 169915	2012-12-11 21:25:42 +00:00

32 Commits