llvm-project

Commit Graph

Author	SHA1	Message	Date
Vincent Lejeune	519f21eed3	R600: Relax some vector constraints on Dot4. Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register coalescer to remove some unneeded COPY. This patch also defines some structures/functions that can be used to handle every vector instructions (CUBE, Cayman special instructions...) in a similar fashion. llvm-svn: 182126	2013-05-17 16:50:32 +00:00
Vincent Lejeune	d3eed66e8c	R600: Improve texture handling llvm-svn: 182125	2013-05-17 16:50:20 +00:00
Vincent Lejeune	4ebef18ab5	R600: Rename 128 bit registers. Almost all instructions that takes a 128 bits reg as input (fetch, export...) have the abilities to swizzle their argument and output. Instead of printing default swizzle for each 128 bits reg, rename T.XYZW to T and let instructions print potentially optimized swizzles themselves. llvm-svn: 182124	2013-05-17 16:50:09 +00:00
Vincent Lejeune	709e01688d	R600: prettier dump of clamp llvm-svn: 182121	2013-05-17 16:49:49 +00:00
Tom Stellard	2b971eb0d0	R600: Remove AMDILPeeopholeOptimizer and replace optimizations with tablegen patterns The BFE optimization was the only one we were actually using, and it was emitting an intrinsic that we don't support. https://bugs.freedesktop.org/show_bug.cgi?id=64201 Reviewed-by: Christian König <christian.koenig@amd.com> NOTE: This is a candidate for the 3.3 branch. llvm-svn: 181580	2013-05-10 02:09:45 +00:00
Tom Stellard	6a6ecedcb7	R600: BFI_INT is a vector-only instruction llvm-svn: 181034	2013-05-03 17:21:24 +00:00
Tom Stellard	eac65dde30	R600: Add pattern for SHA-256 Ma function This can be optimized using the BFI_INT instruction. llvm-svn: 181033	2013-05-03 17:21:20 +00:00
Vincent Lejeune	b0422e24a9	R600: Improve asmPrint of ALU clause llvm-svn: 180957	2013-05-02 21:52:40 +00:00
Vincent Lejeune	f97af796a9	R600: Prettier asmPrint of Alu llvm-svn: 180956	2013-05-02 21:52:30 +00:00
Tom Stellard	40b7f1f6c3	R600: Use new tablegen syntax for patterns All but two patterns have been converted to the new syntax. The remaining two patterns will require COPY_TO_REGCLASS instructions, which the VLIW DAG Scheduler cannot handle. llvm-svn: 180922	2013-05-02 15:30:12 +00:00
Vincent Lejeune	3abdbf1cad	R600: use native for alu llvm-svn: 180761	2013-04-30 00:14:38 +00:00
Vincent Lejeune	076c0b28e3	R600: Rework Scheduling to handle difference between VLIW4 and VLIW5 chips llvm-svn: 180759	2013-04-30 00:14:17 +00:00
Vincent Lejeune	22c4248213	R600: Add a Bank Swizzle operand llvm-svn: 180758	2013-04-30 00:14:08 +00:00
Vincent Lejeune	3f1d136b02	R600: Turn TEX/VTX into native instructions llvm-svn: 180756	2013-04-30 00:13:53 +00:00
Vincent Lejeune	c299164284	R600: Add FetchInst bit to instruction defs to denote vertex/tex instructions v2[Vincent Lejeune]: Split FetchInst into usesTextureCache/usesVertexCache llvm-svn: 180755	2013-04-30 00:13:39 +00:00
Vincent Lejeune	f501ea298b	R600: Clean up instruction class definitions llvm-svn: 180752	2013-04-30 00:13:20 +00:00
Tom Stellard	8367067e02	R600: Fix encoding of CF_END_{EG, R600} instructions The EOP bit was not being encoded. llvm-svn: 180734	2013-04-29 22:23:54 +00:00
Vincent Lejeune	117f075f6e	R600: Use .AMDGPU.config section to emit stacksize llvm-svn: 180124	2013-04-23 17:34:12 +00:00
Vincent Lejeune	b6bfe85a07	R600: Add CF_END llvm-svn: 180123	2013-04-23 17:34:00 +00:00
Tom Stellard	9d10c4ce86	R600: Add pattern for the BFI_INT instruction llvm-svn: 179830	2013-04-19 02:11:06 +00:00
Vincent Lejeune	2d5c341cee	R600: Make Export Instruction not duplicable llvm-svn: 179686	2013-04-17 15:17:39 +00:00
Vincent Lejeune	218093e834	R600: Export is emitted as a CF_NATIVE inst llvm-svn: 179685	2013-04-17 15:17:32 +00:00
Michel Danzer	8caa904bde	R600/SI: Add pattern for AMDGPUurecip 21 more little piglits with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 179186	2013-04-10 17:17:56 +00:00
Vincent Lejeune	5f11dd390a	R600: Control Flow support for pre EG gen llvm-svn: 179020	2013-04-08 13:05:49 +00:00
Vincent Lejeune	bfaa63a6db	R600: Add support for native control flow llvm-svn: 178505	2013-04-01 21:48:05 +00:00
Vincent Lejeune	f43bc57b66	R600: Emit CF_ALU and use true kcache register. llvm-svn: 178503	2013-04-01 21:47:42 +00:00
Vincent Lejeune	53f3525d35	R600: Emit native instructions for tex llvm-svn: 178452	2013-03-31 19:33:04 +00:00
Michel Danzer	a2e28156b4	R600: Use legacy (0 * anything = 0) MUL instructions for pow intrinsics Fixes wrong lighting in some corner cases with r600g and radeonsi, e.g. manifested by failure of two piglit/glean tests and intermittent black patches in many apps. Tested on SI and RS880. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62012 [radeonsi] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58150 [r600g] NOTE: This is a candidate for the Mesa stable branch. Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 177730	2013-03-22 14:09:10 +00:00
Christian Konig	4a1b9c3bb9	R600/SI: add float vector types Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 177276	2013-03-18 11:34:10 +00:00
Vincent Lejeune	e5ecf10a02	R600: Fix JUMP handling so that MachineInstr verification can occur This allows R600 Target to use the newly created -verify-misched llc flag llvm-svn: 176819	2013-03-11 18:15:06 +00:00
Tom Stellard	2add82de09	R600: Improve custom lowering of select_cc Two changes: 1. Prefer SET* instructions when possible 2. Handle the CND*_INT case with floating-point args Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 176699	2013-03-08 15:37:09 +00:00
Vincent Lejeune	0b72f1021d	R600: Remove LowerConstCopyPass and lower CONST_COPY right after ISel. Maintaining CONST_COPY Instructions until Pre Emit may prevent some ifcvt case and taking them in account for scheduling is difficult for no real benefit. llvm-svn: 176488	2013-03-05 15:04:55 +00:00
Vincent Lejeune	10a5e4773e	R600: CONST_ADDRESS node is not marked as mayLoad anymore Reviewed-by: Tom Stellard <thomas.stellard at amd.com> mayLoad complexify scheduling and does not bring any usefull info as the location is not writeable at all. llvm-svn: 176486	2013-03-05 15:04:42 +00:00
Vincent Lejeune	a199d01e4d	R600: Use MUL_IEEE for trig/fdiv intrinsic Reviewed-by: Tom Stellard <thomas.stellard at amd.com> llvm-svn: 176485	2013-03-05 15:04:37 +00:00
Vincent Lejeune	743dca0446	R600: Add support for indirect addressing of non default const buffer NOTE: This is a candidate for the Mesa stable branch. llvm-svn: 176484	2013-03-05 15:04:29 +00:00
Tom Stellard	0d171c8877	R600: Fix for Unigine when MachineSched is enabled Fixes for-loop.cl piglit test Patch By: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> NOTE: This is a candidate for the Mesa stable branch. llvm-svn: 175742	2013-02-21 15:06:59 +00:00
Vincent Lejeune	1ce13f553e	R600/SI: Use MULADD_IEEE/V_MAD_F32 instruction for mad pattern llvm-svn: 175446	2013-02-18 14:11:28 +00:00
Vincent Lejeune	685018009b	R600: Support for TBO NOTE: This is a candidate for the Mesa stable branch. Reviewed-by: Tom Stellard <thomas.stellard at amd.com> llvm-svn: 175445	2013-02-18 14:11:19 +00:00
Vincent Lejeune	ea710fe419	R600: Export instructions are no longer terminator This allows MachineInstScheduler to reorder them, and thus make scheduling more efficient. Reviewed-by: Tom Stellard <thomas.stellard at amd.com> llvm-svn: 175182	2013-02-14 16:55:11 +00:00
Vincent Lejeune	d80bc1561a	R600: Fold zero/one in export instructions Reviewed-by: Tom Stellard <thomas.stellard at amd.com> llvm-svn: 175181	2013-02-14 16:55:06 +00:00
Tom Stellard	91da4e9199	R600: Add support for 128-bit parameters NOTE: This is a candidate for the Mesa stable branch. llvm-svn: 175096	2013-02-13 22:05:20 +00:00
Michel Danzer	3bb17ebd93	R600: Fix regression with shadow array sampler on pre-SI GPUs. 'R600/SI: Use proper instructions for array/shadow samplers.' removed two cases from TEX_SHADOW. Vincent Lejeune reported on IRC that this broke some shadow array piglit tests with the r600g driver. Reinstating the removed cases should fix this, and still works with radeonsi as well. I will follow up with some lit tests which would have caught the regression. NOTE: This is a candidate for the Mesa stable branch. Tested-by: Vincent Lejeune <vljn@ovi.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 174963	2013-02-12 12:11:23 +00:00
Vincent Lejeune	44bf8158c5	Test Commit - Remove some trailing whitespace in R600Instructions.td llvm-svn: 174839	2013-02-10 17:57:33 +00:00
Tom Stellard	462516b737	R600/SI: Use proper instructions for array/shadow samplers. Patch by: Michel Dänzer Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 174634	2013-02-07 17:02:14 +00:00
Tom Stellard	9355b22180	R600: Consolidate sub register indices. Use sub0-15 everywhere. Patch by: Michel Dänzerr Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 174610	2013-02-07 14:02:37 +00:00
Tom Stellard	e06163a9a6	R600: Add support for SET_DX10 instructions These instructions compare two floating point values and return an integer true (-1) or false (0) value. When compiling code generated by the Mesa GLSL frontend, the SET_DX10 instructions save us four instructions for most branch decisions that use floating-point comparisons. llvm-svn: 174609	2013-02-07 14:02:35 +00:00
Tom Stellard	b40ada9b85	R600: Fix assembly name for SETGT_INT llvm-svn: 174607	2013-02-07 14:02:27 +00:00
Tom Stellard	f3b2a1e8b3	R600: Support for indirect addressing v4 Only implemented for R600 so far. SI is missing implementations of a few callbacks used by the Indirect Addressing pass and needs code to handle frame indices. At the moment R600 only supports array sizes of 16 dwords or less. Register packing of vector types is currently disabled, which means that a vec4 is stored in T0_X, T1_X, T2_X, T3_X, rather than T0_XYZW. In order to correctly pack registers in all cases, we will need to implement an analysis pass for R600 that determines the correct vector width for each array. v2: - Add support for i8 zext load from stack. - Coding style fixes v3: - Don't reserve registers for indirect addressing when it isn't being used. - Fix bug caused by LLVM limiting the number of SubRegIndex declarations. v4: - Fix 64-bit defines llvm-svn: 174525	2013-02-06 17:32:29 +00:00
Jakob Stoklund Olesen	fdc37670f6	Don't use MRI liveouts in R600. Something very strange is going on with the output registers in this target. Its ISelLowering code is inserting dangling CopyToReg nodes, hoping that those physregs won't get clobbered before the RETURN. This patch adds the output registers as implicit uses on RETURN instructions in the custom emission pass. I'd much prefer to have those CopyToReg nodes glued to the RETURNs, but I don't see how. llvm-svn: 174400	2013-02-05 17:53:52 +00:00
Tom Stellard	41afe6a6fe	R600: improve inputs/interpolation handling Use one intrinsic for all sorts of interpolation. Use two separate unexpanded instructions to represent INTERP_XY and _ZW - this will allow to eliminate one part if it's not used. Track liveness of special interpolation regs instead of reserving them - this will allow to reuse those regs, lowering reg pressure. Patch By: Vadim Girlin v2[Vincent Lejeune]: Rebased against current llvm master Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 174394	2013-02-05 17:09:14 +00:00
Tom Stellard	af1bce7d1d	R600: Make store_dummy intrinsic more general by passing export type Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 174097	2013-01-31 22:11:46 +00:00
Tom Stellard	6f1b8657f9	R600: Add a llvm.R600.store.swizzle intrinsics This intrinsic is translated to ALLOC_EXPORT_WORD1_SWIZ, hence its name. It is used to store vs/fs outputs Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173297	2013-01-23 21:39:49 +00:00
Tom Stellard	d8ac91d436	R600: Simplify stream outputs intrinsic Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173296	2013-01-23 21:39:47 +00:00
Tom Stellard	365366f9ef	R600: rework handling of the constants Remove Cxxx registers, add new special register - "ALU_CONST" and new operand for each alu src - "sel". ALU_CONST is used to designate that the new operand contains the value to override src.sel, src.kc_bank, src.chan for constants in the driver. Patch by: Vadim Girlin Vincent Lejeune: - Use pointers for constants - Fold CONST_ADDRESS when possible Tom Stellard: - Give CONSTANT_BUFFER_0 its own address space - Use integer types for constant loads Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173222	2013-01-23 02:09:06 +00:00
Tom Stellard	ff62c35da0	R600: Add a CONST_ADDRESS node to model constant buf read Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173221	2013-01-23 02:09:03 +00:00
Tom Stellard	ab28e9a30a	R600: Factorise VTX_WORD0 and VTX_WORD1 in tblgen def Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173220	2013-01-23 02:09:01 +00:00
Tom Stellard	c9b903138d	R600/SI: Use unnormalized coordinates for sampling with the RECT target. Patch by: Michel Dänzer Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 173053	2013-01-21 15:40:48 +00:00
Tom Stellard	41398026e7	R600: Fix MAX_UINT definition Patch by: Vadim Girlin Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 170922	2012-12-21 20:12:01 +00:00
Tom Stellard	4fa7ac29f1	R600: Add SHADOWCUBE to TEX_SHADOW pattern Patch by: Vadim Girlin Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 170921	2012-12-21 20:11:59 +00:00
Tom Stellard	f8794354b2	R600: New control flow for SI v2 This patch replaces the control flow handling with a new pass which structurize the graph before transforming it to machine instruction. This has a couple of different advantages and currently fixes 20 piglit tests without a single regression. It is now a general purpose transformation that could be not only be used for SI/R6xx, but also for other hardware implementations that use a form of structurized control flow. v2: further cleanup, fixes and documentation Patch by: Christian König Signed-off-by: Christian König <deathsimple@vodafone.de> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 170591	2012-12-19 22:10:31 +00:00
Tom Stellard	75aadc2813	Add R600 backend A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX llvm-svn: 169915	2012-12-11 21:25:42 +00:00

1 2 3

111 Commits