llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	b96b57347a	AMDGPU: Add frexp_mant intrinsic llvm-svn: 263948	2016-03-21 16:11:05 +00:00
Nicolai Haehnle	95e8ffd398	AMDGPU: Overload return type of llvm.amdgcn.buffer.load.format Summary: Allow the selection of BUFFER_LOAD_FORMAT_x and _XY. Do this now before the frontend patches land in Mesa. Eventually, we may want to automatically reduce the size of loads at the LLVM IR level, which requires such overloads, and in some cases Mesa can generate them directly. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18255 llvm-svn: 263792	2016-03-18 16:24:40 +00:00
Nicolai Haehnle	ad63638f6d	AMDGPU/SI: Add llvm.amdgcn.buffer.atomic.* intrinsics Summary: These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used by Mesa to implement atomics with buffer semantics. The intrinsic interface matches that of buffer.load.format and buffer.store.format, except that the GLC bit is not exposed (it is automatically deduced based on whether the return value is used). The change of hasSideEffects is required for TableGen to accept the pattern that matches the intrinsic. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, rivanvx, llvm-commits Differential Revision: http://reviews.llvm.org/D18151 llvm-svn: 263791	2016-03-18 16:24:31 +00:00
Nicolai Haehnle	3003ba00a3	AMDGPU: use ComplexPattern for offsets in llvm.amdgcn.buffer.load/store.format Summary: We cannot easily deduce that an offset is in an SGPR, but the Mesa frontend cannot easily make use of an explicit soffset parameter either. Furthermore, it is likely that in the future, LLVM will be in a better position than the frontend to choose an SGPR offset if possible. Since there aren't any frontend uses of these intrinsics in upstream repositories yet, I would like to take this opportunity to change the intrinsic signatures to a single offset parameter, which is then selected to immediate offsets or voffsets using a ComplexPattern. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18218 llvm-svn: 263790	2016-03-18 16:24:20 +00:00
Changpeng Fang	01f6062227	AMDGPU/SI: Implement GroupStaticSize Intrinsic for Dynamic LDS Summary: Static LDS size is saved in MachineFunctionInfo::LDSSize, We define a pseudo instruction with usesCustomInserter bit set. Then, in EmitInstrWithCustomInserter, we replace this pseudo instruction with a mov of MachineFunctionInfo::LDSSize. Reviewers: arsenm tstellarAMD Subscribers llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18064 llvm-svn: 263563	2016-03-15 17:28:44 +00:00
Nikolay Haustov	79af6b33e0	[AMDGPU] Assembler: SOP* instruction fixes s_bitset0_b64, s_bitset1_b64 has 32-bit src0, not 64-bit. s_rfe_b64 has just one destination operand and no source. Uncomment S_BITCMP* and S_SETVSKIP, adjust SOPC_* classes for that. Add s_memrealtime test and change comments in smem.s to follow common style. Change test for s_memtime to use non-zero register to make it really test encoding. Add tests for s_buffer_load*. Add tests for SOPC instructions (same for SI and VI) Differential Revision: http://reviews.llvm.org/D18040 llvm-svn: 263420	2016-03-14 11:17:19 +00:00
Nikolay Haustov	6560781c4f	[AMDGPU] Assembler: change v_madmk operands to have same order as mad. The constant is now at source operand 1 (previously at 2). This is also how it is in legacy AMD sp3 assembler. Update tests. Differential Revision: http://reviews.llvm.org/D17984 llvm-svn: 263212	2016-03-11 09:27:25 +00:00
Nicolai Haehnle	b142770bfe	AMDGPU/SI: add llvm.amdgcn.buffer.load/store.format intrinsics Summary: They correspond to BUFFER_LOAD/STORE_FORMAT_XYZW and will be used by Mesa to implement the GL_ARB_shader_image_load_store extension. The intention is that for llvm.amdgcn.buffer.load.format, LLVM will decide whether one of the _X/_XY/_XYZ opcodes can be used (similar to image sampling and loads). However, this is not currently implemented. For llvm.amdgcn.buffer.store, LLVM cannot decide to use one of the "smaller" opcodes and therefore the intrinsic is overloaded. Currently, only the v4f32 is actually implemented since GLSL also only has a vec4 variant of the store instructions, although it's conceivable that Mesa will want to be smarter about this in the future. BUFFER_LOAD_FORMAT_XYZW is already exposed via llvm.SI.vs.load.input, which has a legacy name, pretends not to access memory, and does not capture the full flexibility of the instruction. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17277 llvm-svn: 263140	2016-03-10 18:43:50 +00:00
Changpeng Fang	278a5b31a5	AMDGPU/SI: Define S_GETREG Intrinsic Summary: Define s_getreg intrinsic to generate s_getreg instruction to read hardware registers. Reviewers: tstellarAMD, arsenm Subscribers: llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D17892 llvm-svn: 263124	2016-03-10 16:47:15 +00:00
Valery Pykhtin	a4db224d54	[AMDGPU] Fix SMEM instructions encoding/operand namings Differential Revision: http://reviews.llvm.org/D17651 llvm-svn: 263108	2016-03-10 13:06:08 +00:00
Nikolay Haustov	8e3f099497	[AMDGPU] Assembler: Fix s_setpc_b64 s_setpc_b64 has just one 64-bit source which is the address of instruction to jump to. Differential Revision: http://reviews.llvm.org/D17888 llvm-svn: 263005	2016-03-09 10:56:19 +00:00
Matt Arsenault	c89f2919a4	AMDGPU: Match more med3 integer patterns llvm-svn: 262864	2016-03-07 21:54:48 +00:00
Valery Pykhtin	0c6293da68	[AMDGPU] SOPxx instructions operand naming fixed in td files. dst -> sdst ssrcN -> srcN Differential Revision: http://reviews.llvm.org/D17646 llvm-svn: 262801	2016-03-06 10:31:44 +00:00
Tom Stellard	649b5db557	AMDGPU/SI: Add support for spiling SGPRs to scratch buffer Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 llvm-svn: 262732	2016-03-04 18:31:18 +00:00
Nikolay Haustov	5bf46ac150	AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. Initial change by Nicolai H.hnle Differential Revision: http://reviews.llvm.org/D17401 llvm-svn: 262701	2016-03-04 10:39:50 +00:00
Matt Arsenault	274d34e725	AMDGPU: Add s_sleep intrinsic llvm-svn: 262120	2016-02-27 08:53:52 +00:00
Matt Arsenault	61738cbcb6	AMDGPU: Implement readcyclecounter This matches the behavior of the HSAIL clock instruction. s_realmemtime is used if the subtarget supports it, and falls back to s_memtime if not. Also introduces new intrinsics for each of s_memtime / s_memrealtime. llvm-svn: 262119	2016-02-27 08:53:46 +00:00
Nikolay Haustov	2f684f1347	[AMDGPU] Assembler: Basic support for MIMG Add parsing and printing of image operands. Matches legacy sp3 assembler. Change image instruction order to have data/image/sampler operands in the beginning. This is needed because optional operands in MC are always last. Update SITargetLowering for new order. Add basic MC test. Update CodeGen tests. Review: http://reviews.llvm.org/D17574 llvm-svn: 261995	2016-02-26 09:51:05 +00:00
Nikolay Haustov	2e4c72977c	[AMDGPU] Assembler: Simplify handling of optional operands Resubmit with index problem fixed. Verified with valgrind. Prepare to support DPP encodings. For DPP encodings, we want row_mask/bank_mask/bound_ctrl to be optional operands. However this means that when parsing instruction which has no mnemonic prefix, we cannot add both default values for VOP3 and for DPP optional operands to OperandVector - neither instructions would match. So add default values for optional operands to MCInst during conversion instead. Mark more operands as IsOptional = 1 in .td files. Do not add default values for optional operands to OperandVector in AMDGPUAsmParser. Add default values for optional operands during conversion using new helper addOptionalImmOperand. Change to cvtVOP3_2_mod to check instruction flag instead of presence of modifiers. In the future, cvtVOP3* functions can be combined into one. Separate cvtFlat and cvtFlatAtomic. Fix CNDMASK_B32 definition to have no modifiers. Review: http://reviews.llvm.org/D17445 llvm-svn: 261856	2016-02-25 10:58:54 +00:00
NAKAMURA Takumi	3d3d0f4151	Revert r261742, "[AMDGPU] Assembler: Simplify handling of optional operands" It brought undefined behavior. llvm-svn: 261839	2016-02-25 08:35:27 +00:00
Nikolay Haustov	4f073ca7fa	[AMDGPU] Assembler: Simplify handling of optional operands Prepare to support DPP encodings. For DPP encodings, we want row_mask/bank_mask/bound_ctrl to be optional operands. However this means that when parsing instruction which has no mnemonic prefix, we cannot add both default values for VOP3 and for DPP optional operands to OperandVector - neither instructions would match. So add default values for optional operands to MCInst during conversion instead. Mark more operands as IsOptional = 1 in .td files. Do not add default values for optional operands to OperandVector in AMDGPUAsmParser. Add default values for optional operands during conversion using new helper addOptionalImmOperand. Change to cvtVOP3_2_mod to check instruction flag instead of presence of modifiers. In the future, cvtVOP3* functions can be combined into one. Separate cvtFlat and cvtFlatAtomic. Fix CNDMASK_B32 definition to have no modifiers. Review: http://reviews.llvm.org/D17445 Reviewers: tstellarAMD llvm-svn: 261742	2016-02-24 14:22:47 +00:00
Nikolay Haustov	2a62b3c244	[AMDGPU] Fix operands of S_BFE_U64 and S_BFM_B64 src1 of s_bfe_u64 is 32-bit (same as s_bfe_i64). src0 and src1 of s_bfm_b64 are 32-bit. Update tests. Review: http://reviews.llvm.org/D17480 Reviewers: arsenm llvm-svn: 261621	2016-02-23 09:19:14 +00:00
Nicolai Haehnle	f2c64db55a	AMDGPU/SI: add llvm.amdgcn.image.load/store[.mip] intrinsics Summary: These correspond to IMAGE_LOAD/STORE[_MIP] and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. IMAGE_LOAD is already matched by llvm.SI.image.load. That intrinsic has a legacy name and pretends not to read memory. Differential Revision: http://reviews.llvm.org/D17276 llvm-svn: 261224	2016-02-18 16:44:18 +00:00
Tom Stellard	e1818af8c5	[AMDGPU] Disassembler: Added basic disassembler for AMDGPU target Changes: - Added disassembler project - Fixed all decoding conflicts in .td files - Added DecoderMethod=“NONE” option to Target.td that allows to disable decoder generation for an instruction. - Created decoding functions for VS_32 and VReg_32 register classes. - Added stubs for decoding all register classes. - Added several tests for disassembler Disassembler only supports: - VI subtarget - VOP1 instruction encoding - 32-bit register operands and inline constants [Valery] One of the point that requires to pay attention to is how decoder conflicts were resolved: - Groups of target instructions were separated by using different DecoderNamespace (SICI, VI, CI) using similar to AssemblerPredicate approach. - There were conflicts in IMAGE_<> instructions caused by two different reasons: 1. dmask wasn’t specified for the output (fixed) 2. There are image instructions that differ only by the number of the address components but have the same encoding by the HW spec. The actual number of address components is determined by the HW at runtime using image resource descriptor starting from the VGPR encoded in an IMAGE instruction. This means that we should choose only one instruction from conflicting group to be the rule for decoder. I didn’t find the way to disable decoder generation for an arbitrary instruction and therefore made a onelinear fix to tablegen generator that would suppress decoder generation when DecoderMethod is set to “NONE”. This is a change that should be reviewed and submitted first. Otherwise I would need to specify different DecoderNamespace for every instruction in the conflicting group. I haven’t checked yet if DecoderMethod=“NONE” is not used in other targets. 3. IMAGE_GATHER decoder generation is for now disabled and to be done later. [/Valery] Patch By: Sam Kolton Differential Revision: http://reviews.llvm.org/D16723 llvm-svn: 261185	2016-02-18 03:42:32 +00:00
Tom Stellard	cc4c8718ed	[AMDGPU] Rename $dst operand to $vdst for VOP instructions. Summary: This change renames output operand for VOP instructions from dst to vdst. This is needed to enable decoding named operands for disassembler. Reviewers: vpykhtin, tstellarAMD, arsenm Subscribers: arsenm, llvm-commits, nhaustov Projects: #llvm-amdgpu-spb Differential Revision: http://reviews.llvm.org/D16920 llvm-svn: 260986	2016-02-16 18:14:56 +00:00
Matt Arsenault	ce56a0ef54	AMDGPU: Add intrinsics for sin/cos These provide direct access to the hardware instruction without the unit version required like llvm.sin/llvm.cos lowering requires. llvm-svn: 260782	2016-02-13 01:19:56 +00:00
Matt Arsenault	79963e80b8	AMDGPU: Rename intrinsic to better match instruction name Also fixes missing f32 test. llvm-svn: 260780	2016-02-13 01:03:00 +00:00
Tom Stellard	bc4497b13c	AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions Reviewers: arsenm Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16603 llvm-svn: 260765	2016-02-12 23:45:29 +00:00
Tom Stellard	e993451f5c	[AMDGPU] Fix for "v_div_scale_f64 reg, vcc, ..." parsing Summary: Added support for "VOP3Only" attribute in VOP3bInst encoding. Set VOP3Only=1 for V_DIV_SCALE_F64/32 insns. Added support for multi-dest instructions in AMDGPUAs::cvt*(). Added lit test for "V_DIV_SCALE_F64\|F32 vreg,vcc\|sreg,vreg,vreg,vreg". Reviewers: tstellarAMD, arsenm Subscribers: arsenm, SamWot, nhaustov, vpykhtin Differential Revision: http://reviews.llvm.org/D16995 Patch By: Artem Tamazov llvm-svn: 260560	2016-02-11 18:25:26 +00:00
Tom Stellard	a90b9526df	[AMDGPU] Assembler: Fix VOP3 only instructions Separate methods to convert parsed instructions to MCInst: - VOP3 only instructions (always create modifiers as operands in MCInst) - VOP2 instrunctions with modifiers (create modifiers as operands in MCInst when e64 encoding is forced or modifiers are parsed) - VOP2 instructions without modifiers (do not create modifiers as operands in MCInst) - Add VOP3Only flag. Pass HasMods flag to VOP3Common. - Simplify code that deals with modifiers (-1 is now same as 0). This is no longer needed. - Add few tests (more will be added separately). Update error message now correct. Patch By: Nikolay Haustov Differential Revision: http://reviews.llvm.org/D16778 llvm-svn: 260483	2016-02-11 03:28:15 +00:00
Matt Arsenault	f639c32739	AMDGPU: Match some med3 patterns llvm-svn: 259089	2016-01-28 20:53:42 +00:00
Matt Arsenault	382d945d16	AMDGPU: Tidy minor td file issues Make comments and indentation more consistent. Rearrange a few things to be in a more consistent order, such as organizing subtarget features from those describing an actual device property, and those used as options. llvm-svn: 258789	2016-01-26 04:49:22 +00:00
Matt Arsenault	c5f6152911	AMDGPU: Make v32i8/v64i8 illegal types Old intrinsics were forcing these, but they have now all been removed. This fixes large i8 vector operations generally being broken. llvm-svn: 258788	2016-01-26 04:43:48 +00:00
Matt Arsenault	018179fc46	AMDGPU: Remove old sample intrinsics I did my best to try to update all the uses in tests that just happened to use the old ones to the newer intrinsics. I'm not sure I got all of the immediate operand conversions correct, since the value seems to have been ignored by the old pattern but I don't think it really matters. llvm-svn: 258787	2016-01-26 04:38:08 +00:00
Matt Arsenault	051d6f9fde	AMDGPU: Add new amdgcn intrinsics for cube instructions More cleanup to try to get all intrinsics using the correct amdgcn prefix that are as close to the instruction as possible. llvm-svn: 258786	2016-01-26 04:29:56 +00:00
Matt Arsenault	7713162c32	AMDGPU: Remove more unused intrinsics Replace tests with lrp with basic IR expansion llvm-svn: 258612	2016-01-23 05:42:38 +00:00
Matt Arsenault	10ca39ca8b	AMDGPU: Add new name for barrier intrinsic llvm-svn: 258558	2016-01-22 21:30:43 +00:00
Matt Arsenault	7898b90ee1	AMDGPU: Change control flow intrinsics to use amdgcn prefix These aren't supposed to be used outside of the backend, so there aren't any users to worry about. llvm-svn: 258516	2016-01-22 18:42:55 +00:00
Matt Arsenault	8d903029e8	AMDGPU: Don't use separate mulhu/mulhs Pats llvm-svn: 258515	2016-01-22 18:42:49 +00:00
Matt Arsenault	de5fbe9c60	AMDGPU: Pattern match ffbh pattern to instruction. The hardware instruction's output on 0 is -1 rather than 32. Eliminate a test and select to -1. This removes an extra instruction from the compatability function with HSAIL's firstbit instruction. llvm-svn: 257352	2016-01-11 17:02:00 +00:00
Matt Arsenault	905042774d	AMDGPU: Remove redundant let mayLoad = 1 This is already set on the SMRD format class. llvm-svn: 256813	2016-01-05 04:50:28 +00:00
Tom Stellard	a6f24c6565	AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions Summary: We were previously selecting all constant loads to SMRD instructions and legalizing the SMRDs with non-uniform addresses during the SIFixSGPRCopesPass. This new solution is more simple and also generates much better code, because the instruction selector is able to take advantage of all the MUBUF addressing modes that are legalization pass wasn't able to. We also no longer need to generate v_add_* instructions when we have a uniform pointer and a non-uniform offset, as this is now folded into the MUBUF instruction during instruction selection. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15425 llvm-svn: 255672	2015-12-15 20:55:55 +00:00
Tom Stellard	8f307217c3	AMDGPU/SI: Fix bitcast between v2f32 and f64 The radeonsi fp64 support can hit these now that some redundant bitcasts are folded. Patch by: Michel Dänzer Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 255657	2015-12-15 17:11:17 +00:00
Tom Stellard	43f52df0b5	AMDGPU/SI: Add llvm.amdgcn.mbcnt.* intrinsics Summary: These are meant to be used instead of the llvm.SI.tid intrinsic which will be deprecated at some point. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15475 llvm-svn: 255652	2015-12-15 17:02:52 +00:00
Matt Arsenault	d079285e05	AMDGPU: Use generic bitreverse intrinsic Also fix bug in vector legalization for bitreverse. llvm-svn: 255512	2015-12-14 17:25:38 +00:00
Matt Arsenault	fbd9bbfda3	Start replacing vector_extract/vector_insert with extractelt/insertelt These are redundant pairs of nodes defined for INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT. insertelement/extractelement are slightly closer to the corresponding C++ node name, and has stricter type checking so prefer it. Update targets to only use these nodes where it is trivial to do so. AArch64, ARM, and Mips all have various type errors on simple replacement, so they will need work to fix. Example from AArch64: def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8), (i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>; Which is trying to do sext_inreg i8, i8. llvm-svn: 255359	2015-12-11 19:20:16 +00:00
Tom Stellard	c93fc11f36	AMDGPU/SI: Emit constant arrays in the .text section Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204	2015-12-10 02:13:01 +00:00
Tom Stellard	38b7cbe3e0	AMDGPU/SI: Remove REGISTER_STORE/REGISTER_LOAD code which is now dead Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15050 llvm-svn: 254427	2015-12-01 17:45:22 +00:00
Marek Olsak	7ed6b2f414	AMDGPU/SI: select S_ABS_I32 when possible (v2) v2: added more tests, moved the SALU->VALU conversion to a separate function It looks like it's not possible to get subregisters in the S_ABS lowering code, and I don't feel like guessing without testing what the correct code would look like. llvm-svn: 254095	2015-11-25 21:22:45 +00:00
Matt Arsenault	61001bbc03	AMDGPU: Make v2i64/v2f64 legal types. They can be loaded and stored, so count them as legal. This is mostly to fix a number of common cases for load/store merging. llvm-svn: 254086	2015-11-25 19:58:34 +00:00
Tom Stellard	41b7e63040	AMDGPU/SI: Refactor VOP[12C] tablegen definitions Summary: Pass the VOPProfile object all the through to *_m multiclasses. This will allow us to do more simplifications in the future. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13437 llvm-svn: 252339	2015-11-06 20:56:18 +00:00
Matt Arsenault	08f14de244	AMDGPU: Remove unused scratch resource operands The SGPR spill pseudos don't actually use them. llvm-svn: 252324	2015-11-06 18:07:53 +00:00
Marek Olsak	74d084f466	AMDGPU/SI: use S_OR for fneg (fabs f32) llvm-svn: 251631	2015-10-29 15:29:05 +00:00
Marek Olsak	f924dd6f3c	AMDGPU/SI: use S_AND for i1 trunc llvm-svn: 251630	2015-10-29 15:05:03 +00:00
Matt Arsenault	fc0ad42516	AMDGPU: Fix missing implicit m0 uses on movrel instructions llvm-svn: 249577	2015-10-07 17:46:32 +00:00
Matt Arsenault	284192730a	AMDGPU: Use explicit register size indirect pseudos This stops using an unknown reg class operand. Currently build_vector selection has a broken looking check where it tries to use a VGPR reg class and an SGPR one if it sees an SGPR use. With the source operand has an explicit VGPR class, illegal copies will be inserted that SIFixSGPRCopies will take care of normally later, which will allow removing the weird check of build_vector users. Without this, when removed v_movrels_b32 would still be emitted even though all of the values were only stored in SGPRs. llvm-svn: 249494	2015-10-07 00:42:51 +00:00
Tom Stellard	88e0b25181	AMDGPU/SI: Add 64-bit versions of v_nop and v_clrexcp Summary: The assembly printing of these is still missing the encoding size suffix, but this will be fixed in a later commit. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13436 llvm-svn: 249424	2015-10-06 15:57:53 +00:00
Matt Arsenault	d092a068ba	AMDGPU/SI: Add verifier check for exec reads Make sure we aren't accidentally not setting these in the instruction definitions. llvm-svn: 249170	2015-10-02 18:58:37 +00:00
Matt Arsenault	e98a074c42	AMDGPU: VOP3b definition cleanups llvm-svn: 248647	2015-09-26 02:25:48 +00:00
Matt Arsenault	e66621b306	AMDGPU: Add s_dcache_* instructions llvm-svn: 248533	2015-09-24 19:52:27 +00:00
Matt Arsenault	d6adfb401c	AMDGPU: Add cache invalidation instructions. These are necessary for implementing mem_fence for OpenCL 2.0. The VI assembler tests are disabled since it seems to be using the wrong encoding or opcode. llvm-svn: 248532	2015-09-24 19:52:21 +00:00
Matt Arsenault	80f766a032	AMDGPU/SI: Fix more cases of losing exec operands llvm-svn: 247230	2015-09-10 01:23:28 +00:00
Matt Arsenault	86d336e91b	AMDGPU/SI: Fix input vcc operand for VOP2b instructions Adds vcc to output string input for e32. Allows option of using e64 encoding with assembler. Also fixes these instructions not implicitly reading exec. llvm-svn: 247074	2015-09-08 21:15:00 +00:00
Matt Arsenault	8ac35cd031	AMDGPU: Mark s_barrier as a high latency instruction These were marked as WriteSALU, which is low latency. I'm guessing at the value to use, but it should probably be considered the highest latency instruction. I'm not sure this has any actual effect since hasSideEffects probably is preventing any moving of these. llvm-svn: 247060	2015-09-08 19:54:32 +00:00
Matt Arsenault	8fb810a1d2	AMDGPU: Fix s_barrier flags This should be convergent. This is not a barrier in the isBarrier sense, nor hasCtrlDep. llvm-svn: 247059	2015-09-08 19:54:25 +00:00
Matt Arsenault	e4d0c142e8	AMDGPU: Add sdst operand to VOP2b instructions The VOP3 encoding of these allows any SGPR pair for the i1 output, but this was forced before to always use vcc. This doesn't yet try to use this, but does add the operand to the definitions so the main change is adding vcc to the output of the VOP2 encoding. llvm-svn: 246358	2015-08-29 07:16:50 +00:00
Matt Arsenault	9a32cd3d3b	AMDGPU: Set mem operands for spill instructions llvm-svn: 246357	2015-08-29 06:48:57 +00:00
Matt Arsenault	8a067121f8	AMDGPU: Delete dead code There is no context where s_mov_b64 is emitted and could potentially be moved to the VALU. It is currently only emitted for materializing immediates, which can't be dependent on vector sources. The immediate splitting is already done when selecting constants. I'm not sure what contexts if any the register splitting would have been used before. Also clean up using s_mov_b64 in place of v_mov_b64_pseudo, although this isn't required and just skips the extra step of eliminating the copy from the SReg_64. llvm-svn: 246080	2015-08-26 20:48:08 +00:00
Matt Arsenault	0a3ac1be43	AMDGPU: Allow specifying different opcode on VI for SMRD/SMEM Although the basic s_load_* instructions happen to use the same opcode, some of the special case SMRD instructions have different opcodes. llvm-svn: 245775	2015-08-22 00:54:31 +00:00
Matt Arsenault	e8df879948	AMDGPU: Improve accuracy of instruction rates for some FP instructions llvm-svn: 245774	2015-08-22 00:50:41 +00:00
Matt Arsenault	6adf07a92e	AMDGPU: Move CI instructions into CIInstructions.td There are still a couple of CI patterns left in SIInstructions. llvm-svn: 245767	2015-08-22 00:16:34 +00:00
Matt Arsenault	6942d1a034	AMDGPU/SI: Remove source uses of VCCReg llvm-svn: 244379	2015-08-08 00:41:45 +00:00
Tom Stellard	85656cabfb	AMDGPU/SI: v_mac_legacy_f32 does not exist on VI Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11810 llvm-svn: 244322	2015-08-07 15:34:30 +00:00
Tom Stellard	217361c33f	AMDGPU/SI: Add support for 32-bit immediate SMRD offsets on CI Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11604 llvm-svn: 244254	2015-08-06 19:28:38 +00:00
Tom Stellard	dee26a2876	AMDGPU/SI: Use ComplexPatterns for SMRD addressing modes Summary: This allows us to consolidate several of the TableGen patterns. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11602 llvm-svn: 244253	2015-08-06 19:28:30 +00:00
Matt Arsenault	95f0606e62	AMDGPU/SI: Remove EXECReg For the same reasons as the other physical registers. llvm-svn: 244062	2015-08-05 16:42:57 +00:00
Matt Arsenault	4c0487bff6	AMDGPU: Remove SCCReg. These should be handled as a physical register rather than a virtual register class with one member. llvm-svn: 244061	2015-08-05 16:42:54 +00:00
Tom Stellard	82325598c3	AMDGPU/SI: Remove unused pattern for f32 constant loads Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11603 llvm-svn: 243719	2015-07-31 01:02:32 +00:00
Marek Olsak	93df060871	AMDGPU: don't match vgpr loads for constant loads Author: Dave Airlie <airlied@redhat.com> In order to implement indirect sampler loads, we don't want to match on a VGPR load but an SGPR one for constants, as we cannot feed VGPRs to the sampler only SGPRs. this should be applicable for llvm 3.7 as well. llvm-svn: 243294	2015-07-27 18:16:08 +00:00
Marek Olsak	1354b87695	AMDGPU/SI: Fix the V_FRACT_F64 SI bug workaround This is a candidate for 3.7. llvm-svn: 243263	2015-07-27 11:37:42 +00:00
Matt Arsenault	f849bb49cc	AMDGPU: Set isMoveImm on s_movk_i32 llvm-svn: 242747	2015-07-21 00:40:08 +00:00
Tom Stellard	db5a11f698	AMDGPU/SI: Select mad patterns to v_mac_f32 The two-address instruction pass will convert these back to v_mad_f32 if necessary. Differential Revision: http://reviews.llvm.org/D11060 llvm-svn: 242038	2015-07-13 15:47:57 +00:00
Tom Stellard	45bb48ea19	R600 -> AMDGPU rename llvm-svn: 239657	2015-06-13 03:28:10 +00:00
Tom Stellard	1be1aa84ec	Revert "AMDGPU: Add core backend files for R600/SI codegen v6" This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea. llvm-svn: 160303	2012-07-16 18:19:53 +00:00
Tom Stellard	bcce80fa95	AMDGPU: Add core backend files for R600/SI codegen v6 llvm-svn: 160270	2012-07-16 14:17:08 +00:00

1 2 3 4

185 Commits