llvm-project

Commit Graph

Author	SHA1	Message	Date
Wei Ding	4991d3570f	AMDGPU : Update TrapCode based on Trap Handler ABI. Differential Revision: http://reviews.llvm.org/D30232 llvm-svn: 295867	2017-02-22 20:05:06 +00:00
Matt Arsenault	1f17c66890	AMDGPU: Add cvt.pkrtz intrinsic Convert llvm.SI.packf16 test uses llvm-svn: 295797	2017-02-22 00:27:34 +00:00
Matt Arsenault	2fdf2a1a18	AMDGPU: Redefine clamp node as clamp 0.0-1.0 Change implementation to use max instead of add. min/max/med3 do not flush denormals regardless of the mode, so it is OK to use it whether or not they are enabled. Also allow using clamp with f16, and use knowledge of dx10_clamp. llvm-svn: 295788	2017-02-21 23:35:48 +00:00
Wei Ding	205bfdb3e9	AMDGPU : Add trap handler support. Differential Revision: http://reviews.llvm.org/D26010 llvm-svn: 294692	2017-02-10 02:15:29 +00:00
Matt Arsenault	9dba9bd4cf	AMDGPU: Use source modifiers with f16->f32 conversions The operand types were defined to fit the fp16_to_fp node, which has the half as an integer type. v_cvt_f32_f16 does support source modifiers, so change this to have an FP type and modifiers. For targets without legal f16, this requires recognizing the bit operations and trying to produce them. llvm-svn: 293857	2017-02-02 02:27:04 +00:00
Matt Arsenault	f84e5d9a27	AMDGPU: Generalize matching of v_med3_f32 I think this is safe as long as no inputs are known to ever be nans. Also add an intrinsic for fmed3 to be able to handle all safe math cases. llvm-svn: 293598	2017-01-31 03:07:46 +00:00
Matt Arsenault	af635240d5	AMDGPU: Undo sub x, c -> add x, -c canonicalization This is worse if the original constant is an inline immediate. This should also be done for 64-bit adds, but requires fixing operand folding bugs first. llvm-svn: 293540	2017-01-30 19:30:24 +00:00
Marek Olsak	e81adb52b1	AMDGPU: Remove a useless VI SMRD pattern Summary: already covered by complex patterns Reviewers: arsenm, nhaehnle, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28995 llvm-svn: 293477	2017-01-30 12:25:14 +00:00
Marek Olsak	8e93529020	AMDGPU: Fix assembler encoding for EXP instructions on VI Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28992 llvm-svn: 293476	2017-01-30 12:25:03 +00:00
Matt Arsenault	4165efdc58	AMDGPU: Add replacement export intrinsics llvm-svn: 292205	2017-01-17 07:26:53 +00:00
Sam Kolton	9772eb3907	[AMDGPU] Assembler: SDWA/DPP should not accept scalar registers and immediate operands Reviewers: artem.tamazov, nhaustov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28157 llvm-svn: 291668	2017-01-11 11:46:30 +00:00
Sam Kolton	e66365e07d	[AMDGPU] Assembler: support SDWA and DPP for VOP2b instructions Reviewers: nhaustov, artem.tamazov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28051 llvm-svn: 290599	2016-12-27 10:06:42 +00:00
Jan Vesely	206a510e54	AMDGPU: split ret/noret patterns for global atomics Differential Revision: https://reviews.llvm.org/D27989 llvm-svn: 290435	2016-12-23 15:34:51 +00:00
Tom Stellard	6f9ef14b9d	AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.* Reviewers: arsenm, nhaehnle, mareko Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27834 llvm-svn: 290184	2016-12-20 17:19:44 +00:00
Matt Arsenault	0e8a299f19	AMDGPU: Assembler support for vintrp instructions llvm-svn: 289866	2016-12-15 20:40:20 +00:00
Matt Arsenault	ebfba7027e	AMDGPU: Change vintrp printing llvm-svn: 289664	2016-12-14 16:36:12 +00:00
Matt Arsenault	4bd7236193	AMDGPU: Fix handling of 16-bit immediates Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determining the legality based on the size is incorrect. Change operands to have the size specified in the type. Also adds a workaround for a disassembler bug that produces an immediate MCOperand for an operand that is supposed to be OPERAND_REGISTER. The assembler appears to accept out of bounds immediates and truncates them, but this seems to be an issue for 32-bit already. llvm-svn: 289306	2016-12-10 00:39:12 +00:00
Matt Arsenault	bf6bdac1ad	AMDGPU: Assembler support for exp compr is not currently parsed (or printed) correctly, but that should probably be fixed along with intrinsic changes. llvm-svn: 288698	2016-12-05 20:42:41 +00:00
Matt Arsenault	8a63cb9044	AMDGPU: Change how exp is printed This is an improvement over a long list of unreadable numbers. A follow up patch will try to match how sc formats these. llvm-svn: 288697	2016-12-05 20:31:49 +00:00
Matt Arsenault	7bee6ac798	AMDGPU: Refactor exp instructions Structure the definitions a bit more like the other classes. The main change here is to split EXP with the done bit set to a separate opcode, so we can set mayLoad = 1 so that it won't be reordered before the other exp stores, since this has the special constraint that if the done bit is set then this should be the last exp in she shader. Previously all exp instructions were inferred to have unmodeled side effects. llvm-svn: 288695	2016-12-05 20:23:10 +00:00
Konstantin Zhuravlyov	aefee42e0f	[AMDGPU] Change frexp.exp intrinsic to return i16 for f16 input Differential Revision: https://reviews.llvm.org/D26862 llvm-svn: 287389	2016-11-18 22:31:08 +00:00
Sam Kolton	c01faa383f	[AMDGPU] TableGen: change individual instruction flags to bit type from bits<1> Summary: This is needed to be able to use this flags in InstrMappings. Reviewers: tstellarAMD, vpykhtin Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26666 llvm-svn: 286960	2016-11-15 13:39:07 +00:00
Konstantin Zhuravlyov	f86e4b7266	[AMDGPU] Add f16 support (VI+) Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753	2016-11-13 07:01:11 +00:00
Tom Stellard	115a61560e	AMDGPU: Add VI i16 support Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464	2016-11-10 16:02:37 +00:00
Tom Stellard	2d2d33f1dc	Revert "AMDGPU: Add VI i16 support" This reverts commit r285939 and r285948. These broke some conformance tests. llvm-svn: 285995	2016-11-04 13:06:34 +00:00
Tom Stellard	2b3379cdff	AMDGPU: Add VI i16 support Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 285939	2016-11-03 17:13:50 +00:00
Matt Arsenault	3d463193a9	AMDGPU: Default to using scalar mov to materialize immediate This is the conservatively correct way because it's easy to move or replace a scalar immediate. This was incorrect in the case when the register class wasn't known from the static instruction definition, but still needed to be an SGPR. The main example of this is inlineasm has an SGPR constraint. Also start verifying the register classes of inlineasm operands. llvm-svn: 285762	2016-11-01 22:55:07 +00:00
Matt Arsenault	4b6a6cc8e9	AMDGPU: Rename glc operand type While trying to add the glc bit to SMEM instructions on VI with the new refactoring I ran into some kind of shadowing problem for the glc operand when using the pseudoinstruction as a multiclass parameter. Everywhere that currently uses it defines the operand to have the same name as its type, i.e. glc:$glc which works. For some reason now it conflicts, and its up evaluating to the wrong thing. For the real encoding classes, let Inst{16} = !if(ps.has_glc, glc, ?); was not being evaluated and still visible in the Inst initializer in the expanded td file. In other cases I got a a different error about an illegal operand where this was using { 0 } initializer from the bits<1> glc initializer instead of evaluating it as false in the if. For consistency all of the operand types should probably be captialized to avoid conflicting with the variable names unless somebody has a better idea of how to fix this. llvm-svn: 285462	2016-10-28 21:55:08 +00:00
Konstantin Zhuravlyov	c96b5d7073	[AMDGPU] Emit 32-bit lo/hi got and pc relative variant kinds for external and global address space variables Differential Revision: https://reviews.llvm.org/D25562 llvm-svn: 284196	2016-10-14 04:37:34 +00:00
Sam Kolton	3381d7a216	[AMDGPU] Disassembler: print label names in branch instructions Summary: Add AMDGPUSymbolizer for finding names for labels from ELF symbol table. Initialize MCObjectFileInfo with some default values. Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D24802 llvm-svn: 283450	2016-10-06 13:46:08 +00:00
Sam Kolton	984461062f	Revert "[AMDGPU] Disassembler: print label names in branch instructions" This reverts commit 6c6dbe625263ec9fcf8de0df27263cf147cde550. llvm-svn: 282396	2016-09-26 11:29:03 +00:00
Sam Kolton	1559f76257	[AMDGPU] Disassembler: print label names in branch instructions Summary: Add AMDGPUSymbolizer for finding names for labels from ELF symbol table. Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D24802 llvm-svn: 282394	2016-09-26 10:05:50 +00:00
Valery Pykhtin	355103f6c0	[AMDGPU] Refactor VOP1 and VOP2 instruction TD definitions Differential revision: https://reviews.llvm.org/D24738 llvm-svn: 282234	2016-09-23 09:08:07 +00:00
Valery Pykhtin	e330cfa294	[AMDGPU] Refactor VOP3 instruction TD definitions Differential revision: https://reviews.llvm.org/D24664 llvm-svn: 281965	2016-09-20 10:41:16 +00:00
Valery Pykhtin	2828b9be1e	[AMDGPU] Refactor VOPC instruction TD definitions Differential Revision: https://reviews.llvm.org/D24546 llvm-svn: 281903	2016-09-19 14:39:49 +00:00
Matt Arsenault	ac0fc849cf	AMDGPU: Fix broken FrameIndex handling We were trying to avoid using a FrameIndex operand in non-pointer operands in a convoluted way, and would break because of using TargetFrameIndex. The TargetFrameIndex should only be used in the case where it makes sense to fold it as part of the addressing mode, otherwise it requires materialization like a normal constant. This wasn't working reliably and failed in the added testcase, hitting the assert when processing the frame index. The TargetFrameIndex was coming from trying to produce an AssertZext limiting the maximum stack size. I'm not sure this was correct to begin with, because it is apparently possible to have a single workitem dispatch that requires all 4G of private memory. llvm-svn: 281824	2016-09-17 16:09:55 +00:00
Matt Arsenault	7ccf6cd104	AMDGPU: Use SOPK compare instructions llvm-svn: 281780	2016-09-16 21:41:16 +00:00
Matt Arsenault	f40b70fa75	Revert "AMDGPU: Use SOPK compare instructions" Accidentally committed llvm-svn: 281514	2016-09-14 18:04:42 +00:00
Matt Arsenault	f757c87959	AMDGPU: Use SOPK compare instructions llvm-svn: 281513	2016-09-14 18:03:53 +00:00
Sam Kolton	fb0d9d9c13	[AMDGPU] Assembler: Move disabled SDWA and DPP instruction into Disable asm variant Summary: This removes disabled instructions from match tables so we will not match them at all. Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: wdng, nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D24452 llvm-svn: 281216	2016-09-12 14:42:43 +00:00
Valery Pykhtin	b66e5eb612	[AMDGPU] Refactor MUBUF/MTBUF instructions Differential revision: https://reviews.llvm.org/D24295 llvm-svn: 281137	2016-09-10 13:09:16 +00:00
Wei Ding	06f8d39424	AMDGPU : Fix mqsad_u32_u8 instruction incorrect data type. Differential Revision: http://reviews.llvm.org/D23700 llvm-svn: 281081	2016-09-09 19:31:51 +00:00
Sam Kolton	1eeb11bfd4	AMDGPU] Assembler: better support for immediate literals in assembler. Summary: Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals. E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least. With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction). Here are rules how we convert literals: - We parsed fp literal: - Instruction expects 64-bit operand: - If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5) - then we do nothing this literal - Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5) - report error - Else literal is not-inlinable but we can encode it as additional 32-bit literal constant - If instruction expect fp operand type (f64) - Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5) - If so then do nothing - Else (e.g. v_fract_f64 v[0:1], 3.1415) - report warning that low 32 bits will be set to zeroes and precision will be lost - set low 32 bits of literal to zeroes - Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5) - report error as it is unclear how to encode this literal - Instruction expects 32-bit operand: - Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow - Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5) - do nothing - Else report error - Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0) - Parsed binary literal: - Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35) - do nothing - Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35) - report error - Else, literal is not-inlinable and we are not required to inline it - Are high 32 bit of literal zeroes or same as sign bit (32 bit) - do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef) - Else - report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0) For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types: ''' enum OperandType { OPERAND_REG_IMM32_INT, OPERAND_REG_IMM32_FP, OPERAND_REG_INLINE_C_INT, OPERAND_REG_INLINE_C_FP, } ''' This is not working yet: - Several tests are failing - Problems with predicate methods for inline immediates - LLVM generated assembler parts try to select e64 encoding before e32. More changes are required for several AsmOperands. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, artem.tamazov Differential Revision: https://reviews.llvm.org/D22922 llvm-svn: 281050	2016-09-09 14:44:04 +00:00
Sam Kolton	d63d8a7c05	[AMDGPU] Assembler: match e32 VOP instructions before e64. Summary: Split assembler match table in 4 tables with assembler variants: Default - all instructions except VOP3, SDWA and DPP - VOP3 - SDWA - DPP First match Default table then VOP3, SDWA and DPP. Reviewers: tstellarAMD, artem.tamazov, vpykhtin Subscribers: arsenm, wdng, nhaehnle, AMDGPU Differential Revision: https://reviews.llvm.org/D24252 llvm-svn: 281023	2016-09-09 09:37:51 +00:00
Valery Pykhtin	8bc659637c	[AMDGPU] Refactor FLAT TD instructions Differential revision: https://reviews.llvm.org/D24072 llvm-svn: 280655	2016-09-05 11:22:51 +00:00
Changpeng Fang	b28fe0307f	AMDGPU/SI: MIMG TD Refactoring. Summary: Created a new td file MIMGInstructions.td which contains all definitions of MIMG related instructions. Reviewed by: kzhuravl, vpykhtin Differential Revision: http://reviews.llvm.org/D24106 llvm-svn: 280385	2016-09-01 17:54:54 +00:00
Valery Pykhtin	1b13886b5f	[AMDGPU] Scalar Memory instructions TD refactoring Differential revision: https://reviews.llvm.org/D23996 llvm-svn: 280349	2016-09-01 09:56:47 +00:00
Valery Pykhtin	a34fb49f8f	[AMDGPU] Refactor SOP instructions TD files. Differential revision: https://reviews.llvm.org/D23617 llvm-svn: 280101	2016-08-30 15:20:31 +00:00
Wei Ding	52bb661dec	AMDGPU : Fix QSAD and MQSAD instructions' incorrect data type. Differential Revision: http://reviews.llvm.org/D23689 llvm-svn: 279126	2016-08-18 19:51:14 +00:00
Matt Arsenault	18da70dd2d	AMDGPU: Remove unused tablegen utilities llvm-svn: 278414	2016-08-11 21:08:43 +00:00
Wei Ding	70cda07526	AMDGPU : Add intrinsic for instruction v_cvt_pk_u8_f32 Differential Revision: http://reviews.llvm.org/D23336 llvm-svn: 278403	2016-08-11 20:34:48 +00:00
Nicolai Haehnle	8a482b33fe	AMDGPU: Stay in WQM for non-intrinsic stores Summary: Two types of stores are possible in pixel shaders: stores to memory that are explicitly requested at the API level, and stores that are an implementation detail of register spilling or lowering of arrays. For the first kind of store, we must ensure that helper pixels have no effect and hence WQM must be disabled. The second kind of store must always be executed, because the written value may be loaded again in a way that is relevant for helper pixels as well -- and there are no externally visible effects anyway. This is a candidate for the 3.9 release branch. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D22675 llvm-svn: 277504	2016-08-02 19:31:14 +00:00
Valery Pykhtin	902db3101b	[AMDGPU] refactor DS instruction definitions. NFC. Differential revision: https://reviews.llvm.org/D22522 llvm-svn: 277344	2016-08-01 14:21:30 +00:00
Matt Arsenault	dc744412ad	AMDGPU: Remove unused pattern llvm-svn: 277258	2016-07-30 01:40:30 +00:00
Matt Arsenault	cb540bc03c	AMDGPU: Expand register indexing pseudos in custom inserter This is to help moveSILowerControlFlow to before regalloc. There are a couple of tradeoffs with this. The complete CFG is visible to more passes, the loop body avoids an extra copy of m0, vcc isn't required, and immediate offsets can be shrunk into s_movk_i32. The disadvantage is the register allocator doesn't understand that the single lane's vector is dead within the loop body, so an extra register is used to outlive the loop block when expanding the VGPR -> m0 loop. This also now results in worse waitcnt insertion before the loop instead of after for pending operations at the point of the indexing, but that should be fixed by future improvements to cross block waitcnt insertion. v_movreld_b32's operands are now modeled more correctly since vdst is not a true output. This is kind of a hack to treat vdst as a use operand. Extra checking is required in the verifier since I can't seem to get tablegen to emit an implicit operand for a virtual register. llvm-svn: 275934	2016-07-19 00:35:03 +00:00
Matt Arsenault	10531d1020	AMDGPU: Set isConvergent on v_cmpx* instructions No test since these aren't used now, except for one place in a pre-emit pass. llvm-svn: 275200	2016-07-12 18:41:03 +00:00
Nicolai Haehnle	7968c34586	AMDGPU: Unify MOVRELSOffset and MOVRELDOffset Summary: Previously, constant index insertelements would be turned into SI_INDIRECT_DST, which is bound to prevent some optimization opportunities. Worse, it mislead the heuristic that decides whether immediates should be lowered to S_MOV_B32 or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22217 llvm-svn: 275160	2016-07-12 08:12:16 +00:00
Nicolai Haehnle	c06bfa1daa	AMDGPU: Treat texture gather instructions more like other MIMG instructions Summary: Setting MIMG to 0 has a bunch of unexpected side effects, including that isVMEM returns false which leads to incorrect treatment in the hazard recognizer. The reason I noticed it is that it also leads to incorrect treatment in VGPR-to-SGPR copies, which is one cause of the referenced bug. The only reason why MIMG was set to 0 is to signal the special handling of dmasks, but that can be checked differently. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96877 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22210 llvm-svn: 275113	2016-07-11 21:59:43 +00:00
Matt Arsenault	1322b6f8bb	AMDGPU: Improve offset folding for register indexing llvm-svn: 274954	2016-07-09 01:13:56 +00:00
Valery Pykhtin	68853ab2c5	[AMDGPU] fix ds_swizzle_b32 opcode for VI (bz 28371) Differential Revision: http://reviews.llvm.org/D22049 llvm-svn: 274852	2016-07-08 15:12:46 +00:00
Valery Pykhtin	af8b1bddbd	[AMDGPU] fix ds_write_src2 encoding (bz26027) Differential revision: http://reviews.llvm.org/D22041 llvm-svn: 274756	2016-07-07 14:23:38 +00:00
Matt Arsenault	ffc8275f2b	AMDGPU: Fix folding SGPRs into madak/madmk src0 Because of the special immediate operand, the constant bus is already used so SGPRs are never useful. r263212 changed the name of the immediate operand, which broke the verifier check for the restriction. llvm-svn: 274564	2016-07-05 17:09:01 +00:00
Tom Stellard	a4b746d808	AMDGPU/SI: Remove address space query functions from AMDGPUDAGToDAGISel Summary: These have been replaced with TableGen code (except for isConstantLoad, which is still used for R600). The queries were broken for cases where MemOperand was a PseudoSourceValue. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21684 llvm-svn: 274561	2016-07-05 16:10:44 +00:00
Valery Pykhtin	e65b39ec09	[AMDGPU] rename DS_1A1D_Off8_NORET to DS_1A2D_Off8_NORET as ds_write2xx use 2 source registers. NFC. llvm-svn: 274556	2016-07-05 15:15:28 +00:00
Tom Stellard	17a0ec5400	AMDGPU/SI: Remove hack for selecting < 32-bit loads to MUBUF instructions Summary: The isGlobalLoad() query was returning true for constant address space loads with memory types less than 32-bits, which is wrong. This logic has been replaced with PatFrag in the TableGen files, to provide the same functionality. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21696 llvm-svn: 274521	2016-07-04 20:41:48 +00:00
Matt Arsenault	7f681ac7a9	AMDGPU: Add feature for unaligned access llvm-svn: 274398	2016-07-01 23:03:44 +00:00
Sam Kolton	5196b88f07	[AMDGPU] Assembler: support SDWA for VOPC instructions Summary: dst_sel and dst_unused disabled for VOPC as they have no effect on result Reviewers: artem.tamazov, tstellarAMD, vpykhtin Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21376 llvm-svn: 274340	2016-07-01 09:59:21 +00:00
Matt Arsenault	43e92fe306	AMDGPU: Cleanup subtarget handling. Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652	2016-06-24 06:30:11 +00:00
Wei Ding	ab3d91b8f1	AMDGPU: Add v_mad 16-bit instructions definition. Differential Revision: http://reviews.llvm.org/D21362 llvm-svn: 272919	2016-06-16 16:50:04 +00:00
Tom Stellard	bf3e6e5bb4	AMDGPU/SI: Refactor fixup handling for constant addrspace variables Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Re-commit this after fixing a bug where we were trying to use a reference to a Triple object that had already been destroyed. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 llvm-svn: 272705	2016-06-14 20:29:59 +00:00
Tom Stellard	b1a523fa68	Revert "AMDGPU/SI: Refactor fixup handling for constant addrspace variables" This reverts commit r272675. llvm-svn: 272677	2016-06-14 15:16:35 +00:00
Tom Stellard	5e6298b0f2	AMDGPU/SI: Refactor fixup handling for constant addrspace variables Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 llvm-svn: 272675	2016-06-14 15:11:01 +00:00
Sam Kolton	945231ada5	[AMDGPU] AsmParser: Support for sext() modifier in SDWA. Some code cleaning in AMDGPUOperand. Summary: sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported. Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier. Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers. Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...). Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20968 llvm-svn: 272384	2016-06-10 09:57:59 +00:00
Matt Arsenault	58ddad5bd6	AMDGPU: v_cndmask_b32 does not def vcc Fixes verifier errors after SIShrinkInstructions. llvm-svn: 272351	2016-06-10 00:18:41 +00:00
Matt Arsenault	7757c59e48	AMDGPU: Fix flat atomics The flat atomics could already be selected, but only when using flat instructions for global memory. Add patterns for flat addresses. llvm-svn: 272345	2016-06-09 23:42:54 +00:00
Artem Tamazov	135487767b	[AMDGPU][llvm-mc] v_cndmask_b32: src2 is mandatory; do not enforce VOP2 when src2 == VCC. Another step for unification llvm assembler/disassembler with sp3. Besides, CodeGen output is a bit improved, thus changes in CodeGen tests. Assembler/Disassembler tests updated/added. Differential Revision: http://reviews.llvm.org/D20796 llvm-svn: 271900	2016-06-06 15:23:43 +00:00
Sam Kolton	a4a99ad1bc	[AMDGPU] Assembler: More tests for SDWA instructions. Fix for SDWA float modifiers. Summary: Depends on D20625 Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20674 llvm-svn: 271662	2016-06-03 11:43:09 +00:00
Sam Kolton	05ef1c940f	[AMDGPU] Assembler: Custom converters for SDWA instructions. Support for _dpp and _sdwa suffixes in mnemonics. Summary: Added custom converters for SDWA instruction to support optional operands and modifiers. Support for _dpp and _sdwa suffixes that allows to force DPP or SDWA encoding for instructions. Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20625 llvm-svn: 271655	2016-06-03 10:27:37 +00:00
Matt Arsenault	d8d304d1d6	AMDGPU: Fix trailing whitespace llvm-svn: 271081	2016-05-28 00:50:51 +00:00
Sam Kolton	11de370cca	[AMDGPU] Assembler: rework parsing of optional operands. Summary: Change process of parsing of optional operands. All optional operands use same parsing method - parseOptionalOperand(). No default values are added to OperandsVector. Get rid of WORKAROUND_USE_DUMMY_OPERANDS_INSTEAD_MUTIPLE_DEFAULT_OPERANDS. Reviewers: tstellarAMD, vpykhtin, artem.tamazov, nhaustov Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20527 llvm-svn: 270556	2016-05-24 12:38:33 +00:00
Sam Kolton	1bdcef7697	[AMDGPU] Assembler: refactor parsing of modifiers and immediates. Allow modifiers for imms. Reviewers: nhaustov, tstellarAMD Subscribers: kzhuravl, arsenm Differential Revision: http://reviews.llvm.org/D20166 llvm-svn: 270415	2016-05-23 09:59:02 +00:00
Artem Tamazov	8ce1f7177b	[AMDGPU][llvm-mc] Fixes to support buffer atomics. Fixes for MUBUF_Atomic instructions to make operand list valid: - For RTN insns, make a copy of $vdata_in operand as $vdata. - Do not add operand for GLC, it is hardcoded and comes as a token. Workaround to avoid adding multiple default optional operands. Tests added. Differential Revision: http://reviews.llvm.org/D20257 llvm-svn: 270049	2016-05-19 12:22:39 +00:00
Jan Vesely	fbcb754e66	AMDGPU: Make CONST_DATA_PTR available to R600 Rename to AMDGPUconstdata_ptr Reviewers: tstellard Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D19786 llvm-svn: 269474	2016-05-13 20:39:18 +00:00
Artem Tamazov	f0b6b40fa4	[AMDGPU][llvm-mc] Some refactoring of .td files Some custom Operands and AsmOperandClasses moved to proper place. No functional changes. Differential Revision: http://reviews.llvm.org/D20012 llvm-svn: 268780	2016-05-06 19:32:38 +00:00
Artem Tamazov	ebe71ce36a	[AMDGPU][llvm-mc] Add support for sendmsg(...) syntax. Added support for sendmsg(MSG[, OP[, STREAM_ID]]) syntax in s_sendmsg and s_sendmsghalt instructions. The syntax matches the SP3 assembler/disassembler rules. That is why implicit inputs (like M0 and EXEC) are not printed to disassembly output anymore. sendmsg(...) allows only known message types and attributes, even if literals are used instead of symbolic names. However, raw literal (without "sendmsg") still can be used, and that allows for any 16-bit value. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19596 llvm-svn: 268762	2016-05-06 17:48:48 +00:00
Sam Kolton	5f10a137d0	[TableGen] AsmMatcher: support for default values for optional operands Summary: This change allows to specify "DefaultMethod" for optional operand (IsOptional = 1) in AsmOperandClass that return default value for operand. This is used in convertToMCInst to set default values in MCInst. Previously if you wanted to set default value for operand you had to create custom converter method. With this change it is possible to use standard converters even when optional operands presented. Reviewers: tstellarAMD, ab, craig.topper Subscribers: jyknight, dsanders, arsenm, nhaustov, llvm-commits Differential Revision: http://reviews.llvm.org/D18242 llvm-svn: 268726	2016-05-06 11:31:17 +00:00
Tom Stellard	92b24f324b	AMDGPU/SI: Add offset field to ds_permute/ds_bpermute instructions Summary: These instructions can add an immediate offset to the address, like other ds instructions. Reviewers: arsenm Subscribers: arsenm, scchan Differential Revision: http://reviews.llvm.org/D19233 llvm-svn: 268043	2016-04-29 14:34:26 +00:00
Nikolay Haustov	4f672a34ed	AMDGPU/SI: Assembler: Unify parsing/printing of operands. Summary: The goal is for each operand type to have its own parse function and at the same time share common code for tracking state as different instruction types share operand types (e.g. glc/glc_flat, etc). Introduce parseAMDGPUOperand which can parse any optional operand. DPP and Clamp/OMod have custom handling for now. Sam also suggested to have class hierarchy for operand types instead of table. This can be done in separate change. Remove parseVOP3OptionalOps, parseDS*OptionalOps, parseFlatOptionalOps, parseMubufOptionalOps, parseDPPOptionalOps. Reduce number of definitions of AsmOperand's and MatchClasses' by using common base class. Rename AsmMatcher/InstPrinter methods accordingly. Print immediate type when printing parsed immediate operand. Use 'off' if offset/index register is unused instead of skipping it to make it more readable (also agreed with SP3). Update tests. Reviewers: tstellarAMD, SamWot, artem.tamazov Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19584 llvm-svn: 268015	2016-04-29 09:02:30 +00:00
Sam Kolton	3025e7f25f	[AMDGPU] Assembler: basic support for SDWA instructions Support for SDWA instructions for VOP1 and VOP2 encoding. Not done yet: - converters for support optional operands and modifiers - VOPC - sext() modifier - intrinsics - VOP2b (see vop_dpp.s) - V_MAC_F32 (see vop_dpp.s) Differential Revision: http://reviews.llvm.org/D19360 llvm-svn: 267553	2016-04-26 13:33:56 +00:00
Artem Tamazov	d6468666b5	[AMDGPU][llvm-mc] s_getreg/setreg* - Add hwreg(...) syntax. Added hwreg(reg[,offset,width]) syntax. Default offset = 0, default width = 32. Possibility to specify 16-bit immediate kept. Added out-of-range checks. Disassembling is always to hwreg(...) format. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19329 llvm-svn: 267410	2016-04-25 14:13:51 +00:00
Matt Arsenault	a9dbdcae04	AMDGPU: Add atomic_inc + atomic_dec intrinsics These are different than atomicrmw add 1 because they have an additional input value to clamp the result. llvm-svn: 266074	2016-04-12 14:05:04 +00:00
Matt Arsenault	21ecfe43ba	AMDGPU: Remove trailing whitespace llvm-svn: 266073	2016-04-12 14:04:54 +00:00
Tom Stellard	0ffdf65eaa	Revert "AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute" This reverts commit r263720. Just confirmed that s_waitcnt is required after ds_permute/ds_bpermute. llvm-svn: 265992	2016-04-11 20:38:40 +00:00
Jan Vesely	43b7b5b846	AMDGPU/SI: Implement atomic load/store for i32 and i64 Standard load/store instructions with GLC bit set. Reviewers: tstellardAMD, arsenm Differential Revision: http://reviews.llvm.org/D18760 llvm-svn: 265709	2016-04-07 19:23:11 +00:00
Tom Stellard	9112758077	AMDGPU/SI: Add latency for export instructions Reviewers: arsenm, nhaehnle Subscribers: nhaehnle, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18599 llvm-svn: 265708	2016-04-07 18:30:05 +00:00
Sam Kolton	ff90c60a78	[AMDGPU] AsmParser: disable DPP for unsupported instructions. New dpp tests. Fix v_nop_dpp. Summary: 1. Disable DPP encoding for instructions that do not support it: - VOP1: - v_readfirstlane_b32 - v_clrexcp - v_movreld_b32 - v_movrels_b32 - v_movrelsd_b32 - VOP2: - v_madmk_f16/32 - v_madak_f16/32 - VOPC, VINTRP, VOP3 2. Fix DPP for v_nop 3. New DPP tests for VOP1 and VOP2 instructions Reviewers: nhaustov, tstellarAMD, vpykhtin Subscribers: tstellarAMD, arsenm Differential Revision: http://reviews.llvm.org/D18552 llvm-svn: 265538	2016-04-06 13:29:59 +00:00
Valery Pykhtin	5b3559c1ec	[AMDGPU] fix MADAK/MADMK instructions operand namings to match encoding fields. $vsrc1 -> $src1, $k -> $imm Differential Revision: http://reviews.llvm.org/D18659 llvm-svn: 265141	2016-04-01 13:13:12 +00:00
Sam Kolton	1048fb1818	[AMDGPU] Disassembler: support for DPP Review: http://reviews.llvm.org/D18642 llvm-svn: 265015	2016-03-31 14:15:04 +00:00
Valery Pykhtin	c0a77c5064	[AMDGPU] Fix missing assembler predicates. Differential Revision: http://reviews.llvm.org/D18351 llvm-svn: 264137	2016-03-23 04:27:26 +00:00
Nicolai Haehnle	ad63638f6d	AMDGPU/SI: Add llvm.amdgcn.buffer.atomic.* intrinsics Summary: These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used by Mesa to implement atomics with buffer semantics. The intrinsic interface matches that of buffer.load.format and buffer.store.format, except that the GLC bit is not exposed (it is automatically deduced based on whether the return value is used). The change of hasSideEffects is required for TableGen to accept the pattern that matches the intrinsic. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, rivanvx, llvm-commits Differential Revision: http://reviews.llvm.org/D18151 llvm-svn: 263791	2016-03-18 16:24:31 +00:00

1 2 3 4 5

220 Commits