llvm-project

Commit Graph

Author	SHA1	Message	Date
Konstantin Zhuravlyov	5b0bf2ff0d	AMDGPU: Remove deprecated and unused elf definitions Differential Revision: https://reviews.llvm.org/D33689 llvm-svn: 304737	2017-06-05 21:33:40 +00:00
Sam Kolton	f7659d71eb	[AMDGPU] SDWA: Add assembler support for GFX9 Summary: Added separate pseudo and real instruction for GFX9 SDWA instructions. Currently supports only in assembler. Depends D32493 Reviewers: vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D33132 llvm-svn: 303620	2017-05-23 10:08:55 +00:00
Matt Arsenault	2b1f9aa577	AMDGPU: Start defining a calling convention Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well. llvm-svn: 303308	2017-05-17 21:56:25 +00:00
Matt Arsenault	efa9f4b210	AMDGPU: Refactor SIMachineFunctionInfo slightly Prepare for handling non-entry functions. llvm-svn: 299999	2017-04-11 22:29:28 +00:00
Matt Arsenault	e622dc3803	AMDGPU: Refactor argument lowering Split into smaller functions and prepare for handling non-entry functions. llvm-svn: 299998	2017-04-11 22:29:24 +00:00
Matt Arsenault	678e111e11	AMDGPU: Fix crash when disassembling VOP3 mac The unused dummy src2_modifiers is missing, so it crashes when trying to print it. I tried to fully remove src2_modifiers, but there are some irritations in the places where it is converted to mad since it starts to require modifying use lists while iterating over them. llvm-svn: 299861	2017-04-10 17:58:06 +00:00
Yaxun Liu	1a14bfa022	[AMDGPU] Get address space mapping by target triple environment As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 llvm-svn: 298846	2017-03-27 14:04:01 +00:00
Dmitry Preobrazhensky	03880f8d24	[AMDGPU][MC] Fix for Bug 30829 + LIT tests Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction). Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code). Added LIT tests. llvm-svn: 296873	2017-03-03 14:31:06 +00:00
Matt Arsenault	9be7b0d485	AMDGPU: Add VOP3P instruction format Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler llvm-svn: 296368	2017-02-27 18:49:11 +00:00
Matt Arsenault	e823d92f7f	AMDGPU: Merge initial gfx9 support llvm-svn: 295554	2017-02-18 18:29:53 +00:00
Eugene Zelenko	d96089b248	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). Same changes in files affected by reduced MC headers dependencies. llvm-svn: 295009	2017-02-14 00:33:36 +00:00
Konstantin Zhuravlyov	9f89ede107	[AMDGPU] Add target information that is required by tools to metadata Differential Revision: https://reviews.llvm.org/D28760#fb670e28 llvm-svn: 294449	2017-02-08 14:05:23 +00:00
Tom Stellard	08efb7ebf6	AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D29068 llvm-svn: 293321	2017-01-27 18:41:14 +00:00
Matt Arsenault	4bd7236193	AMDGPU: Fix handling of 16-bit immediates Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determining the legality based on the size is incorrect. Change operands to have the size specified in the type. Also adds a workaround for a disassembler bug that produces an immediate MCOperand for an operand that is supposed to be OPERAND_REGISTER. The assembler appears to accept out of bounds immediates and truncates them, but this seems to be an issue for 32-bit already. llvm-svn: 289306	2016-12-10 00:39:12 +00:00
Matt Arsenault	26faed3960	AMDGPU: Consolidate inline immediate predicate functions llvm-svn: 288718	2016-12-05 22:26:17 +00:00
Tom Stellard	b133fbb9a4	AMDGPU/SI: Handle hazard with > 8 byte VMEM stores Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25577 llvm-svn: 285359	2016-10-27 23:05:31 +00:00
Konstantin Zhuravlyov	08326b6256	[AMDGPU] Emit constant address space data in .rodata section and use relocations instead of fixups (amdhsa only) Differential Revision: https://reviews.llvm.org/D25693 llvm-svn: 284759	2016-10-20 18:12:38 +00:00
Krzysztof Parzyszek	c87155037b	[AMDGPU] Stop using MCRegisterClass::getSize() Differential Review: https://reviews.llvm.org/D24675 llvm-svn: 284619	2016-10-19 17:40:36 +00:00
Konstantin Zhuravlyov	cdd4547607	[AMDGPU] Refactor waitcnt encoding - Refactor bit packing/unpacking - Calculate bit mask given bit shift and bit width - Introduce function for decoding bits of waitcnt - Introduce function for encoding bits of waitcnt - Introduce function for getting waitcnt mask (instead of using bare numbers) - Introduce function fot getting max waitcnt(s) (instead of using bare numbers) Differential Revision: https://reviews.llvm.org/D25298 llvm-svn: 283919	2016-10-11 18:58:22 +00:00
Sam Kolton	a3ec5c10e2	[AMDGPU] Assembler: support v_mac_f32 DPP and SDWA. Move getNamedOperandIdx to AMDGPUBaseInfo.h Reviewers: artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25084 llvm-svn: 283560	2016-10-07 14:46:06 +00:00
Konstantin Zhuravlyov	836cbff695	[AMDGPU] Choose VMCNT, EXPCNT, LGKMCNT masks and shifts based on the isa version Differential Revision: https://reviews.llvm.org/D24973 llvm-svn: 282877	2016-09-30 17:01:40 +00:00
Sam Kolton	1eeb11bfd4	AMDGPU] Assembler: better support for immediate literals in assembler. Summary: Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals. E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least. With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction). Here are rules how we convert literals: - We parsed fp literal: - Instruction expects 64-bit operand: - If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5) - then we do nothing this literal - Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5) - report error - Else literal is not-inlinable but we can encode it as additional 32-bit literal constant - If instruction expect fp operand type (f64) - Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5) - If so then do nothing - Else (e.g. v_fract_f64 v[0:1], 3.1415) - report warning that low 32 bits will be set to zeroes and precision will be lost - set low 32 bits of literal to zeroes - Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5) - report error as it is unclear how to encode this literal - Instruction expects 32-bit operand: - Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow - Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5) - do nothing - Else report error - Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0) - Parsed binary literal: - Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35) - do nothing - Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35) - report error - Else, literal is not-inlinable and we are not required to inline it - Are high 32 bit of literal zeroes or same as sign bit (32 bit) - do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef) - Else - report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0) For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types: ''' enum OperandType { OPERAND_REG_IMM32_INT, OPERAND_REG_IMM32_FP, OPERAND_REG_INLINE_C_INT, OPERAND_REG_INLINE_C_FP, } ''' This is not working yet: - Several tests are failing - Problems with predicate methods for inline immediates - LLVM generated assembler parts try to select e64 encoding before e32. More changes are required for several AsmOperands. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, artem.tamazov Differential Revision: https://reviews.llvm.org/D22922 llvm-svn: 281050	2016-09-09 14:44:04 +00:00
Konstantin Zhuravlyov	1d65026ca6	[AMDGPU] Wave and register controls - Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch Patch by Tom Stellard and Konstantin Zhuravlyov Differential Revision: https://reviews.llvm.org/D21562 llvm-svn: 280747	2016-09-06 20:22:28 +00:00
Matt Arsenault	8300272823	AMDGPU: Fix getIntegerAttribute type and error message llvm-svn: 269268	2016-05-12 02:45:18 +00:00
Tom Stellard	79a1fd718c	AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit Summary: For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD. This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions. Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug. Reviewers: mareko, arsenm, tstellarAMD, nhaehnle Subscribers: FireBurn, kerberizer, llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18340 Patch By: Bas Nieuwenhuizen llvm-svn: 266337	2016-04-14 16:27:07 +00:00
Nicolai Haehnle	df3a20cd80	AMDGPU: Add a shader calling convention This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589	2016-04-06 19:40:20 +00:00
Marek Olsak	fccabaf57e	AMDGPU/SI: Add new target attribute InitialPSInputAddr Summary: This allows Mesa to pass initial SPI_PS_INPUT_ADDR to LLVM. The register assigns VGPR locations to PS inputs, while the ENA register determines whether or not they are loaded. Mesa needs to set some inputs as not-movable, so that a pixel shader prolog binary appended at the beginning can assume where some inputs are. v2: Make PSInputAddr private, because there is never enough silly getters and setters for people to read. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16030 llvm-svn: 257591	2016-01-13 11:45:36 +00:00
Tom Stellard	2b65ed306d	AMDGPU/SI: Fix encoding for FLAT_SCRATCH registers on VI Summary: These register has different encodings on CI and VI, so we add pseudo FLAT_SCRACTH registers to be used before MC, and subtarget specific registers to be used by the MC layer. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15661 llvm-svn: 256178	2015-12-21 18:44:27 +00:00
Tom Stellard	ac00eb5470	AMDGPU/SI: Add getShaderType() function to Utils/ Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15424 llvm-svn: 255650	2015-12-15 16:26:16 +00:00
Tom Stellard	9760f03757	AMDGPU/SI: Emit constant arrays in the .hsrodata_readonly_agent section Summary: This is done only when targeting HSA. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13807 llvm-svn: 254587	2015-12-03 03:34:32 +00:00
Tom Stellard	00f2f91af4	AMDGPU/SI: Correctly emit agent global segment variables when targeting HSA Differential Revision: http://reviews.llvm.org/D14508 llvm-svn: 254540	2015-12-02 19:47:57 +00:00
Tom Stellard	e3b5aeaf83	AMDGPU/SI: Don't emit group segment global variables Summary: Only global or readonly segment variables should appear in object files. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15111 llvm-svn: 254519	2015-12-02 17:00:42 +00:00
Tom Stellard	e135ffd554	AMDGPU/SI: Use .hsatext section instead of .text for HSA Reviewers: arsenm, grosbach, rafael Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12424 llvm-svn: 248619	2015-09-25 21:41:28 +00:00
Tom Stellard	ff7416ba06	AMDGPU/SI: Update amd_kernel_code_t definition and add assembler support Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10772 llvm-svn: 240839	2015-06-26 21:58:31 +00:00
Tom Stellard	347ac79b15	AMDGPU/SI: Add hsa code object directives Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10757 llvm-svn: 240831	2015-06-26 21:15:07 +00:00

35 Commits