llvm-project

Commit Graph

Author	SHA1	Message	Date
Guillaume Chatelet	b65fa48305	[Alignment] Migrate Attribute::getWith(Stack)Alignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jdoerfert Reviewed By: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68792 llvm-svn: 374884	2019-10-15 12:56:24 +00:00
Matt Arsenault	12994a70cf	AMDGPU: Use SGPR_128 instead of SReg_128 for vregs SGPR_128 only includes the real allocatable SGPRs, and SReg_128 adds the additional non-allocatable TTMP registers. There's no point in allocating SReg_128 vregs. This shrinks the size of the classes regalloc needs to consider, which is usually good. llvm-svn: 374284	2019-10-10 07:11:33 +00:00
Matt Arsenault	ff07631b48	AMDGPU: Add amdgpu-32bit-address-high-bits to MIR serialization llvm-svn: 370089	2019-08-27 18:18:38 +00:00
Stanislav Mekhanoshin	2594fa8593	[AMDGPU] Fix high occupancy calculation and print it We had couple places which still return 10 as a maximum occupancy. Fixed. Also print comment about occupancy as compiler see it. Differential Revision: https://reviews.llvm.org/D65423 llvm-svn: 367381	2019-07-31 01:07:10 +00:00
Stanislav Mekhanoshin	937ff6e701	[AMDGPU] gfx908 agpr spilling Differential Revision: https://reviews.llvm.org/D64594 llvm-svn: 365833	2019-07-11 21:54:13 +00:00
Matt Arsenault	58426a3707	AMDGPU: Serialize mode from MachineFunctionInfo llvm-svn: 365653	2019-07-10 16:09:26 +00:00
Matt Arsenault	acc9e1e4c2	AMDGPU: Fix stray typing llvm-svn: 365373	2019-07-08 19:05:19 +00:00
Matt Arsenault	71dfb7ec5c	AMDGPU: Make s34 the FP register Make the FP register callee saved. This is tricky because now the FP needs to be spilled in the prolog relative to the incoming SP register, rather than the frame register used throughout the rest of the function. I don't like how this bypassess the standard mechanism for CSR spills just to get the correct insert point. I may look for a better solution, since all CSR VGPRs may also need to have all lanes activated. Another option might be to make getFrameIndexReference change the base register if the frame index is a CSR, and then try to figure out the right insertion point in emitProlog. If there is a free VGPR lane available for SGPR spilling, try to use it for the FP. If that would require intrtoducing a new VGPR spill, try to use a free call clobbered SGPR. Only fallback to introducing a new VGPR spill as a last resort. This also doesn't attempt to handle SGPR spilling with scalar stores. llvm-svn: 365372	2019-07-08 19:03:38 +00:00
Michael Liao	7a9ad430fe	[AMDGPU] Correct the setting of `FlatScratchInit`. Summary: - That flag setting should skip spilling stack slot. Reviewers: arsenm, rampitec Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64143 llvm-svn: 365137	2019-07-04 13:29:45 +00:00
Michael Liao	80177ca5a9	[AMDGPU] Enable serializing of argument info. Summary: - Support serialization of all arguments in machine function info. This enables fabricating MIR tests depending on argument info. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64096 llvm-svn: 364995	2019-07-03 02:00:21 +00:00
Nicolai Haehnle	4dc3b2bf95	AMDGPU: Support GDS atomics Summary: Original patch by Marek Olšák Change-Id: Ia97d5d685a63a377d86e82942436d1fe6e429bab Reviewers: mareko, arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63452 llvm-svn: 364814	2019-07-01 17:17:45 +00:00
Matt Arsenault	e0b8443460	AMDGPU: Check MRI for callee saved regs instead of TRI This should the same, but MRI does allow dynamically changing the CSR set, although currently not used. llvm-svn: 364425	2019-06-26 13:39:29 +00:00
Stanislav Mekhanoshin	4be636ebb3	[AMDGPU] Removed dead SIMachineFunctionInfo::getWorkItemIDVGPR() Differential Revision: https://reviews.llvm.org/D63780 llvm-svn: 364339	2019-06-25 18:33:53 +00:00
Matt Arsenault	d88db6d7fc	AMDGPU: Always use s33 for global scratch wave offset Every called function could possibly need this to calculate the absolute address of stack objectst, and this avoids inserting a copy around every call site in the kernel. It's also somewhat cleaner to keep this in a callee saved SGPR. llvm-svn: 363990	2019-06-20 21:58:24 +00:00
Sander de Smalen	7f23e0a62f	Enforce StackID definition in PEI There are various places in LLVM where the definition of StackID is not properly honoured, for example in PEI where objects with a StackID > 0 are allocated on the default stack (StackID0). This patch enforces that PEI only considers allocating objects to StackID 0. Reviewers: arsenm, thegameg, MatzeB Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D60062 llvm-svn: 357460	2019-04-02 09:46:52 +00:00
Matt Arsenault	055e4dce45	AMDGPU: Remove dx10-clamp from subtarget features Since this can be set with s_setreg*, it should not be a subtarget property. Set a default based on the calling convention, and Introduce a new amdgpu-dx10-clamp attribute to override this if desired. Also introduce a new amdgpu-ieee attribute to match. The values need to match to allow inlining. I think it is OK for the caller's dx10-clamp attribute to override the callee, but there doesn't appear to be the infrastructure to do this currently without definining the attribute in the generic Attributes.td. Eventually the calling convention lowering will need to insert a mode switch somewhere for these. llvm-svn: 357302	2019-03-29 19:14:54 +00:00
Tim Renouf	8723a56551	[MsgPack][AMDGPU] Fix unflushed raw_string_ostream bugs on windows expensive checks bot This fixes a couple of unflushed raw_string_ostream bugs in recent commits that only show up on a bot building on windows with expensive checks. Differential Revision: https://reviews.llvm.org/D59396 Change-Id: I9c6208325503b3ee0786b4b688e13fc24a15babf llvm-svn: 356394	2019-03-18 19:00:46 +00:00
Matt Arsenault	bc6d07ca46	MIR: Allow targets to serialize MachineFunctionInfo This has been a very painful missing feature that has made producing reduced testcases difficult. In particular the various registers determined for stack access during function lowering were necessary to avoid undefined register errors in a large percentage of cases. Implement a subset of the important fields that need to be preserved for AMDGPU. Most of the changes are to support targets parsing register fields and properly reporting errors. The biggest sort-of bug remaining is for fields that can be initialized from the IR section will be overwritten by a default initialized machineFunctionInfo section. Another remaining bug is the machineFunctionInfo section is still printed even if empty. llvm-svn: 356215	2019-03-14 22:54:43 +00:00
Matt Arsenault	aa6fb4c45e	AMDGPU: Remove debugger related subtarget features As far as I know these aren't needed anymore. llvm-svn: 354634	2019-02-21 23:27:46 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Scott Linder	c6c627253d	[AMDGPU] Remove FeatureVGPRSpilling This feature is only relevant to shaders, and is no longer used. When disabled, lowering of reserved registers for shaders causes a compiler crash. Remove the feature and add a test for compilation of shaders at OptNone. Differential Revision: https://reviews.llvm.org/D53829 llvm-svn: 345763	2018-10-31 18:54:06 +00:00
Konstantin Zhuravlyov	aa067cb9fb	AMDGPU: Rename isAmdCodeObjectV2 -> isAmdHsaOrMesa The isAmdCodeObjectV2 is a misleading name which actually checks whether the os is amdhsa or mesa. Also add a test to make sure we do not generate old kernel header for code object v3. Differential Revision: https://reviews.llvm.org/D52897 llvm-svn: 343813	2018-10-04 21:02:16 +00:00
Matt Arsenault	4bec7d4261	Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering" Reverts r337079 with fix for msan error. llvm-svn: 337535	2018-07-20 09:05:08 +00:00
Evgeniy Stepanov	1971ba097d	Revert "AMDGPU: Fix handling of alignment padding in DAG argument lowering" This reverts commit r337021. WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x1415cd65 in void write_signed<long>(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:95:7 #1 0x1415c900 in llvm::write_integer(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:121:3 #2 0x1472357f in llvm::raw_ostream::operator<<(long) /code/llvm-project/llvm/lib/Support/raw_ostream.cpp:117:3 #3 0x13bb9d4 in llvm::raw_ostream::operator<<(int) /code/llvm-project/llvm/include/llvm/Support/raw_ostream.h:210:18 #4 0x3c2bc18 in void printField<unsigned int, &(amd_kernel_code_s::amd_kernel_code_version_major)>(llvm::StringRef, amd_kernel_code_s const&, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:78:23 #5 0x3c250ba in llvm::printAmdKernelCodeField(amd_kernel_code_s const&, int, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:104:5 #6 0x3c27ca3 in llvm::dumpAmdKernelCode(amd_kernel_code_s const, llvm::raw_ostream&, char const) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:113:5 #7 0x3a46e6c in llvm::AMDGPUTargetAsmStreamer::EmitAMDKernelCodeT(amd_kernel_code_s const&) /code/llvm-project/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:161:3 #8 0xd371e4 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:204:26 [...] Uninitialized value was created by an allocation of 'KernelCode' in the stack frame of function '_ZN4llvm16AMDGPUAsmPrinter21EmitFunctionBodyStartEv' #0 0xd36650 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:192 llvm-svn: 337079	2018-07-14 01:20:53 +00:00
Matt Arsenault	de95077780	AMDGPU: Fix handling of alignment padding in DAG argument lowering This was completely broken if there was ever a struct argument, as this information is thrown away during the argument analysis. The offsets as passed in to LowerFormalArguments are not useful, as they partially depend on the legalized result register type, and they don't consider the alignment in the first place. Ignore the Ins array, and instead figure out from the raw IR type what we need to do. This seems to fix the padding computation if the DAG lowering is forced (and stops breaking arguments following padded arguments if the arguments were only partially lowered in the IR) llvm-svn: 337021	2018-07-13 16:40:25 +00:00
Tom Stellard	5bfbae5cb1	AMDGPU: Refactor Subtarget classes Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851	2018-07-11 20:59:01 +00:00
Konstantin Zhuravlyov	766c77efd7	AMDGPU/AMDHSA: Remove GridWorkGroupCountX/Y/Z and everything that comes with it from implementation and v3 header files. Leave definition in v2 header files for backwards compatibility. Differential Revision: https://reviews.llvm.org/D48191 llvm-svn: 335267	2018-06-21 18:36:04 +00:00
Stanislav Mekhanoshin	d4b500cb08	[AMDGPU] Track occupancy in MFI Keep track of achieved occupancy in SIMachineFunctionInfo. At the moment we have a lot of duplicated or even missed code to query and maintain occupancy info. Record it in the MFI and query in a single call. Interfaces: - getOccupancy() - returns current recorded achieved occupancy. - getMinAllowedOccupancy() - returns lesser of the achieved occupancy and the lowest occupancy we are ready to tolerate. For example if a kernel is memory bound we are ready to tolerate 4 waves. - limitOccupancy() - record occupancy level if we have to lower it. - increaseOccupancy() - record occupancy if scheduler managed to increase the occupancy. MFI takes care of integrating different checks affecting occupancy, including LDS use and waves-per-eu attribute. Note that scheduler starts with not yet known register pressure, so has to record either limit or increase in occupancy after it is done. Later passes can just query a resulting value. New interface is used in the active scheduler and NFC wrt its work. Changes are also made to experimental schedulers to use it and record an occupancy after they are done. Before the change waves-per-eu was ignored by experimental schedulers and tolerance window for memory bound kernels was not used. Differential Revision: https://reviews.llvm.org/D47509 llvm-svn: 333629	2018-05-31 05:36:04 +00:00
Matt Arsenault	1ea0402e82	AMDGPU: Round up kernel argument allocation size AFAIK the driver's allocation will actually have to round this up anyway. It is useful to track the rounded up size, so that the end of the kernel segment is known to be dereferencable so a wider s_load_dword can be used for a short argument at the end of the segment. llvm-svn: 333456	2018-05-29 19:35:00 +00:00
Matt Arsenault	ceafc55e5a	AMDGPU: Pass function directly instead of MachineFunction These functions just query the underlying IR function, so pass it directly. llvm-svn: 333442	2018-05-29 17:42:50 +00:00
Tom Stellard	44b30b4537	AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are huge so we only want to include them where needed. This will also make it easier if we want to split the R600 and GCN definitions into separate tablegenerated files. I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h because it uses some enums from the header to initialize default values for the SIMachineFunction class, so I ended up having to remove includes of SIMachineFunctionInfo.h from headers too. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46272 llvm-svn: 332930	2018-05-22 02:03:23 +00:00
Matt Arsenault	17f3338015	AMDGPU: Fix not preserving CSR VGPR if used for SGPR spills Before this was not done if the function had no calls in it. This is still a possible issue with any callable function, regardless of calls present. llvm-svn: 328659	2018-03-27 19:42:55 +00:00
Matt Arsenault	923712b6b5	Reapply "AMDGPU: Add 32-bit constant address space" This reverts r324494 and reapplies r324487. llvm-svn: 324747	2018-02-09 16:57:57 +00:00
Rafael Espindola	f4e3f3e31c	Revert "AMDGPU: Add 32-bit constant address space" This reverts commit r324487. It broke clang tests. llvm-svn: 324494	2018-02-07 18:09:35 +00:00
Marek Olsak	871c30e540	AMDGPU: Add 32-bit constant address space Note: This is a candidate for LLVM 6.0, because it was planned to be in that release but was delayed due to a long review period. Merge conflict in release_60 - resolution: Add "-p6:32:32" into the second (non-amdgiz) string. Only scalar loads support 32-bit pointers. An address in a VGPR will fail to compile. That's OK because the results of loads will only be used in places where VGPRs are forbidden. Updated AMDGPUAliasAnalysis and used SReg_64_XEXEC. The tests cover all uses cases we need for Mesa. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D41651 llvm-svn: 324487	2018-02-07 16:01:00 +00:00
Matt Arsenault	e19bc2ee0f	AMDGPU: Use unique PSVs for buffer resources Also fixes using the wrong memory type for some intrinsics when custom lowering them. llvm-svn: 321557	2017-12-29 17:18:21 +00:00
Matt Arsenault	905f3518ba	AMDGPU: Implement getTgtMemIntrinsic for images Currently all images are lowered to have a single image PseudoSourceValue. Image stores happen to have overly strict mayLoad/mayStore/hasSideEffects flags set on them, so this happens to work. When these are fixed to be correct, the scheduler breaks this because the identical PSVs are assumed to be the same address. These need to be unique to the image resource value. llvm-svn: 321555	2017-12-29 17:18:14 +00:00
Matthias Braun	f1caa2833f	MachineFunction: Return reference from getFunction(); NFC The Function can never be nullptr so we can return a reference. llvm-svn: 320884	2017-12-15 22:22:58 +00:00
Tim Renouf	132291589f	[AMDGPU] AMDPAL scratch buffer support Summary: Added support for scratch (including spilling) for OS type amdpal: generates code to set up the scratch descriptor if it is needed. With amdpal, the scratch resource descriptor is loaded from offset 0 of the global information table. The low 32 bits of the address of the global information table is passed in s0. Added amdgpu-git-ptr-high function attribute to hard-wire the high 32 bits of the address of the global information table. If the function attribute is not specified, or is 0xffffffff, then the backend generates code to use the high 32 bits of pc. The documentation for the AMDPAL ABI will be added in a later commit. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye Differential Revision: https://reviews.llvm.org/D37483 llvm-svn: 314501	2017-09-29 09:49:35 +00:00
Jan Sjodin	312ccf761c	Add AddresSpace to PseudoSourceValue. Differential Revision: https://reviews.llvm.org/D35089 llvm-svn: 313297	2017-09-14 20:53:51 +00:00
Tim Renouf	6cb007fc72	AMDGPU: trivial comment change ... to check commit access for new committer. llvm-svn: 312900	2017-09-11 08:31:32 +00:00
Eugene Zelenko	59e128266c	[AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 310328	2017-08-08 00:47:13 +00:00
Matt Arsenault	817c253e60	AMDGPU: Fix implicitarg.ptr handling special inputs llvm-svn: 310002	2017-08-03 23:12:44 +00:00
Matt Arsenault	8623e8d864	AMDGPU: Pass special input registers to functions llvm-svn: 309998	2017-08-03 23:00:29 +00:00
Matt Arsenault	8e8f8f43b0	AMDGPU: Fix clobbering CSR VGPRs when spilling SGPR to it llvm-svn: 309783	2017-08-02 01:52:45 +00:00
Matt Arsenault	9166ce86e8	AMDGPU: Annotate implicitarg.ptr usage We need to pass something to functions for this to work. It isn't derivable just from the kernarg segment pointer because the implicit arguments are placed after the kernel arguments. Also fixes missing test for the intrinsic. llvm-svn: 309398	2017-07-28 15:52:08 +00:00
Matt Arsenault	254ad3de5c	AMDGPU: Annotate necessity of flat-scratch-init As an approximation of the existing handling to avoid regressions. Fixes using too many registers with calls on subtargets with the SGPR allocation bug. llvm-svn: 308326	2017-07-18 16:44:58 +00:00
Matt Arsenault	1cc47f8413	AMDGPU: Figure out private memory regs after lowering Introduce pseudo-registers for registers needed for stack access, which are replaced during finalizeLowering. Note these pseudo-registers are currently only used for the used register location, and not for determining their input argument register. This is better because it avoids the need to try to predict whether a call will be emitted from the IR, and also detects stack objects introduced by legalization. Test changes are from the HasStackObjects check being more accurate since stack objects introduced during legalization are now known. llvm-svn: 308325	2017-07-18 16:44:56 +00:00
Matt Arsenault	e15855d9e3	AMDGPU: Annotate features from x work item/group IDs. This wasn't necessary before since they are always enabled for kernels, but this is necessary if they need to be forwarded to a callable function. llvm-svn: 308226	2017-07-17 22:35:50 +00:00
Matt Arsenault	23e4df6a59	AMDGPU: Detect kernarg segment pointer This is necessary to pass the kernarg segment pointer to callee functions. Also don't unconditionally enable for kernels. llvm-svn: 307978	2017-07-14 00:11:13 +00:00

1 2

94 Commits