llvm-project

Commit Graph

Author	SHA1	Message	Date
Stanislav Mekhanoshin	555d8f4ef5	[AMDGPU] Bundle loads before post-RA scheduler We are relying on atrificial DAG edges inserted by the MemOpClusterMutation to keep loads and stores together in the post-RA scheduler. This does not work all the time since it allows to schedule a completely independent instruction in the middle of the cluster. Removed the DAG mutation and added pass to bundle already clustered instructions. These bundles are unpacked before the memory legalizer because it does not work with bundles but also because it allows to insert waitcounts in the middle of a store cluster. Removing artificial edges also allows a more relaxed scheduling. Differential Revision: https://reviews.llvm.org/D72737	2020-01-24 11:33:38 -08:00
Ron Lieberman	f48e43bbf7	[AMDGPU] Disable SReg Global LD/ST, perf regression Differential Revision: https://reviews.llvm.org/D55093 llvm-svn: 348014	2018-11-30 18:29:17 +00:00
Ron Lieberman	cac749ac88	[AMDGPU] Add FixupVectorISel pass, currently Supports SREGs in GLOBAL LD/ST Add a pass to fixup various vector ISel issues. Currently we handle converting GLOBAL_{LOAD\|STORE}_* and GLOBAL_Atomic_* instructions into their _SADDR variants. This involves feeding the sreg into the saddr field of the new instruction. llvm-svn: 347008	2018-11-16 01:13:34 +00:00
Matt Arsenault	84445dd13c	AMDGPU: Use gfx9 carry-less add/sub instructions llvm-svn: 319491	2017-11-30 22:51:26 +00:00
Matt Arsenault	3f71c0e3ee	AMDGPU: Select DS insts without m0 initialization GFX9 stopped using m0 for most DS instructions. Select a different instruction without the use. I think this will be less error prone than trying to manually maintain m0 uses as needed. llvm-svn: 319270	2017-11-29 00:55:57 +00:00
Matt Arsenault	3dbeefa978	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444	2017-03-21 21:39:51 +00:00
Matt Arsenault	45f8216cee	AMDGPU: Remove superfluous string attributes from tests Also fix v_mac.ll not testing right thing for fneg llvm-svn: 275129	2016-07-11 23:35:48 +00:00
Matt Arsenault	44e5483ada	AMDGPU: Add volatile to test loads and stores When the memory vectorizer is enabled, these tests break. These tests don't really care about the memory instructions, and it's easier to write check lines with the unmerged loads. llvm-svn: 266071	2016-04-12 13:38:18 +00:00
Matt Arsenault	9c47dd583a	AMDGPU: Remove some old intrinsic uses from tests llvm-svn: 260493	2016-02-11 06:02:01 +00:00
Matt Arsenault	2aed6ca1d3	AMDGPU: Switch barrier intrinsics to using convergent noduplicate prevents unrolling of small loops that happen to have barriers in them. If a loop has a barrier in it, it is OK to duplicate it for the unroll. llvm-svn: 256075	2015-12-19 01:46:41 +00:00
Matt Arsenault	84db5d97b0	AMDGPU/SI: Fix read2 merging into a super register. If the read2 produced was supposed to be writing into a super register, it would use the wrong subregister indices. Fix this by inserting copies, so we only ever write to a vreg_64. Run the register coalescer again to clean this up, although this isn't ideal and often does result in an extra move. Also remove the assert that offset1 > offset0. There isn't a real reason to not allow this other than a minor convenience in the compiler, and it doesn't seem worth the effort of avoiding it. llvm-svn: 242174	2015-07-14 17:57:36 +00:00
Tom Stellard	45bb48ea19	R600 -> AMDGPU rename llvm-svn: 239657	2015-06-13 03:28:10 +00:00

12 Commits