llvm-project

Commit Graph

Author	SHA1	Message	Date
Jay Foad	6c0204c794	[AMDGPU] Mark s_barrier as having side effects but not accessing memory. Summary: This fixes poor scheduling in a function containing a barrier and a few load instructions. Without this fix, ScheduleDAGInstrs::buildSchedGraph adds an artificial edge in the dependency graph from the barrier instruction to the exit node representing live-out latency, with a latency of about 500 cycles. Because of this it thinks the critical path through the graph also has a latency of about 500 cycles. And because of that it does not think that any of the load instructions are on the critical path, so it schedules them with no regard for their (80 cycle) latency, which gives poor results. Reviewers: arsenm, dstuttard, tpr, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67218 llvm-svn: 371192	2019-09-06 10:07:28 +00:00
Nicolai Haehnle	2710171a15	AMDGPU: Write LDS objects out as global symbols in code generation Summary: The symbols use the processor-specific SHN_AMDGPU_LDS section index introduced with a previous change. The linker is then expected to resolve relocations, which are also emitted. Initially disabled for HSA and PAL environments until they have caught up in terms of linker and runtime loader. Some notes: - The llvm.amdgcn.groupstaticsize intrinsics can no longer be lowered to a constant at compile times, which means some tests can no longer be applied. The current "solution" is a terrible hack, but the intrinsic isn't used by Mesa, so we can keep it for now. - We no longer know the full LDS size per kernel at compile time, which means that we can no longer generate a relevant error message at compile time. It would be possible to add a check for the size of individual variables, but ultimately the linker will have to perform the final check. Change-Id: If66dbf33fccfbf3609aefefa2558ac0850d42275 Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: qcolombet, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61494 llvm-svn: 364297	2019-06-25 11:52:30 +00:00
Stanislav Mekhanoshin	c90347d760	[AMDGPU] Generate range metadata for workitem id If workgroup size is known inform llvm about range returned by local id and local size queries. Differential Revision: https://reviews.llvm.org/D31804 llvm-svn: 300102	2017-04-12 20:48:56 +00:00
Matt Arsenault	3dbeefa978	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444	2017-03-21 21:39:51 +00:00
Tom Stellard	c2ff0eb697	AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler Summary: The SILoadStoreOptimizer can now look ahead more then one instruction when looking for instructions to merge, which greatly improves the number of loads/stores that we are able to merge. Moving the pass before scheduling avoids increasing register pressure after the scheduler, so that the scheduler's register pressure estimates will be more accurate. It also gives more consistent results, since it is no longer affected by minor scheduling changes. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23814 llvm-svn: 279991	2016-08-29 19:15:22 +00:00
Matt Arsenault	4c519d3518	AMDGPU/R600: Replace barrier intrinsics llvm-svn: 275870	2016-07-18 18:34:59 +00:00

6 Commits