llvm-project

Commit Graph

Author	SHA1	Message	Date
Tim Corringham	6c6d5e24cd	AMDGPU: fix missing s_waitcnt Summary: The pass that inserts s_waitcnt instructions where needed propagated info used to track dependencies for each block by iterating over the predecessor blocks. The iteration was terminated when a predecessor that had not yet been processed was encountered. Any info in blocks later in the list was therefore not processed, leading to the possiblility of a required s_waitcnt not being inserted. The fix is simply to change the "break" to "continue" for the relevant loops, so that all visited blocks are processed. This is likely what was intended when the code was written. There is no test case provided for this fix because: 1) the only example that reproduces this is large and resistant to being reduced 2) the change is trivial Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D40544 llvm-svn: 319651	2017-12-04 12:30:49 +00:00
Matt Arsenault	a41351e37c	AMDGPU: Move hazard avoidance out of waitcnt pass. This is mostly moving VMEM clause breaking into the hazard recognizer. Also move another hazard currently handled in the waitcnt pass. Also stops breaking clauses unless xnack is enabled. llvm-svn: 318557	2017-11-17 21:35:32 +00:00
NAKAMURA Takumi	1657f2ad99	Fix warnings discovered by rL317076. [-Wunused-private-field] llvm-svn: 317091	2017-11-01 13:47:55 +00:00
Evgeny Mankov	bf9751760a	[AMDGPU] NFC: test commit llvm-svn: 311019	2017-08-16 16:47:29 +00:00
Eugene Zelenko	59e128266c	[AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 310328	2017-08-08 00:47:13 +00:00
Matt Arsenault	0ed39d329d	AMDGPU: Partially fix improper reliance on memoperands There are 2 more places doing this, but I'm not sure what they are doing and don't make any sense to me llvm-svn: 308770	2017-07-21 18:54:54 +00:00
Matt Arsenault	6ab9ea9614	AMDGPU: Don't track lgkmcnt for global_/scratch_ instructions llvm-svn: 308766	2017-07-21 18:34:51 +00:00
Mark Searles	602ee930bf	[AMDGPU] Fix uninit'ed var (RevisitLoop) Differential Revision: https://reviews.llvm.org/D33907 llvm-svn: 304729	2017-06-05 19:29:01 +00:00
Konstantin Zhuravlyov	be6c0ca5e2	AMDGPU: Make auto waitcnt before barrier a feature Differential Revision: https://reviews.llvm.org/D33793 llvm-svn: 304571	2017-06-02 17:40:26 +00:00
Mark Searles	11d0a04050	[AMDGPU] Fix bugs in new waitcnt pass. Add test. - new waitcnt pass remains off by default; -enable-si-insert-waitcnts=1 to enable it - fix handling of PERMUTE ops - fix insertion of waitcnt instrs at function begin/end ( port of analogous code that was added to old waitcnt pass ) - add new test Differential Revision: https://reviews.llvm.org/D33114 llvm-svn: 304311	2017-05-31 16:44:23 +00:00
Kannan Narayanan	5e73b04b84	[AMDGPU] In the new waitcnt insertion pass, use getHeader instead of getTopBlock to find the loop header. Differential Revision: https://reviews.llvm.org/D32831 llvm-svn: 302290	2017-05-05 21:10:17 +00:00
Krzysztof Parzyszek	44e25f37ae	Move size and alignment information of regclass to TargetRegisterInfo 1. RegisterClass::getSize() is split into two functions: - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const; - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const; 2. RegisterClass::getAlignment() is replaced by: - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const; This will allow making those values depend on subtarget features in the future. Differential Revision: https://reviews.llvm.org/D31783 llvm-svn: 301221	2017-04-24 18:55:33 +00:00
Kannan Narayanan	acb089e12a	[AMDGPU] Add a new pass to insert waitcnts. Leave under an option for testing. Based on comments in https://reviews.llvm.org/D31161. llvm-svn: 300023	2017-04-12 03:25:12 +00:00

13 Commits