llvm-project

Commit Graph

Author	SHA1	Message	Date
Mark Searles	72da47df25	run post-RA hazard recognizer pass late Memory legalizer, waitcnt, and shrink passes can perturb the instructions, which means that the post-RA hazard recognizer pass should run after them. Otherwise, one of those passes may invalidate the work done by the hazard recognizer. Note that this has adverse side-effect that any consecutive S_NOP 0's, emitted by the hazard recognizer, will not be shrunk into a single S_NOP <N>. This should be addressed in a follow-on patch. Differential Revision: https://reviews.llvm.org/D49288 llvm-svn: 337154	2018-07-16 10:02:41 +00:00
Connor Abbott	79f3ade51a	[AMDGPU] Add pseudo "old" source to all DPP instructions Summary: All instructions with the DPP modifier may not write to certain lanes of the output if bound_ctrl=1 is set or any bits in bank_mask or row_mask aren't set, so the destination register may be both defined and modified. The right way to handle this is to add a constraint that the destination register is the same as one of the inputs. We could tie the destination to the first source, but that would be too restrictive for some use-cases where we want the destination to be some other value before the instruction executes. Instead, add a fake "old" source and tie it to the destination. Effectively, the "old" source defines what value unwritten lanes will get. We'll expose this functionality to users with a new intrinsic later. Also, we want to use DPP instructions for computing derivatives, which means we need to set WQM for them. We also need to enable the entire wavefront when using DPP intrinsics to implement nonuniform subgroup reductions, since otherwise we'll get incorrect results in some cases. To accomodate this, add a new operand to all DPP instructions which will be interpreted by the SI WQM pass. This will be exposed with a new intrinsic later. We'll also add support for Whole Wavefront Mode later. I also fixed llvm.amdgcn.mov.dpp to overwrite the source and fixed up the test. However, I could also keep the old behavior (where lanes that aren't written are undefined) if people want it. Reviewers: tstellar, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34716 llvm-svn: 310283	2017-08-07 19:10:56 +00:00
Connor Abbott	82267a553d	test commit llvm-svn: 309981	2017-08-03 20:22:30 +00:00
Matt Arsenault	3dbeefa978	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444	2017-03-21 21:39:51 +00:00
Matt Arsenault	7aad8fd8f4	Enable FeatureFlatForGlobal on Volcanic Islands This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982	2017-01-24 22:02:15 +00:00
Tom Stellard	a27007eb4f	AMDGPU/SI: Use hazard recognizer to detect DPP hazards Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18603 llvm-svn: 268247	2016-05-02 16:23:09 +00:00
Sam Kolton	a74cd526e9	[AMDGPU] Assembler: Change dpp_ctrl syntax to match sp3 Review: http://reviews.llvm.org/D18267 llvm-svn: 263789	2016-03-18 15:35:51 +00:00
Tom Stellard	331f981cc9	AMDGPU/SI: Handle wait states required for DPP instructions Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17543 llvm-svn: 263447	2016-03-14 17:05:56 +00:00
Sam Kolton	dfa29f7c5b	[AMDGPU] Assembler: Support DPP instructions. Supprot DPP syntax as used in SP3 (except several operands syntax). Added dpp-specific operands in td-files. Added DPP flag to TSFlags to determine if instruction is dpp in InstPrinter. Support for VOP2 DPP instructions in td-files. Some tests for DPP instructions. ToDo: - VOP2bInst: - vcc is considered as operand - AsmMatcher doesn't apply mnemonic aliases when parsing operands - v_mac_f32 - v_nop - disable instructions with 64-bit operands - change dpp_ctrl assembler representation to conform sp3 Review: http://reviews.llvm.org/D17804 llvm-svn: 263008	2016-03-09 12:29:31 +00:00
Tom Stellard	4409051d00	AMDGPU/SI: Add llvm.amdgcn.mov.dpp intrinsic This intrinsic will be used to expose dpp functionality to higher-level languages. It will map to the dpp version of v_mov_b32. llvm-svn: 260792	2016-02-13 02:09:49 +00:00

10 Commits