llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	eb6eb694e4	AMDGPU/GlobalISel: Allow selection of scalar min/max I believe all of the uniform/divergent pattern predicates are redundant and can be removed. The uniformity bit already influences the register class, and nothhing has broken when I've removed this and others. llvm-svn: 372450	2019-09-21 02:37:33 +00:00
Stanislav Mekhanoshin	af77ca7e6e	Remove assert from MachineLoop::getLoopPredecessor() According to the documentation method returns predecessor if the given loop's header has exactly one unique predecessor outside the loop. Otherwise return null. In reality it asserts if there is no predecessor outside of the loop. The testcase has the loop where predecessors outside of the loop were not identified as analyzeBranch() was unable to process the mask branch and returned true. That is also not correct to assert for the truly dead loops. Differential Revision: https://reviews.llvm.org/D67634 llvm-svn: 372405	2019-09-20 15:26:10 +00:00
Nico Weber	03475adcf7	Revert r372366 "Use getTargetConstant for BLENDI, and add a test to catch it." This reverts commit `52621307bc`. Tests have been failing all night with [0/2] ACTION //llvm/test:check-llvm(//llvm/utils/gn/build/toolchain:unix) -- Testing: 33647 tests, 64 threads -- Testing: 0 .. 10.. UNRESOLVED: LLVM :: CodeGen/AMDGPU/GlobalISel/isel-blendi-gettargetconstant.ll (6943 of 33647) ****************** TEST 'LLVM :: CodeGen/AMDGPU/GlobalISel/isel-blendi-gettargetconstant.ll' FAILED **************** Test has no run line! ****************** Since there were other concerns on https://reviews.llvm.org/D67785, I'm just reverting for now. llvm-svn: 372383	2019-09-20 12:05:29 +00:00
Sterling Augustine	52621307bc	Use getTargetConstant for BLENDI, and add a test to catch it. Summary: This fixes a crasher introduced by r372338. Reviewers: echristo, arsenm Subscribers: jvesely, wdng, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67785 Tighten up the test case. llvm-svn: 372366	2019-09-20 02:29:16 +00:00
Matt Arsenault	dd74f4839b	MachineScheduler: Fix missing dependency with multiple subreg defs If an instruction had multiple subregister defs, and one of them was undef, this would improperly conclude all other lanes are killed. There could still be other defs of those read-undef lanes in other operands. This would improperly remove register uses from CurrentVRegUses, so the visitation of later operands would not find the necessary register dependency. This would also mean this would fail or not depending on how different subregister def operands were ordered. On an undef subregister def, scan the instruction for other subregister defs and avoid killing those. This possibly should be deferring removing anything from CurrentVRegUses until the entire instruction has been processed instead. llvm-svn: 372362	2019-09-20 00:09:15 +00:00
Alexander Timofeev	e2f9bc3b11	[AMDGPU] Unnecessary -amdgpu-scalarize-global-loads=false flag removed from min/max lit tests. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D67712 llvm-svn: 372340	2019-09-19 16:44:38 +00:00
Matt Arsenault	3ecab8e455	Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338	2019-09-19 16:26:14 +00:00
Hans Wennborg	13bdae8541	Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_ instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314	2019-09-19 12:33:07 +00:00
Matt Arsenault	bffbeecb44	AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.ds.swizzle llvm-svn: 372297	2019-09-19 04:11:17 +00:00
Matt Arsenault	494243597b	AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store.format This needs special handling due to some subtargets that have a nonstandard register layout for f16 vectors Also reject some illegal types on other targets. llvm-svn: 372293	2019-09-19 02:35:08 +00:00
Matt Arsenault	67f1f6ff8c	AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store llvm-svn: 372292	2019-09-19 02:30:27 +00:00
Matt Arsenault	838ff36553	AMDGPU/GlobalISel: RegBankSelect struct buffer load/store llvm-svn: 372291	2019-09-19 02:26:53 +00:00
Matt Arsenault	a62ef58346	AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.raw.buffer.{load\|store} llvm-svn: 372290	2019-09-19 02:25:09 +00:00
Matt Arsenault	a30d022db6	AMDGPU/GlobalISel: Attempt to RegBankSelect image intrinsics Images should always have 2 consecutive, mandatory SGPR arguments. llvm-svn: 372289	2019-09-19 02:23:06 +00:00
Matt Arsenault	01213407c4	Fix typo llvm-svn: 372288	2019-09-19 02:15:29 +00:00
Matt Arsenault	c189f023ac	MachineScheduler: Fix assert from not checking subregs The assert would fail if there was a dead def of a subregister if there was a previous use of a different subregister. llvm-svn: 372287	2019-09-19 02:14:12 +00:00
Matt Arsenault	22e2c09515	AMDGPU/GlobalISel: Fix RegBankSelect G_SMULH/G_UMULH pre-gfx9 The scalar versions were only introduced in gfx9. llvm-svn: 372286	2019-09-19 01:42:34 +00:00
Matt Arsenault	d8399d12cd	GlobalISel: Don't materialize immarg arguments to intrinsics Encode them directly as an imm argument to G_INTRINSIC. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_ instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285	2019-09-19 01:33:14 +00:00
Tim Renouf	1786117111	[AMDGPU] Allow FP inline constant in v_madak_f16 and v_fmaak_f16 Differential Revision: https://reviews.llvm.org/D67680 Change-Id: Ic38f47cb2079c2c1070a441b5943854844d80a7c llvm-svn: 372208	2019-09-18 09:32:06 +00:00
Alexander Timofeev	6524a7a2b9	[AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Fixed Defferential Revision: https://reviews.llvm.org/D67101 Reviewers: rampitec, vpykhtin llvm-svn: 372086	2019-09-17 09:08:58 +00:00
Amara Emerson	9d64721ca5	[GlobalISel] Partially revert r371901. r371901 was overeager and widenScalarDst() and the like in the legalizer attempt to increment the insert point given in order to add new instructions after the currently legalizing inst. In cases where the insertion point is not exactly the current instruction, then callers need to de-compensate for the behaviour by decrementing the insertion iterator before calling them. It's not a nice state of affairs, for now just undo the problematic parts of the change. llvm-svn: 372050	2019-09-16 23:46:03 +00:00
Matt Arsenault	07b8597656	AMDGPU/GlobalISel: Fix some broken run lines llvm-svn: 371992	2019-09-16 14:14:40 +00:00
Matt Arsenault	1fc07d6648	AMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEIL llvm-svn: 371991	2019-09-16 14:14:37 +00:00
Matt Arsenault	bf7524db35	AMDGPU/GlobalISel: Remove another illegal select test llvm-svn: 371990	2019-09-16 14:14:31 +00:00
Matt Arsenault	255d157672	AMDGPU/GlobalISel: Remove illegal select tests These fail in a release build. llvm-svn: 371955	2019-09-16 04:21:10 +00:00
Matt Arsenault	bc8de8a8da	AMDGPU/GlobalISel: Select SMRD loads for more types llvm-svn: 371954	2019-09-16 00:54:07 +00:00
Matt Arsenault	48b158acae	AMDGPU/GlobalISel: RegBankSelect for kill llvm-svn: 371953	2019-09-16 00:48:37 +00:00
Matt Arsenault	01c7f40de3	AMDGPU/GlobalISel: Legalize s1 source G_[SU]ITOFP llvm-svn: 371952	2019-09-16 00:37:10 +00:00
Matt Arsenault	60169ed613	AMDGPU/GlobalISel: Set type on vgpr live in special arguments Fixes assertion with workitem ID intrinsics used in non-kernel functions. llvm-svn: 371951	2019-09-16 00:33:00 +00:00
Matt Arsenault	9f52c1ea58	AMDGPU/GlobalISel: Select S16->S32 fptoint llvm-svn: 371950	2019-09-16 00:32:56 +00:00
Matt Arsenault	0a6123595f	AMDGPU/GlobalISel: Select s32->s16 G_[US]ITOFP llvm-svn: 371949	2019-09-16 00:29:12 +00:00
Matt Arsenault	f5d5cd205e	AMDGPU/GlobalISel: Fix VALU s16 fneg llvm-svn: 371948	2019-09-16 00:20:54 +00:00
Amara Emerson	02bcc86b08	[GlobalISel] Fix insertion point of new instructions to be after PHIs. For some reason we sometimes insert new instructions one instruction before the first non-PHI when legalizing. This can result in having non-PHI instructions before PHIs, which mean that PHI elimination doesn't catch them. Differential Revision: https://reviews.llvm.org/D67570 llvm-svn: 371901	2019-09-13 21:49:24 +00:00
Alexander Timofeev	9ff70132bf	Revert for: [AMDGPU]: PHI Elimination hooks added for custom COPY insertion. llvm-svn: 371873	2019-09-13 17:37:30 +00:00
Matt Arsenault	a4be3eff5c	AMDGPU/GlobalISel: Legalize s32->s16 G_SITOFP/G_UITOFP llvm-svn: 371811	2019-09-13 04:04:55 +00:00
Matt Arsenault	67d9349dad	AMDGPU/GlobalISel: Fix RegBankSelect for amdgcn.else llvm-svn: 371808	2019-09-13 03:55:49 +00:00
Matt Arsenault	638f802381	AMDGPU/GlobalISel: Select 16-bit VALU bit ops llvm-svn: 371807	2019-09-13 03:55:43 +00:00
Matt Arsenault	f457dd2bd4	AMDGPU/GlobalISel: Legalize G_FFLOOR llvm-svn: 371803	2019-09-13 01:48:15 +00:00
Tim Shen	a31c521f5e	Temporarily revert r371640 "LiveIntervals: Split live intervals on multiple dead defs". It reveals a miscompile on Hexagon. See PR43302 for details. llvm-svn: 371802	2019-09-13 01:34:25 +00:00
Matt Arsenault	4d33918034	AMDGPU/GlobalISel: Legalize G_FMAD Unlike SelectionDAG, treat this as a normally legalizable operation. In SelectionDAG this is supposed to only ever formed if it's legal, but I've found that to be restricting. For AMDGPU this is contextually legal depending on whether denormal flushing is allowed in the use function. Technically we currently treat the denormal mode as a subtarget feature, so custom lowering could be avoided. However I consider this to be a defect, and this should be contextually dependent on the controllable rounding mode of the parent function. llvm-svn: 371800	2019-09-13 00:44:35 +00:00
Matt Arsenault	4a73c6eada	AMDGPU/GlobalISel: Select G_CTPOP llvm-svn: 371798	2019-09-13 00:11:20 +00:00
Matt Arsenault	b85c8c4bbd	LiveIntervals: Remove assertion This testcase is invalid, and caught by the verifier. For the verifier to catch it, the live interval computation needs to complete. Remove the assert so the verifier catches this, which is less confusing. In this testcase there is an undefined use of a subregister, and lanes which aren't used or defined. An equivalent testcase with the super-register shrunk to have no untouched lanes already hit this verifier error. llvm-svn: 371792	2019-09-12 23:46:51 +00:00
Matt Arsenault	8382ce5f1b	AMDGPU: Inline constant when materalizing FI with add on gfx9 This was relying on the SGPR usable for the carry out clobber to also be used for the input. There was no carry out on gfx9. With no carry out clobber to worry about, so the literal can just be directly used with a VOP2 add. llvm-svn: 371791	2019-09-12 23:46:46 +00:00
Qiu Chaofan	b7fb5d0f6f	[DAGCombiner] Improve division estimation of floating points. Current implementation of estimating divisions loses precision since it estimates reciprocal first and does multiplication. This patch is to re-order arithmetic operations in the last iteration in DAGCombiner to improve the accuracy. Reviewed By: Sanjay Patel, Jinsong Ji Differential Revision: https://reviews.llvm.org/D66050 llvm-svn: 371713	2019-09-12 07:51:24 +00:00
Austin Kerbow	666af6714c	AMDGPU: Move m0 initializations earlier Summary: After hoisting and merging m0 initializations schedule them as early as possible in the MBB. This helps the scheduler avoid hazards in some cases. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67450 llvm-svn: 371671	2019-09-11 21:28:41 +00:00
Michael Liao	7957d4c015	[AMDGPU] Fix crash in phi-elimination hook. Summary: - Pre-check in case there's just a single PHI insn. Reviewers: alex-t, rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, llvm-commits, yaxunl Tags: #llvm Differential Revision: https://reviews.llvm.org/D67451 llvm-svn: 371649	2019-09-11 19:55:20 +00:00
Matt Arsenault	81196a595c	LiveIntervals: Split live intervals on multiple dead defs If there are multiple dead defs of the same virtual register, these are required to be split into multiple virtual registers with separate live intervals to avoid a verifier error. llvm-svn: 371640	2019-09-11 17:59:21 +00:00
Guillaume Chatelet	48904e9452	[Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes mir parsing Summary: This catches malformed mir files which specify alignment as log2 instead of pow2. See https://reviews.llvm.org/D65945 for reference, This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67433 llvm-svn: 371608	2019-09-11 11:16:48 +00:00
Matt Arsenault	4a23ae5e78	GlobalISel/TableGen: Handle REG_SEQUENCE patterns The scalar f64 patterns don't work yet because they fail on multiple results from the unused implicit def of scc in the result bit operation. llvm-svn: 371542	2019-09-10 17:57:33 +00:00
Matt Arsenault	e1895aba3d	AMDGPU/GlobalISel: Select G_FABS/G_FNEG f64 doesn't work yet because tablegen currently doesn't handlde REG_SEQUENCE. This does regress some multi use VALU fneg cases since now the immediate remains in an SGPR, and more moves are used for legalizing the xor. This is a SIFixSGPRCopies deficiency. llvm-svn: 371540	2019-09-10 17:19:46 +00:00

1 2 3 4 5 ...

2733 Commits