llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	14d0b646b7	AMDGPU/GlobalISel: RegBankSelect for amdgcn.class llvm-svn: 364214	2019-06-24 18:00:47 +00:00
Matt Arsenault	8fcd5ade3e	AMDGPU/GlobalISel: Split VALU s64 G_ZEXT/G_SEXT in RegBankSelect Scalar extends to s64 can use S_BFE_{I64\|U64}, but vector extends need to extend to the 32-bit half, and then to 64. I'm not sure what the line should be between what RegBankSelect handles, and what instruction select does, but for now I'm erring on the side of RegBankSelect for future post-RBS combines. llvm-svn: 364212	2019-06-24 17:54:12 +00:00
Matt Arsenault	f8a841b88e	AMDGPU/GlobalISel: Fix selecting G_IMPLICIT_DEF for s1 Try to fail for scc, since I don't think that should ever be produced. llvm-svn: 364199	2019-06-24 16:24:03 +00:00
Matt Arsenault	5dbd9228c4	AMDGPU/GlobalISel: Fix RegBankSelect for s1 sext/zext/anyext This needs different handling if the source is known to be a valid condition or not. Handle turning it into shifts or a select during regbankselect. llvm-svn: 364186	2019-06-24 14:53:58 +00:00
Amara Emerson	8f25a021dd	[AArch64][GlobalISel] Make s8 and s16 G_CONSTANTs legal. We sometimes get poor code size because constants of types < 32b are legalized as 32 bit G_CONSTANTs with a truncate to fit. This works but means that the localizer can no longer sink them (although it's possible to extend it to do so). On AArch64 however s8 and s16 constants can be selected in the same way as s32 constants, with a mov pseudo into a W register. If we make s8 and s16 constants legal then we can avoid unnecessary truncates, they can be CSE'd, and the localizer can sink them as normal. There is a caveat: if the user of a smaller constant has to widen the sources, we end up with an anyext of the smaller typed G_CONSTANT. This can cause regressions because of the additional extend and missed pattern matching. To remedy this, there's a new artifact combiner to generate the wider G_CONSTANT if it's legal for the target. Differential Revision: https://reviews.llvm.org/D63587 llvm-svn: 364075	2019-06-21 16:43:50 +00:00
Matt Arsenault	d5ce8ec778	AMDGPU/GlobalISel: RegBankSelect for amdgcn.div.scale llvm-svn: 363667	2019-06-18 12:23:42 +00:00
Matt Arsenault	5a321b899e	GlobalISel: Use the original flags when lowering fneg to fsub This was ignoring the flag on fneg, and using the source instruction's flags. Also fixes tests missing from r358702. Note the expansion itself isn't correct without nnan, but that should be fixed separately. llvm-svn: 363637	2019-06-17 23:48:43 +00:00
Matt Arsenault	3e140066bc	GlobalISel: Ignore callsite attributes when picking intrinsic type A target intrinsic may be defined as possibly reading memory, but the call site may have additional knowledge that it doesn't read memory. The intrinsic lowering will expect the pessimistic assumption of the intrinsic definition, so the chain should still be used. I fixed the same bug in SelectionDAG in r287593. llvm-svn: 363580	2019-06-17 17:01:35 +00:00
Matt Arsenault	a7f09f3c9e	GlobalISel: Verify intrinsics I keep using the wrong instruction when manually writing tests. This really needs to check the number of operands, but I don't see an easy way to do that right now. llvm-svn: 363579	2019-06-17 17:01:32 +00:00
Tom Stellard	8b1c53b528	AMDGPU/GlobalISel: Implement select for G_ICMP and G_SELECT Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60640 llvm-svn: 363576	2019-06-17 16:27:43 +00:00
Matt Arsenault	9487278010	Reapply "GlobalISel: Avoid producing Illegal copies in RegBankSelect" This reapplies r363410, avoiding null dereference if there is no AltRegBank. llvm-svn: 363478	2019-06-15 00:33:26 +00:00
Mitch Phillips	0d44f129bb	Revert "GlobalISel: Avoid producing Illegal copies in RegBankSelect" This patch breaks UBSan build bots. See https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild for a guide as to how to reproduce the error. This reverts commit `c2864c0de0`. This reverts rL363410. llvm-svn: 363476	2019-06-14 23:45:34 +00:00
Matt Arsenault	c2864c0de0	GlobalISel: Avoid producing Illegal copies in RegBankSelect Avoid producing illegal register bank copies for reg_sequence and phi. The default implementation assumes it is possible to pick any operand's bank and use that for the result, introducing a copy for operands with a different bank. This does not check for illegal copies. It is not legal to introduce a VGPR->SGPR copy, so any VGPR operand requires the result to be a VGPR. The changes in getInstrMappingImpl aren't strictly necessary, since AMDGPU now just bypasses this for reg_sequence/phi. This could be replaced with an assert in case other targets run into this. It is currently responsible for producing the error for unsatisfiable copies, but this will be better served with a verifier check. For phis, for now assume any undetermined operands must be VGPRs. Eventually, this needs to be able to defer mapping these operations. This also does not yet have a way to check for whether the block is in a divergent region. llvm-svn: 363410	2019-06-14 15:22:25 +00:00
Stanislav Mekhanoshin	000f9cc62a	[AMDGPU] more gfx1010 tests. NFC. llvm-svn: 363190	2019-06-12 18:44:11 +00:00
Matt Arsenault	61f6395fd0	AMDGPU/GlobalISel: Fix using illegal situations in tests These were using illegal copies as the side effecting use, so make them legal. llvm-svn: 363168	2019-06-12 14:23:28 +00:00
Matt Arsenault	e0a4da8c0a	AMDGPU/GlobalISel: Add wave scratch offset argument Avoids crashing in PEI in a future change. llvm-svn: 362136	2019-05-30 19:33:18 +00:00
Matt Arsenault	9ffd8b5a6f	AMDGPU/GlobalISel: Remove unnecesssary REQUIREs This has been a mandatory part of the build for a while. llvm-svn: 361956	2019-05-29 13:14:35 +00:00
Matt Arsenault	0f3ba44b57	AMDGPU/GlobalISel: Legality for integer min/max llvm-svn: 361519	2019-05-23 17:58:48 +00:00
Matt Arsenault	2f29220d6d	AMDGPU/GlobalISel: Implement s64->s64 [SU]ITOFP llvm-svn: 361082	2019-05-17 23:05:18 +00:00
Matt Arsenault	02b5ca8cd1	GlobalISel: Implement lower for S64->S32 [SU]ITOFP This is ported from the custom AMDGPU DAG implementation. I think this is a better default expansion than what the DAG currently uses, at least if the target has CTLZ. This implements the signed version in terms of the unsigned conversion, which is implemented with bit operations. SelectionDAG has several other implementations that should eventually be ported depending on what instructions are legal. llvm-svn: 361081	2019-05-17 23:05:13 +00:00
Matt Arsenault	a510b570c2	AMDGPU/GlobalISel: Legalize G_FCEIL llvm-svn: 361028	2019-05-17 12:20:05 +00:00
Matt Arsenault	6aebcd5499	AMDGPU/GlobalISel: Legalize G_INTRINSIC_TRUNC llvm-svn: 361027	2019-05-17 12:20:01 +00:00
Matt Arsenault	6aafc5e19d	AMDGPU/GlobalISel: Legalize G_FRINT llvm-svn: 361026	2019-05-17 12:19:57 +00:00
Matt Arsenault	1448f5689e	AMDGPU/GlobalISel: Legalize G_FCOPYSIGN llvm-svn: 361025	2019-05-17 12:19:52 +00:00
Matt Arsenault	568f193847	AMDGPU/GlobalISel: RegBankSelect for llvm.amdgcn.s.buffer.load llvm-svn: 361023	2019-05-17 12:02:34 +00:00
Matt Arsenault	a3b5a386fa	AMDGPU/GlobalISel: Use subreg index instead of extra unmerge This saves instructions and extra steps, but I'm not sure about introducing subregister indexes at this point. llvm-svn: 361022	2019-05-17 12:02:31 +00:00
Matt Arsenault	b3dc73634c	AMDGPU/GlobalISel: Use waterfall loop for buffer_load This adds support for more complex waterfall loops that need to handle operands > 32-bits, and multiple operands. llvm-svn: 361021	2019-05-17 12:02:27 +00:00
Matt Arsenault	a8f88c388f	AMDGPU/GlobalISel: Correct regbank for 1-bit and/or/xor Bool values should use the scc/vcc regbank since r350611. llvm-svn: 360877	2019-05-16 12:06:41 +00:00
Stanislav Mekhanoshin	a6322941ff	[AMDGPU] gfx1010 VMEM and SMEM implementation Differential Revision: https://reviews.llvm.org/D61330 llvm-svn: 359621	2019-04-30 22:08:23 +00:00
Matt Arsenault	2b6f76f05f	AMDGPU/GlobalISel: Fix non-power-of-2 G_EXTRACT sources llvm-svn: 358894	2019-04-22 15:22:46 +00:00
Matt Arsenault	8f624abc1d	GlobalISel: Legalize scalar G_EXTRACT sources llvm-svn: 358892	2019-04-22 15:10:42 +00:00
Amara Emerson	946b1246d6	[GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 `8033492` -0.6% test-suite :: CTMark/kimwitu++/kc.test `3870380` 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369	2019-04-15 05:04:20 +00:00
Matt Arsenault	7187272b2b	GlobalISel: Support legalizing G_CONSTANT with irregular breakdown llvm-svn: 358109	2019-04-10 17:27:53 +00:00
Matt Arsenault	9e0eeba569	GlobalISel: Handle odd breakdowns for bit ops llvm-svn: 358105	2019-04-10 17:07:56 +00:00
Tom Stellard	206b9927f8	AMDGPU/GlobalISel: Implement call lowering for shaders returning values Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, llvm-commits Differential Revision: https://reviews.llvm.org/D57166 llvm-svn: 357964	2019-04-09 02:26:03 +00:00
Matt Arsenault	4ed6ccab9b	AMDGPU/GlobalISel: Fix non-power-of-2 select llvm-svn: 357762	2019-04-05 14:03:04 +00:00
Matt Arsenault	5fddf09187	AMDGPU/GlobalISel: Insert waterfall loop for vector indexing The register index can only really be an SGPR. Lie that a VGPR index is legal, and then rewrite the instruction in a waterfall loop to handle the index. llvm-svn: 357235	2019-03-29 03:54:56 +00:00
Amara Emerson	381188f1f3	[GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead instructions. The artifact combiners push instructions which have been marked for deletion onto an list for the legalizer to deal with on return. However, for trunc(ext) combines the combiner routine recursively calls itself. When it does this the dead instructions list may not be empty, and the other combiners don't expect to be dealing with essentially invalid MIR (multiple vreg defs etc). This change fixes it by ensuring that the dead instructions are processed on entry into tryCombineInstruction. As a result, this fix exposed a few places in tests where G_TRUNC instructions were not being deleted even though they were dead. Differential Revision: https://reviews.llvm.org/D59892 llvm-svn: 357101	2019-03-27 17:47:42 +00:00
Matt Arsenault	733b8571b4	MIR: Freeze reserved regs after parsing everything The AMDGPU implementation of getReservedRegs depends on MachineFunctionInfo fields that are parsed from the YAML section. This was reserving the wrong register since it was setting the reserved regs before parsing the correct one. Some tests were relying on the default reserved set for the assumed default calling convention. llvm-svn: 357083	2019-03-27 16:12:26 +00:00
Matt Arsenault	b34afa311d	GlobalISel: Fix RegBankSelect for REG_SEQUENCE The AArch64 test was broken since the result register already had a set register class, so this test was a no-op. The mapping verify call would fail because the result size is not the same as the inputs like in a copy or phi. The AMDGPU testcases are half broken and introduce illegal VGPR->SGPR copies which need much more work to handle correctly (same for phis), but add them as a baseline. llvm-svn: 356713	2019-03-21 20:45:36 +00:00
Matt Arsenault	2065206a9d	AMDGPU: Don't look for constant in insert/extract_vector_elt regbankselect The constantness shouldn't change the register bank choice. We also don't need to restrict this to only indexing VGPRs, since it's possible to index SGPRs (but SelectionDAG made using this difficult). Allow directly indexing SGPRs when appropriate. llvm-svn: 356611	2019-03-20 20:41:34 +00:00
Matt Arsenault	133716929c	GlobalISel: Use multiple returns for intrinsic structs This is consistent with what SelectionDAG does and is much easier to work with than the extract sequence with an artificial wide register. For the AMDGPU control flow intrinsics, this was producing an s128 for the i64, i1 tuple return. Any legalization that should apply to a real s128 value would badly obscure the direct values that need to be seen. llvm-svn: 356147	2019-03-14 14:18:56 +00:00
David Stuttard	20ea21c6ed	[AMDGPU] Add support for immediate operand for S_ENDPGM Summary: Add support for immediate operand in S_ENDPGM Change-Id: I0c56a076a10980f719fb2a8f16407e9c301013f6 Reviewers: alexshap Subscribers: qcolombet, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, eraman, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59213 llvm-svn: 355902	2019-03-12 09:52:58 +00:00
Petar Avramovic	0b17e59b5c	[MIPS GlobalISel] NarrowScalar G_MUL Narrow Scalar G_MUL for MIPS32. Revisit NarrowScalar implementation in LegalizerHelper. Introduce new helper function multiplyRegisters. It performs generic multiplication of values held in multiple registers. Generated instructions use only types NarrowTy and i1. Destination can be same or two times size of the source. Differential Revision: https://reviews.llvm.org/D58824 llvm-svn: 355814	2019-03-11 10:00:17 +00:00
Tom Stellard	33634d1b25	AMDGPU/GlobalISel: Implement select for G_INSERT Re-commit r344310. Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53116 llvm-svn: 355159	2019-03-01 00:50:26 +00:00
Tom Stellard	41f32196a0	AMDGPU/GlobalISel: Implement select for G_EXTRACT Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49714 llvm-svn: 355156	2019-02-28 23:37:48 +00:00
Matt Arsenault	bf1bf706c8	AMDGPU/GlobalISel: Add regbankselect test for phis Add baseline for future fixes. These mostly show how this is broken and producing illegal situations. llvm-svn: 355057	2019-02-28 00:52:36 +00:00
Matt Arsenault	d3093c2f1f	GlobalISel: Implement fewerElementsVector for phi llvm-svn: 355048	2019-02-28 00:16:32 +00:00
Matt Arsenault	72bcf15dbf	GlobalISel: Implement moreElementsVector for phi llvm-svn: 355047	2019-02-28 00:01:05 +00:00
Matt Arsenault	752579736e	RegBankSelect: Handle slightly more complex value mappings Try to use concat_vectors. Also remove unnecessary assert on pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers for AMDGPU. llvm-svn: 354828	2019-02-25 22:24:13 +00:00
Matt Arsenault	f4bfe4cd17	AMDGPU/GlobalISel: Fix bit ops for non-power-of-2 sizes llvm-svn: 354825	2019-02-25 21:32:48 +00:00
Matt Arsenault	82b103998b	AMDGPU/GlobalISel: Clamp max implicit_def elements llvm-svn: 354818	2019-02-25 20:46:06 +00:00
Matt Arsenault	2e0ee47712	AMDGPU/GlobalISel: Make phis legal llvm-svn: 354592	2019-02-21 15:48:13 +00:00
Matt Arsenault	b10fa8df3f	AMDGPU/GlobalISel: Fix bit count ops for non-power-of-2 types llvm-svn: 354587	2019-02-21 15:22:20 +00:00
Tom Stellard	79b5c3842b	AMDGPU/GlobalISel: Move SMRD selection logic to TableGen Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52922 llvm-svn: 354516	2019-02-20 21:02:37 +00:00
Matt Arsenault	75e30c4d5d	GlobalISel: Fix fewerElementsVector for ctlz with different result type Also complete the set of related operations. llvm-svn: 354480	2019-02-20 16:42:52 +00:00
Matt Arsenault	c4d07554e4	GlobalISel: Implement moreElementsVector for g_insert results llvm-svn: 354477	2019-02-20 16:11:22 +00:00
Matt Arsenault	b4c95b338b	GlobalISel: Implement moreElementsVector for select llvm-svn: 354354	2019-02-19 17:03:09 +00:00
Matt Arsenault	4d88427a58	GlobalISel: Implement moreElementsVector for G_EXTRACT source llvm-svn: 354348	2019-02-19 16:44:22 +00:00
Matt Arsenault	26b7e859ef	GlobalISel: Implement moreElementsVector for bit ops llvm-svn: 354345	2019-02-19 16:30:19 +00:00
Matt Arsenault	fbe92a53d0	GlobalISel: Implement widenScalar for g_extract scalar results llvm-svn: 354293	2019-02-18 22:39:27 +00:00
Matt Arsenault	9e5e868d95	AMDGPU/GlobalISel: Fix RegBankSelect for GEP. This is basically a pointer typed add, so shouldn't be any different. This was assuming everything was an SGPR, which is not true. Also cleanup legality for GEP. I don't seem to be seeing the problem the hack marking s64 as a legal pointer type the comment mentions. llvm-svn: 354067	2019-02-14 22:24:28 +00:00
Matt Arsenault	d3d496338e	AMDGPU/GlobalISel: Handle split for 64-bit VALU select llvm-svn: 354065	2019-02-14 21:58:12 +00:00
Matt Arsenault	a180554020	AMDGPU/GlobalISel: Add more insert/extract testcases llvm-svn: 353848	2019-02-12 15:04:03 +00:00
Matt Arsenault	00ccd13c73	AMDGPU/GlobalISel: Only make f16 constants legal on f16 targets We could deal with it, but there's no real point. llvm-svn: 353845	2019-02-12 14:54:55 +00:00
Matt Arsenault	b2d245771f	GlobalISel: Verify G_EXTRACT llvm-svn: 353759	2019-02-11 22:12:43 +00:00
Matt Arsenault	18ec382698	GlobalISel: Implement moreElementsVector for implicit_def llvm-svn: 353754	2019-02-11 22:00:39 +00:00
Matt Arsenault	9dba67f431	GlobalISel: Add G_FCANONICALIZE instruction llvm-svn: 353719	2019-02-11 17:05:20 +00:00
Matt Arsenault	ca9583a70a	AMDGPU/GlobalISel: Fix broken tests llvm-svn: 353559	2019-02-08 19:59:39 +00:00
Matt Arsenault	b0a227049f	AMDGPU/GlobalISel: Fix shift legalization for non-power-of-2 clampScalar doesn't do anything for non-power-of-2 in range. There should probably be a combination rule to reduce the number of matching rules. llvm-svn: 353526	2019-02-08 15:06:24 +00:00
Matt Arsenault	0f2debb1c2	AMDGPU/GlobalISel: Fix non-power-of-2 implicit_def llvm-svn: 353522	2019-02-08 14:46:27 +00:00
Petar Avramovic	c98b26d326	[MIPS GlobalISel] Select any extending load and truncating store Make behavior of G_LOAD in widenScalar same as for G_ZEXTLOAD and G_SEXTLOAD. That is perform widenScalarDst to size given by the target and avoid additional checks in common code. Targets can reorder or add additional rules in LegalizeRuleSet for the opcode to achieve desired behavior. Select extending load that does not have specified type of extension into zero extending load. Select truncating store that stores number of bytes indicated by size in MachineMemoperand. Differential Revision: https://reviews.llvm.org/D57454 llvm-svn: 353520	2019-02-08 14:27:23 +00:00
Matt Arsenault	dc88a2ce35	AMDGPU/GlobalISel: Don't use a copy in addrspacecast lowering llvm-svn: 353516	2019-02-08 14:16:11 +00:00
Matt Arsenault	a8b4339c2f	AMDGPU/GlobalISel: Legalize addrspacecast Use a placeholder constant for now on targets that need the load from the queue ptr. llvm-svn: 353497	2019-02-08 02:40:47 +00:00
Matt Arsenault	fbec8fe93b	GlobalISel: Implement narrowScalar for shift main type This is pretty much directly ported from SelectionDAG. Doesn't include the shift by non-constant but known bits version, since there isn't a globalisel version of computeKnownBits yet. This shows a disadvantage of targets not specifically which type should be used for the shift amount. If type 0 is legalized before type 1, the operations on the shift amount type use the wider type (which are also less likely to legalize). This can be avoided by targets specifying legalization actions on type 1 earlier than for type 0. llvm-svn: 353455	2019-02-07 19:37:44 +00:00
Matt Arsenault	d914189a2e	AMDGPU/GlobalISel: Restrict g_implicit_def legality llvm-svn: 353452	2019-02-07 19:10:15 +00:00
Matt Arsenault	d6212f9f1b	GlobalISel: Fix artifact combiner constant legality checks for vectors Since G_CONSTANT is illegal for vectors, this needs to check what buildConstant will produce for a splat vector. llvm-svn: 353449	2019-02-07 18:58:28 +00:00
Matt Arsenault	60b33fb6fc	AMDGPU/GlobalISel: Don't use g_implicit_def in a few tests llvm-svn: 353443	2019-02-07 18:33:22 +00:00
Matt Arsenault	c0f7569aab	AMDGPU/GlobalISel: Legalize fsqrt llvm-svn: 353438	2019-02-07 18:14:39 +00:00
Matt Arsenault	93fdec739b	AMDGPU/GlobalISel: Legalize some f16 operations llvm-svn: 353436	2019-02-07 18:03:11 +00:00
Matt Arsenault	c83b82363c	GlobalISel: Implement fewerElementsVector for shifts Introduce a new function which handles instructions with multiple type indices, but have the same number of vector elements. Also legalize v2s16 shifts when applicable. llvm-svn: 353432	2019-02-07 17:38:00 +00:00
Matt Arsenault	7f09fd6b04	GlobalISel: Consolidate load/store legalization The fewerElementsVectors implementation for load/stores handles the scalar reduction case just as well, so drop the redundant code in narrowScalar. This also introduces support for narrowing irregular size breakdowns for scalars. llvm-svn: 353125	2019-02-05 00:26:12 +00:00
Matt Arsenault	81511e5428	GlobalISel: Implement narrowScalar for select Don't handle vector conditions. I think this can be merged in the future with fewerElementsVectorSelect, although this becomes slightly tricky with a vector condition. llvm-svn: 353122	2019-02-05 00:13:44 +00:00
Matt Arsenault	24f14993e8	GlobalISel: Combine g_extract with g_merge_values Try to use the underlying source registers. This enables legalization in more cases where some irregular operations are widened and others narrowed. This seems to make the test_combines_2 AArch64 test worse, since the MERGE_VALUES has multiple uses. Since this should be required for legalization, a hasOneUse check is probably inappropriate (or maybe should only be used if the merge is legal?). llvm-svn: 353121	2019-02-04 23:41:59 +00:00
Matt Arsenault	1f795e2c2a	GlobalISel: Enforce operand types for constants A number of of tests were using imm operands, not cimm. Since CSE relies on the exact ConstantInt* pointer used, and implicit conversions are generally evil, also enforce the bitsize of the types. llvm-svn: 353113	2019-02-04 23:29:31 +00:00
Matt Arsenault	3d6a49b0b9	GlobalISel: Fix not calling observer when legalizing bitcount ops This was hiding bugs from never legalizing the source type. llvm-svn: 353102	2019-02-04 22:26:33 +00:00
Matt Arsenault	10547230f3	AMDGPU/GlobalISel: Legalize select for v4s16 Also add some more select tests to help show future legalization changes. llvm-svn: 353045	2019-02-04 14:04:52 +00:00
Matt Arsenault	888aa5dedd	GlobalISel: Implement widenScalar for G_UNMERGE_VALUES For the scalar case only. Also move the similar G_MERGE_VALUES handling to a separate function and cleanup to make them look more similar. llvm-svn: 352979	2019-02-03 00:07:33 +00:00
Matt Arsenault	0e5d856eb8	GlobalISel: Implement widenScalar for G_EXTRACT vector sources Handle the basic element extract case. llvm-svn: 352978	2019-02-02 23:56:00 +00:00
Matt Arsenault	58f9d3df97	AMDGPU/GlobalISel: Legalize icmp for pointer types llvm-svn: 352976	2019-02-02 23:35:15 +00:00
Matt Arsenault	2065c94dd3	AMDGPU/GlobalISel: Legalize constant for pointer types llvm-svn: 352975	2019-02-02 23:33:49 +00:00
Matt Arsenault	2491f82679	AMDGPU/GlobalISel: Legalize select for pointer types llvm-svn: 352974	2019-02-02 23:31:50 +00:00
Matt Arsenault	cbaada6bc1	GlobalISel: Legalization for inttoptr/ptrtoint llvm-svn: 352973	2019-02-02 23:29:55 +00:00
Matt Arsenault	c7bce739ad	GlobalISel: Handle odd splits in fewerElementsVector for load/store llvm-svn: 352720	2019-01-31 02:46:05 +00:00
Matt Arsenault	d1bfc8d0c3	GlobalISel: Implement narrowScalar for bswap llvm-svn: 352719	2019-01-31 02:34:03 +00:00
Matt Arsenault	d5684f76e0	GlobalISel: Allow bitcount ops to have different result type For AMDGPU the result is always 32-bit for 64-bit inputs. llvm-svn: 352717	2019-01-31 02:09:57 +00:00
Matt Arsenault	dc6c78596b	GlobalISel: Implement fewerElementsVector for select llvm-svn: 352601	2019-01-30 04:19:31 +00:00
Matt Arsenault	f6cab16258	AMDGPU/GlobalISel: Fix clamping shifts with 16-bit insts llvm-svn: 352599	2019-01-30 03:36:25 +00:00
Matt Arsenault	045bc9a4a6	GlobalISel: Support narrowScalar for uneven loads llvm-svn: 352594	2019-01-30 02:35:38 +00:00
Matt Arsenault	ccefbbd0f0	GlobalISel: Handle some odd splits in fewerElementsVector Also add some quick hacks to AMDGPU legality for the tests. llvm-svn: 352591	2019-01-30 02:22:13 +00:00
Matt Arsenault	92c5001136	GlobalISel: Handle more cases for widenScalar for G_STORE llvm-svn: 352585	2019-01-30 02:04:31 +00:00
Matt Arsenault	d45b03bb81	GlobalISel: Verify pointer casts Not sure if the old AArch64 tests should be just deleted or not. llvm-svn: 352562	2019-01-29 23:29:00 +00:00
Matt Arsenault	d8d193d5e2	GlobalISel: Partially implement widenScalar for MERGE_VALUES llvm-svn: 352560	2019-01-29 23:17:35 +00:00
Matt Arsenault	18619afe1d	GlobalISel: Fix narrowScalar for load/store with different mem size This was ignoring the memory size, and producing multiple loads/stores if the operand size was different from the memory size. I assume this is the intent of not having an explicit G_ANYEXTLOAD (although I think that would probably be better). llvm-svn: 352523	2019-01-29 18:13:02 +00:00
Matt Arsenault	cfca2a7adf	GlobalISel: Don't reduce elements for atomic load/store This is invalid for the same reason as in the narrowScalar handling for load. llvm-svn: 352334	2019-01-27 22:36:24 +00:00
Matt Arsenault	fdfb7d78f1	GlobalISel: Verify load/store has a pointer input I expected this to be automatically verified, but it seems nothing uses that the type index was declared as a "ptype" llvm-svn: 352319	2019-01-27 15:57:23 +00:00
Matt Arsenault	211e89d4dd	GlobalISel: Implement narrowScalar for mul llvm-svn: 352300	2019-01-27 00:52:51 +00:00
Matt Arsenault	2e5f900849	GlobalISel: fewerElementsVector for intrinsic_trunc/intrinsic_round llvm-svn: 352298	2019-01-27 00:12:21 +00:00
Matt Arsenault	26a6c74fbe	AMDGPU/GlobalISel: Legalize more bit ops llvm-svn: 352295	2019-01-26 23:47:07 +00:00
Matt Arsenault	4d47594fc5	AMDGPU/GlobalISel: Widen small uaddo/usubo llvm-svn: 352294	2019-01-26 23:44:51 +00:00
Matt Arsenault	3e08b772b3	AMDGPU/GlobalISel: Scalarize add/sub llvm-svn: 352167	2019-01-25 04:53:57 +00:00
Matt Arsenault	e6cebd0d69	GlobalISel: fewerElementsVector for more cast types llvm-svn: 352166	2019-01-25 04:37:33 +00:00
Matt Arsenault	95fd95cfe0	GlobalISel: fewerElementsVector for a few more trivial ops llvm-svn: 352165	2019-01-25 04:03:38 +00:00
Matt Arsenault	5d622fbcc1	AMDGPU/GlobalISel: Legalize smulh/umulh and scalarize mul llvm-svn: 352162	2019-01-25 03:23:04 +00:00
Matt Arsenault	1b1e685f10	GlobalISel: Support fewerElementsVector for icmp/fcmp Also legalize 64-bit compares for AMDGPU llvm-svn: 352157	2019-01-25 02:59:34 +00:00
Matt Arsenault	ca676343a9	GlobalISel: Implement fewerElementsVector for extensions llvm-svn: 352155	2019-01-25 02:36:32 +00:00
Aditya Nandakumar	3ba0d94bce	[GISel]: Change how CSE is enabled by default for each pass https://reviews.llvm.org/D57178 Now add a hook in TargetPassConfig to query if CSE needs to be enabled. By default this hook returns false only for O0 opt level but this can be overridden by the target. As a consequence of the default of enabled for non O0, a few tests needed to be updated to not use CSE (by passing in -O0) to the run line. reviewed by: arsenm llvm-svn: 352126	2019-01-24 23:11:25 +00:00
Matt Arsenault	baa5d2e69c	RegBankSelect: Support some more complex part mappings llvm-svn: 352123	2019-01-24 22:47:04 +00:00
Matt Arsenault	4c5e8f51e7	AMDGPU/GlobalISel: Start selectively legalizing 16-bit operations It might be a bit nicer to use the fancy .legalIf and co. predicates, but this was requiring more boilerplate and disables the coverage assertions. llvm-svn: 351886	2019-01-22 22:00:19 +00:00
Matt Arsenault	736cfa9ffb	AMDGPU/GlobalISel: Handle legality/regbanks for 32/64-bit shifts llvm-svn: 351884	2019-01-22 21:51:38 +00:00
Matt Arsenault	6378629609	GlobalISel: Implement widen for extract_vector_elt elt type llvm-svn: 351871	2019-01-22 20:38:15 +00:00
Matt Arsenault	aebb2ee036	GlobalISel: Implement fewerElementsVector for basic FP ops llvm-svn: 351866	2019-01-22 20:14:29 +00:00
Matt Arsenault	6614f852b6	GlobalISel: Support narrowing zextload/sextload llvm-svn: 351856	2019-01-22 19:02:10 +00:00
Matt Arsenault	a7cd83bc88	GlobalISel: Disallow vectors for G_CONSTANT/G_FCONSTANT llvm-svn: 351853	2019-01-22 18:53:41 +00:00
Matt Arsenault	fb67164ebc	AMDGPU/GlobalISel: Legalize more fp<->int conversions llvm-svn: 351767	2019-01-22 00:20:17 +00:00
Matt Arsenault	7ac79ed8f0	AMDGPU: Legalize more bitcasts llvm-svn: 351700	2019-01-20 19:45:18 +00:00
Matt Arsenault	46ffe68d77	AMDGPU/GlobalISel: Really legalize exts from i1 There is a combine that was hiding these tests not actually testing what they should be, although they were producing the expected end result. llvm-svn: 351698	2019-01-20 19:28:20 +00:00
Matt Arsenault	745fd9f547	GlobalISel: Implement widenScalar for basic FP ops llvm-svn: 351696	2019-01-20 19:10:31 +00:00
Matt Arsenault	cfd9e7f594	AMDGPU/GlobalISel: Legalize f32->f16 fptrunc llvm-svn: 351695	2019-01-20 19:10:26 +00:00
Matt Arsenault	ff6a9a275b	AMDGPU/GlobalISel: Fix some crashs in g_unmerge_values/g_merge_values This was crashing in the predicate function assuming the value is a vector. Copy more of what AArch64 uses. This probably needs more refinement later, but I don't exactly understand what it means in some cases, particularly since any legalization for these seems to be missing. llvm-svn: 351693	2019-01-20 18:40:36 +00:00
Matt Arsenault	2a2086b830	AMDGPU/GlobalISel: Regbank select for fpext llvm-svn: 351692	2019-01-20 18:35:41 +00:00
Matt Arsenault	24563ef628	AMDGPU/GlobalISel: Cleanup legality for extensions llvm-svn: 351691	2019-01-20 18:34:24 +00:00
Matt Arsenault	96e4701401	AMDGPU/GlobalISel: Legalize more types for select llvm-svn: 351599	2019-01-18 21:42:55 +00:00
Matt Arsenault	4599159ac3	AMDGPU/GlobalISel: Legalize illegal g_constant llvm-svn: 351596	2019-01-18 21:33:50 +00:00
Matt Arsenault	c765240060	AMDGPU/GlobalISel: Introduce vcc reg bank I'm not entirely sure this is the correct thing to do with the global isel philosophy, but I think this is necessary to handle how differently SGPRs are used normally vs. from a condition. For example, it makes sense to allow a copy from a VGPR to an SGPR, but it makes no sense to allow a copy from VGPRs to SGPRs used as select mask. This avoids regbankselecting strange code with a truncate feeding directly into a condition field. Now a copy is forced from sgpr(s1) to vcc, which is more sensible to handle. Some of these issues could probably avoided with making enough operations resulting in i1 illegal. I think we can't avoid this register bank for legality. For example, an i1 and where one source is from a truncate, and one source is a compare needs some kind of copy inserted to make sure both are in condition registers. llvm-svn: 350611	2019-01-08 06:30:53 +00:00
Matt Arsenault	a1515d2d33	AMDGPU/GlobalISel: Legalize concat_vectors llvm-svn: 350598	2019-01-08 01:30:02 +00:00
Matt Arsenault	adc40baa29	RegBankSelect: Fix copy insertion point for terminators If a copy was needed to handle the condition of brcond, it was being inserted before the defining instruction. Add tests for iterator edge cases. I find the existing code here suspect for the case where it's looking for terminators that modify the register. It's going to insert a copy in the middle of the terminators, which isn't allowed (it might be necessary to have a COPY_terminator if anybody actually needs this). Also legalize brcond for AMDGPU. llvm-svn: 350595	2019-01-08 01:22:47 +00:00
Matt Arsenault	ae6f1e07fc	AMDGPU/GlobalISel: Disallow VGPR->SCC copies This fixes using scalar adds when only the carry in is a VGPR using greedy regbankselect. llvm-svn: 350593	2019-01-08 01:13:20 +00:00
Matt Arsenault	68c668a5f3	AMDGPU/GlobalISel: RegBankSelect for carry-in I'm not sure we should be allowing the truncate to s1 for the inputs. It may be necessary to create a new VCC reg bank. llvm-svn: 350592	2019-01-08 01:09:09 +00:00
Matt Arsenault	2cc15b67b7	AMDGPU/GlobalISel: RegBankSelect for add/sub with carry out llvm-svn: 350589	2019-01-08 01:03:58 +00:00
Matt Arsenault	299302fbe7	AMDGPU/GlobalISel: InstrMapping for G_UNMERGE_VALUES llvm-svn: 350588	2019-01-08 00:46:19 +00:00
Matt Arsenault	369acb8470	AMDGPU: Remove VS/SV mappings from select These would violate the constant bus restriction llvm-svn: 350517	2019-01-07 13:21:36 +00:00
Matt Arsenault	3eae3c4590	AMDGPU/GlobalISel: RegBankSelect for amdgcn.wqm.vote llvm-svn: 349882	2018-12-21 03:20:54 +00:00
Matt Arsenault	f4c21c575a	AMDGPU/GlobalISel: RegBankSelect for some fp ops llvm-svn: 349880	2018-12-21 03:14:45 +00:00
Matt Arsenault	bee2ad7185	AMDGPU/GlobalISel: Redo legality for build_vector It seems better to avoid using the callback if possible since there are coverage assertions which are disabled if this is used. Also fix missing tests. Only test the legal cases since it seems legalization for build_vector is quite lacking. llvm-svn: 349878	2018-12-21 03:03:11 +00:00
Matt Arsenault	4339883710	AMDGPU: Make i1/i64/v2i32 and/or/xor legal The 64-bit types do depend on the register bank, but that's another issue to deal with later. llvm-svn: 349716	2018-12-20 01:35:49 +00:00
Matt Arsenault	8cc98bee8a	AMDGPU/GlobalISel: Fix ValueMapping tables for i1 This was incorrectly selecting SGPR for any i1 values, e.g. G_TRUNC to i1 from a VGPR was still an SGPR. llvm-svn: 349715	2018-12-20 01:33:43 +00:00
Matt Arsenault	dff33c38e1	AMDGPU/GlobalISel: RegBankSelect for fp conversions llvm-svn: 349709	2018-12-20 00:37:02 +00:00
Matt Arsenault	36d4092173	AMDGPU/GlobalISel: Legality/regbankselect for atomicrmw/atomic_cmpxchg llvm-svn: 349708	2018-12-20 00:33:49 +00:00
Matt Arsenault	b110e2277c	AMDGPU/GlobalISel: Regbankselect for fsub llvm-svn: 349608	2018-12-19 09:07:58 +00:00
Matt Arsenault	c94e26c71d	AMDGPU: Legalize/regbankselect frame_index llvm-svn: 349468	2018-12-18 09:46:13 +00:00
Matt Arsenault	c0ea221068	AMDGPU: Legalize/regbankselect fma llvm-svn: 349467	2018-12-18 09:39:56 +00:00
Matt Arsenault	e01e7c81f2	AMDGPU/GlobalISel: Legalize/regbankselect fneg/fabs/fsub llvm-svn: 349463	2018-12-18 09:19:03 +00:00
Matt Arsenault	934e534c47	AMDGPU/GlobalISel: Legalize/regbankselect block_addr llvm-svn: 349081	2018-12-13 20:34:15 +00:00
Matt Arsenault	577b9fc543	AMDGPU/GlobalISel: Legalize f64 fadd/fmul llvm-svn: 349014	2018-12-13 08:27:48 +00:00
Matt Arsenault	f38f483bef	AMDGPU/GlobalISel: RegBankSelect some simple operations llvm-svn: 349012	2018-12-13 08:23:51 +00:00
Matt Arsenault	7acf89a21a	AMDGPU/GlobalISel: Test cleanups Remove IR and registers sections llvm-svn: 349011	2018-12-13 08:11:45 +00:00
Amara Emerson	5ec146046c	[GlobalISel] Restrict G_MERGE_VALUES capability and replace with new opcodes. This patch restricts the capability of G_MERGE_VALUES, and uses the new G_BUILD_VECTOR and G_CONCAT_VECTORS opcodes instead in the appropriate places. This patch also includes AArch64 support for selecting G_BUILD_VECTOR of <4 x s32> and <2 x s64> vectors. Differential Revisions: https://reviews.llvm.org/D53629 llvm-svn: 348788	2018-12-10 18:44:58 +00:00
Tom Stellard	a894043910	Revert "AMDGPU/GlobalISel: Implement select for G_INSERT" This reverts commit r344310. The test case was failing on some bots. llvm-svn: 344317	2018-10-11 23:36:46 +00:00
Tom Stellard	4733be6e7b	AMDGPU/GlobalISel: Implement select for G_INSERT Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53116 llvm-svn: 344310	2018-10-11 22:49:54 +00:00
Tom Stellard	14d8807d9a	AMDGPU/GlobalISel: Select amdgcn.cvt.pkrtz to 64-bit instructions Summary: The 32-bit variants do not exist on VI+. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52958 llvm-svn: 343985	2018-10-08 17:49:29 +00:00
Tom Stellard	7c65078f04	AMDGPU/GlobalISel: Add support for G_INTTOPTR Summary: This is a no-op. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52916 llvm-svn: 343839	2018-10-05 04:34:09 +00:00
Tom Stellard	ffc6bd6f3d	AMDGPU/GlobalISel: Define instruction mapping for G_SELECT Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D49737 llvm-svn: 341271	2018-09-01 02:41:19 +00:00
Tom Stellard	8adc86a7dc	AMDGPU/GlobalISel: Define instruction mapping for G_INSERT Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49625 llvm-svn: 339491	2018-08-11 00:51:54 +00:00
Tom Stellard	e9bdc5f1d8	AMDGPU/GlobalISel: Fix crash in regbankselect on non-power-of-2 types Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D49624 llvm-svn: 338102	2018-07-27 06:04:40 +00:00
Tom Stellard	b7f19e6d1e	AMDGPU/GlobalISel: Legalize G_INSERT Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49601 llvm-svn: 337798	2018-07-24 02:19:20 +00:00
Tom Stellard	ac68471326	AMDGPU/GlobalISel: Implement select() for 32-bit @llvm.minnun and @llvm.maxnum Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D46172 llvm-svn: 337056	2018-07-13 22:16:03 +00:00
Tom Stellard	390a5f4774	AMDGPU/GlobalISel: Implement select() for @llvm.amdgcn.exp Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45882 llvm-svn: 337046	2018-07-13 21:05:14 +00:00
Matt Arsenault	29f303799b	AMDGPU/GlobalISel: Implement custom kernel arg lowering Avoid using allocateKernArg / AssignFn. We do not want any of the type splitting properties of normal calling convention lowering. For now at least this exists alongside the IR argument lowering pass. This is necessary to handle struct padding correctly while some arguments are still skipped by the IR argument lowering pass. llvm-svn: 336373	2018-07-05 17:01:20 +00:00
Tom Stellard	eebbfc2809	AMDGPU/GlobalISel: Make IMPLICIT_DEF of all sizes < 512 legal. Summary: We could split sizes that are not power of two into smaller sized G_IMPLICIT_DEF instructions, but this ends up generating G_MERGE_VALUES instructions which we then have to handle in the instruction selector. Since G_IMPLICIT_DEF is really a no-op it's easier just to keep everything that can fit into a register legal. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48777 llvm-svn: 336041	2018-06-30 04:09:44 +00:00
Matt Arsenault	8c4a35237a	AMDGPU: Add pass to lower kernel arguments to loads This replaces most argument uses with loads, but for now not all. The code in SelectionDAG for calling convention lowering is actively harmful for amdgpu_kernel. It attempts to split the argument types into register legal types, which results in low quality code for arbitary types. Since all kernel arguments are passed in memory, we just want the raw types. I've tried a couple of methods of mitigating this in SelectionDAG, but it's easier to just bypass this problem alltogether. It's possible to hack around the problem in the initial lowering, but the real problem is the DAG then expects to be able to use CopyToReg/CopyFromReg for uses of the arguments outside the block. Exposing the argument loads in the IR also has the advantage that the LoadStoreVectorizer can merge them. I'm not sure the best approach to dealing with the IR argument list is. The patch as-is just leaves the IR arguments in place, so all the existing code will still compute the same kernarg size and pointlessly lowers the arguments. Arguably the frontend should emit kernels with an empty argument list in the first place. Alternatively a dummy array could be inserted as a single argument just to reserve space. This does have some disadvantages. Local pointer kernel arguments can no longer have AssertZext placed on them as the equivalent !range metadata is not valid on pointer typed loads. This is mostly bad for SI which needs to know about the known bits in order to use the DS instruction offset, so in this case this is not done. More importantly, this skips noalias arguments since this pass does not yet convert this to the equivalent !alias.scope and !noalias metadata. Producing this metadata correctly seems to be tricky, although this logically is the same as inlining into a function which doesn't exist. Additionally, exposing these loads to the vectorizer may result in degraded aliasing information if a pointer load is merged with another argument load. I'm also not entirely sure this is preserving the current clover ABI, although I would greatly prefer if it would stop widening arguments and match the HSA ABI. As-is I think it is extending < 4-byte arguments to 4-bytes but doesn't align them to 4-bytes. llvm-svn: 335650	2018-06-26 19:10:00 +00:00
Matt Arsenault	b1cc4f52ff	AMDGPU/GlobalISel: Add support for llvm.amdgcn.kernarg.segment.ptr Note a normal select test is not currently possible because this relies on input registers tracked in SIMachineFunctionInfo which are not currently serializable in MIR, but this does work end-to-end from the IR. llvm-svn: 335490	2018-06-25 16:17:48 +00:00
Matt Arsenault	b3feccd7fa	AMDGPU/GlobalISel: Fix G_IMPLICIT_DEF for pointers llvm-svn: 335485	2018-06-25 15:42:12 +00:00
Tom Stellard	26fac0f8e1	AMDGPU/GlobalISel: legalize and select 32-bit G_ASHR Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D48196 llvm-svn: 335318	2018-06-22 02:54:57 +00:00
Tom Stellard	9a6535718e	AMDGPU/GlobalISel: legalize and select 32-bit G_SITOFP Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48195 llvm-svn: 335316	2018-06-22 02:34:29 +00:00
Tom Stellard	7712ee8891	AMDGPU/GlobalISel: Implement select() for COPY Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46151 llvm-svn: 335315	2018-06-22 00:44:29 +00:00
Tom Stellard	3f1c6fe156	AMDGPU/GlobalISel: Implement select() for G_IMPLICIT_DEF Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46150 llvm-svn: 335307	2018-06-21 23:38:20 +00:00
Tom Stellard	a92847359a	AMDGPU/GlobalISel: Implement select() for @llvm.amdgcn.cvt.pkrtz Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45907 llvm-svn: 334757	2018-06-14 19:26:37 +00:00
Tom Stellard	46bbbc33c0	AMDGPU/GlobalISel: Implement select() for 32-bit G_FADD and G_FMUL Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46171 llvm-svn: 334665	2018-06-13 22:30:47 +00:00
Tom Stellard	e182b28ae4	AMDGPU/GlobalISel: Implement select() for G_FCONSTANT Summary: Also clean up G_CONSTANT selection. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46170 llvm-svn: 332379	2018-05-15 17:57:09 +00:00
Tom Stellard	655fdd3f82	AMDGPU/GlobalISel: Implement select() for >32-bit G_STORE Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D46153 llvm-svn: 332154	2018-05-11 23:12:49 +00:00
Tom Stellard	dcc95e9385	AMDGPU/GlobalISel: Implement select() for 32-bit G_FPTOUI Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45883 llvm-svn: 332082	2018-05-11 05:44:16 +00:00
Tom Stellard	1e0edad4bb	AMDGPU/GlobalISel: Implement select() for G_BITCAST s32 <--> <2 x s16> Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45881 llvm-svn: 332042	2018-05-10 21:20:10 +00:00
Tom Stellard	1dc90204bf	AMDGPU/GlobalISel: Enable TableGen'd instruction selector Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, mgorny, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45994 llvm-svn: 332039	2018-05-10 20:53:06 +00:00
Roman Tereshin	d2421f9445	[MachineVerifier][GlobalISel] Verifying generic extends and truncates Making sure we don't truncate / extend pointers, don't try to change vector topology or bitcast vectors to scalars or back, and most importantly, don't extend to a smaller type or truncate to a large one. Reviewers: qcolombet t.p.northover aditya_nandakumar Reviewed By: qcolombet Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D46490 llvm-svn: 331718	2018-05-08 02:48:15 +00:00
Daniel Sanders	acc008cb0c	[globalisel] Remove redundant -global-isel option from tests that use -run-pass. NFC As Roman Tereshin pointed out in https://reviews.llvm.org/D45541, the -global-isel option is redundant when -run-pass is given. -global-isel sets up the GlobalISel passes in the pass manager but -run-pass skips that entirely and configures it's own pipeline. llvm-svn: 331603	2018-05-05 21:19:59 +00:00
Tom Stellard	257882ff72	AMDGPU/GlobalISel: Fall-back to SelectionDAG for non-void functions Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45843 llvm-svn: 330774	2018-04-24 21:29:36 +00:00
Tom Stellard	c7709e1c29	AMDGPU/GlobalISel: Add support for amdgpu_ps calling convention Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45837 llvm-svn: 330767	2018-04-24 20:51:28 +00:00
Matt Arsenault	fed0a45036	AMDGPU/GlobalISel: RegBankSelect for basic int ops llvm-svn: 327843	2018-03-19 14:07:23 +00:00
Matt Arsenault	abdc4f2dc7	AMDGPU/GlobalISel: Cleanup constant legality llvm-svn: 327774	2018-03-17 15:17:48 +00:00
Matt Arsenault	685d1e8157	AMDGPU/GlobalISel: Basic G_GEP legality llvm-svn: 327773	2018-03-17 15:17:45 +00:00
Matt Arsenault	85803366d6	AMDGPU/GlobalISel: Basic legality for load/store llvm-svn: 327772	2018-03-17 15:17:41 +00:00
Matt Arsenault	7b9ed89dcf	AMDGPU/GlobalISel: Legality and RegBankInfo for G_{INSERT\|EXTRACT}_VECTOR_ELT llvm-svn: 327269	2018-03-12 13:35:53 +00:00
Matt Arsenault	c0aefd561e	AMDGPU/GlobalISel: InstrMapping for G_MERGE_VALUES llvm-svn: 327268	2018-03-12 13:35:49 +00:00
Matt Arsenault	503afda95f	AMDGPU/GlobalISel: Make some G_MERGE_VALUEs legal llvm-svn: 327267	2018-03-12 13:35:43 +00:00
Matt Arsenault	e31ab94e97	AMDGPU/GlobalISel: Add InstrMapping for G_EXTRACT llvm-svn: 326715	2018-03-05 16:25:18 +00:00
Matt Arsenault	71272e6d4e	AMDGPU/GlobalISel: Make some G_EXTRACTs legal As far as I can tell legalization of weird sizes for the output type isn't implemented. llvm-svn: 326714	2018-03-05 16:25:15 +00:00
Matt Arsenault	b9699c009d	AMDGPU/GlobalISel: InstrMapping for G_ZEXT llvm-svn: 326589	2018-03-02 16:55:37 +00:00
Matt Arsenault	1c1aab99ae	AMDGPU/GlobalISel: InstrMapping for G_TRUNC llvm-svn: 326588	2018-03-02 16:55:33 +00:00
Matt Arsenault	ef8db767d7	AMDGPU/GlobalISel: Define InstrMappings for G_FCMP Patch by Tom Stellard llvm-svn: 326587	2018-03-02 16:53:15 +00:00
Matt Arsenault	2607dc60de	AMDGPU/GlobalISel: Define instruction mapping for @llvm.minnum Patch by Tom Stellard llvm-svn: 326586	2018-03-02 16:40:17 +00:00
Matt Arsenault	b46c191c49	AMDGPU/GlobalISel: Define instruction mapping for @llvm.maxnum Patch by Tom Stellard llvm-svn: 326567	2018-03-02 12:23:00 +00:00
Matt Arsenault	41d2e3d98e	AMDGPU/GlobalISel: Define instruction mapping for G_FPTOSI Patch by Tom Stellard llvm-svn: 326534	2018-03-02 02:19:16 +00:00
Matt Arsenault	b23041ad4d	AMDGPU/GlobalISel: Define instruction mapping for G_FPTOUI Patch by Tom Stellard llvm-svn: 326533	2018-03-02 02:19:11 +00:00
Matt Arsenault	327d5fb2e5	AMDGPU/GlobalISel: Define instruction mapping for G_FMUL llvm-svn: 326532	2018-03-02 02:17:01 +00:00
Matt Arsenault	5a9e834eac	AMDGPU/GlobalISel: Define instruction mapping for G_FADD Patch by Tom Stellard llvm-svn: 326526	2018-03-02 01:22:13 +00:00
Matt Arsenault	d99317f1b3	AMDGPU/GlobalISel: Define instruction mapping for G_SHL Patch by Tom Stellard llvm-svn: 326525	2018-03-02 01:22:10 +00:00
Matt Arsenault	3c7a123ccc	AMDGPU/GlobalISel: Define instruction mapping for G_XOR llvm-svn: 326524	2018-03-02 01:22:06 +00:00
Matt Arsenault	c0f34c9e36	AMDGPU/GlobalISel: Define instruction mapping for G_AND Patch by Tom Stellard llvm-svn: 326523	2018-03-02 01:22:01 +00:00
Matt Arsenault	364f12e8f9	AMDGPU/GlobalISel: Define instruction mapping for @llvm.amdgcn.cvt.pkrtz Patch by Tom Stellard llvm-svn: 326490	2018-03-01 21:25:30 +00:00
Matt Arsenault	5320ee4a05	AMDGPU/GlobalISel: Define instruction mapping for G_OR Patch by Tom Stellard llvm-svn: 326489	2018-03-01 21:25:25 +00:00
Matt Arsenault	62669ede94	AMDGPU/GlobalISel: Define instruction mapping for G_BITCAST Patch by Tom Stellard llvm-svn: 326482	2018-03-01 20:59:44 +00:00
Matt Arsenault	0529a8e2de	AMDGPU/GlobalISel: Mark i32->i64 zext as legal llvm-svn: 326481	2018-03-01 20:56:21 +00:00
Matt Arsenault	36b99e1937	AMDGPU/GlobalISel: InstrMapping for llvm.amdgcn.exp.compr Patch by Tom Stellard llvm-svn: 326479	2018-03-01 20:40:55 +00:00
Matt Arsenault	8931bbf8df	AMDGPU/GlobalISel: Define instruction mapping for @llvm.amdgcn.exp Patch by Tom Stellard llvm-svn: 326477	2018-03-01 20:24:37 +00:00
Matt Arsenault	50721ab325	AMDGPU/GlobalISel: Define InstrMappings for G_ICMP Patch by Tom Stellard llvm-svn: 326472	2018-03-01 19:27:10 +00:00
Matt Arsenault	dc14ec05d4	AMDGPU/GlobalISel: Make i32 mul legal llvm-svn: 326471	2018-03-01 19:22:05 +00:00
Matt Arsenault	06cbb27a79	AMDGPU/GlobalISel: Define instruction mapping for G_IMPLICIT_DEF Patch by Tom Stellard llvm-svn: 326470	2018-03-01 19:16:52 +00:00
Matt Arsenault	e3d9ecf2b9	AMDGPU/GlobalISel: Define instruction mapping for G_FCONSTANT Patch by Tom Stellard llvm-svn: 326468	2018-03-01 19:13:30 +00:00
Matt Arsenault	3f6a204eaa	AMDGPU/GlobalISel: Make i32 xor legal llvm-svn: 326466	2018-03-01 19:09:21 +00:00
Matt Arsenault	8e80a5fbca	AMDGPU/GlobalISel: Mark 32/64-bit G_FCMP as legal Patch by Tom Stellard llvm-svn: 326465	2018-03-01 19:09:16 +00:00
Matt Arsenault	dd022ce064	AMDGPU/GlobalISel: Mark 32-bit G_FPTOSI as legal Patch by Tom Stellard llvm-svn: 326464	2018-03-01 19:04:25 +00:00
Matt Arsenault	2a26a286db	AMDGPU/GlobalISel: Make f64 constants legal llvm-svn: 326101	2018-02-26 17:20:43 +00:00
Yaxun Liu	0124b5484c	[AMDGPU] Change constant addr space to 4 Differential Revision: https://reviews.llvm.org/D43170 llvm-svn: 325030	2018-02-13 18:00:25 +00:00
Tom Stellard	33445765dd	AMDGPU/GlobalISel: Mark 32-bit G_FPTOUI as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D42152 llvm-svn: 324446	2018-02-07 04:47:59 +00:00
Puyan Lotfi	43e94b15ea	Followup on Proposal to move MIR physical register namespace to '$' sigil. Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922	2018-01-31 22:04:26 +00:00
Aditya Nandakumar	e6201c8724	[GISel]: Rework legalization algorithm for better elimination of artifacts along with DCE Legalization Artifacts are all those insts that are there to make the type system happy. Currently, the target needs to say all combinations of extends and truncs are legal and there's no way of verifying that post legalization, we only have truly legal instructions. This patch changes roughly the legalization algorithm to process all illegal insts at one go, and then process all truncs/extends that were added to satisfy the type constraints separately trying to combine trivial cases until they converge. This has the added benefit that, the target legalizerinfo can only say which truncs and extends are okay and the artifact combiner would combine away other exts and truncs. Updated legalization algorithm to roughly the following pseudo code. WorkList Insts, Artifacts; collect_all_insts_and_artifacts(Insts, Artifacts); do { for (Inst in Insts) legalizeInstrStep(Inst, Insts, Artifacts); for (Artifact in Artifacts) tryCombineArtifact(Artifact, Insts, Artifacts); } while(!Insts.empty()); Also, wrote a simple wrapper equivalent to SetVector, except for erasing, it avoids moving all elements over by one and instead just nulls them out. llvm-svn: 318210	2017-11-14 22:42:19 +00:00
Bjorn Pettersson	a42ed3e361	[MIRPrinter] Use %subreg.xxx syntax for subregister index operands Summary: Print %subreg.<subregidxname> instead of just the subregister index when printing immediate operands corresponding to subreg indices in INSERT_SUBREG, EXTRACT_SUBREG, SUBREG_TO_REG and REG_SEQUENCE. Reviewers: qcolombet, MatzeB Reviewed By: MatzeB Subscribers: nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D39696 llvm-svn: 317513	2017-11-06 21:46:06 +00:00
Tom Stellard	d0c6cf2e8c	AMDGPU/GlobalISel: Mark 32-bit G_FADD as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38439 llvm-svn: 316815	2017-10-27 23:57:41 +00:00
Justin Bogner	6c452834a1	MIR: Print the register class or bank in vreg defs This updates the MIRPrinter to include the regclass when printing virtual register defs, which is already valid syntax for the parser. That is, given 64 bit %0 and %1 in a "gpr" regbank, %1(s64) = COPY %0(s64) would now be written as %1:gpr(s64) = COPY %0(s64) While this change alone introduces a bit of redundancy with the registers block, it allows us to update the tests to be more concise and understandable and brings us closer to being able to remove the registers block completely. Note: We generally only print the class in defs, but there is one exception. If there are uses without any defs whatsoever, we'll print the class on all uses. I'm not completely convinced this comes up in meaningful machine IR, but for now the MIRParser and MachineVerifier both accept that kind of stuff, so we don't want to have a situation where we can print something we can't parse. llvm-svn: 316479	2017-10-24 18:04:54 +00:00
Justin Bogner	d45849f703	Canonicalize a large number of mir tests using update_mir_test_checks This converts a large and somewhat arbitrary set of tests to use update_mir_test_checks. I ran the script on all of the tests I expect to need to modify for an upcoming mir syntax change and kept the ones that obviously didn't change the tests in ways that might make it harder to understand. llvm-svn: 316137	2017-10-18 23:18:12 +00:00
Matt Arsenault	36b4b0bed7	AMDGPU: Remove -mcpu=SI Leftover from before amdgcn/r600 split. llvm-svn: 310277	2017-08-07 18:30:35 +00:00
Tom Stellard	3337d74399	AMDGPU/GlobalISel: Mark 32-bit G_FMUL as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D36218 llvm-svn: 309898	2017-08-02 22:56:30 +00:00
Tom Stellard	9d8337d857	AMDGPU/GlobalISel: Add support for amdgpu_vs calling convention Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D35916 llvm-svn: 309675	2017-08-01 12:38:33 +00:00
Tom Stellard	55038cd1d3	AMDGPU/GlobalISel: Mark 32-bit G_OR as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35127 llvm-svn: 309165	2017-07-26 20:00:53 +00:00
Tom Stellard	eb8f1e27d9	AMDGPU/GlobalISel: Mark 32-bit G_SHL as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34589 llvm-svn: 306298	2017-06-26 15:56:52 +00:00
Tom Stellard	af552dc352	AMDGPU/GlobalISel: Mark 32-bit G_AND as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34349 llvm-svn: 306112	2017-06-23 15:17:17 +00:00
Tom Stellard	ff63ee0db5	AMDGPU/GlobalISel: Mark G_BITCAST s32 <--> <2 x s16> legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34129 llvm-svn: 305692	2017-06-19 13:15:45 +00:00
Tom Stellard	ee6e6452df	AMDGPU/GlobalISel: Mark 32-bit G_ADD as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D33992 llvm-svn: 305232	2017-06-12 20:54:56 +00:00
Matt Arsenault	fd02314113	AMDGPU: Start adding offset fields to flat instructions llvm-svn: 305194	2017-06-12 15:55:58 +00:00
Tom Stellard	2860a428f7	AMDGPU/GlobalISel: Mark 32-bit G_SELECT as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33949 llvm-svn: 304910	2017-06-07 13:54:51 +00:00
Tom Stellard	8cd60a5067	AMDGPU/GlobalISel: Mark 32-bit G_ICMP as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33890 llvm-svn: 304797	2017-06-06 14:16:50 +00:00
Vivek Pandya	56d87ef5d7	[Improve CodeGen Testing] This patch renables MIRPrinter print fields which have value equal to its default. If -simplify-mir option is passed then MIRPrinter will not print such fields. This change also required some lit test cases in CodeGen directory to be changed. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D32304 llvm-svn: 304779	2017-06-06 08:16:19 +00:00
Tom Stellard	e042412ef1	AMDGPU/GlobalISel: Mark 1-bit integer constants as legal Summary: These are mostly legal, but will probably need special lowering for some cases. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D33791 llvm-svn: 304628	2017-06-03 01:13:33 +00:00
Tom Stellard	dde28a8c92	AMDGPU/GlobalISel: Mark 32-bit float constants as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33212 llvm-svn: 304003	2017-05-26 16:40:03 +00:00
Matt Arsenault	2b1f9aa577	AMDGPU: Start defining a calling convention Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well. llvm-svn: 303308	2017-05-17 21:56:25 +00:00
Tom Stellard	fab6b1af6e	AMDGPU: Add lit.local.cfg to disable global-isel tests when global-isel is disabled This should fix bots broken by r302919. llvm-svn: 302928	2017-05-12 17:59:30 +00:00
Tom Stellard	a0d67c748a	AMDGPU/GlobalISel: Mark 32-bit integer constants as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33115 llvm-svn: 302919	2017-05-12 16:46:46 +00:00
Matt Arsenault	47ccafe787	AMDGPU: Remove tfe bit from flat instruction definitions We don't use it and it was removed in gfx9, and the encoding bit repurposed. Additionally actually using it requires changing the output register class, which wasn't done anyway. llvm-svn: 302814	2017-05-11 17:38:33 +00:00
Matt Arsenault	3dbeefa978	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444	2017-03-21 21:39:51 +00:00
Ahmed Bougacha	931904d777	[GlobalISel] Don't select trivially dead instructions. Folding instructions when selecting can cause them to become dead. Don't select these dead instructions (if they don't have other side effects, and don't define physical registers). Preserve existing tests by adding COPYs. In some tests, the G_CONSTANT vregs never get constrained to a class: the only use of the vreg was folded into another instruction, so the G_CONSTANT, now dead, never gets selected. llvm-svn: 298224	2017-03-19 16:13:00 +00:00
Tom Stellard	124f5cc8c2	AMDGPU/SI: Fix inst-select-load-smrd.mir on some builds Summary: For some reason instructions are being inserted in the wrong order with some builds. I'm not sure why this is happening. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D29325 llvm-svn: 293639	2017-01-31 15:24:11 +00:00
Tom Stellard	ca16621b2a	Re-commit AMDGPU/GlobalISel: Add support for simple shaders Fix build when global-isel is disabled and fix a warning. Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP. Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D26730 llvm-svn: 293551	2017-01-30 21:56:46 +00:00
Tom Stellard	7a19d56f73	Revert "AMDGPU/GlobalISel: Add support for simple shaders" This reverts commit r293503. Revert while I investigate some of the buildbot failures. llvm-svn: 293509	2017-01-30 17:42:41 +00:00
Tom Stellard	e48f60aec8	AMDGPU/GlobalISel: Add support for simple shaders Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP. Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D26730 llvm-svn: 293503	2017-01-30 17:09:15 +00:00
Tim Northover	0f140c769a	GlobalISel: move type information to MachineRegisterInfo. We want each register to have a canonical type, which means the best place to store this is in MachineRegisterInfo rather than on every MachineInstr that happens to use or define that register. Most changes following from this are pretty simple (you need an MRI anyway if you're going to be doing any transformations, so just check the type there). But legalization doesn't really want to check redundant operands (when, for example, a G_ADD only ever has one type) so I've made use of MCInstrDesc's operand type field to encode these constraints and limit legalization's work. As an added bonus, more validation is possible, both in MachineVerifier and MachineIRBuilder (coming soon). llvm-svn: 281035	2016-09-09 11:46:34 +00:00
Tim Northover	26e40bdb9b	GlobalISel: omit braces on MachineInstr types when there's only one. Tidies up the representation a bit in the common case. llvm-svn: 276772	2016-07-26 17:28:01 +00:00
Tim Northover	98a56eb7f4	GlobalISel: allow multiple types on MachineInstrs. llvm-svn: 276481	2016-07-22 22:13:36 +00:00
Tim Northover	62ae568bbb	GlobalISel: implement low-level type with just size & vector lanes. This should be all the low-level instruction selection needs to determine how to implement an operation, with the remaining context taken from the opcode (e.g. G_ADD vs G_FADD) or other flags not based on type (e.g. fast-math). llvm-svn: 276158	2016-07-20 19:09:30 +00:00
Tom Stellard	000c5af3e6	AMDGPU: Add skeleton GlobalIsel implementation Summary: This adds the necessary target code to be able to run the ir translator. Lowering function arguments and returns is a nop and there is no support for RegBankSelect. Reviewers: arsenm, qcolombet Subscribers: arsenm, joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19077 llvm-svn: 266356	2016-04-14 19:09:28 +00:00

... 6 7 8 9 10 ...

610 Commits