llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	0de547ed4a	AMDGPU/GlobalISel: Ensure subreg is valid when selecting G_UNMERGE_VALUES Fixes verifier error with SGPR unmerges with 96-bit result types.	2020-08-04 12:27:34 -04:00
Matt Arsenault	1782fbbc69	GlobalISel: Reimplement moreElementsVectorDst Use pad with undef and unmerge with unused results. This is annoyingly similar to several other places in LegalizerHelper, but they're all slightly different.	2020-08-03 09:03:48 -04:00
Dominik Montada	052c962ced	[GlobalISel] Combine scalar unmerge(trunc) Summary: Combine unmerge(trunc) to enable other merge combines. Without this combine, the scalar unmerge(trunc(merge)) pattern cannot be combined and easily lead to hard-to-legalize merge/unmerge artifacts. Reviewed By: arsenm Tags: #llvm Differential Revision: https://reviews.llvm.org/D79567	2020-06-02 08:56:18 +02:00
Matt Arsenault	3af85fa8f0	GlobalISel: Handle more cases in lowerUnmergeValues Handle scalar sources, as well as vectors.	2020-05-09 19:33:32 -04:00
Matt Arsenault	ee1a69824d	GlobalISel: Combine G_UNMERGE_VALUES with G_TRUNC G_BITCAST can be lowered with a pair of G_UNMERGE_VALUES and G_MERGE_VALUES with different types, but G_UNMERGE_VALUES of a vector can also be implemented with a bitcast to a scalar, which introduces the possibility for infinite loops. Try to eliminate an illegal source register type in the artifact combiner to avoid this from happening. Avoids infinite looping in the legalizer in a future patch which allows lowering G_UNMERGE_VALUES of a vector source with a G_BITCAST.	2020-05-09 16:14:32 -04:00
Dominik Montada	e5d666d768	Revert "Revert "[GlobalISel] Fix invalid combine of unmerge(merge) with intermediate cast"" This reverts commit `1265899c5f`.	2020-04-16 09:30:34 +02:00
Dominik Montada	1265899c5f	Revert "[GlobalISel] Fix invalid combine of unmerge(merge) with intermediate cast" This reverts commit `bddac41b9f`.	2020-04-15 18:47:39 +02:00
Dominik Montada	bddac41b9f	[GlobalISel] Fix invalid combine of unmerge(merge) with intermediate cast Summary: The combine for unmerge(cast(merge)) is only valid for vectors, but was missing a corresponding check. Add a check that the operands are vectors to avoid an invalid combine. Without this check, the combiner would emit incorrect code for scalars and pointers because the artifact cast (trunc/ext) only affects bits at the end of the type, while this combine assumes that the casted bits appear between meaningful bits. This also uncovered a segmentation fault in the AMDGPU InstructionSelector. The tests triggering this bug have been moved to their own file and a check for the segmentation fault has been added. Reviewers: arsenm, dsanders, aemerson, paquette, aditya_nandakumar Reviewed By: arsenm Subscribers: tpr, jvesely, wdng, nhaehnle, rovka, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78191	2020-04-15 17:19:14 +02:00
Matt Arsenault	baa78179fe	AMDGPU/GlobalISel: Add a testcase for G_UNMERGE_VALUES legalization I had a note that this doesn't work, but it seems to now.	2020-03-24 21:54:43 -04:00
Dominik Montada	ccf49b9ef0	[GlobalISel] support widen unmerge if WideTy > SrcTy Summary: Widening G_UNMERGE_VALUES to a type which is larger than the original source type is the same as widening it to the same type as the source type: in both cases, G_UNMERGE_VALUES has to be replaced with bit arithmetic which. Although the arithmetic itself is independent of whether the source type is smaller or equal to the widen type, widening the source type to the widen type should result in less artifacts being emitted, since this is the type that the user explicitly requested. Reviewers: arsenm, dsanders, aemerson, aditya_nandakumar Reviewed By: arsenm, dsanders Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76494	2020-03-23 09:16:45 +01:00
Matt Arsenault	57d896e838	AMDGPU/GlobalISel: Make some large merges legal We allow up to 1024-bit registers, so we should support merges all the way to the maximum.	2020-03-16 10:49:10 -04:00
Dominik Montada	c0241f150d	[GlobalISel] combine G_TRUNC with G_MERGE_VALUES Summary: Truncating the result of a merge means that most likely we could have done without merge in the first place and just used the input merge inputs directly. This can be done in three cases: 1. If the truncation result is smaller than the merge source, we can use the source in the trunc directly 2. If the sizes are the same, we can replace the register or use a copy 3. If the truncation size is a multiple of the merge source size, we can build a smaller merge This gets rid of most of the larger, hard-to-legalize merges. Reviewers: qcolombet, aditya_nandakumar, aemerson, paquette, arsenm, Petar.Avramovic Reviewed By: arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75915	2020-03-16 14:42:01 +01:00
Matt Arsenault	9087ef0765	GlobalISel: Allow CSE of G_IMPLICIT_DEF The legalizer produces a lot of these, and they make reading legalized MIR annoying. For some reason, this does seem to sometimes introduce copies of implicit def, which is dumb.	2020-02-05 17:47:21 -05:00
Matt Arsenault	bc101ffd77	GlobalISel: Support widening unmerge results with pointer source	2020-02-01 10:47:03 -05:00
Matt Arsenault	2a160ba5b0	GlobalISel: Reimplement widenScalar for G_UNMERGE_VALUES results Only use shifts if the requested type exactly matches the source type, and create sub-unmerges otherwise.	2020-01-27 06:18:26 -08:00
Matt Arsenault	5181c67feb	AMDGPU/GlobalISel: Add some baseline tests for unmerge legalization	2020-01-21 08:31:10 -05:00
Matt Arsenault	a66d2817ca	GlobalISel: Don't ignore requested ext narrowing type This was assuming the narrow target was the source type. Respect the requested type when these don't match by using intermediate merges. This avoids producing very wide, illegal shift expansions.	2020-01-16 14:29:37 -05:00
Matt Arsenault	91715617ad	GlobalISel: Fix narrowScalar for G_ANYEXT results This is nearly the same as G_ZEXT.	2020-01-15 08:58:57 -05:00
Craig Topper	a5376f6322	[GlobalISel][AArch64][AMDGPU][X86] Teach LegalizationArtifactCombiner to combine trunc(g_constant). This allows X86 to properly form shift by immediate instructions since we require an 8-bit constant to match the imported SelectionDAG patterns.	2019-10-24 12:59:26 -07:00
Petar Avramovic	d568ed40e0	[GlobalISel] Fix narrowScalar for shifts to match algorithm from SDAG Fix typos. Use Hi and Lo prefixes for Or instead of LHS and RHS to match names of surrounding variables. Differential Revision: https://reviews.llvm.org/D66587 llvm-svn: 370062	2019-08-27 14:22:32 +00:00
Matt Arsenault	954a012b4c	GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sources This is necessary for handling <3 x s16> on AMDGPU, assuming this should be handled as 2 separate legalization actions. The alternative would be for fewerElementsVector to handle 3->2. llvm-svn: 369547	2019-08-21 16:59:10 +00:00
Matt Arsenault	28215caa60	GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUES Odd sized vectors aren't handled yet. llvm-svn: 368713	2019-08-13 16:26:28 +00:00
Matt Arsenault	d9d30a408e	GlobalISel: Lower scalarizing unmerge of a vector to shifts AMDGPU sometimes has legal s16 and <2 x s16> operations, but all registers are really 32-bit. An unmerge destination really should ben widened to a 32-bit register. If widening a scalarizing vector with a target size that matches the vector size, bitcast to integer and extract the relevant bits with shifts. I'm not sure if this is the right place for this. This could arguably be part of widenScalar for the result. I also have a growing feeling that we're missing a bitcast legalize action. llvm-svn: 367604	2019-08-01 19:10:05 +00:00
Amara Emerson	946b1246d6	[GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 `8033492` -0.6% test-suite :: CTMark/kimwitu++/kc.test `3870380` 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369	2019-04-15 05:04:20 +00:00
Amara Emerson	381188f1f3	[GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead instructions. The artifact combiners push instructions which have been marked for deletion onto an list for the legalizer to deal with on return. However, for trunc(ext) combines the combiner routine recursively calls itself. When it does this the dead instructions list may not be empty, and the other combiners don't expect to be dealing with essentially invalid MIR (multiple vreg defs etc). This change fixes it by ensuring that the dead instructions are processed on entry into tryCombineInstruction. As a result, this fix exposed a few places in tests where G_TRUNC instructions were not being deleted even though they were dead. Differential Revision: https://reviews.llvm.org/D59892 llvm-svn: 357101	2019-03-27 17:47:42 +00:00
Matt Arsenault	f4bfe4cd17	AMDGPU/GlobalISel: Fix bit ops for non-power-of-2 sizes llvm-svn: 354825	2019-02-25 21:32:48 +00:00
Matt Arsenault	18ec382698	GlobalISel: Implement moreElementsVector for implicit_def llvm-svn: 353754	2019-02-11 22:00:39 +00:00
Matt Arsenault	fbec8fe93b	GlobalISel: Implement narrowScalar for shift main type This is pretty much directly ported from SelectionDAG. Doesn't include the shift by non-constant but known bits version, since there isn't a globalisel version of computeKnownBits yet. This shows a disadvantage of targets not specifically which type should be used for the shift amount. If type 0 is legalized before type 1, the operations on the shift amount type use the wider type (which are also less likely to legalize). This can be avoided by targets specifying legalization actions on type 1 earlier than for type 0. llvm-svn: 353455	2019-02-07 19:37:44 +00:00
Matt Arsenault	888aa5dedd	GlobalISel: Implement widenScalar for G_UNMERGE_VALUES For the scalar case only. Also move the similar G_MERGE_VALUES handling to a separate function and cleanup to make them look more similar. llvm-svn: 352979	2019-02-03 00:07:33 +00:00
Matt Arsenault	ff6a9a275b	AMDGPU/GlobalISel: Fix some crashs in g_unmerge_values/g_merge_values This was crashing in the predicate function assuming the value is a vector. Copy more of what AArch64 uses. This probably needs more refinement later, but I don't exactly understand what it means in some cases, particularly since any legalization for these seems to be missing. llvm-svn: 351693	2019-01-20 18:40:36 +00:00
Amara Emerson	5ec146046c	[GlobalISel] Restrict G_MERGE_VALUES capability and replace with new opcodes. This patch restricts the capability of G_MERGE_VALUES, and uses the new G_BUILD_VECTOR and G_CONCAT_VECTORS opcodes instead in the appropriate places. This patch also includes AArch64 support for selecting G_BUILD_VECTOR of <4 x s32> and <2 x s64> vectors. Differential Revisions: https://reviews.llvm.org/D53629 llvm-svn: 348788	2018-12-10 18:44:58 +00:00
Daniel Sanders	acc008cb0c	[globalisel] Remove redundant -global-isel option from tests that use -run-pass. NFC As Roman Tereshin pointed out in https://reviews.llvm.org/D45541, the -global-isel option is redundant when -run-pass is given. -global-isel sets up the GlobalISel passes in the pass manager but -run-pass skips that entirely and configures it's own pipeline. llvm-svn: 331603	2018-05-05 21:19:59 +00:00
Matt Arsenault	503afda95f	AMDGPU/GlobalISel: Make some G_MERGE_VALUEs legal llvm-svn: 327267	2018-03-12 13:35:43 +00:00

33 Commits