llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	e75afc9acf	GlobalISel: Use unmerge when copying wide vectors to result registers Avoid using G_EXTRACT and move towards a more consistent vector legalization strategy.	2020-09-24 15:19:51 -04:00
Matt Arsenault	18bbd9f15e	GlobalISel: Artifact combine unmerge of unmerge Unmerges have the same fundamental problem as G_TRUNC, and G_TRUNC could be implemented in terms of G_UNMERGE_VALUES. Reducing the number of elements in unmerge results ends up producing the original unmerge type profile, so the artifact combiner needs to eliminate the intermediate illegal registers. This avoids infinite looping in the legalizer in a future change. Assuming an unmerge has each result unmerged the same way, this ends up producing a new unmerge of the source for every definition. I'm not sure if the artifact combiner should either insert temporary merges here and erase the original merge, or if the combiner should look at uses from defs rather than defs from uses for unmerges. In a few cases this regresses from using 16-bit shifts for 8-bit values to using 32-bit shifts, but I think these can be legalized later (the other legalization rules don't try very hard to use 16-bit shifts either).	2020-09-01 11:01:33 -04:00
Matt Arsenault	0d2fe90063	AMDGPU/GlobalISel: Use more accurate legality rules for merge/unmerge Most notably, we were incorrectly reporting <3 x s16> as a legal type for these. Make sure these aren't legal to help make progress on fixing the artifact combiner and vector legalizer rules. Unfortunately, this means spreading the -global-isel-abort=0 hack, although this doesn't change the legalizer result in any situation.	2020-08-25 09:40:20 -04:00
Matt Arsenault	bdb25b3ce5	AMDGPU/GlobalISel: Use different technique for sample v3s16 values Avoid relying on implicit_def values, and odd sized G_INSERT/G_EXTRACT	2020-08-24 10:07:30 -04:00
Matt Arsenault	901e3317fe	GlobalISel: Merge FewerElements for G_BUILD_VECTOR/G_CONCAT_VECTORS This switches from using G_EXTRACT in odd cases to widen with undef and unmerge.	2020-08-22 10:25:53 -04:00
Matt Arsenault	93311a9812	AMDGPU/GlobalISel: Fix custom lowering of llvm.trunc.f64 for SI This was missing an operand from BFE and not erasing the original instruction.	2020-07-20 10:06:18 -04:00
Matt Arsenault	49ae0fc2f0	GlobalISel: Fix incorrect lowering G_FCOPYSIGN In the basic case, this was reading the sign from the wrong operand.	2020-04-10 21:00:25 -04:00
Matt Arsenault	19a0350187	GlobalISel: Fix round lowering I used the implementation for floor instead of round. It also turns out the OpenCL builtin library wasn't using the round builtin, but implemented the expanded form.	2020-03-16 11:37:30 -04:00
Matt Arsenault	9087ef0765	GlobalISel: Allow CSE of G_IMPLICIT_DEF The legalizer produces a lot of these, and they make reading legalized MIR annoying. For some reason, this does seem to sometimes introduce copies of implicit def, which is dumb.	2020-02-05 17:47:21 -05:00
Matt Arsenault	dfa9420f09	AMDGPU/GlobalISel: Don't use legal v2s16 G_BUILD_VECTOR If we have s_pack_* instructions, legalize this to G_BUILD_VECTOR_TRUNC from s32 elements. This is closer to how how the s_pack_* instructions really behave. If we don't have s_pack_ instructions, expand this by creating a merge to s32 and bitcasting. This expands to the expected bit operations. I think this eventually should go in a new bitcast legalize action type in LegalizerHelper. We already directly emit the shift operations in RegBankSelect for the vector case. This could possibly be cleaned up, but I also may want to defer doing this expansion to selection anyway. I'll see about that when I try to actually match VOP3P instructions. This breaks the selection of the build_vector since tablegen doesn't know how to match G_BUILD_VECTOR_TRUNC yet, so just xfail it for now.	2020-02-05 11:52:18 -05:00
Matt Arsenault	f3de8ab5cc	GlobalISel: Implement lower for G_INTRINSIC_ROUND Mostly copied from AMDGPU lowering implementation, except used G_SITOFP instead of directly creating a select on -1.0, 0.0.	2020-01-06 18:26:42 -05:00
Matt Arsenault	2e5f900849	GlobalISel: fewerElementsVector for intrinsic_trunc/intrinsic_round llvm-svn: 352298	2019-01-27 00:12:21 +00:00
Matt Arsenault	f4c21c575a	AMDGPU/GlobalISel: RegBankSelect for some fp ops llvm-svn: 349880	2018-12-21 03:14:45 +00:00

13 Commits