llvm-project

Commit Graph

Author	SHA1	Message	Date
Dominik Montada	9fedb6900d	[GlobalISel] add helper function to create arbitrary libcalls Summary: The existing helper function can only create a libcall to functions available in RTLIB. Add a helper function that can create a libcall to a given function name using the provided calling convention. Reviewers: aditya_nandakumar, t.p.northover, rovka, arsenm, dsanders Reviewed By: arsenm Subscribers: wdng, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76845	2020-03-26 16:11:13 +01:00
Matt Arsenault	39c55cef21	GlobalISel: Introduce bitcast legalize action For some operations, the type is unimportant and only the number of bits matters. For example I don't want to treat <4 x s8> as a legal type, but I also don't want to decompose loads of this into smaller pieces to get legal register types. On AMDGPU in SelectionDAG, we legalize a number of operations (most notably load and store) by coercing all types to vectors of i32. For GlobalISel, I'm trying very hard to avoid doing this for every type, but I don't think this strategy can be completely avoided. I'm trying to avoid bitcasts for any legitimately legal type we can operate on, since the intervening bitcasts have proven to be a hassle. For loads, I think I can get away without ever casting the result type, and handling any arbitrary bitwidth during selection (I will eventually want new tablegen support to help with this, rather than having to add every possible type as legal). The unmerge required to do anything with the value should expand to the expected shifts. This is trickier for stores, since it would now require handling a wide array of truncates during selection which I don't want. Future potentially interesting case are for vector indexing, where sub-dword type should be indexed in s32 pieces.	2020-03-24 19:33:33 -04:00
Jessica Paquette	02187ed45a	[GlobalISel] Combine G_SELECTs of the form (cond ? x : x) into x When we find something like this: ``` %a:_(s32) = G_SOMETHING ... ... %select:_(s32) = G_SELECT %cond(s1), %a, %a ``` We can remove the select and just replace it entirely with `%a` because it's always going to result in `%a`. Same if we have ``` %select:_(s32) = G_SELECT %cond(s1), %a, %b ``` where we can deduce that `%a == %b`. This implements the following cases: - `%select:_(s32) = G_SELECT %cond(s1), %a, %a` -> `%a` - `%select:_(s32) = G_SELECT %cond(s1), %a, %some_copy_from_a` -> `%a` - `%select:_(s32) = G_SELECT %cond(s1), %a, %b` -> `%a` when `%a` and `%b` are defined by identical instructions This gives a few minor code size improvements on CTMark at -O3 for AArch64. Differential Revision: https://reviews.llvm.org/D76523	2020-03-23 16:46:03 -07:00
Matt Arsenault	aa63eb6a46	GlobalISel: Add computeKnownBitsForTargetInstr I think we can save the MRI argument from these since it's in GISelKnownBits already, but currently not accessible. Implementation deferred to avoid dependency on other patches.	2020-03-23 15:02:30 -04:00
Jay Foad	0444d16a16	[GlobalISel] Add generic opcodes for saturating add/subtract Summary: Add new generic MIR opcodes G_SADDSAT etc. Add support in IRTranslator for translating the saturating add/subtract intrinsics to the new opcodes. Reviewers: aemerson, dsanders, paquette, arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76600	2020-03-23 15:16:45 +00:00
Dominik Montada	ccf49b9ef0	[GlobalISel] support widen unmerge if WideTy > SrcTy Summary: Widening G_UNMERGE_VALUES to a type which is larger than the original source type is the same as widening it to the same type as the source type: in both cases, G_UNMERGE_VALUES has to be replaced with bit arithmetic which. Although the arithmetic itself is independent of whether the source type is smaller or equal to the widen type, widening the source type to the widen type should result in less artifacts being emitted, since this is the type that the user explicitly requested. Reviewers: arsenm, dsanders, aemerson, aditya_nandakumar Reviewed By: arsenm, dsanders Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76494	2020-03-23 09:16:45 +01:00
Adrian Kuegel	baa6f6a782	Revert "[TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes" This reverts commit `e9f22fd429`. When building with -DLLVM_USE_SANITIZER="Thread", check-llvm has 70 failing tests with this revision, and 29 without this revision.	2020-03-20 11:02:50 +01:00
Jessica Paquette	c999084619	[GlobalISel] Port some basic shufflevector undef combines from the DAGCombiner Port over the following: - shuffle undef, undef, any_mask -> undef - shuffle anything, anything, undef_mask -> undef This sort of thing shows up a lot when you try to bugpoint code containing shufflevector. Differential Revision: https://reviews.llvm.org/D76382	2020-03-19 16:46:06 -07:00
lewis-revill	e9f22fd429	[TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes This patch generates TableGen descriptions for the specified register banks which contain a list of register sizes corresponding to the available HwModes. The appropriate size is used during codegen according to the current HwMode. As this HwMode was not available on generation, it is set upon construction of the RegisterBankInfo class. Targets simply need to provide the HwMode argument to the <target>GenRegisterBankInfo constructor. The RISC-V RegisterBankInfo constructor has been updated accordingly (plus an unused argument removed). Differential Revision: https://reviews.llvm.org/D76007	2020-03-18 19:52:23 +00:00
Jessica Paquette	dc5f982639	[GlobalISel] Port some basic undef combines from DAGCombiner.cpp This ports some combines from DAGCombiner.cpp which perform some trivial transformations on instructions with undef operands. Not having these can make it extremely annoying to find out where we differ from SelectionDAG by looking at existing lit tests. Without them, we tend to produce pretty bad code generation when we run into instructions which use undef operands. Also remove the nonpow2_store_narrowing testcase from arm64-fallback.ll, since we no longer fall back on the add. Differential Revision: https://reviews.llvm.org/D76339	2020-03-18 11:05:44 -07:00
Matt Arsenault	2e77362626	GlobalISel: Fix lower bswap for vectors This would hit an assertion from trying to use the wrong bitwidth for the constants.	2020-03-16 13:59:08 -04:00
Matt Arsenault	19a0350187	GlobalISel: Fix round lowering I used the implementation for floor instead of round. It also turns out the OpenCL builtin library wasn't using the round builtin, but implemented the expanded form.	2020-03-16 11:37:30 -04:00
Dominik Montada	8ff2dcb18b	[GlobalISel] add additional lowering support for G_INSERT Summary: Add lowering support for inserting pointers or scalars into scalars, vectors or pointers Reviewers: arsenm, dsanders Reviewed By: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75994	2020-03-16 16:27:17 +01:00
Dominik Montada	6b96623dcb	[GlobalISel] fix crash in narrowScalarExtract if DstRegs only has one register Summary: When narrowing a scalar G_EXTRACT where the destination lines up perfectly with a single result of the emitted G_UNMERGE_VALUES a COPY should be emitted instead of unconditionally trying to emit a G_MERGE_VALUES. Reviewers: arsenm, dsanders Reviewed By: arsenm Subscribers: wdng, rovka, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75743	2020-03-12 09:14:35 +01:00
Matt Arsenault	c0ad75e758	GlobalISel: Don't try to narrow extending loads/trunc store If the loaded memory size was smaller than the result size, this would produce out of bounds memory accesses. I'm wondering if we need a distinct narrow memory legalize action type, since a case I care about is decomposing a 4-byte unaligned access into 4 extending loads, which would leave the original result register type. I'm currently awkwardly using narrowScalar to handle unaligned accesses that need to be split.	2020-03-10 23:34:10 -04:00
Matt Arsenault	b17a81f8b2	GlobalISel: Add missing add/sub with carries to MachineIRBuilder	2020-03-10 22:39:55 -04:00
Matt Arsenault	ce8a1f7294	GlobalISel: Implement fewerElementsVector for G_TRUNC Extend fewerElementsVectorBasic to handle operands with different element types.	2020-03-10 15:17:20 -07:00
Amara Emerson	c1a97e992d	Revert "Revert "[GlobalISel][Localizer] Enable intra-block localization of already-local uses."" This reverts commit `5583c2f2fb`. The lldb bot failure was a test that was fragile and sensitive to irrelevant changes in instruction ordering. Re-committing this as the test should have been skipped for AArch64 now. Differential Revision: https://reviews.llvm.org/D75555	2020-03-06 21:35:08 -08:00
Dominik Montada	feb20a1594	[GlobalISel] add missing libcalls and 128-bit support for floating points Add libcall support for G_FMINNUM, G_FMAXNUM, G_FSQRT, G_FRINT, G_FNEARBYINT. Add 128-bit libcall support for all simple libcalls. Reviewers: arsenm, Petar.Avramovic, dsanders, petarj, paquette Subscribers: wdng, rovka, hiraditya, volkan, llvm-commits Differential Revision: https://reviews.llvm.org/D75516	2020-03-06 09:06:13 +01:00
Muhammad Omair Javaid	5583c2f2fb	Revert "[GlobalISel][Localizer] Enable intra-block localization of already-local uses." This reverts commit `e91e1df6ab`.	2020-03-05 03:12:28 +05:00
Matt Arsenault	b71203a751	GlobalISel: Move some legalizer functions to utils	2020-03-04 16:40:00 -05:00
Matt Arsenault	fb0c35fa34	GlobalISel: Set alignment on function argument stack load/store	2020-03-04 16:38:46 -05:00
Amara Emerson	e91e1df6ab	[GlobalISel][Localizer] Enable intra-block localization of already-local uses. This changes the localizer to attempt intra-block localizer of instructions that have local uses. This is useful because sometimes the entry block itself has many uses of constant-like instructions, which would benefit from shortening live ranges. Previously if an inst had no non-local uses, we wouldn't add it to the list of instructions to attempt further intra-block localization. This gives a 0.7% geomean code size improvement on CTMark. Differential Revision: https://reviews.llvm.org/D75555	2020-03-03 18:14:57 -08:00
Volkan Keles	4167645d1e	GlobalISel: Move Localizer::shouldLocalize(..) to TargetLowering Add a new target hook for shouldLocalize so that targets can customize the logic. https://reviews.llvm.org/D75207	2020-03-02 09:15:40 -08:00
Matt Arsenault	6fc0d00823	GlobalISel: Fix lowering for G_UADDE/G_USUBE The type parameter passed into lower is invalid and should be removed from the function.	2020-02-26 19:10:52 -08:00
Matt Arsenault	c7e8d8b13e	GlobalISel: Cleanup code with MachineIRBuilder features	2020-02-26 19:10:34 -08:00
Quentin Colombet	5bf0023b0d	[GISel][KnownBits] Update a comment regarding the effect of cache on PHIs Unlike what I claimed in my previous commit. The caching is actually not NFC on PHIs. When we put a big enough max depth, we end up simulating loops. The cache is effectively cutting the simulation short and we get less information as a result. E.g., ``` v0 = G_CONSTANT i8 0xC0 jump v1 = G_PHI i8 v0, v2 v2 = G_LSHR i8 v1, 1 ``` Let say we want the known bits of v1. - With cache: Set v1 cache to we know nothing v1 is v0 & v2 v0 gives us 0xC0 v2 gives us known bits of v1 >> 1 v1 is in the cache => v1 is 0, thus v2 is 0x80 Finally v1 is v0 & v2 => 0x80 - Without cache and enough depth to do two iteration of the loop: v1 is v0 & v2 v0 gives us 0xC0 v2 gives us known bits of v1 >> 1 v1 is v0 & v2 v0 is 0xC0 v2 is v1 >> 1 Reach the max depth for v1... unwinding v1 is know nothing v2 is 0x80 v0 is 0xC0 v1 is 0x80 v2 is 0xC0 v0 is 0xC0 v1 is 0xC0 Thus now v1 is 0xC0 instead of 0x80. I've added a unittest demonstrating that. NFC	2020-02-25 15:56:15 -08:00
Jay Foad	ccee390767	GlobalISel: NFC minor cleanup to avoid a couple of fixed size local arrays	2020-02-25 09:49:19 +00:00
Matt Arsenault	11e3dde625	GlobalISel: Reimplement fewerElementsVectorBasic Changes the handling of odd breakdowns, and avoids using G_EXTRACT/G_INSERT. Pad with undef to a wider size, and unmerge. Also avoid introducing instructions for the fully undef components.	2020-02-24 21:19:47 -05:00
Quentin Colombet	b6d63c92ec	[GISel][KnownBits] Suppress unused warning on the dump method NFC	2020-02-21 21:07:04 -08:00
Quentin Colombet	618dec2aef	[GISel][KnownBits] Add a cache mechanism to speed compile time This patch adds a cache that is valid only for the duration of a call to getKnownBits. With such short lived cache we avoid all the problems of cache invalidation while still getting the benefits of reusing the information we already computed. This cache is useful whenever an instruction occurs more than once in a chain of computation. E.g., v0 = G_ADD v1, v2 v3 = G_ADD v0, v1 Previously we would compute the known bits for: v1, v2, v0, then v1 again and finally v3. With the patch, now we won't have to recompute v1 again. NFC	2020-02-21 14:31:42 -08:00
Jay Foad	cab39e4b8c	GlobalISel: Fix narrowing of (G_ASHR i64:x, 32) Reviewers: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74950	2020-02-21 16:51:03 +00:00
Quentin Colombet	e4a9225f5d	[GISel][KnownBits] Give up on PHI analysis as soon as we don't know anything When analyzing PHIs, we gather the known bits for every operand and merge them together to get the known bits of the result of the PHI. It is not unusual that merging the information leads to know nothing on the result (e.g., phi a: i8 3, b: i8 unknown, ..., after looking at the second argument we know we will know nothing on the result), thus, as soon as we reach that state, stop analyzing the following operand (i.e., on the previous example, we won't process anything after looking at `b`). This improves compile time in particular with PHIs with a large number of operands. NFC.	2020-02-20 11:34:01 -08:00
Aditya Nandakumar	b91d9ec0bb	[GlobalISel]: Fix some non determinism exposed in CSE due to not notifying observers about mutations + add verification for CSE https://reviews.llvm.org/D67133 While investigating some non determinism (CSE doesn't produce wrong code, it just doesn't CSE some times) in GISel CSE on an out of tree target, I realized that the core issue was that there were lots of code that mutates (setReg, setRegClass etc), but doesn't notify observers (CSE in this case but this could be any other observer). In order to make the Observer be available in various parts of code and to avoid having to thread it through various API, the MachineFunction now has the observer as field. This allows it to be easily used in helper functions such as constrainOperandRegClass. Also added some invariant verification method in CSEInfo which can catch these issues (when CSE is enabled).	2020-02-18 14:54:57 -08:00
Matt Arsenault	0e2eb357e0	GlobalISel: Extend narrowing to G_ASHR	2020-02-17 10:42:59 -08:00
Matt Arsenault	8550859535	GlobalISel: Extend shift narrowing to G_SHL	2020-02-17 09:13:37 -08:00
Matt Arsenault	78d455adf0	GlobalISel: Add combine to narrow G_LSHR Produce an unmerge to a narrower type and introduce a narrower shift if needed. I wasn't sure if there was a better way to parameterize the target's preferred shift type for the GICombineRule, so manually call the combine helper.	2020-02-17 08:04:52 -08:00
Matt Arsenault	3bb0ff8341	GlobalISel: Remove unused function argument	2020-02-14 15:57:39 -08:00
Matt Arsenault	bfbfa18591	GlobalISel: Lower s64->s16 G_FPTRUNC This is more or less directly ported from the AMDGPU custom lowering for FP_TO_FP16. I made a few minor fixups (using G_UNMERGE_VALUES instead of creating shift/trunc to extract the two halves, and zexting an inverted compare instead of select_cc). This also does not include the fast math expansion the DAG which converts to f32 and then to f16. I think that belongs in a pre-legalize combine instead.	2020-02-14 10:46:58 -08:00
Volkan Keles	187686a22f	[GlobalISel] LegalizationArtifactCombiner: Fix a bug in tryCombineMerges Like COPY instructions explained in D70616, we don't check the constraints when combining G_UNMERGE_VALUES. Use the same logic used in D70616 to check if registers can be replaced, or a COPY instruction needs to be built. https://reviews.llvm.org/D70564	2020-02-14 10:45:58 -08:00
Matt Arsenault	de256478e6	GlobalISel: Don't use LLT references These should always be passed by value	2020-02-13 15:25:30 -05:00
Jay Foad	32aac25637	[KnownBits] Introduce anyext instead of passing a flag into zext Summary: This was a very odd API, where you had to pass a flag into a zext function to say whether the extended bits really were zero or not. All callers passed in a literal true or false. I think it's much clearer to make the function name reflect the operation being performed on the value we're tracking (rather than on the KnownBits Zero and One fields), so zext means the value is being zero extended and new function anyext means the value is being extended with unknown bits. NFC. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74482	2020-02-12 19:06:53 +00:00
Amara Emerson	067dd9c6b1	[GlobalISel][CallLowering] Use stripPointerCasts(). A downstream test exposed a simple logic bug with the manual pointer stripping code, fix that by just using stripPointerCasts() on the value. I don't think there's a way to expose this issue upstream.	2020-02-10 15:43:57 -08:00
Amara Emerson	21c9d9ad43	[GlobalISel][CallLowering] Tighten constantexpr check for callee. I'm not sure there's a test case for this, but it's better to be safe.	2020-02-09 22:59:48 -08:00
Matt Arsenault	312a9d1b83	GlobalISel: Fix narrowScalar for G_{CTLZ\|CTTZ}_ZERO_UNDEF Narrow these for 64-bit VALU for AMDGPU.	2020-02-09 19:02:38 -05:00
Matt Arsenault	6135f5eda4	GlobalISel: Fix narrowing of G_CTLZ/G_CTTZ The result type is separate from the source type.	2020-02-09 18:11:43 -05:00
Amara Emerson	35c63d66aa	[GlobalISel][CallLowering] Look through bitcasts from constant function pointers. Calls to ObjC's objc_msgSend function are done by bitcasting the function global to the required function type signature. This patch looks through this bitcast so that we can do a direct call with bl on arm64 instead of using an indirect blr. Differential Revision: https://reviews.llvm.org/D74241	2020-02-07 15:32:54 -08:00
Petar Avramovic	7df5fc9e03	[GlobalISel] Add buildMerge with SrcOp initializer list Allows more flexible use of buildMerge in places where use operands are available as SrcOp since it does not require explicit conversion to Register. Simplify code with new buildMerge. Differential Revision: https://reviews.llvm.org/D74223	2020-02-07 18:43:45 +01:00
Amara Emerson	28d22c2c9c	[GlobalISel][IRTranslator] Add special case support for ~memory inline asm clobber. This is a one off special case, since actually implementing full inline asm support will be much more involved. This lets us compile a lot more code as a common simple case. Differential Revision: https://reviews.llvm.org/D74201	2020-02-07 08:55:23 -08:00
Matt Arsenault	3b198518ad	GlobalISel: Fix narrowing of G_CTPOP The result type is separate from the source type. Tests will be included in a future AMDGPU patch which uses this from RegBankSelect/applyMappingImpl.	2020-02-07 06:58:00 -08:00

1 2 3 4 5 ...

1149 Commits