llvm-project

Commit Graph

Author	SHA1	Message	Date
Amara Emerson	13af1ed8e3	[GlobalISel] Support for inlining memcpy, memset and memmove calls. This introduces a new family of combiner helper routines that re-use the target specific cost model from SelectionDAG, and generate inline implementations of the memcpy family of intrinsics. The combines are only enabled at optimization levels higher than -O0, and give very substantial performance improvements. Differential Revision: https://reviews.llvm.org/D65167 llvm-svn: 366951	2019-07-24 22:17:31 +00:00
Amara Emerson	a1997ce2e5	[AArch64][GlobalISel] Fix a crash during s128 G_ICMP legalization due to r366317. r366317 added a legalization for s128 G_ICMP narrow scalar which tried to hard code the result type of the new legalized G_SELECT. Change this to instead use type of the original G_ICMP result and allow the target to legalize it if necessary later. llvm-svn: 366943	2019-07-24 20:46:42 +00:00
Aditya Nandakumar	d7504a1569	[GISel]: Attach missing range metadata while translating G_LOADs https://reviews.llvm.org/D65048 Attach range information to G_LOAD when only defining one register. reviewed by: arsenm llvm-svn: 366656	2019-07-21 14:07:54 +00:00
Amara Emerson	cf12c7815f	[GlobalISel] Translate calls to memcpy et al to G_INTRINSIC_W_SIDE_EFFECTs and legalize later. I plan on adding memcpy optimizations in the GlobalISel pipeline, but we can't do that unless we delay lowering to actual function calls. This patch changes the translator to generate G_INTRINSIC_W_SIDE_EFFECTS for these functions, and then have each target specify that using the new custom legalizer for intrinsics hook that they want it expanded it a libcall. Differential Revision: https://reviews.llvm.org/D64895 llvm-svn: 366516	2019-07-19 00:24:45 +00:00
Matt Arsenault	0966dd0d69	GlobalISel: Handle widenScalar of arbitrary G_MERGE_VALUES sources Extract the sources to the GCD of the original size and target size, padding with implicit_def as necessary. Also fix the case where the requested source type is wider than the original result type. This was ignoring the type, and just using the destination. Do the operation in the requested type and truncate back. llvm-svn: 366367	2019-07-17 20:22:44 +00:00
Matt Arsenault	914a59cad8	GlobalISel: Handle more cases for widenScalar of G_MERGE_VALUES Use an anyext to the requested type for the leftover operand to produce a slightly wider type, and then truncate the final merge. I have another implementation almost ready which handles arbitrary widens, but I think it produces worse code in this example (which I think is 90% due to not folding redundant copies or folding out implicit_def users), so I wanted to add this as a baseline first. llvm-svn: 366366	2019-07-17 20:22:38 +00:00
Petar Avramovic	1e62635d05	[MIPS GlobalISel] ClampScalar and select pointer G_ICMP Add narrowScalar to half of original size for G_ICMP. ClampScalar G_ICMP's operands 2 and 3 to to s32. Select G_ICMP for pointers for MIPS32. Pointer compare is same as for integers, it is enough to declare them as legal type. Differential Revision: https://reviews.llvm.org/D64856 llvm-svn: 366317	2019-07-17 12:08:01 +00:00
Matt Arsenault	1c3f4ec7fc	GlobalISel: Add overload of handleAssignments with CCState AMDGPU needs to allocate special argument registers separately from the user function argument list, so needs direct control over the CCState. The ArgLocs argument is only really necessary because CCState doesn't allow access to it. llvm-svn: 366279	2019-07-16 22:41:34 +00:00
Matt Arsenault	434d664095	GlobalISel: Implement narrowScalar for vector extract/insert indexes llvm-svn: 366113	2019-07-15 19:37:34 +00:00
Fangrui Song	b251cc0d91	Delete dead stores llvm-svn: 365903	2019-07-12 14:58:15 +00:00
Matt Arsenault	7e71902b79	GlobalISel: Use Register llvm-svn: 365780	2019-07-11 14:18:19 +00:00
Amara Emerson	7a4d2df04a	[AArch64][GlobalISel] Optimize compare and branch cases with G_INTTOPTR and unknown values. Since we have distinct types for pointers and scalars, G_INTTOPTRs can sometimes obstruct attempts to find constant source values. These usually come about when try to do some kind of null pointer check. Teaching getConstantVRegValWithLookThrough about this operation allows the CBZ/CBNZ optimization to catch more cases. This change also improves the case where we can't find a constant source at all. Previously we would emit a cmp, cset and tbnz for that. Now we try to just emit a cmp and conditional branch, saving an instruction. The cumulative code size improvement of this change plus D64354 is 5.5% geomean on arm64 CTMark -O0. Differential Revision: https://reviews.llvm.org/D64377 llvm-svn: 365690	2019-07-10 19:21:43 +00:00
Matt Arsenault	6ce1b4fec5	GlobalISel: Legalization for G_FMINNUM/G_FMAXNUM llvm-svn: 365658	2019-07-10 16:31:19 +00:00
Matt Arsenault	e595a2c964	GlobalISel: Define the full family of FP min/max instructions llvm-svn: 365657	2019-07-10 16:31:15 +00:00
Matt Arsenault	b1843e130a	GlobalISel: Implement lower for G_FCOPYSIGN In SelectionDAG AMDGPU treated these as legal, but this was mostly because the bitcasts required for FP types were painful. Theoretically the bitpattern should eventually match to bfi, so don't bother trying to get the patterns to import. llvm-svn: 365583	2019-07-09 23:34:29 +00:00
Matt Arsenault	14a4495155	GlobalISel: Combine unmerge of merge with intermediate cast This eliminates some illegal intermediate vectors when operations are scalarized. llvm-svn: 365566	2019-07-09 22:19:13 +00:00
Amara Emerson	6616e269a6	[AArch64][GlobalISel] Optimize conditional branches followed by unconditional branches If we have an icmp->brcond->br sequence where the brcond just branches to the next block jumping over the br, while the br takes the false edge, then we can modify the conditional branch to jump to the br's target while inverting the condition of the incoming icmp. This means we can eliminate the br as an unconditional branch to the fallthrough block. Differential Revision: https://reviews.llvm.org/D64354 llvm-svn: 365510	2019-07-09 16:05:59 +00:00
Petar Avramovic	be20e36107	[MIPS GlobalISel] Register bank select for G_PHI. Select i64 phi Select gprb or fprb when def/use register operand of G_PHI is used/defined by either: copy to/from physical register or instruction with only one mapping available for that use/def operand. Integer s64 phi is handled with narrowScalar when mapping is applied, produced artifacts are combined away. Manually set gprb to all register operands of instructions created during narrowScalar. Differential Revision: https://reviews.llvm.org/D64351 llvm-svn: 365494	2019-07-09 14:36:17 +00:00
Matt Arsenault	079f77b590	GlobalISel: Convert some build functions to using SrcOp/DstOp llvm-svn: 365343	2019-07-08 16:27:47 +00:00
Matt Arsenault	bd791b57f8	GlobalISel: widenScalar for G_BUILD_VECTOR llvm-svn: 365320	2019-07-08 13:48:06 +00:00
Matt Arsenault	43cbca50e4	GlobalISel: Fix widenScalar for pointer typed G_MERGE_VALUES llvm-svn: 365093	2019-07-03 23:08:06 +00:00
Matt Arsenault	ce690544a6	GlobalISel: Add G_FENCE The pattern importer is for some reason emitting checks for G_CONSTANT for the immediate operands. llvm-svn: 364926	2019-07-02 14:16:39 +00:00
Matt Arsenault	c9f14f29f5	GlobalISel: Try to widen merges with other merges If the requested source type an be used as a merge source type, create a merge of merges. This avoids creating large, illegal extensions and bit-ops directly to the result type. llvm-svn: 364841	2019-07-01 19:36:10 +00:00
Aditya Nandakumar	1023a2eca3	[GlobalISel]: Allow backends to custom legalize Intrinsics https://reviews.llvm.org/D31359 Add a hook "legalizeInstrinsic" to allow backends to override this and custom lower/legalize intrinsics. llvm-svn: 364821	2019-07-01 17:53:50 +00:00
Matt Arsenault	6f74f55750	GlobalISel: Implement lower for min/max llvm-svn: 364816	2019-07-01 17:18:03 +00:00
Diana Picus	2ba16011c1	Fixup r364512 Fix stack-use-after-scope errors from r364512. One instance was already fixed in r364611 - this patch simplifies that fix and addresses one more instance of similar code. Discussed in: https://reviews.llvm.org/D63905 llvm-svn: 364778	2019-07-01 15:07:38 +00:00
Fangrui Song	78ee2fbf98	Cleanup: llvm::bsearch -> llvm::partition_point after r364719 llvm-svn: 364720	2019-06-30 11:19:56 +00:00
Matt Arsenault	3018d1845b	GlobalISel: Use Register llvm-svn: 364618	2019-06-28 01:47:44 +00:00
Matt Arsenault	5e66db6b8c	GlobalISel: Convert rest of MachineIRBuilder to using Register llvm-svn: 364615	2019-06-28 01:16:41 +00:00
Amara Emerson	ecb7ac35f9	[GlobalISel][IRTranslator] Fix some PHI bugs related to jump tables when optimizations are used. The new switch lowering code that tries to generate jump tables and range checks were tested at -O0 on arm64, but on -O3 the generic switch lowering code goes to town on trying to generate optimized lowerings, e.g. multiple jump tables, range checks etc. This exposed bugs in the way PHI nodes are handled because the CFG looks even stranger after all of this is done. llvm-svn: 364613	2019-06-27 23:56:34 +00:00
Rumeet Dhindsa	ddc2804e1a	Fix ASAN error caused by commit r364512. This patch intends to fix ASAN stack-use-after-scope error. This is at least a short-term fix to unbreak LLVM's mainline. Differential Revision: https://reviews.llvm.org/D63905 llvm-svn: 364611	2019-06-27 23:37:04 +00:00
Diana Picus	74a50a723b	[GlobalISel] Remove [un]packRegs from IRTranslator Remove the last use of packRegs from IRTranslator and delete pack/unpackRegs. This introduces a fallback to DAGISel for intrinsics with aggregate arguments, since we don't have a testcase for them so it's hard to tell how we'd want to handle them. Discussed in https://reviews.llvm.org/D63551 llvm-svn: 364514	2019-06-27 09:49:07 +00:00
Diana Picus	43fb5ae50c	[GlobalISel] Accept multiple vregs for lowerCall's args Change the interface of CallLowering::lowerCall to accept several virtual registers for each argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63551 llvm-svn: 364512	2019-06-27 09:18:03 +00:00
Diana Picus	8138996128	[GlobalISel] Accept multiple vregs for lowerCall's result Change the interface of CallLowering::lowerCall to accept several virtual registers for the call result, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63550 llvm-svn: 364511	2019-06-27 09:15:53 +00:00
Diana Picus	c3dbe23977	[GlobalISel] Accept multiple vregs in lowerFormalArgs Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches. With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0. AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC. Mips doesn't support aggregates yet, so it's also NFC. x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument. Differential Revision: https://reviews.llvm.org/D63549 llvm-svn: 364510	2019-06-27 08:54:17 +00:00
Diana Picus	69ce1c1319	[GlobalISel] Allow multiple VRegs in ArgInfo. NFC Allow CallLowering::ArgInfo to contain more than one virtual register. This is useful when passes split aggregates into several virtual registers, but need to also provide information about the original type to the call lowering. Used in follow-up patches. Differential Revision: https://reviews.llvm.org/D63548 llvm-svn: 364509	2019-06-27 08:50:53 +00:00
Matt Arsenault	faeaedf8e9	GlobalISel: Remove unsigned variant of SrcOp Force using Register. One downside is the generated register enums require explicit conversion. llvm-svn: 364194	2019-06-24 16:16:12 +00:00
Matt Arsenault	e3a676e9ad	CodeGen: Introduce a class for registers Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191	2019-06-24 15:50:29 +00:00
Fangrui Song	43e14390b0	Make GlobalISel depend on SelectionDAG after D63169 GlobalISel/IRTranslator.cpp now references SelectionDAG/FunctionLoweringInfo.cpp. This fixes a link error in -DBUILD_SHARED_LIBS=on builds: ld.lld: error: undefined symbol: llvm::FunctionLoweringInfo::clear() >>> referenced by IRTranslator.cpp:2198 (../lib/CodeGen/GlobalISel/IRTranslator.cpp:2198) >>> lib/CodeGen/GlobalISel/CMakeFiles/LLVMGlobalISel.dir/IRTranslator.cpp.o:(llvm::IRTranslator::finalizeFunction()) llvm-svn: 364124	2019-06-22 01:30:17 +00:00
Amara Emerson	fe4625fb24	[GlobalISel][IRTranslator] Change switch table translation to generate jump tables and range checks. This change makes use of the newly refactored SwitchLoweringUtils code from SelectionDAG to in order to generate jump tables and range checks where appropriate. Much of this code is ported from SDAG with some modifications. We generate G_JUMP_TABLE and G_BRJT instructions when JT opportunities are found. This means that targets which previously relied on the naive one MBB per case stmt translation will now start falling back until they add support for the new opcodes. For range checks, we don't generate any previously unused operations. This just recognizes contiguous ranges of case values and generates a single block per range. Single case value blocks are just a special case of ranges so we get that support almost for free. There are still some optimizations missing that I haven't ported over, and bit-tests are also unimplemented. This patch series is already complex enough. Actual arm64 support for selection of jump tables is coming in a later patch. Differential Revision: https://reviews.llvm.org/D63169 llvm-svn: 364085	2019-06-21 18:10:38 +00:00
Amara Emerson	bc0d08e0ee	[GlobalISel][Localizer] Allow localization of G_INTTOPTR and chains of instructions. G_INTTOPTR can prevent the localizer from moving G_CONSTANTs, but since it's essentially a side effect free cast instruction we can remat both instructions. This patch changes the localizer to enable localization of the chains by iterating over the entry block instructions in reverse order. That way, uses will localized first, and then the defs are free to be localized as well. This also changes the previous SmallPtrSet of localized instructions to use a SetVector instead. We're dealing with pointers and need deterministic iteration order. Overall, this change improves ARM64 -O0 CTMark code size by around 0.7% geomean. Differential Revision: https://reviews.llvm.org/D63630 llvm-svn: 364001	2019-06-21 00:36:19 +00:00
Petar Avramovic	153bd24eda	[MIPS GlobalISel] Select integer to floating point conversions Select G_SITOFP and G_UITOFP for MIPS32. Differential Revision: https://reviews.llvm.org/D63542 llvm-svn: 363912	2019-06-20 09:05:02 +00:00
Petar Avramovic	4b4dae1c76	[MIPS GlobalISel] Select floating point to integer conversions Select G_FPTOSI and G_FPTOUI for MIPS32. Differential Revision: https://reviews.llvm.org/D63541 llvm-svn: 363911	2019-06-20 08:52:53 +00:00
Amara Emerson	d11ea2c8c5	[GlobalISel][Localizer] Remove redundant set lookup. After changing the algorithm to only process the entry block we never revisit a processed instruction. llvm-svn: 363745	2019-06-18 22:08:40 +00:00
Tom Stellard	1f7f64665c	GlobalISel: Remove redundant pass initialization Summary: All the GlobalISel passes are initialized when the target calls initializeGlobalISel(), so we don't need to call the initializers from the pass constructors. Reviewers: qcolombet, t.p.northover, paquette, dsanders, aemerson, aditya_nandakumar Reviewed By: aemerson Subscribers: rovka, kristof.beyls, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63235 llvm-svn: 363642	2019-06-18 02:05:06 +00:00
Matt Arsenault	5a321b899e	GlobalISel: Use the original flags when lowering fneg to fsub This was ignoring the flag on fneg, and using the source instruction's flags. Also fixes tests missing from r358702. Note the expansion itself isn't correct without nnan, but that should be fixed separately. llvm-svn: 363637	2019-06-17 23:48:43 +00:00
Amara Emerson	146882242f	[GlobalISel][Localizer] Rewrite localizer to run in 2 phases, inter & intra block. Inter-block localization is the same as what currently happens, except now it only runs on the entry block because that's where the problematic constants with long live ranges come from. The second phase is a new intra-block localization phase which attempts to re-sink the already localized instructions further right before one of the multiple uses. One additional change is to also localize G_GLOBAL_VALUE as they're constants too. However, on some targets like arm64 it takes multiple instructions to materialize the value, so some additional heuristics with a TTI hook have been introduced attempt to prevent code size regressions when localizing these. Overall, these changes improve CTMark code size on arm64 by 1.2%. Full code size results: Program baseline new diff ------------------------------------------------------------------------------ test-suite...-typeset/consumer-typeset.test 1249984 1217216 -2.6% test-suite...:: CTMark/ClamAV/clamscan.test 1264928 1232152 -2.6% test-suite :: CTMark/SPASS/SPASS.test 1394092 1361316 -2.4% test-suite...Mark/mafft/pairlocalalign.test 731320 714928 -2.2% test-suite :: CTMark/lencod/lencod.test 1340592 `1324200` -1.2% test-suite :: CTMark/kimwitu++/kc.test 3853512 3820420 -0.9% test-suite :: CTMark/Bullet/bullet.test 3406036 3389652 -0.5% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8017000 8016992 -0.0% test-suite...TMark/7zip/7zip-benchmark.test 2856588 2856588 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 765704 765704 0.0% Geomean difference -1.2% Differential Revision: https://reviews.llvm.org/D63303 llvm-svn: 363632	2019-06-17 23:20:29 +00:00
Michael Berg	f9bff2a55e	Propagate fmf in IRTranslate for fneg Summary: This case is related to D63405 in that we need to be propagating FMF on negates. Reviewers: volkan, spatel, arsenm Reviewed By: arsenm Subscribers: wdng, javed.absar Differential Revision: https://reviews.llvm.org/D63458 llvm-svn: 363631	2019-06-17 23:19:40 +00:00
Daniel Sanders	184c8ee920	[globalisel] Fix iterator invalidation in the extload combines Summary: Change the way we deal with iterator invalidation in the extload combines as it was still possible to neglect to visit a use. Even worse, it happened in the in-tree test cases and the checks weren't good enough to detect it. We now take a cheap copy of the use list before iterating over it. This prevents iterator invalidation from occurring and has the nice side effect of making the existing schedule-for-erase/schedule-for-insert mechanism moot. Reviewers: aditya_nandakumar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, javed.absar, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61813 llvm-svn: 363616	2019-06-17 20:56:31 +00:00
Matt Arsenault	3e140066bc	GlobalISel: Ignore callsite attributes when picking intrinsic type A target intrinsic may be defined as possibly reading memory, but the call site may have additional knowledge that it doesn't read memory. The intrinsic lowering will expect the pessimistic assumption of the intrinsic definition, so the chain should still be used. I fixed the same bug in SelectionDAG in r287593. llvm-svn: 363580	2019-06-17 17:01:35 +00:00
Matt Arsenault	9487278010	Reapply "GlobalISel: Avoid producing Illegal copies in RegBankSelect" This reapplies r363410, avoiding null dereference if there is no AltRegBank. llvm-svn: 363478	2019-06-15 00:33:26 +00:00
Mitch Phillips	0d44f129bb	Revert "GlobalISel: Avoid producing Illegal copies in RegBankSelect" This patch breaks UBSan build bots. See https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild for a guide as to how to reproduce the error. This reverts commit `c2864c0de0`. This reverts rL363410. llvm-svn: 363476	2019-06-14 23:45:34 +00:00
Amara Emerson	f79d3bc724	[GlobalISel] Add a G_BRJT opcode. This is a branch opcode that takes a jump table pointer, jump table index and an index into the table to do an indirect branch. We pass both the table pointer and JTI to allow targets like ARM64 to more easily use the existing jump table compression optimization without having to walk up the block to find a paired G_JUMP_TABLE. Differential Revision: https://reviews.llvm.org/D63159 llvm-svn: 363434	2019-06-14 17:55:48 +00:00
Matt Arsenault	c2864c0de0	GlobalISel: Avoid producing Illegal copies in RegBankSelect Avoid producing illegal register bank copies for reg_sequence and phi. The default implementation assumes it is possible to pick any operand's bank and use that for the result, introducing a copy for operands with a different bank. This does not check for illegal copies. It is not legal to introduce a VGPR->SGPR copy, so any VGPR operand requires the result to be a VGPR. The changes in getInstrMappingImpl aren't strictly necessary, since AMDGPU now just bypasses this for reg_sequence/phi. This could be replaced with an assert in case other targets run into this. It is currently responsible for producing the error for unsatisfiable copies, but this will be better served with a verifier check. For phis, for now assume any undetermined operands must be VGPRs. Eventually, this needs to be able to defer mapping these operations. This also does not yet have a way to check for whether the block is in a divergent region. llvm-svn: 363410	2019-06-14 15:22:25 +00:00
Matt Arsenault	731a81598e	RegBankSelect: Remove checks for invalid mappings Avoid a check for valid and a set of redundant asserts. The place InstructionMapping is constructed asserts all of the default fields are passed anyway for an invalid mapping, so don't overcomplicate this. llvm-svn: 363391	2019-06-14 13:42:40 +00:00
Amara Emerson	fb0a40f064	[GlobalISel][IRTranslator] Add debug loc with line 0 to constants emitted into the entry block. Constants, including G_GLOBAL_VALUE, are all emitted into the entry block which lets us use the vreg def assuming it dominates all other users. However, it can cause jumpy debug behaviour since the DebugLoc attached to these MIs are from a user instruction that could be in a different block. Fixes PR40887. Differential Revision: https://reviews.llvm.org/D63286 llvm-svn: 363331	2019-06-13 22:15:35 +00:00
Amara Emerson	d133c15925	[GlobalISel] Add a G_JUMP_TABLE opcode. This opcode generates a pointer to the address of the jump table specified by the source operand, which is a jump table index. It will be used in conjunction with an upcoming G_BRJT opcode to support jump table codegen with GlobalISel. Differential Revision: https://reviews.llvm.org/D63111 llvm-svn: 363096	2019-06-11 19:58:06 +00:00
Jessica Paquette	b22954384e	[GlobalISel] Translate memset/memmove/memcpy from undef ptrs into nops If the source is undef, then just don't do anything. This matches SelectionDAG's behaviour in SelectionDAG.cpp. Also add a test showing that we do the right thing here. (irtranslator-memfunc-undef.ll) Differential Revision: https://reviews.llvm.org/D63095 llvm-svn: 362989	2019-06-10 21:53:56 +00:00
Volkan Keles	97204a6788	[GlobalISel] IRTranslator: Translate the intrinsics ignored by CodeGen Summary: Translate `llvm.assume`, `llvm.var.annotation` and `llvm.sideeffect` to nothing as they have no effect on CodeGen. Reviewers: qcolombet, aditya_nandakumar, dsanders, paquette, aemerson, arsenm Reviewed By: arsenm Subscribers: hiraditya, wdng, rovka, kristof.beyls, javed.absar, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63022 llvm-svn: 362834	2019-06-07 20:19:27 +00:00
Petar Avramovic	faaa2b5d21	[MIPS GlobalISel] Select floor and ceil Select G_FFLOOR and G_FCEIL for MIPS32. Differential Revision: https://reviews.llvm.org/D62901 llvm-svn: 362688	2019-06-06 09:02:24 +00:00
Ulrich Weigand	6c5d5ce551	Allow target to handle STRICT floating-point nodes The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping). This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules. To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts: - A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic. Any MI instruction that is both marked as mayRaiseFPException and FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling). Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files. The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today. This patch includes both common code changes required to implement the new features, and the SystemZ implementation. Reviewed By: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D55506 llvm-svn: 362663	2019-06-05 22:33:10 +00:00
Tim Northover	b7141207a4	Reapply: IR: add optional type to 'byval' function parameters When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack. This adds a (for now) optional extra type parameter. If present, the type must match the pointee type of the argument. The original commit did not remap byval types when linking modules, which broke LTO. This version fixes that. Note to front-end maintainers: if this causes test failures, it's probably because the "byval" attribute is printed after attributes without any parameter after this change. llvm-svn: 362128	2019-05-30 18:48:23 +00:00
Tim Northover	71ee3d0237	Revert "IR: add optional type to 'byval' function parameters" The IRLinker doesn't delve into the new byval attribute when mapping types, and this breaks LTO. llvm-svn: 362029	2019-05-29 20:46:38 +00:00
Tim Northover	6e07f16fae	IR: add optional type to 'byval' function parameters When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack. This adds a (for now) optional extra type parameter. If present, the type must match the pointee type of the argument. Note to front-end maintainers: if this causes test failures, it's probably because the "byval" attribute is printed after attributes without any parameter after this change. llvm-svn: 362012	2019-05-29 19:12:48 +00:00
Pengfei Wang	72e3f9662b	Revert "[X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to" This reverts commit c1b3716614bc0a107e6f41a7d3d503baefad8a5b. llvm-svn: 361918	2019-05-29 02:49:59 +00:00
Pengfei Wang	818c652643	[X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to avoid static check fail RegClassOrBank is an object of RegClassOrRegBank, which is defined as using llvm::RegClassOrRegBank = typedef PointerUnion<const TargetRegisterClass , const RegisterBank > so control flow can not get here. Use ""llvm_unreachable" here to avoid "null pointer" confusion. Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D62006 Signed-off-by: pengfei <pengfei.wang@intel.com> llvm-svn: 361912	2019-05-29 02:20:37 +00:00
Tim Northover	3b2157aeed	GlobalISel: support swifterror attribute on AArch64. swifterror marks an argument as a register pretending to be a pointer, so we need a guaranteed mem2reg-like analysis of its uses. Fortunately most of the infrastructure can be reused from the DAG world. llvm-svn: 361608	2019-05-24 08:40:13 +00:00
Matt Arsenault	0f3ba44b57	AMDGPU/GlobalISel: Legality for integer min/max llvm-svn: 361519	2019-05-23 17:58:48 +00:00
Matt Arsenault	02b5ca8cd1	GlobalISel: Implement lower for S64->S32 [SU]ITOFP This is ported from the custom AMDGPU DAG implementation. I think this is a better default expansion than what the DAG currently uses, at least if the target has CTLZ. This implements the signed version in terms of the unsigned conversion, which is implemented with bit operations. SelectionDAG has several other implementations that should eventually be ported depending on what instructions are legal. llvm-svn: 361081	2019-05-17 23:05:13 +00:00
Matt Arsenault	f3cedf4823	GlobalISel: Define integer min/max instructions Doesn't attempt to emit them for anything yet, but some legalizations I want to port use them. llvm-svn: 361061	2019-05-17 18:36:31 +00:00
Matt Arsenault	1448f5689e	AMDGPU/GlobalISel: Legalize G_FCOPYSIGN llvm-svn: 361025	2019-05-17 12:19:52 +00:00
Fangrui Song	ec6dc3089e	[GlobalISel] Fix -Wsign-compare on 32-bit -DLLVM_ENABLE_ASSERTIONS=on builds llvm-svn: 360989	2019-05-17 05:53:39 +00:00
Matt Arsenault	27ac8408f6	GlobalISel: Add DstOp version of buildIntrinsic llvm-svn: 360879	2019-05-16 12:22:56 +00:00
Matt Arsenault	11be78bc7a	GlobalISel: Add buildFConstant for APFloat llvm-svn: 360853	2019-05-16 04:09:06 +00:00
Matt Arsenault	012ecbbbba	GlobalISel: Fix indentation llvm-svn: 360851	2019-05-16 04:08:46 +00:00
Matt Arsenault	55146d3139	GlobalISel: Add G_FCOPYSIGN llvm-svn: 360850	2019-05-16 04:08:39 +00:00
Diana Picus	a568222ddd	[IRTranslator] Don't hardcode GEP index type When breaking up loads and stores of aggregates, the IRTranslator uses LLT::scalar(64) for the index type of the G_GEP instructions that compute the addresses. This is unnecessarily large for 32-bit targets. Use the int ptr type provided by the DataLayout instead. Note that we're already doing the right thing when translating getelementptr instructions from the IR. This is just an oversight when generating new ones while translating loads/stores. Both x86 and AArch64 already have tests confirming that the old behaviour is preserved for 64-bit targets. Differential Revision: https://reviews.llvm.org/D61852 llvm-svn: 360656	2019-05-14 09:25:17 +00:00
Quentin Colombet	c9256cc6ba	[IRTranslator] Use the alloc size instead of the store size when translating allocas We use to incorrectly use the store size instead of the alloc size when creating the stack slot for allocas. On aarch64 this can be demonstrated by allocating weirdly sized types. For instance, in the added test case, we use an alloca for i19. We used to allocate a slot of size 24-bit (19 rounded up to the next byte), whereas we really want to use a full 32-bit slot for this type. llvm-svn: 359856	2019-05-03 01:23:56 +00:00
Daniel Sanders	8f079844d0	[globalisel] Improve Legalizer debug output * LegalizeAction should be printed by name rather than number * Newly created instructions are incomplete at the point the observer first sees them. They are therefore recorded in a small vector and printed just before the legalizer moves on to another instruction. By this point, the instruction must be complete. llvm-svn: 359481	2019-04-29 18:45:59 +00:00
Marcello Maggioni	c596584f67	[GlobalISel] Fix inserting copies in the right position for reg definitions When constrainRegClass is called if the constraining happens on a use the COPY needs to be inserted before the instruction that contains the MachineOperand, but if we are constraining a definition it actually needs to be added after the instruction. In addition, the COPY needs to have its operands flipped (in the use case we are copying from the old unconstrained register to the new constrained register, while in the definition case we are copying from the new constrained register that the instruction defines to the old unconstrained register). llvm-svn: 359282	2019-04-26 07:21:56 +00:00
Jessica Paquette	ba55767f51	[GlobalISel][AArch64] Legalize G_FNEARBYINT Add legalizer support for G_FNEARBYINT. It's the same as G_FCEIL etc. Since the importer allows us to automatically select this after legalization, also add tests for selection etc. Also update arm64-vfloatintrinsics.ll. llvm-svn: 359204	2019-04-25 16:44:40 +00:00
Jessica Paquette	bd7ac30b15	[GlobalISel] Add IRTranslator support for G_FNEARBYINT Translate llvm.nearbyint into G_FNEARBYINT as a simple intrinsic. Update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D60922 llvm-svn: 359203	2019-04-25 16:39:28 +00:00
Bjorn Pettersson	71e8c6f20f	Add "const" in GetUnderlyingObjects. NFC Summary: Both the input Value pointer and the returned Value pointers in GetUnderlyingObjects are now declared as const. It turned out that all current (in-tree) uses of GetUnderlyingObjects were trivial to update, being satisfied with have those Value pointers declared as const. Actually, in the past several of the users had to use const_cast, just because of ValueTracking not providing a version of GetUnderlyingObjects with "const" Value pointers. With this patch we get rid of those const casts. Reviewers: hfinkel, materi, jkorous Reviewed By: jkorous Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61038 llvm-svn: 359072	2019-04-24 06:55:50 +00:00
Jessica Paquette	3cc6d1f542	[AArch64][GlobalISel] Legalize G_INTRINSIC_ROUND Add it to the same rule as G_FCEIL etc. Add a legalizer test, and add a missing switch case to AArch64LegalizerInfo.cpp. llvm-svn: 359033	2019-04-23 21:11:57 +00:00
Jessica Paquette	56342642a0	[AArch64][GlobalISel] Legalize G_INTRINSIC_TRUNC Same patch as G_FCEIL etc. Add the missing switch case in widenScalar, add G_INTRINSIC_TRUNC to the correct rule in AArch64LegalizerInfo.cpp, and add a test. llvm-svn: 359021	2019-04-23 18:20:44 +00:00
Matt Arsenault	8f624abc1d	GlobalISel: Legalize scalar G_EXTRACT sources llvm-svn: 358892	2019-04-22 15:10:42 +00:00
Amara Emerson	4286652556	Revert r358800. Breaks Obsequi from the test suite. The last attempt fixed gcc and consumer-typeset, but Obsequi seems to fail with a different issue. llvm-svn: 358829	2019-04-20 21:25:00 +00:00
Amara Emerson	eac69e9377	Revert "Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores"" We were shifting the wrong component of a split load when trying to combine them back into a single value. llvm-svn: 358800	2019-04-19 23:54:44 +00:00
Jessica Paquette	d5c69e0836	[GlobalISel][AArch64] Legalize + select G_FRINT Exactly the same as G_FCEIL, G_FABS, etc. Add tests for the fp16/nofp16 behaviour, update arm64-vfloatintrinsics, etc. Differential Revision: https://reviews.llvm.org/D60895 llvm-svn: 358799	2019-04-19 23:41:52 +00:00
Jessica Paquette	ad69af3e95	[GlobalISel] Add IRTranslator support for G_FRINT Add it as a simple intrinsic, update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D60893 llvm-svn: 358787	2019-04-19 21:46:12 +00:00
Amara Emerson	36c5baef49	Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores" This introduces some runtime failures which I'll need to investigate further. llvm-svn: 358771	2019-04-19 17:42:13 +00:00
Jessica Paquette	dfd87f6fa1	[GlobalISel][AArch64] Legalize vector G_FPOW This instruction is legalized in the same way as G_FSIN, G_FCOS, G_FLOG10, etc. Update legalize-pow.mir and arm64-vfloatintrinsics.ll to reflect the change. Differential Revision: https://reviews.llvm.org/D60218 llvm-svn: 358764	2019-04-19 16:28:08 +00:00
Michael Berg	d573aa0156	[NFC] FMF propagation for GlobalIsel llvm-svn: 358702	2019-04-18 18:48:57 +00:00
Aditya Nandakumar	9266337656	[GISel]:IRTranslator: Prefer a buidInstr form that allows CSE of cast instructions https://reviews.llvm.org/D60844 Use the style of buildInstr that allows CSEing. llvm-svn: 358637	2019-04-18 02:19:29 +00:00
Amara Emerson	d51adf0568	Add a getSizeInBits() accessor to MachineMemOperand. NFC. Cleans up a bunch of places where we do getSize() * 8. Differential Revision: https://reviews.llvm.org/D60799 llvm-svn: 358617	2019-04-17 22:21:05 +00:00
Amara Emerson	daf6e66ac5	[GlobalISel] Add legalization support for non-power-2 loads and stores Legalize things like i24 load/store by splitting them into smaller power of 2 operations. This matches how SelectionDAG handles these operations. Differential Revision: https://reviews.llvm.org/D59971 llvm-svn: 358613	2019-04-17 21:30:07 +00:00
Fangrui Song	c82e92bca8	Change some llvm::{lower,upper}_bound to llvm::bsearch. NFC llvm-svn: 358564	2019-04-17 07:58:05 +00:00
Amara Emerson	02a90ea73d	[AArch64][GlobalISel] Don't do extending loads combine for non-pow-2 types. Since non-pow-2 types are going to get split up into multiple loads anyway, don't do the [SZ]EXTLOAD combine for those and save us trouble later in legalization. llvm-svn: 358458	2019-04-15 22:34:08 +00:00
Amara Emerson	946b1246d6	[GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 `8033492` -0.6% test-suite :: CTMark/kimwitu++/kc.test `3870380` 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369	2019-04-15 05:04:20 +00:00
Amara Emerson	d189680baa	[GlobalISel] Introduce a CSEConfigBase class to allow targets to define their own CSE configs. Because CodeGen can't depend on GlobalISel, we need a way to encapsulate the CSE configs that can be passed between TargetPassConfig and the targets' custom pass configs. This CSEConfigBase allows targets to create custom CSE configs which is then used by the GISel passes for the CSEMIRBuilder. This support will be used in a follow up commit to allow constant-only CSE for -O0 compiles in D60580. llvm-svn: 358368	2019-04-15 04:53:46 +00:00
Amara Emerson	93e58d2396	[AArch64][GlobalISel] Enable copy elision in the pre-legalizer combine and fix a crash. This enables the simple copy combine that already exists in the CombinerHelper. However, it exposed a bug in the GISelChangeObserver where it wouldn't clear a set of MIs to process, and so would end up causing a crash when deleted MIs were being added to the combiner worklist again. Differential Revision: https://reviews.llvm.org/D60579 llvm-svn: 358318	2019-04-13 00:33:25 +00:00
Amara Emerson	bdb5e4e4ca	[GlobalISel] Fix a crash when handling an invalid MVT during call lowering. This crash was introduced in r358032 as we try to construct an EVT from an MVT in order to find the register type for the calling conv. Fall back instead of trying to do this with an invalid MVT coming from i256. llvm-svn: 358314	2019-04-12 22:05:46 +00:00
Fangrui Song	fb79ff6ab5	Use llvm::upper_bound. NFC llvm-svn: 358277	2019-04-12 11:31:16 +00:00
Fangrui Song	cecc435250	Use llvm::lower_bound. NFC This reapplies rL358161. That commit inadvertently reverted an exegesis file to an old version. llvm-svn: 358246	2019-04-12 02:02:06 +00:00
Ali Tamur	7822b46188	Revert "Use llvm::lower_bound. NFC" This reverts commit rL358161. This patch have broken the test: llvm/test/tools/llvm-exegesis/X86/uops-CMOV16rm-noreg.s llvm-svn: 358199	2019-04-11 17:35:20 +00:00
Fangrui Song	71cce580b9	Use llvm::lower_bound. NFC llvm-svn: 358161	2019-04-11 10:25:41 +00:00
Amara Emerson	ae878dab03	[AArch64][GlobalISel] Scalarize vector SDIV. llvm-svn: 358142	2019-04-10 23:06:08 +00:00
Matt Arsenault	2064e45ce3	GlobalISel: Move computeValueLLTs Call lowering should use this directly instead of going through the EVT version, but more work is needed to deal with this (mostly the passing of the IR type pointer instead of the relevant properties in ArgInfo). llvm-svn: 358111	2019-04-10 17:27:56 +00:00
Matt Arsenault	0aab99902b	GlobalISel: Fix invoke lowering creating invalid type registers Unlike the call handling, this wasn't checking for void results and creating a register with the invalid LLT llvm-svn: 358110	2019-04-10 17:27:55 +00:00
Matt Arsenault	7187272b2b	GlobalISel: Support legalizing G_CONSTANT with irregular breakdown llvm-svn: 358109	2019-04-10 17:27:53 +00:00
Matt Arsenault	9e0eeba569	GlobalISel: Handle odd breakdowns for bit ops llvm-svn: 358105	2019-04-10 17:07:56 +00:00
Amara Emerson	2b523f8162	[GlobalISel][AArch64] Allow CallLowering to handle types which are normally required to be passed as different register types. E.g. <2 x i16> may need to be passed as a larger <2 x i32> type, so formal arg lowering needs to be able truncate it back. Likewise, when dealing with returns of these types, they need to be widened in the appropriate way back. Differential Revision: https://reviews.llvm.org/D60425 llvm-svn: 358032	2019-04-09 21:22:33 +00:00
Matt Arsenault	106429b4e4	GlobalISel: Add another overload of buildUnmerge It's annoying to have to create an array of the result type, particularly when you don't care about the size of the value. llvm-svn: 357763	2019-04-05 14:03:07 +00:00
Evandro Menezes	85bd3978ae	[IR] Refactor attribute methods in Function class (NFC) Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731	2019-04-04 22:40:06 +00:00
Evandro Menezes	7c711ccf36	[IR] Create new method in `Function` class (NFC) Create method `optForNone()` testing for the function level equivalent of `-O0` and refactor appropriately. Differential revision: https://reviews.llvm.org/D59852 llvm-svn: 357638	2019-04-03 21:27:03 +00:00
Jessica Paquette	e794121cd0	[AArch64][GlobalISel] Legalize G_FEXP2 Same as G_EXP. Add a test, and update legalizer-info-validation.mir and f16-instructions.ll. Differential Revision: https://reviews.llvm.org/D60165 llvm-svn: 357605	2019-04-03 16:58:32 +00:00
Jessica Paquette	ed23352379	[GlobalISel] Add IRTranslator support for llvm.stacksave and llvm.stackrestore Also update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D60140 llvm-svn: 357538	2019-04-02 22:46:31 +00:00
Amara Emerson	381188f1f3	[GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead instructions. The artifact combiners push instructions which have been marked for deletion onto an list for the legalizer to deal with on return. However, for trunc(ext) combines the combiner routine recursively calls itself. When it does this the dead instructions list may not be empty, and the other combiners don't expect to be dealing with essentially invalid MIR (multiple vreg defs etc). This change fixes it by ensuring that the dead instructions are processed on entry into tryCombineInstruction. As a result, this fix exposed a few places in tests where G_TRUNC instructions were not being deleted even though they were dead. Differential Revision: https://reviews.llvm.org/D59892 llvm-svn: 357101	2019-03-27 17:47:42 +00:00
Matt Arsenault	b34afa311d	GlobalISel: Fix RegBankSelect for REG_SEQUENCE The AArch64 test was broken since the result register already had a set register class, so this test was a no-op. The mapping verify call would fail because the result size is not the same as the inputs like in a copy or phi. The AMDGPU testcases are half broken and introduce illegal VGPR->SGPR copies which need much more work to handle correctly (same for phis), but add them as a baseline. llvm-svn: 356713	2019-03-21 20:45:36 +00:00
Amara Emerson	a140276a1e	[GlobalISel] Include missing change from r356396 Forgot to add a change to relax some asserts in r356396. llvm-svn: 356411	2019-03-18 21:29:21 +00:00
Amara Emerson	8627178d46	Revert r356304: remove subreg parameter from MachineIRBuilder::buildCopy() After review comments, it was preferred to not teach MachineIRBuilder about non-generic instructions beyond using buildInstr(). For AArch64 I've changed the buildCopy() calls to buildInstr() + a separate addReg() call. This also relaxes the MachineIRBuilder's COPY checking more because it may not always have a SrcOp given to it. llvm-svn: 356396	2019-03-18 19:20:10 +00:00
Amara Emerson	7097e83dab	[GlobalISel] Make isel verification checks of vregs run under NDEBUG only. llvm-svn: 356309	2019-03-16 01:02:10 +00:00
Amara Emerson	3739a20875	[GlobalISel] Allow MachineIRBuilder to build subregister copies. This relaxes some asserts about sizes, and adds an optional subreg parameter to buildCopy(). Also update AArch64 instruction selector to use this in places where we previously used MachineInstrBuilder manually. Differential Revision: https://reviews.llvm.org/D59434 llvm-svn: 356304	2019-03-15 21:59:50 +00:00
Matt Arsenault	133716929c	GlobalISel: Use multiple returns for intrinsic structs This is consistent with what SelectionDAG does and is much easier to work with than the extract sequence with an artificial wide register. For the AMDGPU control flow intrinsics, this was producing an s128 for the i64, i1 tuple return. Any legalization that should apply to a real s128 value would badly obscure the direct values that need to be seen. llvm-svn: 356147	2019-03-14 14:18:56 +00:00
Quentin Colombet	e77e5f44b8	[GlobalISel][Utils] Add a getConstantVRegVal variant that looks through instrs getConstantVRegVal used to only look for G_CONSTANT when looking at unboxing the value of a vreg. However, constants are sometimes not directly used and are hidden behind trunc, s\|zext or copy chain of computation. In particular this may be introduced by the legalization process that doesn't want to simplify these patterns because it can lead to infine loop when legalizing a constant. To circumvent that problem, add a new variant of getConstantVRegVal, named getConstantVRegValWithLookThrough, that allow to look through extensions. Differential Revision: https://reviews.llvm.org/D59227 llvm-svn: 356116	2019-03-14 01:37:13 +00:00
Jessica Paquette	42d16501e6	[GlobalISel][AArch64] Always fall back on aarch64.neon.addp.* Overloaded intrinsics aren't necessarily safe for instruction selection. One such intrinsic is aarch64.neon.addp.*. This is a temporary workaround to ensure that we always fall back on that intrinsic. Eventually this will be replaced with a proper solution. https://bugs.llvm.org/show_bug.cgi?id=40968 Differential Revision: https://reviews.llvm.org/D59062 llvm-svn: 355865	2019-03-11 20:51:17 +00:00
Benjamin Kramer	6ff32e143a	[MIPS GlobalISel] Silence uninitialized variable warning The control flow here cannot ever use the uninitialized value, but it's too hard for the compiler to figure that out. Clang warns: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: error: variable 'CarrySum' is used uninitialized whenever 'for' loop exits because its condition is false [-Werror,-Wsometimes-uninitialized] for (unsigned i = 2; i < Factors.size(); ++i) ^~~~~~~~~~~~~~~~~~ llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2604:26: note: uninitialized use occurs here CarrySumPrevDstIdx = CarrySum; ^~~~~~~~ llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: note: remove the condition if it is always true for (unsigned i = 2; i < Factors.size(); ++i) ^~~~~~~~~~~~~~~~~~ llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2583:22: note: initialize the variable 'CarrySum' to silence this warning unsigned CarrySum; ^ = 0 llvm-svn: 355818	2019-03-11 10:39:15 +00:00
Petar Avramovic	5229f47f9f	[MIPS GlobalISel] NarrowScalar G_UMULH NarrowScalar G_UMULH in LegalizerHelper using multiplyRegisters helper function. NarrowScalar G_UMULH for MIPS32. Differential Revision: https://reviews.llvm.org/D58825 llvm-svn: 355815	2019-03-11 10:08:44 +00:00
Petar Avramovic	0b17e59b5c	[MIPS GlobalISel] NarrowScalar G_MUL Narrow Scalar G_MUL for MIPS32. Revisit NarrowScalar implementation in LegalizerHelper. Introduce new helper function multiplyRegisters. It performs generic multiplication of values held in multiple registers. Generated instructions use only types NarrowTy and i1. Destination can be same or two times size of the source. Differential Revision: https://reviews.llvm.org/D58824 llvm-svn: 355814	2019-03-11 10:00:17 +00:00
Matt Arsenault	d3093c2f1f	GlobalISel: Implement fewerElementsVector for phi llvm-svn: 355048	2019-02-28 00:16:32 +00:00
Matt Arsenault	72bcf15dbf	GlobalISel: Implement moreElementsVector for phi llvm-svn: 355047	2019-02-28 00:01:05 +00:00
Petar Avramovic	bd39569913	[MIPS GlobalISel] Select G_UADDO Lower G_UADDO. Legalize G_UADDO for MIPS32 Differential Revision: https://reviews.llvm.org/D58671 llvm-svn: 354900	2019-02-26 17:22:42 +00:00
Matt Arsenault	752579736e	RegBankSelect: Handle slightly more complex value mappings Try to use concat_vectors. Also remove unnecessary assert on pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers for AMDGPU. llvm-svn: 354828	2019-02-25 22:24:13 +00:00
Matt Arsenault	8df2f3dab2	RegBankSelect: Allow targets to introduce control flow for mapping For AMDGPU, if an operand requires an SGPR but is only available as a VGPR, a loop needs to be introduced to execute the instruction with each unique combination of values across all lanes. The rest of the instructions in the block will be moved to a new block following the loop. Check if the next instruction's parent changed, and update the iterators and insertion block if this happened. Tests will be included in a future patch. llvm-svn: 354591	2019-02-21 15:48:13 +00:00
Matt Arsenault	75e30c4d5d	GlobalISel: Fix fewerElementsVector for ctlz with different result type Also complete the set of related operations. llvm-svn: 354480	2019-02-20 16:42:52 +00:00
Matt Arsenault	c4d07554e4	GlobalISel: Implement moreElementsVector for g_insert results llvm-svn: 354477	2019-02-20 16:11:22 +00:00
Matt Arsenault	b4c95b338b	GlobalISel: Implement moreElementsVector for select llvm-svn: 354354	2019-02-19 17:03:09 +00:00
Matt Arsenault	4d88427a58	GlobalISel: Implement moreElementsVector for G_EXTRACT source llvm-svn: 354348	2019-02-19 16:44:22 +00:00
Matt Arsenault	26b7e859ef	GlobalISel: Implement moreElementsVector for bit ops llvm-svn: 354345	2019-02-19 16:30:19 +00:00
Jessica Paquette	b53e0f4b81	[GlobalISel][AArch64] Legalize + select some llvm.ctlz.* intrinsics Legalize/select llvm.ctlz.* Add select-ctlz to show that we actually select them. Update arm64-clrsb.ll and arm64-vclz.ll to show that we perform valid transformations in optimized builds, and document where GISel can improve. Differential Revision: https://reviews.llvm.org/D58155 llvm-svn: 354299	2019-02-18 23:33:24 +00:00
Matt Arsenault	fbe92a53d0	GlobalISel: Implement widenScalar for g_extract scalar results llvm-svn: 354293	2019-02-18 22:39:27 +00:00
Matt Arsenault	e84bdce609	GlobalISel: Make buildExtract use DstOp/SrcOp llvm-svn: 354292	2019-02-18 22:39:22 +00:00
Matt Arsenault	debaf4bd31	GlobalISel: Fix double count of offset for irregular vector breakdowns Fixes cases with odd vectors that break into multiple requested size pieces. llvm-svn: 354280	2019-02-18 17:01:09 +00:00
Aditya Nandakumar	0e362ec19a	[GISel][NFC]: Add methods to speed up insertion into GISelWorklist https://reviews.llvm.org/D58073 Speed up insertion during the initial populating phase into the GISelWorkList by deferring repeatedly resizing the DenseMap. This results in ~10% improvement in the combiner passes, and ~3% speedup in the Legalizer. reviewed by: aemerson. llvm-svn: 354093	2019-02-15 01:37:54 +00:00
Matt Arsenault	530d05e94a	GlobalISel: Add alignment to LegalityQuery MMOs This allows targets to specify the minimum alignment required for the load/store. llvm-svn: 354071	2019-02-14 22:41:09 +00:00
Petar Avramovic	5d9b8eed85	[MIPS GlobalISel] Select branch instructions Select G_BR and G_BRCOND for MIPS32. Unconditional branch G_BR does not have register operand, for that reason we only add tests. Since conditional branch G_BRCOND compares register to zero on MIPS32, explicit extension must be performed on i1 condition in order to set high bits to appropriate value. Differential Revision: https://reviews.llvm.org/D58182 llvm-svn: 354022	2019-02-14 11:39:53 +00:00
Daniel Sanders	dfa0f556bf	[globalisel][combine] Split existing rules into a match and apply step Summary: The declarative tablegen definitions split rules into match and apply steps. Prepare for that by doing the same in the C++ implementations. This aids some of the migration effort while the tablegen version is incomplete. Reviewers: bogner, volkan, aditya_nandakumar, paquette, aemerson Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, Petar.Avramovic, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58150 llvm-svn: 353996	2019-02-14 00:15:28 +00:00
Jessica Paquette	acbb7ca26c	[GlobalISel][NFC] Gardening: Make translateSimpleUnaryIntrinsic general Instead of only having this code work for unary intrinsics, have it work for an arbitrary number of parameters. Factor out the cases that fall under this (fma, pow). This makes it a bit easier to add more intrinsics which don't require any special work. Differential Revision: https://reviews.llvm.org/D58079 llvm-svn: 353863	2019-02-12 17:38:34 +00:00
Jessica Paquette	0e71e73faa	[GlobalISel][AArch64] Select llvm.bswap* for non-vector types This teaches the IRTranslator to emit G_BSWAP when it runs into Intrinsic::bswap. This allows us to select G_BSWAP for non-vector types in AArch64. Add a select-bswap.mir test, and add global isel checks to a couple existing tests in test/CodeGen/AArch64. This doesn't handle every bswap case, since some of these rely on known bits stuff. This just lets us handle the naive case. Differential Revision: https://reviews.llvm.org/D58081 llvm-svn: 353861	2019-02-12 17:28:17 +00:00
Matt Arsenault	996c66620e	GlobalISel: Use default rounding mode when extending fconstant I don't think this matters since the values should all be exactly representable. llvm-svn: 353844	2019-02-12 14:54:54 +00:00

1 2 3 4 5 ...

997 Commits