llvm-project

Commit Graph

Author	SHA1	Message	Date
Ahsan Saghir	a28e9f1208	[PowerPC] Add support for vmsumudm This patch adds support for Vector Multiply-Sum Unsigned Doubleword Modulo instruction; vmsumudm. Differential Revision: https://reviews.llvm.org/D80294	2020-05-22 14:35:13 -05:00
Jean-Michel Gorius	65cd2c7a80	Revert "[CodeGen] Add support for multiple memory operands in MachineInstr::mayAlias" This temporarily reverts commit `7019cea26d`. It seems that, for some targets, there are instructions with a lot of memory operands (probably more than would be expected). This causes a lot of buildbots to timeout and notify failed builds. While investigations are ongoing to find out why this happens, revert the changes.	2020-05-22 21:26:46 +02:00
Sanjay Patel	6438ea45e0	[VectorCombine] position pass after SLP in the optimization pipeline rather than before There are 2 known problem patterns shown in the test diffs here: vector horizontal ops (an x86 specialization) and vector reductions. SLP has greater ability to match and fold those than vector-combine, so let SLP have first chance at that. This is a quick fix while we continue to improve vector-combine and possibly canonicalize to reduction intrinsics. In the longer term, we should improve matching of these patterns because if they were created in the "bad" forms shown here, then we would miss optimizing them. I'm not sure what is happening with alias analysis on the addsub test. The old pass manager now shows an extra line for that, and we see an improvement that comes from SLP vectorizing a store. I don't know what's missing with the new pass manager to make that happen. Strangely, I can't reproduce the behavior if I compile from C++ with clang and invoke the new PM with "-fexperimental-new-pass-manager". Differential Revision: https://reviews.llvm.org/D80236	2020-05-22 12:22:44 -04:00
Pengxuan Zheng	22ed724975	[RISCV] Register null target streamer for RISC-V Summary: This fixes two llc crashes with the following tests when RISC-V is the default target. LLVM :: DebugInfo/Generic/global.ll LLVM :: DebugInfo/Generic/inlined-strings.ll Reviewers: HsiangKai Reviewed By: HsiangKai Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, sameer.abuasal, apazos, luismarques, evandro, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80352	2020-05-22 09:18:23 -07:00
Simon Pilgrim	c479052a74	[CGP] Ensure address offset is representable as int64_t AddressingModeMatcher::matchAddr was calling getSExtValue for a constant before ensuring that we can actually represent the value as int64_t Fixes PR46004 / OSSFuzz#22357	2020-05-22 17:00:22 +01:00
Sanjay Patel	2f7c24fe30	[InstCombine] (A + B) + B --> A + (B << 1) This eliminates a use of 'B', so it can enable follow-on transforms as well as improve analysis/codegen. The PhaseOrdering test was added for D61726, and that shows the limits of instcombine vs. real reassociation. We would need to run some form of CSE to collapse that further. The intermediate variable naming here is intentional because there's a test at llvm/test/Bitcode/value-with-long-name.ll that would break with the usual nameless value. I'm not sure how to improve that test to be more robust. The naming may also be helpful to debug regressions if this change exposes weaknesses in the reassociation pass for example.	2020-05-22 11:46:59 -04:00
Denis Antrushin	5451289aba	[SCEV] Constant fold MultExpr before applying depth limit. Summary: Users of SCEV reasonably assume that multiplication of two constant SCEVs will in turn be constant. However, that is not always the case: First, we can get here with reached depth limit, and will create MultExpr SCEV `C1 * C2` and cache it. Then, we can get here with the same operands, but with small depth level. But this time we will find existing MultExpr SCEV and return it, instead of expected constant SCEV. This patch changes getMultExpr to not apply depth limit to all constant operands expression, allowing them to be folded. Reviewers: reames, mkazantsev Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79893	2020-05-22 18:34:32 +03:00
Xiangling_Liao	2419dce5d1	[NFC][AIX] Remove spaces after the comma for '.csect' directive To be consistent with other directives like '.comm', '.lcomm', we remove the spaces after the comma for '.csect' on AIX. Differential Revision: https://reviews.llvm.org/D80247	2020-05-22 11:10:32 -04:00
Matt Arsenault	66fe60220c	AMDGPU/GlobalISel: Fix masked control flow with fallthrough blocks Unlike SelectionDAGBuilder, IRTranslator omits the unconditional branch in fallthrough cases. Confusingly, the control flow pseudos function in the opposite way the intrinsics are used, and the branch targets always need to be swapped. We're inverting the target blocks, so we need to figure out the old fallthrough block and insert a branch to the original unconditional branch target.	2020-05-22 10:31:44 -04:00
Anh Tuyen Tran	13bf6039c9	Title: [LV] Handle Fold-Tail of loops with vectorizarion factor equal to 1 Summary: When handling loops whose VF is 1, fold-tail vectorization sets the backedge taken count of the original loop with a vector of a single element. This causes type-mismatch during instruction generartion. The purpose of this patch is toto address the case of VF==1. Reviewer: Ayal (Ayal Zaks), bmahjour (Bardia Mahjour), fhahn (Florian Hahn), gilr (Gil Rapaport), rengolin (Renato Golin) Reviewed By: Ayal (Ayal Zaks), bmahjour (Bardia Mahjour), fhahn (Florian Hahn) Subscribers: Ayal (Ayal Zaks), rkruppe (Hanna Kruppe), bmahjour (Bardia Mahjour), rogfer01 (Roger Ferrer Ibanez), vkmr (Vineet Kumar), bollu (Siddharth Bhat), hiraditya (Aditya Kumar), llvm-commits (Mailing List llvm-commits) Tag: LLVM Differential Revision: https://reviews.llvm.org/D79976	2020-05-22 13:30:56 +00:00
Simon Pilgrim	4ed909bb5b	TargetLowering.h - remove unnecessary includes. NFC. Replace with forward declarations and move SizeOpts.h down to TargetLoweringBase.cpp	2020-05-22 14:26:27 +01:00
Simon Pilgrim	d4c0a082a4	[TargetLowering] Move TargetLoweringBase::isJumpTableRelative() implementation into TargetLoweringBase.cpp. NFC. This will help with reducing header dependencies in TargetLowering.h in a future patch.	2020-05-22 14:26:27 +01:00
Sanjay Patel	21f7cf4057	[SLP] fix verification check for valid IR This is a fix for PR45965 - https://bugs.llvm.org/show_bug.cgi?id=45965 - which was left out of D80106 because of a test failure. SLP does its own mini-CSE after potentially creating redundant instructions, so we need to wait for that to complete before running the verifier. Otherwise, we will see a test failure for test/Transforms/SLPVectorizer/X86/crash_vectorizeTree.ll (not changed here) because a phi temporarily has identical but different incoming values for the same incoming block. A related, but independent, test that would have been altered here was fixed with: rG880df55 The test was escaping verification in SLP without this change because we were not running verifyFunction() unless SLP actually changed the IR. Differential Revision: https://reviews.llvm.org/D80401	2020-05-22 09:15:27 -04:00
Nemanja Ivanovic	1a493b0fa5	[PowerPC] Add missing handling for half precision The fix for PR39865 took care of some of the handling for half precision but it missed a number of issues that still exist. This patch fixes the remaining issues that cause crashes in the PPC back end. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45776 Differential revision: https://reviews.llvm.org/D79283	2020-05-22 07:50:11 -05:00
Marek Kurdej	9301e3aaca	[Target] Fix typos. NFC	2020-05-22 14:40:43 +02:00
Matt Arsenault	88c20fa3d2	InstCombine: Add constant folding/simplify for amdgcn.ldexp intrinsic This really belongs in InstructionSimplify since it doesn't introduce new instructions. Put it in instcombine to avoid increasing the number of passes considering target intrinsics. I also noticed that we seem to now be interpreting strictfp attributes on call sites, so try to handle that.	2020-05-22 08:21:38 -04:00
Simon Pilgrim	1386728fc2	[AVR] Remove unsigned <= 0 checks. NFCI. D77207 changed the bounds checks resulting in tests for positive unsigned values - dropping the superfluous check to fix gcc+Werror "error: comparison of unsigned expression >= 0 is always true [-Werror=type-limits]" warning.	2020-05-22 12:28:39 +01:00
Dmitry Preobrazhensky	933ebc4078	[AMDGPU][MC][GFX8+] Enabled clamp for v_mul_i32_i24_e64 and v_mul_u32_u24_e64 See bug 45925: https://bugs.llvm.org/show_bug.cgi?id=45925 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D80287	2020-05-22 14:11:31 +03:00
Victor Campos	872ee78f65	Revert "[ARM] Improve codegen of volatile load/store of i64" This reverts commit `8a12553223`. A bug has been found when generating code for Thumb2. In some very specific cases, the prologue/epilogue emitter generates erroneous stack offsets for the new LDRD instructions that access the stack. This bug does not seem to be caused by the reverted patch though. Likely the latter has made an undiscovered issue emerge in the prologue/epilogue emission pass. Nevertheless, this reversion is necessary since it is blocking users of the ARM backend.	2020-05-22 11:01:57 +01:00
Simon Pilgrim	b9def827b7	StatepointLowering.h - remove unused includes. NFC.	2020-05-22 10:49:11 +01:00
Simon Pilgrim	c9797cf98b	Fix msvc "switch statement contains 'default' but no 'case' labels" warning. NFC. Stripped out the switch statement, but kept case labels as comments for future reference.	2020-05-22 10:49:10 +01:00
Simon Pilgrim	1041e8b886	MILexer.h/cpp - remove unused includes. NFC. Remove duplicates in MILexer.cpp that are already included in MILexer.h.	2020-05-22 10:49:10 +01:00
Roman Lebedev	cd921accf9	[NFC] InstCombineNegator: use auto where type is obvious from the cast	2020-05-22 11:14:54 +03:00
Max Kazantsev	403810557b	[InstCombine] Sink pure instructions down to return and unreachable blocks If the only user of `Instr` is in a return or unreachable block, we can sink `Instr` to the`User` safely (unless it reads/writes memory). Return or unreachable blocks are guaranteed to execute zero or one time, and `Instr` always dominates `User`, so they either will be executed together (execution of `User` always implies execution of `Instr`) or not executed at all. Differential Revision: https://reviews.llvm.org/D80120 Reviewed By: asbirlea, jdoerfert	2020-05-22 14:33:42 +07:00
Lang Hames	2e40cf06df	[JITLink] Initial implementation of ELF / x86-64 support for JITLink. This initial implementation supports section and symbol parsing, but no relocation support. It enables JITLink to link and execute ELF relocatable objects that do not require relocations. Patch by Jared Wyles. Thanks Jared! Differential Revision: https://reviews.llvm.org/D79832	2020-05-21 21:44:00 -07:00
Jessica Paquette	49a4f3f7d8	[AArch64][GlobalISel] Add a post-legalizer combiner with a very simple combine. (This patch is by Jessica, I'm just committing it on her behalf because I need a post-legalizer combiner for something else). This supersedes D77250, which did equivalent work in the selector. This can be done pre-legalization or post-legalization. Post-legalization is more likely to hit, since G_IMPLICIT_DEFs tend to appear during legalization. There's no reason to not do it pre-legalization though-- if it can be caught earlier, great. (I also think that it might be worth reimplementing D78769 using a target-specific post-legalization combine too after thinking about it for a while.) Differential Revision: https://reviews.llvm.org/D78852	2020-05-21 18:47:32 -07:00
Vedant Kumar	77ffce6954	[Instruction] Set metadata uses to undef on deletion Summary: Replace any extant metadata uses of a dying instruction with undef to preserve debug info accuracy. Some alternatives include: - Treat Instruction like any other Value, and point its extant metadata uses to an empty ValueAsMetadata node. This makes extant dbg.value uses trivially dead (i.e. fair game for deletion in many passes), leading to stale dbg.values being in effect for too long. - Call salvageDebugInfoOrMarkUndef. Not needed to make instruction removal correct. OTOH results in wasted work in some common cases (e.g. when all instructions in a BasicBlock are deleted). This came up while discussing some basic cases in https://reviews.llvm.org/D80052. Reviewers: jmorse, TWeaver, aprantl, dexonsmith, jdoerfert Subscribers: jholewinski, qcolombet, hiraditya, jfb, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80264	2020-05-21 15:58:12 -07:00
Alexey Lapshin	bf242c067e	[AARCH64][NEON] Allow to sink operands of aarch64_neon_pmull64. Summary: This patch fixes a problem when pmull2 instruction is not generated for vmull_high_p64 intrinsic. ISel has a pattern for int_aarch64_neon_pmull64 intrinsic to generate PMULL2 instruction. That pattern assumes that extraction operations are located in the same basic block. We need to sink them if they are not. Handle operands of int_aarch64_neon_pmull64 into AArch64TargetLowering::shouldSinkOperands. Reviewed by: efriedma Differential Revision: https://reviews.llvm.org/D80320	2020-05-22 01:35:24 +03:00
Craig Topper	f96a7706d9	[Target] Use Align in TargetLoweringObjectFile::getSectionForConstant. Differential Revision: https://reviews.llvm.org/D80363	2020-05-21 15:23:29 -07:00
Arthur Eubanks	fc937806ef	Don't jump to landing pads in Control Flow Optimizer Summary: Likely fixes https://bugs.llvm.org/show_bug.cgi?id=45858. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80047	2020-05-21 15:19:10 -07:00
Tim Renouf	d13a508820	[AMDGPU] Fixed incorrect PAL metadata register naming This only affects assembly and -filetype=asm codegen of PAL metadata. Differential Revision: https://reviews.llvm.org/D78860 Change-Id: I7b822e1917bf7b403486820d31afc483be207652	2020-05-21 22:13:19 +01:00
Tim Renouf	db16eb33ce	[MsgPack] Added convenience assignment to MsgPackDocument This commit increases the convenience of using the MsgPackDocument API, especially when creating a document for writing out. It adds direct assignment of bool, integer and string types to a DocNode, as long as that DocNode is already inside a document, e.g. the result of a map lookup. It also adds map lookup given an integer type (it already had that for string). So, to assign a string to a map element whose key is an int, you can now write MyMap[42] = "towel"; instead of MyMap[MyMap.getDocument()->getNode(42)] = MyMap.getDocument()->getNode("towel"); Also added MapDocNode::erase methods. Differential Revision: https://reviews.llvm.org/D80121 Change-Id: I17301fa15bb9802231c52542798af5b54beb583e	2020-05-21 22:13:19 +01:00
Jean-Michel Gorius	7019cea26d	[CodeGen] Add support for multiple memory operands in MachineInstr::mayAlias Summary: To support all targets, the mayAlias member function needs to support instructions with multiple operands. This revision also changes the order of the emitted instructions in some test cases. Reviewers: efriedma, hfinkel, craig.topper, dmgreen Reviewed By: efriedma Subscribers: MatzeB, dmgreen, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80161	2020-05-21 23:02:54 +02:00
Stanislav Mekhanoshin	689e616ed0	[AMDGPU] Promote alloca to vector in opt Promote alloca to vector before SROA and loop unroll. If we manage to eliminate allocas before unroll we may choose to unroll less. Differential Revision: https://reviews.llvm.org/D80386	2020-05-21 13:49:51 -07:00
Eli Friedman	f09d220c71	[AArch64][SVE] Fill out missing unpredicated load/store patterns. The set of patterns for unpredicated load/store was incomplete: it only included non-extending stores. Fill out the remaining patterns for extending stores, and add the corresponding support to frame offset lowering. Differential Revision: https://reviews.llvm.org/D80349	2020-05-21 13:29:30 -07:00
Tim Renouf	e79d002309	[MsgPack] MsgPackDocument::readFromBlob now merges The readFromBlob method can now be used to read MsgPack into a Document that already contains something, merging the two. There is a new Merger argument to readFromBlob, a callback function to resolve conflicts. Differential Revision: https://reviews.llvm.org/D79671 Change-Id: Icf3e959217fe33cd907a41516c0386aef2847c0c	2020-05-21 21:26:26 +01:00
Hendrik Greving	8a6a2c4cb6	[ModuloSchedule] Add missing comma. This is a test commit as per Chris to verify commit access. Thanks!	2020-05-21 13:18:07 -07:00
Marcello Maggioni	dbaed589ab	[SelectionDAG] Add the option of disabling generic combines. Summary: For some targets generic combines don't really do much and they consume a disproportionate amount of time. There's not really a mechanism in SDISel to tactically disable combines, but we can have a switch to disable all of them and let the targets just implement what they specifically need. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79112	2020-05-21 20:11:29 +00:00
Hiroshi Yamauchi	01909b4e85	[IR] Make Module::setProfileSummary to replace an existing ProfileSummary flag. Summary: Module::setProfileSummary currently calls addModuelFlag. This prevents from updating the ProfileSummary metadata in the module and results in a second ProfileSummary added instead of replacing an existing one. I don't think this is the expected behavior. It prevents updating the ProfileSummary and it does not make sense to have more than one. To address this, add Module::setModuleFlag and use it from setProfileSummary. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79902	2020-05-21 11:38:39 -07:00
Jonas Devlieghere	f6cc1c08f1	Revert "Revert "[YAMLTraits] Add trait for char"" Reverting this to unblock all the LLDB bots while we try to figure out a solution for Solaris in https://reviews.llvm.org/D79745.	2020-05-21 10:33:09 -07:00
Hiroshi Yamauchi	b5c59d77c3	[ProfileSummary] Add the PartialProfileRatio field in ProfileSummary metadata. Summary: PartialProfileRatio approximately represents the ratio of the number of profile counters of the program being built to the number of profile counters in the partial sample profile. It is used to scale the working set size under the partial sample profile to reflect the size of the program being built and to improve the working set size heuristics. This is a split from D79831. Reviewers: davidxl Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79951	2020-05-21 09:12:23 -07:00
Stanislav Mekhanoshin	1dfd1b3e4b	[AMDGPU] Tune threshold for cmp/select vector lowering It was set in total vector size while the idea was to limit a number of instructions. Now it started to work with doubles and thresholds needs to be updated. Differential Revision: https://reviews.llvm.org/D80322	2020-05-21 08:59:35 -07:00
Rainer Orth	c4169a3efe	Revert "[YAMLTraits] Add trait for char" This reverts commit `fab08bf489`. It has left the Solaris buildbots broken for a week and a half as reported in https://reviews.llvm.org/D79745.	2020-05-21 17:33:42 +02:00
Dinar Temirbulatov	df3b95bc0a	[SLP][NFC] PR45269 getVectorElementSize() is slow The algorithm inside getVectorElementSize() is almost O(x^2) complexity and when, for example, we compile MultiSource/Applications/ClamAV/shared_sha256.c with 1k instructions inside sha256_transform() function that resulted in almost ~800k iterations. The following change improves the algorithm with the map to a liner complexity. Differential Revision: https://reviews.llvm.org/D80241	2020-05-21 17:26:50 +02:00
Thomas Raoux	20c0527af7	[ModuloSchedule] Trivial fix for instruction with more than one destination in modulo peeler. When moving an instruction into a block where it was referenced by a phi when peeling, refer to the phi's register number and assert that the instruction has it in its destinations. This way, it also covers instructions with more than one destination. Patch by Hendrik Greving! Differential Revision: https://reviews.llvm.org/D80027	2020-05-21 08:14:42 -07:00
Jean-Michel Gorius	439c8b2884	[x86] NFC: Fix typo in command line option description	2020-05-21 16:53:25 +02:00
Benjamin Kramer	c476abfd37	[BitcodeReader] Simplify code. NFCI.	2020-05-21 16:03:09 +02:00
Benjamin Kramer	8f9d3b937c	[StringRef] Use some trickery to avoid initializing the std::string returned by upper()/lower()	2020-05-21 16:03:09 +02:00
James Henderson	79e5ecfa7a	On Windows, handle interrupt signals without crash message For LLVM on *nix systems, the signal handlers are not run on signals such as SIGINT due to CTRL-C. See sys::CleanupOnSignal. This makes sense, as such signals are not really crashes. Prior to this change, this wasn't the case on Windows, however. This patch changes the Windows behaviour to be consistent with Linux, and adds testing that verifies this. The test uses llvm-symbolizer, but any tool with an interactive mode would do the job. Fixes https://bugs.llvm.org/show_bug.cgi?id=45754. Reviewed by: MaskRay, rnk, aganea Differential Revision: https://reviews.llvm.org/D79847	2020-05-21 13:27:10 +01:00
Sam Parker	259eb619ff	Revert "[CostModel] Unify Intrinsic Costs." This reverts commit `de71def3f5`. This is causing some very large changes, so I'm first going to break this patch down and re-commit in parts.	2020-05-21 12:50:24 +01:00
Ehud Katz	111ddc57d3	[FlattenCFG] Fix `MergeIfRegion` in case then-path is empty In case the then-path of an if-region is empty, then merging with the else-path should be handled with the inverse of the condition (leading to that path). Fix PR37662 Differential Revision: https://reviews.llvm.org/D78881	2020-05-21 14:06:44 +03:00
Roman Lebedev	b2df961231	[IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835) Summary: Currently, `rewriteLoopExitValues()`'s logic is roughly as following: > Loop over each incoming value in each PHI node. > Query whether the SCEV for that incoming value is high-cost. > Expand the SCEV. > Perform sanity check (`isValidRewrite()`, D51582) > Record the info > Afterwards, see if we can drop the loop given replacements. > Maybe perform replacements. The problem is that we interleave SCEV cost checking and expansion. This is A Problem, because `isHighCostExpansion()` takes special care to not bill for the expansions that were already expanded, and we can reuse. While it makes sense in general - if we know that we will expand some SCEV, all the other SCEV's costs should account for that, which might cause some of them to become non-high-cost too, and cause chain reaction. But that isn't what we are doing here. We expand all SCEV's, unconditionally. So every next SCEV's cost will be affected by the already-performed expansions for previous SCEV's. Even if we are not planning on keeping some of the expansions we performed. Worse yet, this current "bonus" depends on the exact PHI node incoming value processing order. This is completely wrong. As an example of an issue, see @dmajor's `pr45835.ll` - if we happen to have a PHI node with two(!) identical high-cost incoming values for the same basic blocks, we would decide first time around that it is high-cost, expand it, and immediately decide that it is not high-cost because we have an expansion that we could reuse (because we expanded it right before, temporarily), and replace the second incoming value but not the first one; thus resulting in a broken PHI. What we instead should do for now, is not perform any expansions until after we've queried all the costs. Later, in particular after `isValidRewrite()` is an assertion (D51582) we could improve upon that, but in a more coherent fashion. See [[ https://bugs.llvm.org/show_bug.cgi?id=45835 \| PR45835 ]] Reviewers: dmajor, reames, mkazantsev, fhahn, efriedma Reviewed By: dmajor, mkazantsev Subscribers: smeenai, nikic, hiraditya, javed.absar, llvm-commits, dmajor Tags: #llvm Differential Revision: https://reviews.llvm.org/D79787	2020-05-21 13:05:55 +03:00
Sjoerd Meijer	b0614509a0	[HardwareLoops] llvm.loop.decrement.reg definition This is split off from D80316, slightly tightening the definition of overloaded hardwareloop intrinsic llvm.loop.decrement.reg specifying that both operands its result have the same type.	2020-05-21 10:48:16 +01:00
Denis Antrushin	dedcefe09d	[Statepoint] Constant fold FP deopt args. We do not have any special handling for constant FP deopt arguments. They are just spilled to stack or generated in register by MOVS instruction. This is inefficient and, when we have too many such constant arguments, may result in register allocation failure. Instead, we can bitcast such constant FP operands to appropriately sized integer and record as constant into statepoint and later, into StackMap. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D80318	2020-05-21 11:02:54 +03:00
Benjamin Kramer	5b0d1f04bf	Fix a layering violation by not depending from Transforms/Utils on Transforms/Scalar. NFC.	2020-05-21 09:51:58 +02:00
David Sherwood	1c3d9c2f36	[SVE] Remove IITDescriptor::ScalableVecArgument I have refactored the code so that we no longer need the ScalableVecArgument descriptor - the scalable property of vectors is now encoded using the ElementCount class in IITDescriptor. This means that when matching intrinsics we know precisely how to match the arguments and return values. Differential Revision: https://reviews.llvm.org/D80107	2020-05-21 08:15:10 +01:00
Chen Zheng	8086cdd1b0	[PowerPC] add more high latency opcodes for machine combiner pass Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D80097	2020-05-21 02:39:20 -04:00
Sam Parker	de71def3f5	[CostModel] Unify Intrinsic Costs. With the two getIntrinsicInstrCosts folded into one, now fold in the scalar/code-size orientated getIntrinsicCost. This involved sinking cost of the TTIImpl into the base implementation, as it performs no target checks. The opcodes remaining were memcpy, cttz and ctlz which now have special handling in the BasicTTI implementation. getInstructionThroughput can now directly return the result of getUserCost. This had required a change in the AMDGPU backend for fabs and its always 'free'. I've also changed the X86 backend to return '1' for any intrinsic when the CostKind isn't RecipThroughput. Though this intended to be a non-functional change, there are many paths being combined here so I would be very surprised if this didn't have an effect. Differential Revision: https://reviews.llvm.org/D80012	2020-05-21 07:38:25 +01:00
Sam Parker	fb3ba38021	[CostModel] Remove getExtCost This has not been implemented by any backends which appear to cover the functionality through getCastInstrCost. Sink what there is in the default implementation into BasicTTI. Differential Revision: https://reviews.llvm.org/D78922	2020-05-21 07:18:06 +01:00
Igor Kudrin	0e41d647ce	[MC] Simplify MakeStartMinusEndExpr(). NFC. The function does not need an MCStreamer per se; it was used only to get access to the MCContext. Differential Revision: https://reviews.llvm.org/D80205	2020-05-21 13:05:38 +07:00
Yevgeny Rouban	8138487468	[BrachProbablityInfo] Set edge probabilities at once and fix calcMetadataWeights() Hide the method that allows setting probability for particular edge and introduce a public method that sets probabilities for all outgoing edges at once. Setting individual edge probability is error prone. More over it is difficult to check that the total probability is 1.0 because there is no easy way to know when the user finished setting all the probabilities. Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights(). Changing unreachable branch probabilities to raw(1) and distributing the rest (oldProbability - raw(1)) over the reachable branches could introduce total probability inaccuracy bigger than 1/numOfBranches. Reviewers: yamauchi, ebrevnov Tags: #llvm Differential Revision: https://reviews.llvm.org/D79396	2020-05-21 12:52:37 +07:00
Craig Topper	ae5ab2f40a	[LegalizeDAG] Modify ExpandLegalINT_TO_FP to swap data for little/big endian instead of the pointers. Will make it easier to pass the pointer info and alignment correctly to the loads/stores. While there also make the i32 stores independent and use a token factor to join before the load.	2020-05-20 22:29:59 -07:00
Juneyoung Lee	d9a4a24413	Add CanonicalizeFreezeInLoops pass Summary: If an induction variable is frozen and used, SCEV yields imprecise result because it doesn't say anything about frozen variables. Due to this reason, performance degradation happened after https://reviews.llvm.org/D76483 is merged, causing SCEV yield imprecise result and preventing LSR to optimize a loop. The suggested solution here is to add a pass which canonicalizes frozen variables inside a loop. To be specific, it pushes freezes out of the loop by freezing the initial value and step values instead & dropping nsw/nuw flags from instructions used by freeze. This solution was also mentioned at https://reviews.llvm.org/D70623 . Reviewers: spatel, efriedma, lebedev.ri, fhahn, jdoerfert Reviewed By: fhahn Subscribers: nikic, mgorny, hiraditya, javed.absar, llvm-commits, sanwou01, nlopes Tags: #llvm Differential Revision: https://reviews.llvm.org/D77523	2020-05-21 09:29:29 +09:00
Eli Friedman	b4f9b34701	[AArch64] Fix unwind info generated by outliner. The offsets were wrong. The result is now the same as what the compiler would generate for a function that spills lr normally. Differential Revision: https://reviews.llvm.org/D80238	2020-05-20 16:39:00 -07:00
Eli Friedman	f26bdb539e	Make Value::getPointerAlignment() return an Align, not a MaybeAlign. If we don't know anything about the alignment of a pointer, Align(1) is still correct: all pointers are at least 1-byte aligned. Included in this patch is a bugfix for an issue discovered during this cleanup: pointers with "dereferenceable" attributes/metadata were assumed to be aligned according to the type of the pointer. This wasn't intentional, as far as I can tell, so Loads.cpp was fixed to stop making this assumption. Frontends may need to be updated. I updated clang's handling of C++ references, and added a release note for this. Differential Revision: https://reviews.llvm.org/D80072	2020-05-20 16:37:20 -07:00
Francis Visoiu Mistrih	161122ea1c	[AArch64] Provide Darwin variants of most calling conventions With the new SVE stack layout, we now need to provide a Darwin variant for all the calling conventions based on the main AAPCS CSR save order. This also changes APCS_SwiftError to have a Darwin and a non-Darwin version, assuming it could be used on other platforms these days, and restricts the AArch64_CXX_TLS calling convention to Darwin. Differential Revision: https://reviews.llvm.org/D73805	2020-05-20 16:03:48 -07:00
Stanislav Mekhanoshin	4eecf17164	[AMDGPU] Always expand ext/insertelement with divergent idx Even though series of cmd/cndmask can produce quite a lot of code that is still better than a loop. In case of doubles we would even produce two loops. Differential Revision: https://reviews.llvm.org/D80032	2020-05-20 15:51:29 -07:00
Craig Topper	17bd86bc9b	[LegalizeVectorTypes] Create correct memoperands in SplitVecRes_INSERT_SUBVECTOR. Previously this code just used a default constructed MachinePointerInfo. But we know the accesses are to a fixed stack object or at least somewhere on the stack. While there fix the alignment passed to the full vector load/stores. I don't think this function is currently exercised in tree so I don't know how to test it. I just noticed it when I removed non-constant index support in this function. Differential Revision: https://reviews.llvm.org/D80058	2020-05-20 15:06:36 -07:00
Nico Weber	bc1c3655bf	Give microsoftDemangle() an outparam for how many input bytes were consumed. Demangling Itanium symbols either consumes the whole input or fails, but Microsoft symbols can be successfully demangled with just some of the input. Add an outparam that enables clients to know how much of the input was consumed, and use this flag to give llvm-undname an opt-in warning on partially consumed symbols. Differential Revision: https://reviews.llvm.org/D80173	2020-05-20 16:17:31 -04:00
Roman Lebedev	55430f53f3	[InstCombine] `insertelement` is negatible if both sources are negatible ---------------------------------------- define <2 x i4> @negate_insertelement(<2 x i4> %src, i4 %a, i32 %x, <2 x i4> %b) { %0: %t0 = sub <2 x i4> { 0, 0 }, %src %t1 = sub i4 0, %a %t2 = insertelement <2 x i4> %t0, i4 %t1, i32 %x %t3 = sub <2 x i4> %b, %t2 ret <2 x i4> %t3 } => define <2 x i4> @negate_insertelement(<2 x i4> %src, i4 %a, i32 %x, <2 x i4> %b) { %0: %t2.neg = insertelement <2 x i4> %src, i4 %a, i32 %x %t3 = add <2 x i4> %t2.neg, %b ret <2 x i4> %t3 } Transformation seems to be correct!	2020-05-20 21:44:31 +03:00
Roman Lebedev	ebed96fdbf	[InstCombine] Negator: `extractelement` is negatible if src is negatible ---------------------------------------- define i4 @negate_extractelement(<2 x i4> %x, i32 %y, i4 %z) { %0: %t0 = sub <2 x i4> { 0, 0 }, %x call void @use_v2i4(<2 x i4> %t0) %t1 = extractelement <2 x i4> %t0, i32 %y %t2 = sub i4 %z, %t1 ret i4 %t2 } => define i4 @negate_extractelement(<2 x i4> %x, i32 %y, i4 %z) { %0: %t0 = sub <2 x i4> { 0, 0 }, %x call void @use_v2i4(<2 x i4> %t0) %t1.neg = extractelement <2 x i4> %x, i32 %y %t2 = add i4 %t1.neg, %z ret i4 %t2 } Transformation seems to be correct!	2020-05-20 21:44:31 +03:00
aartbik	645bba8d3d	[llvm] [CodeGen] [X86] Fix issues with v4i1 instruction selection Summary: Fixes issue https://bugs.llvm.org/show_bug.cgi?id=45995 Reviewers: mehdi_amini, nicolasvasilache, reidtatge, craig.topper, ftynse, bkramer Reviewed By: craig.topper Subscribers: RKSimon, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80231	2020-05-20 11:34:56 -07:00
Arthur Eubanks	8a88755610	Reland [X86] Codegen for preallocated See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689	2020-05-20 11:25:44 -07:00
Arthur Eubanks	b8cbff51d3	Revert "[X86] Codegen for preallocated" This reverts commit `810567dc69`. Some tests are unexpectedly passing	2020-05-20 10:04:55 -07:00
Hiroshi Yamauchi	f9a6163f64	[ProfileSummary] Refactor getFromMD to prepare for another optional field. NFC. Summary: Rename 'i' to 'I'. Factor out the optional field handling to getOptionalVal(). Split out of D79951. Reviewers: davidxl Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80230	2020-05-20 09:44:39 -07:00
Arthur Eubanks	810567dc69	[X86] Codegen for preallocated See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689	2020-05-20 09:20:38 -07:00
Matt Arsenault	e8f6b0e583	AMDGPU/GlobalISel: Fix splitting 64-bit extensions This was replicating the low bits into the high bits for G_ZEXT, rather than using 0.	2020-05-20 11:13:32 -04:00
Pierre-vh	835251f7d9	[Target][ARM] Make Low Overhead Loops coexist with VPT blocks. Previously, the LowOverheadLoops pass couldn't handle VPT blocks with conditions, or with multiple VCTPs. This patch improves the LowOverheadLoops pass so it can handle those cases. It also adds support for VCMPs before the VCTP. Differential Revision: https://reviews.llvm.org/D78206	2020-05-20 12:24:55 +01:00
Sam Parker	8cc911fa5b	[NFCI][CostModel] Refactor getIntrinsicInstrCost Combine the two API calls into one by introducing a structure to hold the relevant data. This has the added benefit of moving the boiler plate code for arguments and flags, into the constructors. This is intended to be a non-functional change, but the complicated web of logic involved here makes it very hard to guarantee. Differential Revision: https://reviews.llvm.org/D79941	2020-05-20 11:59:08 +01:00
Georgii Rymar	baf3225987	[yaml2obj] - Implement the "Offset" property for the Fill Chunk. Similar to a regular section chunk, a Fill should have this property. This patch implements it. Differential revision: https://reviews.llvm.org/D80190	2020-05-20 13:38:48 +03:00
Florian Hahn	bcbd26bfe6	[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC). SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. This patch was originally committed as `b8a3c34eee`, but broke the modules build, as LoopAccessAnalysis was using the Expander. The code-gen part of LAA was moved to lib/Transforms recently, so this patch can be landed again. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537	2020-05-20 10:53:40 +01:00
Kang Zhang	3f376ecad0	[PowerPC] Enable machine verification for 3 passes Summary: For PowerPC, there are 3 passes has disabled the machine verification. ``` PPCTargetMachine.cpp: addPass(&LiveVariablesID, false); PPCTargetMachine.cpp: addPass(createPPCEarlyReturnPass(), false); PPCTargetMachine.cpp: addPass(createPPCBranchSelectionPass(), false); ``` This patch is to enable machine verification for above three passes. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D79840	2020-05-20 09:40:25 +00:00
Simon Pilgrim	d9b9ce6c04	CommandFlags.h - remove unnecessary includes. NFC. Replace with forward declarations and move necessary includes down to source files. Exposes an implicit dependency on TargetMachine.h in llvm-opt-fuzzer.cpp	2020-05-20 09:58:37 +01:00
Jay Foad	e5fc9a3604	[IR] Simplify BasicBlock::removePredecessor. NFCI. This is the second attempt at landing this patch, after fixing the KeepOneInputPHIs behaviour to also keep zero input PHIs. Differential Revision: https://reviews.llvm.org/D80141	2020-05-20 09:58:21 +01:00
Jay Foad	b42b30c335	Revert "[IR] Simplify BasicBlock::removePredecessor. NFCI." This reverts commit `59f49f7ee7`. It was causing buildbot failures.	2020-05-20 08:01:43 +01:00
Stanislav Mekhanoshin	677929e352	[AMDGPU] Process V_MOV_B32_indirect in SET_GPR_IDX optimization Differential Revision: https://reviews.llvm.org/D80256	2020-05-19 21:37:14 -07:00
QingShan Zhang	2b59e9f1bd	[DAGCombine] Remove the getNegatibleCost to avoid the out of sync with getNegatedExpression We have the getNegatibleCost/getNegatedExpression to evaluate the cost and negate the expression. However, during negating the expression, the cost might change as we are changing the DAG, and then, hit the assertion if we negated the wrong expression as the cost is not trustful anymore. This patch is target to remove the getNegatibleCost to avoid the out of sync with getNegatedExpression, and check the cost during negating the expression. It also reduce the duplicated code between getNegatibleCost and getNegatedExpression. And fix the crash for the test in D76638 Reviewed By: RKSimon, spatel Differential Revision: https://reviews.llvm.org/D77319	2020-05-20 02:12:16 +00:00
Matt Arsenault	21d2884a9c	AMDGPU: Annotate functions that have stack objects Relying on any MachineFunction state in the MachineFunctionInfo constructor is hazardous, because the construction time is unclear and determined by the first use. The function may be only partially constructed, which is part of why we have many of these hacky string attributes to track what we need for ABI lowering. For SelectionDAG, all stack objects are created up-front before calling convention lowering so stack objects are visible at construction time. For GlobalISel, none of the IR function has been visited yet and the allocas haven't been added to the MachineFrameInfo yet. This should fix failing to set flat_scratch_init in GlobalISel when needed. This pass really needs to be turned into some kind of analysis, but I haven't found a nice way use one here.	2020-05-19 18:51:00 -04:00
Matt Arsenault	08ae945318	GlobalISel: Copy correct flags to select This was looking for a compare condition, and copying the compare flags. I don't think this was ever correct outside of certain min/max patterns which aren't checked, but this probably predates select instructions having fast math flags.	2020-05-19 18:31:24 -04:00
Matt Arsenault	074b802654	AMDGPU: Fix DAG divergence for implicit function arguments This should be directly implied from the register class, and there's no need to special case live ins here. This was getting the wrong answer for the queue ptr argument in callable functions, since it's not an explicit IR argument and is always uniform. Fixes not using scalar loads for the aperture in addrspacecast lowering, and any other places that use implicit SGPR arguments.	2020-05-19 18:11:34 -04:00
Matt Arsenault	61813b8069	AMDGPU: Use member initializers in MFI	2020-05-19 18:11:34 -04:00
Brian Cain	cfba1a9668	[Hexagon] pX.new cannot be used with p3:0 as producer Writes to p3:0 do not produce new values, we should bar any .new consumer trying to use it as a producer.	2020-05-19 17:06:34 -05:00
Matt Arsenault	e6658079ac	GlobalISel: Remove unused include	2020-05-19 17:56:55 -04:00
Matt Arsenault	4dad4914f7	CodeGen: Use Register	2020-05-19 17:56:55 -04:00
Eli Friedman	5d2c3a0b8c	[AArch64] Disable MachineOutliner on Windows. The handling of unwind info is broken, so disable it for now.	2020-05-19 13:49:03 -07:00
Benjamin Kramer	350dadaa8a	Give helpers internal linkage. NFC.	2020-05-19 22:16:37 +02:00
Lei Huang	2e6e27583c	[PowerPC][NFC] Cleanup load/store spilling code Summary: Cleanup and commonize code used for spilling to the stack. Reviewers: stefanp, nemanjai, #powerpc, kamaub Reviewed By: nemanjai, #powerpc, kamaub Subscribers: kamaub, hiraditya, wuzish, shchenz, llvm-commits, kbarton Tags: #llvm, #powerpc Differential Revision: https://reviews.llvm.org/D79736	2020-05-19 14:57:32 -05:00
Thomas Lively	8a43d41a40	[WebAssembly] Fix bug in custom shuffle combine Summary: The code previously assumed the source of the bitcast in the combined pattern was a vector type, but this is not always true. This patch adds a check to avoid an assertion failure in that case. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80164	2020-05-19 12:54:15 -07:00
Thomas Lively	3181273be7	[WebAssembly] Implement i64x2.mul and remove i8x16.mul Summary: This reflects changes in the spec proposal made since basic arithmetic was first implemented. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80174	2020-05-19 12:50:44 -07:00
Jay Foad	59f49f7ee7	[IR] Simplify BasicBlock::removePredecessor. NFCI. Differential Revision: https://reviews.llvm.org/D80141	2020-05-19 19:34:49 +01:00

1 2 3 4 5 ...

134713 Commits