llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	10a2162588	Use unique_ptr to hold AsmInfo,MRI,MII,STI Reviewers: pcc, dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52389 llvm-svn: 342945	2018-09-25 06:19:31 +00:00
Mikael Holmen	adf5e0d91d	Use TRI->regsOverlap() in MachineBasicBlock::computeRegisterLiveness Summary: For the loop that used MCRegAliasIterator this should be NFC. For the loop that previously used MCSubRegIterator we should now detect more cases where the register is actually live out that we previously missed. Reviewers: MatzeB, arsenm Reviewed By: MatzeB Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D52410 llvm-svn: 342944	2018-09-25 06:10:04 +00:00
Hsiangkai Wang	9c2463622d	[DebugInfo] Do not generate address info for removed debug labels. In some senario, LLVM will remove llvm.dbg.labels in IR. For example, when the labels are in unreachable blocks, these labels will not be generated in LLVM IR. In the case, these debug labels will have address zero as their address. It is not legal address for debugger to set breakpoints or query sources. So, the patch inhibits the address info (DW_AT_low_pc) of removed labels. Differential Revision: https://reviews.llvm.org/D51908 llvm-svn: 342943	2018-09-25 06:09:50 +00:00
Justin Bogner	e152483623	[MachineCopyPropagation] Reimplement CopyTracker in terms of register units Change the copy tracker to keep a single map of register units instead of 3 maps of registers. This gives a very significant compile time performance improvement to the pass. I measured a 30-40% decrease in time spent in MCP on x86 and AArch64 and much more significant improvements on out of tree targets with more registers. Differential Revision: https://reviews.llvm.org/D52374 llvm-svn: 342942	2018-09-25 05:16:44 +00:00
Justin Bogner	db02d3d4b3	[MachineCopyPropagation] Rework how we manage RegMask clobbers Instead of updating the CopyTracker's maps each time we come across a RegMask, defer checking for this kind of interference until we're actually trying to propagate a copy. This avoids the need to repeatedly iterate over maps in the cases where we don't end up doing any work. This is a slight compile time improvement for MachineCopyPropagation as is, but it also enables a much bigger improvement that I'll follow up with soon. Differential Revision: https://reviews.llvm.org/D52370 llvm-svn: 342940	2018-09-25 04:45:25 +00:00
Fedor Sergeev	662e5686fe	[New PM][PassInstrumentation] IR printing support for New Pass Manager Implementing -print-before-all/-print-after-all/-filter-print-func support through PassInstrumentation callbacks. - PrintIR routines implement printing callbacks. - StandardInstrumentations class provides a central place to manage all the "standard" in-tree pass instrumentations. Currently it registers PrintIR callbacks. Reviewers: chandlerc, paquette, philip.pfaffe Differential Revision: https://reviews.llvm.org/D50923 llvm-svn: 342896	2018-09-24 16:08:15 +00:00
Sanjay Patel	2c901742ca	[DAGCombiner] use UADDO to optimize saturated unsigned add This is a preliminary step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 If we have an 'add' instruction that sets flags, we can use that to eliminate an explicit compare instruction or some other instruction (cmn) that sets flags for use in the later select. As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively reversing an IR icmp canonicalization that replaces a variable operand with a constant: https://rise4fun.com/Alive/V1Q But we're not using 'uaddo' in those cases via DAG transforms. This happens in CGP after D8889 without checking target lowering to see if the op is supported. So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with "using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned saturated add and converts to uaddo without checking target capabilities. This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title (unlike x86 which sees improvements for all sizes because all sizes are 'custom'). But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, so this patch will fire on those tests. Another possibility given the existing behavior: we could remove the legal-or-custom check altogether because we're assuming that a UADDO sequence is canonical/optimal before we ever reach here. But that seems like a bug to me. If the target doesn't have an add-with-flags op, then it's not likely that we'll get optimal DAG combining using a UADDO node. This is similar justification for why we don't canonicalize IR to the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first place. Differential Revision: https://reviews.llvm.org/D51929 llvm-svn: 342886	2018-09-24 14:47:15 +00:00
Hans Wennborg	83d15dfe2d	Remove debug printf leftover from r342397 llvm-svn: 342863	2018-09-24 08:18:47 +00:00
Craig Topper	5bef27e808	[DAGCombiner] Remove some dead code from ConstantFoldBITCASTofBUILD_VECTOR This code handled SCALAR_TO_VECTOR being returned by the recursion, but the code that used to return SCALAR_TO_VECTOR was removed in 2015. llvm-svn: 342856	2018-09-24 02:03:11 +00:00
Craig Topper	b3b94a8e8b	[DAGCombiner] Clarify a comment. NFC This comment was misleading about why we were restricting to before legalize types. The reason given would only apply to before legalize ops. But there is a before legalize types reason that should also be listed. llvm-svn: 342851	2018-09-23 21:17:56 +00:00
Craig Topper	bec5967176	[LegalizeTypes] Fix bad indentation. NFC llvm-svn: 342850	2018-09-23 21:17:55 +00:00
Sanjay Patel	0027946915	[DAGCombiner][x86] extend decompose of integer multiply into shift/add with negation This is an alternative to https://reviews.llvm.org/D37896. We can't decompose multiplies generically without a target hook to tell us when it's profitable. ARM and AArch64 may be able to remove some existing code that overlaps with this transform. This extends D52195 and may resolve PR34474: https://bugs.llvm.org/show_bug.cgi?id=34474 (still an open question about transforming legal vector multiplies, but we could open another bug report for those) llvm-svn: 342844	2018-09-23 18:41:38 +00:00
Craig Topper	81f67f7afb	[DAGCombiner] Simplify some code in visitBITCAST. NFCI llvm-svn: 342826	2018-09-22 23:12:34 +00:00
Craig Topper	e79a588cac	[DAGCombiner] Rewrite r331896 in a different way to address a FIXME. NFCI llvm-svn: 342809	2018-09-22 18:03:14 +00:00
Justin Bogner	45b3ddc5a4	[MachineCopyPropagation] Refactor copy tracking into a class. NFC This is a bit easier to follow than handling the copy and src maps directly in the pass, and will make upcoming changes to how this is done easier to follow. llvm-svn: 342703	2018-09-21 00:51:04 +00:00
Justin Bogner	927b75dfba	[MachineCopyPropagation] Minor clang-formatting. NFC llvm-svn: 342700	2018-09-21 00:08:33 +00:00
Aditya Nandakumar	e5909431b5	Add the ability to register callbacks for removal and insertion of MachineInstrs https://reviews.llvm.org/D52127 This patch adds the ability to watch for insertions/deletions of MachineInstructions similar to MachineRegisterInfo. llvm-svn: 342696	2018-09-20 23:01:56 +00:00
Jessica Paquette	b320ca2642	[MachineOutliner][NFC] Don't add MBBs with a size < 2 to the search space The suffix tree won't ever consider sequences with a length less than 2. Therefore, we really ought to not even consider them in the first place. Also add a FIXME explaining that this should be defined in terms of the size in B of an outlined call versus the size in B of the MBB. llvm-svn: 342688	2018-09-20 21:53:25 +00:00
Walter Lee	f75e803679	[RegAllocGreedy] Fix crash in tryLocalSplit tryLocalSplit only handles a single use block, but an interval may have multiple use blocks. So don't crash in that case. This fixes PR38795. Differential revision: https://reviews.llvm.org/D52277 llvm-svn: 342682	2018-09-20 20:05:57 +00:00
Jessica Paquette	cc06a782ba	[MachineOutliner][NFC] Move debug info emission to createOutlinedFunction When you create an outlined function, you know everything you need to know to decide if debug info should be created. If we emit debug info in createOutlinedFunction, then we don't need to keep track of every IR function we create. llvm-svn: 342677	2018-09-20 18:53:53 +00:00
Sanjay Patel	8a1227ccc8	[SelectionDAG] replace duplicated peekThroughBitcast helper functions; NFCI x86 had 2 versions of peekThroughBitcast. DAGCombiner had 1. Plus, it had a 1-off implementation for the one-use variant. Move the x86 versions of the code to SelectionDAG, so we don't have different copies of the code. No functional change intended. I'm putting this next to isBitwiseNot() because I am planning to use it in there. Another option is next to the helpers in the ISD namespace (eg, ISD::isConstantSplatVector()). But if there's no good reason for those to be there, I'd prefer to pull other helpers over to SelectionDAG in follow-up steps. Differential Revision: https://reviews.llvm.org/D52285 llvm-svn: 342669	2018-09-20 17:34:08 +00:00
George Rimar	425f75172f	[DWARF] - Emit the correct value for DW_AT_addr_base. Currently, we emit DW_AT_addr_base that points to the beginning of the .debug_addr section. That is not correct for the DWARF5 case because address table contains the header and the attribute should point to the first entry following the header. This is currently the reason why LLDB does not work with such executables correctly. Patch fixes the issue. Differential revision: https://reviews.llvm.org/D52168 llvm-svn: 342635	2018-09-20 09:17:36 +00:00
Bjorn Pettersson	b2154af25f	[MachineVerifier] Relax checkLivenessAtDef regarding dead subreg defs Summary: Consider an instruction that has multiple defs of the same vreg, but defining different subregs: %7.sub1:rc, dead %7.sub2:rc = inst Calling checkLivenessAtDef for the live interval associated with %7 incorrectly reported "live range continues after a dead def". The live range for %7 has a dead def at the slot index for "inst" even if the live range continues (given that there are later uses of %7.sub1). This patch adjusts MachineVerifier::checkLivenessAtDef to allow dead subregister definitions, unless we are checking a subrange (when tracking subregister liveness). A limitation is that we do not detect the situation when the live range continues past an instruction that defines the full virtual register by multiple dead subreg defines. I also removed some dead code related to physical register in checkLivenessAtDef. Wwe only call that method for virtual registers, so I added an assertion instead. Reviewers: kparzysz Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52237 llvm-svn: 342618	2018-09-20 06:59:18 +00:00
Sanjay Patel	fdc0de19cb	[SelectionDAG] allow vector types with isBitwiseNot() The test diff in not-and-simplify.ll is from a use in SimplifyDemandedBits, and the test diff in add.ll is from a DAGCombiner transform. llvm-svn: 342594	2018-09-19 21:48:30 +00:00
Matthias Braun	3136e42039	MachineScheduler: Add -misched-print-dags flag Add a flag to dump the schedule DAG to the debug stream. This will be used in upcoming commits to test schedule DAG mutations such as macro fusion. llvm-svn: 342589	2018-09-19 20:50:49 +00:00
Michael Berg	894c39f770	Copy utilities updated and added for MI flags Summary: This patch adds a GlobalIsel copy utility into MI for flags and updates the instruction emitter for the SDAG path. Some tests show new behavior and I added one for GlobalIsel which mirrors an SDAG test for handling nsw/nuw. Reviewers: spatel, wristow, arsenm Reviewed By: arsenm Subscribers: wdng Differential Revision: https://reviews.llvm.org/D52006 llvm-svn: 342576	2018-09-19 18:52:08 +00:00
Sanjay Patel	4fd2e2a498	[DAGCombiner][x86] add transform/hook to decompose integer multiply into shift/add This is an alternative to D37896. I don't see a way to decompose multiplies generically without a target hook to tell us when it's profitable. ARM and AArch64 may be able to remove some duplicate code that overlaps with this transform. As a first step, we're only getting the most clear wins on the vector examples requested in PR34474: https://bugs.llvm.org/show_bug.cgi?id=34474 As noted in the code comment, it's likely that the x86 constraints are tighter than necessary, but it may not always be a win to replace a pmullw/pmulld. Differential Revision: https://reviews.llvm.org/D52195 llvm-svn: 342554	2018-09-19 15:57:40 +00:00
Alex Bradbury	79518b02cd	[AtomicExpandPass]: Add a hook for custom cmpxchg expansion in IR This involves changing the shouldExpandAtomicCmpXchgInIR interface, but I have updated the in-tree backends using this hook (ARM, AArch64, Hexagon) so they will see no functional change. Previously this hook returned bool, but it now returns AtomicExpansionKind. This hook allows targets to select how a given cmpxchg is to be expanded. D48131 uses this to expand part-word cmpxchg to a target-specific intrinsic. See my associated RFC for more info on the motivation for this change <http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html>. Differential Revision: https://reviews.llvm.org/D48130 llvm-svn: 342550	2018-09-19 14:51:42 +00:00
Alex Bradbury	21aea51e71	[RISCV] Codegen for i8, i16, and i32 atomicrmw with RV32A Introduce a new RISCVExpandPseudoInsts pass to expand atomic pseudo-instructions after register allocation. This is necessary in order to ensure that register spills aren't introduced between LL and SC, thus breaking the forward progress guarantee for the operation. AArch64 does something similar for CmpXchg (though only at O0), and Mips is moving towards this approach (see D31287). See also [this mailing list post](http://lists.llvm.org/pipermail/llvm-dev/2016-May/099490.html) from James Knight, which summarises the issues with lowering to ll/sc in IR or pre-RA. See the [accompanying RFC thread](http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html) for an overview of the lowering strategy. Differential Revision: https://reviews.llvm.org/D47882 llvm-svn: 342534	2018-09-19 10:54:22 +00:00
Matthias Braun	726e12cf0c	ScheduleDAG: Cleanup dumping code; NFC - Instead of having both `SUnit::dump(ScheduleDAG)` and `ScheduleDAG::dumpNode(ScheduleDAG)`, just keep the latter around. - Add `ScheduleDAG::dump()` and avoid code duplication in several places. Implement it for different ScheduleDAG variants. - Add `ScheduleDAG::dumpNodeName()` in favor of the `SUnit::print()` functions. They were only ever used for debug dumping and putting the function into ScheduleDAG is consistent with the `dumpNode()` change. llvm-svn: 342520	2018-09-19 00:23:35 +00:00
Krzysztof Parzyszek	c1e2f39b35	[PostRASink] Make sure to remove subregisters from live-ins as well llvm-svn: 342492	2018-09-18 16:10:51 +00:00
Hans Wennborg	01c3154971	Revert r342457 "Fixes removal of dead elements from PressureDiff (PR37252)." This broke the lit tests on a bunch of buildbots, e.g. http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/36679 > Reviewed By: MatzeB > > Differential Revision: https://reviews.llvm.org/D51495 llvm-svn: 342482	2018-09-18 14:12:54 +00:00
John Brawn	83d7414e19	[TargetLowering] Android has sincos functions Since Android API version 9 the Android libm has had the sincos functions, so they should be recognised as libcalls and sincos optimisation should be applied. Differential Revision: https://reviews.llvm.org/D52025 llvm-svn: 342471	2018-09-18 13:18:21 +00:00
Yury Gribov	53db663afb	Fixes removal of dead elements from PressureDiff (PR37252). Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D51495 llvm-svn: 342457	2018-09-18 09:53:42 +00:00
Jessica Paquette	bd72988c3a	[MachineOutliner][NFC] Don't map more illegal instrs than you have to We were mapping an instruction every time we saw something we couldn't map before this. Since each illegal mapping is unique, we only have to do this once. This makes it so that we don't map illegal instructions when the previous mapped instruction was illegal. In CTMark (AArch64), this results in 240 fewer instruction mappings on average over 619 files in total. The largest improvement is 12576 fewer mappings in one file, and the smallest is 0. The median improvement is 101 fewer mappings. llvm-svn: 342405	2018-09-17 18:40:21 +00:00
Amara Emerson	91c2913522	Revert "Revert r342183 "[DAGCombine] Fix crash when store merging created an extract_subvector with invalid index."" Fixed the assertion failure. llvm-svn: 342397	2018-09-17 14:40:13 +00:00
Kristina Brooks	46c6d3fe75	[DebugInfo] Fix build when std::vector::iterator is a pointer std::vector::iterator type may be a pointer, then iterator::value_type fails to compile since iterator is not a class, namespace, or enumeration. Patch by orivej (Orivej Desh) Differential Revision: https://reviews.llvm.org/D52142 llvm-svn: 342354	2018-09-16 22:21:59 +00:00
Sanjay Patel	3eaf500a6d	[DAGCombiner] try to convert pow(x, 1/3) to cbrt(x) This is a follow-up suggested in D51630 and originally proposed as an IR transform in D49040. Copying the motivational statement by @evandro from that patch: "This transformation helps some benchmarks in SPEC CPU2000 and CPU2006, such as 188.ammp, 447.dealII, 453.povray, and especially 300.twolf, as well as some proprietary benchmarks. Otherwise, no regressions on x86-64 or A64." I'm proposing to add only the minimum support for a DAG node here. Since we don't have an LLVM IR intrinsic for cbrt, and there are no other DAG ways to create a FCBRT node yet, I don't think we need to worry about DAG builder, legalization, a strict variant, etc. We should be able to expand as needed when adding more functionality/transforms. For reference, these are transform suggestions currently listed in SimplifyLibCalls.cpp: // * cbrt(expN(X)) -> expN(x/3) // * cbrt(sqrt(x)) -> pow(x,1/6) // * cbrt(cbrt(x)) -> pow(x,1/9) Also, given that we bail out on long double for now, there should not be any logical differences between platforms (unless there's some platform out there that has pow() but not cbrt()). Differential Revision: https://reviews.llvm.org/D51753 llvm-svn: 342348	2018-09-16 16:50:26 +00:00
Vedant Kumar	1b02dad9f2	[CodeGenPrepare] Preserve debug locs in OptimizeExtractBits CodeGenPrepare has a transform that sinks {lshr, trunc} pairs to make it easier for the backend to emit fancy extract-bits instructions (e.g UBFX). Teach it to preserve debug locations and salvage debug values. llvm-svn: 342319	2018-09-15 04:08:52 +00:00
Craig Topper	5692ac6ce7	[BreakFalseDeps] Fix bad formatting. NFC llvm-svn: 342293	2018-09-14 22:26:09 +00:00
Reid Kleckner	b3d456a79e	[codeview] Remove dead code llvm-svn: 342285	2018-09-14 21:14:08 +00:00
Reid Kleckner	4d1b75c6b7	Revert r342183 "[DAGCombine] Fix crash when store merging created an extract_subvector with invalid index." Causes 'isVector() && "Invalid vector type!"' assertion when building Skia in Chrome. llvm-svn: 342265	2018-09-14 19:39:40 +00:00
Adrian Prantl	16f58d1850	Fix debug info for SelectionDAG legalization of DAG nodes with two results. This patch fixes the debug info handling for SelectionDAG legalization of DAG nodes with two results. When an replaced SDNode has more than one result, transferDbgValues was always copying the SDDbgValue from the first result and attaching them to all members. In reality SelectionDAG::ReplaceAllUsesWith() is given an array of SDNodes (though the type signature doesn't make this obvious (cf. the call site code in ReplaceNode()). rdar://problem/44162227 Differential Revision: https://reviews.llvm.org/D52112 llvm-svn: 342264	2018-09-14 19:38:45 +00:00
Adrian Prantl	66945cf6e3	fix noasserts build llvm-svn: 342247	2018-09-14 17:32:52 +00:00
Adrian Prantl	55b8756b8a	SelectionDAG: Add compact SDDbgValue representation to -dag-dump-verbose output llvm-svn: 342245	2018-09-14 17:08:02 +00:00
Adrian Prantl	86497ad2af	fix typos llvm-svn: 342241	2018-09-14 16:12:14 +00:00
Amara Emerson	ef600cbd86	[DAGCombine] Fix crash when store merging created an extract_subvector with invalid index. Differential Revision: https://reviews.llvm.org/D51831 llvm-svn: 342183	2018-09-13 21:28:58 +00:00
Craig Topper	2f88006ced	[MachineInstr] In addRegisterKilled and addRegisterDead, don't remove operands from inline assembly instructions if they have an associated flag operand. INLINEASM instructions use extra operands to carry flags. If a register operand is removed without removing the flag operand, then the flags will no longer make sense. This patch fixes this by preventing the removal when a flag operand is present. The included test case was generated by MS inline assembly. Longer term maybe we should fix the inline assembly parsing to not generate redundant operands. Differential Revision: https://reviews.llvm.org/D51829 llvm-svn: 342176	2018-09-13 20:51:27 +00:00
Matt Arsenault	842cda6312	DAG: Fix expansion of unaligned FP loads and stores This was trying to scalarizing a scalar FP type, resulting in an assert. Fixes unaligned f64 stack stores for AMDGPU. llvm-svn: 342132	2018-09-13 12:14:23 +00:00
Tim Northover	c15d47bb01	ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4. The Technical Reference Manuals for these two CPUs state that branching to an unaligned 32-bit instruction incurs an extra pipeline reload penalty. That's bad. This also enables the optimization at -Os since it costs on average one byte per loop in return for 1 cycle per iteration, which is pretty good going. llvm-svn: 342127	2018-09-13 10:28:05 +00:00

1 2 3 4 5 ...

24977 Commits