llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexander Kornienko	3635c89070	Fix uninitialized variable. Flags variable was not initialized and later used (both isMBBSafeToOutlineFrom implementations assume it's initialized), which breaks test/CodeGen/AArch64/machine-outliner.mir. under memory sanitizer: MemorySanitizer: use-of-uninitialized-value #0 in llvm::AArch64InstrInfo::getOutliningType(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&, unsigned int) const llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5494:9 #1 in (anonymous namespace)::InstructionMapper::convertToUnsignedVec(llvm::MachineBasicBlock&, llvm::TargetInstrInfo const&) llvm/lib/CodeGen/MachineOutliner.cpp:772:19 #2 in (anonymous namespace)::MachineOutliner::populateMapper((anonymous namespace)::InstructionMapper&, llvm::Module&, llvm::MachineModuleInfo&) llvm/lib/CodeGen/MachineOutliner.cpp:1543:14 #3 in (anonymous namespace)::MachineOutliner::runOnModule(llvm::Module&) llvm/lib/CodeGen/MachineOutliner.cpp:1645:3 #4 in (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1744:27 #5 in llvm::legacy::PassManagerImpl::run(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1857:44 #6 in compileModule(char**, llvm::LLVMContext&) llvm/tools/llc/llc.cpp:597:8 llvm-svn: 346761	2018-11-13 16:41:05 +00:00
Craig Topper	0b33b468a1	[DAGCombiner] Enable tryToFoldExtendOfConstant to run after legalize vector ops It should be ok to create a new build_vector after legal operations so long as it doesn't cause an infinite loop in DAG combiner. Unfortunately, X86's custom constant folding in combineVSZext is hiding any test changes from this. But I'm trying to get to a point where that X86 specific code isn't necessary at all. Differential Revision: https://reviews.llvm.org/D54285 llvm-svn: 346728	2018-11-13 01:59:32 +00:00
Jessica Paquette	82d9c0a3fa	[MachineOutliner][NFC] Change getMachineOutlinerMBBFlags to isMBBSafeToOutlineFrom Instead of returning Flags, return true if the MBB is safe to outline from. This lets us check for unsafe situations, like say, in AArch64, X17 is live across a MBB without being defined in that MBB. In that case, there's no point in performing an instruction mapping. llvm-svn: 346718	2018-11-12 23:51:32 +00:00
Philip Reames	e44a55dc98	[GC][NFC] Simplify code now that we only have one safepoint kind This is the NFC follow up to exploit the semantic simplification from r346701 llvm-svn: 346712	2018-11-12 22:03:53 +00:00
Ali Tamur	d482b01a62	Use a data structure better suited for large sets in SimplificationTracker. Summary: D44571 changed SimplificationTracker to use SmallSetVector to keep phi nodes. As a result, when the number of phi nodes is large, the build time performance suffers badly. When building for power pc, we have a case where there are more than 600.000 nodes, and it takes too long to compile. In this change, I partially revert D44571 to use SmallPtrSet, which does an acceptable job with any number of elements. In the original patch, having a deterministic iteration order was mentioned as a motivation, however I think it only applies to the nodes already matched in MatchPhiSet method, which I did not touch. Reviewers: bjope, skatkov Reviewed By: bjope, skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54007 llvm-svn: 346710	2018-11-12 21:43:43 +00:00
Philip Reames	c75a0c3f69	[GC] Remove so called PreCall safepoints Remove another bit of unused configuration potential from GCStrategy. It's not entirely clear what the intention here was, but from the docs, it sounds like this may have been subsumed by patchable call support. Note: This change is deliberately small to make it clear that while implemented, there's nothing using the option. A following NFC will do most of the simplifications. llvm-svn: 346701	2018-11-12 20:15:34 +00:00
Stanislav Mekhanoshin	5f9513147a	Fix MachineInstr::findRegisterUseOperandIdx subreg checks The function only checks that instruction reads a super-register containing requested physical register. In case if a sub-register if being read that is also a use of a super-reg, so added the check. In particular MI->readsRegister() is broken because of the missing check. The resulting check is essentially regsOverlap(). Differential Revision: https://reviews.llvm.org/D54128 llvm-svn: 346686	2018-11-12 18:12:28 +00:00
Jessica Paquette	9702144341	[MachineOutliner][NFC] Early exit pruning when candidates don't share an MBB There's no way they can overlap in this case. This can save a few iterations when the candidate is close to the beginning of a MachineBasicBlock. It's particularly useful when the average length of a MachineBasicBlock in the program is small. llvm-svn: 346682	2018-11-12 17:50:56 +00:00
Jessica Paquette	3954272ac1	[MachineOutliner][NFC] Put suffix tree in buildCandidateList It's only used there, so it doesn't make much sense to have it in runOnModule. llvm-svn: 346681	2018-11-12 17:50:55 +00:00
Paul Robinson	5b302bfc8e	[DWARFv5] Emit split type units in .debug_info.dwo. Differential Revision: https://reviews.llvm.org/D54350 llvm-svn: 346674	2018-11-12 16:55:11 +00:00
Nirav Dave	a395e2df56	[DAGCombiner] Fix load-store forwarding of indexed loads. Summary: Handle extra output from index loads in cases where we wish to forward a load value directly from a preceeding store. Fixes PR39571. Reviewers: peter.smith, rengolin Subscribers: javed.absar, hiraditya, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54265 llvm-svn: 346654	2018-11-12 14:05:40 +00:00
Philip Reames	8b48ceac80	[GC] Remove unused configuration variable The custom root mechanism didn't actually do anything. ShadowStackGC, the only one which used it, just removed the gcroots before they reached the normal lowering in SelectionDAG. As a result, the state flag had no value. llvm-svn: 346632	2018-11-12 02:34:54 +00:00
Philip Reames	1559021751	[GC] Minor style modernization llvm-svn: 346631	2018-11-12 02:26:26 +00:00
Philip Reames	18945d6c99	[GCRoot] Remove some unneccessary complexity The GCStrategy provides three configuration options were are largely redundant. 1) Support for conditionally lowering gcread and gcwrite to loads and stores. This is redundant since any GC which wished to use these abstractions would lower them out of existance before the built in lowering anyways. As such, there's no need to have the lowering being conditional. 2) Conditional initialization for allocas marked via gcroot. Semantically, roots have to be initialized before first potential use. Arguably, the frontend really should have responsibility for that, but the old API allowed the frontend to ignore this detail. Only one builtin GC used the non-initializing mode. Since no one to my knowledge actually uses the ErlangGC strategy, I decide the slight pessimization was worth the simplicity. If that turns out to be problematic, we can always improve the insertion algorithm to detect more existing initializing stores. llvm-svn: 346621	2018-11-11 21:13:09 +00:00
Craig Topper	d23cdbbeb2	[DAGCombiner] Make tryToFoldExtendOfConstant return an SDValue instead of an SDNode*. NFC Removes the need to call getNode internally and to recreate an SDValue after the call. llvm-svn: 346600	2018-11-10 23:46:03 +00:00
Sanjay Patel	0a515595a7	[x86] allow vector load narrowing with multi-use values This is a long-awaited follow-up suggested in D33578. Since then, we've picked up even more opportunities for vector narrowing from changes like D53784, so there are a lot of test diffs. Apart from 2-3 strange cases, these are all wins. I've structured this to be no-functional-change-intended for any target except for x86 because I couldn't tell if AArch64, ARM, and AMDGPU would improve or not. All of those targets have existing regression tests (4, 4, 10 files respectively) that would be affected. Also, Hexagon overrides the shouldReduceLoadWidth() hook, but doesn't show any regression test diffs. The trade-off is deciding if an extra vector load is better than a single wide load + extract_subvector. For x86, this is almost always better (on paper at least) because we often can fold loads into subsequent ops and not increase the official instruction count. There's also some unknown -- but potentially large -- benefit from using narrower vector ops if wide ops are implemented with multiple uops and/or frequency throttling is avoided. Differential Revision: https://reviews.llvm.org/D54073 llvm-svn: 346595	2018-11-10 20:05:31 +00:00
Philip Reames	9b8c102675	[GC] Rename a header for consistency llvm-svn: 346588	2018-11-10 16:08:10 +00:00
Matthias Braun	fb93aecf8d	RegAllocFast: Further cleanups; NFC llvm-svn: 346576	2018-11-10 00:36:27 +00:00
Philip Reames	afa1742b4b	[GC] Simplify linking of GC builtin GC strategies llvm-svn: 346569	2018-11-09 23:56:21 +00:00
Craig Topper	f2e65f8636	[SelectionDAG] Fix a -Wparentheses warning from gcc in an assert. NFC gcc wants parentheses around the logical OR since there is a logical AND for the string. llvm-svn: 346564	2018-11-09 23:11:30 +00:00
Paul Robinson	ddbde9a4ad	[DWARFv5] Emit normal type units in .debug_info comdats. Differential Revision: https://reviews.llvm.org/D54282 llvm-svn: 346540	2018-11-09 19:06:09 +00:00
Craig Topper	9a7e19b8f2	[DAGCombiner][X86][Mips] Enable combineShuffleOfScalars to run between vector op legalization and DAG legalization. Fix bad one use check in combineShuffleOfScalars It's possible for vector op legalization to generate a shuffle. If that happens we should give a chance for DAG combine to combine that with a build_vector input. I also fixed a bug in combineShuffleOfScalars that was considering the number of uses on a undef input to a shuffle. We don't care how many times undef is used. Differential Revision: https://reviews.llvm.org/D54283 llvm-svn: 346530	2018-11-09 18:04:34 +00:00
Serge Guelton	86f8b70f1b	Type safe version of MachinePassRegistry Previous version used type erasure through a `void* (*)()` pointer, which triggered gcc warning and implied a lot of reinterpret_cast. This version should make it harder to hit ourselves in the foot. Differential revision: https://reviews.llvm.org/D54203 llvm-svn: 346522	2018-11-09 17:19:45 +00:00
Zaara Syeda	5c179bf14b	[Power9] Allow gpr callee saved spills in prologue to vectors registers Currently in llvm, CalleeSavedInfo can only assign a callee saved register to stack frame index to be spilled in the prologue. We would like to enable spilling gprs to vector registers. This patch adds the capability to spill to other registers aside from just the stack. It also adds the changes for power9 to spill gprs to volatile vector registers when they are available. This happens only for leaf functions when using the option -ppc-enable-pe-vector-spills. Differential Revision: https://reviews.llvm.org/D39386 llvm-svn: 346512	2018-11-09 16:36:24 +00:00
Alexandros Lamprineas	e15c982f6d	[SelectionDAG] swap select_cc operands to enable folding The DAGCombiner tries to SimplifySelectCC as follows: select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4) It can't cope with the situation of reordered operands: select_cc(x, y, 0, 16, cc) In that case we just need to swap the operands and invert the Condition Code: select_cc(x, y, 16, 0, ~cc) Differential Revision: https://reviews.llvm.org/D53236 llvm-svn: 346484	2018-11-09 11:09:40 +00:00
Craig Topper	8cca8bd4aa	[SelectionDAG] Assert on the width of DemandedElts argument to computeKnownBits for all vector typed operations not just build_vector. Fix AArch64 unit test that fails with the assertion added. llvm-svn: 346437	2018-11-08 20:29:17 +00:00
Nirav Dave	6ce9f72f76	[DAGCombine] Improve alias analysis for chain of independent stores. FindBetterNeighborChains simulateanously improves the chain dependencies of a chain of related stores avoiding the generation of extra token factors. For chains longer than the GatherAllAliasDepths, stores further down in the chain will necessarily fail, a potentially significant waste and preventing otherwise trivial parallelization. This patch directly parallelize the chains of stores before improving each store. This generally improves DAG-level parallelism. Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53552 llvm-svn: 346432	2018-11-08 19:14:20 +00:00
David Blaikie	c8f7e6c1a9	NFC: DebugInfo: Track the origin CU rather than just the base address for range lists Turns out knowing more than just the base address might be useful - specifically a future change to respect a DICompileUnit flag for the use of base address specifiers in DWARF < 5. llvm-svn: 346380	2018-11-08 00:35:54 +00:00
Jessica Paquette	c4cf775ae0	[MachineOutliner][NFC] Only map blocks which have adjacent legal instructions If a block doesn't have any ranges of adjacent legal instructions, then it can't have outlining candidates. There's no point in mapping legal isntructions in situations like this. I noticed this reduces the size of the suffix tree in sqlite3 for AArch64 at -Oz by about 3%. llvm-svn: 346379	2018-11-08 00:33:38 +00:00
Jessica Paquette	267d266c29	[MachineOutliner][NFC] Don't map MBBs that don't contain legal instructions I noticed that there are lots of basic blocks that don't have enough legal instructions in them to warrant outlining. We can skip mapping these entirely. In sqlite3, compiled for AArch64 at -Oz, this results in a 10% reduction of the total nodes in the suffix tree. These nodes can never be part of a repeated substring, and so they don't impact the result at all. Before this, there were 62128 nodes in the tree for sqlite3. After this, there are 56457 nodes. llvm-svn: 346373	2018-11-08 00:02:11 +00:00
Jessica Paquette	df5b09b8ce	[MachineOutliner][NFC] Remove Parent field from SuffixTreeNode This is only used for calculating ConcatLen. This isn't necessary, since it's easily derived from the traversal setting suffix indices. Remove that. Rename CurrIdx to CurrNodeLen to better describe what's going on. llvm-svn: 346349	2018-11-07 19:56:13 +00:00
Jessica Paquette	a409cc959b	[MachineOutliner][NFC] Traverse suffix tree using a RepeatedSubstring iterator This takes the traversal methods introduced in r346269 and adapts them into an iterator. This allows the outliner to iterate over repeated substrings within the suffix tree directly without having to initially find all of the substrings and then iterate over them after you've found them. llvm-svn: 346345	2018-11-07 19:20:55 +00:00
Jessica Paquette	a3eb0fac3b	[MachineOutliner] Don't store outlined function numberings on OutlinedFunction NFC-ish. This doesn't change the behaviour of the outliner, but does make sure that you won't end up with say OUTLINED_FUNCTION_2: ... ret OUTLINED_FUNCTION_248: ... ret as the only outlined functions in your module. Those should really be OUTLINED_FUNCTION_0: ... ret OUTLINED_FUNCTION_1: ... ret If we produce outlined functions, they probably should have sequential numbers attached to them. This makes it a bit easier+stable to write outliner tests. The point of this is to move towards a bit more stability in outlined function names. By doing this, we at least don't rely on the traversal order of the suffix tree. Instead, we rely on the order of the candidate list, which is far more consistent. The candidate list is ordered by the end indices of candidates, so we're more likely to get a stable ordering. This is still susceptible to changes in the cost model though (like, if we suddenly find new candidates, for example). llvm-svn: 346340	2018-11-07 18:36:43 +00:00
Serge Guelton	a4d9e2293a	Fix ignorded type qualifier warning [NFC] llvm-svn: 346332	2018-11-07 16:17:30 +00:00
James Y Knight	72f76bf230	Add support for llvm.is.constant intrinsic (PR4898) This adds the llvm-side support for post-inlining evaluation of the __builtin_constant_p GCC intrinsic. Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call to a function where canConstantFoldTo returns true, and one of the arguments is a struct. Updated from patch initially by Janusz Sobczak. Differential Revision: https://reviews.llvm.org/D4276 llvm-svn: 346322	2018-11-07 15:24:12 +00:00
Matthias Braun	5b7c90b4e2	RegAllocFast: Leave unassigned virtreg entries in map Set `LiveReg::PhysReg` to zero when freeing a register instead of removing it from the entry from `LiveRegMap`. This way no iterators get invalidated and we can avoid passing around and updating iterators all over the place. This does not change any allocator decisions. It is not completely NFC because the arbitrary iteration order through `LiveRegMap` in `spillAll()` changes so we may get a different order in those spill sequences (the amount of spills does not change). This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346298	2018-11-07 06:57:03 +00:00
Matthias Braun	b0ecbef428	RegAllocFast: Further cleanups; NFC This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346297	2018-11-07 06:57:02 +00:00
Matthias Braun	0804dca358	RegAllocFast: Refactor PhysRegState usage; NFC This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346296	2018-11-07 06:57:00 +00:00
Matthias Braun	b4c76ff77c	RegAllocFast: Factor spill/reload creation into their own functions; NFC This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346289	2018-11-07 02:04:12 +00:00
Matthias Braun	ebcf5437bc	RegAllocFast: Cleanups; NFC This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346288	2018-11-07 02:04:11 +00:00
Matthias Braun	14af82a608	RegAllocFast: Rename statistic from NumCopies to NumCoalesced The metric does not return the number of remaining (or inserted) copies but the number of copies that were coalesced. Pick a more descriptive name. llvm-svn: 346287	2018-11-07 02:04:07 +00:00
Jessica Paquette	935d373db9	[MachineOutliner][NFC] Remove OccurrenceCount from SuffixTreeNode After changing the way we find candidates in r346269, this is no longer used. llvm-svn: 346275	2018-11-06 22:23:13 +00:00
Jessica Paquette	979cf1e566	[MachineOutliner][NFC] Remove IsInTree from SuffixTreeNode After changing the way we find repeated substrings in r346269, this field is no longer used by anything, so it can be removed. llvm-svn: 346274	2018-11-06 22:21:11 +00:00
Jessica Paquette	4e54ef8883	[MachineOutliner][NFC] Add findRepeatedSubstrings to SuffixTree, kill LeafVector Instead of iterating over the leaves to find repeated substrings, and walking collecting leaf children when we don't necessarily need them, let's just calculate what we need and iterate over that. By doing this, we don't have to save every leaf. It's easier to read the code too and understand what's going on. The goal here, at the end of the day, is to set up to allow us to do something like for (RepeatedSubstring &RS : ST) { ... do stuff with RS ... } Which would let us perform the cost model stuff and the repeated substring query at the same time. llvm-svn: 346269	2018-11-06 21:46:41 +00:00
Matthias Braun	c6613879ce	LivePhysRegs/IfConversion: Change some types from unsigned to MCPhysReg; NFC Change the type in a couple of lists and sets that only store physical registers from unsigned to MCPhysRegs. The later is only 16bits and saves us a bit of memory. llvm-svn: 346254	2018-11-06 19:00:11 +00:00
Matthias Braun	7a75a91b5b	MachineFunction: Store more specific reference to LLVMTargetMachine; NFC MachineFunction can only be used in code using lib/CodeGen, hence we can keep a more specific reference to LLVMTargetMachine rather than just TargetMachine around. Do the same for references in ScheduleDAG and RegUsageInfoCollector. llvm-svn: 346183	2018-11-05 23:49:14 +00:00
Matthias Braun	3d849f67cb	MachineModuleInfo: Store more specific reference to LLVMTargetMachine; NFC MachineModuleInfo can only be used in code using lib/CodeGen, hence we can keep a more specific reference to LLVMTargetMachine rather than just TargetMachine around. llvm-svn: 346182	2018-11-05 23:49:13 +00:00
Cameron McInally	9757d5d6c1	[FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsics Differential Revision: https://reviews.llvm.org/D53411 llvm-svn: 346141	2018-11-05 15:59:49 +00:00
Simon Pilgrim	6bd468bd8b	[TargetLowering] Begin generalizing TargetLowering::expandFP_TO_SINT support. NFCI. Prior to initial work to add vector expansion support, remove assumptions that we're working on scalar types. llvm-svn: 346139	2018-11-05 15:49:09 +00:00
Craig Topper	8f2f2a76b9	[DAGCombiner] Use tryFoldToZero to simplify some code and make it work correctly between LegalTypes and LegalOperations. The original code avoided creating a zero vector after type legalization, but if we're after type legalization the type we have is legal. The real hazard we need to avoid is creating a build vector after op legalization. tryFoldToZero takes care of checking for this. llvm-svn: 346119	2018-11-05 05:53:06 +00:00
Craig Topper	8d64abddd1	[DAGCombiner] Remove an unused argument from tryFoldToZero. NFC llvm-svn: 346118	2018-11-05 05:53:03 +00:00
Craig Topper	3292ea03d3	[DAGCombiner] Remove 'else' after return. NFC This makes this code consistent with the nearly identical code in visitZERO_EXTEND. llvm-svn: 346090	2018-11-04 06:56:32 +00:00
Craig Topper	1ba86188cf	[SelectionDAG] Remove special methods for creating *_EXTEND_VECTOR_INREG nodes. Move asserts into getNode. These methods were just wrappers around getNode with additional asserts (identical and repeated 3 times). But getNode already has a switch that can be used to hold these asserts that allows them to be shared for all 3 opcodes. This also enables checking on the places that create these nodes without using the wrappers. The rest of the patch is just changing all callers to use getNode directly. llvm-svn: 346087	2018-11-04 02:10:18 +00:00
Reid Kleckner	2bcb288ade	[codeview] Let the X86 backend tell us the VFRAME offset adjustment Use MachineFrameInfo's OffsetAdjustment field to pass this information from the target to CodeViewDebug.cpp. The X86 backend doesn't use it for any other purpose. This fixes PR38857 in the case where there is a non-aligned quantity of CSRs and a non-aligned quantity of locals. llvm-svn: 346062	2018-11-03 00:41:52 +00:00
Craig Topper	60c202a494	[X86] Don't emit *_extend_vector_inreg nodes when both the input and output types are legal with AVX1 We already have custom lowering for the AVX case in LegalizeVectorOps. So its better to keep the regular extend op around as long as possible. I had to qualify one place in DAG combine that created illegal vector extending load operations. This change by itself had no effect on any tests which is why its included here. I've made a few cleanups to the custom lowering. The sign extend code no longer creates an identity shuffle with undef elements. The zero extend code now emits a zero_extend_vector_inreg instead of an unpckl with a zero vector. For the high half of the custom lowering of zero_extend/any_extend, we're now using an unpckh with a zero vector or undef. Previously we used used a pshufd to move the upper 64-bits to the lower 64-bits and then used a zero_extend_vector_inreg. I think the zero vector should require less execution resources and be smaller code size. Differential Revision: https://reviews.llvm.org/D54024 llvm-svn: 346043	2018-11-02 21:09:49 +00:00
Jeremy Morse	d538352b3e	[MachineSink][DebugInfo] Correctly sink DBG_VALUEs As reported in PR38952, postra-machine-sink relies on DBG_VALUE insns being adjacent to the def of the register that they reference. This is not always true, leading to register copies being sunk but not the associated DBG_VALUEs, which gives the debugger a bad variable location. This patch collects DBG_VALUEs as we walk through a BB looking for copies to sink, then passes them down to performSink. Compile-time impact should be negligable. Differential Revision: https://reviews.llvm.org/D53992 llvm-svn: 345996	2018-11-02 16:52:48 +00:00
Simon Pilgrim	cdcbeb4997	[DAGCombiner] Remove reduceBuildVecConvertToConvertBuildVec and rely on the vectorizers instead (PR35732) reduceBuildVecConvertToConvertBuildVec vectorizes int2float in the DAGCombiner, which means that even if the LV/SLP has decided to keep scalar code using the cost models, this will override this. While there are cases where vectorization is necessary in the DAG (mainly due to legalization artefacts), I don't think this is the case here, we should assume that the vectorizers know what they are doing. Differential Revision: https://reviews.llvm.org/D53712 llvm-svn: 345964	2018-11-02 11:06:18 +00:00
Matthias Braun	e8f717aea8	LLVMTargetMachine/TargetPassConfig: Simplify handling of start/stop options; NFC - Make some TargetPassConfig methods that just check whether options have been set static. - Shuffle code in LLVMTargetMachine around so addPassesToGenerateCode only deals with TargetPassConfig now (but not with MCContext or the creation of MachineModuleInfo) llvm-svn: 345918	2018-11-02 01:31:50 +00:00
Mandeep Singh Grang	547a0d765a	[COFF, ARM64] Implement Intrinsic.sponentry for AArch64 Summary: This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform. Patch by: Yin Ma (yinma@codeaurora.org) Reviewers: mgrang, ssijaric, eli.friedman, TomTan, mstorsjo, rnk, compnerd, efriedma Reviewed By: efriedma Subscribers: efriedma, javed.absar, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D53996 llvm-svn: 345909	2018-11-01 23:22:25 +00:00
Craig Topper	e2483020f2	[DAGCombiner] Make the isTruncateOf call from visitZERO_EXTEND work for vectors. Remove FIXME. I'm having trouble creating a test case for the ISD::TRUNCATE part of this that shows any codegen differences. But I was able to test the setcc path which is what the test changes here cover. llvm-svn: 345908	2018-11-01 23:21:45 +00:00
Jessica Paquette	c991cf3687	[MachineOutliner][NFC] Remember when you map something illegal across MBBs Instruction mapping in the outliner uses "illegal numbers" to signify that something can't ever be part of an outlining candidate. This means that the number is unique and can't be part of any repeated substring. Because each of these is unique, we can use a single unique number to represent a range of things we can't outline. The outliner tries to leverage this using a flag which is set in an MBB when the previous instruction we tried to map was "illegal". This patch improves that logic to work across MBBs. As a bonus, this also simplifies the mapping logic somewhat. This also updates the machine-outliner-remarks test, which was impacted by the order of Candidates on an OutlinedFunction changing. This order isn't guaranteed, so I added a FIXME to fix that in a follow-up. The order of Candidates on an OutlinedFunction isn't important, so this still is NFC. llvm-svn: 345906	2018-11-01 23:09:06 +00:00
Simon Pilgrim	b34a052852	[LegalizeDAG] Add generic vector CTPOP expansion (PR32655) This patch adds support for expanding vector CTPOP instructions and removes the x86 'bitmath' lowering which replicates the same expansion. Differential Revision: https://reviews.llvm.org/D53258 llvm-svn: 345869	2018-11-01 18:22:11 +00:00
Mandeep Singh Grang	b0cdf56dd7	Revert "[COFF, ARM64] Implement Intrinsic.sponentry for AArch64" This reverts commit 585b6667b4712e3c7f32401e929855b3313b4ff2. llvm-svn: 345863	2018-11-01 17:53:57 +00:00
Sanjay Patel	c5fe3ce2ec	[DAGCombiner] make sure we have a whole-number extract before trying to narrow a vector op (PR39511) The test causes a crash because we were trying to extract v4f32 to v3f32, and the narrowing factor was then 4/3 = 1 producing a bogus narrow type. This should fix: https://bugs.llvm.org/show_bug.cgi?id=39511 llvm-svn: 345842	2018-11-01 15:41:12 +00:00
Zachary Turner	56a5a0c3ce	[CodeView] Emit the correct TypeIndex for std::nullptr_t. The TypeIndex used by cl.exe is 0x103, which indicates a SimpleTypeMode of NearPointer (note the absence of the bitness, normally pointers use a mode of NearPointer32 or NearPointer64) and a SimpleTypeKind of void. So this is basically a void, but without a specified size, which makes sense given how std::nullptr_t is defined. clang-cl was actually not emitting anything* for this. Instead, when we encountered std::nullptr_t in a DIType, we would actually just emit a TypeIndex of 0, which is obviously wrong. std::nullptr_t in DWARF is represented as a DW_TAG_unspecified_type with a name of "decltype(nullptr)", so we add that logic along with a test, as well as an update to the dumping code so that we no longer print void* when dumping 0x103 (which would previously treat Void/NearPointer no differently than Void/NearPointer64). Differential Revision: https://reviews.llvm.org/D53957 llvm-svn: 345811	2018-11-01 04:02:41 +00:00
Mandeep Singh Grang	88ad9ac720	[COFF, ARM64] Implement Intrinsic.sponentry for AArch64 Summary: This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform. Reviewers: mgrang, TomTan, rnk, compnerd, mstorsjo, efriedma Reviewed By: efriedma Subscribers: majnemer, chrib, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D53673 llvm-svn: 345791	2018-10-31 23:16:20 +00:00
Stanislav Mekhanoshin	222e9c11f7	Check shouldReduceLoadWidth from SimplifySetCC SimplifySetCC could shrink a load without checking for profitability or legality of such shink with a target. Added checks to prevent shrinking of aligned scalar loads in AMDGPU below dword as scalar engine does not support it. Differential Revision: https://reviews.llvm.org/D53846 llvm-svn: 345778	2018-10-31 21:24:30 +00:00
Scott Linder	92bb783cfe	[SelectionDAG] Handle constant range [0,1) in lowerRangeToAssertZExt lowerRangeToAssertZExt currently relies on something like EarlyCSE having eliminated the constant range [0,1). At -O0 this leads to an assert. Differential Revision: https://reviews.llvm.org/D53888 llvm-svn: 345770	2018-10-31 19:57:36 +00:00
Craig Topper	eeac12af6d	[SelectionDAGISel] Suppress a -Wunused-but-set-variable warning in release builds. NFC llvm-svn: 345761	2018-10-31 18:46:15 +00:00
Simon Pilgrim	077a9adb00	Fix comment typo. NFCI. llvm-svn: 345758	2018-10-31 18:19:52 +00:00
Simon Pilgrim	805cdcfe73	[SelectionDAG] SelectionDAGLegalize::ExpandBITREVERSE - ensure we use ShiftTy We should be using the getShiftAmountTy value type for shift amounts. llvm-svn: 345756	2018-10-31 18:14:14 +00:00
Daniel Sanders	3b39040ad4	[globalisel][irtranslator] Verify that DILocations aren't lost in translation Summary: Also fix a couple bugs where DILocations are lost. EntryBuilder wasn't passing on debug locations for PHI's, constants, GLOBAL_VALUE, etc. Reviewers: aprantl, vsk, bogner, aditya_nandakumar, volkan, rtereshin, aemerson Reviewed By: aemerson Subscribers: aemerson, rovka, kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53740 llvm-svn: 345743	2018-10-31 17:31:23 +00:00
Matthias Braun	8763c0c5b7	MachineModuleInfo: Initialize DbgInfoAvailable depending on debug_cus existing Before this patch DbgInfoAvailable was set to true in DwarfDebug::beginModule() or CodeViewDebug::CodeViewDebug(). This made MIR testing weird since passes would suddenly stop dealing with debug info just because we stopped the pipeline before the debug printers. This patch changes the logic to initialize DbgInfoAvailable based on the fact that debug_compile_units exist in the llvm Module. The debug printers may then override it with false in case of debug printing being disabled. Differential Revision: https://reviews.llvm.org/D53885 llvm-svn: 345740	2018-10-31 17:18:41 +00:00
David Bolvansky	d0080c3a5f	[DAGCombiner] Fold 0 div/rem X to 0 Reviewers: RKSimon, spatel, javed.absar, craig.topper, t.p.northover Reviewed By: RKSimon Subscribers: craig.topper, llvm-commits Differential Revision: https://reviews.llvm.org/D52504 llvm-svn: 345721	2018-10-31 14:18:57 +00:00
Matthias Braun	9fd397b423	ADT/STLExtras: Introduce llvm::empty; NFC This is modeled after C++17 std::empty(). Differential Revision: https://reviews.llvm.org/D53909 llvm-svn: 345679	2018-10-31 00:23:23 +00:00
Matthias Braun	a83403892a	MachineOperand/MIParser: Do not print debug-use flag, infer it The debug-use flag must be set exactly for uses on DBG_VALUEs. This is so obvious that it can be trivially inferred while parsing. This will reduce noise when printing while omitting an information that has little value to the user. The parser will keep recognizing the flag for compatibility with old `.mir` files. Differential Revision: https://reviews.llvm.org/D53903 llvm-svn: 345671	2018-10-30 23:28:27 +00:00
Cameron McInally	2ad870e785	[FPEnv] [FPEnv] Add constrained intrinsics for MAXNUM and MINNUM Differential Revision: https://reviews.llvm.org/D53216 llvm-svn: 345650	2018-10-30 21:01:29 +00:00
Craig Topper	4104c00658	[ScalarizeMaskedMemIntrin] Limit the scope of some variables that are only used inside loops. llvm-svn: 345638	2018-10-30 20:33:58 +00:00
Bjorn Pettersson	fe09a20f09	[DAGCombiner] Fix for big endian in ForwardStoreValueToDirectLoad Summary: Normalize the offset for endianess before checking if the store cover the load in ForwardStoreValueToDirectLoad. Without this we missed out on some optimizations for big endian targets. If for example having a 4 bytes store followed by a 1 byte load, loading the least significant byte from the store, the STCoversLD check would fail (see @test4 in test/CodeGen/AArch64/load-store-forwarding.ll). This patch also fixes a problem seen in an out-of-tree target. The target has i40 as a legal type, it is big endian, and the StoreSize for i40 is 48 bits. So when normalizing the offset for endianess we need to take the StoreSize into account (assuming that padding added when storing into a larger StoreSize always is added at the most significant end). Reviewers: niravd Reviewed By: niravd Subscribers: javed.absar, kristof.beyls, llvm-commits, uabelho Differential Revision: https://reviews.llvm.org/D53776 llvm-svn: 345636	2018-10-30 20:16:39 +00:00
Nirav Dave	eedd2ccd02	[DAG] Add const variants for BaseIndexOffset functions. llvm-svn: 345623	2018-10-30 18:26:43 +00:00
Jonas Paulsson	611b533f1d	[SchedModel] Fix for read advance cycles with implicit pseudo operands. The SchedModel allows the addition of ReadAdvances to express that certain operands of the instructions are needed at a later point than the others. RegAlloc may add pseudo operands that are not part of the instruction descriptor, and therefore cannot have any read advance entries. This meant that in some cases the desired read advance was nullified by such a pseudo operand, which still had the original latency. This patch fixes this by making sure that such pseudo operands get a zero latency during DAG construction. Review: Matthias Braun, Ulrich Weigand. https://reviews.llvm.org/D49671 llvm-svn: 345606	2018-10-30 15:04:40 +00:00
Sanjay Patel	8b207defea	[DAGCombiner] narrow vector binops when extraction is cheap Narrowing vector binops came up in the demanded bits discussion in D52912. I don't think we're going to be able to do this transform in IR as a canonicalization because of the risk of creating unsupported widths for vector ops, but we already have a DAG TLI hook to allow what I was hoping for: isExtractSubvectorCheap(). This is currently enabled for x86, ARM, and AArch64 (although only x86 has existing regression test diffs). This is artificially limited to not look through bitcasts because there are so many test diffs already, but that's marked with a TODO and is a small follow-up. Differential Revision: https://reviews.llvm.org/D53784 llvm-svn: 345602	2018-10-30 14:14:34 +00:00
Sanjay Patel	680c9227ca	[SelectionDAG] fix build warning for mismatched signs in compare; NFC llvm-svn: 345598	2018-10-30 13:47:19 +00:00
Francis Visoiu Mistrih	85d3f1ee8f	[llc] Error out when -print-machineinstrs is used with an unknown pass We used to assert instead of reporting an error. PR39494 llvm-svn: 345589	2018-10-30 12:07:18 +00:00
Simon Pilgrim	858303b827	[SelectionDAG] Add FoldBUILD_VECTOR to simplify new BUILD_VECTOR nodes Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VECTOR - if all the operands are UNDEF or if the BUILD_VECTOR simplifies to a copy. This exposed an assumption in some AMDGPU code that getBuildVector was guaranteed to be a BUILD_VECTOR node that I've tried to handle. Differential Revision: https://reviews.llvm.org/D53760 llvm-svn: 345578	2018-10-30 10:32:11 +00:00
David Bolvansky	dfdbb038e8	[DAGCombiner] Improve X div/rem Y fold if single bit element type Summary: Tests by @spatel, thanks Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: sdardis, atanasyan, llvm-commits, spatel Differential Revision: https://reviews.llvm.org/D52668 llvm-svn: 345575	2018-10-30 09:07:22 +00:00
Craig Topper	b293322cee	[LegalizeTypes] Teach PromoteIntRes_BITCAST to better handle a bitcast with vector output type and a vector input type that needs to be widened Summary: Previously if we had a bitcast vector output type that needs promotion and a vector input type that needs widening we would just do a stack store and load to handle the conversion. We can do a little better if we can widen the bitcast to a legal vector type the same size as the widened input type. Then we can do the bitcast between this widened type and the widened input type. Afterwards we can extract_subvector back to the original output and any_extend that. Type legalization will then circle back and handle promotion of the extract_subvector and the any_extend will just be removed. This will avoid going through the stack and allows us to remove a custom version of this legalization from X86. Reviewers: efriedma, RKSimon Reviewed By: efriedma Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53229 llvm-svn: 345567	2018-10-30 03:27:15 +00:00
Matt Arsenault	0ae1b52186	Remove dead declaration llvm-svn: 345555	2018-10-30 01:12:12 +00:00
Matt Arsenault	901a95f729	Pass TRI to printReg llvm-svn: 345553	2018-10-30 01:11:31 +00:00
Jessica Paquette	e3932eeea4	[MachineOutliner] Inherit target features from parent function If a function has target features, it may contain instructions that aren't represented in the default set of instructions. If the outliner pulls out one of these instructions, and the function doesn't have the right attributes attached, we'll run into an LLVM error explaining that the target doesn't support the necessary feature for the instruction. This makes outlined functions inherit target features from their parents. It also updates the machine-outliner.ll test to check that we're properly inheriting target features. llvm-svn: 345535	2018-10-29 20:27:07 +00:00
Leonard Chan	905abe5b5d	[Intrinsic] Signed and Unsigned Saturation Subtraction Intirnsics Add an intrinsic that takes 2 integers and perform saturation subtraction on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D53783 llvm-svn: 345512	2018-10-29 16:54:37 +00:00
Craig Topper	7a18b4bc51	[SelectionDAG] Fix bad indentation. NFC llvm-svn: 345481	2018-10-28 21:24:20 +00:00
Simon Pilgrim	3497d536f7	[TargetLowering] Move i64/vXi64 to f32/vXf32 UINT_TO_FP handling to TargetLowering::expandUINT_TO_FP. llvm-svn: 345478	2018-10-28 15:34:35 +00:00
Simon Pilgrim	9b77f0c291	[VectorLegalizer] Enable TargetLowering::expandFP_TO_UINT support. Add vector support to TargetLowering::expandFP_TO_UINT. This exposes an issue in X86TargetLowering::LowerVSELECT which was assuming that the select mask was the same width as the LHS/RHS ops - as long as the result is a sign splat we can easily sext/trunk this. llvm-svn: 345473	2018-10-28 13:07:25 +00:00
Craig Topper	c4b785ae1e	[DAGCombiner] Better constant vector support for FCOPYSIGN. Enable constant folding when both operands are vectors of constants. Turn into FNEG/FABS when the RHS is a splat constant vector. llvm-svn: 345469	2018-10-28 01:32:49 +00:00
Simon Pilgrim	3cf33fcdd6	[TargetLowering] Move LegalizeDAG FP_TO_UINT handling to TargetLowering::expandFP_TO_UINT. NFCI. First step towards fixing PR17686 and adding vector support. llvm-svn: 345452	2018-10-27 12:15:58 +00:00
Sanjin Sijaric	96f2ea3dd4	[ARM64][Windows] MCLayer support for exception handling Add ARM64 unwind codes to MCLayer, as well SEH directives that will be emitted by the frame lowering patch to follow. We only emit unwind codes into object object files for now. Differential Revision: https://reviews.llvm.org/D50166 llvm-svn: 345450	2018-10-27 06:13:06 +00:00
Sanjay Patel	0eddd4730f	[DAGCombiner] rearrange code in narrowExtractedVectorBinOp(); NFC We can extend this code to handle many more cases if an extract is cheap, so prepping for that change. llvm-svn: 345430	2018-10-26 21:32:04 +00:00
Craig Topper	7bf85f5c8d	[LegalizeTypes] Stop DAGTypeLegalizer::getSETCCWidenedResultTy from creating illegal setccs. Add checks for valid setccs The DAGTypeLegalizer::getSETCCWidenedResultTy was widening the MaskVT, but the code in convertMask called after getSETCCWidenedResultTy had no idea this widening had occurred. So none of the operands were widened when convertMask created new setccs with the widened VT. This patch removes the widening and adds some asserts to getNode to validate the types of setccs to prevent issues like this in the future. Differential Revision: https://reviews.llvm.org/D53743 llvm-svn: 345428	2018-10-26 20:59:55 +00:00
Eli Friedman	2ac1162917	[ARM] Make InstrEmitter mark CPSR defs dead for Thumb1. The "dead" markings allow existing target-independent optimizations, like MachineSink, to trigger more frequently. The CPSR defs would have eventually been marked dead by LiveVariables, so this only affects optimizations before regalloc. The ARMBaseInstrInfo.cpp change is fixing a bug which is only visible with this change: the transform adds a use to an otherwise dead def of CPSR. This is covered by existing regression tests. thumb2-tbh.ll breaks for Thumb1 due to MachineLICM changing the generated code; I'll fix it in D53452. Differential Revision: https://reviews.llvm.org/D53453 llvm-svn: 345420	2018-10-26 19:32:24 +00:00
George Rimar	088d96b43d	[Codegen] - Implement basic .debug_loclists section emission (DWARF5). .debug_loclists is the DWARF 5 version of the .debug_loc. With that patch, it will be emitted when DWARF 5 is used. Differential revision: https://reviews.llvm.org/D53365 llvm-svn: 345377	2018-10-26 11:25:12 +00:00
Heejin Ahn	24faf859e5	Reland "[WebAssembly] LSDA info generation" Summary: This adds support for LSDA (exception table) generation for wasm EH. Wasm EH mostly follows the structure of Itanium-style exception tables, with one exception: a call site table entry in wasm EH corresponds to not a call site but a landing pad. In wasm EH, the VM is responsible for stack unwinding. After an exception occurs and the stack is unwound, the control flow is transferred to wasm 'catch' instruction by the VM, after which the personality function is called from the compiler-generated code. (Refer to WasmEHPrepare pass for more information on this part.) This patch: - Changes wasm.landingpad.index intrinsic to take a token argument, to make this 1:1 match with a catchpad instruction - Stores landingpad index info and catch type info MachineFunction in before instruction selection - Lowers wasm.lsda intrinsic to an MCSymbol pointing to the start of an exception table - Adds WasmException class with overridden methods for table generation - Adds support for LSDA section in Wasm object writer Reviewers: dschuff, sbc100, rnk Subscribers: mgorny, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52748 llvm-svn: 345345	2018-10-25 23:55:10 +00:00
Jonas Paulsson	f213f81d9c	Fix in MachineOperand::printIRValueReference(). Handle the case where getCurrentFunction() returns nullptr by passing -1 to printIRSlotNumber(). This will result in <badref> being printed instead of an assertion failure. Review: Francis Visoiu Mistrih https://reviews.llvm.org/D53333 llvm-svn: 345342	2018-10-25 23:39:07 +00:00
David Blaikie	73c2f197d2	DebugInfo: Explain why DW_LLE_(GNU_)startx_length is used This isn't the most object-size efficient encoding, but it's the only one GDB supports for the pre-standard fission format. I've written fixes for this twice now... - so perhaps this comment will help me remember why neither of these have been committed and why I shouldn't try to write a third fix another year from now... llvm-svn: 345326	2018-10-25 22:26:25 +00:00
Sumanth Gundapaneni	ada0f511ba	[Pipeliner] Ignore Artificial dependences while computing recurrences. The artificial dependencies are not real dependencies. In some cases, they form circuits with bigger MII. However, they are used to schedule instructions better. Differential Revision: https://reviews.llvm.org/D53450 llvm-svn: 345319	2018-10-25 21:27:08 +00:00
Sumanth Gundapaneni	dfdbc716e4	[Pipeliner] Remove the unneeded include header(NFC). Differential Revision: https://reviews.llvm.org/D53451 llvm-svn: 345318	2018-10-25 21:25:30 +00:00
Cameron McInally	384a74b0e6	[FPEnv] Last BinaryOperator::isFNeg(...) to m_FNeg(...) changes Replacing BinaryOperator::isFNeg(...) to avoid regressions when we separate FNeg from the FSub IR instruction. Differential Revision: https://reviews.llvm.org/D53650 llvm-svn: 345295	2018-10-25 18:09:33 +00:00
Volkan Keles	60c6affcb0	[GlobalISel] LegalizerHelper: Fix the incorrect alignment when splitting loads/stores in narrowScalar Reviewers: dsanders, bogner, jpaquette, aemerson, ab, paquette Reviewed By: dsanders Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53664 llvm-svn: 345292	2018-10-25 17:52:19 +00:00
Simon Pilgrim	f02c0f8af6	[LegalizeDAG] Remove dead SINT_TO_FP legalization code As noticed on D52965, the SINT_TO_FP i64 to f32 legalization code has been dead for years - protected by an assert. Differential Revision: https://reviews.llvm.org/D53703 llvm-svn: 345290	2018-10-25 17:43:36 +00:00
Volkan Keles	f87473fe1c	[GISel] LegalizerInfo: Rename MemDesc::Size to SizeInBits to make the value clearer Requested in D53679. llvm-svn: 345288	2018-10-25 17:37:07 +00:00
Amara Emerson	cbd86d8429	[GlobalISel] Use the target preferred type for G_EXTRACT_VECTOR_ELT index. Allows for better imported pattern re-use. llvm-svn: 345265	2018-10-25 14:04:54 +00:00
Simon Pilgrim	49d79a864c	Missing semicolon. llvm-svn: 345257	2018-10-25 11:38:17 +00:00
Simon Pilgrim	838eb24014	[TargetLowering] Improve vXi64 UINT_TO_FP vXf64 support (P38226) As suggested on D52965, this patch moves the i64 to f64 UINT_TO_FP expansion code from LegalizeDAG into TargetLowering and makes it available to LegalizeVectorOps as well. Not only does this help perform X86 lowering as a true vectorization instead of (partially vectorized) scalar conversions, it avoids the HADDPD op from the scalar code which can be slow on most targets. The AVX512F does have the vcvtusi2sdq scalar operation but we don't unroll to use it as it seems to only help for the v2f64 case - otherwise the unrolling cost will certainly be too high. My feeling is that we should leave it to the vectorizers - and if it generates the vector UINT_TO_FP we should use it. Differential Revision: https://reviews.llvm.org/D53649 llvm-svn: 345256	2018-10-25 11:15:57 +00:00
David Blaikie	60fddac907	DebugInfo: Reuse common addresses for rnglist base address selections This makes the offsets larger (since they are further from the base address) but those are in the .dwo - and allows removing addresses and relocations from the .o file. This could be built into the AddressPool more fundamentally, perhaps - when you ask for an AddressPool entry you could say "or give me some other entry and an offset I need to use" - though what to do about situations where the first use of an address in a section is not the earliest address in that section... is tricky. At least with range addresses we can be fairly sure we've seen the earliest address first because we see the start address for the function. llvm-svn: 345224	2018-10-24 23:36:29 +00:00
Thomas Lively	30f1d69115	[NFC] Rename minnan and maxnan to minimum and maximum Summary: Changes all uses of minnan/maxnan to minimum/maximum globally. These names emphasize that the semantic difference between these operations is more than just NaN-propagation. Reviewers: arsenm, aheejin, dschuff, javed.absar Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53112 llvm-svn: 345218	2018-10-24 22:49:55 +00:00
Thomas Lively	43bc46207a	[SelectionDAG] DAG combiner for fminnan and fmaxnan Summary: Depends on D52765. Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52768 llvm-svn: 345210	2018-10-24 22:18:54 +00:00
Tim Northover	05fe8f918b	[DAG] check more operands for cycles when merging stores. Until now, we've only checked whether merging stores would cause a cycle via the value argument, but the address and indexed offset arguments are also capable of creating cycles in some situations. The addresses are all base+offset with notionally the same base, but the base SDNode may still be different (e.g. via an indexed load in one case, and an ISD::ADD elsewhere). This allows cycles to creep in if one of these sources depends on another. The indexed offset is usually undef (representing a non-indexed store), but on some architectures (e.g. 32-bit ARM-mode ARM) it can be an arbitrary value, again allowing dependency cycles to creep in. llvm-svn: 345200	2018-10-24 21:36:34 +00:00
Sanjin Sijaric	625d08eea1	[MIR] Add hasWinCFI field Adding hasWinCFI field so that I can add MIR test cases to https://reviews.llvm.org/D50166. Differential Revision: https://reviews.llvm.org/D51201 llvm-svn: 345196	2018-10-24 21:07:38 +00:00
Reid Kleckner	953bdce68d	[MC] Separate masm integer literal lexer support from inline asm Summary: This renames the IsParsingMSInlineAsm member variable of AsmLexer to LexMasmIntegers and moves it up to MCAsmLexer. This is the only behavior controlled by that variable. I added a public setter, so that it can be set from outside or from the llvm-mc command line. We may need to arrange things so that users can get this behavior from clang, but that's future work. I also put additional hex literal lexing functionality under this flag to fix PR32973. It appears that this hex literal parsing wasn't intended to be enabled in non-masm-style blocks. Now, masm integers (0b1101 and 0ABCh) work in __asm blocks from clang, but 0b label references work when using .intel_syntax in standalone .s files. However, 0b label references will not work from __asm blocks in clang. They will work from GCC inline asm blocks, which it sounds like is important for Crypto++ as mentioned in PR36144. Essentially, we only lex masm literals for inline asm blobs that use intel syntax. If the .intel_syntax directive is used inside a gnu-style inline asm statement, masm literals will not be lexed, which is compatible with gas and llvm-mc standalone .s assembly. This fixes PR36144 and PR32973. Reviewers: Gerolf, avt77 Subscribers: eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D53535 llvm-svn: 345189	2018-10-24 20:23:57 +00:00
Simon Pilgrim	6f53b38fd4	[TargetLowering] Add SimplifyDemandedBitsForTargetNode callback Add a SimplifyDemandedBitsForTargetNode callback to handle target nodes. Differential Revision: https://reviews.llvm.org/D53643 llvm-svn: 345179	2018-10-24 19:00:56 +00:00
Robert Lougher	18bfb3a5ec	[CodeGen] skip lifetime end marker in isInTailCallPosition A lifetime end intrinsic between a tail call and the return should not prevent the call from being tail call optimized. Differential Revision: https://reviews.llvm.org/D53519 llvm-svn: 345163	2018-10-24 17:03:19 +00:00
Simon Pilgrim	c8c7451063	[LegalizeDAG] ExpandLegalINT_TO_FP - cleanup UINT_TO_FP i64 -> f32 expansion. Use SrcVT/DestVT types and correct shift type. Part of prep work for D52965 llvm-svn: 345158	2018-10-24 16:35:01 +00:00
Alexey Bataev	c15c853c3a	[DEBUGINFO, NVPTX] Try to pack bytes data into a single string. Summary: If the target does not support `.asciz` and `.ascii` directives, the strings are represented as bytes and each byte is placed on the new line as a separate byte directive `.b8 <data>`. NVPTX target allows to represent the vector of the data of the same type as a vector, where values are separated using `,` symbol: `.b8 <data1>,<data2>,...`. This allows to reduce the size of the final PTX file. Ptxas tool includes ptx files into the resulting binary object, so reducing the size of the PTX file is important. Reviewers: tra, jlebar, echristo Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D45822 llvm-svn: 345142	2018-10-24 14:04:00 +00:00
Matthias Braun	4f82406c46	SelectionDAG: Reuse bigger sized constants in memset expansion. When implementing memset's today we often see this pattern: $x0 = MOV 0xXYXYXYXYXYXYXYXY store $x0, ... $w1 = MOV 0xXYXYXYXY store $w1, ... We first create a 64bit constant in a 64bit register with all bytes the same and then create a 32bit constant with all bytes the same in a 32bit register. In many targets we could just access the lower byte of the 64bit register instead. - Ideally this would be handled by the ConstantHoist pass but it runs too early when memset isn't expanded yet. - The memset expansion code already had this optimization implemented, however SelectionDAG constantfolding would constantfold the "trunc(bigconstnat)" pattern to "smallconstant". - This patch makes the memset expansion mark the constant as Opaque and stop DAGCombiner from constant folding in this situation. (Similar to how ConstantHoisting marks things as Opaque to avoid folding ADD/SUB/etc.) Differential Revision: https://reviews.llvm.org/D53181 llvm-svn: 345102	2018-10-23 23:19:23 +00:00
Matt Arsenault	9ef8e51cec	Fix typo in verifier error message llvm-svn: 345083	2018-10-23 21:23:52 +00:00
Peter Collingbourne	abd820a92b	CGP: Clear data structures at the end of a loop iteration instead of the beginning. Clearing LargeOffsetGEPMap at the end fixes a bug where if a large offset GEP is in a dead basic block, we fail an assertion when trying to delete the block due to the asserting VH in LargeOffsetGEPMap. Differential Revision: https://reviews.llvm.org/D53464 llvm-svn: 345082	2018-10-23 21:23:18 +00:00
Simon Pilgrim	8c4796deb4	[LegalizeDAG] Share Vector/Scalar CTPOP Expansion As suggested on D53258, this patch move the CTPOP expansion code from SelectionDAGLegalize to TargetLowering to allow it to be reused by the VectorLegalizer. Proper vector support will be added by D53258. llvm-svn: 345066	2018-10-23 18:28:24 +00:00
Simon Pilgrim	d705ba97dd	[LegalizeDAG] Share Vector/Scalar CTLZ Expansion As suggested on D53258, this patch shares common CTLZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering. Extension to D53474 llvm-svn: 345060	2018-10-23 17:48:30 +00:00
Sanjay Patel	ad12df829c	[SelectionDAG] use 'match' to simplify code; NFC Vector types are not possible here because this code only starts matching from the scalar bool value of a conditional branch, but this is another step towards completely removing the fake binop queries for not/neg/fneg. llvm-svn: 345041	2018-10-23 15:46:10 +00:00
Benjamin Kramer	1e212e8abb	[LegalizeDAG] Remove unused variable llvm-svn: 345040	2018-10-23 15:43:36 +00:00
Simon Pilgrim	b975ff4700	[LegalizeDAG] Share Vector/Scalar CTTZ Expansion As suggested on D53258, this patch demonstrates sharing common CTTZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering. I intend to move CTLZ and (scalar) CTPOP over as well and then update D53258 accordingly. Differential Revision: https://reviews.llvm.org/D53474 llvm-svn: 345039	2018-10-23 15:37:19 +00:00
Aleksandr Urakov	00d4c38668	Revert "[MachinePipeliner] Split MachinePipeliner code into header and cpp files" This reverts commit 40760b733d9eef841c897338af5e9d81b12551bf. It seems that the commit is a cuse of the build failure. llvm-svn: 345032	2018-10-23 14:27:45 +00:00
Lama Saba	7d9b3a682e	[MachinePipeliner] Split MachinePipeliner code into header and cpp files Split MachinePipeliner code into header and cpp files to allow inheritance from SwingSchedulerDAG Differential Revision: https://reviews.llvm.org/D53477 llvm-svn: 345008	2018-10-23 07:58:41 +00:00
Leonard Chan	0acfc6be38	[Intrinsic] Unigned Saturation Addition Intrinsic Add an intrinsic that takes 2 integers and perform unsigned saturation addition on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D53340 llvm-svn: 344971	2018-10-22 23:08:40 +00:00
Craig Topper	c8e183f9ee	Recommit r344877 "[X86] Stop promoting integer loads to vXi64" I've included a fix to DAGCombiner::ForwardStoreValueToDirectLoad that I believe will prevent the previous miscompile. Original commit message: Theoretically this was done to simplify the amount of isel patterns that were needed. But it also meant a substantial number of our isel patterns have to match an explicit bitcast. By making the vXi32/vXi16/vXi8 types legal for loads, DAG combiner should be able to change the load type to rem I had to add some additional plain load instruction patterns and a few other special cases, but overall the isel table has reduced in size by ~12000 bytes. So it looks like this promotion was hurting us more than helping. I still have one crash in vector-trunc.ll that I'm hoping @RKSimon can help with. It seems to relate to using getTargetConstantFromNode on a load that was shrunk due to an extract_subvector combine after the constant pool entry was created. So we end up decoding more mask elements than the lo I'm hoping this patch will simplify the number of patterns needed to remove the and/or/xor promotion. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D53306 llvm-svn: 344965	2018-10-22 22:14:05 +00:00
Vedant Kumar	74533bd3b8	[DWARF] Use a function-local offset for AT_call_return_pc Logs provided by @stella.stamenova indicate that on Linux, lldb adds a spurious slide offset to the return PC it loads from AT_call_return_pc attributes (see the list thread: "[PATCH] D50478: Add support for artificial tail call frames"). This patch side-steps the issue by getting rid of the load address calculation in lldb's CallEdge::GetReturnPCAddress. The idea is to have the DWARF writer emit function-local offsets to the instruction after a call. I.e. return-pc = label-after-call-insn - function-entry. LLDB can simply add this offset to the base address of a function to get the return PC. Differential Revision: https://reviews.llvm.org/D53469 llvm-svn: 344960	2018-10-22 21:44:21 +00:00
Justin Bogner	912adfba7e	Reapply "[MachineCopyPropagation] Reimplement CopyTracker in terms of register units" Recommits r342942, which was reverted in r343189, with a fix for an issue where we would propagate unsafely if we defined only the upper part of a register. Original message: Change the copy tracker to keep a single map of register units instead of 3 maps of registers. This gives a very significant compile time performance improvement to the pass. I measured a 30-40% decrease in time spent in MCP on x86 and AArch64 and much more significant improvements on out of tree targets with more registers. llvm-svn: 344942	2018-10-22 19:51:31 +00:00
Matt Arsenault	687ec75d10	DAG: Change behavior of fminnum/fmaxnum nodes Introduce new versions that follow the IEEE semantics to help with legalization that may need quieted inputs. There are some regressions from inserting unnecessary canonicalizes when these are matched from fast math fcmp + select which should be fixed in a future commit. llvm-svn: 344914	2018-10-22 16:27:27 +00:00
Sanjay Patel	e439cc2745	[DAGCombiner] reduce insert+bitcast+extract vector ops to truncate (PR39016) This is a late backend subset of the IR transform added with: D52439 We can confirm that the conversion to a 'trunc' is correct by running: $ opt -instcombine -data-layout="e" (assuming the IR transforms are correct; change "e" to "E" for big-endian) As discussed in PR39016: https://bugs.llvm.org/show_bug.cgi?id=39016 ...the pattern may emerge during legalization, so that's we are waiting for an insertelement to become a scalar_to_vector in the pattern matching here. The DAG allows for fun variations that are not possible in IR. Result types for extracts and scalar_to_vector don't necessarily match input types, so that means we have to be a bit more careful in the transform (see code comments). The tests show that we don't handle cases that require a shift (as we did in the IR version). I've left that as a potential follow-up because I'm not sure if that's a real concern at this late stage. Differential Revision: https://reviews.llvm.org/D53201 llvm-svn: 344872	2018-10-21 20:13:29 +00:00
David Blaikie	14cfa0dcdc	DebugInfo: Use base address specifiers more aggressively Using a base address specifier even for a single-element range is a size win for object files (7 words versus 8 words - more significant savings if the debug info is compressed (since it's 3 words of uncompressable reloc + 4 compressable words compared to 6 uncompressable reloc + 2 compressable words) - does trade off executable size increase though. llvm-svn: 344841	2018-10-20 09:16:49 +00:00
David Blaikie	2df23a4e2e	DebugInfo: Use DW_OP_addrx in DWARFv5 Reuse addresses in the address pool, even in non-split cases. llvm-svn: 344838	2018-10-20 08:54:05 +00:00
David Blaikie	32e09de91c	DebugInfo: Implement debug_rnglists.dwo Save space/relocations in .o files by keeping dwo ranges in the dwo file rather than the .o file. llvm-svn: 344837	2018-10-20 08:12:36 +00:00
David Blaikie	c4af8bf29f	DebugInfo: Use address pool forms in debug_rnglists Save no relocations by reusing addresses from the address pool. llvm-svn: 344836	2018-10-20 07:36:39 +00:00
David Blaikie	161dd3c186	DebugInfo: Use debug_addr for non-dwo addresses in DWARF 5 Putting addresses in the address pool, even with non-fission, can reduce relocations - reusing the addresses from debug_info and debug_rnglists (the latter coming soon) llvm-svn: 344834	2018-10-20 06:02:15 +00:00
Roman Tereshin	8d6ff4c0af	[MachineCSE][GlobalISel] Making sure MachineCSE works mid-GlobalISel (again) Change of approach, it looks like it's a much better idea to deal with the vregs that have LLTs and reg classes both properly, than trying to avoid creating those across all GlobalISel passes and all targets. The change mostly touches MachineRegisterInfo::constrainRegClass, which is apparently only used by MachineCSE. The changes are NFC for any pipeline but one that contains MachineCSE mid-GlobalISel. NOTE on isCallerPreservedOrConstPhysReg change in MachineCSE: There is no test covering it as the only way to insert a new pass (MachineCSE) from a command line I know of is llc's -run-pass option, which only works with MIR, but MIRParser freezes reserved registers upon MachineFunctions creation, making it impossible to reproduce the state that exposes the issue. Reviwed By: aditya_nandakumar Differential Revision: https://reviews.llvm.org/D53144 llvm-svn: 344822	2018-10-20 00:06:15 +00:00
Aditya Nandakumar	cd04e366d7	[GISel]: Allow PHIs to be DCEd https://reviews.llvm.org/D53304 Currently dead phis are not cleaned up during DCE. This patch allows dead PHI and G_PHI insts to be deleted. Reviewed by: dsanders llvm-svn: 344811	2018-10-19 20:11:52 +00:00
Krzysztof Pszeniczny	2bfe759a8d	Fix a use-after-RAUW bug in large GEP splitting Summary: Large GEP splitting, introduced in rL332015, uses a `DenseMap<AssertingVH<Value>, ...>`. This causes an assertion to fail (in debug builds) or undefined behaviour to occur (in release builds) when a value is RAUWed. This manifested itself in the 7zip benchmark from the llvm test suite built on ARM with `-fstrict-vtable-pointers` enabled while RAUWing invariant group launders and splits in CodeGenPrepare. This patch merges the large offsets of the argument and the result of an invariant.group strip/launder intrinsic before RAUWing. Reviewers: Prazek, javed.absar, haicheng, efriedma Reviewed By: Prazek, efriedma Subscribers: kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D51936 llvm-svn: 344802	2018-10-19 19:02:16 +00:00
Fangrui Song	2e83b2e9ee	Use llvm::{all,any,none}_of instead std::{all,any,none}_of. NFC llvm-svn: 344774	2018-10-19 06:12:02 +00:00
Sumanth Gundapaneni	62ac69d45c	[Pipeliner] copyToPhi DAG Mutation to improve scheduling. In a loop, create artificial dependences between the source of a COPY/REG_SEQUENCE to the use in next iteration. Eg: SRC ----Data Dep--> COPY COPY ---Anti Dep--> PHI (implies, to be used in next iteration) PHI ----Data Dep--> USE This patches creates USE ----Artificial Dep---> SRC This will effectively schedule the COPY late to eliminate additional copies. Before this patch, the schedule can be SRC, COPY, USE : The COPY is used in next iteration and it needs to be preserved. After this patch, the schedule can be USE, SRC, COPY : The COPY is used in next iteration and the live interval is reduced. Differential Revision: https://reviews.llvm.org/D53303 llvm-svn: 344748	2018-10-18 15:51:16 +00:00
Krasimir Georgiev	547d824da6	Revert "[WebAssembly] LSDA info generation" This reverts commit r344575. Newly introduced test eh-lsda.ll.test fails with use-after-free under ASAN build. llvm-svn: 344639	2018-10-16 18:50:09 +00:00
Leonard Chan	699b3b54da	[Intrinsic] Signed Saturation Addition Intrinsic Add an intrinsic that takes 2 integers and perform saturation addition on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D53053 llvm-svn: 344629	2018-10-16 17:35:41 +00:00
Simon Pilgrim	ac58636ec5	[LegalizeDAG] ExpandLegalINT_TO_FP - cleanup UINT_TO_FP i64 -> f64 expansion. Use SrcVT/DestVT types, correct shift type and AND instead of ZERO_EXTEND_IN_REG. Part of prep work for D52965 llvm-svn: 344602	2018-10-16 10:06:15 +00:00
Heejin Ahn	0981eaab47	[WebAssembly] LSDA info generation Summary: This adds support for LSDA (exception table) generation for wasm EH. Wasm EH mostly follows the structure of Itanium-style exception tables, with one exception: a call site table entry in wasm EH corresponds to not a call site but a landing pad. In wasm EH, the VM is responsible for stack unwinding. After an exception occurs and the stack is unwound, the control flow is transferred to wasm 'catch' instruction by the VM, after which the personality function is called from the compiler-generated code. (Refer to WasmEHPrepare pass for more information on this part.) This patch: - Changes wasm.landingpad.index intrinsic to take a token argument, to make this 1:1 match with a catchpad instruction - Stores landingpad index info and catch type info MachineFunction in before instruction selection - Lowers wasm.lsda intrinsic to an MCSymbol pointing to the start of an exception table - Adds WasmException class with overridden methods for table generation - Adds support for LSDA section in Wasm object writer Reviewers: dschuff, sbc100, rnk Subscribers: mgorny, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52748 llvm-svn: 344575	2018-10-16 00:09:12 +00:00
Sanjay Patel	4cf1da0e02	[SelectionDAG] allow FP binops in SimplifyDemandedVectorElts This is intended to make the backend on par with functionality that was added to the IR version of SimplifyDemandedVectorElts in: rL343727 ...and the original motivation is that we need to improve demanded-vector-elements in several ways to avoid problems that would be exposed in D51553. Differential Revision: https://reviews.llvm.org/D52912 llvm-svn: 344541	2018-10-15 18:05:34 +00:00
Sanjay Patel	8bd74785f0	[DAGCombiner] allow undef elts in vector fmul matching llvm-svn: 344534	2018-10-15 16:54:07 +00:00
Sanjay Patel	89e2197c33	[DAGCombiner] refactor folds for fadd (fmul X, -2.0), Y; NFCI The transform doesn't work if the vector constant has undef elements. llvm-svn: 344532	2018-10-15 16:47:01 +00:00
Sanjay Patel	9e7e0fd828	[DAGCombiner] allow undef elts in vector fma matching llvm-svn: 344528	2018-10-15 15:56:39 +00:00
Sanjay Patel	4e970ff022	[DAGCombiner] allow undef elts in vector fma matching llvm-svn: 344525	2018-10-15 15:38:38 +00:00
Chandler Carruth	edb12a838a	[TI removal] Make variables declared as `TerminatorInst` and initialized by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502	2018-10-15 10:04:59 +00:00
Bjorn Pettersson	064944352e	[TwoAddressInstructionPass] Replace subregister uses when processing tied operands Summary: TwoAddressInstruction pass typically rewrites %1:short = foo %0.sub_lo:long as %1:short = COPY %0.sub_lo:long %1:short = foo %1:short when having tied operands. If there are extra un-tied operands that uses the same reg and subreg, such as the second and third inputs to fie here: %1:short = fie %0.sub_lo:long, %0.sub_hi:long, %0.sub_lo:long then there was a bug which replaced the register %0 also for the un-tied operand, but without changing the subregister indices. So we used to get: %1:short = COPY %0.sub_lo:long %1:short = fie %1, %1.sub_hi:short, %1.sub_lo:short With this fix we instead get: %1:short = COPY %0.sub_lo:long %1:short = fie %1, %0.sub_hi:long, %1 Reviewers: arsenm, JesperAntonsson, kparzysz, MatzeB Reviewed By: MatzeB Subscribers: bjope, kparzysz, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D36224 llvm-svn: 344492	2018-10-15 08:36:03 +00:00
Simon Pilgrim	a0590a4f7a	[LegalizeDAG] Don't bother with final MUL+SRL stage for byte CTPOP. The final stage of CTPOP expansion (v = (v * 0x01010101...) >> (Len - 8)) is completely pointless for the byte (Len = 8) case as it reduces to (v = (v * 0x01...) >> 0), but annoyingly this doesn't always get optimized away. Found while investigating generic vector CTPOP expansion (PR32655). llvm-svn: 344477	2018-10-14 15:56:28 +00:00
Simon Pilgrim	28a143f738	Pull out repeated variables from SelectionDAGLegalize::ExpandBitCount. The CTPOP case has been changed from VT.getSizeInBits to VT.getScalarSizeInBits - but this fits in with future work for vector support (PR32655) and doesn't affect any current (scalar) uses. llvm-svn: 344461	2018-10-13 18:40:48 +00:00
Craig Topper	189e5b4ab6	[LegalizeTypes] Prevent an assertion from PromoteIntRes_BSWAP and PromoteIntRes_BITREVERSE if the shift amount is too large for the VT returned by getShiftAmountTy Summary: getShiftAmountTy for X86 returns MVT::i8. If a BSWAP or BITREVERSE is created that requires promotion and the difference between the original VT and the promoted VT is more than 255 then we won't able to create the constant. This patch adds a check to replace the result from getShiftAmountTy to MVT::i32 if the difference won't fit. This should get legalized later when the shift is ultimately expanded since its clearly an illegal type that we're only promoting to make it a power of 2 bit width. Alternatively we could base the decision completely on the largest shift amount the promoted VT could use. Vectors should be immune here because getShiftAmountTy always returns the incoming VT for vectors. Only the scalar shift amount can be changed by the targets. Reviewers: eli.friedman, RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53232 llvm-svn: 344460	2018-10-13 17:47:20 +00:00
Simon Pilgrim	c5d7c6e5f6	[X86][SSE] Remove most of vector CTTZ custom lowering and use LegalizeDAG instead. There is one remnant - AVX1 custom splitting of 256-bit vectors - which is due to a regression where the X86ISD::ANDNP is still performed as a YMM. I've also tightened the CTLZ or CTPOP lowering in SelectionDAGLegalize::ExpandBitCount to require a legal CTLZ - it doesn't affect existing users and fixes an issue with AVX512 codegen. llvm-svn: 344457	2018-10-13 16:11:15 +00:00
Simon Pilgrim	1c2051ead7	[X86][SSE] Begin removing vector CTTZ custom lowering and use LegalizeDAG instead. Adds CTTZ vector legalization support and begins the removal of the X86/SSE custom lowering. llvm-svn: 344453	2018-10-13 15:16:55 +00:00
Thomas Lively	16c349d892	[Intrinsic] Add llvm.minimum and llvm.maximum instrinsic functions Summary: These new intrinsics have the semantics of the `minimum` and `maximum` operations specified by the latest draft of IEEE 754-2018. Unlike llvm.minnum and llvm.maxnum, these new intrinsics propagate NaNs and always treat -0.0 as less than 0.0. `minimum` and `maximum` lower directly to the existing `fminnan` and `fmaxnan` ISel DAG nodes. It is safe to reuse these DAG nodes because before this patch were only emitted in situations where there were known to be no NaN arguments or where NaN propagation was correct and there were known to be no zero arguments. I know of only four backends that lower fminnan and fmaxnan: WebAssembly, ARM, AArch64, and SystemZ, and each of these lowers fminnan and fmaxnan to instructions that are compatible with the IEEE 754-2018 semantics. Reviewers: aheejin, dschuff, sunfish, javed.absar Subscribers: kristof.beyls, dexonsmith, kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D52764 llvm-svn: 344437	2018-10-13 07:21:44 +00:00
Craig Topper	a796580903	[LegalizeVectorTypes] Use TLI.getVectorIdxTy instead of DAG.getIntPtrConstant. There's no guarantee that vector indices should use pointer types. So use the correct query method. llvm-svn: 344428	2018-10-12 22:55:17 +00:00
Craig Topper	435e38a5df	[LegalizeVectorTypes] When widening the result of a bitcast from a scalar type, use a scalar_to_vector to turn the scalar into a vector intead of a build vector full of mostly undefs. This is more consistent with what we usually do and matches some code X86 custom emits in some cases that I think I can cleanup. The MIPS test change just looks to be an instruction ordering change. llvm-svn: 344422	2018-10-12 21:59:55 +00:00
Eli Friedman	a6e3a823b3	Revert BTF commit series. The initial patch was not reviewed, and does not have any tests; it should not have been merged. This reverts 344395, 344390, 344387, 344385, 344381, 344376, and 344366. llvm-svn: 344405	2018-10-12 19:41:05 +00:00
Craig Topper	1bb0c6041a	[LegalizeVectorTypes] When widening the operands to a concat_vectors, see if we can use the widened operand 0 if the width matches and the other operands are undef. This saves a conversion to extracts and build_vector. We already do this when both the result and the input need to be widened to the same type. This changed the sse-intrinsics-fast-isel test because we don't lower (insert_vector_elt (scalar_to_vector X), Y, 1) well. We turn it into (vector_shuffle (scalar_to_vector X), (scalar_to_vector Y), <0, 4, 2, 3>) losing track of the fact that the upper elts could be undef. We should probably find a way to prevent the scalarization of the <2 x f32> load on these tests. llvm-svn: 344404	2018-10-12 19:37:49 +00:00
Craig Topper	05f014a684	[LegalizeVectorTypes] When unrolling in WidenVecRes_Convert, make sure we use the original vector element count. Not min of the widened result type and the possibly widened input type. If the input type is widened as well, but we still were forced to unroll, we shouldn't be considering the widened input element count. We should only create as many scalar operations as the original type called for. This will be important for an upcoming patch. llvm-svn: 344403	2018-10-12 19:37:47 +00:00
Rui Ueyama	0f3a56c850	Replace assert() with llvm_unreachable because it's obviously a typo. llvm-svn: 344395	2018-10-12 18:29:30 +00:00
Reid Kleckner	810687cb57	[codeview] Emit S_BUILDINFO and LF_BUILDINFO with cwd and source file Summary: We can fill in the command line and compiler path later if we want. Reviewers: zturner Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D53179 llvm-svn: 344393	2018-10-12 18:19:06 +00:00
Fangrui Song	d15d602654	[BPF] Use cstdint {,u}int*_t instead of linux/types.h __u32 __u16 ... llvm-svn: 344387	2018-10-12 17:57:07 +00:00
Eric Liu	0916efc232	Disambiguate: s/make_unique/llvm::make_unique/. NFC llvm-svn: 344385	2018-10-12 17:55:21 +00:00
Fangrui Song	19b8fa5c5a	[BPF] Don't include linux/types.h and fix style llvm-svn: 344381	2018-10-12 17:41:12 +00:00
Zachary Turner	5bba1cafbe	Better support for POSIX paths in PDBs. This a resubmission of a patch which was previously reverted due to breaking several lld tests. The issues causing those failures have been fixed, so the patch is now resubmitted. ---Original Commit Message--- While it doesn't make a ton of sense for POSIX paths to be in PDBs, it's possible to occur in real scenarios involving cross compilation. The tools need to be able to handle this, because certain types of debugging scenarios are possible without a running process and so don't necessarily require you to be on a Windows system. These include post-mortem debugging and binary forensics (e.g. using a debugger to disassemble functions and examine symbols without running the process). There's changes in clang, LLD, and lldb in this patch. After this the cross-platform disassembly and source-list tests pass on Linux. Furthermore, the behavior of LLD can now be summarized by a much simpler rule than before: Unless you specify /pdbsourcepath and /pdbaltpath, the PDB ends up with paths that are valid within the context of the machine that the link is performed on. Differential Revision: https://reviews.llvm.org/D53149 llvm-svn: 344377	2018-10-12 17:26:19 +00:00
Fangrui Song	2c31e12d62	[BPF] Some fixes after rL344366 * Move #include outside of namespaces * Add missing #include * Add out-of-line virtual destructor to BTFTypeEntry designated initializers should also be fixed llvm-svn: 344376	2018-10-12 17:23:25 +00:00
Yonghong Song	6c2327a09e	[BPF] Add BTF generation for BPF target BTF is the debug format for BPF, a kernel virtual machine and widely used for tracing, networking and security, etc ([1]). Currently only instruction streams are passed to kernel, the kernel verifier verifies them before execution. In order to provide better visibility of bpf programs to user space tools, some debug information, e.g., function names and debug line information are desirable for kernel so tools can get such information with better annotation for jited instructions for performance or other reasons. The dwarf is too complicated in kernel and for BPF. Hence, BTF is designed to be the debug format for BPF ([2]). Right now, pahole supports BTF for types, which are generated based on dwarf sections in the ELF file. In order to annotate performance metrics for jited bpf insns, it is necessary to pass debug line info to the kernel. Furthermore, we want to pass the actual code to the kernel because of the following reasons: . bpf program typically is small so storage overhead should be small. . in bpf land, it is totally possible that an application loads the bpf program into the kernel and then that application quits, so holding debug info by the user space application is not practical. . having source codes directly kept by kernel would ease deployment since the original source code does not need ship on every hosts and kernel-devel package does not need to be deployed even if kernel headers are used. The only reliable time to get the source code is during compilation time. This will result in both more accurate information and easier deployment as stated in the above. Another consideration is for JIT. The project like bcc use MCJIT to compile a C program into bpf insns and load them to the kernel ([3]). The generated BTF sections will be readily available for such cases as well. This patch implemented generation of BTF info in llvm compiler. The BTF related sections will be generated when both -target bpf and -g are specified. Two sections are generated: .BTF contains all the type and string information, and .BTF.ext contains the func_info and line_info. The separation is related to how two sections are used differently in bpf loader, e.g., linux libbpf ([4]). The .BTF section can be loaded into the kernel directly while .BTF.ext needs loader manipulation before loading to the kernel. The format of the each section is roughly defined in llvm:include/llvm/MC/MCBTFContext.h and from the implementation in llvm:lib/MC/MCBTFContext.cpp. A later example also shows the contents in each section. The type and func_info are gathered during CodeGen/AsmPrinter by traversing dwarf debug_info. The line_info is gathered in MCObjectStreamer before writing to the object file. After all the information is gathered, the two sections are emitted in MCObjectStreamer::finishImpl. With cmake CMAKE_BUILD_TYPE=Debug, the compiler can dump out all the tables except insn offset, which will be resolved later as relocation records. The debug type "btf" is used for BTFContext dump. Dwarf tests the debug info generation with llvm-dwarfdump to decode the binary sections and check whether the result is expected. Currently we do not have such a tool yet. We will implement btf dump functionality in bpftool ([5]) as the bpftool is considered the recommended tool for bpf introspection. The implementation for type and func_info is tested with linux kernel test cases. The line_info is visually checked with dump from linux kernel libbpf ([4]) and checked with readelf dumping section raw data. Note that the .BTF and .BTF.ext information will not be emitted to assembly code and there is no assembler support for BTF either. In the below, with a clang/llvm built with CMAKE_BUILD_TYPE=Debug, Each table contents are shown for a simple C program. -bash-4.2$ cat -n test.c 1 struct A { 2 int a; 3 char b; 4 }; 5 6 int test(struct A t) { 7 return t->a; 8 } -bash-4.2$ clang -O2 -target bpf -g -mllvm -debug-only=btf -c test.c Type Table: [1] FUNC name_off=1 info=0x0c000001 size/type=2 param_type=3 [2] INT name_off=12 info=0x01000000 size/type=4 desc=0x01000020 [3] PTR name_off=0 info=0x02000000 size/type=4 [4] STRUCT name_off=16 info=0x04000002 size/type=8 name_off=18 type=2 bit_offset=0 name_off=20 type=5 bit_offset=32 [5] INT name_off=22 info=0x01000000 size/type=1 desc=0x02000008 String Table: 0 : 1 : test 6 : .text 12 : int 16 : A 18 : a 20 : b 22 : char 27 : test.c 34 : int test(struct A t) { 58 : return t->a; FuncInfo Table: sec_name_off=6 insn_offset=<Omitted> type_id=1 LineInfo Table: sec_name_off=6 insn_offset=<Omitted> file_name_off=27 line_off=34 line_num=6 column_num=0 insn_offset=<Omitted> file_name_off=27 line_off=58 line_num=7 column_num=3 -bash-4.2$ readelf -S test.o ...... [12] .BTF PROGBITS 0000000000000000 0000028d 00000000000000c1 0000000000000000 0 0 1 [13] .BTF.ext PROGBITS 0000000000000000 0000034e 0000000000000050 0000000000000000 0 0 1 [14] .rel.BTF.ext REL 0000000000000000 00000648 0000000000000030 0000000000000010 16 13 8 ...... -bash-4.2$ The latest linux kernel ([6]) can already support .BTF with type information. The [7] has the reference implementation in linux kernel side to support .BTF.ext func_info. The .BTF.ext line_info support is not implemented yet. If you have difficulty accessing [6], you can manually do the following to access the code: git clone https://github.com/yonghong-song/bpf-next-linux.git cd bpf-next-linux git checkout btf The change will push to linux kernel soon once this patch is landed. References: [1]. https://www.kernel.org/doc/Documentation/networking/filter.txt [2]. https://lwn.net/Articles/750695/ [3]. https://github.com/iovisor/bcc [4]. https://github.com/torvalds/linux/tree/master/tools/lib/bpf [5]. https://github.com/torvalds/linux/tree/master/tools/bpf/bpftool [6]. https://github.com/torvalds/linux [7]. https://github.com/yonghong-song/bpf-next-linux/tree/btf Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Differential Revision: https://reviews.llvm.org/D52950 llvm-svn: 344366	2018-10-12 17:01:46 +00:00
Nick Desaulniers	47bab69a2e	[MC][ELF] fix newly added test Summary: Reland of - r344197 "[MC][ELF] compute entity size for explicit sections" - r344206 "[MC][ELF] Fix section_mergeable_size.ll" after being reverted in r344278 due to build breakages from not specifying a target triple. Move test from test/CodeGen/Generic/ to test/MC/ELF/. Add explicit target triple so we don't try to run this test on non ELF targets. Reported: https://reviews.llvm.org/D53056#1261707 Reviewers: fhahn, rnk, espindola, NoQ Reviewed By: fhahn, rnk Subscribers: NoQ, MaskRay, rengolin, emaste, arichardson, llvm-commits, pirama, srhines Differential Revision: https://reviews.llvm.org/D53146 llvm-svn: 344360	2018-10-12 16:35:44 +00:00
Simon Pilgrim	c046b6856e	Pull out repeated value types. NFCI. llvm-svn: 344355	2018-10-12 15:49:19 +00:00
Simon Pilgrim	b926fd7b36	Pull out repeated value types. NFCI. llvm-svn: 344354	2018-10-12 15:48:47 +00:00
Simon Pilgrim	b8339c0167	[SelectionDAG] Move VectorLegalizer::ExpandCTLZ codegen into SelectionDAGLegalize Generalize SelectionDAGLegalize's CTLZ expansion to handle vectors - lets VectorLegalizer::ExpandCTLZ to just pass the expansion on instead of repeating the same codegen. llvm-svn: 344349	2018-10-12 14:45:57 +00:00
Sanjay Patel	56b6660d2e	[DAGCombiner] rearrange extract_element+bitcast fold; NFC I want to add another pattern here that includes scalar_to_vector, so this makes that patch smaller. I was hoping to remove the hasOneUse() check because it shouldn't be necessary for common codegen, but an AMDGPU test has a comment suggesting that the extra check makes things better on one of those targets. llvm-svn: 344320	2018-10-11 23:56:56 +00:00
Matthias Braun	c7efb6f990	Revert "DwarfDebug: Pick next location in case of missing location at block begin" It originally triggered a stepping problem in the debugger, which could be fixed by adjusting CodeGen/LexicalScopes.cpp however it seems we prefer the previous behavior anyway. See the discussion for details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181008/593833.html This reverts commit r343880. This reverts commit r343874. llvm-svn: 344318	2018-10-11 23:37:58 +00:00
Sumanth Gundapaneni	77418a3753	[Pipeliner] Use the Index from Topo instead of relying on NodeNum. (NFC) In future, if we may add any new DAG mutations other than artificial dependencies, the NodeNum may not be valid. Instead the index from topological schedule DAG can be used as long as we update it with the DAG change. Differential Revision: https://reviews.llvm.org/D53104 llvm-svn: 344283	2018-10-11 19:45:07 +00:00
Sumanth Gundapaneni	8916e438c4	[Pipeliner] Fix the Schedule DAG topoligical order. This patch updates the DAG change to reflect in the topological ordering of the nodes. Differential Revision: https://reviews.llvm.org/D53105 llvm-svn: 344282	2018-10-11 19:42:46 +00:00
Zachary Turner	e8a6c3eb96	Revert SymbolFileNativePDB plugin. This was originally causing some test failures on non-Windows platforms, which required fixes in the compiler and linker. After those fixes, however, other tests started failing. Reverting temporarily until I can address everything. llvm-svn: 344279	2018-10-11 18:45:44 +00:00
Artem Dergachev	2ce1d6faf8	Revert r344197 "[MC][ELF] compute entity size for explicit sections" Revert r344206 "[MC][ELF] Fix section_mergeable_size.ll" They were causing failures on too many important buildbots for too long. Please revert eagerly if your fix takes more than a couple of hours to land! llvm-svn: 344278	2018-10-11 18:43:08 +00:00
Nirav Dave	f1f2a2a31a	[DAG] Fix Big Endian in Load-Store forwarding Summary: Correct offset calculation in load-store forwarding for big-endian targets. Reviewers: rnk, RKSimon, waltl Subscribers: sdardis, nemanjai, hiraditya, jrtc27, atanasyan, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D53147 llvm-svn: 344272	2018-10-11 18:28:59 +00:00
Zachary Turner	e502f8b315	Better support for POSIX paths in PDBs. While it doesn't make a ton of sense for POSIX paths to be in PDBs, it's possible to occur in real scenarios involving cross compilation. The tools need to be able to handle this, because certain types of debugging scenarios are possible without a running process and so don't necessarily require you to be on a Windows system. These include post-mortem debugging and binary forensics (e.g. using a debugger to disassemble functions and examine symbols without running the process). There's changes in clang, LLD, and lldb in this patch. After this the cross-platform disassembly and source-list tests pass on Linux. Furthermore, the behavior of LLD can now be summarized by a much simpler rule than before: Unless you specify /pdbsourcepath and /pdbaltpath, the PDB ends up with paths that are valid within the context of the machine that the link is performed on. Differential Revision: https://reviews.llvm.org/D53149 llvm-svn: 344269	2018-10-11 18:01:55 +00:00
Sanjay Patel	4875662e57	[DAGCombiner] move comment closer to the corresponding code; NFC llvm-svn: 344255	2018-10-11 16:07:25 +00:00
Nick Desaulniers	335315697a	[MC][ELF] compute entity size for explicit sections Summary: Global variables might declare themselves to be in explicit sections. Calculate the entity size always to prevent assembler warnings "entity size for SHF_MERGE not specified" when sections are to be marked merge-able. Fixes PR31828. Reviewers: rnk, echristo Reviewed By: rnk Subscribers: llvm-commits, pirama, srhines Differential Revision: https://reviews.llvm.org/D53056 llvm-svn: 344197	2018-10-10 22:52:32 +00:00
George Burgess IV	6ef8002c2c	Replace most users of UnknownSize with LocationSize::unknown(); NFC Moving away from UnknownSize is part of the effort to migrate us to LocationSizes (e.g. the cleanup promised in D44748). This doesn't entirely remove all of the uses of UnknownSize; some uses require tweaks to assume that UnknownSize isn't just some kind of int. This patch is intended to just be a trivial replacement for all places where LocationSize::unknown() will Just Work. llvm-svn: 344186	2018-10-10 21:28:44 +00:00
Nirav Dave	07acc992dc	[DAGCombine] Improve Load-Store Forwarding Summary: Extend analysis forwarding loads from preceeding stores to work with extended loads and truncated stores to the same address so long as the load is fully subsumed by the store. Hexagon's swp-epilog-phis.ll and swp-memrefs-epilog1.ll test are deleted as they've no longer seem to be relevant. Reviewers: RKSimon, rnk, kparzysz, javed.absar Subscribers: sdardis, nemanjai, hiraditya, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D49200 llvm-svn: 344142	2018-10-10 14:15:52 +00:00
Simon Pilgrim	bc7b6251b6	[TargetLowering] SimplifyDemandedBits - rename demanded mask args. NFCI. Help stop bugs like rL343935 by making the 'original' DemandedBits arg more obviously not the mask that is actually used. llvm-svn: 344138	2018-10-10 13:00:49 +00:00
Simon Pilgrim	53a503f6ac	[TargetLowering] SimplifyDemandedBits - pull out repeated getOperands. NFCI. Part of a minor cleanup to make all the switch statements more consistent prior to improving vector support. llvm-svn: 344136	2018-10-10 12:32:13 +00:00
Simon Pilgrim	5cb3a82892	[TargetLowering] Add root node back to work list after successful SimplifyDemandedBits/SimplifyDemandedVectorElts Similar to what already happens in the DAGCombiner wrappers, this patch adds the root nodes back onto the worklist if the DCI wrappers' SimplifyDemandedBits/SimplifyDemandedVectorElts were successful. Differential Revision: https://reviews.llvm.org/D53026 llvm-svn: 344132	2018-10-10 10:44:15 +00:00
Nemanja Ivanovic	72d4866e57	[DAGCombiner] Expand combining of FP logical ops to sign-setting FP ops We already do the following combines: (bitcast int (and (bitcast fp X to int), 0x7fff...) to fp) -> fabs X (bitcast int (xor (bitcast fp X to int), 0x8000...) to fp) -> fneg X When the target has "bit preserving fp logic". This patch just extends it to also combine: (bitcast int (or (bitcast fp X to int), 0x8000...) to fp) -> fneg (fabs X) As some targets have fnabs and even those that don't can efficiently lower both the fabs and the fneg. Differential revision: https://reviews.llvm.org/D44548 llvm-svn: 344093	2018-10-09 23:20:11 +00:00
Simon Pilgrim	23f880317a	[SelectionDAG] Add SIGN_EXTEND_VECTOR_INREG and CONCAT_VECTORS support to SimplifyDemandedBits Fix for AVX1 masked load/store regression on D52964 llvm-svn: 344043	2018-10-09 13:13:35 +00:00
Matthias Braun	1c96c9687f	ExpandPostRAPseudos: Fix alldefsAreDead() not removing operands One case left around nonsensical operands for the KILL instruction which the machine verifier checks for nowadays. While this should not hurt in release builds we should fix the machine verifier errors anyway. llvm-svn: 344008	2018-10-09 00:07:34 +00:00
Matthias Braun	c9dac6b9b4	TwoAddressInstructionPass: Modernize/fix some comments; NFC llvm-svn: 344006	2018-10-08 23:47:35 +00:00
Matthias Braun	66badb38b2	PHIElimination: Remove wrong comment; NFC The comment was contradicting the code. Looking at history the feature was implemented a day after the comment was written without dropping the comment. llvm-svn: 344005	2018-10-08 23:47:35 +00:00
Matthias Braun	7d88f0e386	MachineFunctionPrinterPass: Declare SlotIndexes as used if available; NFC This makes print-machineinstrs print the slot indexes in more situations. NFC for normal compilation. llvm-svn: 344004	2018-10-08 23:47:34 +00:00
Sanjay Patel	b64c0d7b53	[DAGCombiner] simplify code for fmul with constant fold; NFCI llvm-svn: 343997	2018-10-08 21:17:20 +00:00
Alex Bradbury	f27c67af12	[SelectionDAGBuilder][NFC] Pass LHSTy to getShiftAmountTy rather than RHSTy r126518 introduced a a type parameter to the getShiftAmountTy target hook. It produces the type of the shift (RHSTy), parameterised by the type of the value being shifted (LHSTy). SelectionDAGBuilder::visitShift passed RHSTy rather than LHSTy and this patch corrects this. The change is a no-op because in LLVM IR the LHS and RHS types for a shift must be equal anyway. llvm-svn: 343955	2018-10-08 06:24:59 +00:00
Craig Topper	98dd9d6896	Revert r343948 "[LegalizeDAG] Make one of the ReplaceNode signatures take an ArrayRef instead a pointer to an array. Add assert on size of array. NFC" The assert is failing some asan tests on the bots. llvm-svn: 343950	2018-10-08 03:12:12 +00:00
Craig Topper	c058a68784	[LegalizeDAG] Make one of the ReplaceNode signatures take an ArrayRef instead a pointer to an array. Add assert on size of array. NFC llvm-svn: 343948	2018-10-08 02:02:08 +00:00
Craig Topper	cd38de8b15	[LegalizeDAG] Move legalization of scatter and masked store from LegalizeVectorOps to LegalizeDAG. This is where we legalize gather and masked load so this is consistent. Since these ops are always on vectors I've chosen to go with LegalizeDAG since that's what we do for other vector only ops like BUILD_VECTOR, VECTOR_SHUFFLE, etc. The ScalarizeMaskedMemIntrinsic pass should take care of scalarizing these before SelectionDAG so hopefully we don't need to worry about illegally typed scalar ops being emitted in the legalizing. If we did we would need to do this in LegalizeVectorOps so we could get the second type legalization that runs between LegalizeVectorOps and LegalizeDAG. llvm-svn: 343947	2018-10-08 00:04:55 +00:00
Sanjay Patel	ecc8af61e7	[DAGCombiner] allow undef elts in vector fadd matching llvm-svn: 343945	2018-10-07 16:30:42 +00:00
Sanjay Patel	ef76e27985	[DAGCombiner] allow undefs when matching vector splats for fmul folds llvm-svn: 343942	2018-10-07 16:05:37 +00:00
Sanjay Patel	0b74c840dd	[DAGCombiner] allow undef elts in vector fabs/fneg matching This change is proposed as a part of D44548, but we need this independently to avoid regressions from improved undef propagation in SimplifyDemandedVectorElts(). llvm-svn: 343940	2018-10-07 15:32:06 +00:00
Sanjay Patel	46a9dc2e3e	[DAGCombiner] shorten code for bitcast+fabs fold; NFC llvm-svn: 343939	2018-10-07 15:18:30 +00:00
Simon Pilgrim	3b04a4e322	[SelectionDAG] Respect multiple uses in SimplifyDemandedBits to SimplifyDemandedVectorElts simplification rL343913 was using SimplifyDemandedBits's original demanded mask instead of the adjusted 'NewMask' that accounts for multiple uses of the op (those variable names really need improving....). Annoyingly many of the test changes (back to pre-rL343913 state) are actually safe - but only because their multiple uses are all by PMULDQ/PMULUDQ. Thanks to Jan Vesely (@jvesely) for bisecting the bug. llvm-svn: 343935	2018-10-07 11:45:46 +00:00
Craig Topper	e4d199e360	[LegalizeVectorOps] Make ExpandStrictFPOp return the result corresponding to the result number of the SDValue passed in. It was always returning the chain which seems to be the result number of the SDValue in the lit tests we have. But I don't know if that's guaranteed. llvm-svn: 343933	2018-10-07 07:16:44 +00:00
Simon Pilgrim	9c9c97bcf4	[SelectionDAG] Add SimplifyDemandedBits to SimplifyDemandedVectorElts simplification This patch enables SimplifyDemandedBits to call SimplifyDemandedVectorElts in cases where the demanded bits mask covers entire elements of a bitcasted source vector. There are a couple of cases here where simplification at a deeper level (such as through bitcasts) prevents further simplification - CommitTargetLoweringOpt only adds immediate uses/users back to the worklist when we might want to combine the original caller again to see what else it can simplify. As well as that I had to disable handling of bool vector until SimplifyDemandedVectorElts better supports some of their opcodes (SETCC, shifts etc.). Fixes PR39178 Differential Revision: https://reviews.llvm.org/D52935 llvm-svn: 343913	2018-10-06 10:20:04 +00:00
Vedant Kumar	8c46668b6e	[LiveDebugValues] Extend var ranges through artificial blocks ASan often introduces basic blocks consisting exclusively of instructions without debug locations, or with line 0 debug locations. LiveDebugValues needs to extend variable ranges through these artificial blocks. Otherwise, a lot of variables disappear -- even at -O0. Typically, LiveDebugValues does not extend a variable's range into a block unless the block is essentially "part of" the variable's scope (for a precise definition, see LexicalScopes::dominates). This patch relaxes the lexical dominance check for artificial blocks. This makes the following Swift program debuggable at -O0: ``` 1\| var x = 100 2\| print("x = \(x)") ``` rdar://39127144 Differential Revision: https://reviews.llvm.org/D52921 llvm-svn: 343890	2018-10-05 21:44:15 +00:00
Vedant Kumar	9b558380dd	Clarify debug output in LiveDebugValues MachineBasicBlocks often do not have names, so it helps to refer to them by block number when printing debug messages. llvm-svn: 343889	2018-10-05 21:44:00 +00:00
Jessica Paquette	b328d95333	[GlobalIsel] Add llvm.invariant.start and llvm.invariant.end Port over the implementation in SelectionDAGBuilder.cpp into the IRTranslator and update the arm64-irtranslator test. These were causing fallbacks in CTMark/Bullet (-Rpass-missed=gisel-select), and this patch fixes that. https://reviews.llvm.org/D52945 llvm-svn: 343885	2018-10-05 21:02:46 +00:00
Vedant Kumar	5931b4e5b5	[DebugInfo] Add support for DWARF5 call site-related attributes DWARF v5 introduces DW_AT_call_all_calls, a subprogram attribute which indicates that all calls (both regular and tail) within the subprogram have call site entries. The information within these call site entries can be used by a debugger to populate backtraces with synthetic tail call frames. Tail calling frames go missing in backtraces because the frame of the caller is reused by the callee. Call site entries allow a debugger to reconstruct a sequence of (tail) calls which led from one function to another. This improves backtrace quality. There are limitations: tail recursion isn't handled, variables within synthetic frames may not survive to be inspected, etc. This approach is not novel, see: https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jelinek.pdf This patch adds an IR-level flag (DIFlagAllCallsDescribed) which lowers to DW_AT_call_all_calls. It adds the minimal amount of DWARF generation support needed to emit standards-compliant call site entries. For easier deployment, when the debugger tuning is LLDB, the DWARF requirement is adjusted to v4. Testing: Apart from check-{llvm, clang}, I built a stage2 RelWithDebInfo clang binary. Its dSYM passed verification and grew by 1.4% compared to the baseline. 151,879 call site entries were added. rdar://42001377 Differential Revision: https://reviews.llvm.org/D49887 llvm-svn: 343883	2018-10-05 20:37:17 +00:00
Matthias Braun	fb43114ba2	DwarfDebug: Pick next location in case of missing location at block begin Context: Compiler generated instructions do not have a debug location assigned to them. However emitting 0-line records for all of them bloats the line tables for very little benefit so we usually avoid doing that. Not emitting anything will lead to the previous debug location getting applied to the locationless instructions. This is not desirable for block begin and after labels. Previously we would emit simply emit line-0 records in this case, this patch changes the behavior to do a forward search for a debug location in these cases before emitting a line-0 record to further reduce line table bloat. Inspired by the discussion in https://reviews.llvm.org/D52862 llvm-svn: 343874	2018-10-05 18:29:24 +00:00
Sanjay Patel	f6a160a102	[SelectionDAG] allow undefs when matching splat constants And use that to transform fsub with zero constant operands. The integer part isn't used yet, but it is proposed for use in D44548, so adding both enhancements here makes that patch simpler. llvm-svn: 343865	2018-10-05 17:42:19 +00:00
Jonas Paulsson	faad1b3056	[TargetRegisterInfo] Remove temporary hook enableMultipleCopyHints() Finally all targets are enabling multiple regalloc hints, so the hook to disable this can now be removed. NFC. Review: Simon Pilgrim https://reviews.llvm.org/D52316 llvm-svn: 343851	2018-10-05 14:23:11 +00:00
Daniel Sanders	a464ffd52c	[globalisel][combine] When placing truncates, handle the case when the BB is empty GlobalISel uses MIR with implicit fallthrough on each basic block. As a result, getFirstNonPhi() can return end(). llvm-svn: 343829	2018-10-04 23:47:37 +00:00
Daniel Sanders	ab358bfd09	[globalisel][combine] Fix a rare crash when encountering an instruction whose op0 isn't a reg The simplest instance of this is an intrinsic with no results which will have the intrinsic ID as operand 0. Also fix some benign incorrectness when op0 is a reg but isn't a def that was guarded against by checking for the extension opcodes. llvm-svn: 343821	2018-10-04 21:44:32 +00:00
Craig Topper	7d2155e3f9	[X86][LegalizeVectorOps] Use MERGE_VALUES to return two results from LowerLoad. Remove special case code in LegalizeVectorOps that allowed us to only return one result. Previously we replaced the chain use ourself and return the data result. LegalizeVectorOps then detected that we'd done this and assumed the chain had already been handled. This commit instead returns a MERGE_VALUES node with two results joined from nodes. This allows LegalizeVectorOps to do all the replacements for us without any special casing. The MERGE_VALUES will be removed by DAG combine. llvm-svn: 343817	2018-10-04 21:24:24 +00:00
Daniel Sanders	a05c7583c9	[globalisel][combine] Improve the truncate placement for the extending-loads combine This brings the extending loads patch back to the original intent but minus the PHI bug and with another small improvement to de-dupe truncates that are inserted into the same block. The truncates are sunk to their uses unless this would require inserting before a phi in which case it sinks to the _beginning_ of the predecessor block for that path (but no earlier than the def). The reason for choosing the beginning of the predecessor is that it makes de-duping multiple truncates in the same block simple, and optimized code is going to run a scheduler at some point which will likely change the position anyway. llvm-svn: 343804	2018-10-04 18:44:58 +00:00
Craig Topper	08ae6774eb	[LegalizeIntegerTypes] Fix typo in comment. NFC llvm-svn: 343750	2018-10-04 02:40:35 +00:00
Daniel Sanders	1b493739e0	[machineverifier] Detect PHI's that are preceeded by non-PHI's If present, PHI nodes must appear before non-PHI nodes in a basic block. The register allocator relies on this and will fail to eliminate PHI's that do not meet this requirement. llvm-svn: 343731	2018-10-03 22:05:31 +00:00
Heejin Ahn	9d224346b2	Make meanings of variables clearer in action table generation (NFC) Summary: Reviewers: kristina, zhmu, dschuff, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52680 llvm-svn: 343724	2018-10-03 21:30:15 +00:00
Matthew Voss	f8ab35a4f4	Emit template type and value parameter DIEs for template variables. Summary: Ensure the TemplateParam attribute of the DIGlobalVariable node is translated into the proper DIEs. Resolves https://bugs.llvm.org/show_bug.cgi?id=22119 Reviewers: dblaikie, probinson, aprantl, JDevlieghere, clayborg, whitequark, deadalnix Reviewed By: dblaikie Subscribers: llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D52057 llvm-svn: 343706	2018-10-03 18:44:53 +00:00
Daniel Sanders	10dedc00d0	Correct implementation of -verify-machineinstrs such that it's still overridable for EXPENSIVE_CHECKS -verify-machineinstrs was implemented as a simple bool. As a result, the 'VerifyMachineCode == cl::BOU_UNSET' used by EXPENSIVE_CHECKS to make it on by default but possible to disable didn't work as intended. Changed -verify-machineinstrs to a boolOrDefault to correct this. llvm-svn: 343696	2018-10-03 16:29:24 +00:00
Daniel Sanders	fb9b99b26e	[globalisel][combines] Don't sink G_TRUNC down to use if that use is a G_PHI This fixes a problem where the register allocator fails to eliminate a PHI because there's a non-PHI in the middle of the PHI instructions at the start of a BB. This G_TRUNC can be better placed but this at least fixes the correctness issue quickly. I'll follow up with a patch to the verifier to catch this kind of bug in future. llvm-svn: 343693	2018-10-03 15:43:39 +00:00
Jonas Paulsson	fb3a97bec0	[RA CopyHints] Fix compile-time regression This patch makes sure that a register is only hinted once to RA. In extreme cases the same register can otherwise be hinted numerous times and cause a compile time slowdown. Review: Simon Pilgrim https://reviews.llvm.org/D52826 llvm-svn: 343686	2018-10-03 12:51:19 +00:00
Jonas Toth	602e3a640f	[CodeGen] NFC fix pedantic warning from extra semicolon llvm-svn: 343674	2018-10-03 10:59:19 +00:00
Daniel Sanders	c973ad1878	Re-commit: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64 Summary: Depends on D45541 Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45543 The previous commit failed portions of the test-suite on GreenDragon due to duplicate COPY instructions and iterator invalidation. Both issues have now been fixed. To assist with this, a helper (cloneVirtualRegister) has been added to MachineRegisterInfo that can be used to get another register that has the same type and class/bank as an existing one. llvm-svn: 343654	2018-10-03 02:12:17 +00:00
Aaron Smith	da0602c154	[CodeView] Only add the Scoped flag for an enum type when it has an immediate function scope to match MSVC Reviewers: rnk, zturner, llvm-commits Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D52706 llvm-svn: 343627	2018-10-02 20:28:15 +00:00
Aaron Smith	802b033d78	[CodeView] Emit function options for subprogram and member functions Summary: Use the newly added DebugInfo (DI) Trivial flag, which indicates if a C++ record is trivial or not, to determine Codeview::FunctionOptions. Clang and MSVC generate slightly different Codeview for C++ records. For example, here is the C++ code for a class with a defaulted ctor, class C { public: C() = default; }; Clang will produce a LF for the defaulted ctor while MSVC does not. For more details, refer to FIXMEs in the test cases in "function-options.ll" included with this set of changes. Reviewers: zturner, rnk, llvm-commits, aleksandr.urakov Reviewed By: rnk Subscribers: Hui, JDevlieghere Differential Revision: https://reviews.llvm.org/D45123 llvm-svn: 343626	2018-10-02 20:21:05 +00:00
Daniel Sanders	74de21d06f	[globalisel][verifier] Run the MachineVerifier from IRTranslator onwards -verify-machineinstrs inserts the MachineVerifier after every MachineInstr-based pass. However, GlobalISel creates MachineInstr-based passes earlier than DAGISel and the corresponding verifiers are not being added. This patch fixes that. If GlobalISel triggers the fallback path then the MIR can be left in a bad state that is going to be cleared by ResetMachineFunctions. In this situation verifying between GlobalISel passes will prevent the fallback path from recovering from this. As a result, we bail out of verifying a function if the FailedISel attribute is present. llvm-svn: 343613	2018-10-02 17:56:58 +00:00
Reid Kleckner	d5e4ec74e3	[codeview] Fix 32-bit x86 variable locations in realigned stack frames Add the .cv_fpo_stackalign directive so that we can define $T0, or the VFRAME virtual register, with it. This was overlooked in the initial implementation because unlike MSVC, we push CSRs before allocating stack space, so this value is only needed to describe local variable locations. Variables that the compiler now addresses via ESP are instead described as being stored at offsets from VFRAME, which for us is ESP after alignment in the prologue. This adds tests that show that we use the VFRAME register properly in our S_DEFRANGE records, and that we emit the correct FPO data to define it. Fixes PR38857 llvm-svn: 343603	2018-10-02 16:43:52 +00:00
Daniel Sanders	33f42f97af	Revert: r343521 and r343541: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64 There's a strange assertion on two of the Green Dragon bots that goes away when this is reverted. The assertion is in RegBankAlloc and if it is this commit then -verify-machine-instrs should have caught it earlier in the pipeline. llvm-svn: 343546	2018-10-01 22:32:08 +00:00
Reid Kleckner	8d7c421a70	[codeview] Simplify S_DEFRANGE emission code, NFC These assembler directives are still pretty unreadable and it would be nice to clean them up at some point. llvm-svn: 343544	2018-10-01 22:25:49 +00:00
Reid Kleckner	9ea2c01264	[codeview] Emit S_FRAMEPROC and use S_DEFRANGE_FRAMEPOINTER_REL Summary: Before this change, LLVM would always describe locals on the stack as being relative to some specific register, RSP, ESP, EBP, ESI, etc. Variables in stack memory are pretty common, so there is a special S_DEFRANGE_FRAMEPOINTER_REL symbol for them. This change uses it to reduce the size of our debug info. On top of the size savings, there are cases on 32-bit x86 where local variables are addressed from ESP, but ESP changes across the function. Unlike in DWARF, there is no FPO data to describe the stack adjustments made to push arguments onto the stack and pop them off after the call, which makes it hard for the debugger to find the local variables in frames further up the stack. To handle this, CodeView has a special VFRAME register, which corresponds to the $T0 variable set by our FPO data in 32-bit. Offsets to local variables are instead relative to this value. This is part of PR38857. Reviewers: hans, zturner, javed.absar Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D52217 llvm-svn: 343543	2018-10-01 21:59:45 +00:00
Reid Kleckner	7a6966ec27	Fix the Windows build in GlobalISel Clang-cl was complaining about some sort of constexpr narrowing bug: C:\src\llvm-project\llvm\lib\CodeGen\GlobalISel\CombinerHelper.cpp(136,31): error: non-constant-expression cannot be narrowed from type 'llvm::TargetOpcode::(anonymous enum at C:\src\llvm-project\llvm\include\llvm/CodeGen/TargetOpcodes.h:22:1)' to 'unsigned int' in initializer list [-Wc++11-narrowing] unsigned(MI.getOpcode()) == unsigned(TargetOpcode::G_LOAD) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C:\src\llvm-project\llvm\lib\CodeGen\GlobalISel\CombinerHelper.cpp(136,31): note: insert an explicit cast to silence this issue unsigned(MI.getOpcode()) == unsigned(TargetOpcode::G_LOAD) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ static_cast<unsigned int>( llvm-svn: 343541	2018-10-01 21:39:39 +00:00
Daniel Sanders	9659bfda5a	[globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64 Summary: Depends on D45541 Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45543 llvm-svn: 343521	2018-10-01 18:56:47 +00:00
Matthias Braun	7159daa68e	MIRParser: Check that instructions only reference DILocation metadata llvm-svn: 343505	2018-10-01 17:50:52 +00:00
Matthias Braun	004fe6bf83	DAGCombiner: StoreMerging: Fix bad index calculating when adjusting mismatching vector types This fixes a case of bad index calculation when merging mismatching vector types. This changes the existing code to just use the existing extract_{subvector\|element} and a bitcast (instead of bitcast first and then newly created extract_xxx) so we don't need to adjust any indices in the first place. rdar://44584718 Differential Revision: https://reviews.llvm.org/D52681 llvm-svn: 343493	2018-10-01 16:25:50 +00:00
Carlos Alberto Enciso	81d8ef2196	[DebugInfo][Dexter] Incorrect DBG_VALUE after MCP dead copy instruction removal. When MachineCopyPropagation eliminates a dead 'copy', its associated debug information becomes invalid. as the recorded register has been removed. It causes the debugger to display wrong variable value. Differential Revision: https://reviews.llvm.org/D52614 llvm-svn: 343445	2018-10-01 08:14:44 +00:00
Fangrui Song	3507c6e884	Use the container form llvm::sort(C, ...) There are a few leftovers in rL343163 which span two lines. This commit changes these llvm::sort(C.begin(), C.end, ...) to llvm::sort(C, ...) llvm-svn: 343426	2018-09-30 22:31:29 +00:00
Bjorn Pettersson	c2fc53ac90	[PHIElimination] Lower a PHI node with only undef uses as IMPLICIT_DEF Summary: The lowering of PHI nodes used to detect if all inputs originated from IMPLICIT_DEF's. If so the PHI node was replaced by an IMPLICIT_DEF. Now we also consider undef uses when checking the inputs. So if all inputs are implicitly defined or undef we lower the PHI to an IMPLICIT_DEF. This makes PHIElimination::LowerPHINode more consistent as it checks both implicit and undef properties at later stages. Reviewers: MatzeB, tstellar Reviewed By: MatzeB Subscribers: jvesely, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D52558 llvm-svn: 343417	2018-09-30 17:26:58 +00:00

... 3 4 5 6 7 ...

25471 Commits