Commit Graph

25271 Commits

Author SHA1 Message Date
Alexander Kornienko 3635c89070 Fix uninitialized variable.
Flags variable was not initialized and later used (both isMBBSafeToOutlineFrom
implementations assume it's initialized), which breaks
test/CodeGen/AArch64/machine-outliner.mir. under memory sanitizer:
MemorySanitizer: use-of-uninitialized-value
    #0  in llvm::AArch64InstrInfo::getOutliningType(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&, unsigned int) const llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5494:9
    #1  in (anonymous namespace)::InstructionMapper::convertToUnsignedVec(llvm::MachineBasicBlock&, llvm::TargetInstrInfo const&) llvm/lib/CodeGen/MachineOutliner.cpp:772:19
    #2  in (anonymous namespace)::MachineOutliner::populateMapper((anonymous namespace)::InstructionMapper&, llvm::Module&, llvm::MachineModuleInfo&) llvm/lib/CodeGen/MachineOutliner.cpp:1543:14
    #3  in (anonymous namespace)::MachineOutliner::runOnModule(llvm::Module&) llvm/lib/CodeGen/MachineOutliner.cpp:1645:3
    #4  in (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1744:27
    #5  in llvm::legacy::PassManagerImpl::run(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1857:44
    #6  in compileModule(char**, llvm::LLVMContext&) llvm/tools/llc/llc.cpp:597:8

llvm-svn: 346761
2018-11-13 16:41:05 +00:00
Craig Topper 0b33b468a1 [DAGCombiner] Enable tryToFoldExtendOfConstant to run after legalize vector ops
It should be ok to create a new build_vector after legal operations so long as it doesn't cause an infinite loop in DAG combiner.

Unfortunately, X86's custom constant folding in combineVSZext is hiding any test changes from this. But I'm trying to get to a point where that X86 specific code isn't necessary at all.

Differential Revision: https://reviews.llvm.org/D54285

llvm-svn: 346728
2018-11-13 01:59:32 +00:00
Jessica Paquette 82d9c0a3fa [MachineOutliner][NFC] Change getMachineOutlinerMBBFlags to isMBBSafeToOutlineFrom
Instead of returning Flags, return true if the MBB is safe to outline from.

This lets us check for unsafe situations, like say, in AArch64, X17 is live
across a MBB without being defined in that MBB. In that case, there's no point
in performing an instruction mapping.

llvm-svn: 346718
2018-11-12 23:51:32 +00:00
Philip Reames e44a55dc98 [GC][NFC] Simplify code now that we only have one safepoint kind
This is the NFC follow up to exploit the semantic simplification from r346701

llvm-svn: 346712
2018-11-12 22:03:53 +00:00
Ali Tamur d482b01a62 Use a data structure better suited for large sets in SimplificationTracker.
Summary:
D44571 changed SimplificationTracker to use SmallSetVector to keep phi nodes. As a result, when the number of phi nodes is large, the build time performance suffers badly. When building for power pc, we have a case where there are more than 600.000 nodes, and it takes too long to compile.

In this change, I partially revert D44571 to use SmallPtrSet, which does an acceptable job with any number of elements. In the original patch, having a deterministic iteration order was mentioned as a motivation, however I think it only applies to the nodes already matched in MatchPhiSet method, which I did not touch.

Reviewers: bjope, skatkov

Reviewed By: bjope, skatkov

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54007

llvm-svn: 346710
2018-11-12 21:43:43 +00:00
Philip Reames c75a0c3f69 [GC] Remove so called PreCall safepoints
Remove another bit of unused configuration potential from GCStrategy.  It's not entirely clear what the intention here was, but from the docs, it sounds like this may have been subsumed by patchable call support.

Note: This change is deliberately small to make it clear that while implemented, there's nothing using the option.  A following NFC will do most of the simplifications.
llvm-svn: 346701
2018-11-12 20:15:34 +00:00
Stanislav Mekhanoshin 5f9513147a Fix MachineInstr::findRegisterUseOperandIdx subreg checks
The function only checks that instruction reads a super-register
containing requested physical register. In case if a sub-register
if being read that is also a use of a super-reg, so added the check.
In particular MI->readsRegister() is broken because of the missing
check. The resulting check is essentially regsOverlap().

Differential Revision: https://reviews.llvm.org/D54128

llvm-svn: 346686
2018-11-12 18:12:28 +00:00
Jessica Paquette 9702144341 [MachineOutliner][NFC] Early exit pruning when candidates don't share an MBB
There's no way they can overlap in this case.

This can save a few iterations when the candidate is close to the beginning
of a MachineBasicBlock. It's particularly useful when the average length of
a MachineBasicBlock in the program is small.

llvm-svn: 346682
2018-11-12 17:50:56 +00:00
Jessica Paquette 3954272ac1 [MachineOutliner][NFC] Put suffix tree in buildCandidateList
It's only used there, so it doesn't make much sense to have it in runOnModule.

llvm-svn: 346681
2018-11-12 17:50:55 +00:00
Paul Robinson 5b302bfc8e [DWARFv5] Emit split type units in .debug_info.dwo.
Differential Revision: https://reviews.llvm.org/D54350

llvm-svn: 346674
2018-11-12 16:55:11 +00:00
Nirav Dave a395e2df56 [DAGCombiner] Fix load-store forwarding of indexed loads.
Summary:
Handle extra output from index loads in cases where we wish to
forward a load value directly from a preceeding store.

Fixes PR39571.

Reviewers: peter.smith, rengolin

Subscribers: javed.absar, hiraditya, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D54265

llvm-svn: 346654
2018-11-12 14:05:40 +00:00
Philip Reames 8b48ceac80 [GC] Remove unused configuration variable
The custom root mechanism didn't actually do anything.  ShadowStackGC, the only one which used it, just removed the gcroots before they reached the normal lowering in SelectionDAG.  As a result, the state flag had no value.

llvm-svn: 346632
2018-11-12 02:34:54 +00:00
Philip Reames 1559021751 [GC] Minor style modernization
llvm-svn: 346631
2018-11-12 02:26:26 +00:00
Philip Reames 18945d6c99 [GCRoot] Remove some unneccessary complexity
The GCStrategy provides three configuration options were are largely redundant.

1) Support for conditionally lowering gcread and gcwrite to loads and stores.  This is redundant since any GC which wished to use these abstractions would lower them out of existance before the built in lowering anyways.  As such, there's no need to have the lowering being conditional.
2) Conditional initialization for allocas marked via gcroot.  Semantically, roots have to be initialized before first potential use.  Arguably, the frontend really should have responsibility for that, but the old API allowed the frontend to ignore this detail.  Only one builtin GC used the non-initializing mode.  Since no one to my knowledge actually uses the ErlangGC strategy, I decide the slight pessimization was worth the simplicity.  If that turns out to be problematic, we can always improve the insertion algorithm to detect more existing initializing stores.

llvm-svn: 346621
2018-11-11 21:13:09 +00:00
Craig Topper d23cdbbeb2 [DAGCombiner] Make tryToFoldExtendOfConstant return an SDValue instead of an SDNode*. NFC
Removes the need to call getNode internally and to recreate an SDValue after the call.

llvm-svn: 346600
2018-11-10 23:46:03 +00:00
Sanjay Patel 0a515595a7 [x86] allow vector load narrowing with multi-use values
This is a long-awaited follow-up suggested in D33578. Since then, we've picked up even more
opportunities for vector narrowing from changes like D53784, so there are a lot of test diffs.
Apart from 2-3 strange cases, these are all wins.

I've structured this to be no-functional-change-intended for any target except for x86
because I couldn't tell if AArch64, ARM, and AMDGPU would improve or not. All of those
targets have existing regression tests (4, 4, 10 files respectively) that would be
affected. Also, Hexagon overrides the shouldReduceLoadWidth() hook, but doesn't show
any regression test diffs. The trade-off is deciding if an extra vector load is better
than a single wide load + extract_subvector.

For x86, this is almost always better (on paper at least) because we often can fold
loads into subsequent ops and not increase the official instruction count. There's also
some unknown -- but potentially large -- benefit from using narrower vector ops if wide
ops are implemented with multiple uops and/or frequency throttling is avoided.

Differential Revision: https://reviews.llvm.org/D54073

llvm-svn: 346595
2018-11-10 20:05:31 +00:00
Philip Reames 9b8c102675 [GC] Rename a header for consistency
llvm-svn: 346588
2018-11-10 16:08:10 +00:00
Matthias Braun fb93aecf8d RegAllocFast: Further cleanups; NFC
llvm-svn: 346576
2018-11-10 00:36:27 +00:00
Philip Reames afa1742b4b [GC] Simplify linking of GC builtin GC strategies
llvm-svn: 346569
2018-11-09 23:56:21 +00:00
Craig Topper f2e65f8636 [SelectionDAG] Fix a -Wparentheses warning from gcc in an assert. NFC
gcc wants parentheses around the logical OR since there is a logical AND for the string.

llvm-svn: 346564
2018-11-09 23:11:30 +00:00
Paul Robinson ddbde9a4ad [DWARFv5] Emit normal type units in .debug_info comdats.
Differential Revision: https://reviews.llvm.org/D54282

llvm-svn: 346540
2018-11-09 19:06:09 +00:00
Craig Topper 9a7e19b8f2 [DAGCombiner][X86][Mips] Enable combineShuffleOfScalars to run between vector op legalization and DAG legalization. Fix bad one use check in combineShuffleOfScalars
It's possible for vector op legalization to generate a shuffle. If that happens we should give a chance for DAG combine to combine that with a build_vector input.

I also fixed a bug in combineShuffleOfScalars that was considering the number of uses on a undef input to a shuffle. We don't care how many times undef is used.

Differential Revision: https://reviews.llvm.org/D54283

llvm-svn: 346530
2018-11-09 18:04:34 +00:00
Serge Guelton 86f8b70f1b Type safe version of MachinePassRegistry
Previous version used type erasure through a `void* (*)()` pointer,
which triggered gcc warning and implied a lot of reinterpret_cast.

This version should make it harder to hit ourselves in the foot.

Differential revision: https://reviews.llvm.org/D54203

llvm-svn: 346522
2018-11-09 17:19:45 +00:00
Zaara Syeda 5c179bf14b [Power9] Allow gpr callee saved spills in prologue to vectors registers
Currently in llvm, CalleeSavedInfo can only assign a callee saved register to
stack frame index to be spilled in the prologue. We would like to enable
spilling gprs to vector registers. This patch adds the capability to spill to
other registers aside from just the stack. It also adds the changes for power9
to spill gprs to volatile vector registers when they are available.
This happens only for leaf functions when using the option
-ppc-enable-pe-vector-spills.

Differential Revision: https://reviews.llvm.org/D39386

llvm-svn: 346512
2018-11-09 16:36:24 +00:00
Alexandros Lamprineas e15c982f6d [SelectionDAG] swap select_cc operands to enable folding
The DAGCombiner tries to SimplifySelectCC as follows:

  select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4)

It can't cope with the situation of reordered operands:

  select_cc(x, y, 0, 16, cc)

In that case we just need to swap the operands and invert the Condition Code:

  select_cc(x, y, 16, 0, ~cc)

Differential Revision: https://reviews.llvm.org/D53236

llvm-svn: 346484
2018-11-09 11:09:40 +00:00
Craig Topper 8cca8bd4aa [SelectionDAG] Assert on the width of DemandedElts argument to computeKnownBits for all vector typed operations not just build_vector.
Fix AArch64 unit test that fails with the assertion added.

llvm-svn: 346437
2018-11-08 20:29:17 +00:00
Nirav Dave 6ce9f72f76 [DAGCombine] Improve alias analysis for chain of independent stores.
FindBetterNeighborChains simulateanously improves the chain
dependencies of a chain of related stores avoiding the generation of
extra token factors. For chains longer than the GatherAllAliasDepths,
stores further down in the chain will necessarily fail, a potentially
significant waste and preventing otherwise trivial parallelization.

This patch directly parallelize the chains of stores before improving
each store. This generally improves DAG-level parallelism.

Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk

Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53552

llvm-svn: 346432
2018-11-08 19:14:20 +00:00
David Blaikie c8f7e6c1a9 NFC: DebugInfo: Track the origin CU rather than just the base address for range lists
Turns out knowing more than just the base address might be useful -
specifically a future change to respect a DICompileUnit flag for the use
of base address specifiers in DWARF < 5.

llvm-svn: 346380
2018-11-08 00:35:54 +00:00
Jessica Paquette c4cf775ae0 [MachineOutliner][NFC] Only map blocks which have adjacent legal instructions
If a block doesn't have any ranges of adjacent legal instructions, then it
can't have outlining candidates. There's no point in mapping legal isntructions
in situations like this.

I noticed this reduces the size of the suffix tree in sqlite3 for AArch64 at
-Oz by about 3%.

llvm-svn: 346379
2018-11-08 00:33:38 +00:00
Jessica Paquette 267d266c29 [MachineOutliner][NFC] Don't map MBBs that don't contain legal instructions
I noticed that there are lots of basic blocks that don't have enough legal
instructions in them to warrant outlining. We can skip mapping these entirely.

In sqlite3, compiled for AArch64 at -Oz, this results in a 10% reduction of
the total nodes in the suffix tree. These nodes can never be part of a
repeated substring, and so they don't impact the result at all.

Before this, there were 62128 nodes in the tree for sqlite3. After this, there
are 56457 nodes.

llvm-svn: 346373
2018-11-08 00:02:11 +00:00
Jessica Paquette df5b09b8ce [MachineOutliner][NFC] Remove Parent field from SuffixTreeNode
This is only used for calculating ConcatLen. This isn't necessary,
since it's easily derived from the traversal setting suffix indices.

Remove that. Rename CurrIdx to CurrNodeLen to better describe what's
going on.

llvm-svn: 346349
2018-11-07 19:56:13 +00:00
Jessica Paquette a409cc959b [MachineOutliner][NFC] Traverse suffix tree using a RepeatedSubstring iterator
This takes the traversal methods introduced in r346269 and adapts them
into an iterator. This allows the outliner to iterate over repeated substrings
within the suffix tree directly without having to initially find all of the
substrings and then iterate over them after you've found them.

llvm-svn: 346345
2018-11-07 19:20:55 +00:00
Jessica Paquette a3eb0fac3b [MachineOutliner] Don't store outlined function numberings on OutlinedFunction
NFC-ish. This doesn't change the behaviour of the outliner, but does make sure
that you won't end up with say

OUTLINED_FUNCTION_2:
...
ret

OUTLINED_FUNCTION_248:
...
ret

as the only outlined functions in your module. Those should really be

OUTLINED_FUNCTION_0:
...
ret

OUTLINED_FUNCTION_1:
...
ret

If we produce outlined functions, they probably should have sequential numbers
attached to them. This makes it a bit easier+stable to write outliner tests.

The point of this is to move towards a bit more stability in outlined function
names. By doing this, we at least don't rely on the traversal order of the
suffix tree. Instead, we rely on the order of the candidate list, which is
*far* more consistent. The candidate list is ordered by the end indices of
candidates, so we're more likely to get a stable ordering. This is still
susceptible to changes in the cost model though (like, if we suddenly find new
candidates, for example).

llvm-svn: 346340
2018-11-07 18:36:43 +00:00
Serge Guelton a4d9e2293a Fix ignorded type qualifier warning [NFC]
llvm-svn: 346332
2018-11-07 16:17:30 +00:00
James Y Knight 72f76bf230 Add support for llvm.is.constant intrinsic (PR4898)
This adds the llvm-side support for post-inlining evaluation of the
__builtin_constant_p GCC intrinsic.

Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call
to a function where canConstantFoldTo returns true, and one of the
arguments is a struct.

Updated from patch initially by Janusz Sobczak.

Differential Revision: https://reviews.llvm.org/D4276

llvm-svn: 346322
2018-11-07 15:24:12 +00:00
Matthias Braun 5b7c90b4e2 RegAllocFast: Leave unassigned virtreg entries in map
Set `LiveReg::PhysReg` to zero when freeing a register instead of
removing it from the entry from `LiveRegMap`. This way no iterators get
invalidated and we can avoid passing around and updating iterators all
over the place.

This does not change any allocator decisions. It is not completely NFC
because the arbitrary iteration order through `LiveRegMap` in
`spillAll()` changes so we may get a different order in those spill
sequences (the amount of spills does not change).

This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346298
2018-11-07 06:57:03 +00:00
Matthias Braun b0ecbef428 RegAllocFast: Further cleanups; NFC
This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346297
2018-11-07 06:57:02 +00:00
Matthias Braun 0804dca358 RegAllocFast: Refactor PhysRegState usage; NFC
This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346296
2018-11-07 06:57:00 +00:00
Matthias Braun b4c76ff77c RegAllocFast: Factor spill/reload creation into their own functions; NFC
This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346289
2018-11-07 02:04:12 +00:00
Matthias Braun ebcf5437bc RegAllocFast: Cleanups; NFC
This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346288
2018-11-07 02:04:11 +00:00
Matthias Braun 14af82a608 RegAllocFast: Rename statistic from NumCopies to NumCoalesced
The metric does not return the number of remaining (or inserted) copies
but the number of copies that were coalesced. Pick a more descriptive
name.

llvm-svn: 346287
2018-11-07 02:04:07 +00:00
Jessica Paquette 935d373db9 [MachineOutliner][NFC] Remove OccurrenceCount from SuffixTreeNode
After changing the way we find candidates in r346269, this is no longer used.

llvm-svn: 346275
2018-11-06 22:23:13 +00:00
Jessica Paquette 979cf1e566 [MachineOutliner][NFC] Remove IsInTree from SuffixTreeNode
After changing the way we find repeated substrings in r346269, this
field is no longer used by anything, so it can be removed.

llvm-svn: 346274
2018-11-06 22:21:11 +00:00
Jessica Paquette 4e54ef8883 [MachineOutliner][NFC] Add findRepeatedSubstrings to SuffixTree, kill LeafVector
Instead of iterating over the leaves to find repeated substrings, and walking
collecting leaf children when we don't necessarily need them, let's just
calculate what we need and iterate over that.

By doing this, we don't have to save every leaf. It's easier to read the code
too and understand what's going on.

The goal here, at the end of the day, is to set up to allow us to do something
like

for (RepeatedSubstring &RS : ST) {
 ... do stuff with RS ...
}

Which would let us perform the cost model stuff and the repeated substring
query at the same time.

llvm-svn: 346269
2018-11-06 21:46:41 +00:00
Matthias Braun c6613879ce LivePhysRegs/IfConversion: Change some types from unsigned to MCPhysReg; NFC
Change the type in a couple of lists and sets that only store physical
registers from unsigned to MCPhysRegs. The later is only 16bits and
saves us a bit of memory.

llvm-svn: 346254
2018-11-06 19:00:11 +00:00
Matthias Braun 7a75a91b5b MachineFunction: Store more specific reference to LLVMTargetMachine; NFC
MachineFunction can only be used in code using lib/CodeGen, hence we
can keep a more specific reference to LLVMTargetMachine rather than just
TargetMachine around.

Do the same for references in ScheduleDAG and RegUsageInfoCollector.

llvm-svn: 346183
2018-11-05 23:49:14 +00:00
Matthias Braun 3d849f67cb MachineModuleInfo: Store more specific reference to LLVMTargetMachine; NFC
MachineModuleInfo can only be used in code using lib/CodeGen, hence we
can keep a more specific reference to LLVMTargetMachine rather than just
TargetMachine around.

llvm-svn: 346182
2018-11-05 23:49:13 +00:00
Cameron McInally 9757d5d6c1 [FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsics
Differential Revision: https://reviews.llvm.org/D53411

llvm-svn: 346141
2018-11-05 15:59:49 +00:00
Simon Pilgrim 6bd468bd8b [TargetLowering] Begin generalizing TargetLowering::expandFP_TO_SINT support. NFCI.
Prior to initial work to add vector expansion support, remove assumptions that we're working on scalar types.

llvm-svn: 346139
2018-11-05 15:49:09 +00:00
Craig Topper 8f2f2a76b9 [DAGCombiner] Use tryFoldToZero to simplify some code and make it work correctly between LegalTypes and LegalOperations.
The original code avoided creating a zero vector after type legalization, but if we're after type legalization the type we have is legal. The real hazard we need to avoid is creating a build vector after op legalization. tryFoldToZero takes care of checking for this.

llvm-svn: 346119
2018-11-05 05:53:06 +00:00