Commit Graph

25980 Commits

Author SHA1 Message Date
Simon Pilgrim 8fbe439345 [SelectionDAG] Add SimplifyDemandedBits handling for ISD::SCALAR_TO_VECTOR
Fixes a lot of constant folding mismatches between i686 and x86_64

llvm-svn: 356273
2019-03-15 17:00:55 +00:00
Mikael Holmen 339daae806 [CodeGenPrepare] avoid crashing from replacing a phi twice
Summary:
This is a fix to bug 41052:
https://bugs.llvm.org/show_bug.cgi?id=41052

While trying to optimize a memory instruction in a dead basic block, we end up registering the same phi for replacement twice. This patch avoids registering more than the first replacement candidate for a phi.

Patch by: JesperAntonsson

Reviewers: skatkov, aprantl

Reviewed By: aprantl

Subscribers: jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59358

llvm-svn: 356260
2019-03-15 13:51:05 +00:00
Sanjay Patel 2c9275a790 [CGP] add another bailout for degenerate code (PR41064)
This is almost the same as:
rL355345
...and should prevent any potential crashing from examples like:
https://bugs.llvm.org/show_bug.cgi?id=41064
...although the bug was masked by:
rL355823
...and I'm not sure how to repro the problem after that change.

llvm-svn: 356218
2019-03-14 23:14:31 +00:00
Matt Arsenault bc6d07ca46 MIR: Allow targets to serialize MachineFunctionInfo
This has been a very painful missing feature that has made producing
reduced testcases difficult. In particular the various registers
determined for stack access during function lowering were necessary to
avoid undefined register errors in a large percentage of
cases. Implement a subset of the important fields that need to be
preserved for AMDGPU.

Most of the changes are to support targets parsing register fields and
properly reporting errors. The biggest sort-of bug remaining is for
fields that can be initialized from the IR section will be overwritten
by a default initialized machineFunctionInfo section. Another
remaining bug is the machineFunctionInfo section is still printed even
if empty.

llvm-svn: 356215
2019-03-14 22:54:43 +00:00
Philip Reames 70d156991c Allow code motion (and thus folding) for atomic (but unordered) memory operands
Building on the work done in D57601, now that we can distinguish between atomic and volatile memory accesses, go ahead and allow code motion of unordered atomics. As seen in the diffs, this allows much better folding of memory operations into using instructions. (Mostly done by the PeepholeOpt pass.)

Note: I have not reviewed all callers of hasOrderedMemoryRef since one of them - isSafeToMove - is very widely used. I'm relying on the documented semantics of each method to judge correctness.

Differential Revision: https://reviews.llvm.org/D59345

llvm-svn: 356170
2019-03-14 17:20:59 +00:00
Adrian Prantl e69917f166 Add IR debug info support for Elemental, Pure, and Recursive Procedures.
Patch by Eric Schweitz!

Differential Revision: https://reviews.llvm.org/D54043

llvm-svn: 356163
2019-03-14 16:29:54 +00:00
Matt Arsenault 133716929c GlobalISel: Use multiple returns for intrinsic structs
This is consistent with what SelectionDAG does and is much easier to
work with than the extract sequence with an artificial wide register.

For the AMDGPU control flow intrinsics, this was producing an s128 for
the i64, i1 tuple return. Any legalization that should apply to a real
s128 value would badly obscure the direct values that need to be seen.

llvm-svn: 356147
2019-03-14 14:18:56 +00:00
Quentin Colombet e77e5f44b8 [GlobalISel][Utils] Add a getConstantVRegVal variant that looks through instrs
getConstantVRegVal used to only look for G_CONSTANT when looking at
unboxing the value of a vreg. However, constants are sometimes not
directly used and are hidden behind trunc, s|zext or copy chain of
computation.

In particular this may be introduced by the legalization process that
doesn't want to simplify these patterns because it can lead to infine
loop when legalizing a constant.

To circumvent that problem, add a new variant of getConstantVRegVal,
named getConstantVRegValWithLookThrough, that allow to look through
extensions.

Differential Revision: https://reviews.llvm.org/D59227

llvm-svn: 356116
2019-03-14 01:37:13 +00:00
Craig Topper 66df7361ff [ResetMachineFunctionPass] Add visited functions statistics info
Adding a "NumFunctionsVisited" for collecting the visited function number.
It can be used to collect function pass rate in some tests,
the pass rate = (NumberVisited - NumberReset)/NumberVisited.
e.g. it can be used for caculating GlobalISel pass rate in Test-Suite.

Patch by Tianyang Zhu (zhutianyang)

Differential Revision: https://reviews.llvm.org/D59285

llvm-svn: 356114
2019-03-14 01:13:15 +00:00
Nirav Dave ee5183c796 [DAGCombiner] Fix Comment. NFC.
llvm-svn: 356069
2019-03-13 17:44:40 +00:00
Nirav Dave d6351340bb [DAGCombiner] If a TokenFactor would be merged into its user, consider the user later.
Summary:
A number of optimizations are inhibited by single-use TokenFactors not
being merged into the TokenFactor using it. This makes we consider if
we can do the merge immediately.

Most tests changes here are due to the change in visitation causing
minor reorderings and associated reassociation of paired memory
operations.

CodeGen tests with non-reordering changes:

  X86/aligned-variadic.ll -- memory-based add folded into stored leaq
  value.

  X86/constant-combiners.ll -- Optimizes out overlap between stores.

  X86/pr40631_deadstore_elision -- folds constant byte store into
  preceding quad word constant store.

Reviewers: RKSimon, craig.topper, spatel, efriedma, courbet

Reviewed By: courbet

Subscribers: dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, eraman, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59260

llvm-svn: 356068
2019-03-13 17:07:09 +00:00
Clement Courbet 3bb5d0bb9b Re-land r354244 "[DAGCombiner] Eliminate dead stores to stack."
Always check candidates for hasOtherUses(), not only stores.

llvm-svn: 356050
2019-03-13 13:56:23 +00:00
Simon Pilgrim 360ce82db2 [DAG] Move integer setcc %x, %x folding into FoldSetCC
First step towards PR40800 - I intend to move the float case in a separate future patch.

I had to tweak the (overly reduced) thumb2 test and the x86 widening test change is annoying (no longer rematerializable) but we should address this separately.

Differential Revision: https://reviews.llvm.org/D59244

llvm-svn: 356040
2019-03-13 11:08:57 +00:00
Philip Reames 21a50ccf9c [ImplicitNullChecks] Support unordered atomic accesses
Update the INC pass to allow folding unordered atomics.  This is the first optimization unblocked by the changes landed from D57601.

llvm-svn: 356006
2019-03-13 03:25:20 +00:00
Matt Arsenault bdfb6cfdf1 MIR: Stop reinitializing target information for every use
Every time a physical register reference was parsed, this would
initialize a string map for every register in in target, and discard
it for the next. The same applies for the other fields initialized
from target information.

Follow along with how the function state is tracked, and add a new
tracking class for target information.

The string->register class/register bank for some reason were kept
separately, so track them in the same place.

llvm-svn: 355970
2019-03-12 20:42:12 +00:00
Philip Reames 18408d5e79 [CodeGen] Add MMOs to statepoint nodes during SelectionDAG
The existing statepoint lowering code does something odd; it adds machine memory operands post instruction selection. This was copied from the stackmap/patchpoint implementation, but appears to be non-idiomatic.

This change is largely NFC. It moves the MMO creation logic into SelectionDAG building. It ends up not quite being NFC because the size of the stack slot is reflected in the MMO. The old code blindly used pointer size for the MMO size, which appears to have always been incorrect for larger values. It just happened nothing actually relied on the MMOs, so it worked out okay.

For context, I'm planning on removing the MOVolatile flag from these in a future commit, and then removing the MOStore flag from deopt spill slots in a separate one. Doing so is motivated by a small test case where we should be able to better schedule spill slots, but don't do so due to a memory use/def implied by the statepoint.

Differential Revision: https://reviews.llvm.org/D59106

llvm-svn: 355953
2019-03-12 19:12:33 +00:00
Nikita Popov 149bc099f6 [SDAG] Expand pow2 mulo using shifts
Expand MULO with constant power of two operand into a shift. The
overflow is checked with (x << shift) >> shift == x, where the right
shift will be logical for umulo and arithmetic for smulo (with
exception for multiplications by signed_min).

Differential Revision: https://reviews.llvm.org/D59041

llvm-svn: 355937
2019-03-12 16:57:25 +00:00
Simon Pilgrim 9f0a5ca843 [DAGCombine] Pull out repeated demanded bitmask generation. NFCI.
llvm-svn: 355932
2019-03-12 15:58:28 +00:00
Tim Northover 8935aca9c7 CodeGenPrep: preserve inbounds attribute when sinking GEPs.
Targets can potentially emit more efficient code if they know address
computations never overflow. For example ILP32 code on AArch64 (which only has
64-bit address computation) can ignore the possibility of overflow with this
extra information.

llvm-svn: 355926
2019-03-12 15:22:23 +00:00
Eugene Leviant 1e249caaec [CGP] Fix UB when GEP is bound to trivial PHINode
Differential revision: https://reviews.llvm.org/D59140

llvm-svn: 355904
2019-03-12 10:10:29 +00:00
Sanjoy Das 3f5ce18658 Reland "Relax constraints for reduction vectorization"
Change from original commit: move test (that uses an X86 triple) into the X86
subdirectory.

Original description:
Gating vectorizing reductions on *all* fastmath flags seems unnecessary;
`reassoc` should be sufficient.

Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal

Reviewed By: sdesmalen

Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57728

llvm-svn: 355889
2019-03-12 01:31:44 +00:00
Nathan Lanza cc51dc649a Add Swift enumerator value for CodeView::SourceLanguage
Summary:
Swift now generates PDBs for debugging on Windows. llvm and lldb
need a language enumerator value too properly handle the output
emitted by swiftc.

Subscribers: jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59231

llvm-svn: 355882
2019-03-11 23:27:59 +00:00
Sanjoy Das 2136a5bc49 Revert "Relax constraints for reduction vectorization"
This reverts commit r355868.  Breaks hexagon.

llvm-svn: 355873
2019-03-11 22:37:31 +00:00
Evgeniy Stepanov aedec3f684 Remove ASan asm instrumentation.
Summary: It is incomplete and has no users AFAIK.

Reviewers: pcc, vitalybuka

Subscribers: srhines, kubamracek, mgorny, krytarowski, eraman, hiraditya, jdoerfert, #sanitizers, llvm-commits, thakis

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D59154

llvm-svn: 355870
2019-03-11 21:50:10 +00:00
Sanjoy Das 93f8cc186a Relax constraints for reduction vectorization
Summary:
Gating vectorizing reductions on *all* fastmath flags seems unnecessary;
`reassoc` should be sufficient.

Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal

Reviewed By: sdesmalen

Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57728

llvm-svn: 355868
2019-03-11 21:36:41 +00:00
Jessica Paquette 42d16501e6 [GlobalISel][AArch64] Always fall back on aarch64.neon.addp.*
Overloaded intrinsics aren't necessarily safe for instruction selection. One
such intrinsic is aarch64.neon.addp.*.

This is a temporary workaround to ensure that we always fall back on that
intrinsic. Eventually this will be replaced with a proper solution.

https://bugs.llvm.org/show_bug.cgi?id=40968

Differential Revision: https://reviews.llvm.org/D59062

llvm-svn: 355865
2019-03-11 20:51:17 +00:00
Nikita Popov aa7cfa75f9 [SDAG][AArch64] Legalize VECREDUCE
Fixes https://bugs.llvm.org/show_bug.cgi?id=36796.

Implement basic legalizations (PromoteIntRes, PromoteIntOp,
ExpandIntRes, ScalarizeVecOp, WidenVecOp) for VECREDUCE opcodes.
There are more legalizations missing (esp float legalizations),
but there's no way to test them right now, so I'm not adding them.

This also includes a few more changes to make this work somewhat
reasonably:

 * Add support for expanding VECREDUCE in SDAG. Usually
   experimental.vector.reduce is expanded prior to codegen, but if the
   target does have native vector reduce, it may of course still be
   necessary to expand due to legalization issues. This uses a shuffle
   reduction if possible, followed by a naive scalar reduction.
 * Allow the result type of integer VECREDUCE to be larger than the
   vector element type. For example we need to be able to reduce a v8i8
   into an (nominally) i32 result type on AArch64.
 * Use the vector operand type rather than the scalar result type to
   determine the action, so we can control exactly which vector types are
   supported. Also change the legalize vector op code to handle
   operations that only have vector operands, but no vector results, as
   is the case for VECREDUCE.
 * Default VECREDUCE to Expand. On AArch64 (only target using VECREDUCE),
   explicitly specify for which vector types the reductions are supported.

This does not handle anything related to VECREDUCE_STRICT_*.

Differential Revision: https://reviews.llvm.org/D58015

llvm-svn: 355860
2019-03-11 20:22:13 +00:00
Jonas Paulsson 8b8dc50e79 [RegAlloc] Avoid compile time regression with multiple copy hints.
As a fix for https://bugs.llvm.org/show_bug.cgi?id=40986 ("excessive compile
time building opencollada"), this patch makes sure that no phys reg is hinted
more than once from getRegAllocationHints().

This handles the case were many virtual registers are assigned to the same
physreg. The previous compile time fix (r343686) in weightCalcHelper() only
made sure that physical/virtual registers are passed no more than once to
addRegAllocationHint().

Review: Dimitry Andric, Quentin Colombet
https://reviews.llvm.org/D59201

llvm-svn: 355854
2019-03-11 19:00:37 +00:00
Simon Pilgrim f3be93a2ff [DAG] FoldSetCC - reuse valuetype + ensure its simple.
llvm-svn: 355847
2019-03-11 17:56:18 +00:00
Brian Gesiak 4349dc76fa [Utils] Extract EliminateUnreachableBlocks (NFC)
Summary:
Extract the functionality of eliminating unreachable basic blocks
within a function, previously encapsulated within the
-unreachableblockelim pass, and make it available as a function within
BlockUtils.h. No functional change intended other than making the logic
reusable.

Exposing this logic makes it easier to implement
https://reviews.llvm.org/D59068, which fixes coroutines bug
https://bugs.llvm.org/show_bug.cgi?id=40979.

Reviewers: mkazantsev, wmi, davidxl, silvas, davide

Reviewed By: davide

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59069

llvm-svn: 355846
2019-03-11 17:51:57 +00:00
Simon Pilgrim 1bb5b56485 [DAG] Move SetCC NaN handling into FoldSetCC
llvm-svn: 355845
2019-03-11 17:43:10 +00:00
Simon Pilgrim 53518b45a5 [DAG] TargetLowering::SimplifySetCC - call FoldSetCC early to handle constant/commute folds.
Noticed while looking at PR40800 (and also D57921)

llvm-svn: 355828
2019-03-11 15:01:31 +00:00
Sam Parker 52760bf435 [CGP] Limit distance between overflow math and cmp
Inserting an overflowing arithmetic intrinsic can increase register
pressure by producing two values at a point where only one is needed,
while the second use maybe several blocks away. This increase in
pressure is likely to be more detrimental on performance than
rematerialising one of the original instructions.
    
So, check that the arithmetic and compare instructions are no further
apart than their immediate successor/predecessor.

Differential Revision: https://reviews.llvm.org/D59024

llvm-svn: 355823
2019-03-11 13:19:46 +00:00
Benjamin Kramer 6ff32e143a [MIPS GlobalISel] Silence uninitialized variable warning
The control flow here cannot ever use the uninitialized value, but it's
too hard for the compiler to figure that out. Clang warns:

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: error: variable 'CarrySum' is used uninitialized whenever 'for' loop exits because its condition is false [-Werror,-Wsometimes-uninitialized]
      for (unsigned i = 2; i < Factors.size(); ++i)
                           ^~~~~~~~~~~~~~~~~~
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2604:26: note: uninitialized use occurs here
    CarrySumPrevDstIdx = CarrySum;
                         ^~~~~~~~
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: note: remove the condition if it is always true
      for (unsigned i = 2; i < Factors.size(); ++i)
                           ^~~~~~~~~~~~~~~~~~
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2583:22: note: initialize the variable 'CarrySum' to silence this warning
    unsigned CarrySum;
                     ^
                      = 0

llvm-svn: 355818
2019-03-11 10:39:15 +00:00
Petar Avramovic 5229f47f9f [MIPS GlobalISel] NarrowScalar G_UMULH
NarrowScalar G_UMULH in LegalizerHelper 
using multiplyRegisters helper function.
NarrowScalar G_UMULH for MIPS32.

Differential Revision: https://reviews.llvm.org/D58825

llvm-svn: 355815
2019-03-11 10:08:44 +00:00
Petar Avramovic 0b17e59b5c [MIPS GlobalISel] NarrowScalar G_MUL
Narrow Scalar G_MUL for MIPS32.
Revisit NarrowScalar implementation in LegalizerHelper.
Introduce new helper function multiplyRegisters.
It performs generic multiplication of values held in multiple registers.
Generated instructions use only types NarrowTy and i1.
Destination can be same or two times size of the source.

Differential Revision: https://reviews.llvm.org/D58824

llvm-svn: 355814
2019-03-11 10:00:17 +00:00
Amaury Sechet a135fd5562 Remove redundant extractBooleanFlip argument. NFC
llvm-svn: 355794
2019-03-11 00:37:01 +00:00
Sanjay Patel 7d8260feb6 [CGP] fix comments; NFC
llvm-svn: 355791
2019-03-10 18:42:30 +00:00
Craig Topper 1a872f2b15 Recommit r355224 "[TableGen][SelectionDAG][X86] Add specific isel matchers for immAllZerosV/immAllOnesV. Remove bitcasts from X86 patterns that are no longer necessary."
Includes a fix to emit a CheckOpcode for build_vector when immAllZerosV/immAllOnesV is used as a pattern root. This means it can't be used to look through bitcasts when used as a root, but that's probably ok. This extra CheckOpcode will ensure that the first match in the isel table will be a SwitchOpcode which is needed by the caching optimization in the ISel Matcher.

Original commit message:

Previously we had build_vector PatFrags that called ISD::isBuildVectorAllZeros/Ones. Internally the ISD::isBuildVectorAllZeros/Ones look through bitcasts, but we aren't able to take advantage of that in isel. Instead of we have to canonicalize the types of the all zeros/ones build_vectors and insert bitcasts. Then we have to pattern match those exact bitcasts.

By emitting specific matchers for these 2 nodes, we can make isel look through any bitcasts without needing to explicitly match them. We should also be able to remove the canonicalization to vXi32 from lowering, but I've left that for a follow up.

This removes something like 40,000 bytes from the X86 isel table.

Differential Revision: https://reviews.llvm.org/D58595

llvm-svn: 355784
2019-03-10 05:21:52 +00:00
Amaury Sechet b62642a115 Refactor isBooleanFlip into extractBooleanFlip so that users do not depend on the patern matched. NFC
llvm-svn: 355769
2019-03-09 02:51:52 +00:00
Craig Topper 69f8c1653d [ScalarizeMaskedMemIntrin] Use IRBuilder functions that take uint32_t/uint64_t for getelementptr, extractelement, and insertelement.
This saves needing to call getInt32 ourselves. Making the code a little shorter.

The test changes are because insert/extract use getInt64 internally. Shouldn't be a functional issue.

This cleanup because I plan to write similar code for expandload/compressstore.

llvm-svn: 355767
2019-03-09 02:08:41 +00:00
Wei Mi 98214347c4 Rename a local variable counter to Counter.
llvm-svn: 355759
2019-03-08 23:32:07 +00:00
Wei Mi fb9693d1c9 [RegisterCoalescer][NFC] bind a DenseMap access to a reference to avoid
repeated lookup operations

llvm-svn: 355757
2019-03-08 23:29:46 +00:00
Craig Topper d84f605910 [ScalarizeMaskedMemIntrin] Only set the ModifiedDT flag if new basic blocks were added.
There are special cases in the scalarization for constant masks. If we hit one of the special cases we don't need to reset the iteration.

Noticed while starting work on adding expandload/compressstore to this pass.

llvm-svn: 355754
2019-03-08 23:03:43 +00:00
Rong Xu ce3be45cac [CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst
r44412 fixed a huge compile time regression but it needed ModifiedDT flag to be
maintained correctly in optimizations in optimizeBlock() and optimizeInst().
Function optimizeSelectInst() does not update the flag.
This patch propagates the flag in optimizeSelectInst() back to
optimizeBlock().

This patch also removes ModifiedDT in CodeGenPrepare class (which is not used).
The property of ModifiedDT is now recorded in a ref parameter.

Differential Revision: https://reviews.llvm.org/D59139

llvm-svn: 355751
2019-03-08 22:46:18 +00:00
Matt Arsenault 26e76ef0e2 DAG: Don't try to cluster loads with tied inputs
This avoids breaking possible value dependencies when sorting loads by
offset.

AMDGPU has some load instructions that write into the high or low bits
of the destination register, and have a tied input for the other input
bits. These can easily have the same base pointer, but be a swizzle so
the high address load needs to come first. This was inserting glue
forcing the opposite ordering, producing a cycle the InstrEmitter
would assert on. It may be potentially expensive to look for the
dependency between the other loads, so just skip any where this could
happen.

Fixes bug 40936 by reverting r351379, which added a hacky attempt to
fix this by adding chains in this case, which I think was just working
around broken glue before the InstrEmitter. The core of the patch is
re-implementing the fix for that problem.

llvm-svn: 355728
2019-03-08 20:46:15 +00:00
Amaury Sechet 782ac933b5 [DAGCombiner] fold (add (add (xor a, -1), b), 1) -> (sub b, a)
Summary: This pattern is sometime created after legalization.

Reviewers: efriedma, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58874

llvm-svn: 355716
2019-03-08 19:39:32 +00:00
Wei Mi 72ec6801b5 [RegisterCoalescer] Limit the number of joins for large live interval with
many valnos.

Recently we found compile time out problem in several cases when
SpeculativeLoadHardening was enabled. The significant compile time was spent
in register coalescing pass, where register coalescer tried to join many other
live intervals with some very large live intervals with many valnos.

Specifically, every time JoinVals::mapValues is called, computeAssignment will
be called by getNumValNums() times of the target live interval. If the large
live interval has N valnos and has N copies associated with it, trying to
coalescing those copies will at least cost N^2 complexity.

The patch adds some limit to the effort trying to join those very large live
intervals with others. By default, for live interval with > 100 valnos, and
when it has been coalesced with other live interval by more than 100 times,
we will stop coalescing for the live interval anymore. That put a compile
time cap for the N^2 algorithm and effectively solves the compile time
problem we saw.

Differential revision: https://reviews.llvm.org/D59143

llvm-svn: 355714
2019-03-08 19:25:32 +00:00
Simon Pilgrim 04e8439f72 [DAGCombine] Merge visitSMULO+visitUMULO into visitMULO. NFCI.
llvm-svn: 355690
2019-03-08 11:41:18 +00:00
Simon Pilgrim c71d6d157f [DAGCombine] Merge visitSADDO+visitUADDO into visitADDO. NFCI.
llvm-svn: 355689
2019-03-08 11:30:33 +00:00