Commit Graph

29920 Commits

Author SHA1 Message Date
Yuanfang Chen 94427af60c Revert "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline"
This reverts commit 4646de5d75.

Some bots have build failure.
2020-12-28 17:44:22 -08:00
Yuanfang Chen 4646de5d75 [NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline
Following up on D67687.
Please refer to the RFC here http://lists.llvm.org/pipermail/llvm-dev/2020-July/143309.html

`CodeGenPassBuilder` is the NPM counterpart of `TargetPassConfig` with below differences.
- Debugging features (MIR print/verify, disable pass, start/stop-before/after, etc.) living in `TargetPassConfig` are moved to use PassInstrument as much as possible. (Implementation also lives in `TargetPassConfig.cpp`)
- `TargetPassConfig` is a polymorphic base (virtual inheritance) to build the target-dependent pipeline whereas `CodeGenPassBuilder` is the CRTP base/helper to implement the target-dependent pipeline. The motivation is flexibility for targets to customize the pipeline, inlining opportunity, and fits the overall NPM value semantics design.
- `TargetPassConfig` is a legacy immutable pass to declare hooks for targets to customize some target-independent codegen layer behavior. This is partially ported to TargetMachine::options. The rest, such as `createMachineScheduler/createPostMachineScheduler`, are left out for now. They should be implemented in LLVMTargetMachine in the future.

Reviewed By: arsenm, aeubanks

Differential Revision: https://reviews.llvm.org/D83608
2020-12-28 17:36:36 -08:00
Gabriel Hjort Åkerlund b9a7c89d43 [MIRPrinter] Fix incorrect output of unnamed stack names
The MIRParser expects unnamed stack entries to have empty names ('').
In case of unnamed alloca instructions, the MIRPrinter would output
'<unnamed alloca>', which caused the MIRParser to reject the generated
code.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D93685
2020-12-28 18:01:40 +01:00
Chen Zheng 31c2b93d83 [MachineSink] add threshold in machinesink pass to reduce compiling time. 2020-12-27 23:23:07 -05:00
Kazu Hirata 789d250613 [CodeGen, Transforms] Use *Map::lookup (NFC) 2020-12-27 09:57:27 -08:00
Amara Emerson 7df3544e80 [GlobalISel] Fix assertion failures after "GlobalISel: Return APInt from getConstantVRegVal" landed.
APInt binary ops don't promote types but instead assert, which a combine was
relying on.
2020-12-26 23:51:44 -08:00
Kazu Hirata e457896a6e [CodeGen] Remove unused function hasInlineAsmMemConstraint (NFC)
The last use of the function was removed on Sep 13, 2010 in commit
1094c80281.
2020-12-24 09:17:58 -08:00
Kazu Hirata df812115e3 [CodeGen, Transforms] Use llvm::any_of (NFC) 2020-12-24 09:08:36 -08:00
Evgeniy Brevnov e0751234ef [CodeGen] Add "noreturn" attirbute to _Unwind_Resume
Currently 'resume' is lowered to _Unwind_Resume with out "noreturn" attribute. Semantically _Unwind_Resume  library call is expected to never return and should be marked as such. Though I didn't find any changes in behavior of existing tests there will be a difference once https://reviews.llvm.org/D79485 lands.

I was not able to come up with the test case anything better than just checking for presence of "noreturn" attribute. Please let me know if there is a better way to test the change.

Reviewed By: xbolva00

Differential Revision: https://reviews.llvm.org/D93682
2020-12-24 18:14:18 +07:00
Layton Kifer d29f93bda5 [DAGCombiner] Don't create sexts of deleted xors when they were in-visit replaced
Fixes a bug introduced by D91589.

When folding `(sext (not i1 x)) -> (add (zext i1 x), -1)`, we try to replace the not first when possible. If we replace the not in-visit, then the now invalidated node will be returned, and subsequently we will return an invalid sext. In cases where the not is replaced in-visit we can simply return SDValue, as the not in the current sext should have already been replaced.

Thanks @jgorbe, for finding the below reproducer.

The following reduced test case crashes clang when built with `clang -O1 -frounding-math`:

```
template <class> class a {
  int b() { return c == 0.0 ? 0 : -1; }
  int c;
};
template class a<long>;
```

A debug build of clang produces this "assertion failed" error:
```
clang: /home/jgorbe/code/llvm/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:264: void {anonymous}::DAGCombiner::AddToWorklist(llvm::
SDNode*): Assertion `N->getOpcode() != ISD::DELETED_NODE && "Deleted Node added to Worklist"' failed.
```

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93274
2020-12-23 16:16:26 -08:00
Sriraman Tallam 34e70d722d Append ".__part." to every basic block section symbol.
Every basic block section symbol created by -fbasic-block-sections will contain
".__part." to know that this symbol corresponds to a basic block fragment of
the function.

This patch solves two problems:

a) Like D89617, we want function symbols with suffixes to be properly qualified
   so that external tools like profile aggregators know exactly what this
   symbol corresponds to.
b) The current basic block naming just adds a ".N" to the symbol name where N is
   some integer. This collides with how clang creates __cxx_global_var_init.N.
   clang creates these symbol names to call constructor functions and basic
   block symbol naming should not use the same style.

Fixed all the test cases and added an extra test for __cxx_global_var_init
breakage.

Differential Revision: https://reviews.llvm.org/D93082
2020-12-23 11:35:44 -08:00
Matt Arsenault 581d13f8ae GlobalISel: Return APInt from getConstantVRegVal
Returning int64_t was arbitrarily limiting for wide integer types, and
the functions should handle the full generality of the IR.

Also changes the full form which returns the originally defined
vreg. Add another wrapper for the common case of just immediately
converting to int64_t (arguably this would be useful for the full
return value case as well).

One possible issue with this change is some of the existing uses did
break without conversion to getConstantVRegSExtVal, and it's possible
some without adequate test coverage are now broken.
2020-12-22 22:23:58 -05:00
Matt Arsenault 5bec082834 VirtRegMap: Use Register 2020-12-22 20:56:14 -05:00
Arthur O'Dwyer 22cf54a7fb Replace `T(x)` with `reinterpret_cast<T>(x)` everywhere it means reinterpret_cast. NFC.
Differential Revision: https://reviews.llvm.org/D76572
2020-12-22 19:54:29 -05:00
Sjoerd Meijer 9a6de74d5a [MachineLICM] Add llvm debug messages to SinkIntoLoop. NFC.
I am investigating sinking instructions back into the loop under high
register pressure. This is just a first NFC step to add some debug
messages that allows tracing of the decision making.
2020-12-22 09:19:43 +00:00
Pavel Labath 8d75d902a9 [DebugInfo] Don't use DW_OP_implicit_value for fragments
Currently using DW_OP_implicit_value in fragments produces invalid DWARF
expressions. (Such a case can occur in complex floats, for example.)

This problem manifests itself as a missing DW_OP_piece operation after
the last fragment. This happens because the function for printing
constant float value skips printing the accompanying DWARF expression,
as that would also print DW_OP_stack_value (which is not desirable in
this case). However, this also results in DW_OP_piece being skipped.

The reason that DW_OP_piece is missing only for the last piece is that
the act of printing the next fragment corrects this. However, it does
that for the wrong reason -- the code emitting this DW_OP_piece thinks
that the previous fragment was missing, and so it thinks that it needs
to skip over it in order to be able to print itself.

In a simple scenario this works out, but it's likely that in a more
complex setup (where some pieces are in fact missing), this logic would
go badly wrong. In a simple setup gdb also seems to not mind the fact
that the DW_OP_piece is missing, but it would also likely not handle
more complex use cases.

For this reason, this patch disables the usage of DW_OP_implicit_value
in the frament scenario (we will use DW_OP_const*** instead), until we
figure out the right way to deal with this. This guarantees that we
produce valid expressions, and gdb can handle both kinds of inputs
anyway.

Differential Revision: https://reviews.llvm.org/D92013
2020-12-22 10:07:47 +01:00
Bing1 Yu e8ade4569b [LegalizeType] When LegalizeType procedure widens a masked_gather, set MemoryType's EltNum equal to Result's EltNum
When LegalizeType procedure widens a masked_gather, set MemoryType's EltNum equal to Result's EltNum.

As I mentioned in https://reviews.llvm.org/D91092, in previous code, If we have a v17i32's masked_gather in avx512, we widen it to a v32i32's masked_gather with a v17i32's MemoryType. When the SplitVecRes_MGATHER process this v32i32's masked_gather, GetSplitDestVTs will assert fail since what you are going to split is v17i32.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93610
2020-12-22 13:27:38 +08:00
Fangrui Song d9a0c40bce [MC] Split MCContext::createTempSymbol, default AlwaysAddSuffix to true, and add comments
CanBeUnnamed is rarely false. Splitting to a createNamedTempSymbol makes the
intention clearer and matches the direction of reverted r240130 (to drop the
unneeded parameters).

No behavior change.
2020-12-21 14:04:13 -08:00
Denis Antrushin 6f45049fb6 [Statepoints] Disable VReg lowering for values used on exception path of invoke.
Currently we lower invokes the same way as usual calls, e.g.:

V1 = STATEPOINT ... V (tied-def 0)

But this is incorrect is V1 is used on exceptional path.
By LLVM rules V1 neither dominates its uses in landing pad, nor
its live range is live on entry to landing pad. So compiler is
allowed to do various weird transformations like splitting live
range after statepoint and use split LR in catch block.

Until (and if) we find better solution to this problem, let's
use old lowering (spilling) for those values which are used on
exceptional path and allow VReg lowering for values used only
on normal path.

Differential Revision: https://reviews.llvm.org/D93449
2020-12-21 20:27:05 +07:00
Fangrui Song 1635dea266 [AsmPrinter] Replace a reachable report_fatal_error with MCContext::reportError 2020-12-20 23:45:49 -08:00
Pushpinder Singh e2303a448e [FastRA] Fix handling of bundled MIs
Fast register allocator skips bundled MIs, as the main assignment
loop uses MachineBasicBlock::iterator (= MachineInstrBundleIterator)
This was causing SIInsertWaitcnts to crash which expects all
instructions to have registers assigned.

This patch makes sure to set everything inside bundle to the same
assignments done on BUNDLE header.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D90369
2020-12-21 02:10:55 -05:00
Chen Zheng 4dce7c2e20 [MachineLICM] delete dead flag if the duplicated def outside of loop is dead.
Fixup dead flags for CSE-ed instructions.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D92557
2020-12-20 19:26:22 -05:00
Kazu Hirata 3285ee143b [Analysis, IR, CodeGen] Use llvm::erase_if (NFC) 2020-12-20 09:19:35 -08:00
Kazu Hirata 805d59593f [Analysis, CodeGen, IR] Use contains (NFC) 2020-12-18 19:08:17 -08:00
Chih-Ping Chen 5f75dcf571 [DebugInfo] Support Fortran 'use <external module>' statement.
The main change is to add a 'IsDecl' field to DIModule so
that when IsDecl is set to true, the debug info entry generated
for the module would be marked as a declaration. That way, the debugger
would look up the definition of the module in the gloabl scope.

Please see the comments in llvm/test/DebugInfo/X86/dimodule.ll
for what the debug info entries would look like.

Differential Revision: https://reviews.llvm.org/D93462
2020-12-18 13:10:57 -05:00
Craig Blackmore 698ae90f30 [RegisterScavenging] Fix assert in scavengeRegisterBackwards
According to the documentation, if a spill is required to make a
register available and AllowSpill is false, then NoRegister should be
returned, however, this scenario was actually triggering an assertion
failure.

This patch moves the assertion after the handling of AllowSpill.

Authored by: Lewis Revill

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D92104
2020-12-18 16:57:05 +00:00
Matt Arsenault fd0f5fb8de PEI: Only call updateLiveness once per function
This only needs to be called once for the function, and it visits all
the necessary blocks in the function. It looks like
631f6b888c accidentally moved this into
the loop over all save blocks.
2020-12-18 11:02:28 -05:00
Bjorn Pettersson a89d751fb4 Add intrinsics for saturating float to int casts
This patch adds support for the fptoui.sat and fptosi.sat intrinsics,
which provide basically the same functionality as the existing fptoui
and fptosi instructions, but will saturate (or return 0 for NaN) on
values unrepresentable in the target type, instead of returning
poison. Related mailing list discussion can be found at:
https://groups.google.com/d/msg/llvm-dev/cgDFaBmCnDQ/CZAIMj4IBAAJ

The intrinsics have overloaded source and result type and support
vector operands:

    i32 @llvm.fptoui.sat.i32.f32(float %f)
    i100 @llvm.fptoui.sat.i100.f64(double %f)
    <4 x i32> @llvm.fptoui.sat.v4i32.v4f16(half %f)
    // etc

On the SelectionDAG layer two new ISD opcodes are added,
FP_TO_UINT_SAT and FP_TO_SINT_SAT. These opcodes have two operands
and one result. The second operand is an integer constant specifying
the scalar saturation width. The idea here is that initially the
second operand and the scalar width of the result type are the same,
but they may change during type legalization. For example:

    i19 @llvm.fptsi.sat.i19.f32(float %f)
    // builds
    i19 fp_to_sint_sat f, 19
    // type legalizes (through integer result promotion)
    i32 fp_to_sint_sat f, 19

I went for this approach, because saturated conversion does not
compose well. There is no good way of "adjusting" a saturating
conversion to i32 into one to i19 short of saturating twice.
Specifying the saturation width separately allows directly saturating
to the correct width.

There are two baseline expansions for the fp_to_xint_sat opcodes. If
the integer bounds can be exactly represented in the float type and
fminnum/fmaxnum are legal, we can expand to something like:

    f = fmaxnum f, FP(MIN)
    f = fminnum f, FP(MAX)
    i = fptoxi f
    i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN

If the bounds cannot be exactly represented, we expand to something
like this instead:

    i = fptoxi f
    i = select f ult FP(MIN), MIN, i
    i = select f ogt FP(MAX), MAX, i
    i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN

It should be noted that this expansion assumes a non-trapping fptoxi.

Initial tests are for AArch64, x86_64 and ARM. This exercises all of
the scalar and vector legalization. ARM is included to test float
softening.

Original patch by @nikic and @ebevhan (based on D54696).

Differential Revision: https://reviews.llvm.org/D54749
2020-12-18 11:09:41 +01:00
Rong Xu 3733463dbb [IR][PGO] Add hot func attribute and use hot/cold attribute in func section
Clang FE currently has hot/cold function attribute. But we only have
cold function attribute in LLVM IR.

This patch adds support of hot function attribute to LLVM IR.  This
attribute will be used in setting function section prefix/suffix.
Currently .hot and .unlikely suffix only are added in PGO (Sample PGO)
compilation (through isFunctionHotInCallGraph and
isFunctionColdInCallGraph).

This patch changes the behavior. The new behavior is:
(1) If the user annotates a function as hot or isFunctionHotInCallGraph
    is true, this function will be marked as hot. Otherwise,
(2) If the user annotates a function as cold or
    isFunctionColdInCallGraph is true, this function will be marked as
    cold.

The changes are:
(1) user annotated function attribute will used in setting function
    section prefix/suffix.
(2) hot attribute overwrites profile count based hotness.
(3) profile count based hotness overwrite user annotated cold attribute.

The intention for these changes is to provide the user a way to mark
certain function as hot in cases where training input is hard to cover
all the hot functions.

Differential Revision: https://reviews.llvm.org/D92493
2020-12-17 18:41:12 -08:00
Layton Kifer 385e9a2a04 [DAGCombiner] Improve shift by select of constant
Clean up a TODO, to support folding a shift of a constant by a
select of constants, on targets with different shift operand sizes.

Reviewed By: RKSimon, lebedev.ri

Differential Revision: https://reviews.llvm.org/D90349
2020-12-18 02:21:42 +00:00
dfukalov 9ed8e0caab [NFC] Reduce include files dependency and AA header cleanup (part 2).
Continuing work started in https://reviews.llvm.org/D92489:

Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h".

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D92852
2020-12-17 14:04:48 +03:00
Krasimir Georgiev e71a4cc207 fix a -Wunused-variable warning in release build 2020-12-17 11:52:00 +01:00
Barry Revzin 92310454bf Make LLVM build in C++20 mode
Part of the <=> changes in C++20 make certain patterns of writing equality
operators ambiguous with themselves (sorry!).
This patch goes through and adjusts all the comparison operators such that
they should work in both C++17 and C++20 modes. It also makes two other small
C++20-specific changes (adding a constructor to a type that cases to be an
aggregate, and adding casts from u8 literals which no longer have type
const char*).

There were four categories of errors that this review fixes.
Here are canonical examples of them, ordered from most to least common:

// 1) Missing const
namespace missing_const {
    struct A {
    #ifndef FIXED
        bool operator==(A const&);
    #else
        bool operator==(A const&) const;
    #endif
    };

    bool a = A{} == A{}; // error
}

// 2) Type mismatch on CRTP
namespace crtp_mismatch {
    template <typename Derived>
    struct Base {
    #ifndef FIXED
        bool operator==(Derived const&) const;
    #else
        // in one case changed to taking Base const&
        friend bool operator==(Derived const&, Derived const&);
    #endif
    };

    struct D : Base<D> { };

    bool b = D{} == D{}; // error
}

// 3) iterator/const_iterator with only mixed comparison
namespace iter_const_iter {
    template <bool Const>
    struct iterator {
        using const_iterator = iterator<true>;

        iterator();

        template <bool B, std::enable_if_t<(Const && !B), int> = 0>
        iterator(iterator<B> const&);

    #ifndef FIXED
        bool operator==(const_iterator const&) const;
    #else
        friend bool operator==(iterator const&, iterator const&);
    #endif
    };

    bool c = iterator<false>{} == iterator<false>{} // error
          || iterator<false>{} == iterator<true>{}
          || iterator<true>{} == iterator<false>{}
          || iterator<true>{} == iterator<true>{};
}

// 4) Same-type comparison but only have mixed-type operator
namespace ambiguous_choice {
    enum Color { Red };

    struct C {
        C();
        C(Color);
        operator Color() const;
        bool operator==(Color) const;
        friend bool operator==(C, C);
    };

    bool c = C{} == C{}; // error
    bool d = C{} == Red;
}

Differential revision: https://reviews.llvm.org/D78938
2020-12-17 10:44:10 +00:00
Simon Pilgrim cdb692ee0c [X86] Add X86ISD::SUBV_BROADCAST_LOAD and begin removing X86ISD::SUBV_BROADCAST (PR38969)
Subvector broadcasts are only load instructions, yet X86ISD::SUBV_BROADCAST treats them more generally, requiring a lot of fallback tablegen patterns.

This initial patch replaces constant vector lowering inside lowerBuildVectorAsBroadcast with direct X86ISD::SUBV_BROADCAST_LOAD loads which helps us merge a number of equivalent loads/broadcasts.

As well as general plumbing/analysis additions for SUBV_BROADCAST_LOAD, I needed to wrap SelectionDAG::makeEquivalentMemoryOrdering so it can handle result chains from non generic LoadSDNode nodes.

Later patches will continue to replace X86ISD::SUBV_BROADCAST usage.

Differential Revision: https://reviews.llvm.org/D92645
2020-12-17 10:25:25 +00:00
QingShan Zhang ebdd20f430 Expand the fp_to_int/int_to_fp/fp_round/fp_extend as libcall for fp128
X86 and AArch64 expand it as libcall inside the target. And PowerPC also
want to expand them as libcall for P8. So, propose an implement in the
legalizer to common the logic and remove the code for X86/AArch64 to
avoid the duplicate code.

Reviewed By: Craig Topper

Differential Revision: https://reviews.llvm.org/D91331
2020-12-17 07:59:30 +00:00
Fangrui Song c70f36865e Use basic_string::find(char) instead of basic_string::find(const char *s, size_type pos=0)
Many (StringRef) cannot be detected by clang-tidy performance-faster-string-find.
2020-12-16 23:28:32 -08:00
Xiang1 Zhang 39584ae5b5 [Debugify] Support checking Machine IR debug info
Add mir-check-debug pass to check MIR-level debug info.

For IR-level, currently, LLVM have debugify + check-debugify to generate
and check debug IR. Much like the IR-level pass debugify, mir-debugify
inserts sequentially increasing line locations to each MachineInstr in a
Module, But there is no equivalent MIR-level check-debugify pass, So now
we support it at "mir-check-debug".

Reviewed By: djtodoro

Differential Revision: https://reviews.llvm.org/D91595
2020-12-16 22:17:25 -08:00
Kazu Hirata 5501b92957 [IR, CodeGen] Use llvm::is_contained (NFC) 2020-12-16 21:30:44 -08:00
Xiang1 Zhang 1e42ad9d62 Revert "[Debugify] Support checking Machine IR debug info"
This reverts commit 50aaa8c274.
2020-12-16 20:12:33 -08:00
Xiang1 Zhang 50aaa8c274 [Debugify] Support checking Machine IR debug info
Add mir-check-debug pass to check MIR-level debug info.

For IR-level, currently, LLVM have debugify + check-debugify to generate
and check debug IR. Much like the IR-level pass debugify, mir-debugify
inserts sequentially increasing line locations to each MachineInstr in a
Module, But there is no equivalent MIR-level check-debugify pass, So now
we support it at "mir-check-debug".

Reviewed By: djtodoro

Differential Revision: https://reviews.llvm.org/D91595
2020-12-16 18:04:05 -08:00
Guozhi Wei 687e80be7f [MBP] Add whole chain to BlockFilterSet instead of individual BB
Currently we add individual BB to BlockFilterSet if its frequency satisfies

  LoopFreq / Freq <= LoopToColdBlockRatio

LoopFreq is edge frequency from outside to loop header.
LoopToColdBlockRatio is a command line parameter.

It doesn't make sense since we always layout whole chain, not individual BBs.

It may also cause a tricky problem. Sometimes it is possible that the LoopFreq
of an inner loop is smaller than LoopFreq of outer loop. So a BB can be in
BlockFilterSet of inner loop, but not in BlockFilterSet of outer loop,
like .cold in the test case. So it is added to the chain of inner loop. When
work on the outer loop, .cold is not added to BlockFilterSet, so the edge to
successor .problem is not counted in UnscheduledPredecessors of .problem chain.
But other blocks in the inner loop are added BlockFilterSet, so the whole inner
loop chain can be layout, and markChainSuccessors is called to decrease
UnscheduledPredecessors of following chains. markChainSuccessors calls
markBlockSuccessors for every BB, even it is not in BlockFilterSet, like .cold,
so .problem chain's UnscheduledPredecessors is decreased, but this edge was not
counted on in fillWorkLists, so .problem chain's UnscheduledPredecessors
becomes 0 when it still has an unscheduled predecessor .pred! And it causes
problems in following various successor BB selection algorithms.

Differential Revision: https://reviews.llvm.org/D89088
2020-12-16 15:45:47 -08:00
Bardia Mahjour 6eff12788e [DDG] Data Dependence Graph - DOT printer - recommit
This is being recommitted to try and address the MSVC complaint.

This patch implements a DDG printer pass that generates a graph in
the DOT description language, providing a more visually appealing
representation of the DDG. Similar to the CFG DOT printer, this
functionality is provided under an option called -dot-ddg and can
be generated in a less verbose mode under -dot-ddg-only option.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D90159
2020-12-16 12:37:36 -05:00
diggerlin a1e1dcabe4 [XCOFF][AIX] Emit EH information in traceback table
SUMMARY:

In order for the runtime on AIX to find the compact unwind section(EHInfo table),
we would need to set the following on the traceback table:

The 6th byte's longtbtable field to true to signal there is an Extended TB Table Flag.
The Extended TB Table Flag to be 0x08 to signal there is an exception handling info presents.
Emit the offset between ehinfo TC entry and TOC base after all other optional portions of traceback table.

The patch is authored by Jason Liu.

Reviewers: David Tenty, Digger Lin
Differential Revision: https://reviews.llvm.org/D92766
2020-12-16 09:34:59 -05:00
Matt Arsenault 60eba8161b RegisterCoalescer: Remove phi-only subranges when erasing identity copies
Undef subranges are not present in the live range values, except when
they cross block boundaries. In this situation, a identity copy is
inside a loop, and one of the lanes is undefined. It only appears
alive inside the loop due to the copy. Once the copy was erased, it
would leave behind a segment inside the loop body with no
corresponding def anywhere in the program.

When RenameIndependentSubregs processed this dummy interval, it would
introduce a "Multiple connected components in live interval" verifier
error when IMPLICIT_DEFs were added to the other two blocks. I believe
there is a missing verifier check for this type of dummy interval.

I have found additional cases from the same fundamental problem in
other areas I haven't managed to fix yet (e.g. the commented out
prune_subrange_phi_value_* cases).
2020-12-15 17:36:32 -05:00
Fangrui Song 8c4e55762d [docs][unittest][Go][StackProtector] Migrate deprecated DebugInfo::get to DILocation::get 2020-12-15 14:17:04 -08:00
Matt Arsenault e7e7d371fd GlobalISel: Fix generic handling of single outgoing call arguments
Simply call the argument handler like is done for the incoming
case. This will allow removal of hacks in the AMDGPU call lowering in
a future change.
2020-12-15 17:00:27 -05:00
Paul Walker 6d35bd1d48 [CodeGenPrepare] Update optimizeGatherScatterInst for scalable vectors.
optimizeGatherScatterInst does nothing specific to fixed length vectors
but uses FixedVectorType to extract the number of elements.  This patch
simply updates the code to use VectorType and getElementCount instead.

For testing I just copied Transforms/CodeGenPrepare/X86/gather-scatter-opt.ll
replacing `<4 x ` with `<vscale x 4`.

Differential Revision: https://reviews.llvm.org/D92572
2020-12-15 10:57:51 +00:00
Amara Emerson a69b76c500 [GlobalISel][IRTranslator] Ensure branch probabilities are added when translating invoke edges.
This uses a straightforward port of findUnwindDestinations() from SelectionDAG.

Differential Revision: https://reviews.llvm.org/D93256
2020-12-14 23:36:54 -08:00
Nico Weber a852ee199c Reland "[MachineDebugify] Insert synthetic DBG_VALUE instructions"
This reverts commit 841f9c937f.
The change landed many months ago; something else broke those tests.
2020-12-14 22:34:23 -05:00
Nico Weber 841f9c937f Revert "[MachineDebugify] Insert synthetic DBG_VALUE instructions"
This reverts commit 2a5675f11d.
The tests it adds fail: https://reviews.llvm.org/D78135#2453736
2020-12-14 22:14:48 -05:00