I suspect this buildbot has slow-incdec set by default, most likely due to
the default CPU having this set. This feature bit can prevent optsize from
having an effect on this IR.
llvm-svn: 303720
This patch adds missing scheds for Neon VLDx/VSTx instructions.
This will help one write schedulers easier/faster in the future for ARM sub-targets.
Existing models will not affected by this patch.
Reviewed by: Renato Golin, Diana Picus
Differential Revision: https://reviews.llvm.org/D33120
llvm-svn: 303717
Otherwise we don't revisit an instruction that could be simplified,
and when we verify, we discover there's something that changed, i.e.
what we had wasn't a maximal fixpoint.
Fixes PR32836.
llvm-svn: 303715
LazyRandomTypeCollection is designed for random access, and in
order to provide this it lazily indexes ranges of types. In the
case of types from an object file, there is no partial index
to build off of, so it has to index the full stream up front.
However, merging types only requires sequential access, and when
that is needed, this extra work is simply wasted. Changing the
algorithm to work on sequential arrays of types rather than
random access type collections eliminates this up front scan.
llvm-svn: 303707
Instead of using the SCCP homegrown one. We should eventually
make the private SCCP version disappear, but that wont' be today.
PR33143 tracks this issue.
Add braces for consistency while here. No functional change intended.
llvm-svn: 303706
Coverage instrumentation has an optimization not to instrument extra
blocks, if the pass is already "accounted for" by a
successor/predecessor basic block.
However (https://github.com/google/sanitizers/issues/783) this
reasoning may become circular, which stops valid paths from having
coverage.
In the worst case this can cause fuzzing to stop working entirely.
This change simplifies logic to something which trivially can not have
such circular reasoning, as losing valid paths does not seem like a
good trade-off for a ~15% decrease in the # of instrumented basic blocks.
llvm-svn: 303698
The error message that git-llvm script prints out when svn is missing
is very cryptic. I spent a fair amount of time to find what was wrong
with my environment. It looks like many newcomers also exprienced a
hard time to submit their first patches due to this error.
This patch adds a more user-friendly error message.
Differential Revision: https://reviews.llvm.org/D33458
llvm-svn: 303696
C++14 added user-defined literal support for complex numbers so that you can
write something like "complex<double> val = 2i". However, there is an existing
GNU extension supporting this syntax and interpreting the result as a _Complex
type.
This changes parsing so that such literals are interpreted in terms of C++14's
operators if an overload is present but otherwise falls back to the original
GNU extension.
llvm-svn: 303694
This fixes 17 of the 41 -verify-machineinstrs test failures identified in PR33045
Differential Revision: https://reviews.llvm.org/D33451
llvm-svn: 303691
Summary:
Promoting Alloca to Vector and Promoting Alloca to LDS are two independent handling of Alloca and should not affect each other.
As a result, we should not give up promoting to vector if there is not enough LDS. This patch factors out the local memory usage
related checking out and replace it after the calling convention checking.
Reviewer:
arsenm
Differential Revision:
http://reviews.llvm.org/D33139
llvm-svn: 303684
Perform DAG combine:
and (srl x, c), mask => shl (bfe x, nb + c, mask >> nb), nb
Where nb is a number of trailing zeroes in mask.
It replaces two instructions with two and BFE is generally a more
expensive one. However this is only done if we are selecting a byte
or word at an aligned boundary which results in a proper SDWA
operand pattern. It is only done if SDWA is supported.
TODO: improve SDWA pass to actually convert this pattern. It is not
done now because we have an immediate in the instruction, which has
be moved into a VGPR.
Differential Revision: https://reviews.llvm.org/D33455
llvm-svn: 303681
Summary:
A temporary workaround for PR32780 - rematerialized instructions accessing the same promoted global through different constant pool entries.
The patch turns off the globals promotion optimization leaving all its code in place, so that it can be easily turned on once PR32780 is fixed.
Since this is a miscompilation issue causing generation of misbehaving code, and the problem is very subtle, the patch might be valuable enough to get into 4.0.1.
Reviewers: efriedma, jmolloy
Reviewed By: efriedma
Subscribers: aemerson, javed.absar, llvm-commits, rengolin, asl, tstellar
Differential Revision: https://reviews.llvm.org/D33446
llvm-svn: 303679
Summary:
It's rare but a small number of patterns use IntInit's at the root of the match.
On X86, one such rule is enabled by the OptForSize predicate and causes the
compiler to use the smaller:
%0 = MOV32r1
instead of the usual:
%0 = MOV32ri 1
This patch adds support for matching IntInit's at the root and uses this as a
test case for the optsize attribute that was implemented in r301750
Reviewers: qcolombet, ab, t.p.northover, rovka, kristof.beyls, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D32791
llvm-svn: 303678
When writing field list records, we would construct a temporary
type serializer that shared a bump ptr allocator with the rest
of the application, so anything allocated from here would live
forever. Furthermore, this temporary serializer had all the
properties of a full blown serializer including record hashing
and de-duplication.
These features are required when you're merging multiple type
streams into each other, because different streams may contain
identical records, but records from the same type stream will
never collide with each other. So all of this hashing was
unnecessary.
To solve this, two fixes are made:
1) The temporary serializer keeps its own bump ptr allocator
instead of sharing a global one. When it's finished, all of
its memory is freed.
2) Instead of using the same temporary serializer for the life
of an entire type stream, we use it only for the life of a single
field list record and delete it when the field list record is
completed. This way the hash table will not grow as other
records from the same type stream get inserted. Further improvements
could eliminate hashing entirely from this codepath.
This reduces the link time by 85% in my test, from 1 minute to 9
seconds.
llvm-svn: 303676
Summary:
This is a first patch for GSoC project, bash-completion for clang.
To use this on bash, please run `source clang/utils/bash-autocomplete.sh`.
bash-autocomplete.sh is code for bash-completion.
Simple flag completion and path completion is available in this patch.
Reviewers: teemperor, v.g.vassilev, ruiu, Bigcheese, efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33237
llvm-svn: 303670
Summary:
First, StringMap uses llvm::HashString, which is only good for short
identifiers and really bad for large blobs of binary data like type
records. Moving to `DenseMap<StringRef, TypeIndex>` with some tricks for
memory allocation fixes that.
Unfortunately, that didn't buy very much performance. Profiling showed
that we spend a long time during DenseMap growth rehashing existing
entries. Also, in general, DenseMap is faster when the keys are small.
This change takes that to the logical conclusion by introducing a small
wrapper value type around a pointer to key data. The key data contains a
precomputed hash, the original record data (pointer and size), and the
type index, which is the "value" of our original map.
This reduces the time to produce llvm-as.exe and llvm-as.pdb from ~15s
on my machine to 3.5s, which is about a 4x improvement.
Reviewers: zturner, inglorion, ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33428
llvm-svn: 303665
This is just a cleanup. Also, it adds checking that ByteCount is aligned to 4.
Reviewers: arsenm, nhaehnle, tstellarAMD
Subscribers: kzhuravl, wdng, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D28994
llvm-svn: 303658
Actually, to identify external symbols, we need to check for
*either* non-null Value.SymbolName *or* a SymType of
Symbol::ST_Unknown.
The former may happen for symbols not known to the JIT at all
(e.g. defined in a native library), while the latter happens
for symbols known to the JIT, but defined in a different module.
Fixed several regressions on big-endian ppc64.
llvm-svn: 303655
Summary:
Before this change, AttributeLists stored a pair of index and
AttributeSet. This is memory efficient if most arguments do not have
attributes. However, it requires doing a search over the pairs to test
an argument or function attribute. Profiling shows that this loop was
0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly
tests values for nullability.
This was worth about 2.5% of mid-level optimization cycles on the
sqlite3 amalgamation. Here are the full perf results:
https://reviews.llvm.org/P7995
Here are just the before and after cycle counts:
```
$ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null
13,274,181,184 cycles # 3.047 GHz ( +- 0.28% )
$ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null
12,906,927,263 cycles # 3.043 GHz ( +- 0.51% )
```
This patch *does not* change the indices used to query attributes, as
requested by reviewers. Tracking whether an index is usable for array
indexing is a huge pain that affects many of the internal APIs, so it
would be good to come back later and do a cleanup to remove this
internal adjustment.
Reviewers: pete, chandlerc
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D32819
llvm-svn: 303654
Also, rename the tests and the file, add comments, and add more tests
because there are no existing tests for some of these folds.
These patterns are particularly important for crippled vector ISAs that
have limited compare predicates (PR33138).
llvm-svn: 303652
shl (or|add x, c2), c1 => or|add (shl x, c1), (c2 << c1)
This allows to fold a constant into an address in some cases as
well as to eliminate second shift if the expression is used as
an address and second shift is a result of a GEP.
Differential Revision: https://reviews.llvm.org/D33432
llvm-svn: 303641
The PowerPC part of processRelocationRef currently assumes that external
symbols can be identified by checking for SymType == SymbolRef::ST_Unknown.
This is actually incorrect in some cases, causing relocation overflows to
be mis-detected. The correct check is to test whether Value.SymbolName
is null.
Includes test case. Note that it is a bit tricky to replicate the exact
condition that triggers the bug in a test case. The one included here
seems to fail reliably (before the fix) across different operating
system versions on Power, but it still makes a few assumptions (called
out in the test case comments).
Also add ppc64le platform name to the supported list in the lit.local.cfg
files for the MCJIT and OrcMCJIT directories, since those tests were
currently not run at all.
Fixes PR32650.
Reviewer: hfinkel
Differential Revision: https://reviews.llvm.org/D33402
llvm-svn: 303637
This patch builds over https://reviews.llvm.org/rL303349 and replaces
the use of the condition only if it is safe to do so.
We should not blindly RAUW the condition if experimental.guard or assume
is a use of that
condition. This is because LVI may have used the guard/assume to
identify the
value of the condition, and RUAWing will fold the guard/assume and uses
before the guards/assumes.
Reviewers: sanjoy, reames, trentxintong, mkazantsev
Reviewed by: sanjoy, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33257
llvm-svn: 303633
Code in RuntimeDyldELF currently uses 32-bit temporaries to detect
whether a PPC64 relocation target is out of range. This is incorrect,
and can mis-detect overflow where the distance between relocation site
and target is close to a multiple of 4GB. Fixed by using 64-bit
temporaries.
Noticed while debugging PR32650.
Reviewer: hfinkel
Differential Revision: https://reviews.llvm.org/D33403
llvm-svn: 303632
Summary:
Added separate pseudo and real instruction for GFX9 SDWA instructions.
Currently supports only in assembler.
Depends D32493
Reviewers: vpykhtin, artem.tamazov
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D33132
llvm-svn: 303620
Summary:
This patch makes instruction fusion more aggressive by
* adding artificial edges between the successors of FirstSU and
SecondSU, similar to BaseMemOpClusterMutation::clusterNeighboringMemOps.
* updating PostGenericScheduler::tryCandidate to keep clusters together,
similar to GenericScheduler::tryCandidate.
This change increases the number of AES instruction pairs generated on
Cortex-A57 and Cortex-A72. This doesn't change code at all in
most benchmarks or general code, but we've seen improvement on kernels
using AESE/AESMC and AESD/AESIMC.
Reviewers: evandro, kristof.beyls, t.p.northover, silviu.baranga, atrick, rengolin, MatzeB
Reviewed By: evandro
Subscribers: aemerson, rengolin, MatzeB, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33230
llvm-svn: 303618
The default behavior of -Rpass-analysis=loop-vectorizer is to report only the
first reason encountered for not vectorizing, if one is found, at which time the
vectorizer aborts its handling of the loop. This patch allows multiple reasons
for not vectorizing to be identified and reported, at the potential expense of
additional compile-time, under allowExtraAnalysis which can currently be turned
on by Clang's -fsave-optimization-record and opt's -pass-remarks-missed.
Removed from LoopVectorizationLegality::canVectorize() the redundant checking
and reporting if we CantComputeNumberOfIterations, as LAI::canAnalyzeLoop() also
does that. This redundancy is caught by a lit test once multiple reasons are
reported.
Patch initially developed by Dror Barak.
Differential Revision: https://reviews.llvm.org/D33396
llvm-svn: 303613
This commit fixes a bug introduced in r301019 where optimizeLogicalImm
would replace a logical node's immediate operand that was CSE'd and
was also an operand of another node.
This commit fixes the bug by replacing the logical node instead of its
immediate operand.
rdar://problem/32295276
llvm-svn: 303607
Summary:
Add Max ModFlagBehavior, which can be used to take the max of two
module flag values when merging modules. Use it for the PIE and PIC
levels.
This avoids an error when we try to import from a module built -fpic
into a module built -fPIC, for example. For both PIE and PIC levels,
this will be legal, since the code generation gets more conservative
as the level is increased. Therefore we can take the max instead of
somehow trying to block importing between modules compiled with
different levels.
Reviewers: tmsriram, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33418
llvm-svn: 303590
The forward declarations and the SimplifyQuery class at the beginning of the namespace weren't indented. But the closing brace for SimplifyQuery and everything after it were indented.
This commit makes the whole file consistent to no identation per coding standards. The signature of every function in this file changed a few weeks ago so this isn't a big disturbance to the revision history.
llvm-svn: 303588
When presented with an icmp/select pair, we can end up asking what would happen
if we replaced one constant with another in an instruction. This is a mistake,
while non-constant Values could become a constant, constants cannot change and
trying to do so can lead to completely invalid IR (a GEP referencing a
non-existant field in the original case).
llvm-svn: 303580
Previous algotirhm assumed that types and ids are in a single
unified stream. For inputs that come from object files, this
is the case. But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.
Differential Revision: https://reviews.llvm.org/D33417
llvm-svn: 303577
MachineInstructions that don't generate any code (such as
IMPLICIT_DEFs) should not generate any debug info either.
Fixes PR33107.
https://bugs.llvm.org/show_bug.cgi?id=33107
This reapplies r303566 without any modifications. The stage2 build
failures persisted even after reverting this patch, and looking back
through history, it looks like these tests are flaky.
llvm-svn: 303575
Summary:
With instrumentation profiling, when updating the VP metadata after
an inline, VP metadata on the inlined copy was inadvertantly having
all counts zeroed out. This was causing indirect calls from code inlined
during the call step to be marked as cold in the ThinLTO summaries and
not imported.
The CallerBFI needs to be passed down so that the CallSiteCount can be
computed from the profile summary info. With Sample PGO this was working
since the count is extracted from the branch weight metadata on the
call being inlined (even before we stopped looking at metadata for
non-sample PGO in r302844 this largely wasn't working for instrumentation
PGO since only promoted indirect calls would be getting inlined and have
the metadata).
Added an instrumentation PGO test and renamed the sample PGO test.
Reviewers: danielcdh, eraman
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D33389
llvm-svn: 303574
Turn expensive 64 bit shift into 32 bit if shift does not overflow int:
shl (ext x) => zext (shl x)
Differential Revision: https://reviews.llvm.org/D33367
llvm-svn: 303569
MachineInstructions that don't generate any code (such as
IMPLICIT_DEFs) should not generate any debug info either.
Fixes PR33107.
https://bugs.llvm.org/show_bug.cgi?id=33107
llvm-svn: 303566
This patch adds handling of the `micromips` and `nomicromips` attributes
passed by front-end. The patch depends on D33363.
Differential revision: https://reviews.llvm.org/D33364
llvm-svn: 303545
It's causing some buildbots to timeout whenever tablegen needs re-compilation,
particularly those with -fsanitize=memory but not only them. A compile time
regression was expected since it triples the amount of SelectionDAG rules we
are able to import but it's currently too high.
llvm-svn: 303542
Re-applying now that PR32825 which was raised on the commit this fixed up is now known to have also been fixed by this commit.
Original commit message:
Multiple ldr pseudoinstructions with the same constant value will
reuse the same constant pool entry. However, if the constant pool
is explicitly flushed with a .ltorg directive, we should not try
to reference constants in the previous pool any longer, since they
may be out of range.
This fixes assembling hand-written assembler source which repeatedly
loads the same constant value, across a binary size larger than the
pc-relative fixup range for ldr instructions (4096 bytes). Such
assembler source already uses explicit .ltorg instructions to emit
constant pools with regular intervals. However if we try to reuse
constants emitted in earlier pools, they end up out of range.
This makes the output of the testcase match what binutils gas does
(prior to this patch, it would fail to assemble).
Differential Revision: https://reviews.llvm.org/D32847
llvm-svn: 303540
Re-applying now that the open bug on this commit, PR32825, is known to be fixed.
Original commit message:
Summary: This patch returns the same label if the CP entry with the same value has been created.
Reviewers: eli.friedman, rengolin, jmolloy
Subscribers: majnemer, jmolloy, llvm-commits
Differential Revision: https://reviews.llvm.org/D25804
llvm-svn: 303539
This reverts commit r302416. This was a fixup for r286006, which has now been reverted so this doesn't apply (either in concept or in code).
This commit itself has no problems, but the underlying issue it was fixing has now disappeared from the codebase.
llvm-svn: 303536
llvm-symbolizer would fail to symbolize addresses in unlinked object
files when handling .dwo file data because the addresses would not be
relocated in the same way as the ranges in the skeleton CU in the object
file.
Fix that so object files can be symbolized the same as executables.
llvm-svn: 303532
This is a re-application of a r303497 that was reverted in r303498.
I thought it had broken a bot when it had not (the breakage did not
go away with the revert).
This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious. Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.
There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.
At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision. If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.
llvm-svn: 303531
Summary:
Fix naming conventions and const correctness.
This completes the changes made in rL303029.
Patch by Yoav Ben-Shalom.
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33377
llvm-svn: 303529
Taken from PR32845. Dan removed the most dominating leader check
in r303443, but we check this test anyway to make sure things
don't regress.
llvm-svn: 303515
Otherwise we end up miscompiling, transforming:
define i8 @tinky() {
%sext = sext i1 1 to i16
%hibit = lshr i16 %sext, 15
%tr = trunc i16 %hibit to i8
ret i8 %tr
}
into:
%sext = sext i1 1 to i8
ret i8 %sext
and the first get folded to ret i8 1, while the second gets folded
to ret i8 -1.
Eventually we should get rid of this transform entirely, but for now,
this at least fixes a know correctness bug.
Differential Revision: https://reviews.llvm.org/D33338
llvm-svn: 303513
As discussed in:
https://reviews.llvm.org/D33338
...we may be able to remove a wider pattern match by doing these more
basic canonicalizations.
llvm-svn: 303504
PPC backend eliminates compare instructions by using record-form instructions in PPCInstrInfo::optimizeCompareInstr, which is called from peephole optimization pass.
This patch improves this optimization to eliminate more compare instructions in two types of common case.
- comparison against a constant 1 or -1
The record-form instructions set CR bit based on signed comparison against 0. So, the current implementation does not exploit the record-form instruction for comparison against a non-zero constant.
This patch enables record-form optimization for constant of 1 or -1 if possible; it changes the condition "greater than -1" into "greater than or equal to 0" and "less than 1" into "less than or equal to 0".
With this patch, compare can be eliminated in the following code sequence, as an example.
uint64_t a, b;
if ((a | b) & 0x8000000000000000ull) { ... }
else { ... }
- andi for 32-bit comparison on PPC64
Since record-form instructions execute 64-bit signed comparison and so we have limitation in eliminating 32-bit comparison, i.e. with cmplwi, using the record-form. The original implementation already has such checks but andi. is not recognized as an instruction which executes implicit zero extension and hence safe to convert into record-form if used for equality check.
%1 = and i32 %a, 10
%2 = icmp ne i32 %1, 0
br i1 %2, label %foo, label %bar
In this simple example, LLVM generates andi. + cmplwi + beq on PPC64.
This patch make it possible to eliminate the cmplwi for this case.
I added andi. for optimization targets if it is safe to do so.
Differential Revision: https://reviews.llvm.org/D30081
llvm-svn: 303500
This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious. Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.
There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.
At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision. If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.
llvm-svn: 303497
Summary: This allows pthread_self to be pulled out of a loop by LICM.
Reviewers: hfinkel, arsenm, davide
Reviewed By: davide
Subscribers: davide, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D32782
llvm-svn: 303495
This is split up into two commits.
The will create the DEF parser in LLVM.
Check the following commit to see the removal from LLD
Reviewers: ruiu
Differential Revision: https://reviews.llvm.org/D32689
llvm-svn: 303490
Summary: Added the new modules in the Object/ folder. Updated the
llvm-cvtres interface as well, and added additional tests.
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D33180
llvm-svn: 303480
In the case where we have an operand defined by a lod of the
same memory location. Historically this was a VariableExpression
because we wanted to make sure they ended up in the same class,
but if we create the right expression, they end up in the same
class anyway.
Fixes PR32897. Thanks to Dan for the detailed discussion and the
fix suggestion.
llvm-svn: 303475
This was here because we don't want to switch leaders too much,
in order to avoid fixpoint(ing) issue, but it's not sure if it
matters in practice.
A first step towards fixing PR32897.
llvm-svn: 303473
This reapplies commit r303438 modified to not verify cross-imported
bitcode in FunctionImporter.
rdar://problem/31233625
Differential Revision: https://reviews.llvm.org/D33370
llvm-svn: 303470
Refactor the strlen optimization code to work for both strlen and wcslen.
This especially helps with programs in the wild where people pass
L"string"s to const std::wstring& function parameters and the wstring
constructor gets inlined.
This also fixes a lingerind API problem/bug in getConstantStringInfo()
where zeroinitializers would always give you an empty string (without a
length) back regardless of the actual length of the initializer which
did not work well in the TrimAtNul==false causing the PR mentioned
below.
Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG
memcpy lowering and may lead to some cases for out-of-bounds
zeroinitializer accesses not getting optimized anymore. So some code
with UB may produce out of bound memory reads now instead of just
producing zeros.
The refactoring "accidentally" fixes http://llvm.org/PR32124
Differential Revision: https://reviews.llvm.org/D32839
llvm-svn: 303461
getParamAlignment expects an argument number, not an AttributeList
index.
Johan Englan, who works on LDC, found this bug and told me about it off
list.
llvm-svn: 303458
This is a complicated bug involving two issues:
1. What do we do with phi nodes when we prove all arguments are not
live?
2. When is it safe to use value leaders to determine if we can ignore
an argumnet?
llvm-svn: 303453
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows. After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling. Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.
llvm-svn: 303446
Summary:
NewGVN: Handle equivalence between phi of ops and op of phis.
This makes our GVN mostly-complete. It would be complete, modulo some
deliberate choices we make. This means it detects roughly all herband
equivalences in polynomial time, including cases notoriously hard for
other GVN's to detect. It also detects a very large swath of the
cases we currently rely on instcombine to detect that involve folding
upwards through phis.
Fixes PR 31125, 31463, PR 31868
Reviewers: davide
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D32151
llvm-svn: 303444