This means that we can use a shorter instruction sequence in the case where
the size is a power of two and on the boundary between two representations.
Differential Revision: https://reviews.llvm.org/D28421
llvm-svn: 291706
Refines max backedge-taken count if a loop like
"for (int i = 0; i != n; ++i) { /* body */ }" is rotated.
Differential Revision: https://reviews.llvm.org/D28536
llvm-svn: 291704
This is both easier to understand, and produces a tighter bound in certain
cases.
Differential Revision: https://reviews.llvm.org/D28393
llvm-svn: 291701
classes, and updating checking to allow for equivalence through
reachability.
(Sadly, the checking here is not perfect, and can't be made perfect,
so we'll have to disable it after we are satisfied with correctness.
Right now it is just "very unlikely" to happen.)
llvm-svn: 291698
This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded.
This needs a simple probability check because there are some cases where it is
not profitable.
llvm-svn: 291695
There are a couple left in bool-like containers (BitVector, etc) where
the implicit conversions seem more suitable - though it might be worth
considering explicitifying those too.
llvm-svn: 291694
The new matchers work after legalization to make them simpler, and to avoid
blocking other optimizations.
Differential Revision: https://reviews.llvm.org/D27779
llvm-svn: 291693
The removed assert seems bogus - it's perfectly legal for the roots of the
vectorized subtrees to be equal even if the original scalar values aren't,
if the original scalars happen to be equivalent.
This fixes PR31599.
Differential Revision: https://reviews.llvm.org/D28539
llvm-svn: 291692
a header of that same module.
This fixes a regression caused by r280409.
rdar://problem/29930553
This is an updated version for r291628 (which was reverted in r291688).
llvm-svn: 291689
filter out the implicilty imported modules at CodeGen instead of removing the
implicit ImportDecl when an implementation TU of a module imports a header of
that same module.
llvm-svn: 291688
Now we only support returning Optional<> values and have changed all clients over to use Optional::getValueOr().
Differential Revision: https://reviews.llvm.org/D28569
llvm-svn: 291686
Summary:
Revert LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.
This change separates how type identifiers are resolved from how intrinsic
calls are lowered. All information required to lower an intrinsic call
is stored in a new TypeIdLowering data structure. The idea is that this
data structure can either be initialized using the module itself during
regular LTO, or using the module summary in ThinLTO backends.
Original URL: https://reviews.llvm.org/D28341
Reviewers: pcc
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D28532
llvm-svn: 291684
Commit rL290616 (https://reviews.llvm.org/rL290616) changed a checking command
for the triple arm-apple-darwin in LLVM::CodeGen/ARM/fpcmp_ueq.ll. As a result
of the changes the test could fail for the valid generated code.
These changes fixes the test to check only instructions we would expect.
Differential Revision: https://reviews.llvm.org/D28159
llvm-svn: 291678
The `-target` impacts the CC for the builtins. HF targets (with either
floating point ABI) always use AAPCS VFP for the builtins unless they
are AEABI builtins, in which case they use AAPCS. Non-HF targets (with
either floating point ABI) always use AAPCS for the builtins and AAPCS
for the AEABI builtins. This introduces the thunks necessary to switch
CC for the floating point operations. This is not currently enabled,
and should be dependent on the target being used to build compiler-rt.
However, as a stop-gap, a define can be added for ASFLAGS to get the
thunks.
llvm-svn: 291677
Decompressor intention is to reduce duplication of code.
Currently LLD has own implementation of decompressor
for compressed debug sections.
This class helps to avoid it and share the code.
LLD patch for reusing it is D28106
Differential revision: https://reviews.llvm.org/D28105
llvm-svn: 291675
A store of an extracted element or a load which gets inserted into a vector,
will be combined into a vector load/store element instruction.
Therefore, isFoldableMemAccessOffset(), which is called by LSR, should
return false in these cases.
Reviewer: Ulrich Weigand
llvm-svn: 291673
We had an error when met this relocation
after latest changes aboult listing
x86 relocations explicitly.
Since we support R_X86_64_NONE,
and GNU ld supports R_386_NONE,
it seems reasonable to have.
Differential revision: https://reviews.llvm.org/D28552
llvm-svn: 291672
Here's my second try at making @llvm.assume processing more efficient. My
previous attempt, which leveraged operand bundles, r289755, didn't end up
working: it did make assume processing more efficient but eliminating the
assumption cache made ephemeral value computation too expensive. This is a
more-targeted change. We'll keep the assumption cache, but extend it to keep a
map of affected values (i.e. values about which an assumption might provide
some information) to the corresponding assumption intrinsics. This allows
ValueTracking and LVI to find assumptions relevant to the value being queried
without scanning all assumptions in the function. The fact that ValueTracking
started doing O(number of assumptions in the function) work, for every
known-bits query, has become prohibitively expensive in some cases.
As discussed during the review, this is a pragmatic fix that, longer term, will
likely be replaced by a more-principled solution (perhaps based on an extended
SSA form).
Differential Revision: https://reviews.llvm.org/D28459
llvm-svn: 291671
DAG patterns optimization: truncate + unsigned saturation supported by VPMOVUS* instructions in AVX-512.
And VPACKUS* instructions on SEE* targets.
Differential Revision: https://reviews.llvm.org/D28216
llvm-svn: 291670
Summary:
Fix no std::function index.
Previously, we don't index all the template specialization declarations of functions and classes (Because we assume that template functions/classes are indexed by their template declarations), but this is not always true in some cases like `std::function` which only has a partial template specialization declaration.
Reviewers: ioeric, bkramer
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D27920
llvm-svn: 291669
Instead of just using popularity, we also take into account how similar the
path of the current file is to the path of the header.
Our first approach is to get popularity into a reasonably small scale by taking
log2 (which is roughly intuitive to how humans would bucket popularity), and
multiply that with the number of matching prefix path fragments of the included
header with the current file.
Note that currently we do not take special care for unclean paths containing
"../" or "./".
Differential Revision: https://reviews.llvm.org/D28548
llvm-svn: 291664
the latter to the Transforms library.
While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.
Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.
We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.
This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.
I haven't split the unittest though because testing one component
without the other seems nearly intractable.
Differential Revision: https://reviews.llvm.org/D28452
llvm-svn: 291662
The code emiited by Clang's intrinsics for (v)cvtsi2ss, (v)cvtsi2sd,
(v)cvtsd2ss and (v)cvtss2sd is lowered to a code sequence that includes
redundant (v)movss/(v)movsd instructions. This patch adds patterns for
optimizing these sequences.
Differential revision: https://reviews.llvm.org/D28455
llvm-svn: 291660