Summary:
This addresses a regression in common handling from the new LTO
API in r278338. Only create a new common if the size is different.
The type comparison against an array type fails when the size is
different but not an array. GlobalMerge does not handle the
array types as well and we lose some global merging opportunities.
Reviewers: mehdi_amini
Subscribers: junbuml, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23955
llvm-svn: 279911
There's only one use of this for the convenience
of a pattern. I think v_mov_b64_pseudo should also be
moved, but SIFoldOperands does currently make use of it.
llvm-svn: 279901
When global-isel fails on a MachineFunction MF, MF will be cleaned up
and given to SDISel.
Thanks to this fallback, we can already perform correctness test even if
we support only a small portion of the functions in a test.
llvm-svn: 279891
Adds a baseline test for lowering shuffles where the width of the output
vector is not twice the size of the input vectors. Many of those sequences
are suboptimal, and will hopefully be improved in follow-up patches.
llvm-svn: 279888
Summary:
If the scheduler clusters the loads, then the offsets will be sorted,
but it is possible for the scheduler to scheduler loads together
without out explicitly clustering them, which would give us non-sorted
offsets.
Also, we will want to do this if we move the load/store optimizer before
the scheduler.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D23776
llvm-svn: 279870
Summary:
This is obviously an interesting case because it may motivate code
restructuring or LTO.
Reporting this requires instantiation of ORE in the loop where the call
sites are first gathered. I've checked compile-time
overhead *with* -Rpass-with-hotness and the worst slow-down was 6% in
mcf and quickly tailing off. As before without -Rpass-with-hotness
there is no overhead.
Because this could be a pretty noisy diagnostics, it is currently
qualified as 'verbose'. As of this patch, 'verbose' diagnostics are
only emitted with -Rpass-with-hotness, i.e. when the output is expected
to be filtered.
Reviewers: eraman, chandlerc, davidxl, hfinkel
Subscribers: tejohnson, Prazek, davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D23415
llvm-svn: 279860
In the code to detect fixed-point conversions and make use of AArch64's special
instructions, we weren't prepared for weird types. The fptosi direction got
fixed recently, but not the similar sitofp code.
llvm-svn: 279852
It's unclear how the old
%res(32) = G_ICMP { s32, s32 } intpred(eq), %0, %1
is actually different from an s1 verison
%res(1) = G_ICMP { s1, s32 } intpred(eq), %0, %1
so we'll remove it for now.
llvm-svn: 279843
Summary:
In fuctions that contained debug info but were empty otherwise,
the ARM load/store optimizer could abort. This was because
function MergeReturnIntoLDM handled the special case where a
Machine Basic BLock is empty by calling MBB.empty(). However, this
returns false in presence of debug info, although the function
should be considered empty in the eyes of the load/store optimizer.
This has been fixed by handling the case where searching through the
block finds only debug instructions.
Reviewers: rengolin, dexonsmith, llvm-commits, jmolloy
Subscribers: t.p.northover, aemerson, rengolin, samparker
Differential Revision: https://reviews.llvm.org/D23847
llvm-svn: 279820
This was for some reason skipping operands that are subregisters
instead of keeping the same subregister index.
v_movreld_b32 expects src0 to be the subregister of the tied
super register use/def.
e.g.
v_movreld_b32 v0, v9, <imp-def, tied3> v[0:3], <imp-use, tied2> v[0:3]
was being replaced with
v[4:7] = copy v[0:3]
v_movreld_b32 v0, v9, <imp-def, tied3> v[4:7], <imp-use, tied2> v[4:7],
which really writes to v[0:3]
llvm-svn: 279804
This reverts most of r274613 (AKA r274626) and its follow-ups (r276347, r277289),
due to miscompiles in the test suite. The FastISel change was left in, because
it apparently fixes an unrelated issue.
(Recommit of r279782 which was broken due to a bad merge.)
This fixes 4 out of the 5 test failures in PR29112.
llvm-svn: 279788
This reverts most of r274613 and its follow-ups (r276347, r277289), due to
miscompiles in the test suite. The FastISel change was left in, because it
apparently fixes an unrelated issue.
This fixes 4 out of the 5 test failures in PR29112.
llvm-svn: 279782
Its existence is largely historical, apparently we tried to make ARM object
files look maybe-almost-possibly runnable by putting our best guess at the
actual value into relocated locations. Of course, the real linker then comes
along and can completely change things.
But it should only be there for word-sized and movw/movt relocations. It can't
be encoded in branch relocations, and I've seen it mess up validity
calculations twice in the last couple of weeks so the default is clearly problematic.
llvm-svn: 279773
Summary:
This fixes pr29105. The reason is that lifetime marks creates new
aliasing pointers the original ones, but before this patch aliases
were not checked in performMemCpyToMemSetOptzn.
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23846
llvm-svn: 279769
The 32-bit variants of these operations don't depend on the bits not being
operated on, so they also naturally model operations narrower than the actual
register width.
llvm-svn: 279760
Fix VPAVG detection to require AVX512BW, not AVX512F for 512-bit widths,
and change associated asserts to assert in the right direction...
This fixes PR29111.
llvm-svn: 279755
when unroll runtime iteration loop.
In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner
loop inside a loop nest, the scalar evolution needs to be dropped for its
parent loop which is done by ScalarEvolution::forgetLoop. However, we can
postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC
expansion can still reuse existing value.
Differential Revision: https://reviews.llvm.org/D23572
llvm-svn: 279748
It is invalid to hoist stores or loads if they are not executed on all paths
from the hoisting point to the exit of the function. In the testcase, there are
paths in the loop that do not execute the stores or the loads, and so hoisting
them within the loop is unsafe.
The problem is that the current implementation of hoistingFromAllPaths is
incomplete: it walks all blocks dominated by the hoisting point, and does not
return false when the loop contains a path on which the hoisted ld/st is
not executed.
Differential Revision: https://reviews.llvm.org/D23843
llvm-svn: 279732
Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of
running after register and simply describes that no vregs are used in
a machine function. With that we can simply compute the property and do
not need to dump/parse it in .mir files.
Differential Revision: http://reviews.llvm.org/D23850
llvm-svn: 279698