Constant fold both the trapping and saturating versions of the
WebAssembly truncation intrinsics. The tests are adapted from the
WebAssembly spec tests for the corresponding instructions.
Requested in PR46982.
Differential Revision: https://reviews.llvm.org/D85392
If we can't identify alloca used in lifetime marker we
need to assume to worst case scenario.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D84630
Add cases of fused fmul+fadd/fsub with f16 and f64 operands to cost model.
Also added operations with contract attribute.
Fixed line endings in test.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D84995
As mentioned in
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html,
loop-unswitch has not been ported to the NPM. Instead people are using
simple-loop-unswitch.
Pin all tests in Transforms/LoopUnswitch to legacy PM and replace all
other uses of loop-unswitch with simple-loop-unswitch.
One test that didn't fit into the above was
2014-06-21-congruent-constant.ll which seems to only pass with
loop-unswitch. That is also pinned to legacy PM.
Now all tests containing "-loop-unswitch" anywhere in the test succeed with
NPM turned on by default.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D85360
-print-memoryssa in legacy PM is print<memoryssa> in NPM.
Pin tests with -print-memoryssa to legacy PM.
Add corresponding tests for NPM where missing.
This fixes "unknown pass name 'print-memoryssa'".
Some tests still fail in Analysis/MemorySSA due to other passes that
haven't been ported.
pr43427.ll and pr43438.ll required adding -aa-pipeline=basic-aa,
-loop-simplify (since it doesn't run on legacy PM by default), and
decrementing some of the MemoryPhi numbers.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D85333
As with other targets, set the throughput cost of control-flow
instructions to free so that we don't miss out of vectorization
opportunities.
Differential Revision: https://reviews.llvm.org/D85283
This option was added a while back, to help improve AA around pointer
phi loops. It looks for phi(gep(phi, const), x) loops, checking if x can
then prove more precise aliasing info.
Differential Revision: https://reviews.llvm.org/D82998
Merging alias results from different paths, when a path did phi
translation is not necesarily correct. Conservatively terminate such paths.
Aimed to fix PR46156.
Differential Revision: https://reviews.llvm.org/D84905
In some cases, it seems like we can get rid of unnecessary s/umins by
using information from the loop guards (unless I am missing something).
One place where this seems to be helpful in practice is when computing
loop trip counts. This patch just changes howManyGreaterThans for now.
Note that this requires a loop for which we can check 'is guarded'.
On SPEC2000/SPEC2006/MultiSource, there are some notable changes for
some programs in the number of loops unrolled and trip counts computed.
```
Same hash: 179 (filtered out)
Remaining: 58
Metric: scalar-evolution.NumTripCountsComputed
Program base patch diff
test-suite...langs-C/compiler/compiler.test 25.00 31.00 24.0%
test-suite.../Applications/SPASS/SPASS.test 2020.00 2323.00 15.0%
test-suite...langs-C/allroots/allroots.test 29.00 32.00 10.3%
test-suite.../Prolangs-C/loader/loader.test 17.00 18.00 5.9%
test-suite...fice-ispell/office-ispell.test 253.00 265.00 4.7%
test-suite...006/450.soplex/450.soplex.test 3552.00 3692.00 3.9%
test-suite...chmarks/MallocBench/gs/gs.test 453.00 470.00 3.8%
test-suite...ngs-C/assembler/assembler.test 29.00 30.00 3.4%
test-suite.../Benchmarks/Ptrdist/bc/bc.test 263.00 270.00 2.7%
test-suite...rks/FreeBench/pifft/pifft.test 722.00 741.00 2.6%
test-suite...count/automotive-bitcount.test 41.00 42.00 2.4%
test-suite...0/253.perlbmk/253.perlbmk.test 1417.00 1451.00 2.4%
test-suite...000/197.parser/197.parser.test 387.00 396.00 2.3%
test-suite...lications/sqlite3/sqlite3.test 1168.00 1189.00 1.8%
test-suite...000/255.vortex/255.vortex.test 173.00 176.00 1.7%
Metric: loop-unroll.NumUnrolled
Program base patch diff
test-suite...langs-C/compiler/compiler.test 1.00 3.00 200.0%
test-suite.../Applications/SPASS/SPASS.test 134.00 234.00 74.6%
test-suite...count/automotive-bitcount.test 3.00 4.00 33.3%
test-suite.../Prolangs-C/loader/loader.test 3.00 4.00 33.3%
test-suite...langs-C/allroots/allroots.test 3.00 4.00 33.3%
test-suite...Source/Benchmarks/sim/sim.test 10.00 12.00 20.0%
test-suite...fice-ispell/office-ispell.test 21.00 25.00 19.0%
test-suite.../Benchmarks/Ptrdist/bc/bc.test 32.00 38.00 18.8%
test-suite...006/450.soplex/450.soplex.test 300.00 352.00 17.3%
test-suite...rks/FreeBench/pifft/pifft.test 60.00 69.00 15.0%
test-suite...chmarks/MallocBench/gs/gs.test 57.00 63.00 10.5%
test-suite...ngs-C/assembler/assembler.test 10.00 11.00 10.0%
test-suite...0/253.perlbmk/253.perlbmk.test 145.00 157.00 8.3%
test-suite...000/197.parser/197.parser.test 43.00 46.00 7.0%
test-suite...TimberWolfMC/timberwolfmc.test 205.00 214.00 4.4%
Geomean difference 7.6%
```
Fixes https://bugs.llvm.org/show_bug.cgi?id=46939
Fixes https://bugs.llvm.org/show_bug.cgi?id=46924 on X86.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D85046
For consistency with legacy pass name.
Helps with 37 instances of "unknown pass name 'tbaa'" in check-llvm under NPM.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D84967
There's a slight difference in functionality with the new CHECK lines:
before, we allowed either -0.0 or 0.0 for maxnum/minnum. That matches
the definition, but we should always get a deterministic result from
constant folding within the compiler, so now we assert that we got
the single expected result in all cases.
This patch uses the feature added in D79162 to fix the cost of a
sext/zext of a masked load, or a trunc for a masked store.
Previously, those were considered cheap or even free, but it's
not the case as we cannot split the load in the same way we would for
normal loads.
This updates the costs to better reflect reality, and adds a test for it
in test/Analysis/CostModel/ARM/cast.ll.
It also adds a vectorizer test that showcases the improvement: in some
cases, the vectorizer will now choose a smaller VF when
tail-predication is enabled, which results in better codegen. (Because
if it were to use a higher VF in those cases, the code we see above
would be generated, and the vmovs would block tail-predication later in
the process, resulting in very poor codegen overall)
Original Patch by Pierre van Houtryve
Differential Revision: https://reviews.llvm.org/D79163
To match NewPM pass name, and also for readability.
Also rename rpo-functionattrs -> rpo-function-attrs while we're here.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D84694
This is a simple patch that adds constant folding for freeze
instruction.
IIUC, it isn't needed to update ConstantFold.cpp because there is no freeze
constexpr.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D84597
Summary: To match NewPM name. Also the new name is clearer and more consistent.
Subscribers: jvesely, nhaehnle, hiraditya, asbirlea, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D84542