Commit Graph

12759 Commits

Author SHA1 Message Date
Matt Arsenault 1e21181aee LoopDistribute/LAA: Add tests to catch regressions
I broke 2 of these with a patch, but were not covered by existing
tests.

https://reviews.llvm.org/D63035

llvm-svn: 363158
2019-06-12 13:15:59 +00:00
Sam Parker 52d7326f32 [NFC] Add HardwareLoops lit.local.cfg file
Set Transforms/HardwareLoops/ARM/ tests as unsupported if there isn't
an arm target.

llvm-svn: 363157
2019-06-12 12:54:19 +00:00
Sam Parker ece316b56a Attempt to fix non-Arm buildbots
Adding REQUIRES: arm to failing tests

llvm-svn: 363156
2019-06-12 12:47:35 +00:00
Sam Parker 757ac02dc8 [ARM] Implement TTI::isHardwareLoopProfitable
Implement the backend target hook to drive the HardwareLoops pass.
The low-overhead branch extension for Arm M-class cores is flexible
enough that we don't have to ensure correctness at this point, except
checking that the loop counter variable can be stored in LR - a
32-bit register. For it to be profitable, we want to avoid loops that
contain function calls, or any other instruction that alters the PC.
    
This implementation uses TargetLoweringInfo, to query type and
operation actions, looks at intrinsic calls and also performs some
manual checks for remainder/division and FP operations.
    
I think this should be a good base to start and extra details can be
filled out later.

Differential Revision: https://reviews.llvm.org/D62907

llvm-svn: 363149
2019-06-12 12:00:42 +00:00
Orlando Cazalet-Hyams a947156396 Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion"
This reverts commit 1a0f7a2077.
See phabricator thread for D60831.

llvm-svn: 363132
2019-06-12 08:34:51 +00:00
Philip Reames 082cd30327 Generalize icmp matching in IndVars' eliminateTrunc
We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases.

Differential Revision: https://reviews.llvm.org/D63112

llvm-svn: 363108
2019-06-11 22:43:25 +00:00
Cameron McInally 08200d6d26 [InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZ
Differential Revision: https://reviews.llvm.org/D62612

llvm-svn: 363082
2019-06-11 16:21:21 +00:00
Cameron McInally 796de11331 [InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNeg
Differential Revision: https://reviews.llvm.org/D62629

llvm-svn: 363080
2019-06-11 15:45:41 +00:00
Orlando Cazalet-Hyams 1a0f7a2077 [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

Reviewed By: hfinkel, jmorse

Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

llvm-svn: 363046
2019-06-11 10:37:20 +00:00
Matt Arsenault c5830f5f05 AtomicExpand: Don't crash on non-0 alloca
This now produces garbage on AMDGPU with a call to an nonexistent,
anonymous libcall but won't assert.

llvm-svn: 363022
2019-06-11 01:35:07 +00:00
Matt Arsenault 383e72fcfe AMDGPU: Expand < 32-bit atomics
Also fix AtomicExpand asserting on atomicrmw fadd/fsub.

llvm-svn: 363021
2019-06-11 01:35:00 +00:00
Philip Reames efb14f9005 [Tests] Adjust LFTR dead-iv tests to bypass undef cases
As pointed out by Nikita in review, undef and poison need to be handled separately.  Since we're no longer expecting any test improvements - just fixes for miscompiles - update the tests to bypass the existing undef check.

llvm-svn: 363002
2019-06-10 23:17:10 +00:00
Rong Xu e44fa83c37 [PGO] Handle cases of non-instrument BBs
As shown in PR41279, some basic blocks (such as catchswitch) cannot be
instrumented. This patch filters out these BBs in PGO instrumentation.
It also sets the profile count to the fail-to-instrument edge, so that we
can propagate the counts in the CFG.

Differential Revision: https://reviews.llvm.org/D62700

llvm-svn: 362995
2019-06-10 22:36:27 +00:00
Philip Reames 1d322ccaac [Tests] Split an LFTR dead-iv case
There are two interesting sub-cases here.  1) Switching IVs is legal, but only in pre-increment form.  and 2) Switching IVs is legal, and so is post-increment form.

llvm-svn: 362993
2019-06-10 22:33:20 +00:00
Philip Reames 78c0d75697 [Tests] Add tests for D62939 (miscompiles around dead pointer IVs)
Flesh out a collection of tests for switching to a dead IV within LFTR, both for the current miscompile, and for some cases which we should be able to handle via simple reasoning.

llvm-svn: 362976
2019-06-10 19:45:59 +00:00
Philip Reames a9633d5f0b [LFTR] Use recomputed BE count
This was discussed as part of D62880.  The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening.  Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion.  This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable.

For the nestedIV example from elim-extend.ll, we end up with the following BE counts:
BEFORE: (-2 + (-1 * %innercount) + %limit)
AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>)

Note that before is an i32 type, and the after is an i64.  Truncating the i64 produces the i32. 

llvm-svn: 362975
2019-06-10 19:18:53 +00:00
Sanjay Patel 9650c95b7e [InstCombine] allow unordered preds when canonicalizing to fabs()
We have a known-never-nan value via 'nnan', so an unordered predicate
is the same as its ordered sibling.

Similar to:
rL362937

llvm-svn: 362954
2019-06-10 15:39:00 +00:00
Sanjay Patel 07bba68889 [InstCombine] add tests for fabs() with unordered preds; NFC
llvm-svn: 362949
2019-06-10 15:08:22 +00:00
Sanjay Patel 85de9634e6 [InstCombine] fix bug in canonicalization to fabs()
Forgot to translate the predicate clauses in rL362943.

llvm-svn: 362945
2019-06-10 14:57:45 +00:00
Sanjay Patel 8b6d9f60ed [InstCombine] change canonicalization to fabs() to use FMF on fsub
Similar to rL362909:
This isn't the ideal fix (use FMF on the select), but it's still an
improvement until we have better FMF propagation to selects and other
FP math operators.

I don't think there's much risk of regression from this change by
not including the FMF on the fcmp any more. The nsz/nnan FMF
should be the same on the fcmp and the fsub because they have the
same operand.

llvm-svn: 362943
2019-06-10 14:46:36 +00:00
Sanjay Patel 8cd8c5784b [InstCombine] allow unordered preds when canonicalizing to fabs()
PR42179:
https://bugs.llvm.org/show_bug.cgi?id=42179

llvm-svn: 362937
2019-06-10 14:14:51 +00:00
Sanjay Patel 4cdd3ceb57 [InstCombine] add tests for fcmp unordered pred -> fabs (PR42179); NFC
llvm-svn: 362936
2019-06-10 14:04:10 +00:00
David Green d847aa573b [ARM] Enable Unroll UpperBound
This option allows loops with small max trip counts to be fully unrolled. This
can help with code like the remainder loops from manually unrolled loops like
those that appear in the cmsis dsp library. We would apparently previously
runtime unroll them with the default unroll count (4).

Differential Revision: https://reviews.llvm.org/D63064

llvm-svn: 362928
2019-06-10 10:22:14 +00:00
Vivek Pandya 11cb15f8ed Do not derive no-recurse attribute if function does not have exact definition.
This is fix for https://bugs.llvm.org/show_bug.cgi?id=41336

Reviewers: jdoerfert
Reviewed by: jdoerfert

Differential Revision: https://reviews.llvm.org/D63045

llvm-svn: 362918
2019-06-10 04:16:04 +00:00
Roman Lebedev d669758d84 [InstCombine] foldICmpWithLowBitMaskedVal(): 'icmp sgt/sle': avoid miscompiles
A precondition 'x != 0' was forgotten by me:
https://rise4fun.com/Alive/JFNP
https://rise4fun.com/Alive/jHvL

These 4 folds with non-constants could be re-enabled,
but for now let's go for the simplest solution.

https://bugs.llvm.org/show_bug.cgi?id=42198

llvm-svn: 362911
2019-06-09 16:30:42 +00:00
Roman Lebedev ff0c99b017 [NFC][InstCombine] Revisit canonicalize-constant-low-bit-mask-and-icmp-s* tests in preparatio for PR42198.
The `icmp sgt`/`icmp sle` variants are, too, miscompiles:
https://rise4fun.com/Alive/JFNP
https://rise4fun.com/Alive/jHvL
A precondition 'x != 0' was forgotten by me.

While ensuring test coverage for `-1`, also add test coverage
for `0` mask. Mask `0` is allowed for all the folds,
mask `-1` is allowed for all the folds with unsigned `icmp` pred.
Constant mask `0` is missed though.

https://bugs.llvm.org/show_bug.cgi?id=42198

llvm-svn: 362910
2019-06-09 16:30:14 +00:00
Sanjay Patel 87cd16a86e [InstCombine] change canonicalization to fabs() to use FMF on fneg
This isn't the ideal fix (use FMF on the select), but it's still an
improvement until we have better FMF propagation to selects and other
FP math operators.

I don't think there's much risk of regression from this change by
not including the FMF on the fcmp any more. The nsz/nnan FMF
should be the same on the fcmp and the fneg (fsub) because they
have the same operand.

This works around the most glaring FMF logical inconsistency cited
in PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086

llvm-svn: 362909
2019-06-09 16:22:01 +00:00
David Bolvansky 4e95b36b6d [NFC] Added test from PR42084 for D63058
llvm-svn: 362906
2019-06-09 14:56:46 +00:00
Nikita Popov 06beb48229 [InstCombine] Add tests for usub.sat(x,y)+y etc; NFC
For PR42178.

llvm-svn: 362905
2019-06-09 14:39:47 +00:00
Sanjay Patel 73f5a855b3 [InstSimplify] enhance fcmp fold with never-nan operand
This is another step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

This is a continuation of D62979 / rL362879.

llvm-svn: 362903
2019-06-09 13:48:59 +00:00
Sanjay Patel de4d4d5049 [InstSimplify] add tests for fcmp with known-never-nan operands; NFC
Opposite predicate for rL362742 / rL362879 / D62979

llvm-svn: 362902
2019-06-09 13:30:14 +00:00
Ayke van Laethem f18cf230e4 [CaptureTracking] Don't let comparisons against null escape inbounds pointers
Pointers that are in-bounds (either through dereferenceable_or_null or
thorough a getelementptr inbounds) cannot be captured with a comparison
against null. There is no way to construct a pointer that is still in
bounds but also NULL.

This helps safe languages that insert null checks before load/store
instructions. Without this patch, almost all pointers would be
considered captured even for simple loads. With this patch, an icmp with
null will not be seen as escaping as long as certain conditions are met.

There was a lot of discussion about this patch. See the Phabricator
thread for detals.

Differential Revision: https://reviews.llvm.org/D60047

llvm-svn: 362900
2019-06-09 10:20:33 +00:00
Sanjay Patel 4329c15f11 [InstSimplify] enhance fcmp fold with never-nan operand
This is 1 step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

I'll update the 'ult' case below here as a follow-up assuming no problems here.

Differential Revision: https://reviews.llvm.org/D62979

llvm-svn: 362879
2019-06-08 15:12:33 +00:00
David Bolvansky 54b1044983 [NFC] Added tests for D63038
llvm-svn: 362875
2019-06-08 12:07:59 +00:00
Matt Arsenault a59aeb3f29 LoopDistribute: Add testcase where SCEV wants to insert a runtime
check.

Only the memory based checks were being tested. Prepare for fix in
convergent handling.

llvm-svn: 362854
2019-06-07 23:17:38 +00:00
Sam Parker c5ef502ee8 [CodeGen] Generic Hardware Loop Support
Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops and PowerPC has been ported over to use this generic
pass. The target dependent parts have been moved into
TargetTransformInfo, via isHardwareLoopProfitable, with
HardwareLoopInfo introduced to transfer information from the backend.
    
Three generic intrinsics have been introduced:
- void @llvm.set_loop_iterations
  Takes as a single operand, the number of iterations to be executed.
- i1 @llvm.loop_decrement(anyint)
  Takes the maximum number of elements processed in an iteration of
  the loop body and subtracts this from the total count. Returns
  false when the loop should exit.
- anyint @llvm.loop_decrement_reg(anyint, anyint)
  Takes the number of elements remaining to be processed as well as
  the maximum numbe of elements processed in an iteration of the loop
  body. Returns the updated number of elements remaining.

llvm-svn: 362774
2019-06-07 07:35:30 +00:00
Sanjay Patel 38c5ee1802 [InstSimplify] add tests for fcmp with known-never-nan operands; NFC
llvm-svn: 362742
2019-06-06 20:14:06 +00:00
Craig Topper 6cda33ba36 [InlineCost] Add support for unary fneg.
This adds support for unary fneg based on the implementation of BinaryOperator without the soft float FP cost.

Previously we would just delegate to visitUnaryInstruction. I think the only real change is that we will pass the FastMath flags to SimplifyFNeg now.

Differential Revision: https://reviews.llvm.org/D62699

llvm-svn: 362732
2019-06-06 19:02:18 +00:00
Philip Reames 101915cfda [LoopPred] Fix a bug in unconditional latch bailout introduced in r362284
This is a really silly bug that even a simple test w/an unconditional latch would have caught.  I tried to guard against the case, but put it in the wrong if check.  Oops.

llvm-svn: 362727
2019-06-06 18:02:36 +00:00
Sanjay Patel dd2d1a168f [InstCombine] add tests for loads of bitcasted vector pointer; NFC
llvm-svn: 362703
2019-06-06 13:18:20 +00:00
Benjamin Kramer f1249442cf Revert "[SCEV] Use wrap flags in InsertBinop"
This reverts commit r362687. Miscompiles llvm-profdata during selfhost.

llvm-svn: 362699
2019-06-06 12:35:46 +00:00
Sam Parker 7cc580f5e9 [SCEV] Use wrap flags in InsertBinop
If the given SCEVExpr has no (un)signed flags attached to it, transfer
these to the resulting instruction or use them to find an existing
instruction.

Differential Revision: https://reviews.llvm.org/D61934

llvm-svn: 362687
2019-06-06 08:56:26 +00:00
Joseph Tremoulet acb5609063 [EarlyCSE] Add tests for negated min/max/abs [NFC]
Summary:
I'm planning to update the hashing logic to recognize their equivalence
in a subsequent change (D62644).

Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62918

llvm-svn: 362657
2019-06-05 21:30:10 +00:00
Matt Arsenault 663d762c9a NewGVN: Handle addrspacecast
The AllConstant check needs to be moved out of the if/else if chain to
avoid a test regression. The "there is no SimplifyZExt" comment
puzzles me, since there is SimplifyCastInst. Additionally, the
Simplify* calls seem to not see the operand as constant, so this needs
to be tried if the simplify failed.

llvm-svn: 362653
2019-06-05 21:15:52 +00:00
Tim Northover 8d7f118ab2 InstCombine: correctly change byval type attribute alongside call args.
When the byval attribute has a type, it must match the pointee type of
any parameter; but InstCombine was not updating the attribute when
folding casts of various kinds away.

llvm-svn: 362643
2019-06-05 20:38:17 +00:00
Cameron McInally 8b83a9c6b1 [NFC][Reassociate] Fix mistake in 468b2ad
Missed 2 'fast fsub(0.0,X) -> fneg(X)' changes.

llvm-svn: 362631
2019-06-05 18:50:07 +00:00
Cameron McInally 5162266515 [NFC][Reassociate] Add unary fneg tests to fast-basictest.ll
llvm-svn: 362630
2019-06-05 18:35:54 +00:00
Philip Reames 13dd125043 [Tests] Add poison inference tests for indvars showing both existing transforms, and some room for improvement
llvm-svn: 362628
2019-06-05 18:00:59 +00:00
Cameron McInally 0a31726d20 [NFC][Reassociate] Regenerate CHECKs for fast-basictest.ll
llvm-svn: 362627
2019-06-05 18:00:27 +00:00
Dinar Temirbulatov 15c657d13d [SLP] Fix regression in broadcasts caused by operand reordering patch D59973.
This patch fixes a regression caused by the operand reordering refactoring patch https://reviews.llvm.org/D59973 .
The fix changes the strategy to Splat instead of Opcode, if broadcast opportunities are found.
Please see the lit test for some examples.

Committed on behalf of @vporpo (Vasileios Porpodas)
    
Differential Revision: https://reviews.llvm.org/D62427

llvm-svn: 362613
2019-06-05 15:26:28 +00:00