Commit Graph

10376 Commits

Author SHA1 Message Date
Craig Topper 301991080e [ValueTracking] Teach cannotBeOrderedLessThanZeroImpl to look through ExtractElement.
This is similar to what's done in computeKnownBits and computeSignBits. Don't do anything fancy just collect information valid for any element.

Differential Revision: https://reviews.llvm.org/D43789

llvm-svn: 326237
2018-02-27 19:53:45 +00:00
Sanjay Patel 8529dd5ee1 [ARM] add loop vectorizer test based on 482.sphinx3 from SPEC2006; NFC
This is a slight reduction of one of the benchmarks
that suffered with D43079. Cost model changes should
not cause this test to remain scalarized.

llvm-svn: 326221
2018-02-27 18:33:24 +00:00
Sanjay Patel 04d1d79ee5 [AArch64] add SLP test based on TSVC; NFC
This is a slight reduction of one of the benchmarks
that suffered with D43079. Cost model changes should
not cause this test to remain scalarized.

llvm-svn: 326217
2018-02-27 18:06:15 +00:00
Florian Hahn 1807c516c7 [NewGVN] Update phi-of-ops def block when updating existing ValuePHI.
In case we update a ValuePHI node created earlier, we could update it
based on a different OpPHI which could be in a different block.
We need to update the TempToBlock mapping reflecting the new block,
otherwise we would end up placing the new phi node in a wrong block.

This problem is exposed by the test case in
https://bugs.llvm.org/show_bug.cgi?id=36504.

This patch fixes a slightly simpler problem than in the bug report. In
the bug's re-producer, the additional problem is that we are re-using a
ValuePHI node with to few incoming values for the new OpPHI. If this
patch makes sense, I will follow it up with a patch that creates a new
PHI node if the existing PHI node has a different number of incoming
values.

Reviewers: davide, dberlin

Reviewed By: dberlin

Differential Revision: https://reviews.llvm.org/D43770

llvm-svn: 326181
2018-02-27 09:34:51 +00:00
Adam Nemet b424cd5d61 Make test agnostic to cost model
This was causing bot failures on greendragon

llvm-svn: 326169
2018-02-27 05:41:16 +00:00
Evgeny Stupachenko f1c058d99b Fix r326154 buildbots test fail
Summary:

Add specific mtriples to tests added in r326154.

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 326158
2018-02-27 01:33:11 +00:00
Evgeny Stupachenko a732611ac8 Fix PR36032, PR35432
Summary:

The change fix an assert fail at ScalarEvolutionExpander.cpp:
  assert(ExitCount != SE.getCouldNotCompute() && "Invalid loop count");

Reviewers: sbaranga

Differential Revision: http://reviews.llvm.org/D42604

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 326154
2018-02-27 00:17:31 +00:00
Sanjay Patel 66911b16e6 [InstCombine, InstSimplify] add tests with undef elements in constant FP vectors; NFC
llvm-svn: 326148
2018-02-26 23:23:02 +00:00
Craig Topper 69c8972fd1 [ValueTracking] Teach cannotBeOrderedLessThanZeroImpl to handle vector constants.
Summary: This allows vector fabs to be removed in more cases.

Reviewers: spatel, arsenm, RKSimon

Reviewed By: spatel

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43739

llvm-svn: 326138
2018-02-26 22:33:17 +00:00
Simon Pilgrim 9929f90740 [X86][SSE] Reduce FADD/FSUB/FMUL costs on later targets (PR36280)
Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark.

Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch.

Differential Revision: https://reviews.llvm.org/D43733

llvm-svn: 326133
2018-02-26 22:10:17 +00:00
Alexey Bataev b44e2b75e8 [SLP] Added new test + fixed some checks, NFC.
llvm-svn: 326117
2018-02-26 20:01:24 +00:00
Craig Topper 43fb1cdef7 [InstCombine] Add test cases with vector constants to fpextend.ll
llvm-svn: 326115
2018-02-26 19:36:37 +00:00
Craig Topper b284e8b9b4 [InstCombine] Switch to using FileCheck instead of grep. Auto-generate checks. NFC
llvm-svn: 326114
2018-02-26 19:36:36 +00:00
Sanjay Patel 31a90468e1 [InstCombine] allow fdiv folds with less than fully 'fast' ops
Note: gcc appears to allow this fold with -freciprocal-math alone, 
but clang/llvm require more than that with this patch. The wording
in the definitions seems fuzzy enough that it could go either way,
but we'll err on the conservative side of FMF interpretation.

This patch also changes the newly created fmul to have FMF propagated
by the last fdiv rather than intersecting the FMF of the fdivs. This
matches the behavior of other folds near here. The new fmul is only 
used to produce an intermediate op for the final fdiv result, so it
shouldn't be any stricter than that result. The previous behavior
could result in dropping FMF via other folds in instcombine or CSE.

Differential Revision: https://reviews.llvm.org/D43398

llvm-svn: 326098
2018-02-26 16:02:45 +00:00
Renato Golin 9d1b2acaaa [LV] Move isLegalMasked* functions from Legality to CostModel
All SIMD architectures can emulate masked load/store/gather/scatter
through element-wise condition check, scalar load/store, and
insert/extract. Therefore, bailing out of vectorization as legality
failure, when they return false, is incorrect. We should proceed to cost
model and determine profitability.

This patch is to address the vectorizer's architectural limitation
described above. As such, I tried to keep the cost model and
vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning
should be done separately.

Please see
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for
RFC and the discussions.

Closes D43208.

Patch by: Hideki Saito <hideki.saito@intel.com>

llvm-svn: 326079
2018-02-26 11:06:36 +00:00
Florian Hahn ed45836253 [LoopInterchange] Add test case for D43236.
llvm-svn: 326078
2018-02-26 10:46:25 +00:00
Craig Topper aee341ef28 [InstSimplify] Add test cases for removal of vector fabs on known positive.
llvm-svn: 326050
2018-02-25 06:51:52 +00:00
Craig Topper 2b8f051aaa [InstSimplify] Remove unused parameter from test cases.
llvm-svn: 326049
2018-02-25 06:51:51 +00:00
Adam Nemet e4e1de60aa Revert "StructurizeCFG: Test for branch divergence correctly"
This reverts commit r325881.

Breaks many bots

llvm-svn: 326037
2018-02-24 17:29:09 +00:00
Sanjay Patel db53d1847b [InstSimplify] sqrt(X) * sqrt(X) --> X
This was misplaced in InstCombine. We can loosen the FMF as a follow-up step.

llvm-svn: 325965
2018-02-23 22:20:13 +00:00
Sanjay Patel d32104e1b2 [InstCombine] allow fmul-sqrt folds with less than full -ffast-math
Also, add a Builder method for intrinsics to reduce code duplication for clients.

llvm-svn: 325960
2018-02-23 21:16:12 +00:00
Matt Davis 708271849a [Test] Fix the test to output to /dev/null instead of redirecting.
The redirection was confusing the windows build machine.

llvm-svn: 325937
2018-02-23 19:03:04 +00:00
Matt Davis 523c656e25 [Debug] Add dbg.value intrinsics for PHIs created during LCSSA.
Summary:
This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA.
I noticed a case where a value carried across a loop was reported as <optimized out>.

Specifically this case:
```
int bar(int x, int y) {
  return x + y;
}

int foo(int size) {
  int val = 0;
  for (int i = 0; i < size; ++i) {
    val = bar(val, i);  // Both val and i are correct
  }
  return val; // <optimized out>
}
```

In the above case, after all of the interesting computation completes our value
is reported as "optimized out." This change will add a dbg.value to correct this.

This patch also moves the dbg.value insertion routine from LoopRotation.cpp 
into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA).

Reviewers: mzolotukhin, aprantl, vsk, davide

Reviewed By: aprantl, vsk

Subscribers: dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D42551

llvm-svn: 325926
2018-02-23 17:38:27 +00:00
Nicolai Haehnle 43c1115cd4 StructurizeCFG: Test for branch divergence correctly
Summary:
This fixes cases like the new test @nonuniform. In that test, %cc itself
is a uniform value; however, when reading it after the end of the loop in
basic block %if, its value is effectively non-uniform.

This problem was encountered in
https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change
in itself is not sufficient to fix that bug, as there is another issue
in the AMDGPU backend.

Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4

Reviewers: arsenm, rampitec, jlebar

Subscribers: wdng, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D40546

llvm-svn: 325881
2018-02-23 10:45:46 +00:00
Bjorn Steinbrink 983d6c3f18 Mark MergedLoadStoreMotion as not preserving MemDep results
Summary:
MemDep caches results that signify that a dependence is non-local, and
there is currently no way to invalidate such cache entries.
Unfortunately, when MLSM sinks a store that can result in a non-local
dependence becoming a local one, and then MemDep gives wrong answers.
The easiest way out here is to just say that MLSM does indeed not
preserve MemDep results.

Reviewers: davide, Gerolf

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43177

llvm-svn: 325880
2018-02-23 10:41:57 +00:00
Daniel Neilson 20c9207be3 [AlignmentFromAssumptions] Set source and dest alignments of memory intrinsiscs separately
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
AlignmentFromAssumptions pass to cease using the old getAlignment()/setAlignment API of
MemoryIntrinsic in favour of getting/setting source & dest specific alignments through
the new API. This allows us to simplify some of the code in this pass and also be more
aggressive about setting the source and destination alignments separately.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955, rL324960 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

Reviewers: hfinkel, bollu, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D43081

llvm-svn: 325816
2018-02-22 18:55:59 +00:00
Luke Cheeseman 6c1e6bbe0c [FunctionAttrs][ArgumentPromotion][GlobalOpt] Disable some optimisations passes for naked functions
- Fix for bug 36078.
- Prevent the functionattrs, function-attrs, globalopt and argpromotion passes
  from changing naked functions.
- These passes can perform some alterations to the functions that should not be
  applied. An example is removing parameters that are seemingly not used because
  they are only referenced in the inline assembly. Another example is marking
  the function as fastcc.

llvm-svn: 325788
2018-02-22 14:42:08 +00:00
Sanjay Patel 92b7371113 [InstCombine] add fmul multi-use test; NFC
Also, rename tests to make their intent clearer.

llvm-svn: 325785
2018-02-22 14:27:16 +00:00
Simon Pilgrim 864949d5e9 [SLPVectorizer][X86] Add load extend tests (PR36091)
llvm-svn: 325772
2018-02-22 12:19:34 +00:00
Sanjay Patel 9befaeb582 [InstCombine] add some random FMF to tests so we know it's not dropped; NFC
llvm-svn: 325734
2018-02-21 22:48:28 +00:00
Sanjay Patel d53da082a0 [AArch64] fix IR names to not be 'tmp' because that gives the CHECK script problems
llvm-svn: 325718
2018-02-21 20:48:14 +00:00
Sanjay Patel ffe51e450f [AArch64] add SLP test for matmul (PR36280); NFC
This is a slight reduction of one of the benchmarks
that suffered with D43079. Cost model changes should
not cause this test to remain scalarized.

llvm-svn: 325717
2018-02-21 20:34:16 +00:00
Alexey Bataev 650f639d33 [LV] Fix test checks, NFC
llvm-svn: 325699
2018-02-21 16:48:23 +00:00
Alexey Bataev cdd0675ddc [SLP] Fix test checks, NFC.
llvm-svn: 325689
2018-02-21 15:32:58 +00:00
Silviu Baranga 10ad93c6bf [SCEV] Temporarily disable loop versioning for the purpose
of turning SCEVUnknowns of PHIs into AddRecExprs.

This feature is now hidden behind the -scev-version-unknown flag.

Fixes PR36032 and PR35432.

llvm-svn: 325687
2018-02-21 15:20:32 +00:00
Vedant Kumar 56492f9177 [BDCE] Salvage debug info from dying insts
This results in 15 additional unique source variables in a stage2 build
of FileCheck (at '-Os -g'), with a negligible increase in the size of
the .debug_loc section.

llvm-svn: 325660
2018-02-21 01:55:33 +00:00
Sanjay Patel e6143904b9 revert r325515: [TTI CostModel] change default cost of FP ops to 1 (PR36280)
There are too many perf regressions resulting from this, so we need to 
investigate (and add tests for) targets like ARM and AArch64 before 
trying to reinstate.

llvm-svn: 325658
2018-02-21 01:42:52 +00:00
Sanjay Patel 6f716a7c5e [InstCombine] C / -X --> -C / X
We already do this in DAGCombiner, but it should
also be good to eliminate the fsub use in IR.

This is similar to rL325648.

llvm-svn: 325649
2018-02-21 00:01:45 +00:00
Sanjay Patel d8dd0151fc [InstCombine] -X / C --> X / -C for FP
We already do this in DAGCombiner, but it should 
also be good to eliminate the fsub use in IR.

llvm-svn: 325648
2018-02-20 23:51:16 +00:00
Sanjay Patel 8357371861 [InstCombine] add tests for fdiv with negated op and constant op; NFC
llvm-svn: 325644
2018-02-20 23:34:43 +00:00
Sanjay Patel 3e569ac0cc [PatternMatch] allow vector matches with m_FNeg
llvm-svn: 325642
2018-02-20 23:29:05 +00:00
Sanjoy Das 737fa40ffa [DSE] Don't DSE stores that subsequent memmove calls read from
Summary:
We used to remove the first memmove in cases like this:

  memmove(p, p+2, 8);
  memmove(p, p+2, 8);

which is incorrect.  Fix this by changing isPossibleSelfRead to what was most
likely the intended behavior.

Historical note: the buggy code was added in https://reviews.llvm.org/rL120974
to address PR8728.

Reviewers: rsmith

Subscribers: mcrosier, llvm-commits, jlebar

Differential Revision: https://reviews.llvm.org/D43425

llvm-svn: 325641
2018-02-20 23:19:34 +00:00
Sanjay Patel 4f65e0d008 [InstCombine] auto-generate full checks; NFC
llvm-svn: 325639
2018-02-20 23:08:47 +00:00
Sanjay Patel 088f4690f5 [InstCombine] add test for vector -X/-Y; NFC
m_FNeg doesn't match vector types.

llvm-svn: 325637
2018-02-20 22:46:38 +00:00
Benjamin Kramer 1516dd70bb Fix broken test from r325630.
llvm-svn: 325634
2018-02-20 22:30:16 +00:00
Benjamin Kramer fd0630665b [MemoryBuiltins] Check nobuiltin status when identifying calls to free.
This is usually not a problem because this code's main purpose is
eliminating unused new/delete pairs. We got deletes of nullptr or
nobuiltin deletes of builtin new wrong though.

llvm-svn: 325630
2018-02-20 22:00:33 +00:00
Sanjay Patel e29caaa9c5 [PatternMatch] enhance m_SignMask() to ignore undef elements in vectors
llvm-svn: 325623
2018-02-20 21:02:40 +00:00
Sanjay Patel ff7b777bbe [InstSimplify] add tests for m_SignMask with undef vector elements; NFC
llvm-svn: 325622
2018-02-20 20:53:35 +00:00
Alexey Bataev 42bcec7d38 [LV] Fix test checks, NFC.
llvm-svn: 325617
2018-02-20 19:49:25 +00:00
Alexey Bataev 47dfd249f0 [SLP] Fix tests checks, NFC.
llvm-svn: 325605
2018-02-20 18:11:50 +00:00
Sanjay Patel 90f4c8ec29 [InstCombine] fold fdiv with non-splat divisor to fmul: X/C --> X * (1/C)
llvm-svn: 325590
2018-02-20 16:08:15 +00:00
Sanjay Patel 1d14779aed [InstCombine] allow fdiv with constant dividend folds with less than full -ffast-math
It's possible that we could allow this either 'arcp' or 'reassoc' alone, but this
should be conservatively better than what we have right now. GCC allows this with
only -freciprocal-math.

The last test is changed to show a case that is expected to fold, but we need D43398.

llvm-svn: 325533
2018-02-19 21:46:52 +00:00
Sanjay Patel e82cc6fcc5 [InstCombine] move fdiv tests; NFC
Also, use vector constants just to prove that already works.

llvm-svn: 325530
2018-02-19 21:13:39 +00:00
Sanjay Patel 3e8a76abfd [TTI CostModel] change default cost of FP ops to 1 (PR36280)
This change was mentioned at least as far back as:
https://bugs.llvm.org/show_bug.cgi?id=26837#c26
...and I found a real program that is harmed by this: 
Himeno running on AMD Jaguar gets 6% slower with SLP vectorization:
https://bugs.llvm.org/show_bug.cgi?id=36280
...but the change here appears to solve that bug only accidentally.

The div/rem costs for x86 look very wrong in some cases, but that's already true, 
so we can fix those in follow-up patches. There's also evidence that more cost model
changes are needed to solve SLP problems as shown in D42981, but that's an independent 
problem (though the solution may be adjusted after this change is made).

Differential Revision: https://reviews.llvm.org/D43079

llvm-svn: 325515
2018-02-19 16:11:44 +00:00
Ivan A. Kosarev f03f579d1d [Transforms] Propagate new-format TBAA tags on simplification of memory-transfer intrinsics
With this patch in place, when a new-format TBAA tag is available
for a memory-transfer intrinsic call, we prefer propagating that
new-format tag. Otherwise, we fallback to the old approach where
we try to construct a proper TBAA access tag from 'tbaa.struct'
metadata.

Differential Revision: https://reviews.llvm.org/D41543

llvm-svn: 325488
2018-02-19 12:10:20 +00:00
Sanjay Patel adf6e88c74 [PatternMatch, InstSimplify] enhance m_AllOnes() to ignore undef elements in vectors
Loosening the matcher definition reveals a subtle bug in InstSimplify (we should not
assume that because an operand constant matches that it's safe to return it as a result).

So I'm making that change here too (that diff could be independent, but I'm not sure how 
to reveal it before the matcher change).

This also seems like a good reason to *not* include matchers that capture the value.
We don't want to encourage the potential misstep of propagating undef values when it's
not allowed/intended.

I didn't include the capture variant option here or in the related rL325437 (m_One), 
but it already exists for other constant matchers.

llvm-svn: 325466
2018-02-18 18:05:08 +00:00
Sanjay Patel 7faceaed31 [InstSimplify] add tests with vector undef elts; NFC
llvm-svn: 325465
2018-02-18 17:39:09 +00:00
Sanjay Patel f569578373 [PatternMatch] enhance m_One() to ignore undef elements in vectors
llvm-svn: 325437
2018-02-17 16:00:42 +00:00
Sanjay Patel a6a1426cf1 [InstSimplify, InstCombine] add tests with vector undef elts; NFC
These would fold if the m_One pattern matcher accounted for undef elts.

llvm-svn: 325436
2018-02-17 15:55:40 +00:00
Sanjay Patel 841ca95219 [InstSimplify] add vector select tests with undef elts in condition; NFC
llvm-svn: 325419
2018-02-17 01:18:53 +00:00
Sanjay Patel 870fbda805 [InstCombine] add FMF to better show current fdiv fold behavior; NFC
llvm-svn: 325365
2018-02-16 17:46:50 +00:00
Eugene Leviant 8c83b9b8c5 [ThinLTO] Fix data race in test #2
Switched to the right option (-thinlto-threads)

llvm-svn: 325362
2018-02-16 17:25:03 +00:00
Eugene Leviant c9724d9149 [ThinLTO] Fix data race in test
llvm-svn: 325361
2018-02-16 16:56:33 +00:00
Brian M. Rzycki f1a7df5ef2 [JumpThreading] PR36133 enable/disable DominatorTree for LVI analysis
Summary:
The LazyValueInfo pass caches a copy of the DominatorTree when available.
Whenever there are pending DominatorTree updates within JumpThreading's
DeferredDominance object we cannot use the cached DT for LVI analysis.
This commit adds the new methods enableDT() and disableDT() to LVI.
JumpThreading also sets the appropriate usage model before calling LVI
analysis methods.

Fixes https://bugs.llvm.org/show_bug.cgi?id=36133

Reviewers: sebpop, dberlin, kuhar

Reviewed by: sebpop, kuhar

Subscribers: uabelho, llvm-commits, aprantl, hiraditya, a.elovikov

Differential Revision: https://reviews.llvm.org/D42717

llvm-svn: 325356
2018-02-16 16:35:17 +00:00
Ivan A. Kosarev 53270d0fa6 [Transforms] Propagate TBAA info in SROA
Now that we have the new TBAA metadata format that is capable of
representing accesses to aggregates, we can propagate TBAA access
tags from memory setting and transferring intrinsics to load and
store instructions and vice versa.

Since SROA produces lots of new loads and stores on optimized
builds, this change significantly decreases the share of
undecorated memory accesses on such builds.

Differential Revision: https://reviews.llvm.org/D41563

llvm-svn: 325329
2018-02-16 10:10:29 +00:00
Eugene Leviant 7331a0bf1c [ThinLTO] Import global variables
Differential revision: https://reviews.llvm.org/D43077

llvm-svn: 325320
2018-02-16 08:11:04 +00:00
Vedant Kumar 3dc6de619a Remove brittle check lines from a test, NFC
llvm-svn: 325310
2018-02-16 01:21:01 +00:00
Vedant Kumar 616fdb00df [GVN] Partially revert debug info salvage change (r325063)
In r325063, we salvaged debug values from dying instructions in
GVN::processBlock() and GVN::performScalarPRE().

The change in performScalarPRE(), while correct, is unhelpful. It
introduced a call to salvageDebugInfo() which was immediately followed
by a RAUW, meaning it prevented the RAUW from efficiently updating
dbg.value intrinsics.  This commit reverts the mistake and tightens up
the affected test case.

llvm-svn: 325308
2018-02-16 01:15:20 +00:00
Vedant Kumar 1df820ecd7 [DCE] Salvage debug info from dead insts
This results in small increases in the size of the .debug_loc section
and the number of unique source variables in a stage2 build of opt.

llvm-svn: 325301
2018-02-15 22:26:18 +00:00
Brian Gesiak a5e3675bd3 [Coroutines] Don't move stores for allocator args
Summary:
The behavior described in Coroutines TS `[dcl.fct.def.coroutine]/7`
allows coroutine parameters to be passed into allocator functions.
The instructions to store values into the alloca'd parameters must not
be moved past the frame allocation, otherwise uninitialized values are
passed to the allocator.

Test Plan: `check-llvm`

Reviewers: rsmith, GorNishanov, eric_niebler

Reviewed By: GorNishanov

Subscribers: compnerd, EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D43000

llvm-svn: 325285
2018-02-15 19:31:45 +00:00
Vedant Kumar 24953dc876 [SCCP] Test that constant propagation updates debug info, NFC
This extends an existing test to check that SCCP updates the operands of
relevant dbg.value instructions as it does its work.

llvm-svn: 325281
2018-02-15 19:13:04 +00:00
Alexey Bataev 862c476fc2 [SLP] Fix the test for the reversed stores, NFC.
llvm-svn: 325268
2018-02-15 17:11:50 +00:00
Alexey Bataev ac619599d8 [SLP] Added test for reversed stores, NFC.
llvm-svn: 325265
2018-02-15 16:56:49 +00:00
Sanjay Patel 9174416e89 [InstCombine] test fdiv folds better; NFC
We had redundant tests, but no tests for extra uses or vectors.
'fast' is an overly conservative requirement for these folds.

llvm-svn: 325262
2018-02-15 16:28:15 +00:00
Sanjay Patel 339b4d338d [InstCombine] allow sin/cos transforms with 'reassoc'
The variable name 'AllowReassociate' is a lie at this point because
it's set to 'isFast()' which is more than the 'reassoc' FMF after
rL317488.

In D41286, we showed that this transform may be valid even with strict
math by brute force checking every 32-bit float result.

There's a potential problem here because we're replacing with a tan()
libcall rather than a hypothetical LLVM tan intrinsic. So we might
set errno when we should be guaranteed not to do that. But that's
independent of this change.

llvm-svn: 325247
2018-02-15 15:07:12 +00:00
Sanjay Patel 6a0f667077 [InstCombine] allow X / C -> X * (1.0/C) for vector splat FP constants
llvm-svn: 325237
2018-02-15 13:55:52 +00:00
Max Kazantsev 6e4ce23add [NFC] Fix metadata placement in test
llvm-svn: 325215
2018-02-15 07:13:18 +00:00
Max Kazantsev c5941d12f4 [SCEV] Favor isKnownViaSimpleReasoning over constant ranges check
There is a more powerful but still simple function `isKnownViaSimpleReasoning ` that
does constant range check and few more additional checks. We use it some places (e.g.
when proving implications) and in some other places we only check constant ranges.

Currently, indvar simplifier fails to remove the check in following loop:

  int inc = ...;
  for (int i = inc, j = inc - 1; i < 200; ++i, ++j)
    if (i > j) { ... }

This patch replaces all usages of `isKnownPredicateViaConstantRanges` with
`isKnownViaSimpleReasoning` to have smarter proofs. In particular, it fixes the
case above.

Reviewed-By: sanjoy
Differential Revision: https://reviews.llvm.org/D43175

llvm-svn: 325214
2018-02-15 07:09:00 +00:00
Sanjay Patel af5d499cb9 [InstCombine] add tests and comments for fdiv X, C; NFC
llvm-svn: 325161
2018-02-14 19:54:51 +00:00
Craig Topper 1c19cc1745 [InstCombine] Don't fold select(C, Z, binop(select(C, X, Y), W)) -> select(C, Z, binop(Y, W)) if the binop is rem or div.
The select may have been preventing a division by zero or INT_MIN/-1 so removing it might not be safe.

Fixes PR36362.

Differential Revision: https://reviews.llvm.org/D43276

llvm-svn: 325148
2018-02-14 18:08:33 +00:00
Sanjay Patel 48f671bb27 [InstCombine] regenerate checks; NFC
llvm-svn: 325144
2018-02-14 17:37:32 +00:00
Alexey Bataev 7f246e003a [SLP] Allow vectorization of reversed loads.
Summary:
Reversed loads are handled as gathering. But we can just reshuffle
these values. Patch adds support for vectorization of reversed loads.

Reviewers: RKSimon, spatel, mkuper, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43022

llvm-svn: 325134
2018-02-14 15:29:15 +00:00
Florian Hahn b4e3bad89b Recommit r325001: [CallSiteSplitting] Support splitting of blocks with instrs before call.
For basic blocks with instructions between the beginning of the block
and a call we have to duplicate the instructions before the call in all
split blocks and add PHI nodes for uses of the duplicated instructions
after the call.

Currently, the threshold for the number of instructions before a call
is quite low, to keep the impact on binary size low.

Reviewers: junbuml, mcrosier, davidxl, davide

Reviewed By: junbuml

Differential Revision: https://reviews.llvm.org/D41860

llvm-svn: 325126
2018-02-14 13:59:12 +00:00
Florian Hahn c6296fea3f [LoopInterchange] Incrementally update the dominator tree.
We can use incremental dominator tree updates to avoid re-calculating
the dominator tree after interchanging 2 loops.

Reviewers: dmgreen, kuhar

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D43176

llvm-svn: 325122
2018-02-14 13:13:15 +00:00
Petar Jovanovic 1768957c82 [Utils] Salvage the debug info of DCE'ed 'and' instructions
Preserve debug info from a dead 'and' instruction with a constant.

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D43163

llvm-svn: 325119
2018-02-14 13:10:35 +00:00
Elena Demikhovsky 945b7e5aa6 Adding a width of the GEP index to the Data Layout.
Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout.
p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits.
The index size parameter is optional, if not specified, it is equal to the pointer size.

Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width.
It works fine if you can convert pointer to integer for address calculation and all registered targets do this.
But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout.
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html

I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account.
This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size.

Differential Revision: https://reviews.llvm.org/D42123

llvm-svn: 325102
2018-02-14 06:58:08 +00:00
Sanjay Patel 3ce76ad26f [InstCombine] put tests of mul with neg operand(s) together; NFC
llvm-svn: 325066
2018-02-13 23:02:12 +00:00
Vedant Kumar 1d5d31b706 [GVN] Salvage debug info from dead insts
This preserves an additional 581 unique source variables in a stage2
build of clang (according to `llvm-dwarfdump --statistics`). It
increases the size of the .debug_loc section by 0.1% (or 87139 bytes).

Differential Revision: https://reviews.llvm.org/D43255

llvm-svn: 325063
2018-02-13 22:27:17 +00:00
Sanjay Patel 7558d860af [InstCombine] (lshr X, 31) * Y --> (ashr X, 31) & Y
This replaces the bit-tracking based fold that did the same thing,
but it only worked for scalars and not directly. 

There is no evidence in existing regression tests that the greater 
power of bit-tracking was needed here, but we should be aware of 
this potential loss of optimization.

llvm-svn: 325062
2018-02-13 22:24:37 +00:00
Sanjay Patel fdb3b036cc [InstCombine] add vector tests, fix comments; NFC
The scalar folds are done indirectly and use potentially
expensive value tracking calls. That can be improved
along with the enhancement to support vector types.

llvm-svn: 325051
2018-02-13 21:19:42 +00:00
Sanjay Patel cb8ac00f73 [InstCombine] (bool X) * Y --> X ? Y : 0
This is both a functional improvement for vectors and an
efficiency improvement for scalars. The existing code below
the new folds does the same thing for scalars, but in an 
indirect and expensive way.

llvm-svn: 325048
2018-02-13 20:41:22 +00:00
Sanjay Patel 2e2958497f [InstCombine] fix test comment and add vector test; NFC
llvm-svn: 325039
2018-02-13 18:48:27 +00:00
Sanjay Patel b13fcd52ed [InstCombine, InstSimplify] (re)move tests, regenerate checks; NFC
The InstCombine integer mul test file had tests that belong in InstSimplify 
(including fmul tests). Move things to where they belong and auto-generate
complete checks for everything.

llvm-svn: 325037
2018-02-13 18:22:53 +00:00
Vedant Kumar 35fc103e1e [DeadStoreElimination] Salvage debug info from dead insts
According to `llvm-dwarfdump --statistics` this salvages 43 additional
unique source variables in a stage2 build of clang. It increases the
size of the .debug_loc section by 0.002% (or 2864 bytes).

Differential Revision: https://reviews.llvm.org/D43220

llvm-svn: 325035
2018-02-13 18:15:26 +00:00
Yaxun Liu 0124b5484c [AMDGPU] Change constant addr space to 4
Differential Revision: https://reviews.llvm.org/D43170

llvm-svn: 325030
2018-02-13 18:00:25 +00:00
Florian Hahn 35d744d388 Revert r325001: [CallSiteSplitting] Support splitting of blocks with instrs before call.
Due to memsan not being happy with the array of ValueToValue maps.

llvm-svn: 325009
2018-02-13 14:48:39 +00:00
Ivan A. Kosarev 4a381b444e [IR] Fix creating mutable versions of TBAA access tags
Due to a typo in D41565, mutable TBAA tags created with
createMutableTBAAAccessTag() lose their base types. This patch
fixes that typo and updates tests respectively.

Differential Revision: https://reviews.llvm.org/D42364

llvm-svn: 325008
2018-02-13 14:44:25 +00:00
Florian Hahn b0884b6443 [CallSiteSplitting] Support splitting of blocks with instrs before call.
For basic blocks with instructions between the beginning of the block
and a call we have to duplicate the instructions before the call in all
split blocks and add PHI nodes for uses of the duplicated instructions
after the call.

Currently, the threshold for the number of instructions before a call
is quite low, to keep the impact on binary size low.

Reviewers: junbuml, mcrosier, davidxl, davide

Reviewed By: junbuml

Differential Revision: https://reviews.llvm.org/D41860

llvm-svn: 325001
2018-02-13 12:00:48 +00:00
Florian Hahn 1f95ef1815 [LoopInterchange] Check number of latch successors before accessing them.
In cases where the OuterMostLoopLatchBI only has a single successor,
accessing the second successor will fail.

This fixes a failure when building the test-suite with loop-interchange
enabled.

Reviewers: mcrosier, karthikthecool, davide

Reviewed by: karthikthecool

Differential Revision: https://reviews.llvm.org/D42906

llvm-svn: 324994
2018-02-13 10:02:52 +00:00
Vedant Kumar 388fac5de6 [Utils] Salvage debug info from all no-op casts
We already try to salvage debug values from no-op bitcasts and inttoptr
instructions: we should handle ptrtoint instructions as well.

This saves an additional 24,444 debug values in a stage2 build of clang,
and (according to llvm-dwarfdump --statistics) provides an additional
289 unique source variables.

llvm-svn: 324982
2018-02-13 03:34:23 +00:00