Commit Graph

7781 Commits

Author SHA1 Message Date
Sanjay Patel 8e68394376 [InstCombine] update test to use FileCheck; NFC
llvm-svn: 286668
2016-11-11 23:12:46 +00:00
Adam Nemet 9bfbf8bbdf [LV] Stop saying "use -Rpass-analysis=loop-vectorize"
This is PR28376.

Unfortunately given the current structure of optimization diagnostics we
lack the capability to tell whether the user has
passed -Rpass-analysis=loop-vectorize since this is local to the
front-end (BackendConsumer::OptimizationRemarkHandler).

So rather than printing this even if the user has already
passed -Rpass-analysis, this patch just punts and stops recommending
this option.  I don't think that getting this right is worth the
complexity.

Differential Revision: https://reviews.llvm.org/D26563

llvm-svn: 286662
2016-11-11 22:51:46 +00:00
Evgeniy Stepanov 1fe189d795 [cfi] Fix weak functions handling.
When a function pointer is replaced with a jumptable pointer, special
case is needed to preserve the semantics of extern_weak functions.
Since a jumptable entry can not be extern_weak, we emulate that
behaviour by replacing all references to F (the extern_weak function)
with the following expression: F != nullptr ? JumpTablePtr : nullptr.

Extra special care is needed for global initializers, since most (or
probably all) backends can not lower an initializer that includes
this kind of constant expression. Initializers like that are replaced
with a global constructor (i.e. a runtime initializer).

llvm-svn: 286636
2016-11-11 21:39:26 +00:00
Vyacheslav Klochkov f1a12fe0f5 Fixed the lost FastMathFlags for FCmp operations in SLPVectorizer.
Reviewer: Michael Zolotukhin.
Differential Revision: https://reviews.llvm.org/D26543

llvm-svn: 286626
2016-11-11 19:55:29 +00:00
Sanjay Patel da0149dd74 [InstCombine] add tests to show size-increasing select transforms
llvm-svn: 286619
2016-11-11 19:37:54 +00:00
Evgeniy Stepanov f48ffab554 [cfi] Implement cfi-icall using inline assembly.
The current implementation is emitting a global constant that happens
to evaluate to the same bytes + relocation as a jump instruction on
X86. This does not work for PIE executables and shared libraries
though, because we end up with a wrong relocation type. And it has no
chance of working on ARM/AArch64 which use different relocation types
for jump instructions (R_ARM_JUMP24) that is never generated for
data.

This change replaces the constant with module-level inline assembly
followed by a hidden declaration of the jump table. Works fine for
ARM/AArch64, but has some drawbacks.
* Extra symbols are added to the static symbol table, which inflate
the size of the unstripped binary a little. Stripped binaries are not
affected. This happens because jump table declarations must be
external (because their body is in the inline asm).
* Original functions that were anonymous are now named
<original name>.cfi, and it affects symbolization sometimes. This is
necessary because the only user of these functions is the (inline
asm) jump table, so they had to be added to @llvm.used, which does
not allow unnamed functions.

llvm-svn: 286611
2016-11-11 18:49:09 +00:00
Adam Nemet 7da20c39ee [OptDiag] Remove non-printable chars from function name
The r283656 did this in the remark arguments.  We also need to do this
in the main function attribute as that is written to YAML as well.

llvm-svn: 286482
2016-11-10 17:47:03 +00:00
Sanjay Patel 40d33e7554 [InstCombine] auto-generate better checks; NFC
Note that the existing metadata checking was re-added by hand because the 
script doesn't currently know how to generate checks for lines outside of 
functions.

llvm-svn: 286460
2016-11-10 14:58:17 +00:00
Sanjay Patel 4e1b5a53c7 [InstCombine] avoid infinite loop from shuffle-extract-insert sequence (PR30923)
Removing the limitation in visitInsertElementInst() causes several regressions
because we're not prepared to fold sequences of shuffles or inserts and extracts
separated by shuffles. Fixing that appears to be a difficult mission because we
are purposely trying to avoid creating shuffles with arbitrary shuffle masks
because some targets may choke on those.

https://llvm.org/bugs/show_bug.cgi?id=30923

llvm-svn: 286423
2016-11-10 00:15:14 +00:00
Dehao Chen 06e079a530 Update vectorization debug info unittest.
Summary:
The change will test the change in r286159.
The idea behind the change: Make the dbg location different between loop header and preheader/exit. Originally, dbg location 21 exists in 3 BBs: preheader, header, critical edge (exit). Update the debug location of inside the loop header from !21 to !22 so that it will reflect the correct location.

Reviewers: probinson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26428

llvm-svn: 286403
2016-11-09 22:25:19 +00:00
Sanjay Patel 600631daf3 [InstCombine] regenerate checks; NFC
llvm-svn: 286402
2016-11-09 22:21:58 +00:00
Sanjay Patel 16da6c466f [InstCombine] regenerate checks; NFC
llvm-svn: 286399
2016-11-09 21:41:34 +00:00
Alexandros Lamprineas 0ee3ec2fe4 [ARM] Loop Strength Reduction crashes when targeting ARM or Thumb.
Scalar Evolution asserts when not all the operands of an Add Recurrence
Expression are loop invariants. Loop Strength Reduction should only
create affine Add Recurrences, so that both the start and the step of
the expression are loop invariants.

Differential Revision: https://reviews.llvm.org/D26185

llvm-svn: 286347
2016-11-09 08:53:07 +00:00
Dehao Chen 947dbe1254 Enable Loop Sink pass for functions that has profile.
Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code.

Reviewers: davidxl, hfinkel

Subscribers: mehdi_amini, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26155

llvm-svn: 286325
2016-11-09 00:58:19 +00:00
Sanjay Patel 4e9d6cd354 [InstCombine] fix profitability equation for max-of-nots transform
As the test change shows, we can increase the critical path by adding
a 'not' instruction, so make sure that we're actually removing an
instruction if we do this transform.

This transform could also cause us to miss folds of min/max pairs.

llvm-svn: 286315
2016-11-09 00:13:11 +00:00
Davide Italiano 1e77aaca8a [LibcallsShrinkWrap] This pass doesn't preserve the CFG.
For example, it invalidates the domtree, causing assertions
in later passes which need dominator infos. Make it preserve
GlobalsAA, as suggested by Eli.

Differential Revision:  https://reviews.llvm.org/D26381

llvm-svn: 286271
2016-11-08 19:18:20 +00:00
Sanjay Patel 8625c43662 [InstCombine] move min/max tests to min/max test file; NFC
llvm-svn: 286256
2016-11-08 18:12:19 +00:00
Sanjay Patel 686cf49f7a [InstCombine] update checks; NFC
llvm-svn: 286255
2016-11-08 18:06:14 +00:00
Pablo Barrio 9f45254138 [JumpThreading] Unfold selects that depend on the same condition
Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.

Patch by James Molloy and Pablo Barrio.

Reviewers: rengolin, haicheng, sebpop

Subscribers: jojo, jmolloy, llvm-commits

Differential Revision: https://reviews.llvm.org/D26391

llvm-svn: 286236
2016-11-08 14:53:30 +00:00
Simon Pilgrim d02c55204b [VectorLegalizer] Expansion of CTLZ using CTPOP when possible
This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available.

This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful.

Differential Revision: https://reviews.llvm.org/D25910

llvm-svn: 286233
2016-11-08 14:10:28 +00:00
Adam Nemet b103fc52d3 [OptDiag, opt-viewer] Save callee's location and display as link
With this we get a new field in the YAML record if the value being
streamed out has a debug location.  For examples, please see the changes
to the tests.

This is then used in opt-viewer to display a link for the callee
function in the inlining remarks.

Differential Revision: https://reviews.llvm.org/D26366

llvm-svn: 286169
2016-11-07 22:41:13 +00:00
Sanjoy Das e06ef141fc Avoid tail recursion elimination across calls with operand bundles
Summary:
In some specific scenarios with well understood operand bundle types
(like `"deopt"`) it may be possible to go ahead and convert recursion to
iteration, but TailRecursionElimination does not have that logic today
so avoid doing the right thing for now.

I need some input on whether `"funclet"` operand bundles should also
block tail recursion elimination.  If not, I'll allow TRE across calls
with `"funclet"` operand bundles and add a test case.

Reviewers: rnk, majnemer, nlewycky, ahatanak

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D26270

llvm-svn: 286147
2016-11-07 21:01:49 +00:00
Benjamin Kramer 1697d39eef [MemCpyOpt] Don't emit IR in an unspecified order
Argument evaluation order is one of the edge cases where Clang differs
from GCC, yielding different IR depending on which compiler LLVM was
built with. Make the order deterministic and tune the test to actually
verify the order instead of trying to hide it.

llvm-svn: 286126
2016-11-07 17:47:28 +00:00
Sanjay Patel 86408a8048 [InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)
This was reverted at r285866 because there was a crash handling a scalar
select of vectors. I added a check for that pattern and a test case based
on the example provided in the post-commit thread for r285732.

llvm-svn: 286113
2016-11-07 15:52:45 +00:00
NAKAMURA Takumi b4eef1fa4a llvm/test/Transforms/DCE/calls-errno.ll: Suppress checking @pow(+0,-1).
It depends on host's pow(3), and mingw's pow doesn't raise any errors, just returns +INF.

llvm-svn: 286005
2016-11-04 18:50:45 +00:00
Greg Bedwell 5fc6f94591 Revert "[InstCombine] allow splat vector folds in adjustMinMax()"
This reverts commit r285732.

This change introduced a new assertion failure in the following
testcase at -O2:

typedef short __v8hi __attribute__((__vector_size__(16)));
__v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) {
  __v8hi Result = V1;
  if (mask & 0x80)
    Result[0] = V2[0];
  return Result;
}

llvm-svn: 285866
2016-11-02 23:17:05 +00:00
Eli Friedman b6befc3bc4 DCE math library calls with a constant operand.
On platforms which use -fmath-errno, math libcalls without any uses
require some extra checks to figure out if they are actually dead.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30464 .

Differential Revision: https://reviews.llvm.org/D25970

llvm-svn: 285857
2016-11-02 20:48:11 +00:00
Bjorn Pettersson 7424c8ccd1 [Reassociate] Skip analysis of dead code to avoid infinite loop.
Summary:
It was detected that the reassociate pass could enter an inifite
loop when analysing dead code. Simply skipping to analyse basic
blocks that are dead avoids such problems (and as a side effect
we avoid spending time on optimising dead code).

The solution is using the same Reverse Post Order ordering of the
basic blocks when doing the optimisations, as when building the
precalculated rank map. A nice side-effect of this solution is
that we now know that we only try to do optimisations for blocks
with ranked instructions.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30818

Reviewers: llvm-commits, davide, eli.friedman, mehdi_amini

Subscribers: dberlin

Differential Revision: https://reviews.llvm.org/D26154

llvm-svn: 285793
2016-11-02 08:55:19 +00:00
Sanjay Patel c3d89842ad [InstCombine] allow splat vector folds in adjustMinMax()
llvm-svn: 285732
2016-11-01 20:08:02 +00:00
Sanjay Patel c0339c77ef [InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons.
This transforms

%a = shl nuw %x, c1
%b = icmp {ugt|ule} %a, c0

into

%b = icmp {ugt|ule} %x, (c0 >> c1)

z3:

(declare-const x (_ BitVec 64))
(declare-const c0 (_ BitVec 64))
(declare-const c1 (_ BitVec 64))

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvugt (bvshl x c1) c0)
                (bvugt x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvule (bvshl x c1) c0)
                (bvule x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25913

llvm-svn: 285729
2016-11-01 19:19:29 +00:00
Sanjay Patel 700193ae05 [InstCombine] add vector tests for ext+adjust min/max
llvm-svn: 285713
2016-11-01 17:34:29 +00:00
Sanjay Patel 586aeee341 [InstCombine] move/fix tests for adjusted min/max
I think the former 'test50' had a typo making it functionally equivalent
to the former 'test49'; changed the predicate to provide more coverage.

llvm-svn: 285706
2016-11-01 16:39:30 +00:00
Sanjay Patel 04ef115ff7 [InstCombine] fix tests for adjusted min/max
1. Delete identical tests
2. Rename tests to reflect actual functionality
3. Add comments
4. Add unsigned variants
5. Add vector variants with FIXME comments
6. Rename test file

llvm-svn: 285699
2016-11-01 15:48:30 +00:00
Simon Pilgrim 6dd8fab443 [InstCombine] Folding of shifts by the sum of positive values
This patch introduces the combine:

(C1 shift (A add C2)) -> ((C1 shift C2) shift A)
iff A and C2 are both positive

If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift.

Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive).

Differential Revision: https://reviews.llvm.org/D26000

llvm-svn: 285696
2016-11-01 15:40:30 +00:00
Sanjay Patel 69587324e8 [InstCombine] auto-generate better checks
llvm-svn: 285693
2016-11-01 14:38:30 +00:00
Dorit Nuzman bf2c15b5dc Second attempt at r285517.
llvm-svn: 285568
2016-10-31 13:17:31 +00:00
Dorit Nuzman 06903d16af Revert r285517 due to build failures.
llvm-svn: 285518
2016-10-30 14:34:57 +00:00
Dorit Nuzman 3c1c658f24 [LoopVectorize] Make interleaved-accesses analysis less conservative about
possible pointer-wrap-around concerns, in some cases.

Before this patch, collectConstStridedAccesses (part of interleaved-accesses
analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when
examining all candidate pointers. This is too conservative. Instead, this
patch makes collectConstStridedAccesses use an optimistic approach, calling
getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the
candidate interleave groups have been formed, revisits the pointer-wrapping
analysis but only where it matters: namely, in groups that have gaps, and where
the gaps are not at the very end of the group (in which case the loop is
peeled). This second time getPtrStride is called with [Assume=false,
ShouldCheckWrap=true], but this could further be improved to using Assume=true,
once we also add the logic to track that we are not going to meet the scev
runtime checks threshold.

Differential Revision: https://reviews.llvm.org/D25276

llvm-svn: 285517
2016-10-30 12:23:26 +00:00
Sanjay Patel 36eeb6d6f6 [ValueTracking] recognize more variants of smin/smax
Try harder to detect obfuscated min/max patterns: the initial pattern was added with D9352 / rL236202. 
There was a bug fix for PR27137 at rL264996, but I think we can do better by folding the corresponding
smax pattern and commuted variants.

The codegen tests demonstrate the effect of ValueTracking on the backend via SelectionDAGBuilder. We
can't expose these differences minimally in IR because we don't have smin/smax intrinsics for IR.

Differential Revision: https://reviews.llvm.org/D26091

llvm-svn: 285499
2016-10-29 16:21:19 +00:00
Sanjay Patel 978f827d12 [InstCombine] re-use bitcasted compare operands in selects (PR28001)
These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>.

The bitcasts obfuscate the expected min/max forms as shown in PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001#c6

Differential Revision: https://reviews.llvm.org/D25943

llvm-svn: 285495
2016-10-29 15:22:04 +00:00
Justin Lebar 1535a5e9df Add missing lit.local.cfg to llvm/test/Transforms/CodeGenPrepare/NVPTX.
llvm-svn: 285464
2016-10-28 21:56:07 +00:00
Justin Lebar 0ede5fb1bb Don't leave unused divs/rems sitting around in BypassSlowDivision.
Summary:
This "pass" eagerly creates div and rem instructions even when only one
is needed -- it relies on a later pass (machine DCE?) to clean them up.

This is problematic not just from a cleanliness perspective (this pass
is running during CodeGenPrepare, so should leave the IR in a better
state), but it also creates a problem for instruction selection.  If we
always have a div+rem, isel will always select a divrem instruction (if
possible), even when a single div or rem would do.

Specifically, in NVPTX, we want to compute rem from the output of div,
if available.  But if a div is not available, we want to leave the rem
alone.  This transformation is overeager if div is always available.

Because this code runs as part of CodeGenPrepare, it's nontrivial to
write a test for this change.  But this will effectively be tested by
a later patch which adds the aforementioned change to NVPTX isel.

Reviewers: tra

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26088

llvm-svn: 285460
2016-10-28 21:43:54 +00:00
Justin Lebar 468bf73209 Don't claim the udiv created in BypassSlowDivision is exact.
Summary:
In BypassSlowDivision's short-dividend path, we would create e.g.

  udiv exact i32 %a, %b

"exact" here means that we are asserting that %a is a multiple of %b.
But we have no reason to believe this must be true -- this is just a
bug, as far as I can tell.

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D26097

llvm-svn: 285459
2016-10-28 21:43:51 +00:00
Matt Arsenault ef00283425 SpeculativeExecution: Allow speculating more inst types
Partial step towards removing the whitelist and only
using TTI's cost.

llvm-svn: 285438
2016-10-28 20:00:33 +00:00
Sanjay Patel 19ace1d548 [InstCombine] move/add tests for smin/smax folds
llvm-svn: 285414
2016-10-28 16:54:03 +00:00
Matthew Simpson 9b6755362b [LV] Correct misleading comments in test (NFC)
llvm-svn: 285402
2016-10-28 14:27:45 +00:00
Davide Italiano 631cd27f29 [Reassociate] Removing instructions mutates the IR.
Fixes PR 30784. Discussed with Justin, who pointed out that
in the new PassManager infrastructure we can have more fine-grained
control on which analyses we want to preserve, but this is the
best we can do with the current infrastructure.

llvm-svn: 285380
2016-10-28 02:47:09 +00:00
Davide Italiano 30665147f9 [ConstantFold] Get the correct vector type when folding a getelementptr.
Differential Revision:  https://reviews.llvm.org/D26014

llvm-svn: 285371
2016-10-28 00:53:16 +00:00
Davide Italiano e4146714ca Remove accidentally commited test.
llvm-svn: 285366
2016-10-27 23:40:19 +00:00
Davide Italiano e865a525df [IR] Reintroduce getGEPReturnType(), it will be used in a later patch.
llvm-svn: 285365
2016-10-27 23:38:51 +00:00