Commit Graph

16473 Commits

Author SHA1 Message Date
Sanjay Patel 86408a8048 [InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)
This was reverted at r285866 because there was a crash handling a scalar
select of vectors. I added a check for that pattern and a test case based
on the example provided in the post-commit thread for r285732.

llvm-svn: 286113
2016-11-07 15:52:45 +00:00
Justin Lebar 54b0be048e [LoopStrengthReduce] Don't use a DenseSet<int64_t> when we might add any valid int64_t to the set.
Summary:
SmallSetVector uses DenseSet, but that means we need to reserve some
values for the empty and tombstone keys.

It seems to me we should have a general way to let us store full-range
ints inside of DenseSets, and furthermore that we probably shouldn't
silently let you add ints into DenseSets without explicitly promising
that they're in range.  But that's a battle for another day; for now,
just fix this code, since we currently do something Very Bad when
compiling ffmpeg.

Fixes PR30914.

Reviewers: jeremyhu

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D26323

llvm-svn: 286038
2016-11-05 16:47:25 +00:00
Chandler Carruth d9ef4b6601 Only log the visit of a return instruction if we in fact found a return
instruction.

This avoids dereferencing null in the debug logging if the instruction
was not in fact a return instruction. This potential bug was found by
PVS-Studio.

This actually fixes the last of the "dereferenced a pointer before
checking it for null" reports in the recent PVS-Studio run. However,
there are quite a few reports of this nature that I did not do anything
to fix because they are pretty glaring false positives. They usually
took the form of quite clear correlated checks or a check made in
a separate function. I've even added asserts anywhere this correlation
wasn't pretty obvious and fundamental to the code.

llvm-svn: 285988
2016-11-04 06:59:50 +00:00
Xinliang David Li f450b88191 Fix typo
llvm-svn: 285978
2016-11-04 03:00:52 +00:00
Chandler Carruth fca1ff0da2 Fix a bug found by inspection by PVS-Studio.
This condition is trivially always true prior to the change. The comment
at the call site makes it clear that we expect *all* of these to be '=',
'S', or 'I' so fix the code.

We have a bug I will update to track the fact that Clang doesn't warn on
this: http://llvm.org/PR13101

llvm-svn: 285930
2016-11-03 16:39:25 +00:00
Teresa Johnson 0515fb8d4b [ThinLTO] Handle distributed backend case when doing renaming
Summary:
The recent change I made to consult the summary when deciding whether to
rename (to handle inline asm) in r285513 broke the distributed build
case. In a distributed backend we will only have a portion of the
combined index, specifically for imported modules we only have the
summaries for any imported definitions. When renaming on import we were
asserting because no summary entry was found for a local reference being
linked in (def wasn't imported).

We only need to consult the summary for a renaming decision for the
exporting module. For imports, we would have prevented importing any
references to NoRename values already.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26250

llvm-svn: 285871
2016-11-03 01:07:16 +00:00
Greg Bedwell 5fc6f94591 Revert "[InstCombine] allow splat vector folds in adjustMinMax()"
This reverts commit r285732.

This change introduced a new assertion failure in the following
testcase at -O2:

typedef short __v8hi __attribute__((__vector_size__(16)));
__v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) {
  __v8hi Result = V1;
  if (mask & 0x80)
    Result[0] = V2[0];
  return Result;
}

llvm-svn: 285866
2016-11-02 23:17:05 +00:00
Eli Friedman b6befc3bc4 DCE math library calls with a constant operand.
On platforms which use -fmath-errno, math libcalls without any uses
require some extra checks to figure out if they are actually dead.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30464 .

Differential Revision: https://reviews.llvm.org/D25970

llvm-svn: 285857
2016-11-02 20:48:11 +00:00
Bjorn Pettersson 7424c8ccd1 [Reassociate] Skip analysis of dead code to avoid infinite loop.
Summary:
It was detected that the reassociate pass could enter an inifite
loop when analysing dead code. Simply skipping to analyse basic
blocks that are dead avoids such problems (and as a side effect
we avoid spending time on optimising dead code).

The solution is using the same Reverse Post Order ordering of the
basic blocks when doing the optimisations, as when building the
precalculated rank map. A nice side-effect of this solution is
that we now know that we only try to do optimisations for blocks
with ranked instructions.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30818

Reviewers: llvm-commits, davide, eli.friedman, mehdi_amini

Subscribers: dberlin

Differential Revision: https://reviews.llvm.org/D26154

llvm-svn: 285793
2016-11-02 08:55:19 +00:00
George Burgess IV 66837aba0a [MemorySSA] Tighten up types to make our API prettier. NFC.
Patch by bryant.

Differential Revision: https://reviews.llvm.org/D26126

llvm-svn: 285750
2016-11-01 21:17:46 +00:00
Sanjay Patel c3d89842ad [InstCombine] allow splat vector folds in adjustMinMax()
llvm-svn: 285732
2016-11-01 20:08:02 +00:00
Sanjay Patel c0339c77ef [InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons.
This transforms

%a = shl nuw %x, c1
%b = icmp {ugt|ule} %a, c0

into

%b = icmp {ugt|ule} %x, (c0 >> c1)

z3:

(declare-const x (_ BitVec 64))
(declare-const c0 (_ BitVec 64))
(declare-const c1 (_ BitVec 64))

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvugt (bvshl x c1) c0)
                (bvugt x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvule (bvshl x c1) c0)
                (bvule x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25913

llvm-svn: 285729
2016-11-01 19:19:29 +00:00
Sanjay Patel 644d7c3b8a [InstCombine] clean up adjustMinMax(); NFCI
1. Change param names for readability
2. Change pointer param to ref
3. Early exit to reduce indent
4. Change switch to if/else

llvm-svn: 285718
2016-11-01 18:15:03 +00:00
Sanjay Patel 7ce658388b [InstCombine] add helper function for adjustMinMax(); NFCI
This is just a cut and paste; clean-up and enhancements to follow.

llvm-svn: 285715
2016-11-01 17:46:08 +00:00
Simon Pilgrim 6dd8fab443 [InstCombine] Folding of shifts by the sum of positive values
This patch introduces the combine:

(C1 shift (A add C2)) -> ((C1 shift C2) shift A)
iff A and C2 are both positive

If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift.

Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive).

Differential Revision: https://reviews.llvm.org/D26000

llvm-svn: 285696
2016-11-01 15:40:30 +00:00
Evgeniy Stepanov 1bd9fc7098 Fix a typo.
Found with PVS-Studio here: http://www.viva64.com/en/b/0446/

llvm-svn: 285652
2016-10-31 22:42:39 +00:00
Kuba Brecka a28c9e8f09 [asan] Move instrumented null-terminated strings to a special section, LLVM part
On Darwin, simple C null-terminated constant strings normally end up in the __TEXT,__cstring section of the resulting Mach-O binary. When instrumented with ASan, these strings are transformed in a way that they cannot be in __cstring (the linker unifies the content of this section and strips extra NUL bytes, which would break instrumentation), and are put into a generic __const section. This breaks some of the tools that we have: Some tools need to scan all C null-terminated strings in Mach-O binaries, and scanning all the contents of __const has a large performance penalty. This patch instead introduces a special section, __asan_cstring which will now hold the instrumented null-terminated strings.

Differential Revision: https://reviews.llvm.org/D25026

llvm-svn: 285619
2016-10-31 18:51:58 +00:00
Dorit Nuzman bf2c15b5dc Second attempt at r285517.
llvm-svn: 285568
2016-10-31 13:17:31 +00:00
Dorit Nuzman 06903d16af Revert r285517 due to build failures.
llvm-svn: 285518
2016-10-30 14:34:57 +00:00
Dorit Nuzman 3c1c658f24 [LoopVectorize] Make interleaved-accesses analysis less conservative about
possible pointer-wrap-around concerns, in some cases.

Before this patch, collectConstStridedAccesses (part of interleaved-accesses
analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when
examining all candidate pointers. This is too conservative. Instead, this
patch makes collectConstStridedAccesses use an optimistic approach, calling
getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the
candidate interleave groups have been formed, revisits the pointer-wrapping
analysis but only where it matters: namely, in groups that have gaps, and where
the gaps are not at the very end of the group (in which case the loop is
peeled). This second time getPtrStride is called with [Assume=false,
ShouldCheckWrap=true], but this could further be improved to using Assume=true,
once we also add the logic to track that we are not going to meet the scev
runtime checks threshold.

Differential Revision: https://reviews.llvm.org/D25276

llvm-svn: 285517
2016-10-30 12:23:26 +00:00
Teresa Johnson bf28c8fa45 [ThinLTO] Use per-summary flag to prevent exporting locals used in inline asm
Summary:
Instead of using the workaround of suppressing the entire index for
modules that call inline asm that may reference locals, use the
NoRename flag on the summary for any locals in the llvm.used set, and
add a reference edge from any functions containing inline asm.

This avoids issues from having no summaries despite the module defining
global values, which was preventing more aggressive index-based
optimization. It will be followed by a subsequent patch to make a
similar fix for local references in module level asm (to fix PR30610).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26121

llvm-svn: 285513
2016-10-30 05:40:44 +00:00
Teresa Johnson 38d4df714c [ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC)
Rename as suggested in code review for D26063.

llvm-svn: 285508
2016-10-29 21:52:23 +00:00
Teresa Johnson 1b9c2be8f4 [ThinLTO] Use NoPromote flag in summary during promotion
Summary:
Replace the check of whether a GV has a section with the flag check
in the summary. This is in preparation for using the NoPromote flag
to convey other situations when we can't promote (e.g. locals used in
inline asm).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26063

llvm-svn: 285507
2016-10-29 21:31:48 +00:00
Sanjay Patel 978f827d12 [InstCombine] re-use bitcasted compare operands in selects (PR28001)
These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>.

The bitcasts obfuscate the expected min/max forms as shown in PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001#c6

Differential Revision: https://reviews.llvm.org/D25943

llvm-svn: 285495
2016-10-29 15:22:04 +00:00
Justin Lebar 0ede5fb1bb Don't leave unused divs/rems sitting around in BypassSlowDivision.
Summary:
This "pass" eagerly creates div and rem instructions even when only one
is needed -- it relies on a later pass (machine DCE?) to clean them up.

This is problematic not just from a cleanliness perspective (this pass
is running during CodeGenPrepare, so should leave the IR in a better
state), but it also creates a problem for instruction selection.  If we
always have a div+rem, isel will always select a divrem instruction (if
possible), even when a single div or rem would do.

Specifically, in NVPTX, we want to compute rem from the output of div,
if available.  But if a div is not available, we want to leave the rem
alone.  This transformation is overeager if div is always available.

Because this code runs as part of CodeGenPrepare, it's nontrivial to
write a test for this change.  But this will effectively be tested by
a later patch which adds the aforementioned change to NVPTX isel.

Reviewers: tra

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26088

llvm-svn: 285460
2016-10-28 21:43:54 +00:00
Justin Lebar 468bf73209 Don't claim the udiv created in BypassSlowDivision is exact.
Summary:
In BypassSlowDivision's short-dividend path, we would create e.g.

  udiv exact i32 %a, %b

"exact" here means that we are asserting that %a is a multiple of %b.
But we have no reason to believe this must be true -- this is just a
bug, as far as I can tell.

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D26097

llvm-svn: 285459
2016-10-28 21:43:51 +00:00
Matt Arsenault ef00283425 SpeculativeExecution: Allow speculating more inst types
Partial step towards removing the whitelist and only
using TTI's cost.

llvm-svn: 285438
2016-10-28 20:00:33 +00:00
George Burgess IV 013fd7315f [MemorySSA] Add const to getClobberingMemoryAccess.
Thanks to bryant for the patch!

Differential Revision: https://reviews.llvm.org/D26086

llvm-svn: 285432
2016-10-28 19:22:46 +00:00
Igor Laevsky c3ccf5d77b [LCSSA] Perform LCSSA verification only for the current loop nest.
Now LPPassManager will run LCSSA verification only for the top-level loop
which was processed on the current iteration.

Differential Revision: https://reviews.llvm.org/D25873

llvm-svn: 285394
2016-10-28 12:57:20 +00:00
Davide Italiano 631cd27f29 [Reassociate] Removing instructions mutates the IR.
Fixes PR 30784. Discussed with Justin, who pointed out that
in the new PassManager infrastructure we can have more fine-grained
control on which analyses we want to preserve, but this is the
best we can do with the current infrastructure.

llvm-svn: 285380
2016-10-28 02:47:09 +00:00
Teresa Johnson 58fbc916a0 [ThinLTO] Rename HasSection to NoRename (NFC)
Summary:
This is in preparation for a change to utilize this flag for symbols
referenced/defined in either inline or module level assembly.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26048

llvm-svn: 285376
2016-10-28 02:24:59 +00:00
Sanjay Patel c0de9c9e40 [InstCombine] fix foldSPFofSPF() to handle vector splats
llvm-svn: 285345
2016-10-27 21:19:40 +00:00
Haicheng Wu 430b3e4893 [LoopUnroll] Check partial unrolling is enabled before initialization. NFC.
Differential Revision: https://reviews.llvm.org/D23891

llvm-svn: 285330
2016-10-27 18:40:02 +00:00
Sanjay Patel 611f9f92fc [InstCombine] handle simple vector integer constants in IsFreeToInvert
llvm-svn: 285318
2016-10-27 17:30:50 +00:00
Dehao Chen b94c09baa0 Add Loop Sink pass to reverse the LICM based of basic block frequency.
Summary: LICM may hoist instructions to preheader speculatively. Before code generation, we need to sink down the hoisted instructions inside to loop if it's beneficial. This pass is a reverse of LICM: looking at instructions in preheader and sinks the instruction to basic blocks inside the loop body if basic block frequency is smaller than the preheader frequency.

Reviewers: hfinkel, davidxl, chandlerc

Subscribers: anna, modocache, mgorny, beanz, reames, dberlin, chandlerc, mcrosier, junbuml, sanjoy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22778

llvm-svn: 285308
2016-10-27 16:30:08 +00:00
Alexey Bataev 46c0278e7d [SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer.
After successfull horizontal reduction vectorization attempt for PHI node
vectorizer tries to update root binary op by combining vectorized tree
and the ReductionPHI node. But during vectorization this ReductionPHI
can be vectorized itself and replaced by the `undef` value, while the
instruction itself is marked for deletion. This 'marked for deletion'
PHI node then can be used in new binary operation, causing "Use still
stuck around after Def is destroyed" crash upon PHI node deletion.

Also the test is fixed to make it perform actual testing.

Differential Revision: https://reviews.llvm.org/D25671

llvm-svn: 285286
2016-10-27 12:02:28 +00:00
Dehao Chen e713000eb6 Introduce updateDiscriminator interface to DILocation to make it cleaner assigning discriminators.
Summary: This patch introduces updateDiscriminator to DILocation so that it can be directly called by AddDiscriminator. It also makes it easier to update the discriminator later.

Reviewers: dnovillo, dblaikie, aprantl, echristo

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D25959

llvm-svn: 285207
2016-10-26 15:48:45 +00:00
Sanjay Patel 8d7196bfde [InstCombine] clean up commonCastTransforms; NFC
1. Use 'auto' with dyn_cast.
2. Variables start with a capital letter.
3. Use proper punctuation in comments.

llvm-svn: 285200
2016-10-26 14:52:35 +00:00
Andrea Di Biagio 9bcb064f19 [IndVarSimplify][DebugLoc] When widening the exit loop condition, correctly reuse the debug location of the original comparison.
When the loop exit condition is canonicalized as a != compaison, reuse the
debug location of the original (non canonical) comparison.

Before this patch, the debug location of the new icmp was obtained from the
loop latch terminator. This patch fixes the issue by correctly setting the
IRBuilder's "current debug location" to the location of the original compare.

Differential Revision: https://reviews.llvm.org/D25953

llvm-svn: 285185
2016-10-26 10:28:32 +00:00
Peter Collingbourne 7b7bac367c Cloning: Also clone global variable attached metadata.
llvm-svn: 285161
2016-10-26 02:57:33 +00:00
Evgeniy Stepanov ea6d49d3ee Utility functions for appending to llvm.used/llvm.compiler.used.
llvm-svn: 285143
2016-10-25 23:53:31 +00:00
Rong Xu 33308f92eb [PGO] Fix select instruction annotation
Summary:
Select instruction annotation in IR PGO uses the edge count to infer the
branch count. It's currently placed in setInstrumentedCounts() where
no all the BB counts have been computed. This leads to wrong branch weights.
Move the annotation after all BB counts are populated.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25961

llvm-svn: 285128
2016-10-25 21:47:24 +00:00
Guozhi Wei ae541f6a71 [InstCombine] Resubmit the combine of A->B->A BitCast and fix for pr27996
The original patch of the A->B->A BitCast optimization was reverted by r274094 because it may cause infinite loop inside compiler https://llvm.org/bugs/show_bug.cgi?id=27996.

The problem is with following code

xB = load (type B); 
xA = load (type A); 
+yA = (A)xB; B -> A
+zAn = PHI[yA, xA]; PHI 
+zBn = (B)zAn; // A -> B
store zAn;
store zBn;

optimizeBitCastFromPhi generates

+zBn = (B)zAn; // A -> B

and expects it will be combined with the following store instruction to another

store zAn 

Unfortunately before combineStoreToValueType is called on the store instruction, optimizeBitCastFromPhi is called on the new BitCast again, and this pattern repeats indefinitely.

optimizeBitCastFromPhi only generates BitCast for load/store instructions, only the BitCast before store can cause the reexecution of optimizeBitCastFromPhi, and BitCast before store can easily be handled by InstCombineLoadStoreAlloca.cpp. So the solution to the problem is if all users of a CI are store instructions, we should not do optimizeBitCastFromPhi on it. Then optimizeBitCastFromPhi will not be called on the new BitCast instructions.

Differential Revision: https://reviews.llvm.org/D23896

llvm-svn: 285116
2016-10-25 20:43:42 +00:00
Sanjay Patel f3dda13bd2 [InstCombine] Ensure that truncated int types are legal.
Fixes the FIXMEs in D25952 and rL285075.

Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25955

llvm-svn: 285108
2016-10-25 20:11:47 +00:00
Matthew Simpson c62266d680 [LV] Sink scalar operands of predicated instructions
When we predicate an instruction (div, rem, store) we place the instruction in
its own basic block within the vectorized loop. If a predicated instruction has
scalar operands, it's possible to recursively sink these scalar expressions
into the predicated block so that they might avoid execution. This patch sinks
as much scalar computation as possible into predicated blocks. We previously
were able to sink such operands only if they were extractelement instructions.

Differential Revision: https://reviews.llvm.org/D25632

llvm-svn: 285097
2016-10-25 18:59:45 +00:00
Michael Ilseman e542804343 Add -strip-nonlinetable-debuginfo capability
This adds a new function to DebugInfo.cpp that takes an llvm::Module
as input and removes all debug info metadata that is not directly
needed for line tables, thus effectively stripping all type and
variable information from the module.

The primary motivation for this feature was the bitcode work flow
(cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html
for more background). This is not wired up yet, but will be in
subsequent patches.  For testing, the new functionality is exposed to
opt with a -strip-nonlinetable-debuginfo option.

The secondary use-case (and one that works right now!) is as a
reduction pass in bugpoint. I added two new bugpoint options
(-disable-strip-debuginfo and -disable-strip-debug-types) to control
the new features. By default it will first attempt to remove all debug
information, then only the type info, and then proceed to hack at any
remaining MDNodes.

Thanks to Adrian Prantl for stewarding this patch!

llvm-svn: 285094
2016-10-25 18:44:13 +00:00
Michael Kuperstein cffedc4a94 Fix 80-char violations. NFC.
llvm-svn: 285092
2016-10-25 18:31:23 +00:00
Dehao Chen c1472b5092 Move discriminator assignment to where it is used. (NFC)
llvm-svn: 285084
2016-10-25 16:50:27 +00:00
Andrea Di Biagio 824cabd06d [IndVarSimplify][Dwarf] When widening the IV increment, correctly set the debug loc.
When indvars widened an induction variable, the debug location for the loop
increment computation was incorrectly set equal to the debug loc of the loop
latch terminator.

This patch fixes the issue by propagating the correct location from the
original loop increment instruction to the new widened increment.

Differential Revision: https://reviews.llvm.org/D25872

llvm-svn: 285083
2016-10-25 16:45:17 +00:00
Geoff Berry 91e9a5cc23 [EarlyCSE] Make MemorySSA memory dependency check more aggressive.
Now that MemorySSA keeps track of whether MemoryUses are optimized, use
getClobberingMemoryAccess() to check MemoryUse memory dependencies since
it should no longer be so expensive.

This is a follow-up change to https://reviews.llvm.org/D25881

llvm-svn: 285080
2016-10-25 16:18:47 +00:00