Commit Graph

20214 Commits

Author SHA1 Message Date
Simon Pilgrim 2c9d2adff5 [SLPVectorizer] getSameOpcode - remove useless cast [NFC]
There's no need to cast the base Value to an Instruction

llvm-svn: 334588
2018-06-13 10:49:24 +00:00
Simon Pilgrim 1224260f83 [SLPVectorizer] getSameOpcode - remove unusued alternate code [NFC]
We early-out for the case where we don't use alternate opcodes, so no need to check for it later.

llvm-svn: 334587
2018-06-13 10:14:27 +00:00
Max Kazantsev 0ed79620c6 [SimplifyIndVars] Ignore dead users
IndVarSimplify sometimes makes transforms basing on users that are trivially dead. In particular,
if DCE wasn't run before it, there may be a dead `sext/zext` in loop that will trigger widening
transforms, however it makes no sense to do it.

This patch teaches IndVarsSimplify ignore the mist trivial cases of that.

Differential Revision: https://reviews.llvm.org/D47974
Reviewed By: sanjoy

llvm-svn: 334567
2018-06-13 02:25:32 +00:00
Simon Pilgrim e39fa6cbbb [CostModel] Replace ShuffleKind::SK_Alternate with ShuffleKind::SK_Select (PR33744)
As discussed on PR33744, this patch relaxes ShuffleKind::SK_Alternate which requires shuffle masks to only match an alternating pattern from its 2 sources:

e.g. v4f32: <0,5,2,7> or <4,1,6,3>

This seems far too restrictive as most SIMD hardware which will implement it using a general blend/bit-select instruction, so replaces it with SK_Select, permitting elements from either source as long as they are inline:

e.g. v4f32: <0,5,2,7>, <4,1,6,3>, <0,1,6,7>, <4,1,2,3> etc.

This initial patch just updates the name and cost model shuffle mask analysis, later patch reviews will update SLP to better utilise this - it still limits itself to SK_Alternate style patterns.

Differential Revision: https://reviews.llvm.org/D47985

llvm-svn: 334513
2018-06-12 16:12:29 +00:00
Florian Hahn a1cc848399 Use SmallPtrSet explicitly for SmallSets with pointer types (NFC).
Currently SmallSet<PointerTy> inherits from SmallPtrSet<PointerTy>. This
patch replaces such types with SmallPtrSet, because IMO it is slightly
clearer and allows us to get rid of unnecessarily including SmallSet.h

Reviewers: dblaikie, craig.topper

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D47836

llvm-svn: 334492
2018-06-12 11:16:56 +00:00
Wei Mi a0c0857e7a [SampleFDO] Add a new compact binary format for sample profile.
Name table occupies a big chunk of size in current binary format sample profile.
In order to reduce its size, the patch changes the sample writer/reader to
save/restore MD5Hash of names in the name table. Sample annotation phase will
also use MD5Hash of name to query samples accordingly.

Experiment shows compact binary format can reduce the size of sample profile by
2/3 compared with binary format generally.

Differential Revision: https://reviews.llvm.org/D47955

llvm-svn: 334447
2018-06-11 22:40:43 +00:00
Roman Lebedev ebb3252f00 Revert rL334371 / D47980: "[InstCombine] Fold (x << y) >> y -> x & (-1 >> y)"
test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll broke,
and i did not notice because i did not build that backend.

llvm-svn: 334373
2018-06-10 20:32:03 +00:00
Roman Lebedev eb795a0661 [InstCombine] Fold (x >> y) << y -> x & (-1 << y)
Summary:
We already do it for matching splat constants, but not just values.

Further improvements for non-matching splat constants, as noted in
https://reviews.llvm.org/D46760#1123713 will be needed,
but i'd prefer to do that as a follow-up.

https://bugs.llvm.org/show_bug.cgi?id=37603
https://rise4fun.com/Alive/cplX
https://rise4fun.com/Alive/0HF

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47981

llvm-svn: 334372
2018-06-10 20:10:13 +00:00
Roman Lebedev 4cdc59ecf2 [InstCombine] Fold (x << y) >> y -> x & (-1 >> y)
Summary:
We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.

https://bugs.llvm.org/show_bug.cgi?id=37603
https://rise4fun.com/Alive/cplX

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47980

llvm-svn: 334371
2018-06-10 20:10:06 +00:00
Craig Topper 98a79934af [X86] Remove masking from the 512-bit masked floating point add/sub/mul/div intrinsics. Use a select in IR instead.
llvm-svn: 334358
2018-06-10 06:01:36 +00:00
Craig Topper 61998289f9 Use SmallPtrSet instead of SmallSet in places where we iterate over the set.
SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work.

These places were found by hiding the begin/end methods in the SmallSet forwarding

llvm-svn: 334343
2018-06-09 05:04:20 +00:00
Davide Italiano 189c2cf114 [InstCombine] Skip dbg.value(s) when looking at stack{save,restore}.
Fixes PR37713.

llvm-svn: 334317
2018-06-08 20:42:36 +00:00
Reid Kleckner 0bab222084 [asan] Instrument comdat globals on COFF targets
Summary:
If we can use comdats, then we can make it so that the global metadata
is thrown away if the prevailing definition of the global was
uninstrumented. I have only tested this on COFF targets, but in theory,
there is no reason that we cannot also do this for ELF.

This will allow us to re-enable string merging with ASan on Windows,
reducing the binary size cost of ASan on Windows.

Reviewers: eugenis, vitalybuka

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D47841

llvm-svn: 334313
2018-06-08 18:33:16 +00:00
Florian Hahn 45e5d5b4be [VPlan] Move recipe construction to VPRecipeBuilder.
This patch moves the recipe-creation functions out of
LoopVectorizationPlanner, which should do the high-level
orchestration of the transformations.

Reviewers: dcaballe, rengolin, hsaito, Ayal

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D47595

llvm-svn: 334305
2018-06-08 17:30:45 +00:00
Daniil Fukalov 37433dc2e1 reapply r334209 with fixes for harfbuzz in Chromium
r334209 description:
[LSR] Check yet more intrinsic pointer operands

the patch fixes another assertion in isLegalUse()

Differential Revision: https://reviews.llvm.org/D47794

llvm-svn: 334300
2018-06-08 16:22:52 +00:00
Florian Hahn b3c6f07dde [VPlan] Move recipe based VPlan generation to separate function.
This first step separates VPInstruction-based and VPRecipe-based
VPlan creation, which should make it easier to migrate to VPInstruction
based code-gen step by step.

Reviewers: Ayal, rengolin, dcaballe, hsaito, mkuper, mzolotukhin

Reviewed By: dcaballe

Subscribers: bollu, tschuett, rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D47477

llvm-svn: 334284
2018-06-08 12:53:51 +00:00
Roman Shirokiy 9ba0aa2da0 [LV] Fix PR36983. For a given recurrence, fix all phis in exit block
There could be more than one PHIs in exit block using same loop recurrence.
Don't assume there is only one and fix each user.

Differential Revision: https://reviews.llvm.org/D47788

llvm-svn: 334271
2018-06-08 08:21:20 +00:00
Reid Kleckner a3609f75b2 Revert r334209 "[LSR] Check yet more intrinsic pointer operands"
This causes cast failures when compiling harfbuzz in Chromium.
Reproducer on the way.

llvm-svn: 334254
2018-06-08 00:43:27 +00:00
Daniil Fukalov 12c0663a25 [LSR] Check yet more intrinsic pointer operands
the patch fixes another assertion in isLegalUse()

Differential Revision: https://reviews.llvm.org/D47794

llvm-svn: 334209
2018-06-07 17:30:58 +00:00
Florian Hahn 0d6b01761c [Mem2Reg] Avoid replacing load with itself in promoteSingleBlockAlloca.
We do the same thing in rewriteSingleStoreAlloca.

Fixes PR37632.

Reviewers: chandlerc, davide, efriedma

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D47825

llvm-svn: 334187
2018-06-07 11:09:05 +00:00
Max Kazantsev b4b2ccea6d [NFC] Use variable instead of accessing pair many times
llvm-svn: 334173
2018-06-07 08:47:19 +00:00
Michael Zolotukhin 31800864dc SpeculativeExecution Pass: Set PreserveCFG to avoid unnecessary analyses invalidation.
The pass doesn't touch CFG in any way, only moves instructions between
blocks.

llvm-svn: 334150
2018-06-07 00:19:29 +00:00
Teresa Johnson 4ffc3e7834 [ThinLTO] Rename index IsAnalysis flag to HaveGVs (NFC)
With the upcoming patch to add summary parsing support, IsAnalysis would
be true in contexts where we are not performing module summary analysis.
Rename to the more specific and approprate HaveGVs, which is essentially
what this flag is indicating.

llvm-svn: 334140
2018-06-06 22:22:01 +00:00
Sanjay Patel 3cd1aa88f9 [InstCombine] fold another shifty abs pattern to cmp+sel (PR36036)
The bug report:
https://bugs.llvm.org/show_bug.cgi?id=36036

...requests a DAG change for this, but an IR canonicalization
probably handles most cases. If we still want to match this
pattern in the backend, there's a proposal for that too:
D47831

Alive proofs including nsw/nuw cases that were first noted in:
D46988

https://rise4fun.com/Alive/Kmp

This patch is largely copied from the existing code that was
initially added with:
D40984
...but I didn't see much gain from trying to share code.

llvm-svn: 334137
2018-06-06 21:58:12 +00:00
Roman Lebedev cbf8446359 [InstCombine] PR37603: low bit mask canonicalization
Summary:
This is [[ https://bugs.llvm.org/show_bug.cgi?id=37603 | PR37603 ]].

https://godbolt.org/g/VCMNpS
https://rise4fun.com/Alive/idM

When doing bit manipulations, it is quite common to calculate some bit mask,
and apply it to some value via `and`.

The typical C code looks like:
```
int mask_signed_add(int nbits) {
    return (1 << nbits) - 1;
}
```
which is translated into (with `-O3`)
```
define dso_local i32 @mask_signed_add(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 1, %0
  %3 = add nsw i32 %2, -1
  ret i32 %3
}
```

But there is a second, less readable variant:
```
int mask_signed_xor(int nbits) {
    return ~(-(1 << nbits));
}
```
which is translated into (with `-O3`)
```
define dso_local i32 @mask_signed_xor(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 -1, %0
  %3 = xor i32 %2, -1
  ret i32 %3
}
```

Since we created such a mask, it is quite likely that we will use it in `and` next.
And then we may get rid of `not` op by folding into `andn`.

But now that i have actually looked:
https://godbolt.org/g/VTUDmU
_some_ backend changes will be needed too.
We clearly loose `bzhi` recognition.

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47428

llvm-svn: 334127
2018-06-06 19:38:27 +00:00
Tim Northover 9b80060d7b InstCombine: ignore debug instructions during fence combine
We should never get different CodeGen based on whether the code is being
compiled in debug mode so we must skip over @llvm.dbg.value (and similar)
calls.

Should fix at least the worst part of PR37690.

llvm-svn: 334090
2018-06-06 12:46:02 +00:00
John Brawn e4ff0bd401 [InstCombine] Correct the cmp operand type used when canonicalizing abs/nabs
When adjusting a cmp in order to canonicalize an abs/nabs select pattern we need
to use the type of the existing operand when creating a new operand not the
type of a select operand, as the two may be different.

This fixes PR37686.

llvm-svn: 334019
2018-06-05 14:10:55 +00:00
Sanjay Patel dcb8d304c3 [InstCombine] refine UB-handling in shuffle-binop transform
As noted in rL333782, we can be both better for optimization and
safer with this transform:
BinOp (shuffle V1, Mask), C --> shuffle (BinOp V1, NewC), Mask

The only potentially unsafe-to-speculate binops are integer div/rem.
All other binops are always safe (although I don't see a way to
assert that in code here).

For opcodes like shifts that can produce poison, it can't matter
here because we know the lanes with undef are dropped by the
subsequent shuffle.

Differential Revision: https://reviews.llvm.org/D47686

llvm-svn: 333962
2018-06-04 22:26:45 +00:00
David Blaikie 31b98d2e99 Move Analysis/Utils/Local.h back to Transforms
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)

llvm-svn: 333954
2018-06-04 21:23:21 +00:00
Dmitry Mikulin 4539487650 In thin and full LTO + CFI, direct function calls may go through jump table
entries to reach the target. Since these calls don't require type checks,
we can short-circuit them to their real targets, except in cases when they
can be pre-empted.

Differential Revision: https://reviews.llvm.org/D46326

llvm-svn: 333937
2018-06-04 18:18:12 +00:00
Serguei Katkov d894fb4288 [InstCombine] Fix div handling
When we optimize select basing on fact that div by 0 is undef
we should not traverse the instruction which are not guaranteed to
transfer execution to next instruction. Guard intrinsic is an example.

Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47576

llvm-svn: 333864
2018-06-04 02:52:36 +00:00
Sanjay Patel 3bd957b7ae [InstCombine] improve sub with bool folds
There's a patchwork of existing transforms trying to handle
these cases, but as seen in the changed test, we weren't
catching them all.

llvm-svn: 333845
2018-06-03 16:35:26 +00:00
Sanjay Patel bbc6d60677 [InstCombine] call simplify before trying vector folds
As noted in the review thread for rL333782, we could have
made a bug harder to hit if we were simplifying instructions
before trying other folds. 

The shuffle transform in question isn't ever a simplification;
it's just a canonicalization. So I've renamed that to make that 
clearer.

This is NFCI at this point, but I've regenerated the test file 
to show the cosmetic value naming difference of using 
instcombine's RAUW vs. the builder.

Possible follow-ups:
1. Move reassociation folds after simplifies too.
2. Refactor common code; we shouldn't have so much repetition.

llvm-svn: 333820
2018-06-02 16:27:44 +00:00
Chandler Carruth 9281503e8f [PM/LoopUnswitch] Fix how the cloned loops are handled when updating analyses.
Summary:
I noticed this issue because we didn't put the primary cloned loop into
the `NonChildClonedLoops` vector and so never iterated on it. Once
I fixed that, it made it clear why I had to do a really complicated and
unnecesasry dance when updating the loops to remain in canonical form --
I was unwittingly working around the fact that the primary cloned loop
wasn't in the expected list of cloned loops. Doh!

Now that we include it in this vector, we don't need to return it and we
can consolidate the update logic as we correctly have a single place
where it can be handled.

I've just added a test for the iteration order aspect as every time
I changed the update logic partially or incorrectly here, an existing
test failed and caught it so that seems well covered (which is also
evidenced by the extensive working around of this missing update).

Reviewers: asbirlea, sanjoy

Subscribers: mcrosier, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D47647

llvm-svn: 333811
2018-06-02 01:29:01 +00:00
Sanjay Patel 66f7e19f6a [InstCombine] fix vector shuffle transform to replace undef elements (PR37648)
This bug:
https://bugs.llvm.org/show_bug.cgi?id=37648
...was created with the enhancement to this transform with rL332479.

The urem test shows the disaster potential: any undef divisor lane makes
the whole op undef.

The test diffs show that vector demanded elements turns some of the potential, 
but not all, unused binop operands back into undef already.

llvm-svn: 333782
2018-06-01 19:23:18 +00:00
Vlad Tsyrklevich 6867ab7c90 [ThinLTOBitcodeWriter] Emit summaries for regular LTO modules
Summary:
Emit summaries for bitcode modules that are only destined for the
regular LTO portion of the build so they can participate in
summary-based dead stripping.

This change reduces the size of a nacl_helper build with cfi-icall
enabled by 7%, removing the majority of the overhead due to enabling
cfi-icall. The cfi-icall size increase was caused by compiling in lots
of unused code and cfi-icall generating jumptable references to unused
symbols that could no longer be removed by -Wl,-gc-sections. Increasing
the visibility of summary-based dead stripping prevented jumptable
entries being created for unused symbols from the regular LTO portion
of the build.

Reviewers: pcc

Reviewed By: pcc

Subscribers: dschuff, mehdi_amini, inglorion, eraman, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D47594

llvm-svn: 333768
2018-06-01 15:20:47 +00:00
Florian Hahn 8a17f1f43e Revert r333740: IPSCCP] Use PredicateInfo to propagate facts from cmp.
This is breaking the clang-with-thin-lto-ubuntu bot.

llvm-svn: 333745
2018-06-01 12:58:43 +00:00
Florian Hahn f4df554f32 Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.
This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.

As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.

Reviewers: davide, mssimpso, dberlin, efriedma

Reviewed By: davide, dberlin

Differential Revision: https://reviews.llvm.org/D45330

llvm-svn: 333740
2018-06-01 10:48:54 +00:00
Craig Topper 9a6c0bdcbd [LoopIdiomRecognize] Only convert loops to ctlz if we can prove that the input is non-negative.
Summary:
Loop idiom recognize tries to convert loops like

```
int foo(int x) {
  int cnt = 0;
  while (x) {
    x >>= 1;
    ++cnt;
  }
  return cnt;
}
```

into calls to ctlz, but if x is initially negative this loop should be infinite.

It happens that the cases that motivated this change have an absolute value of x before the loop. So this patch restricts the transform to cases where we know x is positive. Note: We are relying on the absolute value of INT_MIN to be undefined so we can assume that the result is always positive.

Fixes PR37479

Reviewers: spatel, hfinkel, efriedma, javed.absar

Reviewed By: efriedma

Subscribers: dmgreen, llvm-commits

Differential Revision: https://reviews.llvm.org/D47348

llvm-svn: 333702
2018-05-31 22:16:55 +00:00
Sanjay Patel 26368cd5d9 [InstCombine] narrow select to match condition operands' size
This is the planned enhancement to D47163 / rL333611.
We want to match cmp/select sizes because that will be recognized
as min/max more easily and lead to better codegen (especially for
vector types).

As mentioned in D47163, this improves some of the tests that would
also be folded by D46380, so we may want to adjust that patch to
match the new patterns where the extend op occurs after the select.

llvm-svn: 333689
2018-05-31 19:55:27 +00:00
Craig Topper c9a4c6208b [JumpThreading] Fix some strange formatting of code inside LLVM_DEBUG. NFC
I don't know if clang-format got confused here or what.

llvm-svn: 333675
2018-05-31 18:08:11 +00:00
David Bolvansky 5430b73755 [SimplifyLibcalls] [NFC] Cleanup, improvements
Summary:
* Use "find('%')" instead of loop to find '%' char (we already uses find('%') in optimizePrintFString..)
* Convert getParent() chains to getModule()/getFunction()

Reviewers: lebedev.ri, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47397

llvm-svn: 333668
2018-05-31 16:39:27 +00:00
Benjamin Kramer 0deb9a9a1f Extend the GlobalObject metadata interface
- Make eraseMetadata return whether it changed something
- Wire getMetadata for a single MDNode efficiently into the attachment
map
- Add hasMetadata, which is less weird than checking getMetadata ==
nullptr on a multimap.

Use it to simplify code.

llvm-svn: 333649
2018-05-31 13:29:58 +00:00
Alexandros Lamprineas 61f0ba1fcc [InstCombine, ARM] Convert vld1 to llvm load
Convert a vector load intrinsic into an llvm load instruction.
This is beneficial when the underlying object being addressed
comes from a constant, since we get constant-folding for free.

Differential Revision: https://reviews.llvm.org/D46273

llvm-svn: 333643
2018-05-31 12:19:18 +00:00
Max Kazantsev 0bad5be430 [NFC] Factor out a method for further extension
llvm-svn: 333633
2018-05-31 08:08:34 +00:00
Roman Lebedev c0ecd06428 Revert rL333106 / D46814: [InstCombine] Fold unfolded masked merge pattern with variable mask!
In post-commit review, Eric Christopher notes that many
new MSan warnings are being observed with this patch.

The probable reason is: if 'y' is undef here and we could
evaluate it twice and get different results.
We can't increase the number of uses of a value.

llvm-svn: 333631
2018-05-31 06:00:36 +00:00
Sanjay Patel e5bc441791 [InstCombine] don't change the size of a select if it would mismatch its condition operands' sizes
Don't always:
cast (select (cmp x, y), z, C) --> select (cmp x, y), (cast z), C'

This is something that came up as far back as D26556, and I lost track of it. 
I suspect that this transform is part of the underlying problem that is 
inspiring some of the recent proposals that seek to match larger patterns 
that include a cast op. Even if that's not true, this transform causes
problems for codegen (particularly with vector types).

A transform to actively match the size of cmp and select operand sizes should
follow. This patch just removes the harmful canonicalization in the other
direction.

Differential Revision: https://reviews.llvm.org/D47163

llvm-svn: 333611
2018-05-31 00:16:58 +00:00
Sanjay Patel ceb595b04e [InstCombine] don't negate constant expression with fsub (PR37605)
X + (-C) would be transformed back into X - C, so infinite loop:
https://bugs.llvm.org/show_bug.cgi?id=37605

llvm-svn: 333610
2018-05-30 23:55:12 +00:00
Vlad Tsyrklevich 178fdb1a3b [LowerTypeTests] Discard extern_weak linkage for definitions
Summary:
Fix PR37625. It's possible for an extern_weak declaration to be emitted
to the merged module when a definition exists in the ThinLTO portion of
the build; discard the linkage on the declaration in that case.
(otherwise we copy the linkage to the alias to the jumptable and fail)

Reviewers: pcc

Reviewed By: pcc

Subscribers: mehdi_amini, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D47494

llvm-svn: 333604
2018-05-30 22:39:52 +00:00
George Burgess IV 485762ccba [NewGVN] Fix set comparison; reflow comment
Looks like we intended to compare this->Members with Other->Members
here, but ended up comparing this->Members with this->Members. Oops. :)

Since CongruenceClass::Members is a SmallPtrSet anyway, we can probably
skip building std::sets if we're willing to write a bit more code.

This appears to be no functional change (for sufficiently lax values of
"no"): this equality check was only being called inside of an assert.
So, worst case, we'll catch more bugs in the form of assertion failures.

Thanks to d0k for noting this!

llvm-svn: 333601
2018-05-30 22:24:08 +00:00
Benjamin Kramer c8bd5449e0 [CalledValuePropagation] Just use a sorted vector instead of a set.
The set properties are never used, so a vector is enough. No
functionality change intended.

While there add some std::moves to SparseSolver.

llvm-svn: 333582
2018-05-30 19:31:11 +00:00
Alexandros Lamprineas 52457d33b2 [InstCombine, ARM, AArch64] Convert table lookup to shuffle vector
Turning a table lookup intrinsic into a shuffle vector instruction
can be beneficial. If the mask used for the lookup is the constant
vector {7,6,5,4,3,2,1,0}, then the back-end generates byte reverse
instructions instead.

Differential Revision: https://reviews.llvm.org/D46133

llvm-svn: 333550
2018-05-30 14:38:50 +00:00
Chandler Carruth 71fd27043e [PM/LoopUnswitch] When using the new SimpleLoopUnswitch pass, schedule
loop-cleanup passes at the beginning of the loop pass pipeline, and
re-enqueue loops after even trivial unswitching.

This will allow us to much more consistently avoid simplifying code
while doing trivial unswitching. I've also added a test case that
specifically shows effective iteration using this technique.

I've unconditionally updated the new PM as that is always using the
SimpleLoopUnswitch pass, and I've made the pipeline changes for the old
PM conditional on using this new unswitch pass. I added a bunch of
comments to the loop pass pipeline in the old PM to make it more clear
what is going on when reviewing.

Hopefully this will unblock doing *partial* unswitching instead of just
full unswitching.

Differential Revision: https://reviews.llvm.org/D47408

llvm-svn: 333493
2018-05-30 02:46:45 +00:00
Diego Caballero b94b21d441 [VPlan] Replace LLVM_ATTRIBUTE_USED with ifndef NDEBUG
Minor replacement. LLVM_ATTRIBUTE_USED was introduced to silence
a warning but using #ifndef NDEBUG makes more sense in this case.

Reviewers: dblaikie, fhahn, hsaito

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D47498

llvm-svn: 333476
2018-05-29 23:10:44 +00:00
Chandler Carruth 4cbcbb0761 [LoopInstSimplify] Re-implement the core logic of loop-instsimplify to
be both simpler and substantially more efficient.

Rather than use a hand-rolled iteration technique that isn't quite the
same as RPO, use the pre-built RPO loop body traversal utility.

Once visiting the loop body in RPO, we can assert that we visit defs
before uses reliably. When this is the case, the only need to iterate is
when simplifying a def that is used by a PHI node along a back-edge.
With this patch, the first pass over the loop body is just a complete
simplification of every instruction across the loop body. When we
encounter a use of a simplified instruction that stems from a PHI node
in the loop body that has already been visited (due to some cyclic CFG,
potentially the loop itself, or a nested loop, or unstructured control
flow), we recall that specific PHI node for the second iteration.
Nothing else needs to be preserved from iteration to iteration.

On the second and later iterations, only instructions known to have
simplified inputs are considered, each time starting from a set of PHIs
that had simplified inputs along the backedges.

Dead instructions are collected along the way, but deleted in a batch at
the end of each iteration making the iterations themselves substantially
simpler. This uses a new batch API for recursively deleting dead
instructions.

This alsa changes the routine to visit subloops. Because simplification
is fundamentally transitive, we may need to visit the entire loop body,
including subloops, to handle knock-on simplification.

I've added a basic test file that helps demonstrate that all of these
changes work. It includes both straight-forward loops with
simplifications as well as interesting PHI-structures, CFG-structures,
and a nested loop case.

Differential Revision: https://reviews.llvm.org/D47407

llvm-svn: 333461
2018-05-29 20:15:38 +00:00
Fangrui Song afa95ee03d [LLVM-C] [OCaml] Remove LLVMAddBBVectorizePass
Summary: It was fully replaced back in 2014, and the implementation was removed 11 months ago by r306797.

Reviewers: hfinkel, chandlerc, whitequark, deadalnix

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47436

llvm-svn: 333378
2018-05-28 16:58:10 +00:00
David Green aee7ad0cde Revert 333358 as it's failing on some builders.
I'm guessing the tests reply on the ARM backend being built.

llvm-svn: 333359
2018-05-27 12:54:33 +00:00
David Green 3034281b43 [UnrollAndJam] Add a new Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

for i..
  ForeBlocks(i)
  for j..
    SubLoopBlocks(i, j)
  AftBlocks(i)

Instead of doing normal inner or outer unrolling, we unroll as follows:

for i... i+=2
  ForeBlocks(i)
  ForeBlocks(i+1)
  for j..
    SubLoopBlocks(i, j)
    SubLoopBlocks(i+1, j)
  AftBlocks(i)
  AftBlocks(i+1)
Remainder

So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now-jammed loops.

To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).

Differential Revision: https://reviews.llvm.org/D41953

llvm-svn: 333358
2018-05-27 12:11:21 +00:00
Florian Hahn 718af2f817 Revert r333268: [IPSCCP] Use PredicateInfo to propagate facts from...
Reverting this to see if this is causing the failures of the
clang-with-thin-lto-ubuntu bot.

[IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.

This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.

As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.

Reviewers: davide, mssimpso, dberlin, efriedma

Reviewed By: davide, dberlin

Differential Revision: https://reviews.llvm.org/D45330

llvm-svn: 333323
2018-05-25 23:32:02 +00:00
Florian Hahn b4a70b9f47 [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.
This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.

As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.

Reviewers: davide, mssimpso, dberlin, efriedma

Reviewed By: davide, dberlin

Differential Revision: https://reviews.llvm.org/D45330

llvm-svn: 333268
2018-05-25 11:12:33 +00:00
Chandler Carruth e6c30fdda7 Restore the LoopInstSimplify pass, reverting r327329 that removed it.
The plan had always been to move towards using this rather than so much
in-pass simplification within the loop pipeline, but we never got around
to it.... until only a couple months after it was removed due to disuse.
=/

This commit is just a pure revert of the removal. I will add tests and
do some basic cleanup in follow-up commits. Then I'll wire it into the
loop pass pipeline.

Differential Revision: https://reviews.llvm.org/D47353

llvm-svn: 333250
2018-05-25 01:32:36 +00:00
Jun Bum Lim dfbe6fa832 [LICM] Preserve DT and LoopInfo specifically
Summary:
In LICM, CFG could be changed in splitPredecessorsOfLoopExit(), which update
only DT and LoopInfo. Therefore, we should preserve only DT and LoopInfo specifically,
instead of all analyses that depend on the CFG (setPreservesCFG()).

This change should fix PR37323.

Reviewers: uabelho, davide, dberlin, Ka-Ka

Reviewed By: dberlin

Subscribers: mzolotukhin, bjope, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D46775

llvm-svn: 333198
2018-05-24 15:58:34 +00:00
Chad Rosier 274d72faad [InstCombine] Combine XOR and AES instructions on ARM/ARM64.
The ARM/ARM64 AESE and AESD instructions have a builtin XOR as the first step in
the instruction. Therefore, if the AES key is zero and the AES data was
previously XORed, it can be combined into a single instruction.

Differential Revision: https://reviews.llvm.org/D47239
Patch by Michael Brase!

llvm-svn: 333193
2018-05-24 15:26:42 +00:00
Andrei Elovikov d34b765cb2 [NFC][VPlan] Wrap PlainCFGBuilder with an anonymous namespace.
Summary:
It's internal to the VPlanHCFGBuilder and should not be visible outside of its
translation unit.

Reviewers: dcaballe, fhahn

Reviewed By: fhahn

Subscribers: rengolin, bollu, tschuett, llvm-commits, rkruppe

Differential Revision: https://reviews.llvm.org/D47312

llvm-svn: 333187
2018-05-24 14:31:00 +00:00
Karl-Johan Karlsson 478232d52f [NaryReassociate] Detect deleted instr with WeakVH
Summary:
If NaryReassociate succeed it will, when replacing the old instruction
with the new instruction, also recursively delete trivially
dead instructions from the old instruction. However, if the input to the
NaryReassociate pass contain dead code it is not save to recursively
delete trivially deadinstructions as it might lead to deleting the newly
created instruction.

This patch will fix the problem by using WeakVH to detect this
rare case, when the newly created instruction is dead, and it will then
restart the basic block iteration from the beginning.

This fixes pr37539

Reviewers: tra, meheff, grosser, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47139

llvm-svn: 333155
2018-05-24 06:09:02 +00:00
Changpeng Fang 5f9154618e StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly
Summary:
  StructurizeCFG::orderNodes basically uses a reverse post-order (RPO) traversal of the region list to get the order.
The only problem with it is that sometimes backedges for outer loops will be visited before backedges for inner loops.
To solve this problem, a loop depth based approach has been used to make sure all blocks in this loop has been visited
before moving on to outer loop.

However, we found a problem for a SubRegion which is a loop itself:

--> BB1 --> BB2 --> BB3 -->

In this case, BB2 is a SubRegion (loop), and thus its loopdepth is different than that of BB1 and BB3. This fact will lead
BB2 to be placed in the wrong order.

In this work, we treat the SubRegion as a special case and use its exit block to determine the loop and its depth
to guard the sorting.

Reviewers:
  arsenm, jlebar

Differential Revision:
  https://reviews.llvm.org/D46912

llvm-svn: 333111
2018-05-23 18:34:48 +00:00
Roman Lebedev 6b6c553bb8 [InstCombine] Fold unfolded masked merge pattern with variable mask!
Summary:
Finally fixes [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]].

Now that the backend is all done, we can finally fold it!

The canonical unfolded masked merge pattern is
```(x &  m) | (y & ~m)```
There is a second, equivalent variant:
```(x | ~m) & (y |  m)```
Only one of them (the or-of-and's i think) is canonical.
And if the mask is not a constant, we should fold it to:
```((x ^ y) & M) ^ y```

https://rise4fun.com/Alive/ndQw

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: nicholas, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D46814

llvm-svn: 333106
2018-05-23 17:47:52 +00:00
Jakub Kuderski ef33edd9b5 [Dominators] Add PDT constructor from Function
Summary: This patch adds a PDT constructor from Function and lets codes previously using a local class to do this use PostDominatorTree class directly.

Reviewers: davide, kuhar, grosser, dberlin

Reviewed By: kuhar

Author: NutshellySima

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46709

llvm-svn: 333102
2018-05-23 17:29:21 +00:00
Craig Topper 3b768e8602 [InstCombine] Negate ABS/NABS patterns by swapping the select operands to remove the negation
Differential Revision: https://reviews.llvm.org/D47236

llvm-svn: 333101
2018-05-23 17:29:03 +00:00
Nicola Zaghen 03d0b91f43 Remove DEBUG macro.
Now that the LLVM_DEBUG() macro landed on the various sub-projects
the DEBUG macro can be removed.
Also change the new uses of DEBUG to LLVM_DEBUG.

Differential Revision: https://reviews.llvm.org/D46952

llvm-svn: 333091
2018-05-23 15:09:29 +00:00
Max Kazantsev d99f3bacb4 [LoopUnswitch] Fix SCEV invalidation in unswitching
Loop unswitching makes substantial changes to a loop that can also affect cached
SCEV info in its outer loops as well, but it only cares to invalidate SCEV cache for the
innermost loop in case of full unswitching and does not invalidate anything at all in
case of trivial unswitching. As result, we may end up with incorrect data in cache.

Differential Revision: https://reviews.llvm.org/D46045
Reviewed By: mzolotukhin

llvm-svn: 333072
2018-05-23 10:09:53 +00:00
Sanjay Patel 4b96935bd7 [InstCombine] use nsw negation for abs libcalls
Also, produce the canonical IR abs (s<0) to be more efficient. 

This is the libcall equivalent of the clang builtin change from:
rL333038

Pasting from that commit message:
The stdlib functions are defined in section 7.20.6.1 of the C standard with:
"If the result cannot be represented, the behavior is undefined."

That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would
be UB/poison.

llvm-svn: 333042
2018-05-22 23:29:40 +00:00
David Bolvansky 1f343fa0e0 [InstCombine] Remove calloc transformations
Summary: Previous patch does not care if a value is changed between calloc and strlen. This needs to be removed from InstCombine and maybe moved to DSE later after some rework.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47218

llvm-svn: 333022
2018-05-22 20:27:36 +00:00
Florian Hahn a6e63f176c [NewGVN] Fix handling of assumes
This patch fixes two bugs:

* test1: Previously assume(a >= 5) concluded that a == 5. That's only
         valid for assume(a == 5)...
* test2: If operands were swapped, additional users were added to the
         wrong cmp operand. This resulted in an "unsettled iteration"
         assertion failure.

Patch by Nikita Popov

Differential Revision: https://reviews.llvm.org/D46974

llvm-svn: 333007
2018-05-22 17:38:22 +00:00
David Bolvansky 41f4b64ee1 [InstCombine] Calloc-ed strings optimizations
Summary:
Example cases:
strlen(calloc(...)) -> 0

Reviewers: efriedma, bkramer

Reviewed By: bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47059

llvm-svn: 332990
2018-05-22 15:41:23 +00:00
Karl-Johan Karlsson 11d68a619e [LowerSwitch] Fixed faulty PHI node update
Summary:
When lowerswitch merge several cases into a new default block it's not
updating the PHI nodes accordingly. The code that update the PHI nodes
for the default edge only update the first entry and do not remove the
remaining ones, to make sure the number of entries match the number of
predecessors.

This is easily fixed by replacing the code that update the PHI node with
the already existing utility function for updating PHI nodes.

Reviewers: hans, reames, arsenm

Reviewed By: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D47055

llvm-svn: 332960
2018-05-22 08:46:48 +00:00
Bjorn Pettersson fecef6be9e [LoopVersioning] Don't modify the list that we iterate over in addPHINodes
Summary:
In LoopVersioning::addPHINodes we need to iterate over all
users for a value "Inst", and if the user is outside of the
VersionedLoop we should replace the use of "Inst" by using
the value "PN" instead.

Replacing the use of "Inst" for a user of "Inst" also means
that Inst->users() is modified. So it is not safe to do the
replace while iterating over Inst->users() as we used to do.
This patch splits the task into two steps. First we iterate
over Inst->users() to find all users that should be updated.
Those users are saved into a local data structure on the stack.
And then, in the second step, we do the actual updates. This
time iterating over the local data structure.

Reviewers: mzolotukhin, anemet

Reviewed By: mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47134

llvm-svn: 332958
2018-05-22 08:33:02 +00:00
Stanislav Mekhanoshin 0e132dca53 [AMDGPU] Optimze old value of v_mov_b32_dpp
We can eliminate old value if bound_ctrl = 1 and row_mask = bank_mask = 0xf.
This is alternative implementation working with the intrinsic in InstCombine.
Original review for past-ISel optimization: D46570.

Differential Revision: https://reviews.llvm.org/D46596

llvm-svn: 332956
2018-05-22 08:04:33 +00:00
Diego Caballero 1bd5f2261d Fix warning from r332654 with LLVM_ATTRIBUTE_USED
r332654 tried to fix an unused function warning with
a void cast. This approach worked for clang and gcc 
but not for MSVC. This commit replaces the void cast
with the LLVM_ATTRIBUTE_USED approach.

llvm-svn: 332910
2018-05-21 22:12:38 +00:00
Sanjay Patel b8346e3f07 [InstCombine] remove fptrunc (select) code; NFCI
This pattern is handled within commonCastTransforms(),
so the code here is dead AFAICT.

llvm-svn: 332887
2018-05-21 20:39:35 +00:00
Craig Topper f14e62c9a5 [EarlyCSE] Improve EarlyCSE of some absolute value cases.
Change matchSelectPattern to return X and -X for ABS/NABS in a well defined order. Adjust EarlyCSE to account for this. Ensure the SPF result is some kind of min/max and not abs/nabs in one place in InstCombine that made me nervous.

Prevously we returned the two operands of the compare part of the abs pattern. The RHS is always going to be a 0i, 1 or -1 constant. This isn't a very meaningful thing to return for any one. There's also some freedom in the abs pattern as to what happens when the value is equal to 0. This freedom led to early cse failing to match when different constants were used in otherwise equivalent operations. By returning the input and its negation in a defined order we can ensure an exact match. This also makes sure both patterns use the exact same subtract instruction for the negation. I believe CSE should evebntually make this happen and properly merge the nsw/nuw flags. But I'm not familiar with CSE and what order it does things in so it seemed like it might be good to really enforce that they were the same.

Differential Revision: https://reviews.llvm.org/D47037

llvm-svn: 332865
2018-05-21 18:42:42 +00:00
Diego Caballero 168d04d544 [VPlan] Reland r332654 and silence unused func warning
r332654 was reverted due to an unused function warning in
release build. This commit includes the same code with the
warning silenced.

Differential Revision: https://reviews.llvm.org/D44338

llvm-svn: 332860
2018-05-21 18:14:23 +00:00
Alexey Bataev 7c9ad0db3d [InstCombine] Fix PR37526: MinMax patterns produce an infinite loop.
Summary:
This patch fixes PR37526 by simplifying the newly generated LoadInst
instructions. If the pointer address is a bitcast from the pointer to
the NewType, we can just remove this extra bitcast instead of creating
the new one. This fixes the PR37526 + may speed up the whole compilation
process.

Reviewers: spatel, RKSimon, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47144

llvm-svn: 332855
2018-05-21 17:46:34 +00:00
Nico Weber e4a12cfa2f revert r332610, it breaks cfi, see D46326
llvm-svn: 332838
2018-05-21 11:44:39 +00:00
David Green 8ceab61c75 [CVP] Require DomTree for new Pass Manager
We were previously using a DT in CVP through SimplifyQuery, but not requiring it in
the new pass manager. Hence it would crash if DT was not already available. This now
gets DT directly and plumbs it through to where it is used (instead of using it
through SQ).

llvm-svn: 332836
2018-05-21 11:06:28 +00:00
Eric Christopher 563d0b9cb9 Fix up a few grammar issues.
llvm-svn: 332835
2018-05-21 10:27:36 +00:00
Craig Topper e4c045b7df [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead.
Someday maybe we'll use selects for all intrinsics.

llvm-svn: 332824
2018-05-20 23:34:04 +00:00
Sanjay Patel a003c728a5 [InstCombine] choose 1 form of abs and nabs as canonical
We already do this for min/max (see the blob above the diff), 
so we should do the same for abs/nabs.
A sign-bit check (<s 0) is used as a predicate for other IR 
transforms and it's likely the best for codegen.

This might solve the motivating cases for D47037 and D47041, 
but I think those patches still make sense. We can't guarantee 
this canonicalization if the icmp has more than one use.

Differential Revision: https://reviews.llvm.org/D47076

llvm-svn: 332819
2018-05-20 14:23:23 +00:00
Max Kazantsev c0b268f90c [IRCE] Fix miscompile with range checks against negative values
In the patch rL329547, we have lifted the over-restrictive limitation on collected range
checks, allowing to work with range checks with the end of their range not being
provably non-negative. However it appeared that the non-negativity of this value was
assumed in the utility function `ClampedSubtract`. In particular, its reasoning is based
on the fact that `0 <= SINT_MAX - X`, which is not true if `X` is negative.

The function `ClampedSubtract` is only called twice, once with `X = 0` (which is OK)
and the second time with `X = IRC.getEnd()`, where we may now see the problem if
the end is actually a negative value. In this case, we may sometimes miscompile.

This patch is the conservative fix of the miscompile problem. Rather than rejecting
non-provably non-negative `getEnd()` values, we will check it for non-negativity in
runtime. For this, we use function `smax(smin(X, 0), -1) + 1` that is equal to `1` if `X`
is non-negative and is equal to 0 if `X` is negative. If we multiply `Begin, End` of safe
iteration space by this function calculated for `X = IRC.getEnd()`, we will get the original
`[Begin, End)` if `IRC.getEnd()` was non-negative (and, thus, `ClampedSubtract` worked
correctly) and the empty range `[0, 0)` in case if ` IRC.getEnd()` was negative.

So we in fact prohibit execution of the main loop if at least one of range checks was
made against a negative value (and we figured it out in runtime). It is still better than
what we have before (non-negativity had to be proved in compile time) and prevents
us from miscompile, however it is sometiles too restrictive for unsigned range checks
against a negative value (which in fact can be eliminated).

Once we re-implement `ClampedSubtract` in a way that it handles negative `X` correctly,
this limitation can be lifted, too.

Differential Revision: https://reviews.llvm.org/D46860
Reviewed By: samparker

llvm-svn: 332809
2018-05-19 13:06:37 +00:00
Benjamin Kramer a76b64ff80 [MergeICmps] Don't crash when memcmp is not available
Fixes clang crashing with -fno-builtin, PR37527.

llvm-svn: 332808
2018-05-19 12:51:59 +00:00
Yaxun Liu ea988f1fd9 Fix evaluator for non-zero alloca addr space
The evaluator goes through BB and creates global vars as temporary values to evaluate
results of LLVM instructions. It creates undef for alloca, however it assumes alloca
in addr space 0. If the next instruction is addrspace cast to 0, then we get an invalid
cast instruction.

This patch let the temp global var have an address space matching alloca addr space,
so that the valuation can be done.

Differential Revision: https://reviews.llvm.org/D47081

llvm-svn: 332794
2018-05-19 02:58:16 +00:00
Piotr Padlewski a26a08cb52 Constant fold launder of null and undef
Summary:
This might be useful because clang will add
some barriers for pointer comparisons.

Reviewers: majnemer, dberlin, hfinkel, nlewycky, davide, rsmith, amharc,
kuhar

Subscribers: davide, amharc, llvm-commits

Differential Revision: https://reviews.llvm.org/D32423

llvm-svn: 332786
2018-05-18 23:52:57 +00:00
Craig Topper 0198b73769 [InstCombine] Qualify a select pattern based transform to restrct to only min/max and ignore abs/nabs.
llvm-svn: 332770
2018-05-18 21:21:56 +00:00
Evgeniy Stepanov 28f330fd6f [msan] Don't check divisor shadow in fdiv.
Summary:
Floating point division by zero or even undef does not have undefined
behavior and may occur due to optimizations.

Fixes https://bugs.llvm.org/show_bug.cgi?id=37523.

Reviewers: kcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D47085

llvm-svn: 332761
2018-05-18 20:19:53 +00:00
Galina Kistanova 083ea389d6 Reverted r332654 as it has broken some buildbots and left unfixed for a long time.
The introduced problem is:
llvm.src/lib/Transforms/Vectorize/VPlanVerifier.cpp:29:13: error: unused function 'hasDuplicates' [-Werror,-Wunused-function]
static bool hasDuplicates(const SmallVectorImpl<VPBlockBase *> &VPBlockVec) {
            ^

llvm-svn: 332747
2018-05-18 18:14:06 +00:00
David Stenberg 0af67e5b65 [SimplifyCFG] Fix a debug invariant bug in FoldBranchToCommonDest()
Summary:
Fix a case where FoldBranchToCommonDest() would bail out from doing CSE
when encountering a debug intrinsic. Handle that by skipping past the
debug intrinsics.

Also, as a minor refactoring, rename checkCSEInPredecessor() to
tryCSEWithPredecessor() to make it a bit more clear that the function
may remove instructions.

Reviewers: fhahn, craig.topper, dblaikie, xbolva00

Reviewed By: fhahn, xbolva00

Subscribers: vsk, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D46635

llvm-svn: 332698
2018-05-18 08:52:15 +00:00
Walter Lee cdbb207bd1 [asan] Add instrumentation support for Myriad
1. Define Myriad-specific ASan constants.

2. Add code to generate an outer loop that checks that the address is
   in DRAM range, and strip the cache bit from the address.  The
   former is required because Myriad has no memory protection, and it
   is up to the instrumentation to range-check before using it to
   index into the shadow memory.

3. Do not add an unreachable instruction after the error reporting
   function; on Myriad such function may return if the run-time has
   not been initialized.

4. Add a test.

Differential Revision: https://reviews.llvm.org/D46451

llvm-svn: 332692
2018-05-18 04:10:38 +00:00
Heejin Ahn b4be38fcdd [WebAssembly] Add Wasm personality and isScopedEHPersonality()
Summary:
- Add wasm personality function
- Re-categorize the existing `isFuncletEHPersonality()` function into
two different functions: `isFuncletEHPersonality()` and
`isScopedEHPersonality(). This becomes necessary as wasm EH uses scoped
EH instructions (catchswitch, catchpad/ret, and cleanuppad/ret) but not
outlined funclets.
- Changed some callsites of `isFuncletEHPersonality()` to
`isScopedEHPersonality()` if they are related to scoped EH IR-level
stuff.

Reviewers: majnemer, dschuff, rnk

Subscribers: jfb, sbc100, jgravelle-google, eraman, JDevlieghere, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D45559

llvm-svn: 332667
2018-05-17 20:52:03 +00:00
Diego Caballero f58ad3129c [LV][VPlan] Build plain CFG with simple VPInstructions for outer loops.
Patch #3 from VPlan Outer Loop Vectorization Patch Series #1
(RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).

Expected to be NFC for the current inner loop vectorization path. It
introduces the basic algorithm to build the VPlan plain CFG (single-level
CFG, no hierarchical CFG (H-CFG), yet) in the VPlan-native vectorization
path using VPInstructions. It includes:
  - VPlanHCFGBuilder: Main class to build the VPlan H-CFG (plain CFG without nested regions, for now).
  - VPlanVerifier: Main class with utilities to check the consistency of a H-CFG.
  - VPlanBlockUtils: Main class with utilities to manipulate VPBlockBases in VPlan.

Reviewers: rengolin, fhahn, mkuper, mssimpso, a.elovikov, hfinkel, aprantl.

Differential Revision: https://reviews.llvm.org/D44338

llvm-svn: 332654
2018-05-17 19:24:47 +00:00
Xinliang David Li bc471c39ee Add a limit for phi folding instcombine
Differential Revision: http://reviews.llvm.org/D47023

llvm-svn: 332653
2018-05-17 19:24:03 +00:00
Craig Topper bd332588bd [InstCombine] Propagate the nsw/nuw flags from the add in the 'shifty' abs pattern to the sub in the select version.
According to alive this is valid. I'm hoping to use this to make an assumption that the sign bit is zero after this sequence. The only way it wouldn't be is if the input was INT__MIN, but by preserving the flags we can make doing this to INT_MIN UB.

The nuw flags is weird because it creates such a contradiction that the original number would have to be positive meaning we could remove the select entirely, but we don't get that far.

Differential Revision: https://reviews.llvm.org/D46988

llvm-svn: 332623
2018-05-17 16:29:52 +00:00
Dmitry Mikulin 3c6b4e35bd In thin and full LTO + CFI, direct function calls may go through jump table
entries to reach the target. Since these calls don't require type checks,
we can short-circuit them to their real targets.

Differential Revision: https://reviews.llvm.org/D46326

llvm-svn: 332610
2018-05-17 14:29:07 +00:00
Bjorn Pettersson 81a76a388a [SROA] Handle PHI with multiple duplicate predecessors
Summary:
The verifier accepts PHI nodes with multiple entries for the
same basic block, as long as the value is the same.

As seen in PR37203, SROA did not handle such PHI nodes properly
when speculating loads over the PHI, since it inserted multiple
loads in the predecessor block and changed the PHI into having
multiple entries for the same basic block, but with different
values.

This patch teaches SROA to reuse the same speculated load for
each PHI duplicate entry in such situations.

Resolves: https://bugs.llvm.org/show_bug.cgi?id=37203

Reviewers: uabelho, chandlerc, hfinkel, bkramer, efriedma

Reviewed By: efriedma

Subscribers: dberlin, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D46426

llvm-svn: 332577
2018-05-17 07:21:41 +00:00
Hiroshi Inoue f5c0e6c285 [SROA] pr37267: fix assertion failure in integer widening
The current integer widening does not support rewriting partial split slices in rewriteIntegerStore (and rewriteIntegerLoad).
This patch adds explicit checks for this case in isIntegerWideningViableForSlice.
Before r322533, splitting is allowed only for the whole-alloca slice and hence the above case is implicitly rejected by another check `if (DL.getTypeStoreSize(ValueTy) > Size)` because whole-alloca slice is larger than the partition.

Differential Revision: https://reviews.llvm.org/D46750

llvm-svn: 332575
2018-05-17 06:32:17 +00:00
Vedant Kumar 5a0872c2b7 [STLExtras] Add size() for ranges, and remove distance()
r332057 introduced distance() for ranges. Based on post-commit feedback,
this renames distance() to size(). The new size() is also only enabled
when the operation is O(1).

Differential Revision: https://reviews.llvm.org/D46976

llvm-svn: 332551
2018-05-16 23:20:42 +00:00
Benjamin Kramer 8ac15bf4dc [InstCombine] Fix the signature of fgets_unlocked.
It returns a pointer, not an int. This miscompiles all code that uses
the return value of fgets.

llvm-svn: 332531
2018-05-16 21:45:39 +00:00
Sanjay Patel 2eb3512090 [InstCombine] allow more binop (shuffle X), C transforms
The canonicalization was restricted to shuffle masks with
a 1-to-1 mapping to the constant vector, but that disqualifies
the common splat pattern. This is part of solving PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

llvm-svn: 332479
2018-05-16 15:15:22 +00:00
David Bolvansky ca22d427b9 [SimplifyLibcalls] Replace locked IO with unlocked IO
Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed,

Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer, lebedev.ri, rja

Reviewed By: rja

Subscribers: rja, srhines, efriedma, lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D45736

llvm-svn: 332452
2018-05-16 11:39:52 +00:00
David Green cdee1d957e [LoopUnroll] Split out simplify code after Unroll into a new function. NFC
So that it can be shared with other passes that may end up doing the same
thing.

Differential Revision: https://reviews.llvm.org/D45874

llvm-svn: 332450
2018-05-16 10:41:58 +00:00
Shoaib Meenai 074728a2a9 [ObjCARC] Prevent code motion into a catchswitch
A catchswitch must be the only non-phi instruction in its basic block;
attempting to move a retain or release into a catchswitch basic block
will result in invalid IR. Explicitly mark a CFG hazard in this case to
prevent the code motion.

Differential Revision: https://reviews.llvm.org/D46482

llvm-svn: 332430
2018-05-16 04:52:18 +00:00
Evgeny Stupachenko bff9302c3d Fix LSR compile time hang.
Summary:
Limit number of reassociations in GenerateReassociationsImpl.

Reviewers: qcolombet, mkazantsev

Differential Revision: https://reviews.llvm.org/D46039

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 332426
2018-05-16 02:48:50 +00:00
Sanjay Patel 919882638e [InstCombine] fix binop (shuffle X), C --> shuffle (binop X, C') to check uses
llvm-svn: 332407
2018-05-15 22:00:37 +00:00
Marek Olsak 3c5fd145c5 StructurizeCFG: fix inverting conditions
Author: Samuel Pitoiset

Without this patch, it appears to me that we are selecting
the wrong operand when inverting conditions. In the attached
test, it will select %tmp3 instead of %tmp4. To fix it, just
use 'A' as everywhere.

This fixes a regression introduced by
"[PatternMatch] define m_Not using m_Xor and cst_pred_ty"

https://reviews.llvm.org/D46351

llvm-svn: 332403
2018-05-15 21:41:55 +00:00
Evgeniy Stepanov 091fed94ae [msan] Instrument masked.store, masked.load intrinsics.
Summary: Instrument masked store/load intrinsics.

Reviewers: kcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D46785

llvm-svn: 332402
2018-05-15 21:28:25 +00:00
Sanjay Patel 3c569f0de0 [InstCombine] clean up code for binop-shuffle transforms; NFCI
llvm-svn: 332399
2018-05-15 21:23:58 +00:00
Sanjay Patel 3c35290c58 [InstCombine] fix binop-of-shuffles to check uses
llvm-svn: 332375
2018-05-15 17:14:23 +00:00
whitequark 8f0ab258bd [MergeFunctions] Fix merging of small weak functions
When two interposable functions are merged, we cannot replace
uses and have to emit calls to a common internal function. However,
writeThunk() will not actually emit a thunk if the function is too
small. This leaves us in a broken state where mergeTwoFunctions
already rewired the functions, but writeThunk doesn't do anything.

This patch changes the implementation so that:

 * writeThunk() does just that.
 * The direct replacement of calls is moved into mergeTwoFunctions()
   into the non-interposable case only.
 * isThunkProfitable() is extracted and will be called for
   the non-iterposable case always, and in the interposable case
   only if uses are still left after replacement.

This issue has been introduced in https://reviews.llvm.org/D34806,
where the code for checking thunk profitability has been moved.

Differential Revision: https://reviews.llvm.org/D46804

Reviewed By: whitequark

llvm-svn: 332342
2018-05-15 11:31:07 +00:00
Max Kazantsev 9b90373c8b [NFC] Add const to method signature
llvm-svn: 332317
2018-05-15 01:21:56 +00:00
Keno Fischer de577af8c0 [InstCombine] fix crash due to ignored addrspacecast
Summary:
Part of the InstCombine code for simplifying GEPs looks through
addrspacecasts. However, this was done by updating a variable
also used by the next transformation, for marking GEPs as
inbounds. This led to replacing a GEP with a similar instruction
in a different addrspace, which caused an assertion failure in RAUW.

This caused julia issue https://github.com/JuliaLang/julia/issues/27055

Patch by Jeff Bezanson <jeff@juliacomputing.com>

Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D46722

llvm-svn: 332302
2018-05-14 22:05:01 +00:00
Sanjay Patel bf55e6dee1 [AggressiveInstCombine] avoid crashing on unsimplified code (PR37446)
This bug:
https://bugs.llvm.org/show_bug.cgi?id=37446
...raises another question: why do we run aggressive-instcombine before 
regular instcombine?

llvm-svn: 332243
2018-05-14 13:43:32 +00:00
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Nicola Zaghen 617d4a8199 Test commit access.
Remove trailing whitespace.

llvm-svn: 332220
2018-05-14 08:24:29 +00:00
Craig Topper 0e71c6d5ca [X86] Remove and autoupgrade the cvtusi2sd intrinsic. Use uitofp+insertelement instead.
llvm-svn: 332206
2018-05-14 00:06:49 +00:00
Craig Topper 911025b1cd [X86] Extend instcombine folds for pclmuldq intrinsics to the 256 and 512 bit version.
llvm-svn: 332202
2018-05-13 21:56:32 +00:00
Craig Topper 85906cf041 [X86] Remove and autoupgrade masked vpermd/vpermps intrinsics.
llvm-svn: 332198
2018-05-13 18:03:59 +00:00
Craig Topper df3a9cedff [X86] Remove an autoupgrade legacy cvtss2sd intrinsics.
llvm-svn: 332187
2018-05-13 00:29:40 +00:00
Craig Topper 38ad7ddabc [X86] Remove and autoupgrade cvtsi2ss/cvtsi2sd intrinsics to match what clang has used for a very long time.
llvm-svn: 332186
2018-05-12 23:14:39 +00:00
Michael Zolotukhin a41660df7e Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading."
Stage3/stage4 bootstrap miscompares should be fixed by a non-determinism
fix in IDF (r332167).

This reverts commit r330446.

llvm-svn: 332168
2018-05-12 01:52:36 +00:00
Sergey Dmitriev 69c9cd277d [CodeExtractor] Allow extracting blocks with exception handling
This is a CodeExtractor improvement which adds support for extracting blocks
which have exception handling constructs if that is legal to do. CodeExtractor
performs validation checks to ensure that extraction is legal when it finds
invoke instructions or EH pads (landingpad, catchswitch, or cleanuppad) in
blocks to be extracted.

I have also added an option to allow extraction of blocks with alloca
instructions, but no validation is done for allocas. CodeExtractor caller has
to validate it himself before allowing alloca instructions to be extracted.
By default allocas are still not allowed in extraction blocks.

Differential Revision: https://reviews.llvm.org/D45904

llvm-svn: 332151
2018-05-11 22:49:49 +00:00
Craig Topper a17d627abb [X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer used by clang.
llvm-svn: 332146
2018-05-11 21:59:34 +00:00
Artem Belevich c2cd5d5ce0 [Split GEP] handle trunc() in separate-const-offset-from-gep pass.
Let separate-const-offset-from-gep pass handle trunc() when it calculates
constant offset relative to base. The pass itself may insert trunc()
instructions when it canonicalises array indices to pointer-size integers
and needs to handle trunc() in order to evaluate the offset.

Differential Revision: https://reviews.llvm.org/D46732

llvm-svn: 332142
2018-05-11 21:13:19 +00:00
Daniel Neilson f6651d4d94 [InstCombine] Handle atomic memset in the same way as regular memset
Summary:
This change adds handling of the atomic memset intrinsic to the
code path that simplifies the regular memset. In practice this means
that we will now also expand a small constant-length atomic memset
into a single unordered atomic store.

Reviewers: apilipenko, skatkov, mkazantsev, anna, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D46660

llvm-svn: 332132
2018-05-11 20:04:50 +00:00
David Bolvansky cd93c4ef1a [InstCombine] snprintf optimizations
Reviewers: spatel, efriedma, majnemer, rja, bkramer

Reviewed By: rja, bkramer

Subscribers: mstorsjo, rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46285

llvm-svn: 332110
2018-05-11 17:50:49 +00:00
Davide Italiano 6e1f7bf316 [Reassociate] Prevent infinite loops when processing PHIs.
Phi nodes can reside in live blocks but one of their incoming
arguments can come from a dead block. Dead blocks and reassociate
don't play nice together. In fact, reassociate performs an RPO
as a first step to avoid processing dead blocks.

The reason why Reassociate might not fixpoint when examining
dead blocks is that the following:

  %xor0 = xor i16 %xor1, undef
  %xor1 = xor i16 %xor0, undef

is perfectly valid LLVM IR (if it appears in a dead block),
so the worklist algorithm keeps pushing the two instructions for
reexamination. Note that this is not Reassociate fault, at least
not entirely. It's llvm that has a weird definition of dominance.

Fixes PR37390.

llvm-svn: 332100
2018-05-11 15:45:36 +00:00
Daniel Neilson 8f30ec65b0 [InstCombine] Unify handling of atomic memtransfer with non-atomic memtransfer
Summary:
This change reworks the handling of atomic memcpy within the instcombine pass.
Previously, a constant length atomic memcpy would be lowered into loads & stores
as long as no more than 16 load/store pairs are created. This is quite different
from the lowering done for a non-atomic memcpy; which only ever lowers into a single
load/store pair of no more than 8 bytes. Larger constant-sized memcpy calls are
expanded to load/stores in later passes, such as SelectionDAG lowering.

In this change the behaviour for atomic memcpy is unified with non-atomic memcpy;
atomic memcpy is now treated in the same was as non-atomic memcpy has always been.
We leave it to later passes to lower longer-length atomic memcpy calls.

Due to the structure of the pass's handling of memtransfer intrinsics, this change
also gives us handling of atomic memmove that we did not previously have.

Reviewers: apilipenko, skatkov, mkazantsev, anna, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D46658

llvm-svn: 332093
2018-05-11 14:30:02 +00:00
Brian Gesiak c651113439 [Coroutines] PR34897: Fix incorrect elisions
Summary:
https://bugs.llvm.org/show_bug.cgi?id=34897 demonstrates an incorrect
coroutine frame allocation elision in the coro-elide pass. The elision
is performed on the basis that the SSA variables from all llvm.coro.begin
are directly referenced in subsequent llvm.coro.destroy instructions.

However, this ignores the fact that the function may exit through paths
that do not run these destroy instructions. In the sample program from
PR34897, for example, the llvm.coro.destroy instruction is only
executed in exception handling code. When the coroutine function exits
normally, llvm.coro.destroy is not called. Eliding the allocation in
this case causes a subsequent reference to the coroutine handle from
outside of the function to access freed memory.

To fix the issue, when finding an llvm.coro.destroy for each llvm.coro.begin,
only consider llvm.coro.destroy that are executed along non-exceptional paths.

Test Plan:
1. Download the sample program from
   https://bugs.llvm.org/show_bug.cgi?id=34897, compile it with
   `clang++ -fcoroutines-ts -stdlib=libc++ -std=c++1z -O2`, and run it.
   It should print `"run1\ncheck1\nrun2\ncheck2"` and then exit
   successfully.
2. Compile https://godbolt.org/g/mCKfnr and confirm it is still
   optimized to a single instruction, 'return 1190'.
3. `check-llvm`

Reviewers: rsmith, GorNishanov, eric_niebler

Reviewed By: GorNishanov

Subscribers: andrewrk, lewissbaker, EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D43242

llvm-svn: 332077
2018-05-11 03:12:28 +00:00
Kostya Serebryany a2759327fd [sanitizer-coverage] don't instrument a function if it's entry block ends with 'unreachable'
llvm-svn: 332072
2018-05-11 01:09:39 +00:00
Kamil Rytarowski 02c432a72b Register NetBSD/i386 in AddressSanitizer.cpp
Summary:
Ship kNetBSD_ShadowOffset32 set to 1ULL << 30.

This is prepared for the amd64 kernel runtime.

Sponsored by <The NetBSD Foundation>

Reviewers: vitalybuka, joerg, kcc

Reviewed By: vitalybuka

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46724

llvm-svn: 332069
2018-05-11 00:58:01 +00:00
Wei Mi 0c2f6be662 [SampleFDO] Don't treat warm callsite with inline instance in the profile as cold
We found current sampleFDO had a performance issue when triaging a regression.
For a callsite with inline instance in the profile, even if hot callsite inliner
cannot inline it, it may still execute enough times and should not be treated as
cold in regular inliner later. However, currently if such callsite is not inlined
by hot callsite inliner, and the BB where the callsite locates doesn't get
samples from other instructions inside of it, the callsite will have no profile
metadata annotated. In regular inliner cost analysis, if the callsite has no
profile annotated and its caller has profile information, it will be treated as
cold.

The fix changes the isCallsiteHot check and chooses to compare
CallsiteTotalSamples with hot cutoff value computed by ProfileSummaryInfo.

Differential Revision: https://reviews.llvm.org/D45377

llvm-svn: 332058
2018-05-10 23:02:27 +00:00
Vedant Kumar e0b5f86b30 [STLExtras] Add distance() for ranges, pred_size(), and succ_size()
This commit adds a wrapper for std::distance() which works with ranges.
As it would be a common case to write `distance(predecessors(BB))`, this
also introduces `pred_size()` and `succ_size()` helpers to make that
easier to write.

Differential Revision: https://reviews.llvm.org/D46668

llvm-svn: 332057
2018-05-10 23:01:54 +00:00
Craig Topper ea78a261de [InstCombine] Replace an 'if' that should always be true with an assert.
The bitwidth of the operation should always be wider than the result width of the truncate since we don't recurse through any width changing operations.

llvm-svn: 332055
2018-05-10 22:45:28 +00:00
Martin Storsjo 86e6742c17 Revert "[InstCombine] snprintf optimizations"
This reverts commit SVN r331889, which could trigger failed
assertions for cases where the snprintf function is declared
with a vaguely differing signature (e.g. being defined as
static inline), see PR37408.

llvm-svn: 332043
2018-05-10 21:23:36 +00:00
Sanjay Patel c7bb14301a [InstCombine] add folds for minnum(-a, -b) --> -maxnum(a, b)
This is similar to what we do for integer min/max with 'not'
ops (rL321882).

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=37404
https://bugs.llvm.org/show_bug.cgi?id=37405

llvm-svn: 332031
2018-05-10 20:03:13 +00:00
Omer Paparo Bivas fbb83deef7 [InstCombine] Moving overflow computation logic from InstCombine to ValueTracking; NFC
Differential Revision: https://reviews.llvm.org/D46704

Change-Id: Ifabcbe431a2169743b3cc310f2a34fd706f13f02
llvm-svn: 332026
2018-05-10 19:46:19 +00:00
Chandler Carruth baf045fb28 [PM/LoopUnswitch] Avoid pointlessly creating an exit block set.
This code can just test whether blocks are *in* the loop, which we
already have a dedicated set tracking in the loop itself.

llvm-svn: 332004
2018-05-10 17:33:20 +00:00
Daniel Neilson 71fa1b904a [DSE] Teach the pass about partial overwrite of atomic memory intrinsics
Summary:
This change teaches DSE that the atomic memory intrinsics can be overwriten
partially in the same way as the non-atomic forms. Specifically, that the
atomic memcpy & memset can be shortened at the end and that the atomic memset
can be shortened at the beginning, if they partially overwritten
by later stores.

Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith, spatel, filcab, sanjoy

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45584

llvm-svn: 331991
2018-05-10 15:12:49 +00:00
whitequark 68403564df [PR37339] Fix assertion in FunctionComparator::cmpInlineAsm
Fixes bug https://bugs.llvm.org/show_bug.cgi?id=37339.

InlineAsm is only uniqued if the FunctionTypes are exactly the
same, while cmpTypes() for example considers all pointer types
in the default address space to be the same. For this reason
the end of cmpInlineAsm() can be reached.

This patch replaces the unreachable assertion with a check that
the function types are not identical.

Differential Revision: https://reviews.llvm.org/D46495

Reviewers: jfb
llvm-svn: 331990
2018-05-10 15:05:47 +00:00
Benjamin Kramer 456f473ea8 [InstCombine] Only propagate known leading zeros from udiv input to output.
Put in a conservatively correct estimate for now. Avoids miscompiling
clang in FDO mode. This is really tricky to trigger in reality as
basically all interesting cases will be folded away by computeKnownBits
earlier, I was unable to find a reasonably small test case.

llvm-svn: 331975
2018-05-10 11:45:18 +00:00
Craig Topper 553d451e95 [InstCombine] Reorder an if condition to put a cheap check in front of a computeKnownBits call. NFC
llvm-svn: 331948
2018-05-10 00:53:25 +00:00
Craig Topper 333efc951a [InstCombine] Use APInt::getBitsSetFrom to shortern a line and fix an 80 columns violation. NFC
Fix a similar line in the same function.

llvm-svn: 331947
2018-05-10 00:53:22 +00:00
Philip Reames 913a779df2 [Inscombine] fix a signedness warning which broke -Werror builds
llvm-svn: 331944
2018-05-10 00:05:29 +00:00
Sanjay Patel ac3951a735 [AggressiveInstCombine] convert a chain of 'and-shift' bits into masked compare
This is a follow-up to D45986. As suggested there, we should match the "all-bits-set" 
pattern in addition to "any-bits-set".

This was a little more complicated than I thought it would be initially because the 
"and 1" instruction can be anywhere in the chain. Hopefully, the code comments make 
that logic understandable, but if you see a way to simplify or improve that, it's 
most appreciated.

This transforms patterns that emerge from bitfield tests as seen in PR37098:
https://bugs.llvm.org/show_bug.cgi?id=37098

I think it would also help reduce the large test from:
D46336
D46595 
but we need something to reassociate that case to the forms we're expecting here first.

Differential Revision: https://reviews.llvm.org/D46649

llvm-svn: 331937
2018-05-09 23:08:15 +00:00
Philip Reames 79e917d117 [InstCombine] Widen guards with conditions between
The previous handling for guard widening in InstCombine was extremely restrictive. In particular, it didn't handle the common case where we had two guards separated by a single icmp. Handle this by scanning through a small fixed window of instructions to find the next guard if needed.

Differential Revision: https://reviews.llvm.org/D46203

llvm-svn: 331935
2018-05-09 22:56:32 +00:00
Benjamin Kramer 0d2fc1a501 [InstCombine] Teach SimplifyDemandedBits that udiv doesn't demand low dividend bits that are zero in the divisor
This is safe as long as the udiv is not exact. The pattern is not common in
C++ code, but comes up all the time in code generated by XLA's GPU backend.

Differential Revision: https://reviews.llvm.org/D46647

llvm-svn: 331933
2018-05-09 22:27:34 +00:00
David Bolvansky 9b5e6e8288 [InstCombine] snprintf optimizations
Reviewers: spatel, efriedma, majnemer, rja, bkramer

Reviewed By: rja, bkramer

Subscribers: rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46285

llvm-svn: 331889
2018-05-09 16:09:31 +00:00
Krzysztof Parzyszek ea4c1bb772 [LV] Change MaxVectorSize bound to 256 in assertion, NFC otherwise
It's possible to have a vector of 256 bytes in HVX code on Hexagon
(vector pair in 128-byte mode).

llvm-svn: 331885
2018-05-09 15:18:12 +00:00
Benjamin Kramer ccb0fbe9a0 Revert "[InstCombine] snprintf optimizations"
This reverts commit r331849. It miscompiles
snprintf(buf, sizeof(buf), "%s", "any constant string); into
memcpy(buf, "%s", sizeof("any constant string"));

llvm-svn: 331866
2018-05-09 11:38:57 +00:00
Bjorn Pettersson 9f953cdd7c [MergedLoadStoreMotion] Fix a debug invariant bug in mergeStores
Summary:
MergedLoadStoreMotion::mergeStores is using some heuristics
to limit the amount of stores that it tries to sink (see
MagicCompileTimeControl in MergedLoadStoreMotion.cpp). The
heuristic involves counting the number of instructions in
one of the basic blocks that is part of the transformation.

We now ignore dbg intrinsics when counting instruction for
the MagicCompileTimeControl heuristic. This to make sure that
the amount of stores that are sunk doesn't depend on the amount
of debug information (if -g is used or not).

Reviewers: Gerolf, davide, majnemer

Reviewed By: davide

Subscribers: dberlin, bjope, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D46600

llvm-svn: 331852
2018-05-09 06:52:12 +00:00
David Bolvansky 44a37f04b2 [InstCombine] snprintf optimizations
Reviewers: spatel, efriedma, majnemer, rja

Reviewed By: rja

Subscribers: rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46285

llvm-svn: 331849
2018-05-09 06:34:20 +00:00
Shiva Chen 2c864551df [DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is

!DILabel(scope: !1, name: "foo", file: !2, line: 3)

We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is

llvm.dbg.label(metadata !1)

It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.

We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.

Differential Revision: https://reviews.llvm.org/D45024

Patch by Hsiangkai Wang.

llvm-svn: 331841
2018-05-09 02:40:45 +00:00
Heejin Ahn bf7716952a Support a funclet operand bundle in LowerInvoke
Summary:
The current LowerInvoke pass cannot handle invoke instructions with a
funclet bundle operand. The order of operands for an invoke instruction
is {call arguments, callee, funclet operand (if any), normal dest,
unwind dest}. The current code assumes there is no funclet operand and
incorrectly includes a funclet operand into call arguments.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46242

llvm-svn: 331832
2018-05-09 00:53:50 +00:00
Davide Italiano 48283ba3a1 [SimplifyCFG] Fix a crash when folding PHIs.
We enter MergeBlockIntoPredecessor with a block looking like this:

for.inc.us-lcssa:                                 ; preds = %cond.end
  %k.1.lcssa.ph = phi i32 [ %conv15, %cond.end ]
  %t.3.lcssa.ph = phi i32 [ %k.1.lcssa.ph, %cond.end ]
  br label %for.inc, !dbg !66

[note the first arg of the PHI being a PHI].
FoldSingleEntryPHINodes gets rid of both PHIs (calling, eraseFromParent).
But right before we call the function, we push into IncomingValues the
only argument of the PHIs, and shortly after we try to iterate over
something which has been invalidated before :(

The fix its not trying to remove PHIs which have an incoming value
coming from the same BB we're looking at.

Fixes PR37300 and rdar://problem/39910460

Differential Revision:  https://reviews.llvm.org/D46568

llvm-svn: 331824
2018-05-08 23:28:15 +00:00
Hideki Saito d722d61402 [LV] Fix for PR37248, Broadcast codegen incorrectly assumed vector loop body is single basic block
Summary:
Broadcast code generation emitted instructions in pre-header, while the instruction they are dependent on in the vector loop body.
This resulted in an IL verification error ---- value used before defined.


Reviewers: rengolin, fhahn, hfinkel

Reviewed By: rengolin, fhahn

Subscribers: dcaballe, Ka-Ka, llvm-commits

Differential Revision: https://reviews.llvm.org/D46302

llvm-svn: 331799
2018-05-08 18:57:34 +00:00
Bjorn Pettersson 51cebc98f3 [LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions
Summary:
In formLCSSAForInstructions we speculatively add new PHI
nodes, that sometimes ends up without having any uses. It
has been discovered that sometimes an added PHI node can
appear as being unused in one iteration of the Worklist,
although it can end up being used by a PHI node added in
a later iteration. We now check, a second time, that the
PHI node still is unused before we remove it. This avoids
an assert about "Trying to remove a phi with uses." for the
added test case.

Reviewers: davide, mzolotukhin, mattd, dberlin

Reviewed By: mzolotukhin, dberlin

Subscribers: dberlin, mzolotukhin, davide, bjope, uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D46422

llvm-svn: 331741
2018-05-08 06:59:47 +00:00
Teresa Johnson 59da890c96 [NewPM] Emit inliner NoDefinition missed optimization remark
Summary: Makes this consistent with the old PM.

Reviewers: eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D46526

llvm-svn: 331709
2018-05-08 01:45:46 +00:00
Dmitry Mikulin 738bac77c1 Remove explicit setting of the CFI jumptable section name, it does not appear
to be needed: jump table sections are created with .cfi.jumptable suffix. With
this change each jump table is placed in a separate section, which allows the
linker to re-order them.

Differential Revision: https://reviews.llvm.org/D46537

llvm-svn: 331680
2018-05-07 21:30:15 +00:00
Fangrui Song 862eebb6d6 Simplify LLVM_ATTRIBUTE_USED call sites.
llvm-svn: 331599
2018-05-05 20:14:38 +00:00
George Burgess IV f9d26af4ea Range-ify for loop; NFC
llvm-svn: 331582
2018-05-05 04:52:26 +00:00
Craig Topper 781aa181ab Fix a bunch of places where operator-> was used directly on the return from dyn_cast.
Inspired by r331508, I did a grep and found these.

Mostly just change from dyn_cast to cast. Some cases also showed a dyn_cast result being converted to bool, so those I changed to isa.

llvm-svn: 331577
2018-05-05 01:57:00 +00:00
Peter Collingbourne e04ecc88de LowerTypeTests: Fix non-determinism in code that handles icall branch funnels.
This was exposed by enabling expensive checks, which causes llvm::sort
to sort randomly.

Differential Revision: https://reviews.llvm.org/D45901

llvm-svn: 331573
2018-05-05 00:51:55 +00:00
Philip Reames 5b39acd111 [LICM] Compute a must execute property for the prefix of the header as we go
Computing this property within the existing walk ensures that the cost is linear with the size of the block. If we did this from within isGuaranteedToExecute, it would be quadratic without some very fancy caching.

This allows us to reliably catch a hoistable instruction within a header which may throw at some point *after* our hoistable instruction. It doesn't do anything for non-header cases, but given how common single block loops are, this seems very worthwhile.

llvm-svn: 331557
2018-05-04 21:35:00 +00:00
Shoaib Meenai 57fadab1cb [ObjCARC] Account for catchswitch in bitcast insertion
A catchswitch is both a pad and a terminator, meaning it must be the
only non-phi instruction in its basic block. When we're inserting a
bitcast in the incoming basic block for a phi, if that incoming block is
a catchswitch, we should go up the dominator tree to find a valid
insertion point rather than attempting to insert before the catchswitch
(which would result in invalid IR).

Differential Revision: https://reviews.llvm.org/D46412

llvm-svn: 331548
2018-05-04 19:03:11 +00:00
Craig Topper ded8ee07e9 [LoopIdiomRecognize] Don't create an IRBuilder just to call getTrue/getFalse.
We can call the methods in ConstantInt directly. We just need a context.

llvm-svn: 331542
2018-05-04 17:39:08 +00:00
Max Kazantsev 786032c1b7 [IRCE] Fix misuse of dyn_cast which leads to UB
llvm-svn: 331508
2018-05-04 07:34:35 +00:00
Craig Topper 9510f70636 [LoopIdiomRecognize] Replace more unchecked dyn_casts with cast.
Two of these are immediately dereferenced on the next line. The other two are passed immediately to the IRBuilder constructor which can't handle a nullptr.

llvm-svn: 331500
2018-05-04 01:04:28 +00:00
Craig Topper cafae62ec9 [LoopIdiomRecognize] Use a regular array instead of a SmallVector and explicit ArrayRef.
llvm-svn: 331499
2018-05-04 01:04:26 +00:00
Craig Topper 8304231508 [LoopIdiomRecognize] Turn two uncheck dyn_casts into regular casts.
These are casts on users of a PHINode to Instruction. I think since PHINode is an Instruction any users would also be Instructions. At least a cast will give us an assertion if its wrong.

llvm-svn: 331498
2018-05-04 01:04:24 +00:00
Sanjay Patel e7b6654711 [InstCombine] refine select-of-constants to bitwise ops
Add logic for the special case when a cmp+select can clearly be
reduced to just a bitwise logic instruction, and remove an 
over-reaching chunk of general purpose bit magic. The primary goal 
is to remove cases where we are not improving the IR instruction 
count when doing these select transforms, and in all cases here that 
is true.

In the motivating 3-way compare tests, there are further improvements
because we can combine/propagate select values (not sure if that
belongs in instcombine, but it's there for now).

DAGCombiner has folds to turn some of these selects into bit magic,
so there should be no difference in the end result in those cases.
Not all constant combinations are handled there yet, however, so it
is possible that some targets will see more cmov/csel codegen with
this change in IR canonicalization. 

Ideally, we'll go further to *not* turn selects into multiple 
logic/math ops in instcombine, and we'll canonicalize to selects.
But we should make sure that this step does not result in regressions
first (and if it does, we should fix those in the backend).

The general direction for this change was discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105373.html
http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html

Alive proofs for the new bit magic:
https://rise4fun.com/Alive/XG7

Differential Revision: https://reviews.llvm.org/D46086

llvm-svn: 331486
2018-05-03 21:58:44 +00:00
Piotr Padlewski c77ab8ef2f perform DSE through launder.invariant.group
Summary:
Alias Analysis knows that llvm.launder.invariant.group
returns pointer that mustalias argument, but this information
wasn't used, therefor we didn't DSE through launder.invariant.group

Reviewers: chandlerc, dberlin, bogner, hfinkel, efriedma

Reviewed By: dberlin

Subscribers: amharc, llvm-commits, nlewycky, rsmith

Differential Revision: https://reviews.llvm.org/D31581

llvm-svn: 331449
2018-05-03 11:03:53 +00:00
Craig Topper 856fd68690 [LoopIdiomRecognize] When looking for 'x & (x -1)' for popcnt, make sure the left hand side of the 'and' matches the left hand side of the 'subtract'
llvm-svn: 331437
2018-05-03 05:48:49 +00:00
Craig Topper 8ef2abdbc4 [LoopIdiomRecognize] Remove unnecessary cast from BinaryOperator to Instruction. NFC
BinaryOperator is a sub class of Instruction. We don't need an explicit cast back to Instruction.

llvm-svn: 331432
2018-05-03 05:00:18 +00:00
Shoaib Meenai a07295f977 [ObjCARC] Convert an if to an early continue. NFC
This reduces nesting and makes the logic slightly easier to follow.

Differential Revision: https://reviews.llvm.org/D46371

llvm-svn: 331422
2018-05-03 01:20:36 +00:00
Chandler Carruth e74c354d12 [gcov] Switch to an explicit if clunky array to satisfy some compilers
on various build bots that are unhappy with using makeArrayRef with an
initializer list.

llvm-svn: 331418
2018-05-03 00:11:03 +00:00
Chandler Carruth 71c3a3fac5 [GCOV] Emit the writeout function as nested loops of global data.
Summary:
Prior to this change, LLVM would in some cases emit *massive* writeout
functions with many 10s of 1000s of function calls in straight-line
code. This is a very wasteful way to represent what are fundamentally
loops and creates a number of scalability issues. Among other things,
register allocating these calls is extremely expensive. While D46127 makes this
less severe, we'll still run into scaling issues with this eventually. If not
in the compile time, just from the code size.

Now the pass builds up global data structures modeling the inputs to
these functions, and simply loops over the data structures calling the
relevant functions with those values. This ensures that the code size is
a fixed and only data size grows with larger amounts of coverage data.

A trivial change to IRBuilder is included to make it easier to build
the constants that make up the global data.

Reviewers: wmi, echristo

Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D46357

llvm-svn: 331407
2018-05-02 22:24:39 +00:00
Daniel Sanders 8d0d1aa229 [reassociate] Fix excessive revisits when processing long chains of reassociatable instructions.
Summary:
Some of our internal testing detected a major compile time regression which I've
tracked down to:
    r278938 - Revert "Reassociate: Reprocess RedoInsts after each inst".
It appears that processing long chains of reassociatable instructions causes
non-linear (potentially exponential) growth in the number of times an
instruction is revisited. For example, the included test revisits instructions
220 times in a 20-instruction test.

It appears that r278938 reversed the order instructions were visited and that
this is preventing scheduled revisits from being cancelled as a result of
visiting the instructions naturally during normal processing. However, simply
reversing the order also harmed the generated code. Upon closer inspection, it
was discovered that revisits occurred in the opposite order to the first pass
(Thanks to escha for spotting that).

This patch makes the revisit order consistent with the first pass which allows
more revisits to be cancelled. This does appear to have a small impact on the
generated code in few cases but it significantly reduces compile-time.

After this patch, our internal test that was most affected by the regression
dropped from ~2 million revisits to ~4k resulting in Reassociate having 0.46%
of the runtime it had before (99.54% improvement).

Here's the summaries reported by lnt for the LLVM test-suite with --benchmarking-only:
| metric         | geomean before patch | geomean after patch | delta   |
| -----          | -----                | -----               | -----   |
| compile time   | 0.1956               | 0.1261              | -35.54% |
| execution time | 0.3240               | 0.3237              | -       |
| code size      | 7365.4459            | 7365.6079           | -       |

The results have a few wins and losses on compile-time, mostly in the +/- 2.5% range. There was one outlier though:
| Performance Regressions - compile_time | Δ | Previous | Current |
| MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk | 9.82% | 2.0473 | 2.2483 |

Reviewers: javed.absar, dberlin

Reviewed By: dberlin

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45734

llvm-svn: 331381
2018-05-02 17:59:16 +00:00
Simon Pilgrim f53ee8e640 Fix '32-bit shift implicitly converted to 64 bits' warning by using APInt::setBit instead.
llvm-svn: 331359
2018-05-02 14:22:30 +00:00
Florian Hahn 5912c667b0 [LoopInterchange] Update some loops to use range base for loops (NFC).
llvm-svn: 331342
2018-05-02 10:53:04 +00:00
Sanjay Patel d2025a2e31 [AggressiveInstCombine] convert a chain of 'or-shift' bits into masked compare
and (or (lshr X, C), ...), 1 --> (X & C') != 0

I initially thought about implementing the minimal pattern in instcombine as mentioned here:
https://bugs.llvm.org/show_bug.cgi?id=37098#c6

...but we need to do better to catch the more general sequence from the motivating test 
(more than 2 bits in the compare). And a test-suite run with statistics showed that this 
pattern only happened 2 times currently. It would potentially happen more often if 
reassociation worked better (D45842), but it's probably still not too frequent?

This is small enough that I didn't see a need to create a whole new class/file within 
AggressiveInstCombine. There are likely other relatively small matchers like what was 
discussed in D44266 that would slide under foldUnusualPatterns() (name suggestions welcome). 
We could potentially also consolidate matchers for ctpop, bswap, etc under here.

Differential Revision: https://reviews.llvm.org/D45986

llvm-svn: 331311
2018-05-01 21:02:09 +00:00
Adrian Prantl 4dfcc4a788 Remove @brief commands from doxygen comments, too.
This is a follow-up to r331272.

We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by
  for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done

https://reviews.llvm.org/D46290

llvm-svn: 331275
2018-05-01 16:10:38 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Daniel Neilson 9e4bbe801a [LV] Preserve inbounds on created GEPs
Summary:
This is a fix for PR23997.

The loop vectorizer is not preserving the inbounds property of GEPs that it creates.
This is inhibiting some optimizations. This patch preserves the inbounds property in
the case where a load/store is being fed by an inbounds GEP.

Reviewers: mkuper, javed.absar, hsaito

Reviewed By: hsaito

Subscribers: dcaballe, hsaito, llvm-commits

Differential Revision: https://reviews.llvm.org/D46191

llvm-svn: 331269
2018-05-01 15:35:08 +00:00
Wei Mi eec5ba9fae Fix the issue that ComputeValueKnownInPredecessors only handles the case when
phi is on lhs of a comparison op.

For the following testcase,
L1:

  %t0 = add i32 %m, 7
  %t3 = icmp eq i32* %t2, null
  br i1 %t3, label %L3, label %L2

L2:

  %t4 = load i32, i32* %t2, align 4
  br label %L3

L3:

  %t5 = phi i32 [ %t0, %L1 ], [ %t4, %L2 ]
  %t6 = icmp eq i32 %t0, %t5
  br i1 %t6, label %L4, label %L5

We know if we go through the path L1 --> L3, %t6 should always be true. However
currently, if the rhs of the eq comparison is phi, JumpThreading fails to
evaluate %t6 to true. And we know that Instcombine cannot guarantee always
canonicalizing phi to the left hand side of the comparison operation according
to the operand priority comparison mechanism in instcombine. The patch handles
the case when rhs of the comparison op is a phi.

Differential Revision: https://reviews.llvm.org/D46275

llvm-svn: 331266
2018-05-01 14:47:24 +00:00
Omer Paparo Bivas 82ef8e19ef [InstCombine] Adjusting bswap pattern matching to hold for And/Shift mixed case
Differential Revision: https://reviews.llvm.org/D45731

Change-Id: I85d4226504e954933c41598327c91b2d08192a9d
llvm-svn: 331257
2018-05-01 12:25:46 +00:00
Chandler Carruth 2c85a23123 [PM/LoopUnswitch] Remove the last manual domtree update code from loop
unswitch and replace it with the amazingly simple update API code.

This addresses piles of FIXMEs around the update logic here and makes
everything substantially simpler.

llvm-svn: 331247
2018-05-01 09:54:39 +00:00
Chandler Carruth 44aab925fd [PM/LoopUnswitch] Add back a successor set that was removed based on
code review.

It turns out this *is* necessary, and I read the comment on the API
correctly the first time. ;]

The `applyUpdates` routine requires that updates are "balanced". This is
in order to cleanly handle cycles like inserting, removing, nad then
re-inserting the same edge. This precludes inserting the same edge
multiple times in a row as handling that would cause the insertion logic
to become *ordered* instead of *unordered* (which is what the API
provides).

It happens that in this specific case nothing (other than an assert and
contract violation) goes wrong because we're never inserting and
removing the same edge. The implementation *happens* to do the right
thing to eliminate redundant insertions in that case.

But the requirement is there and there is an assert to catch it.
Somehow, after the code review I never did another asserts-clang build
testing loop-unswich for a long time. As a consequence, I didn't notice
this despite a bunch of testing going on, but it shows up immediately
with an asserts build of clang itself.

llvm-svn: 331246
2018-05-01 09:42:09 +00:00
Florian Hahn 3df8844b92 [SimplifyCFG] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC).
This patch updates some code responsible the skip debug info to use
BasicBlock::instructionsWithoutDebug. I think this makes things slightly
simpler and more direct.

Reviewers: aprantl, vsk, hans, danielcdh

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D46252

llvm-svn: 331221
2018-04-30 20:10:53 +00:00
Florian Hahn 8fe04ad3f7 [LoopSimplify] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC).
This patch updates some code responsible the skip debug info to use
BasicBlock::instructionsWithoutDebug. I think this makes things slightly
simpler and more direct.

Reviewers: aprantl, vsk, chandlerc

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D46253

llvm-svn: 331217
2018-04-30 19:19:36 +00:00
Roman Lebedev aa4faec114 [InstCombine] Unfold masked merge with constant mask
Summary:
As discussed in D45733, we want to do this in InstCombine.

https://rise4fun.com/Alive/LGk

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: chandlerc, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D45867

llvm-svn: 331205
2018-04-30 17:59:33 +00:00
Davide Italiano bd3bf1660b [SLPVectorizer] Debug info shouldn't impact spill cost computation.
<rdar://problem/39794738>

(Also, PR32761).

Differential Revision:  https://reviews.llvm.org/D46199

llvm-svn: 331199
2018-04-30 16:57:33 +00:00
Nico Weber 432a38838d IWYU for llvm-config.h in llvm, additions.
See r331124 for how I made a list of files missing the include.
I then ran this Python script:

    for f in open('filelist.txt'):
        f = f.strip()
        fl = open(f).readlines()

        found = False
        for i in xrange(len(fl)):
            p = '#include "llvm/'
            if not fl[i].startswith(p):
                continue
            if fl[i][len(p):] > 'Config':
                fl.insert(i, '#include "llvm/Config/llvm-config.h"\n')
                found = True
                break
        if not found:
            print 'not found', f
        else:
            open(f, 'w').write(''.join(fl))

and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p`
and tried to fix include ordering and whatnot.

No intended behavior change.

llvm-svn: 331184
2018-04-30 14:59:11 +00:00