Commit Graph

9827 Commits

Author SHA1 Message Date
Jun Bum Lim 0f90672ae9 [LICM] Fix PR35342
Summary: This change fix PR35342 by replacing only the current use with undef in unreachable blocks.

Reviewers: efriedma, mcrosier, igor-laevsky

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40184

llvm-svn: 318551
2017-11-17 20:38:25 +00:00
Chandler Carruth 693eedb138 [PM/Unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching,
making it no longer even remotely simple.

The pass will now be more of a "full loop unswitching" pass rather than
anything substantively simpler than any other approach. I plan to rename
it accordingly once the dust settles.

The key ideas of the new loop unswitcher are carried over for
non-trivial unswitching:
1) Fully unswitch a branch or switch instruction from inside of a loop to
   outside of it.
2) Update the CFG and IR. This avoids needing to "remember" the
   unswitched branches as well as avoiding excessively cloning and
   reliance on complex parts of simplify-cfg to cleanup the cfg.
3) Update the analyses (where we can) rather than just blowing them away
   or relying on something else updating them.

Sadly, #3 is somewhat compromised here as the dominator tree updates
were too complex for me to want to reason about. I will need to make
another attempt to do this now that we have a nice dynamic update API
for dominators. However, we do adhere to #3 w.r.t. LoopInfo.

This approach also adds an important principls specific to non-trivial
unswitching: not *all* of the loop will be duplicated when unswitching.
This fact allows us to compute the cost in terms of how much *duplicate*
code is inserted rather than just on raw size. Unswitching conditions
which essentialy partition loops will work regardless of the total loop
size.

Some remaining issues that I will be addressing in subsequent commits:
- Handling unstructured control flow.
- Unswitching 'switch' cases instead of just branches.
- Moving to the dynamic update API for dominators.

Some high-level, interesting limitationsV that folks might want to push
on as follow-ups but that I don't have any immediate plans around:
- We could be much more clever about not cloning things that will be
  deleted. In fact, we should be able to delete *nothing* and do
  a minimal number of clones.
- There are many more interesting selection criteria for which branch to
  unswitch that we might want to look at. One that I'm interested in
  particularly are a set of conditions which all exit the loop and which
  can be merged into a single unswitched test of them.

Differential revision: https://reviews.llvm.org/D34200

llvm-svn: 318549
2017-11-17 19:58:36 +00:00
Max Kazantsev 1ac6e8ae61 [IRCE] Remove folding of two range checks into RANGE_CHECK_BOTH
The logic of replacing of a couple `RANGE_CHECK_LOWER + RANGE_CHECK_UPPER`
into `RANGE_CHECK_BOTH` in fact duplicates the logic of range intersection which
happens when we calculate safe iteration space. Effectively, the result of intersection of
these ranges doesn't differ from the range of merged range check.

We chose to remove duplicating logic in favor of code simplicity.

Differential Revision: https://reviews.llvm.org/D39589

llvm-svn: 318508
2017-11-17 06:49:26 +00:00
Dmitry Mikulin 2f2ace985d Current implementation of Value::replaceUsesExceptBlockAddr() uses UseList
iterator to walk the list which keeps changing inside the loop. When the
UseList contains several uses with the same user, we end processing the same
user more than once, which leads to an assert.

With this fix, unique users are saved and processed later to avoid
processing duplicates.

Differential Revision: https://reviews.llvm.org/D39864

llvm-svn: 318477
2017-11-17 00:30:24 +00:00
Sanjay Patel cc318be68d [InstCombine] add tests for pow(); NFC
Also, increase test diversity (and show another bug) by varying the types.

llvm-svn: 318430
2017-11-16 17:49:54 +00:00
Sanjay Patel cebbfacc9e [InstCombine] add tests for 'afn' FMF; NFC
llvm-svn: 318423
2017-11-16 17:06:36 +00:00
Sanjay Patel dcb9e1b387 [InstCombine] regenerate test checks; NFC
llvm-svn: 318420
2017-11-16 17:01:09 +00:00
Sanjay Patel 68f7ee2ef9 [InstCombine] regenerate test checks; NFC
Also, remove some unnecessary bits. I don't think we need fcmp in any test here either?

llvm-svn: 318418
2017-11-16 16:59:49 +00:00
Sanjay Patel 6859627c09 [InstCombine] regenerate test checks; NFC
llvm-svn: 318417
2017-11-16 16:38:42 +00:00
Sanjay Patel 8d7d1db294 [InstCombine] regenerate test checks; NFC
llvm-svn: 318416
2017-11-16 16:36:48 +00:00
Yaxun Liu 407ca36b27 Let llvm.invariant.group.barrier accepts pointer to any address space
llvm.invariant.group.barrier may accept pointers to arbitrary address space.

This patch let it accept pointers to i8 in any address space and returns
pointer to i8 in the same address space.

Differential Revision: https://reviews.llvm.org/D39973

llvm-svn: 318413
2017-11-16 16:32:16 +00:00
Sanjay Patel b6f107d759 [InstSimplify] add tests for fcmp ord/uno; NFC
llvm-svn: 318408
2017-11-16 15:25:59 +00:00
John Brawn c3347d4247 Remove stray comma in sink-addrmode test
The extra comma meant it wasn't correctly checking that we weren't getting an
extra getelementptr.

llvm-svn: 318406
2017-11-16 15:15:00 +00:00
Sanjay Patel b3fa94586f [InstCombine] include 'sub' in the list of narrow-able binops
// trunc (binop X, C) --> binop (trunc X, C')
      // trunc (binop (ext X), Y) --> binop X, (trunc Y)

I'm grouping sub with the other binops  because that makes the code simpler
and the transforms are valid:
https://rise4fun.com/Alive/UeF
...so even though we don't expect a sub with constant Op1 or any of the
other opcodes with constant Op0 due to canonicalization rules, we might as
well handle those situations if non-canonical code somehow reaches this
point (it should just make instcombine more efficient in reaching its
end goal).

This should solve the problem that later manifests in the vectorizers in 
PR35295:
https://bugs.llvm.org/show_bug.cgi?id=35295

llvm-svn: 318404
2017-11-16 14:40:51 +00:00
Max Kazantsev b1b8aff2e7 [IRCE] Fix SCEVExpander's usage in IRCE
When expanding exit conditions for pre- and postloops, we may end up expanding a
recurrency from the loop to in its loop's preheader. This produces incorrect IR.

This patch ensures that IRCE uses SCEVExpander correctly and only expands code which
is safe to expand in this particular location.

Differentian Revision: https://reviews.llvm.org/D39234

llvm-svn: 318381
2017-11-16 06:06:27 +00:00
Sanjay Patel 4b65ee64f2 [InstCombine] add sub narrowing tests; NFC
This might be the root cause of PR35295:
https://bugs.llvm.org/show_bug.cgi?id=35295

llvm-svn: 318342
2017-11-15 22:19:55 +00:00
Sanjay Patel 03d0cd6a81 [InstCombine] trunc (binop X, C) --> binop (trunc X, C')
Note that one-use and shouldChangeType() are checked ahead of the switch.

Without the narrowing folds, we can produce inferior vector code as shown in PR35299:
https://bugs.llvm.org/show_bug.cgi?id=35299

llvm-svn: 318323
2017-11-15 19:12:01 +00:00
Reid Kleckner 72b819b8ee [InstCombine] Salvage debug info during initial DCE
InstCombine salvages debug info for every instruction it erases from its
worklist, but it wasn't doing it during its initial DCE when populating
its worklist. This fixes that.

This should help improve availability of 'this' in optimized debug info
when casts are necessary.

llvm-svn: 318320
2017-11-15 18:51:12 +00:00
Sanjay Patel 680c73f049 [InstCombine] add tests for missing trunc folds; NFC
As noted in PR35299:
https://bugs.llvm.org/show_bug.cgi?id=35299
...this is likely the root cause for a mis-vectorization transform.

llvm-svn: 318319
2017-11-15 18:09:43 +00:00
Adam Nemet 572a87c76f [SLP] Added more missed optimization remarks
Summary:
Added more remarks to SLP pass, in particular "missed" optimization remarks.
Also proposed several tests for new functionality.

Patch by Vladimir Miloserdov!

For reference you may look at: https://reviews.llvm.org/rL302811

Reviewers: anemet, fhahn

Reviewed By: anemet

Subscribers: javed.absar, lattner, petecoup, yakush, llvm-commits

Differential Revision: https://reviews.llvm.org/D38367

llvm-svn: 318307
2017-11-15 17:04:53 +00:00
Sanjay Patel 956dec63fb [PassManager, SimplifyCFG] add test for PR34603 / D38566; NFC
This is a recommit of r316908 which was reverted by r317444.

llvm-svn: 318300
2017-11-15 16:37:30 +00:00
Sanjay Patel 3e29890a7f [(new) Pass Manager] instantiate SimplifyCFG with the same options as the old PM
This is a recommit of r316869 which was speculatively reverted with r317444 and 
subsequently shown to not be the cause of PR35210. That crash should be fixed
after r318237.

Original commit message:

The old PM sets the options of what used to be known as "latesimplifycfg" on the
instantiation after the vectorizers have run, so that's what we'redoing here.

FWIW, there's a later SimplifyCFGPass instantiation in both PMs where we do not
set the "late" options. I'm not sure if that's intentional or not.

Differential Revision: https://reviews.llvm.org/D39407

llvm-svn: 318299
2017-11-15 16:33:11 +00:00
NAKAMURA Takumi 5ce714a334 Fix llvm/test/Transforms/LoopRotate/pr35210.ll in rL318237, it uses debug options.
llvm-svn: 318273
2017-11-15 06:46:58 +00:00
Craig Topper f7b86728fa [InstCombine] Simplify binops that are only used by a select and are fed by a select with the same condition.
Summary:
This patch optimizes a binop sandwiched between 2 selects with the same condition. Since we know its only used by the select we can propagate the appropriate input value from the earlier select.

As I'm writing this I realize I may need to avoid doing this for division in case the select was protecting a divide by zero?

Reviewers: spatel, majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39999

llvm-svn: 318267
2017-11-15 05:23:02 +00:00
Hans Wennborg 45cabacd2f Revert r318193 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops."
It crashes building sqlite; see reply on the llvm-commits thread.

> [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
>
>         Patch tries to improve vectorization of the following code:
>
>         void add1(int * __restrict dst, const int * __restrict src) {
>           *dst++ = *src++;
>           *dst++ = *src++ + 1;
>           *dst++ = *src++ + 2;
>           *dst++ = *src++ + 3;
>         }
>         Allows to vectorize even if the very first operation is not a binary add, but just a load.
>
>         Fixed issues related to previous commit.
>
>         Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
>
>         Reviewed By: ABataev, RKSimon
>
>         Subscribers: llvm-commits, RKSimon
>
>         Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 318239
2017-11-15 00:38:13 +00:00
Craig Topper bf6495fbcb [LoopRotate] processLoop should return true even if it just simplified the loop latch without making any other changes
Simplifying a loop latch changes the IR and we need to make sure the pass manager knows to invalidate analysis passes if that happened.

PR35210 discovered a case where we failed to invalidate the post dominator tree after this simplification because we no changes other than simplifying the loop latch.

Fixes PR35210.

Differential Revision: https://reviews.llvm.org/D40035

llvm-svn: 318237
2017-11-15 00:22:42 +00:00
Reid Kleckner 29a5c03cc2 Make salvageDebugInfo of casts work for dbg.declare and dbg.addr
Summary:
Instcombine (and probably other passes) sometimes want to change the
type of an alloca. To do this, they generally create a new alloca with
the desired type, create a bitcast to make the new pointer type match
the old pointer type, replace all uses with the cast, and then simplify
the casts. We already knew how to salvage dbg.value instructions when
removing casts, but we can extend it to cover dbg.addr and dbg.declare.

Fixes a debug info quality issue uncovered in Chromium in
http://crbug.com/784609

Reviewers: aprantl, vsk

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40042

llvm-svn: 318203
2017-11-14 21:49:06 +00:00
Hans Wennborg e1ecd61b98 Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
Clang implements the -finstrument-functions flag inherited from GCC, which
inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.

This is useful for getting a trace of how the functions in a program are
executed. Normally, the calls remain even if a function is inlined into another
function, but it is useful to be able to turn this off for users who are
interested in a lower-level trace, i.e. one that reflects what functions are
called post-inlining. (We use this to generate link order files for Chromium.)

LLVM already has a pass for inserting similar instrumentation calls to
mcount(), which it does after inlining. This patch renames and extends that
pass to handle calls both to mcount and the cygprofile functions, before and/or
after inlining as controlled by function attributes.

Differential Revision: https://reviews.llvm.org/D39287

llvm-svn: 318195
2017-11-14 21:09:45 +00:00
Dinar Temirbulatov 2bd1836520 [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:
    
        void add1(int * __restrict dst, const int * __restrict src) {
          *dst++ = *src++;
          *dst++ = *src++ + 1;
          *dst++ = *src++ + 2;
          *dst++ = *src++ + 3;
        }
        Allows to vectorize even if the very first operation is not a binary add, but just a load.
    
        Fixed issues related to previous commit.
    
        Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
    
        Reviewed By: ABataev, RKSimon
    
        Subscribers: llvm-commits, RKSimon
    
        Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 318193
2017-11-14 20:55:08 +00:00
Hiroshi Yamauchi 69c233ac6c Simplify irreducible loop metadata test code.
Summary:
Shorten the irreducible loop metadata test code by removing insignificant
instructions.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40043

llvm-svn: 318182
2017-11-14 19:48:59 +00:00
Gil Rapaport 848581cadb [LV] Introduce VPBlendRecipe, VPWidenMemoryInstructionRecipe
This patch is part of D38676.

The patch introduces two new Recipes to handle instructions whose vectorization
involves masking. These Recipes take VPlan-level masks in D38676, but still rely
on ILV's existing createEdgeMask(), createBlockInMask() in this patch.

VPBlendRecipe handles intra-loop phi nodes, which are vectorized as a sequence
of SELECTs. Its execute() code is refactored out of ILV::widenPHIInstruction(),
which now handles only loop-header phi nodes.

VPWidenMemoryInstructionRecipe handles load/store which are to be widened
(but are not part of an Interleave Group). In this patch it simply calls
ILV::vectorizeMemoryInstruction on execute().

Differential Revision: https://reviews.llvm.org/D39068

llvm-svn: 318149
2017-11-14 12:09:30 +00:00
Sanjay Patel feabdd18d9 [Reassociation] regenerate test checks; NFC
llvm-svn: 318076
2017-11-13 19:46:28 +00:00
Dinar Temirbulatov a9e47fd7d9 NFC, Allow SystemZ SLP tests only when SystemZ is supported.
llvm-svn: 318070
2017-11-13 18:35:43 +00:00
Sanjay Patel 7822fd884b [Reassociate] add tests with 'reassoc' FMF; NFC
llvm-svn: 318053
2017-11-13 17:29:11 +00:00
Simon Dardis 8222160eb3 Revert "[CodeGenPrepare] Check that erased sunken address are not reused"
This reverts commit r318032. The test broke some sanitizer bots.

llvm-svn: 318049
2017-11-13 16:41:17 +00:00
Simon Dardis 8e2a5bd235 [CodeGenPrepare] Check that erased sunken address are not reused
CodeGenPrepare sinks address computations from one basic block to another
and attempts to reuse address computations that have already been sunk. If
the same address computation appears twice with the first instance as an
operand of a load whose result is an operand to a simplifable select,
CodeGenPrepare simplifies the select and recursively erases the now dead
instructions. CodeGenPrepare then attempts to use the erased address
computation for the second load.

Fix this by erasing the cached address value if it has zero uses before
looking for the address value in the sunken address map.

This partially resolves PR35209.

Thanks to Alexander Richardson for reporting the issue!

Reviewers: john.brawn

Differential Revision: https://reviews.llvm.org/D39841

llvm-svn: 318032
2017-11-13 11:47:21 +00:00
Florian Hahn 0e9dec672d [PartialInliner] Inline vararg functions that forward varargs.
Summary:
This patch extends the partial inliner to support inlining parts of
vararg functions, if the vararg handling is done in the outlined part.

It adds a `ForwardVarArgsTo` argument to InlineFunction. If it is
non-null, all varargs passed to the inlined function will be added to
all calls to `ForwardVarArgsTo`.

The partial inliner takes care to only pass `ForwardVarArgsTo` if the
varargs handing is done in the outlined function. It checks that vastart
is not part of the function to be inlined.

`test/Transforms/CodeExtractor/PartialInlineNoInline.ll` (already part
of the repo) checks we do not do partial inlining if vastart is used in
a basic block that will be inlined.

Reviewers: davide, davidxl, grosser

Reviewed By: davide, davidxl, grosser

Subscribers: gyiu, grosser, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D39607

llvm-svn: 318028
2017-11-13 10:35:52 +00:00
Matt Arsenault 90e4f719e1 Fix some misc. -enable-var-scope violations
llvm-svn: 318006
2017-11-13 01:47:52 +00:00
Craig Topper d3e5781e53 [InstCombine] Teach visitICmpInst to not break integer absolute value idioms
Summary:
This patch adds an early out to visitICmpInst if we are looking at a compare as part of an integer absolute value idiom. Similar is already done for min/max.

In the particular case I observed in a benchmark we had an absolute value of a load from an indexed global. We simplified the compare using foldCmpLoadFromIndexedGlobal into a magic bit vector, a shift, and an and. But the load result was still used for the select and the negate part of the absolute valute idiom. So we overcomplicated the code and lost the ability to recognize it as an absolute value.

I've chosen a simpler case for the test here.

Reviewers: spatel, davide, majnemer

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39766

llvm-svn: 317994
2017-11-12 02:28:21 +00:00
Sanjoy Das 6fabb90765 [CVP] Remove some {s|u}add.with.overflow checks.
Summary:
This adds logic to CVP to remove some overflow checks.  It uses LVI to remove
operations with at least one constant.  Specifically, this can remove many
overflow intrinsics immediately following an overflow check in the source code,
such as:

if (x < INT_MAX)
    ... x + 1 ...

Patch by Joel Galenson!

Reviewers: sanjoy, regehr

Reviewed By: sanjoy

Subscribers: fhahn, pirama, srhines, llvm-commits

Differential Revision: https://reviews.llvm.org/D39483

llvm-svn: 317911
2017-11-10 19:13:35 +00:00
Easwaran Raman e8c4bf54ba [SimplifyCFG] Fix a test case.
This was first committed in r317845, but had the order of branch weights
wrong and didn't properly check the output.

llvm-svn: 317848
2017-11-09 23:17:52 +00:00
Easwaran Raman 0a0913def2 Add a wrapper function to set branch weights metadata.
Summary:
This wrapper checks if there is at least one non-zero weight before
setting the metadata.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39872

llvm-svn: 317845
2017-11-09 22:52:20 +00:00
Sanjay Patel b5d2e11e4e [Reassociate] regenerate test checks; NFC
llvm-svn: 317841
2017-11-09 22:41:39 +00:00
Paul Robinson b46256b0b4 Fix out-of-order stepping behavior in programs with hoisted constants.
When the Constant Hoisting pass moves expensive constants into a
common block, it would assign a debug location equal to the last use
of that constant. While this is certainly intuitive, it places the
constant in an out-of-order location, according to the debug location
information. This produces out-of-order stepping when debugging
programs affected by this pass.

This patch creates in-order stepping behavior by merging the debug
locations for hoisted constants, and the new insertion point.

Patch by Matthew Voss!

Differential Revision: https://reviews.llvm.org/D38088

llvm-svn: 317827
2017-11-09 20:01:31 +00:00
Alexey Bataev 0bd9004425 [SLP] Fix PR23510: Try to find best possible vectorizable stores.
Summary:
The analysis of the store sequence goes in straight order - from the
first store to the last. Bu the best opportunity for vectorization will
happen if we're going to use reverse order - from last store to the
first. It may be best because usually users have some initialization
part + further processing and this first initialization may confuse
SLP vectorizer.

Reviewers: RKSimon, hfinkel, mkuper, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39606

llvm-svn: 317821
2017-11-09 19:07:16 +00:00
Sanjay Patel c019c39f4f [Reassociate] auto-generate test checks; NFC
llvm-svn: 317819
2017-11-09 18:26:49 +00:00
Sanjay Patel 0d66010454 [Reassociate] don't name values "tmp"; NFCI
The toxic stew of created values named 'tmp' and tests that already have
values named 'tmp' and CHECK lines looking for values named 'tmp' causes
bad things to happen in our test line auto-generation scripts because it
wants to use 'TMP' as a prefix for unnamed values. Use less 'tmp' to 
avoid that.

llvm-svn: 317818
2017-11-09 18:14:24 +00:00
Sanjay Patel 5ac48bd9c8 revert r317809 - [Reassociate] regenerate test checks; NFC
The reassociate pass generates named values such as "%tmp2" which trips up the script's regex's
because the script uses a 'TMP' prefix for unnamed values (%2).

llvm-svn: 317810
2017-11-09 16:46:04 +00:00
Sanjay Patel e04f032424 [Reassociate] regenerate test checks; NFC
llvm-svn: 317809
2017-11-09 16:35:30 +00:00
Sanjay Patel 2471c16d3e [Reassociate] regenerate test checks; NFC
llvm-svn: 317806
2017-11-09 16:30:19 +00:00