Commit Graph

4881 Commits

Author SHA1 Message Date
Erik Eckstein 0d86c7623f reinstate r222872: Peephole optimization in switch table lookup: reuse the guarding table comparison if possible.
Fixed missing dominance check.
Original commit message:

This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch.
Example:
   if (idx < tablesize)
      r = table[idx]; // table does not contain default_value
   else
      r = default_value;
   if (r != default_value)
      ...
Is optimized to:
   cond = idx < tablesize;
   if (cond)
      r = table[idx];
   else
      r = default_value;
   if (cond)
      ...
Jump threading will then eliminate the second if(cond).

llvm-svn: 222891
2014-11-27 15:13:14 +00:00
Suyog Sarda f8516e1662 Use FileCheck instead of grep. Change by Ankur Garg.
Differential Revision: http://reviews.llvm.org/D6430

llvm-svn: 222879
2014-11-27 11:22:49 +00:00
Erik Eckstein 2190cd9ffa Revert "Peephole optimization in switch table lookup: reuse the guarding table comparison if possible."
It is breaking the clang bootstrag.

llvm-svn: 222877
2014-11-27 10:59:08 +00:00
Suyog Sarda c3024c75e0 Use FileCheck instead of grep. Change by Sonam.
Differential Revision: http://reviews.llvm.org/D6432

llvm-svn: 222876
2014-11-27 10:57:24 +00:00
Erik Eckstein e73e308ab9 Peephole optimization in switch table lookup: reuse the guarding table comparison if possible.
This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch.
Example:
    if (idx < tablesize)
       r = table[idx]; // table does not contain default_value
    else
       r = default_value;
    if (r != default_value)
       ...
Is optimized to:
    cond = idx < tablesize;
    if (cond)
       r = table[idx];
    else
       r = default_value;
    if (cond)
       ...
\endcode
Jump threading will then eliminate the second if(cond).

llvm-svn: 222872
2014-11-27 08:33:51 +00:00
David Majnemer 40157d5c4d InstCombine: Restore optimizations lost in r210006
This restores our ability to optimize:
(X & C) == 0 ? X ^ C : X  into  X | C
(X & C) != 0 ? X ^ C : X  into  X & ~C

llvm-svn: 222871
2014-11-27 07:25:21 +00:00
David Majnemer c6a5e1dd4f InstSimplify: Restore optimizations lost in r210006
This restores our ability to optimize:
(X & C) ? X & ~C : X  into  X & ~C
(X & C) ? X : X & ~C  into  X
(X & C) ? X | C : X  into  X
(X & C) ? X : X | C  into  X | C

llvm-svn: 222868
2014-11-27 06:32:46 +00:00
David Majnemer 5468e86469 Revert "Added inst combine transforms for single bit tests from Chris's note"
This reverts commit r210006, it miscompiled libapr which is used in who
knows how many projects.

A test has been added to ensure that we don't regress again.

I'll work on a rewrite of what the optimization was trying to do later.

llvm-svn: 222856
2014-11-26 23:00:38 +00:00
Hans Wennborg bda193edff Remove useless rdar:// comment from switch_to_lookup_table.ll test.
llvm-svn: 222772
2014-11-25 18:45:23 +00:00
Hans Wennborg 45172aceb3 LazyValueInfo: Actually re-visit partially solved block-values in solveBlockValue()
If solveBlockValue() needs results from predecessors that are not already
computed, it returns false with the intention of resuming when the dependencies
have been resolved. However, the computation would never be resumed since an
'overdefined' result had been placed in the cache, preventing any further
computation.

The point of placing the 'overdefined' result in the cache seems to have been
to break cycles, but we can check for that when inserting work items in the
BlockValue stack instead. This makes the "stop and resume" mechanism of
solveBlockValue() work as intended, unlocking more analysis.

Using this patch shaves 120 KB off a 64-bit Chromium build on Linux.

I benchmarked compiling bzip2.c at -O2 but couldn't measure any difference in
compile time.

Tests by Jiangning Liu from r215343 / PR21238, Pete Cooper, and me.

Differential Revision: http://reviews.llvm.org/D6397

llvm-svn: 222768
2014-11-25 17:23:05 +00:00
Chandler Carruth 816d26fe5e [InstCombine] Change LLVM To canonicalize toward the value type being
stored rather than the pointer type.

This change is analogous to r220138 which changed the canonicalization
for loads. The rationale is the same: memory does not have a type,
operations (and thus the values they produce) have a type. We should
match that type as closely as possible rather than reading some form of
semantics into the pointer type.

With this change, loads and stores should no longer be made with
nonsensical types for the values that tehy load and store. This is
particularly important when trying to match specific loaded and stored
types in the process of doing other instcombines, which is what led me
down this twisty maze of miscanonicalization.

I've put quite some effort into looking through IR to find places where
LLVM's optimizer was being unreasonably conservative in the face of
mismatched load and store types, however it is possible (let's say,
likely!) I have missed some. If you see regressions here, or from
r220138, the likely cause is some part of LLVM failing to cope with load
and store types differing. Test cases appreciated, it is important that
we root all of these out of LLVM.

llvm-svn: 222748
2014-11-25 10:09:51 +00:00
Suyog Sarda 99c9c1f2b0 Change the test case file to use FileCheck instead of grep. NFC.
Change by Ankur Garg.

Differential Revision: http://reviews.llvm.org/D6382

llvm-svn: 222740
2014-11-25 08:44:56 +00:00
Chandler Carruth 1a3c2c414c Revert r220349 to re-instate r220277 with a fix for PR21330 -- quite
clearly only exactly equal width ptrtoint and inttoptr casts are no-op
casts, it says so right there in the langref. Make the code agree.

Original log from r220277:
Teach the load analysis to allow finding available values which require
inttoptr or ptrtoint cast provided there is datalayout available.
Eventually, the datalayout can just be required but in practice it will
always be there today.

To go with the ability to expose available values requiring a ptrtoint
or inttoptr cast, helpers are added to perform one of these three casts.

These smarts are necessary to finish canonicalizing loads and stores to
the operational type requirements without regressing fundamental
combines.

I've added some test cases. These should actually improve as the load
combining and store combining improves, but they may fundamentally be
highlighting some missing combines for select in addition to exercising
the specific added logic to load analysis.

llvm-svn: 222739
2014-11-25 08:20:27 +00:00
David Majnemer bd9ce4ea51 InstSimplify: Handle some simple tautological comparisons
This handles cases where we are comparing a masked value against itself.
The analysis could be further improved by making it recursive but such
expense is not currently justified.

llvm-svn: 222716
2014-11-25 02:55:48 +00:00
Matt Arsenault 238ff1ad1e Bug 21610: Canonicalize min/max fcmp selects to use ordered comparisons
llvm-svn: 222705
2014-11-24 23:15:18 +00:00
Matt Arsenault ea515d33c9 Convert test to FileCheck and use CHECK-LABEL
llvm-svn: 222704
2014-11-24 23:03:17 +00:00
David Majnemer 8e6f6a98b5 InstCombine: Don't create an unused instruction
We would create an instruction but not inserting it.
Not inserting the unused instruction would lead us to verification
failure.

This fixes PR21653.

llvm-svn: 222659
2014-11-24 16:41:13 +00:00
David Majnemer b2a6e7458d InstCombine: Don't assume DataLayout is always available
We tried to get the result of DataLayout::getLargestLegalIntTypeSize but
we didn't have a DataLayout.  This resulted in opt crashing.

This fixes PR21651.

llvm-svn: 222645
2014-11-24 07:26:20 +00:00
Elena Demikhovsky 9e5089a938 Masked Vector Load and Store Intrinsics.
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)

Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.

http://reviews.llvm.org/D6191

llvm-svn: 222632
2014-11-23 08:07:43 +00:00
David Majnemer fb3805576b InstCombine: Propagate exact for (sdiv X, Pow2) -> (udiv X, Pow2)
llvm-svn: 222625
2014-11-22 20:00:41 +00:00
David Majnemer ec6e481bc5 InstCombine: Propagate exact for (sdiv X, Y) -> (udiv X, Y)
llvm-svn: 222624
2014-11-22 20:00:38 +00:00
David Majnemer fa4699e65f InstCombine: Propagate exact for (sdiv -X, C) -> (sdiv X, -C)
llvm-svn: 222623
2014-11-22 20:00:34 +00:00
David Majnemer a3aeb15613 InstCombine: Propagate exact in (udiv (lshr X,C1),C2) -> (udiv x,C1<<C2)
llvm-svn: 222620
2014-11-22 18:16:54 +00:00
David Majnemer 546f81064c InstCombine: Propagate NSW/NUW for X*(1<<Y) -> X<<Y
llvm-svn: 222613
2014-11-22 08:57:02 +00:00
David Majnemer 8279a7506d InstCombine: Propagate NSW for -X * -Y -> X * Y
llvm-svn: 222612
2014-11-22 07:25:19 +00:00
David Majnemer 4efa9ff8ca InstSimplify: Simplify (sub 0, X) -> X if it's NUW
This is a generalization of the X - (0 - Y) -> X transform.

llvm-svn: 222611
2014-11-22 07:15:16 +00:00
David Majnemer 80c8f627db InstCombine: Preserve nsw when folding X*(2^C) -> X << C
llvm-svn: 222606
2014-11-22 04:52:55 +00:00
David Majnemer fd4a6d2b7a InstCombine: Preserve nsw/nuw for ((X << C2)*C1) -> (X * (C1 << C2))
llvm-svn: 222605
2014-11-22 04:52:52 +00:00
David Majnemer 027bc80928 InstCombine: Preserve nsw for (mul %V, -1) -> (sub 0, %V)
llvm-svn: 222604
2014-11-22 04:52:38 +00:00
Gerolf Hoflehner ec6217c929 [InstCombine] Re-commit of r218721 (Optimize icmp-select-icmp sequence)
Fixes the self-host fail. Note that this commit activates dominator
analysis in the combiner by default (like the original commit did).

llvm-svn: 222590
2014-11-21 23:36:44 +00:00
David Majnemer c0a313b57c SROA: The alloca type isn't a candidate promotion type for vectors
The alloca's type is irrelevant, only those types which are used in a
load or store of the exact size of the slice should be considered.

This manifested as an assertion failure when we compared the various
types: we had a size mismatch.

This fixes PR21480.

llvm-svn: 222499
2014-11-21 02:34:55 +00:00
Michael Zolotukhin 0dcae71449 Fix a trip-count overflow issue in LoopUnroll.
Currently LoopUnroll generates a prologue loop before the main loop
body to execute first N%UnrollFactor iterations. Also, this loop is
used if trip-count can overflow - it's determined by a runtime check.

However, we've been mistakenly optimizing this loop to a linear code for
UnrollFactor = 2, not taking into account that it also serves as a safe
version of the loop if its trip-count overflows.

llvm-svn: 222451
2014-11-20 20:19:55 +00:00
Chad Rosier 90a2f9b110 Revert "[Reassociate] As the expression tree is rewritten make sure the operands are"
This reverts commit r222142.  This is causing/exposing an execution-time regression
in spec2006/gcc and coremark on AArch64/A57/Ofast.

Conflicts:

	test/Transforms/Reassociate/optional-flags.ll

llvm-svn: 222398
2014-11-19 23:21:20 +00:00
Suyog Sarda aba97f4aba Vectorize a reduction chain feeding into a 'return' statement.
e.x 
return (a[0]+b[0]) + (a[1]+b[1])

Differential Revision: http://reviews.llvm.org/D6227

llvm-svn: 222364
2014-11-19 16:07:38 +00:00
Arnaud A. de Grandmaison 7b9dc28060 Fix tail recursion elimination
When the BasicBlock containing the return instrution has a PHI with 2
incoming values, FoldReturnIntoUncondBranch will remove the no longer
used incoming value and remove the no longer needed phi as well. This
leaves us with a BB that no longer has a PHI, but the subsequent call
to FoldReturnIntoUncondBranch from FoldReturnAndProcessPred will not
remove the return instruction (which still uses the result of the call
instruction). This prevents EliminateRecursiveTailCall to remove
the value, as it is still being used in a basicblock which has no
predecessors.

The basicblock can not be erased on the spot, because its iterator is
still being used in runTRE.

This issue was exposed when removing the threshold on size for lifetime
marker insertion for named temporaries in clang. The testcase is a much
reduced version of peelOffOuterExpr(const Expr*, const ExplodedNode *)
from clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp.

llvm-svn: 222354
2014-11-19 13:32:51 +00:00
David Majnemer b7adf34ee0 AliasSetTracker: UnknownInsts should contribute to the refcount
AliasSetTracker::addUnknown may create an AliasSet devoid of pointers
just to contain an instruction if no suitable AliasSet already exists.
It will then AliasSet::addUnknownInst and we will be done.

However, it's possible for addUnknown to choose an existing AliasSet to
addUnknownInst.
If this were to occur, we are in a bit of a pickle: removing pointers
from the AliasSet can cause the entire AliasSet to become destroyed,
taking our unknown instructions out with them.

Instead, keep track whether or not our AliasSet has any unknown
instructions.

This fixes PR21582.

llvm-svn: 222338
2014-11-19 09:41:05 +00:00
Manman Ren c67109313c Revert r222039 because of bot failure.
http://lab.llvm.org:8080/green/job/clang-Rlto_master/298/
Hopefully, bot will be green. If not, we will re-submit the commit.

llvm-svn: 222287
2014-11-19 00:13:26 +00:00
David Majnemer c6b8e20a5c InstCombine: Fix another infinite loop caused by visitFPTrunc
We would attempt to replace an frem's operand with the same operand.
This would cause InstCombine to think real work was done, causing
InstCombine to enter an infinite loop.

This fixes the second part of PR21576.

llvm-svn: 222265
2014-11-18 22:06:45 +00:00
David Majnemer b32eaddf11 Revert "Revert r222040 because of bot failure."
This reverts commit r222203, reverting r222040 didn't end up turning the
bot green.

llvm-svn: 222261
2014-11-18 21:30:02 +00:00
Chad Rosier b83c6d9c08 [Reassociate] Use test cases that can actually be optimized to verify optional
flags are cleared.  The reassociation pass was just reordering the leaf nodes
in the previous test cases.

llvm-svn: 222250
2014-11-18 20:34:01 +00:00
Philip Reames 018dbf18c4 Tweak EarlyCSE to recognize series of dead stores
EarlyCSE is giving up on the current instruction immediately when it recognizes that the current instruction makes a previous store trivially dead. There's no reason to do this. Once the previous store has been deleted, it's perfectly legal to remember the value of the current store (for value forwarding) and the fact the store occurred (it could be dead too!).

Reviewed by: Hal
Differential Revision: http://reviews.llvm.org/D6301

llvm-svn: 222241
2014-11-18 17:46:32 +00:00
David Majnemer 6fdb6b8fd4 InstCombine: Fold away tautological masked compares
It is impossible for (x & INT_MAX) == 0 && x == INT_MAX to ever be true.

While this sort of reasoning should normally live in InstSimplify,
the machinery that derives this result is not trivial to split out.

llvm-svn: 222230
2014-11-18 09:31:41 +00:00
David Majnemer 9a91e4a18a IndVarSimplify: Allow LFTR to fire more often
I added a pessimization in r217102 to prevent miscompiles when the
incremented induction variable was used in a comparison; it would be
poison.

Try to use the incremented induction variable more often when we can be
sure that the increment won't end in poison.

Differential Revision: http://reviews.llvm.org/D6222

llvm-svn: 222213
2014-11-18 02:20:58 +00:00
Manman Ren a64bd44fd8 Revert r222040 because of bot failure.
http://lab.llvm.org:8080/green/job/clang-Rlto_master/298/
Hopefully, bot will be green.

llvm-svn: 222203
2014-11-18 00:33:22 +00:00
Juergen Ributzka c9591e9bdb [SimplifyCFG] Make the value type of the hole check bitmask a power-of-2.
When converting a switch to a lookup table we might have to generate a bitmaks
to encode and check for holes in the original switch statement.

The type of this mask depends on the number of switch statements, which can
result in illegal types for pretty much all architectures.

To avoid unnecessary type legalization and help FastISel this commit increases
the size of the bitmask to next power-of-2 value when necessary.

This fixes rdar://problem/18984639.

llvm-svn: 222168
2014-11-17 19:39:56 +00:00
Chad Rosier bc0b869be9 [Reassociate] As the expression tree is rewritten make sure the operands are
emitted in canonical form.

llvm-svn: 222142
2014-11-17 16:33:50 +00:00
Chad Rosier 9a1ac6e494 [Reassociate] Canonicalize constants to RHS operand.
Fix a thinko where the RHS was already a constant.

llvm-svn: 222139
2014-11-17 15:52:51 +00:00
Erik Eckstein 105374fe5e Optimize switch lookup tables with linear mapping.
This is a simple optimization for switch table lookup:
It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output.
Example:

int f1(int x) {
  switch (x) {
    case 0: return 10;
    case 1: return 11;
    case 2: return 12;
    case 3: return 13;
  }
  return 0;
}

generates:

define i32 @f1(i32 %x) #0 {
entry:
  %0 = icmp ult i32 %x, 4
  br i1 %0, label %switch.lookup, label %return

switch.lookup:
  %switch.offset = add i32 %x, 10
  ret i32 %switch.offset

return:
  ret i32 0
}

llvm-svn: 222121
2014-11-17 09:13:57 +00:00
Rafael Espindola a3b5b60753 Add back r222061 with a fix.
This adds back r222061, but now calls initializePAEvalPass from the correct
library to avoid link problems.

Original message:

Don't make assumptions about the name of private global variables.

Private variables are can be renamed, so it is not reliable to make
decisions on the name.

The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).

This patch changes one case where we were looking at the variable name to use
the section instead.

Test tuning by Michael Gottesman.

llvm-svn: 222117
2014-11-17 02:28:27 +00:00
Reid Kleckner 007239863e Revert "Don't make assumptions about the name of private global variables."
This reverts commit r222061.

It's causing linker errors.

llvm-svn: 222077
2014-11-15 02:03:53 +00:00