Commit Graph

12651 Commits

Author SHA1 Message Date
Cameron McInally 4ebbc4d73a [NFC][InstCombine] Add unary FNeg tests to fsub.ll known-never-nan.ll
llvm-svn: 361971
2019-05-29 15:21:28 +00:00
Matt Arsenault 36e7254441 SpeculateAroundPHIs: Respect convergent
llvm-svn: 361957
2019-05-29 13:14:39 +00:00
Rong Xu e88173abc0 [PGO] Handle cases of failing to split critical edges
Fix PR41279 where critical edges to EHPad are not split.
The fix is to not instrument those critical edges. We used to be able to know
the size of counters right after MST is computed. With this, we have to
pre-collect the instrument BBs to know the size, and then instrument them.

Differential Revision: https://reviews.llvm.org/D62439

llvm-svn: 361882
2019-05-28 21:45:56 +00:00
Nikita Popov 5b32f60ec3 Revert "[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst"
This reverts commit 53f2f32865.

As reported on D62126, this causes assertion failures if the switch
has incorrect branch_weights metadata, which may happen as a result
of other transforms not handling it correctly yet.

llvm-svn: 361881
2019-05-28 21:28:24 +00:00
Nikita Popov 2941eb6864 [InstCombine] Add tests for signed saturating always overflow; NFC
llvm-svn: 361864
2019-05-28 18:59:28 +00:00
Simon Tatham 760df47b77 [ARM] Replace fp-only-sp and d16 with fp64 and d32.
Those two subtarget features were awkward because their semantics are
reversed: each one indicates the _lack_ of support for something in
the architecture, rather than the presence. As a consequence, you
don't get the behavior you want if you combine two sets of feature
bits.

Each SubtargetFeature for an FP architecture version now comes in four
versions, one for each combination of those options. So you can still
say (for example) '+vfp2' in a feature string and it will mean what
it's always meant, but there's a new string '+vfp2d16sp' meaning the
version without those extra options.

A lot of this change is just mechanically replacing positive checks
for the old features with negative checks for the new ones. But one
more interesting change is that I've rearranged getFPUFeatures() so
that the main FPU feature is appended to the output list *before*
rather than after the features derived from the Restriction field, so
that -fp64 and -d32 can override defaults added by the main feature.

Reviewers: dmgreen, samparker, SjoerdMeijer

Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D60691

llvm-svn: 361845
2019-05-28 16:13:20 +00:00
Hans Wennborg d936e40575 Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"
This was reverted in r360086 as it was supected of causing mysterious test
failures internally. However, it was never concluded that this patch was the
root cause.

> The code was previously checking that candidates for sinking had exactly
> one use or were a store instruction (which can't have uses). This meant
> we could sink call instructions only if they had a use.
>
> That limitation seemed a bit arbitrary, so this patch changes it to
> "instruction has zero or one use" which seems more natural and removes
> the need to special-case stores.
>
> Differential revision: https://reviews.llvm.org/D59936

llvm-svn: 361811
2019-05-28 12:19:38 +00:00
Yevgeny Rouban 53f2f32865 [CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst
This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.

Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126

llvm-svn: 361808
2019-05-28 11:33:50 +00:00
Simon Pilgrim 48c8bdad2a [SLPVectorizer][X86] Add broadcast test case from D62427
llvm-svn: 361805
2019-05-28 11:10:56 +00:00
Florian Hahn 11b2f4fe50 [LoopInterchange] Fix handling of LCSSA nodes defined in headers and latches.
The code to preserve LCSSA PHIs currently only properly supports
reduction PHIs and PHIs for values defined outside the latches.

This patch improves the LCSSA PHI handling to cover PHIs for values
defined in the latches.

Fixes PR41725.

Reviewers: efriedma, mcrosier, davide, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D61576

llvm-svn: 361743
2019-05-26 23:38:25 +00:00
Shawn Landden 343578759e [SimplifyCFG] back out all SwitchInst commits
They caused the sanitizer builds to fail.

My suspicion is the change the countLeadingZeros().

llvm-svn: 361736
2019-05-26 18:15:51 +00:00
Shawn Landden 7b883b7ed0 [SimplifyCFG] NFC, one more fixed test from previous push.
The old test was checking for a stupid subtract one that is a transform that
makes the code woorse.

The constant-islands-jump-table.ll test wants the code a specific way,
that makes sense, so I will submit code to fix that one.

Sorry that I really didn't know how to run the test suite before this.

llvm-svn: 361733
2019-05-26 15:29:10 +00:00
Shawn Landden 927fe7328d [SimplifyCFG] NFC, fix failing tests from last patches.
No problems with the transforms.

llvm-svn: 361730
2019-05-26 14:44:14 +00:00
Sanjay Patel 9317963920 [InstCombine] prevent crashing with invalid extractelement index
This was found/reduced from a fuzzer report:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956

llvm-svn: 361729
2019-05-26 14:03:50 +00:00
Shawn Landden fa91ab85d9 [SimplifyCFG] ReduceSwitchRange: Improve on the case where the SubThreshold doesn't trigger
llvm-svn: 361728
2019-05-26 13:55:52 +00:00
Shawn Landden 30111c786f [SimplifyCFG] Run ReduceSwitchRange unconditionally, generalize
Rather than gating on "isSwitchDense" (resulting in necessesarily
sparse lookup tables even when they were generated), always run
this quite cheap transform.

This transform is useful not just for generating tables.
LowerSwitch also wants this: read LowerSwitch.cpp:257.

Be careful to not generate worse code, by introducing a
SubThreshold heuristic.

Instead of just sorting by signed, generalize the finding of the
best base.

And now that it is run unconditionally, do not replicate its
functionality in SwitchToLookupTable (which could use a Sub
when having a hole is smaller, hence the SubThreshold
heuristic located in a single place).
This simplifies SwitchToLookupTable, and fixes
some ugly corner cases due to the use of signed numbers,
such as a table containing i16 32768 and 32769, of which
32769 would be interpreted as -32768, and now the code thinks
the table is size 65536.

(We still use unconditional subtraction when building a single-register mask,
but I think this whole block should go when the more general sparse
map is added, which doesn't leave empty holes in the table.)

And the reason test4 and test5 did not trigger was documented wrong:
it was because they were not considered sufficiently "dense".

Also, fix generation of invalid LLVM-IR: shl by bit-width.

llvm-svn: 361727
2019-05-26 13:55:14 +00:00
Shawn Landden 50c73a044f [SimplifyCFG] NFC, update Switch tests to HEAD so I can see if my changes change anything
Also add baseline tests to show effect of later patches.

llvm-svn: 361725
2019-05-26 13:52:41 +00:00
David Bolvansky 0290a77aa8 [SimplifyCFG] Added condition assumption for unreachable blocks
Summary: PR41688

Reviewers: spatel, efriedma, craig.topper, hfinkel, reames

Reviewed By: hfinkel

Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61409

llvm-svn: 361707
2019-05-25 22:34:27 +00:00
Nikita Popov 6bb5041e94 [LVI][CVP] Add support for saturating add/sub
Adds support for the uadd.sat family of intrinsics in LVI, based on
ConstantRange methods from D60946.

Differential Revision: https://reviews.llvm.org/D62447

llvm-svn: 361703
2019-05-25 16:44:14 +00:00
Nikita Popov 3c7edb2de5 [LoopVectorize] Fix test by regenerating checks
llvm-svn: 361699
2019-05-25 14:33:30 +00:00
David Bolvansky 2149811854 [NFC] Make tests more robust for new optimizations
llvm-svn: 361697
2019-05-25 14:10:20 +00:00
David Bolvansky bb76cf0f96 [NFC] Update test checks
llvm-svn: 361695
2019-05-25 13:11:22 +00:00
Nikita Popov 9a33dc9fb8 [CVP] Add tests for saturating add/sub ranges; NFC
llvm-svn: 361694
2019-05-25 09:53:51 +00:00
Nikita Popov 024b18aca7 [LVI][CVP] Calculate with.overflow result range
In LVI, calculate the range of extractvalue(op.with.overflow(%x, %y), 0)
as the range of op(%x, %y). This is mainly useful in conjunction with
D60650: If the result of the operation is extracted in a branch guarded
against overflow, then the value of %x will be appropriately constrained
and the result range of the operation will be calculated taking that
into account.

Differential Revision: https://reviews.llvm.org/D60656

llvm-svn: 361693
2019-05-25 09:53:45 +00:00
Craig Topper 46e5052b8e [X86FixupLEAs] Turn optIncDec into a generic two address LEA optimizer. Support LEA64_32r properly.
INC/DEC is really a special case of a more generic issue. We should also turn leas into add reg/reg or add reg/imm regardless of the slow lea flags.

This also supports LEA64_32 which has 64 bit input registers and 32 bit output registers. So we need to convert the 64 bit inputs to their 32 bit equivalents to check if they are equal to base reg.

One thing to note, the original code preserved the kill flags by adding operands to the new instruction instead of using addReg. But I think tied operands aren't supposed to have the kill flag set. I dropped the kill flags, but I could probably try to preserve it in the add reg/reg case if we think its important. Not sure which operand its supposed to go on for the LEA64_32r instruction due to the super reg implicit uses. Though I'm also not sure those are needed since they were probably just created by an INSERT_SUBREG from a 32-bit input.

Differential Revision: https://reviews.llvm.org/D61472

llvm-svn: 361691
2019-05-25 06:17:47 +00:00
Matt Arsenault 0ff901fba0 AMDGPU: Boost inline threshold with addrspacecasted alloca arguments
This was skipping GetUnderlyingObject for nonprivate addresses, but an
alloca could also be found through an addrspacecast if it's flat.

llvm-svn: 361649
2019-05-24 16:52:35 +00:00
Sanjay Patel 6f7734a125 [LoopVectorize] update test to be independent of instcombine; NFC
This is a regression test for vectorization, so remove instcombine
from the RUN line and adjust the comparison predicates to show what
the vectorizer is creating rather than how instcombine cleans it up.

llvm-svn: 361648
2019-05-24 16:46:09 +00:00
Neil Henning 119c31ad93 StructurizeCFG: Relax uniformity checks.
This change relaxes the checks for hasOnlyUniformBranches such that our
region is uniform if:

1. All conditional branches that are direct children are uniform.
2. And either:
  a. All sub-regions are uniform.
  b. There is one or less conditional branches among the direct
     children.

Differential Revision: https://reviews.llvm.org/D62198

llvm-svn: 361610
2019-05-24 08:59:17 +00:00
Bjorn Pettersson d63a2bb35f [DSE] Bugfix to avoid PartialStoreMerging involving non byte-sized stores
Summary:
The DeadStoreElimination pass now skips doing
PartialStoreMerging when stores overlap according to
OW_PartialEarlierWithFullLater and at least one of
the stores is having a store size that is different
from the size of the type being stored.

This solves problems seen in
  https://bugs.llvm.org/show_bug.cgi?id=41949
for which we in the past could end up with
mis-compiles or assertions.

The content and location of the padding bits is not
formally described (or undefined) in the LangRef
at the moment. So the solution is chosen based on
that we cannot assume anything about the padding bits
when having a store that clobbers more memory than
indicated by the type of the value that is stored
(such as storing an i6 using an 8-bit store instruction).

Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949

Reviewers: spatel, efriedma, fhahn

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62250

llvm-svn: 361605
2019-05-24 08:32:02 +00:00
Eli Friedman 052f87ae36 Revert r361460
It regresses https://bugs.llvm.org/show_bug.cgi?id=38309 (represented
by the testcase test/Transforms/GlobalOpt/globalsra-multigep.ll).

llvm-svn: 361581
2019-05-24 01:03:51 +00:00
Sanjay Patel 8869a98e82 [InstSimplify] fold insertelement-of-extractelement
This was partly handled in InstCombine (only the constant
index case), so delete that and zap it more generally in
InstSimplify.

llvm-svn: 361576
2019-05-24 00:13:58 +00:00
Sanjay Patel 3e15f83381 [InstSimplify] add tests for insert-of-extract; NFC
llvm-svn: 361575
2019-05-24 00:11:23 +00:00
Sanjay Patel e60cb7d1be [InstSimplify] insertelement V, undef, ? --> V
This was part of InstCombine, but it's better placed in
InstSimplify. InstCombine also had an unreachable but weaker
fold for insertelement with undef index, so that is deleted.

llvm-svn: 361559
2019-05-23 21:49:47 +00:00
Sanjay Patel 3249be1e03 [InstCombine] be more careful when transforming a shuffle mask
This is reduced from a fuzzer test:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14890

Usually, demanded elements should be able to simplify shuffle
mask elements that are pointing to undef elements of its source
operands, but that doesn't happen in the test case.

llvm-svn: 361533
2019-05-23 18:46:03 +00:00
Saleem Abdulrasool 7bbefb13ee Transforms: lower fadd and fsub atomicrmw instructions
`fadd` and `fsub` have recently (r351850) been added as `atomicrmw`
operations. This diff adds lowering cases for them to the LowerAtomic
transform.

Patch by Josh Berdine!

llvm-svn: 361512
2019-05-23 17:03:43 +00:00
Cameron McInally 1312225f8c [NFC][InstCombine] Add unary FNeg tests to maximum.ll/minimum.ll
llvm-svn: 361500
2019-05-23 14:53:42 +00:00
Clement Courbet 43882b16a3 [MergeICmps] Make the pass compatible with the new pass manager.
Reviewers: gchatelet, spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62287

llvm-svn: 361490
2019-05-23 12:35:26 +00:00
Christian Bruel 4a7da98bd9 [GlobalOpt] recognize dead struct fields and propagate values
Summary:
Allow struct fields SRA and dead stores. This works by considering fields accesses from getElementPtr to be considered as a possible pointer root that can be cleaned up.
We check that the variable can be SRA by recursively checking the sub expressions with the new isSafeSubSROAGEP function.

basically this allows the array in following C code  to be optimized out 

struct Expr {
  int a[2];
  int b;
};

static struct Expr e;

int foo (int i)
{
  e.b = 2;
  e.a[i] = 1;
  return e.b;
}


Reviewers: greened, bkramer, nicholas, jmolloy

Reviewed By: jmolloy

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61911

llvm-svn: 361460
2019-05-23 05:53:10 +00:00
Craig Topper 9816d55776 [X86][InstCombine] Remove InstCombine code that turns X86 round intrinsics into llvm.ceil/floor. Remove some isel patterns that existed because that was happening.
We were turning roundss/sd/ps/pd intrinsics with immediates of 1 or 2 into
llvm.floor/ceil.  The llvm.ceil/floor intrinsics are supposed to correspond
to the libm functions.  For the libm functions we need to disable the
precision exception so the llvm.floor/ceil functions should always map to
encodings 0x9 and 0xA.

We had a mix of isel patterns where some used 0x9 and 0xA and others used
0x1 and 0x2. We need to be consistent and always use 0x9 and 0xA.

Since we have no way in isel of knowing where the llvm.ceil/floor came
from, we can't map X86 specific intrinsics with encodings 1 or 2 to it.
We could map 0x9 and 0xA to llvm.ceil/floor instead, but I'd really like
to see a use case and optimization advantage first.

I've left the backend test cases to show the blend we now emit without
the extra isel patterns. But I've removed the InstCombine tests completely.

llvm-svn: 361425
2019-05-22 20:04:55 +00:00
Hiroshi Yamauchi dfeb797455 [PGO][CHR] Speed up following long use-def chains.
Summary: Avoid visiting an instruction more than once by using a map.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62262

llvm-svn: 361416
2019-05-22 18:37:34 +00:00
Cameron McInally adea0b6b40 [NFC][InstCombine] Add unary fneg tests to maxnum.ll/minnum.ll
llvm-svn: 361415
2019-05-22 18:27:43 +00:00
Sanjay Patel 5a4f7cf2ff [IR] allow fast-math-flags on select of FP values
This is a minimal start to correcting a problem most directly discussed in PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086

We have been hacking around a limitation for FP select patterns by using the
fast-math-flags on the condition of the select rather than the select itself.
This patch just allows FMF to appear with the 'select' opcode. No changes are
needed to "FPMathOperator" because it already includes select-of-FP because
that definition is based on the (return) value type.

Once we have this ability, we can start correcting and adding IR transforms
to use the FMF on a 'select' instruction. The instcombine and vectorizer test
diffs only show that the IRBuilder change is behaving as expected by applying
an FMF guard value to 'select'.

For reference:
rL241901 - allowed FMF with fcmp
rL255555 - allowed FMF with FP calls

Differential Revision: https://reviews.llvm.org/D61917

llvm-svn: 361401
2019-05-22 15:50:46 +00:00
Sanjay Patel 6a554188aa [InstCombine] fold shuffles of insert_subvectors
This should be a valid exception to the general rule of not creating new shuffle masks in IR...
because we already do it. :)
Also, DAG combining/legalization will undo this by widening the shuffle back out if needed.

Explanation for how we already do this: SLP or vector source can create chains of insert/extract
as shown in 1 of the examples from PR16739:
https://godbolt.org/z/NlK7rA
https://bugs.llvm.org/show_bug.cgi?id=16739

And we expect instcombine or DAGCombine to clean that up by creating relatively simple shuffles.

Differential Revision: https://reviews.llvm.org/D62024

llvm-svn: 361338
2019-05-22 00:32:25 +00:00
Sanjay Patel 3590bae8d6 [InstCombine] add more tests for shuffle folding; NFC
As discussed in D62024, we want to limit any potential IR
transforms of shuffles to cases where we know the SDAG
conversion would result in equivalent patterns for these
IR variants.

llvm-svn: 361317
2019-05-21 21:45:24 +00:00
Cameron McInally 17fdf1d383 [NFC][InstCombine] Add unary fneg tests to operand-complexity.ll.
llvm-svn: 361311
2019-05-21 21:07:46 +00:00
Cameron McInally 872dc79f20 [NFC][InstCombine] Add unary FNeg tests to X86/x86-avx512.ll
llvm-svn: 361308
2019-05-21 20:31:09 +00:00
Bob Haarman 032f87bbb3 Revert r360902 "Resubmit: [Salvage] Change salvage debug info ..."
This reverts commit rr360902. It caused an assertion failure in
lib/IR/DebugInfoMetadata.cpp: Assertion `(OffsetInBits + SizeInBits <=
FragmentSizeInBits) && "new fragment outside of original fragment"'
failed.

PR41931.

llvm-svn: 361246
2019-05-21 11:53:41 +00:00
Clement Courbet a95d95d392 [MergeICmps] Preserve the dominator tree.
Summary: In preparation for D60318 .

Reviewers: gchatelet, efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62068

llvm-svn: 361239
2019-05-21 11:02:23 +00:00
Nikita Popov e1d38ec811 [LFTR] Add additional PR31181 test cases
One case where overflow happens in the first loop iteration, and
two cases where we switch to a dynamically dead IV with post/pre
increment, respectively.

llvm-svn: 361189
2019-05-20 19:13:04 +00:00
Cameron McInally 2557ca296a [InstCombine] Add visitFNeg(...) visitor for unary Fneg
Also, break out a helper function, namely foldFNegIntoConstant(...), which performs transforms common between visitFNeg(...) and visitFSub(...).

Differential Revision: https://reviews.llvm.org/D61693

llvm-svn: 361188
2019-05-20 19:10:30 +00:00