Commit Graph

335 Commits

Author SHA1 Message Date
Bjorn Pettersson bf3213e485 [CGP] Avoid segmentation fault when doing PHI node simplifications
Summary:
Made PHI node simplifiations more robust in several ways:

- Minor refactoring to let the SimplificationTracker own the
sets with new PHI/Select nodes that are introduced. This is
maybe not mapping to the original intention with the
SimplificationTracker, but IMHO it encapsulates the logic behind
those sets a little bit better.

- MatchPhiNode can sometimes populate the Matched set with
several entries, where it maps one PHI node to different candidates
for replacement. The Matched set is changed into a SmallSetVector
to make sure we get a deterministic iteration when doing
the replacements.

- As described above we may get several different replacements
for a single PHI node. The loop in MatchPhiSet that is doing
the replacements could end up calling eraseFromParent several
times for the same PHI node, resulting in segmentation faults.
This problem was supposed to be fixed in rL327250, but due to
the non-determinism(?) it only appeared to be fixed (I still
got crashes sometime when turning on/off -print-after-all etc
to get different iteration order in the DenseSets).
With this patch we follow the deterministic ordering in the
Matched set when replacing the PHI nodes. If we find a new
replacement for an already replaced PHI node we replace the
new replacement by the old replacement instead. This is quite
similar to what happened in the rl327250 patch, but here we
also recursively verify that the old replacement hasn't been
replaced already.

- It was really hard to track down the fault described above
(segementation fault due to doing eraseFromParent multiple
times for the same instruction). The fault was intermittent and
small changes in the code, or simply turning on -print-after-all
etc could make the problem go away. This was basically due to
the iteration over PhiNodesToMatch in MatchPhiSet no being
deterministic. Therefore I've changed the data structure for
the SimplificationTracker::AllPhiNodes into an SmallSetVector.
This gives a deterministic behavior.

Reviewers: skatkov, john.brawn

Reviewed By: skatkov

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44571

llvm-svn: 327961
2018-03-20 09:06:37 +00:00
Jonas Paulsson 5612bb292c [CodeGenPrepare] Respect endianness in splitMergedValStore.
splitMergedValStore will split a store into two if target prefers this, or if
-force-split-store is passed.

This patch adds the missing handling for endianness in this function along
with a test case.

Review: Eli Friedman
https://reviews.llvm.org/D44396

llvm-svn: 327375
2018-03-13 08:36:20 +00:00
Serguei Katkov a20e05bb94 [CGP] Fix the remove of matched phis in complex addressing mode
When we replace the Phi we created with matched ones it is possible that
there are two identical phi nodes in IR. And matcher is smart enough to find that
new created phi matches both of them. So we try to replace our phi node with
matched ones twice and what is bad we delete our phi node twice causing a crash.

As soon as we found that we have two identical Phi nodes it makes sense to do
a clean-up and replace one phi node by other one.
The patch implements it.

Reviewers: john.brawn, reames
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43758

llvm-svn: 327250
2018-03-12 03:50:07 +00:00
Elena Demikhovsky 945b7e5aa6 Adding a width of the GEP index to the Data Layout.
Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout.
p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits.
The index size parameter is optional, if not specified, it is equal to the pointer size.

Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width.
It works fine if you can convert pointer to integer for address calculation and all registered targets do this.
But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout.
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html

I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account.
This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size.

Differential Revision: https://reviews.llvm.org/D42123

llvm-svn: 325102
2018-02-14 06:58:08 +00:00
Daniel Neilson be58a220e9 [CodeGenPrepare] Improve source and dest alignments of memory intrinsics independently
Summary:
  This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
CodeGenPrepare pass to be more aggressive in improving the source and destination alignments
of memcpy/memmove/memset by exploiting our new ability to record independent alignments
for each argument.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

llvm-svn: 323891
2018-01-31 17:24:53 +00:00
Serguei Katkov 9fe0524ee6 [CGP] Re-enable Select in complex addressing mode.
Switch Select handling on after fixing two bugs: rL323192 and rL323497.

llvm-svn: 323498
2018-01-26 06:26:56 +00:00
Serguei Katkov 17e5794f11 [CGP] Fix the GV handling in complex addressing mode
If in complex addressing mode the difference is in GV then
base reg should not be installed because we plan to use
base reg as a merge point of different GVs.

This is a fix for PR35980.

Reviewers: reames, john.brawn, santosh
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42230

llvm-svn: 323192
2018-01-23 12:07:49 +00:00
Serguei Katkov 22bb1c0e17 Revert [CGP] Re-enable Select in complex addressing mode
One of buildbots failed. Revert for now till fix the issue.

llvm-svn: 322923
2018-01-19 04:52:39 +00:00
Daniel Neilson 2409d24201 [NFC] Change MemIntrinsicInst::setAlignment() to take an unsigned instead of a Constant
Summary:
 In preparation for https://reviews.llvm.org/D41675 this NFC changes this
prototype of MemIntrinsicInst::setAlignment() to accept an unsigned instead
of a Constant.

llvm-svn: 322403
2018-01-12 21:33:37 +00:00
Serguei Katkov 76a1de3cd5 [CGP] Re-enable Select in complex addressing mode
Re-enable Select after a couple of fixes.

Differential Revision: https://reviews.llvm.org/D40634

llvm-svn: 322358
2018-01-12 08:33:34 +00:00
Eric Christopher d72f78e7c8 Tidy some grammar in some comments
llvm-svn: 322133
2018-01-09 23:25:38 +00:00
Serguei Katkov 4d1dd6b53a [CGP] Fix Complex addressing mode for offset
If the offset is differ in two addressing mode we can continue only if
ScaleReg is not set due to we will use it as merge of different offsets.

It should fix PR35799 and PR35805.

Reviewers: john.brawn, reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41227

llvm-svn: 322056
2018-01-09 04:37:06 +00:00
Benjamin Kramer c7fc81e659 Use phi ranges to simplify code. No functionality change intended.
llvm-svn: 321585
2017-12-30 15:27:33 +00:00
Teresa Johnson a4ce3bfdda [PGO] Function section hotness prefix should look at all blocks
Summary:
The function section prefix for PGO based layout (e.g. hot/unlikely)
should look at the hotness of all blocks not just the entry BB.
A function with a cold entry but a very hot loop should be placed in the
hot section, for example, so that it is located close to other hot
functions it may call. For SamplePGO it was already looking at the
branch weights on calls, and I made that code conditional on whether
this is SamplePGO since it was essentially a noop for instrumentation
PGO anyway.

Reviewers: davidxl

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D41395

llvm-svn: 321197
2017-12-20 17:53:10 +00:00
Haicheng Wu 0be8825146 [CGP] Format. NFC
Clang-format.

llvm-svn: 321107
2017-12-19 20:53:32 +00:00
Serguei Katkov b0b67a8d38 [CGP] Fix the handling select inst in complex addressing mode
When we put the value in select placeholder we must pass
the value through simplification tracker due to the value might
be already simplified and erased.

This is a fix for PR35658.

Reviewers: john.brawn, uabelho
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41251

llvm-svn: 320956
2017-12-18 04:25:07 +00:00
Serguei Katkov ac4a8fb1cd Revert "[CGP] Enable select in complex addr mode"
Causes: Assertion `ScaledReg == nullptr' failed.

This actually a revert of rL320551.

llvm-svn: 320553
2017-12-13 07:39:35 +00:00
Serguei Katkov b8cb5da28d [CGP] Enable select in complex addr mode
Enable select instruction handling in complex addr modes.

Reviewers: john.brawn, reames, aaboud
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40634

llvm-svn: 320551
2017-12-13 06:57:59 +00:00
Hiroshi Yamauchi 9364fa3434 Move splitIndirectCriticalEdges() to BasicBlockUtils.h.
Summary:
Move splitIndirectCriticalEdges() from CodeGenPrepare to BasicBlockUtils.h so
that it can be called from other places.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40750

llvm-svn: 319689
2017-12-04 20:36:01 +00:00
Serguei Katkov d4df744434 [CGP] Enable complex addr mode
Enable complex addr modes after two critical fixes: rL319109 and rL319292

llvm-svn: 319302
2017-11-29 09:48:50 +00:00
Serguei Katkov 5036459ae3 [CGP] Fix common type handling in optimizeMemoryInst
If common type is different we should bail out due to we will not be
able to create a select or Phi of these values.

Basically it is done in ExtAddrMode::compare however it does not work
if we handle the null first and then two values of different types.
so add a check in initializeMap as well. The check in ExtAddrMode::compare
is used as earlier bail out.

Reviewers: reames, john.brawn
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40479

llvm-svn: 319292
2017-11-29 05:51:26 +00:00
John Brawn 4b476488ba [CGP] Fix handling of null pointer values in optimizeMemoryInst
The current way that trivial addressing modes are detected incorrectly thinks
that null pointers are non-trivial, leading to an infinite loop where we keep
duplicating the same select. Fix this by aware of null when deciding if an
addressing mode is trivial.

Differential Revision: https://reviews.llvm.org/D40447

llvm-svn: 319019
2017-11-27 11:29:15 +00:00
Simon Dardis 230f453574 [CodeGenPrepare] Check that erased sunken address are not reused
CodeGenPrepare sinks address computations from one basic block to another
and attempts to reuse address computations that have already been sunk. If
the same address computation appears twice with the first instance as an
operand of a load whose result is an operand to a simplifable select,
CodeGenPrepare simplifies the select and recursively erases the now dead
instructions. CodeGenPrepare then attempts to use the erased address
computation for the second load.

Fix this by erasing the cached address value if it has zero uses before
looking for the address value in the sunken address map.

This partially resolves PR35209.

Thanks to Alexander Richardson for reporting the issue!

This fixed version relands r318032 which was reverted in r318049 due to
sanitizer buildbot failures.

Reviewers: john.brawn

Differential Revision: https://reviews.llvm.org/D39841

llvm-svn: 318956
2017-11-24 16:45:28 +00:00
John Brawn 70cdb5b391 [CGP] Make optimizeMemoryInst able to combine more kinds of ExtAddrMode fields
This patch extends the recent work in optimizeMemoryInst to make it able to
combine more ExtAddrMode fields than just the BaseReg.

This fixes some benchmark regressions introduced by r309397, where GVN PRE is
hoisting a getelementptr such that it can no longer be combined into the
addressing mode of the load or store that uses it.

Differential Revision: https://reviews.llvm.org/D38133

llvm-svn: 318949
2017-11-24 14:10:45 +00:00
Serguei Katkov ac17aadf41 Revert "[CGP] Enable complex addr mode (2nd attempt)"
Revert the patch rl318728 causing buildbot hangs-ups.

llvm-svn: 318731
2017-11-21 06:03:43 +00:00
Serguei Katkov fc1ff29966 [CGP] Enable complex addr mode (2nd attempt)
2nd attempt to enable complex addr modes after
fix of the crash by rL318638.

llvm-svn: 318728
2017-11-21 05:31:47 +00:00
Serguei Katkov 505359f705 [CGP] Fix the crash caused by enable of complex addr mode
We must collect all AddModes even if they are the same.
This is due to Original value is different but we need all original
values collected as they are used as anchors in common phi finding.

Reviewers: john.brawn, reames
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40166

llvm-svn: 318638
2017-11-20 05:42:36 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Easwaran Raman 0d55b55bb6 [CodeGenPrepare] Disable div bypass when working set size is huge.
Summary:
Bypass of slow divs based on operand values is currently disabled for
-Os. Do the same when profile summary is available and the working set
size of the application is huge. This is similar to how loop peeling is
guarded by hasHugeWorkingSetSize. In the div bypass case, the generated
extra code (and the extra branch) tendss to outweigh the benefits of the
bypass. This results in noticeable performance improvement on an
internal application.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39992

llvm-svn: 318179
2017-11-14 19:31:51 +00:00
Simon Dardis 8222160eb3 Revert "[CodeGenPrepare] Check that erased sunken address are not reused"
This reverts commit r318032. The test broke some sanitizer bots.

llvm-svn: 318049
2017-11-13 16:41:17 +00:00
Simon Dardis 8e2a5bd235 [CodeGenPrepare] Check that erased sunken address are not reused
CodeGenPrepare sinks address computations from one basic block to another
and attempts to reuse address computations that have already been sunk. If
the same address computation appears twice with the first instance as an
operand of a load whose result is an operand to a simplifable select,
CodeGenPrepare simplifies the select and recursively erases the now dead
instructions. CodeGenPrepare then attempts to use the erased address
computation for the second load.

Fix this by erasing the cached address value if it has zero uses before
looking for the address value in the sunken address map.

This partially resolves PR35209.

Thanks to Alexander Richardson for reporting the issue!

Reviewers: john.brawn

Differential Revision: https://reviews.llvm.org/D39841

llvm-svn: 318032
2017-11-13 11:47:21 +00:00
Serguei Katkov 3664aa8658 Revert "[CGP] Enable extending scope of optimizeMemoryInst"
Revert the patch r317665 causing buildbot failures.

llvm-svn: 317667
2017-11-08 05:38:54 +00:00
Serguei Katkov ee892325bf [CGP] Enable extending scope of optimizeMemoryInst
This patch enables the folding of address computation in
memory instruction in case adress is represented by Phi node.

The inputs of Phi node might be different in base register.

Differential Revision: https://reviews.llvm.org/D36073

llvm-svn: 317665
2017-11-08 05:02:51 +00:00
Craig Topper 87e715fbac [CodeGenPrepare] Fix typo in comment. NFC
llvm-svn: 317614
2017-11-07 20:56:17 +00:00
Serguei Katkov 365200295a [CGP] Disable Select instruction handling in optimizeMemoryInst. NFC
This patch disables the handling of selects in optimization
extensing scope of optimizeMemoryInst.

The optimization itself is disable by default.
The idea here is just to switch optimiztion level step by step.

Specifically, first optimization will be enabled only for Phi nodes,
then select instructions will be added.

In case someone will complain about perfromance it will be easier to
detect what part of optimizations is responsible for that.

Differential Revision: https://reviews.llvm.org/D36073

llvm-svn: 317555
2017-11-07 09:43:08 +00:00
Serguei Katkov aee6375b02 [CGP] Fix the bug found by asan.
Try to fix the asan failure introduced by r317429.

llvm-svn: 317431
2017-11-05 07:59:02 +00:00
Serguei Katkov d5d8d54b08 [CGP] Extends the scope of optimizeMemoryInst optimization
This is an implementation of PR26223.

Currently optimizeMemoryInst optimization tries to fold address computation
if all possible way to get compute the address are of the form

baseGV + base + scale * Index + offset
where scale and offset are constants and baseGV, base and Index are exactly
the same instructions if defined.

The patch extends this optimization to allow different bases. In this case
it tries to find/build a Phi node merging all possible bases and use this Phi node
as a base for sunk address computation. Also it supports Select instruction on
the way.

The main motivation for this scope extension is GCRelocateInst.
If there is a relocation of derived pointer it will be represented as relocation of base + offset.
Also there will be a Phi node merging address computation for relocated derived pointer
and derived pointer itself. If we have a Phi node merging original base and relocated base
and can fold the address computation of derived pointer then we can potentially reduce
the code size and Phi node for derived pointer. The later can have a positive impact to
register allocator.

Reviewers: efriedma, dberlin, mkazantsev, reames, john.brawn
Reviewed By: john.brawn
Subscribers: javed.absar, john.brawn, dneilson, llvm-commits
Differential Revision: https://reviews.llvm.org/D36073

llvm-svn: 317429
2017-11-05 05:50:33 +00:00
Adrian Prantl 261ac8b23c Invoke salvageDebugInfo from CodeGenPrepare's SinkCast()
This preserves the debug info for the cast operation in the original location.

rdar://problem/33460652

Reapplied r317340 with the test moved into an ARM-specific directory.

llvm-svn: 317375
2017-11-03 21:55:03 +00:00
Adrian Prantl 8fe9fb0ae5 Revert "Invoke salvageDebugInfo from CodeGenPrepare's SinkCast()"
This reverts commit 317342 while investigating bot breakage.

llvm-svn: 317345
2017-11-03 18:26:36 +00:00
Adrian Prantl 58e9a0bb16 Invoke salvageDebugInfo from CodeGenPrepare's SinkCast()
This preserves the debug info for the cast operation in the original location.

rdar://problem/33460652

llvm-svn: 317340
2017-11-03 18:00:02 +00:00
Clement Courbet 063bed9baf re-land [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
Fix undefined references: ExpandMemCmp belongs to CodeGen/, not Scalar/.

llvm-svn: 317318
2017-11-03 12:12:27 +00:00
Clement Courbet 82bade615b Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
undefined reference to `llvm::TargetPassConfig::ID' on
clang-ppc64le-linux-multistage

This reverts commit eea333c33fa73ad225ef28607795984829f65688.

llvm-svn: 317213
2017-11-02 15:53:10 +00:00
Clement Courbet 1dc37b9c3b [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.
Summary:
This is mostly a noop (most of the test diffs are renamed blocks).
There are a few temporary register renames (eax<->ecx) and a few blocks are
shuffled around.

See the discussion in PR33325 for more details.

Reviewers: spatel

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D39456

llvm-svn: 317211
2017-11-02 15:02:51 +00:00
Serguei Katkov f66a59ee88 [CGP] Fix the detection of trivial case for addressing mode
The address can be presented as a bitcast of baseReg.
In this case it is still trivial but OriginalValue != baseReg.

llvm-svn: 316980
2017-10-31 07:01:35 +00:00
Philip Reames 9c3cbeea39 [CGP] Fix crash on i96 bit multiply
Issue found by llvm-isel-fuzzer on OSS fuzz, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3725

If anyone actually cares about > 64 bit arithmetic, there's a lot more to do in this area.  There's a bunch of obviously wrong code in the same function.  I don't have the time to fix all of them and am just using this to understand what the workflow for fixing fuzzer cases might look like.

llvm-svn: 316967
2017-10-30 23:59:51 +00:00
Clement Courbet b2c3eb8cf1 [CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).
- Targets that want to support memcmp expansions now return the list of
   supported load sizes.
 - Expansion codegen does not assume that all power-of-two load sizes
   smaller than the max load size are valid. For examples, this is not the
   case for x86(32bit)+sse2.

Fixes PR34887.

llvm-svn: 316905
2017-10-30 14:19:33 +00:00
Clement Courbet e1eafe0a54 [CodeGen] Fix -Wunused-private-field warning on lld-x86_64-darwin13.
llvm-svn: 316765
2017-10-27 13:34:41 +00:00
Clement Courbet be684eee82 [CodeGen][ExpandMemCmp][NFC] Simplify load sequence generation.
llvm-svn: 316763
2017-10-27 12:34:18 +00:00
Balaram Makam 32bcb5d7fb Revert "[CGP] Merge empty case blocks if no extra moves are added."
This reverts commit r316711. The domtree isn't getting updated correctly.

llvm-svn: 316721
2017-10-27 00:35:18 +00:00
Balaram Makam cddf3c5e1c [CGP] Merge empty case blocks if no extra moves are added.
Summary:
Currently we skip merging when extra moves may be added in the header of switch instead of the case block, if the case block is used as an incoming
block of a PHI. If all the incoming values of the PHIs are non-constants and the destination block is dominated by the switch block then extra moves are likely not added by ISel, so there is no need to skip merging in this case.

Reviewers: efriedma, junbuml, davidxl, hfinkel, qcolombet

Reviewed By: efriedma

Subscribers: dberlin, kuhar, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D37343

llvm-svn: 316711
2017-10-26 22:34:01 +00:00