Commit Graph

354 Commits

Author SHA1 Message Date
Jinsong Ji bf544fa1c3 Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit adffce7153.

This is breaking test-suite, revert while investigation.
2020-07-27 21:07:00 +00:00
Jinsong Ji adffce7153 [PowerPC] Remove QPX/A2Q BGQ/BGP CNK support
Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html
no one is making use of QPX/A2Q/BGQ/BGP CNK anymore.

This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang,
CNK support in openmp/polly.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D83915
2020-07-27 19:24:39 +00:00
John Brawn 20854d85e1 [DSE,MSSA] Recognise init_trampoline in getLocForWriteEx
This fixes an instance where MemorySSA-using Dead Store Elimination is failing
to do a transformation that the non-MemorySSA-using version does.

Differential Revision: https://reviews.llvm.org/D83783
2020-07-15 12:18:58 +01:00
Florian Hahn 80970ac875 [DSE,MSSA] Eliminate stores by terminators (free,lifetime.end).
This patch adds support for eliminating stores by free & lifetime.end
calls. We can remove stores that are not read before calling a memory
terminator and we can eliminate all stores after a memory terminator
until we see a new lifetime.start. The second case seems to not really
trigger much in practice though.

Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D72410
2020-07-08 08:59:46 +01:00
Nuno Lopes 7f903873b8 DSE: fix builtin function recognition to take decl into account 2020-07-02 10:28:47 +01:00
Arthur Eubanks 339eed5d0b [NewPM][BasicAA] basicaa -> basic-aa in Transforms/DeadStoreElimination
Summary: Following https://reviews.llvm.org/D82607.

Reviewers: ychen

Subscribers: george.burgess.iv, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82689
2020-06-26 20:13:37 -07:00
Florian Hahn 4837daf883 [DSE,MSSA] Check if Def is removable only wen we try to remove it.
Non-removable MemoryDefs can still eliminate other defs. Update the
isRemovable checks to only candidates for removal.
2020-06-25 14:01:10 +01:00
Florian Hahn ab27603c6d [DSE,MSSA] Add missing -enable-dse-memoryssa flag to test. 2020-06-24 14:07:30 +01:00
Florian Hahn 4e62c6359c [DSE] Eliminate stores at the end of the function.
This patch add support for eliminating MemoryDefs that do not have any
aliasing users, which indicates that there are no reads/writes to the
memory location until the end of the function.

To eliminate such defs, we have to ensure that the underlying object is
not visible in the caller and does not escape via returning. We need a
separate check for that, as InvisibleToCaller does not consider returns.

Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker, george.burgess.iv

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D72631
2020-06-24 12:58:20 +01:00
Florian Hahn 7b72cb47e6 [DSE,MSSA] Precommit small test changes for D72631. 2020-06-24 10:17:09 +01:00
Florian Hahn ff4de8683a [DSE,MSSA] Treat `store 0` after calloc as noop stores.
This patch extends storeIsNoop to also detect stores of 0 to an calloced
object. This basically ports the logic from legacy DSE to the MemorySSA
backed version.

It triggers in a few cases on MultiSource, SPEC2000, SPEC2006 with -O3
LTO:

Same hash: 218 (filtered out)
Remaining: 19
Metric: dse.NumNoopStores

Program                                        base   patch2 diff
 test-suite...CFP2000/177.mesa/177.mesa.test     1.00  15.00 1400.0%
 test-suite...6/482.sphinx3/482.sphinx3.test     1.00  14.00 1300.0%
 test-suite...lications/ClamAV/clamscan.test     2.00  28.00 1300.0%
 test-suite...CFP2006/433.milc/433.milc.test     1.00   8.00 700.0%
 test-suite...pplications/oggenc/oggenc.test     2.00   9.00 350.0%
 test-suite.../CINT2000/176.gcc/176.gcc.test     6.00   6.00  0.0%
 test-suite.../CINT2006/403.gcc/403.gcc.test    NaN   137.00  nan%
 test-suite...libquantum/462.libquantum.test    NaN     3.00  nan%
 test-suite...6/464.h264ref/464.h264ref.test    NaN     7.00  nan%
 test-suite...decode/alacconvert-decode.test    NaN     2.00  nan%
 test-suite...encode/alacconvert-encode.test    NaN     2.00  nan%
 test-suite...ications/JM/ldecod/ldecod.test    NaN     9.00  nan%
 test-suite...ications/JM/lencod/lencod.test    NaN    39.00  nan%
 test-suite.../Applications/lemon/lemon.test    NaN     2.00  nan%
 test-suite...pplications/treecc/treecc.test    NaN     4.00  nan%
 test-suite...hmarks/McCat/08-main/main.test    NaN     4.00  nan%
 test-suite...nsumer-lame/consumer-lame.test    NaN     3.00  nan%
 test-suite.../Prolangs-C/bison/mybison.test    NaN     1.00  nan%
 test-suite...arks/mafft/pairlocalalign.test    NaN    30.00  nan%

Reviewers: efriedma, zoecarver, asbirlea

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D82204
2020-06-23 21:01:39 +01:00
Florian Hahn 328c8642e2 [DSE,MSSA] Reorder DSE blocking checks.
Currently we stop exploring candidates too early in some cases.

In particular, we can continue checking the defining accesses of
non-removable MemoryDefs and defs without analyzable write location
(read clobbers are already ruled out using MemorySSA at this point).
2020-06-22 17:16:34 +01:00
Florian Hahn b458d8ce95 [DSE,MSSA] Add additional tests with non-removable stores (NFC).
Add a few additional tests with volatile stores, which cannot be
removedt stas.
2020-06-22 16:22:25 +01:00
Florian Hahn 40569db7b3 [DSE,MSSA] Move reachability check to main loop.
As we traverse the CFG backwards, we could end up reaching unreachable
blocks. For unreachable blocks, we won't have computed post order
numbers and because DomAccess is reachable, unreachable blocks cannot be
on any path from it.

This fixes a crash with unreachable blocks.
2020-06-21 16:38:10 +01:00
Florian Hahn 88f722c269 [DSE,MSSA] Enable XFAIL'd merge-stores.ll test (NFC).
All cases in the test are supported now, it only still failed because an
over-eager regex match not accounting for `, align ` being added to each
load/store now.
2020-06-19 12:10:47 +01:00
Florian Hahn 120c059292 [DSE,MSSA] Port partial store merging.
Port partial constant store merging logic to MemorySSA backed DSE. The
heavy lifting is done by the existing helper function. It is used in
context where we already ensured that the later instruction can
eliminate the earlier one, if it is a complete overwrite.
2020-06-15 18:41:46 +01:00
Florian Hahn 8c61f13a0f [DSE,MSSA] Delete instructions after printing it.
Also enables a now-passing test case, that exposed a crash caused by the
wrong order.
2020-06-15 16:01:36 +01:00
Florian Hahn 979720a9bb [DSE,MSSA] Add additional merging test cases (NFC).
Additional tests added ahead of partial overlapping store merging.
2020-06-15 15:45:07 +01:00
Florian Hahn 97e7147e34 [DSE,MSSA] Fix location order in isOverwrite call.
isOverwrite expects the later location as first argument and the earlier
result later. The adjusted call is intended to check whether CC
overwrites DefLoc.
2020-06-13 20:39:00 +01:00
Florian Hahn 67671024c8 [DSE,MSSA] Relax post-dom restriction for objs visible after return.
This patch relaxes the post-dominance requirement for accesses to
objects visible after the function returns.

Instead of requiring the killing def to post-dominate the access to
eliminate, the set of 'killing blocks' (= blocks that completely
overwrite the original access) is collected.

If all paths from the access to eliminate and an exit block go through a
killing block, the access can be removed.

To check this property, we first get the common post-dominator block for
the killing blocks. If this block does not post-dominate the access
block, there may be a path from DomAccess to an exit block not involving
any killing block.

Otherwise we have to check if there is a path from the DomAccess to the
common post-dominator, that does not contain a killing block. If there
is no such path, we can remove DomAccess. For this check, we start at
the common post-dominator and then traverse the CFG backwards. Paths are
terminated when we hit a killing block or a block that is not executed
between DomAccess and a killing block according to the post-order
numbering (if the post order number of a block is greater than the one
of DomAccess, the block cannot be in in a path starting at DomAccess).

This gives the following improvements on the total number of stores
after DSE for MultiSource, SPEC2K, SPEC2006:

Tests: 237
Same hash: 206 (filtered out)
Remaining: 31
Metric: dse.NumRemainingStores

Program                                        base      new100    diff
 test-suite...CFP2000/188.ammp/188.ammp.test   3624.00   3544.00   -2.2%
 test-suite...ch/g721/g721encode/encode.test   128.00    126.00    -1.6%
 test-suite.../Benchmarks/Olden/mst/mst.test    73.00     72.00    -1.4%
 test-suite...CFP2006/433.milc/433.milc.test   3202.00   3163.00   -1.2%
 test-suite...000/186.crafty/186.crafty.test   5062.00   5010.00   -1.0%
 test-suite...-typeset/consumer-typeset.test   40460.00  40248.00  -0.5%
 test-suite...Source/Benchmarks/sim/sim.test   642.00    639.00    -0.5%
 test-suite...nchmarks/McCat/09-vor/vor.test   642.00    644.00     0.3%
 test-suite...lications/sqlite3/sqlite3.test   35664.00  35563.00  -0.3%
 test-suite...T2000/300.twolf/300.twolf.test   7202.00   7184.00   -0.2%
 test-suite...lications/ClamAV/clamscan.test   19475.00  19444.00  -0.2%
 test-suite...INT2000/164.gzip/164.gzip.test   2199.00   2196.00   -0.1%
 test-suite...peg2/mpeg2dec/mpeg2decode.test   2380.00   2378.00   -0.1%
 test-suite.../Benchmarks/Bullet/bullet.test   39335.00  39309.00  -0.1%
 test-suite...:: External/Povray/povray.test   36951.00  36927.00  -0.1%
 test-suite...marks/7zip/7zip-benchmark.test   67396.00  67356.00  -0.1%
 test-suite...6/464.h264ref/464.h264ref.test   31497.00  31481.00  -0.1%
 test-suite...006/453.povray/453.povray.test   51441.00  51416.00  -0.0%
 test-suite...T2006/401.bzip2/401.bzip2.test   4450.00   4448.00   -0.0%
 test-suite...Applications/kimwitu++/kc.test   23481.00  23471.00  -0.0%
 test-suite...chmarks/MallocBench/gs/gs.test   6286.00   6284.00   -0.0%
 test-suite.../CINT2000/254.gap/254.gap.test   13719.00  13715.00  -0.0%
 test-suite.../Applications/SPASS/SPASS.test   30345.00  30338.00  -0.0%
 test-suite...006/450.soplex/450.soplex.test   15018.00  15016.00  -0.0%
 test-suite...ications/JM/lencod/lencod.test   27780.00  27777.00  -0.0%
 test-suite.../CINT2006/403.gcc/403.gcc.test   105285.00 105276.00 -0.0%

There might be potential to pre-compute some of the information of which
blocks are on the path to an exit for each block, but the overall
benefit might be comparatively small.

On the set of benchmarks, 15738 times out of 20322 we reach the
CFG check, the CFG check is successful. The total number of iterations
in the CFG check is 187810, so on average we need less than 10 steps in
the check loop. Bumping the threshold in the loop from 50 to 150 gives a
few small improvements, but I don't think they warrant such a big bump
at the moment. This is all pending further tuning in the future.

Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv

Reviewed By: george.burgess.iv

Differential Revision: https://reviews.llvm.org/D78932
2020-06-10 10:39:25 +01:00
zoecarver 065bf124fd [DSE] Remove noop stores in MSSA.
Adds a simple fast-path check for the pattern:
v = load ptr
store v to ptr

I took the tests from the bugzilla post, I can add more if needed (but I think these should be sufficent).

Refs: https://bugs.llvm.org/show_bug.cgi?id=45795

Differential Revision: https://reviews.llvm.org/D79391
2020-05-30 09:57:30 -07:00
Florian Hahn 7a325c14b4 [DSE,MSSA] Add additional multiblock tests. 2020-05-22 18:24:43 +01:00
Arthur Eubanks 8a88755610 Reland [X86] Codegen for preallocated
See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.

In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.

This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.

The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.

The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.

Force any function containing a preallocated call to use the frame
pointer.

Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.

Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).

Aside from the tests added here, I checked that this codegen produces
correct code for something like

```
struct A {
        A();
        A(A&&);
        ~A();
};

void bar() {
        foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```

by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.

Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77689
2020-05-20 11:25:44 -07:00
Arthur Eubanks b8cbff51d3 Revert "[X86] Codegen for preallocated"
This reverts commit 810567dc69.

Some tests are unexpectedly passing
2020-05-20 10:04:55 -07:00
Arthur Eubanks 810567dc69 [X86] Codegen for preallocated
See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.

In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.

This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.

The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.

The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.

Force any function containing a preallocated call to use the frame
pointer.

Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.

Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).

Aside from the tests added here, I checked that this codegen produces
correct code for something like

```
struct A {
        A();
        A(A&&);
        ~A();
};

void bar() {
        foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```

by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77689
2020-05-20 09:20:38 -07:00
Eli Friedman 11aa3707e3 StoreInst should store Align, not MaybeAlign
This is D77454, except for stores.  All the infrastructure work was done
for loads, so the remaining changes necessary are relatively small.

Differential Revision: https://reviews.llvm.org/D79968
2020-05-15 12:26:58 -07:00
Eli Friedman 4532a50899 Infer alignment of unmarked loads in IR/bitcode parsing.
For IR generated by a compiler, this is really simple: you just take the
datalayout from the beginning of the file, and apply it to all the IR
later in the file. For optimization testcases that don't care about the
datalayout, this is also really simple: we just use the default
datalayout.

The complexity here comes from the fact that some LLVM tools allow
overriding the datalayout: some tools have an explicit flag for this,
some tools will infer a datalayout based on the code generation target.
Supporting this properly required plumbing through a bunch of new
machinery: we want to allow overriding the datalayout after the
datalayout is parsed from the file, but before we use any information
from it. Therefore, IR/bitcode parsing now has a callback to allow tools
to compute the datalayout at the appropriate time.

Not sure if I covered all the LLVM tools that want to use the callback.
(clang? lli? Misc IR manipulation tools like llvm-link?). But this is at
least enough for all the LLVM regression tests, and IR without a
datalayout is not something frontends should generate.

This change had some sort of weird effects for certain CodeGen
regression tests: if the datalayout is overridden with a datalayout with
a different program or stack address space, we now parse IR based on the
overridden datalayout, instead of the one written in the file (or the
default one, if none is specified). This broke a few AVR tests, and one
AMDGPU test.

Outside the CodeGen tests I mentioned, the test changes are all just
fixing CHECK lines and moving around datalayout lines in weird places.

Differential Revision: https://reviews.llvm.org/D78403
2020-05-14 13:03:50 -07:00
zoecarver f65f566aeb Re-commit: Mark values as trivially dead when their only use is a start or end lifetime intrinsic.
Summary:
If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well.

Currently, this only works for allocas, globals, and arguments.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79355
2020-05-08 12:24:10 -07:00
zoecarver 1998e796e9 Revert "Mark values as trivially dead when their only use is a start or end lifetime intrinsic."
This reverts commit 95aa28cc8f.
2020-05-06 11:07:22 -07:00
zoecarver 95aa28cc8f Mark values as trivially dead when their only use is a start or end lifetime intrinsic.
Summary:
If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well.

Currently, this only works for allocas, globals, and arguments.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79355
2020-05-06 10:58:08 -07:00
Florian Hahn e018b8bbb0 [DSE,MSSA] Add multi-path tests with readnone throwing calls. 2020-04-29 10:30:05 +01:00
Florian Hahn 46a04940e8 [DSE] Add stat for remaining stores after DSE.
Using the existing NumFastStores statistic can be misleading when
comparing the impact of DSE patches.

For example, consider the case where a store gets removed from a
function before it is inlined into another function. A less
powerful DSE might only remove the store from functions it has
been inlined into, which will result in more stores being removed, but
no difference in the actual number of stores after DSE.

The new stat provides the absolute number of stores surviving after
DSE.

Reviewers: dmgreen, bryant, asbirlea, jfb

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D78830
2020-04-25 16:12:55 +01:00
Florian Hahn b578608256 [DSE,MSSA] Add use of alloca, to guard against removal in the future.
Currently the alloca does not escape and all stores and the memset can
be removed. Adding a use of the alloca ensures not all stores are
eliminated.
2020-04-15 15:23:43 +01:00
Florian Hahn cf9ee49b4d [DSE] Lift post-dominance for objs not accessible in caller.
We can eliminate MemoryDefs of objects not accessible after the function
returns (e.g. alloca), if there are no reads between the MemoryDef and
any function exits. We can stop traversing paths that completely
overwrite the memory location of the MemoryDef.

This patch was split off D73763.

Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv

Reviewed By: asbirlea, george.burgess.iv

Differential Revision: https://reviews.llvm.org/D77736
2020-04-15 11:37:14 +01:00
Tyker 086de7673e [AssumeBundles] preserve knowledge in DSE
Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77404
2020-04-14 12:48:15 +02:00
Florian Hahn 4184b2e034 [DSE,MSSA] Add additional test cases for multi-path elimination (NFC).
This adds additional test cases for more scenarios and also with objects
that are accessible after the functions return and allocas.
2020-04-08 15:26:26 +01:00
Uday Bondhugula c0955edfd6 Introduce support for lib function aligned_alloc in TLI / memory builtins
Aligned_alloc is a standard lib function and has been in glibc since
2.16 and in the C11 standard. It has semantics similar to malloc/calloc
for several analyses/transforms. This patch introduces aligned_alloc
in target library info and memory builtins. Subsequent ones will
make other passes aware and fix https://bugs.llvm.org/show_bug.cgi?id=44062

This change will also be useful to LLVM generators that need to allocate
buffers of vector elements larger than 16 bytes (for eg. 256-bit ones),
element boundary alignment for which is not typically provided by glibc malloc.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D76970
2020-03-29 23:36:24 +05:30
Florian Hahn ece6cf0fa5 [DSE,MSSA] Precommit additional tests for D73763. 2020-03-20 13:39:46 +00:00
Florian Hahn 3a8372ed02 [DSE] Support traversing MemoryPhis.
For MemoryPhis, we have to avoid that the MemoryPhi may be executed
before before the access we are currently looking at.

To do this we do a post-order numbering of the basic blocks in the
function and bail out once we reach a MemoryPhi with a larger (or equal)
post-order block number than the current MemoryAccess.
This changes the order in which we visit stores for elimination.

This patch also adds support for exploring multiple paths. We keep a worklist (ToCheck) of memory accesses that might be eliminated by our starting MemoryDef or MemoryPhis for further exploration.  For MemoryPhis, we add the incoming values to the worklist, for MemoryDefs we add the defining access.

Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D72148
2020-03-20 07:51:42 +00:00
Artur Pilipenko 02e3d5c3a2 Fix DSE miscompile when store is clobbered across loop iterations
DSE would mistakenly remove store (2):

  a = calloc(n+1)
  for (int i = 0; i < n; i++) {
    store 1, a[i+1] // (1)
    store 0, a[i]   // (2)
  }

The fix is to do PHI transaltion while looking for clobbering
instructions between the store and the calloc.

Reviewed By: efriedma, bjope

Differential Revision: https://reviews.llvm.org/D68006
2020-02-27 14:43:01 -08:00
Florian Hahn b8d638d337 [DSE,MSSA] Do not attempt to remove un-removable memdefs.
We have to skip MemoryDefs that cannot be removed. This fixes a crash in
the newly added test case and fixes a wrong case in
memset-and-memcpy.ll.
2020-02-25 13:31:46 +00:00
Florian Hahn af69d5e10e [DSE] Track overlapping stores.
Add a map from BasicBlocks to overlap intervals. For partial writes, we
can keep track of those in IOLs. We only add candidates that are valid
for eliminations.

Reviewers: dmgreen, bryant, asbirlea, Tyker

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D73757
2020-02-23 15:44:40 +00:00
Florian Hahn deb0a8bfc4 [DSE,MSSA] Dbg counters required assertions. Mark test accordingly. 2020-02-21 17:34:34 +00:00
Florian Hahn 134bab7cd5 [DSE,MSSA] Add debug counter.
Can be used like
-debug-counter=dse-memoryssa-skip=10,dse-memoryssa-counter-count=20

Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D72147
2020-02-21 17:04:37 +00:00
Florian Hahn 81dbb6aec6 Recommit "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)."
This includes a fix for the santizier failures.

This reverts the revert commit
42f8b915eb.
2020-02-12 14:17:50 +00:00
Kadir Cetinkaya 42f8b915eb
Revert "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)."
This reverts commit d0c4d4fe09.

Revert "[DSE,MSSA] Move more passing test cases from todo to simple.ll."

This reverts commit 02266e64bb.

Revert "[DSE,MSSA] Adjust mda-with-dbg-values.ll to MSSA backed DSE."

This reverts commit 74f03e4ff0.
2020-02-11 15:34:48 +01:00
Florian Hahn 74f03e4ff0 [DSE,MSSA] Adjust mda-with-dbg-values.ll to MSSA backed DSE.
-memdep-block-scan-limit is not relevant with MSSA.
2020-02-10 15:24:00 +00:00
Florian Hahn 02266e64bb [DSE,MSSA] Move more passing test cases from todo to simple.ll. 2020-02-10 12:38:17 +00:00
Florian Hahn d0c4d4fe09 [DSE] Add first version of MemorySSA-backed DSE (Bottom up walk).
This patch adds a first version of a MemorySSA based DSE. It is missing
a lot of features, which will get added as follow-ups, to help to keep
the review manageable.

The patch uses the following general approach: given a MemoryDef, walk
upwards to find clobbering MemoryDefs that may be killed by the
starting def. Then check that there are no uses that may read the
location of the original MemoryDef in between both MemoryDefs. A bit
more concretely:

For all MemoryDefs StartDef:
1. Get the next dominating clobbering MemoryDef (DomAccess) by walking upwards.
2. Check that there no reads between DomAccess and the StartDef by checking
   all uses starting at DomAccess and walking until we see StartDef.
3. For each found DomDef, check that:
  1. There are no barrier instructions between DomDef and StartDef (like
     throws or stores with ordering constraints).
  2. StartDef is executed whenever DomDef is executed.
3. StartDef completely overwrites DomDef.
4. Erase DomDef from the function and MemorySSA.

The patch uses a very simple approach to guarantee that no throwing
instructions are between 2 stores: We only allow accesses to stack
objects, access that are in the same basic block if the block does not
contain any throwing instructions or accesses in functions that do
not contain any throwing instructions. This will get lifted later.

Besides adding support for the missing cases, there is plenty of additional
potential for improvements as follow-up work, e.g. the way we visit stores
(could be just a traversal of the MemorySSA, rather than collecting them
up-front), using the alias information discovered during walking to optimize
the MemorySSA.

This is loosely based on D40480 by Dave Green.

Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D72700
2020-02-10 11:52:11 +00:00
Florian Hahn da52b9c118 [DSE] Add tests for MemorySSA based DSE.
This copies the DSE tests into a MSSA subdirectory to test the MemorySSA
backed DSE implementation, without disturbing the original tests.

Differential Revision: https://reviews.llvm.org/D72145
2020-02-10 10:28:43 +00:00
Florian Hahn 0b21d55262 [IR] Mark memset.* intrinsics as IntrWriteMem.
llvm.memset intrinsics do only write memory, but are missing
IntrWriteMem, so they doesNotReadMemory() returns false for them.

The test change is due to the test checking the fn attribute ids at the
call sites, which got bumped up due to a new combination with writeonly
appearing in the test file.

Reviewers: jdoerfert, reames, efriedma, nlopes, lebedev.ri

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72789
2020-01-16 10:35:46 +00:00
Nuno Lopes 87407fc03c DSE: fix bug where we would only check libcalls for name rather than whole decl 2020-01-11 11:57:29 +00:00
Ankit 369a919514 Fix for a dangling point bug in DeadStoreElimination pass
The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction.

While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration.

In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container.

Reviewers: fhahn, bcahoon, efriedma

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D65326
2020-01-03 14:28:44 +00:00
Fangrui Song 502a77f125 Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351 2019-12-24 15:57:33 -08:00
Florian Hahn 19071173fc Revert "[DSE] Fix for a dangling point bug in DeadStoreElimination."
The commit causes a failure:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20911

This reverts commit 1847fd9d85.
2019-12-05 19:29:21 +00:00
Ankit 1847fd9d85 [DSE] Fix for a dangling point bug in DeadStoreElimination.
The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction.

While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration.

In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container.

Patch by Ankit <quic_aankit@quicinc.com>

Reviewers: fhahn, bcahoon, efriedma

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D65326
2019-12-05 17:53:58 +00:00
Hideto Ueno cc0a4cdc89 [FunctionAttrs] Annotate "willreturn" for intrinsics
Summary:
In D62801, new function attribute `willreturn` was introduced. In short, a function with `willreturn` is guaranteed to come back to the call site(more precise definition is in LangRef).

In this patch, willreturn is annotated for LLVM intrinsics.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: jvesely, nhaehnle, sstefan1, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64904

llvm-svn: 367184
2019-07-28 06:09:56 +00:00
Bjorn Pettersson d63a2bb35f [DSE] Bugfix to avoid PartialStoreMerging involving non byte-sized stores
Summary:
The DeadStoreElimination pass now skips doing
PartialStoreMerging when stores overlap according to
OW_PartialEarlierWithFullLater and at least one of
the stores is having a store size that is different
from the size of the type being stored.

This solves problems seen in
  https://bugs.llvm.org/show_bug.cgi?id=41949
for which we in the past could end up with
mis-compiles or assertions.

The content and location of the padding bits is not
formally described (or undefined) in the LangRef
at the moment. So the solution is chosen based on
that we cannot assume anything about the padding bits
when having a store that clobbers more memory than
indicated by the type of the value that is stored
(such as storing an i6 using an 8-bit store instruction).

Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949

Reviewers: spatel, efriedma, fhahn

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62250

llvm-svn: 361605
2019-05-24 08:32:02 +00:00
Eric Christopher cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Jeremy Morse 32afe6a1f8 [DebugInfo] Fix pr41175 Dead Store Elimination missing debug loc
Bug: https://bugs.llvm.org/show_bug.cgi?id=41175

In the bug test case the DSE pass is shortening the range of memory that a
memset is working on. A getelementptr is generated so that the new
starting address can be passed to memset. This instruction was not given
a DebugLoc.

To fix the bug, copy the DebugLoc from the memset instruction.

Patch by Orlando Cazalet-Hyams!

Differential Revision: https://reviews.llvm.org/D60556

llvm-svn: 358270
2019-04-12 09:47:35 +00:00
Craig Topper 35f55d72f6 [X86] Remove IntrArgMemOnly from target specific gather/scatter intrinsics
IntrArgMemOnly implies that only memory pointed to by pointer typed arguments will be accessed. But these intrinsics allow you to pass null to the pointer argument and put the full address into the index argument. Other passes won't be able to understand this.

A colleague found that ISPC was creating gathers like this and then dead store elimination removed some stores because it didn't understand what the gather was doing since the pointer argument was null.

Differential Revision: https://reviews.llvm.org/D58805

llvm-svn: 355228
2019-03-01 21:02:40 +00:00
Craig Topper e6bfb0919c [X86] Add test case for D58805. NFC
This demonstrates dead store elimination removing a store that may alias a gather that uses null as its base.

llvm-svn: 355227
2019-03-01 21:02:34 +00:00
Jeremy Morse b33a5c7347 [DebugInfo] Don't salvage load operations (PR40628).
Salvaging a redundant load instruction into a debug expression hides a
memory read from optimisation passes. Passes that alter memory behaviour
(such as LICM promoting memory to a register) aren't aware of these debug
memory reads and leave them unaltered, making the debug variable location
point somewhere unsafe.

Teaching passes to know about these debug memory reads would be challenging
and probably incomplete. Finding dbg.value instructions that need to be fixed
would likely be computationally expensive too, as more analysis would be
required. It's better to not generate debug-memory-reads instead, alas.

Changed tests:
 * DeadStoreElim: test for salvaging of intermediate operations contributing
   to the dead store, instead of salvaging of the redundant load,
 * GVN: remove debuginfo behaviour checks completely, this behaviour is still
   covered by other tests,
 * InstCombine: don't test for salvaged loads, we're removing that behaviour.

Differential Revision: https://reviews.llvm.org/D57962

llvm-svn: 353824
2019-02-12 10:54:30 +00:00
Reid Kleckner 40e7663b1f [BasicAA] Don't assume tail calls with byval don't alias allocas
Summary:
Calls marked 'tail' cannot read or write allocas from the current frame
because the current frame might be destroyed by the time they run.
However, a tail call may use an alloca with byval. Calling with byval
copies the contents of the alloca into argument registers or stack
slots, so there is no lifetime issue. Tail calls never modify allocas,
so we can return just ModRefInfo::Ref.

Fixes PR38466, a longstanding bug.

Reviewers: hfinkel, nlewycky, gbiv, george.burgess.iv

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50679

llvm-svn: 339636
2018-08-14 01:24:35 +00:00
Piotr Padlewski 5b3db45e8f Implement strip.invariant.group
Summary:
This patch introduce new intrinsic -
strip.invariant.group that was described in the
RFC: Devirtualization v2

Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar

Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47103

Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 336073
2018-07-02 04:49:30 +00:00
Vedant Kumar 6d354ed72e [Debugify] Move debug value intrinsics closer to their operand defs
Before this patch, debugify would insert debug value intrinsics before the
terminating instruction in a block. This had the advantage of being simple,
but was a bit too simple/unrealistic.

This patch teaches debugify to insert debug values immediately after their
operand defs. This enables better testing of the compiler.

For example, with this patch, `opt -debugify-each` is able to identify a
vectorizer DI-invariance bug fixed in llvm.org/PR32761. In this bug, the
vectorizer produced different output with/without debug info present.

Reverting Davide's bugfix locally, I see:

$ ~/scripts/opt-check-dbg-invar.sh ./bin/opt \
  .../SLPVectorizer/AArch64/spillcost-di.ll -slp-vectorizer
Comparing: -slp-vectorizer .../SLPVectorizer/AArch64/spillcost-di.ll
  Baseline: /var/folders/j8/t4w0bp8j6x1g6fpghkcb4sjm0000gp/T/tmp.iYYeL1kf
  With DI : /var/folders/j8/t4w0bp8j6x1g6fpghkcb4sjm0000gp/T/tmp.sQtQSeet
9,11c9,11
<   %5 = getelementptr inbounds %0, %0* %2, i64 %0, i32 1
<   %6 = bitcast i64* %4 to <2 x i64>*
<   %7 = load <2 x i64>, <2 x i64>* %6, align 8, !tbaa !0
---
>   %5 = load i64, i64* %4, align 8, !tbaa !0
>   %6 = getelementptr inbounds %0, %0* %2, i64 %0, i32 1
>   %7 = load i64, i64* %6, align 8, !tbaa !5
12a13
>   store i64 %5, i64* %8, align 8, !tbaa !0
14,15c15
<   %10 = bitcast i64* %8 to <2 x i64>*
<   store <2 x i64> %7, <2 x i64>* %10, align 8, !tbaa !0
---
>   store i64 %7, i64* %9, align 8, !tbaa !5
:: Found a test case ^

Running this over the *.ll files in tree, I found four additional examples
which compile differently with/without DI present. I plan on filing bugs for
these.

llvm-svn: 334118
2018-06-06 19:05:42 +00:00
Vedant Kumar 4872535eb9 [Debugify] Set a DI version module flag for llc compatibility
Setting the "Debug Info Version" module flag makes it possible to pipe
synthetic debug info into llc, which is useful for testing backends.

llvm-svn: 333237
2018-05-24 23:00:23 +00:00
Daniel Neilson 71fa1b904a [DSE] Teach the pass about partial overwrite of atomic memory intrinsics
Summary:
This change teaches DSE that the atomic memory intrinsics can be overwriten
partially in the same way as the non-atomic forms. Specifically, that the
atomic memcpy & memset can be shortened at the end and that the atomic memset
can be shortened at the beginning, if they partially overwritten
by later stores.

Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith, spatel, filcab, sanjoy

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45584

llvm-svn: 331991
2018-05-10 15:12:49 +00:00
Shiva Chen 2c864551df [DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is

!DILabel(scope: !1, name: "foo", file: !2, line: 3)

We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is

llvm.dbg.label(metadata !1)

It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.

We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.

Differential Revision: https://reviews.llvm.org/D45024

Patch by Hsiangkai Wang.

llvm-svn: 331841
2018-05-09 02:40:45 +00:00
Piotr Padlewski c77ab8ef2f perform DSE through launder.invariant.group
Summary:
Alias Analysis knows that llvm.launder.invariant.group
returns pointer that mustalias argument, but this information
wasn't used, therefor we didn't DSE through launder.invariant.group

Reviewers: chandlerc, dberlin, bogner, hfinkel, efriedma

Reviewed By: dberlin

Subscribers: amharc, llvm-commits, nlewycky, rsmith

Differential Revision: https://reviews.llvm.org/D31581

llvm-svn: 331449
2018-05-03 11:03:53 +00:00
Daniel Neilson cc45e923c5 [DSE] Teach the pass that atomic memory intrinsics are stores.
Summary:
This change teaches DSE that the atomic memory intrinsics are stores
that can be eliminated, and can allow other stores to be eliminated.
This change specifically does not teach DSE that these intrinsics
can be partially eliminated (i.e. length reduced, and dest/src changed);
that will be handled in another change.

Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith

Reviewed By: efriedma

Subscribers: dmgreen, llvm-commits

Differential Revision: https://reviews.llvm.org/D45535

llvm-svn: 330629
2018-04-23 19:06:49 +00:00
Daniel Neilson 381cdf3e07 [DSE] Add tests for atomic memory intrinsics (NFC)
Summary:
These tests show that DSE currently does nothing with the atomic memory
intrinsics. Future work will teach DSE how to simplify these.

llvm-svn: 329845
2018-04-11 19:46:02 +00:00
Daniel Neilson 9cfa786faa [DSE] Regenerate tests with update_test_checks.py (NFC)
Summary:
In preparation for a future commit, this regenerates the test checks for
test/Transforms/DeadStoreElimination/OverwriteStoreBegin.ll
test/Transforms/DeadStoreElimination/OverwriteStoreEnd.ll

llvm-svn: 329839
2018-04-11 18:43:10 +00:00
Daniel Neilson 7e2e5c3c58 [DSE] Regenerate tests with update_test_checks.py (NFC)
Summary:
In preparation for a future commit, this regenerates the test checks for
test/Transforms/DeadStoreElimination/simple.ll
test/Transforms/DeadStoreElimination/memintrinsics.ll

llvm-svn: 329824
2018-04-11 16:50:04 +00:00
Sanjoy Das 737fa40ffa [DSE] Don't DSE stores that subsequent memmove calls read from
Summary:
We used to remove the first memmove in cases like this:

  memmove(p, p+2, 8);
  memmove(p, p+2, 8);

which is incorrect.  Fix this by changing isPossibleSelfRead to what was most
likely the intended behavior.

Historical note: the buggy code was added in https://reviews.llvm.org/rL120974
to address PR8728.

Reviewers: rsmith

Subscribers: mcrosier, llvm-commits, jlebar

Differential Revision: https://reviews.llvm.org/D43425

llvm-svn: 325641
2018-02-20 23:19:34 +00:00
Vedant Kumar 35fc103e1e [DeadStoreElimination] Salvage debug info from dead insts
According to `llvm-dwarfdump --statistics` this salvages 43 additional
unique source variables in a stage2 build of clang. It increases the
size of the .debug_loc section by 0.002% (or 2864 bytes).

Differential Revision: https://reviews.llvm.org/D43220

llvm-svn: 325035
2018-02-13 18:15:26 +00:00
Sanjay Patel 1aef27f5cd [DSE] make sure memory is not modified before partial store merging (PR36129)
We missed a critical check in D30703. We must make sure that no intermediate 
store is sitting between the stores that we want to merge.

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=36129

Differential Revision: https://reviews.llvm.org/D42663

llvm-svn: 323759
2018-01-30 13:53:59 +00:00
Sanjay Patel d023a9b777 [DSE] add test for PR36129; NFC
We can miscompile because we're not checking is the memory might
me modified between the seemingly redundant store ops.

llvm-svn: 323704
2018-01-29 22:50:08 +00:00
Daniel Neilson 1e68724d24 Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Summary:
 This is a resurrection of work first proposed and discussed in Aug 2015:
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

 The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.

 In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
   require that the alignments for source & dest be equal.

 For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.

s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g

 The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
   source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
        and those that use use MemIntrinsicInst::[get|set]Alignment() to use
        getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
        MemIntrinsicInst::[get|set]Alignment() methods.

Reviewers: pete, hfinkel, lhames, reames, bollu

Reviewed By: reames

Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits

Differential Revision: https://reviews.llvm.org/D41675

llvm-svn: 322965
2018-01-19 17:13:12 +00:00
Dan Gohman 2c74fe977d Add an @llvm.sideeffect intrinsic
This patch implements Chandler's idea [0] for supporting languages that
require support for infinite loops with side effects, such as Rust, providing
part of a solution to bug 965 [1].

Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
effect, but which appears to optimization passes to have obscure side effects,
such that they don't optimize away loops containing it. It also teaches
several optimization passes to ignore this intrinsic, so that it doesn't
significantly impact optimization in most cases.

As discussed on llvm-dev [2], this patch is the first of two major parts.
The second part, to change LLVM's semantics to have defined behavior
on infinite loops by default, with a function attribute for opting into
potential-undefined-behavior, will be implemented and posted for review in
a separate patch.

[0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
[1] https://bugs.llvm.org/show_bug.cgi?id=965
[2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html

Differential Revision: https://reviews.llvm.org/D38336

llvm-svn: 317729
2017-11-08 21:59:51 +00:00
Mikael Holmen 279790b674 [MemDep] DBG intrinsics don't impact abort limit for call site dependence analysis
Summary:
Memory dependence analysis no longer counts DbgInfoIntrinsics towards the
limit where to abort the analysis. Before, a bunch of calls to dbg.value
could affect the generated code, meaning that with -g we could generate
different code than without.

Reviewers: chandlerc, Prazek, davide, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39181

llvm-svn: 316551
2017-10-25 06:15:32 +00:00
Sanjay Patel 1d04b5bacf [DSE] Merge stores when the later store only writes to memory locations the early store also wrote to (2nd try)
This is a 2nd attempt at:
https://reviews.llvm.org/rL310055
...which was reverted at rL310123 because of PR34074:
https://bugs.llvm.org/show_bug.cgi?id=34074

In this version, we break out of the inner loop after we successfully merge and kill a pair of stores. In the
earlier rev, we were continuing instead, which meant we could process the invalid info from a now dead store.

Original commit message (authored by Filipe Cabecinhas):

This fixes PR31777.

If both stores' values are ConstantInt, we merge the two stores
(shifting the smaller store appropriately) and replace the earlier (and
larger) store with an updated constant.

In the future we should also support vectors of integers. And maybe
float/double if we can.  

Differential Revision: https://reviews.llvm.org/D30703

llvm-svn: 314206
2017-09-26 13:54:28 +00:00
Nico Weber 69ca5322d4 Revert r310055, it caused PR34074.
llvm-svn: 310123
2017-08-04 20:40:38 +00:00
Filipe Cabecinhas fb9d2a8775 [DSE] Merge stores when the later store only writes to memory locations the early store also wrote to.
Summary:
This fixes PR31777.

If both stores' values are ConstantInt, we merge the two stores
(shifting the smaller store appropriately) and replace the earlier (and
larger) store with an updated constant.

In the future we should also support vectors of integers. And maybe
float/double if we can.

Reviewers: hfinkel, junbuml, jfb, RKSimon, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30703

llvm-svn: 310055
2017-08-04 12:28:36 +00:00
Adrian Prantl abe04759a6 Remove the obsolete offset parameter from @llvm.dbg.value
There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951

llvm-svn: 309426
2017-07-28 20:21:02 +00:00
Matt Arsenault f10061ec70 Add address space mangling to lifetime intrinsics
In preparation for allowing allocas to have non-0 addrspace.

llvm-svn: 299876
2017-04-10 20:18:21 +00:00
Igor Laevsky b40152d5d1 [DeadStoreElimination] Check function modref behavior before considering memory clobbered
Differential Revision: https://reviews.llvm.org/D29996

llvm-svn: 296625
2017-03-01 14:38:29 +00:00
Eli Friedman a6707f56b5 [DSE] Don't remove stores made live by a call which unwinds.
Issue exposed by noalias or more aggressive alias analysis.

Fixes http://llvm.org/PR25422.

Differential revision: https://reviews.llvm.org/D21007

llvm-svn: 278451
2016-08-12 01:09:53 +00:00
Anna Thomas 037e540f08 [AliasAnalysis] Treat invariant.start as read-memory
Summary:
We teach alias analysis that invariant.start is readonly.
This helps with GVN and memcopy optimizations that currently treat.
invariant.start as a clobber.
We need to treat this as readonly, so that DSE does not incorrectly
remove stores prior to the invariant.start

Reviewers: sanjoy, reames, majnemer, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23214

llvm-svn: 278138
2016-08-09 17:18:05 +00:00
Jun Bum Lim 6a7dc5c430 Recommit - [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals
Recommiting r275571 after fixing crash reported in PR28270.
Now we erase elements of IOL in deleteDeadInstruction().

Original Summary:
This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics.
Add test cases which was missing opportunities before.

llvm-svn: 276452
2016-07-22 18:27:24 +00:00
Alexander Kornienko 63dd36faa5 Revert "r275571 [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals"
Causes https://llvm.org/bugs/show_bug.cgi?id=28588

llvm-svn: 275801
2016-07-18 15:51:31 +00:00
Jun Bum Lim a5737d8eac [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals
Summary:
This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics.
Add test cases which was missing opportunities before.

Reviewers: hfinkel, eeckstein, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D21909

llvm-svn: 275571
2016-07-15 16:14:34 +00:00
Anna Thomas 6a78c78a03 [DSE] Remove dead stores in end blocks containing fence
We can remove dead stores in the presence of fence instructions. Fence
does not change an otherwise thread local store to visible.

reviewers: reames, dexonsmith, jfb
Differential Revision: http://reviews.llvm.org/D22001

llvm-svn: 274795
2016-07-07 20:51:42 +00:00
Chad Rosier dcfce2d0ec [DSE] Avoid iterator invalidation bugs.
The dse_with_dbg_value.ll test committed with r273141 is removed because this
we no longer performs any type of back tracking, which is what was causing the
codegen differences with and without debug information.

Differential Revision: http://reviews.llvm.org/D21613

llvm-svn: 274660
2016-07-06 19:48:52 +00:00
Jun Bum Lim 596a3bd9ec [DSE] Fix bug in partial overwrite tracking
Summary:
Found cases where DSE incorrectly add partially-overwritten intervals.
Please see the test case for details.

Reviewers: mcrosier, eeckstein, hfinkel

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D21859

llvm-svn: 274237
2016-06-30 15:32:20 +00:00
Hal Finkel a1271036c5 Allow DeadStoreElimination to track combinations of partial later wrties
DeadStoreElimination can currently remove a small store rendered unnecessary by
a later larger one, but could not remove a larger store rendered unnecessary by
a series of later smaller ones. This adds that capability.

It works by keeping a map, which is used as an effective interval map, for each
store later overwritten only partially, and filling in that interval map as
more such stores are discovered. No additional walking or aliasing queries are
used. In the map forms an interval covering the the entire earlier store, then
it is dead and can be removed. The map is used as an interval map by storing a
mapping between the ending offset and the beginning offset of each interval.

I discovered this problem when investigating a performance issue with code like
this on PowerPC:

  #include <complex>
  using namespace std;

  complex<float> bar(complex<float> C);
  complex<float> foo(complex<float> C) {
    return bar(C)*C;
  }

which produces this:

  define void @_Z4testSt7complexIfE(%"struct.std::complex"* noalias nocapture sret %agg.result, i64 %c.coerce) {
  entry:
    %ref.tmp = alloca i64, align 8
    %tmpcast = bitcast i64* %ref.tmp to %"struct.std::complex"*
    %c.sroa.0.0.extract.shift = lshr i64 %c.coerce, 32
    %c.sroa.0.0.extract.trunc = trunc i64 %c.sroa.0.0.extract.shift to i32
    %0 = bitcast i32 %c.sroa.0.0.extract.trunc to float
    %c.sroa.2.0.extract.trunc = trunc i64 %c.coerce to i32
    %1 = bitcast i32 %c.sroa.2.0.extract.trunc to float
    call void @_Z3barSt7complexIfE(%"struct.std::complex"* nonnull sret %tmpcast, i64 %c.coerce)
    %2 = bitcast %"struct.std::complex"* %agg.result to i64*
    %3 = load i64, i64* %ref.tmp, align 8
    store i64 %3, i64* %2, align 4 ; <--- ***** THIS SHOULD NOT BE HERE ****
    %_M_value.realp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 0
    %4 = lshr i64 %3, 32
    %5 = trunc i64 %4 to i32
    %6 = bitcast i32 %5 to float
    %_M_value.imagp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 1
    %7 = trunc i64 %3 to i32
    %8 = bitcast i32 %7 to float
    %mul_ad.i.i = fmul fast float %6, %1
    %mul_bc.i.i = fmul fast float %8, %0
    %mul_i.i.i = fadd fast float %mul_ad.i.i, %mul_bc.i.i
    %mul_ac.i.i = fmul fast float %6, %0
    %mul_bd.i.i = fmul fast float %8, %1
    %mul_r.i.i = fsub fast float %mul_ac.i.i, %mul_bd.i.i
    store float %mul_r.i.i, float* %_M_value.realp.i.i, align 4
    store float %mul_i.i.i, float* %_M_value.imagp.i.i, align 4
    ret void
  }

the problem here is not just that the i64 store is unnecessary, but also that
it blocks further backend optimizations of the other uses of that i64 value in
the backend.

In the future, we might want to add a special case for handling smaller
accesses (e.g. using a bit vector) if the map mechanism turns out to be
noticeably inefficient. A sorted vector is also a possible replacement for the
map for small numbers of tracked intervals.

Differential Revision: http://reviews.llvm.org/D18586

llvm-svn: 273559
2016-06-23 13:46:39 +00:00
Patrik Hagglund 7205215591 Fix for PR27940
After a store has been eliminated, when making sure that the
instruction iterator points to a valid instruction, dbg intrinsics are
now ignored as a new instruction.

Patch by Henric Karlsson.

Reviewed by Daniel Berlin.

Differential Revision: http://reviews.llvm.org/D21076

llvm-svn: 273141
2016-06-20 09:10:10 +00:00
Sanjoy Das 98ac278b86 Move previously added test case to the right location
In rL272580 I accidentally added a test case to test/CodeGen when
test/Transforms/DeadStoreElimination/ is a better place for it.

llvm-svn: 272581
2016-06-13 20:12:07 +00:00
Justin Bogner 594e07bd78 [PM] Port DSE to the new pass manager
Patch by JakeVanAdrighem. Thanks!

llvm-svn: 269847
2016-05-17 21:38:13 +00:00
Jun Bum Lim d29a24e4fd [DeadStoreElimination] Shorten beginning of memset overwritten by later stores
Summary: This change will shorten memset if the beginning of memset is overwritten by later stores.

Reviewers: hfinkel, eeckstein, dberlin, mcrosier

Subscribers: mgrang, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18906

llvm-svn: 267197
2016-04-22 19:51:29 +00:00
Adrian Prantl 75819aedf6 [PR27284] Reverse the ownership between DICompileUnit and DISubprogram.
Currently each Function points to a DISubprogram and DISubprogram has a
scope field. For member functions the scope is a DICompositeType. DIScopes
point to the DICompileUnit to facilitate type uniquing.

Distinct DISubprograms (with isDefinition: true) are not part of the type
hierarchy and cannot be uniqued. This change removes the subprograms
list from DICompileUnit and instead adds a pointer to the owning compile
unit to distinct DISubprograms. This would make it easy for ThinLTO to
strip unneeded DISubprograms and their transitively referenced debug info.

Motivation
----------

Materializing DISubprograms is currently the most expensive operation when
doing a ThinLTO build of clang.

We want the DISubprogram to be stored in a separate Bitcode block (or the
same block as the function body) so we can avoid having to expensively
deserialize all DISubprograms together with the global metadata. If a
function has been inlined into another subprogram we need to store a
reference the block containing the inlined subprogram.

Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script
that updates LLVM IR testcases to the new format.

http://reviews.llvm.org/D19034
<rdar://problem/25256815>

llvm-svn: 266446
2016-04-15 15:57:41 +00:00
Adrian Prantl b8089516a5 testcase gardening: update the emissionKind enum to the new syntax. (NFC)
llvm-svn: 265081
2016-04-01 00:16:49 +00:00
Adrian Prantl b939a25707 Move the DebugEmissionKind enum from DIBuilder into DICompileUnit.
This mostly cosmetic patch moves the DebugEmissionKind enum from DIBuilder
into DICompileUnit. DIBuilder is not the right place for this enum to live
in — a metadata consumer should not have to include DIBuilder.h.
I also added a Verifier check that checks that the emission kind of a
DICompileUnit is actually legal.

http://reviews.llvm.org/D18612
<rdar://problem/25427165>

llvm-svn: 265077
2016-03-31 23:56:58 +00:00
Philip Reames b5681138e4 Allow value forwarding past release fences in GVN
A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-load forwarding across a release fence.  

We choose to be much more conservative about stores.  In theory, nothing prevents us from shifting a store from after a release fence to before it, and then eliminating the preceeding (previously fenced) store.  Doing this without actually moving the second store is likely also legal, but we chose to be conservative at this time.

The LangRef indicates only atomic loads and stores are effected by fences. This patch chooses to be far more conservative then that. 

This is the GVN companion to http://reviews.llvm.org/D11434 which applied the same logic in EarlyCSE and has been baking in tree for a while now.

Differential Revision: http://reviews.llvm.org/D11436

llvm-svn: 264472
2016-03-25 22:40:35 +00:00
Igor Laevsky 28eeb3f66c [BasicAliasAnalysis] Take into account operand bundles in the getModRefInfo function
Differential Revision: http://reviews.llvm.org/D16225

llvm-svn: 257991
2016-01-16 12:15:53 +00:00
Keno Fischer 81e2e9ef86 Reapply r257105 "[Verifier] Check that debug values have proper size"
I originally reapplied this in 257550, but had to revert again due to bot
breakage. The only change in this version is to allow either the TypeSize
or the TypeAllocSize of the variable to be the one represented in debug info
(hopefully in the future we can figure out how to encode the difference).
Additionally, several bot failures following r257550, were due to
optimizer bugs now fixed in r257787 and r257795.

r257550 commit message was:

```
The follow extra changes were made to test cases:

Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:

    LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
    LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
    LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
    LLVM :: DebugInfo/Generic/varargs.ll

Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):

    LLVM :: DebugInfo/Generic/restrict.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/type-unique-simple2.ll

Fixing an incorrect DW_OP_deref

    LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll

Fixing a missing DW_OP_deref

    LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll

Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.

The original commit message was:
``
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
``

```

llvm-svn: 257850
2016-01-15 00:46:17 +00:00
Keno Fischer 78e5c9e6e2 Re-Revert r257105 (Verifier debug info changes)
While I investigate some new buildbot failures. This was originally reapplied
as r257550 and r257558.

llvm-svn: 257563
2016-01-13 02:31:14 +00:00
Keno Fischer 25916079ff Reapply r257105 "[Verifier] Check that debug values have proper size"
The follow extra changes were made to test cases:

Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:

    LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
    LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
    LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
    LLVM :: DebugInfo/Generic/varargs.ll

Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):

    LLVM :: DebugInfo/Generic/restrict.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/type-unique-simple2.ll

Fixing an incorrect DW_OP_deref

    LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll

Fixing a missing DW_OP_deref

    LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll

Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.

The original commit message was:
```
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
```

llvm-svn: 257550
2016-01-13 00:31:44 +00:00
Keno Fischer ea33a25816 Temporarily revert r257105 "[Verifier] Check that debug values have proper size"
Looks like there's a case where clang generates debug info that triggers
the new verifier check. Reverting while investigating.

llvm-svn: 257107
2016-01-07 22:39:11 +00:00
Keno Fischer b3326be6ad [Verifier] Check that debug values have proper size
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D14276

llvm-svn: 257105
2016-01-07 22:18:37 +00:00
Chad Rosier d7634fc91d Revert r255247, r255265, and r255286 due to serious compile-time regressions.
Revert "[DSE] Disable non-local DSE to see if the bots go green."
Revert "[DeadStoreElimination] Use range-based loops. NFC."
Revert "[DeadStoreElimination] Add support for non-local DSE."

llvm-svn: 255354
2015-12-11 18:39:41 +00:00
Chad Rosier 843c7b4309 [DSE] Disable non-local DSE to see if the bots go green.
I see a few bots timing out, so I'm speculatively disabling r255247.

llvm-svn: 255286
2015-12-10 19:23:02 +00:00
Chad Rosier 533bc3fcac [DeadStoreElimination] Add support for non-local DSE.
We extend the search for redundant stores to predecessor blocks that
unconditionally lead to the block BB with the current store instruction.  That
also includes single-block loops that unconditionally lead to BB, and
if-then-else blocks where then- and else-blocks unconditionally lead to BB.

http://reviews.llvm.org/D13363
Patch by Ivan Baev <ibaev@codeaurora.org>!

llvm-svn: 255247
2015-12-10 13:51:43 +00:00
Pete Cooper 67cf9a723b Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253543
2015-11-19 05:56:52 +00:00
Pete Cooper 72bc23ef02 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

llvm-svn: 253511
2015-11-18 22:17:24 +00:00
Peter Collingbourne d4bff30370 DI: Reverse direction of subprogram -> function edge.
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.

For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.

This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.

Since this is an IR change, a bitcode upgrade has been provided.

Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.

Differential Revision: http://reviews.llvm.org/D14265

llvm-svn: 252219
2015-11-05 22:03:56 +00:00
Igor Laevsky 029bd93c5d [DeadStoreElimination] Remove dead zero store to calloc initialized memory
This change allows dead store elimination to remove zero and null stores into memory freshly allocated with calloc-like function.

Differential Revision: http://reviews.llvm.org/D13021

llvm-svn: 248374
2015-09-23 11:38:44 +00:00
Duncan P. N. Exon Smith 814b8e91c7 DI: Require subprogram definitions to be distinct
As a follow-up to r246098, require `DISubprogram` definitions
(`isDefinition: true`) to be 'distinct'.  Specifically, add an assembler
check, a verifier check, and bitcode upgrading logic to combat testcase
bitrot after the `DIBuilder` change.

While working on the testcases, I realized that
test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore.  Its
purpose was to check for a corner case in PR22792 where two subprogram
definitions match exactly and share the same metadata node.  The new
verifier check, requiring that subprogram definitions are 'distinct',
precludes that possibility.

I updated almost all the IR with the following script:

    git grep -l -E -e '= !DISubprogram\(.* isDefinition: true' |
    grep -v test/Bitcode |
    xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true\)/= distinct \1/'

Likely some variant of would work for out-of-tree testcases.

llvm-svn: 246327
2015-08-28 20:26:49 +00:00
Bjorn Steinbrink 2e2f66557e Revert "[DSE] Enable removal of lifetime intrinsics in terminating blocks"
llvm-svn: 245543
2015-08-20 08:58:47 +00:00
Bjorn Steinbrink cc7e8a9705 [DSE] Enable removal of lifetime intrinsics in terminating blocks
Usually DSE is not supposed to remove lifetime intrinsics, but it's
actually ok to remove them for dead objects in terminating blocks,
because they convey no extra information there. Until we hit a lifetime
start that cannot be removed, that is. Because from that point on the
lifetime intrinsics become interesting again, e.g. for stack coloring.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11710

llvm-svn: 245542
2015-08-20 08:25:28 +00:00
Eric Christopher 0efe9f60bb Revert "Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks."
This is causing bootstrap problems, e.g.: http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/2960

This reverts r245195.

llvm-svn: 245402
2015-08-19 02:15:13 +00:00
Karthik Bhat 3af28945b9 Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks.
PR24469 resulted because DeleteDeadInstruction in handleNonLocalStoreDeletion was
deleting the next basic block iterator. Fixed the same by resetting the basic block iterator
post call to DeleteDeadInstruction.

llvm-svn: 245195
2015-08-17 05:51:39 +00:00
David Majnemer e04443baff Revert "Add support for cross block dse. This patch enables dead stroe elimination across basicblocks."
This reverts commit r245025, it caused PR24469.

llvm-svn: 245172
2015-08-16 07:11:59 +00:00
Karthik Bhat ddc2a86a00 Add support for cross block dse.
This patch enables dead stroe elimination across basicblocks.

Example:
define void @test_02(i32 %N) {
  %1 = alloca i32
  store i32 %N, i32* %1
  store i32 10, i32* @x
  %2 = load i32, i32* %1
  %3 = icmp ne i32 %2, 0
  br i1 %3, label %4, label %5

; <label>:4
  store i32 5, i32* @x
  br label %7

; <label>:5
  %6 = load i32, i32* @x
  store i32 %6, i32* @y
  br label %7

; <label>:7
  store i32 15, i32* @x
  ret void
}
In the above example dead store "store i32 5, i32* @x" is now eliminated.

Differential Revision: http://reviews.llvm.org/D11143

llvm-svn: 245025
2015-08-14 04:17:23 +00:00
Erik Eckstein 11fc8175d9 [DeadStoreElimination] remove a redundant store even if the load is in a different block.
DeadStoreElimination does eliminate a store if it stores a value which was loaded from the same memory location.
So far this worked only if the store is in the same block as the load.
Now we can also handle stores which are in a different block than the load.
Example:

define i32 @test(i1, i32*) {
entry:
  %l2 = load i32, i32* %1, align 4
  br i1 %0, label %bb1, label %bb2
bb1:
  br label %bb3
bb2:
  ; This store is redundant
  store i32 %l2, i32* %1, align 4
  br label %bb3
bb3:
  ret i32 0
}

Differential Revision: http://reviews.llvm.org/D11854

llvm-svn: 244901
2015-08-13 15:36:11 +00:00
Duncan P. N. Exon Smith 55ca964e94 DI: Disallow uniquable DICompileUnits
Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s.
The backend is liable to start relying on that (if it hasn't already),
so make uniquable `DICompileUnit`s illegal and automatically upgrade old
bitcode.  This is a nice cleanup, since we can remove an unnecessary
`DenseSet` (and the associated uniquing info) from `LLVMContextImpl`.

Almost all the testcases were updated with this script:

    git grep -e '= !DICompileUnit' -l -- test |
    grep -v test/Bitcode |
    xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,'

I imagine something similar should work for out-of-tree testcases.

llvm-svn: 243885
2015-08-03 17:26:41 +00:00
Duncan P. N. Exon Smith ed013cd221 DI: Remove DW_TAG_arg_variable and DW_TAG_auto_variable
Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags,
using `DW_TAG_variable` in their place Stop exposing the `tag:` field at
all in the assembly format for `DILocalVariable`.

Most of the testcase updates were generated by the following sed script:

    find test/ -name "*.ll" -o -name "*.mir" |
    xargs grep -l 'DILocalVariable' |
    xargs sed -i '' \
      -e 's/tag: DW_TAG_arg_variable, //' \
      -e 's/tag: DW_TAG_auto_variable, //'

There were only a handful of tests in `test/Assembly` that I needed to
update by hand.

(Note: a follow-up could change `DILocalVariable::DILocalVariable()` to
set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable`
(as appropriate), instead of having that logic magically in the backend
in `DbgVariable`.  I've added a FIXME to that effect.)

llvm-svn: 243774
2015-07-31 18:58:39 +00:00
Duncan P. N. Exon Smith a9308c49ef IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

llvm-svn: 236120
2015-04-29 16:38:44 +00:00
Duncan P. N. Exon Smith 48b3503c16 DebugInfo: Add missing !dbg attachments to intrinsics
Add missing `!dbg` attachments to `@llvm.dbg.*` intrinsics.  I updated
these using a script (add-dbg-to-intrinsics.sh) that I'll attach to
PR22778 for posterity.

llvm-svn: 235040
2015-04-15 21:04:10 +00:00
Duncan P. N. Exon Smith 49e6a70fe3 Verifier: Call verifyModule() from llc and opt
Change `llc` and `opt` to run `verifyModule()`.  This ensures that we
check the full module before `FunctionPass::doInitialization()` ever
gets called (I was getting crashes in `DwarfDebug` instead of verifier
failures when testing a WIP patch that checks operands of compile
units).  In `opt`, also move up debug-info-stripping so that it still
runs before verification.

There was a fair bit of broken code that was sitting in tree.
Interestingly, some were cases of a `select` that referred to itself in
`-instcombine` tests (apparently an intermediate result).  I split them
off to `*-noverify.ll` tests with RUN lines like this:

    opt < %s -S -disable-verify -instcombine | opt -S | FileCheck %s

This avoids verifying the input file (so we can get the broken code into
`-instcombine), but still verifies the output with a second call to
`opt` (to verify that `-instcombine` will clean it up like it should).

llvm-svn: 233432
2015-03-27 22:04:28 +00:00
David Majnemer e165502ed7 MemoryDependenceAnalysis: Don't miscompile atomics
r216771 introduced a change to MemoryDependenceAnalysis that allowed it
to reason about acquire/release operations.  However, this change does
not ensure that the acquire/release operations pair.  Unfortunately,
this leads to miscompiles as we won't see an acquire load as properly
memory effecting.  This largely reverts r216771.

This fixes PR22708.

llvm-svn: 232889
2015-03-21 06:19:17 +00:00
Duncan P. N. Exon Smith b786572d7c DebugInfo: Fix testcases that fail -verify-debug-info=true
As part of PR22777, fix testcases that fail the debug info verifier.
The changes fall into the following categories:

  - Empty `filename:` fields in `MDFile`s.  Compile units and some types
    require non-empty filenames.  A number of testcases have empty
    filenames, probably due to hand-reduction of testcases.
  - Not-quite empty arrays: `!{i32 0}`.  This used to be equivalent in
    the debug info schema to `!{}`.  They cause problems for
    `!MDSubroutineType`'s `types:` array, since it requires all operands
    to be valid types.  (Note that `!{null}` is the correct type array
    for functions that take no arguments and return `void`.)
  - Significantly bitrotted testcases.  Nodes got left behind a few
    upgrades ago because of missing or invalid tags.

llvm-svn: 232415
2015-03-16 21:10:12 +00:00
Duncan P. N. Exon Smith 166121ad0b Verifier: Check debug info intrinsic arguments
Verify that debug info intrinsic arguments are valid.  (These checks
will not recurse through the full debug info graph, so they don't need
to be cordoned of in `DebugInfoVerifier`.)

With those checks in place, changing the `DbgIntrinsicInst` accessors to
downcast to `MDLocalVariable` and `MDExpression` is natural (added isa
specializations in `Metadata.h` to support this).

Added tests to `test/Verifier` for the new -verify checks, and fixed the
debug info in all the in-tree tests.

If you have out-of-tree testcases that have started to fail to -verify,
hopefully the verify checks are helpful.  The most likely problem is
that the expression argument is `!{}` (instead of `!MDExpression()`).

llvm-svn: 232296
2015-03-15 01:21:30 +00:00
Duncan P. N. Exon Smith e274180f0e DebugInfo: Move new hierarchy into place
Move the specialized metadata nodes for the new debug info hierarchy
into place, finishing off PR22464.  I've done bootstraps (and all that)
and I'm confident this commit is NFC as far as DWARF output is
concerned.  Let me know if I'm wrong :).

The code changes are fairly mechanical:

  - Bumped the "Debug Info Version".
  - `DIBuilder` now creates the appropriate subclass of `MDNode`.
  - Subclasses of DIDescriptor now expect to hold their "MD"
    counterparts (e.g., `DIBasicType` expects `MDBasicType`).
  - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp`
    for printing comments.
  - Big update to LangRef to describe the nodes in the new hierarchy.
    Feel free to make it better.

Testcase changes are enormous.  There's an accompanying clang commit on
its way.

If you have out-of-tree debug info testcases, I just broke your build.

  - `upgrade-specialized-nodes.sh` is attached to PR22564.  I used it to
    update all the IR testcases.
  - Unfortunately I failed to find way to script the updates to CHECK
    lines, so I updated all of these by hand.  This was fairly painful,
    since the old CHECKs are difficult to reason about.  That's one of
    the benefits of the new hierarchy.

This work isn't quite finished, BTW.  The `DIDescriptor` subclasses are
almost empty wrappers, but not quite: they still have loose casting
checks (see the `RETURN_FROM_RAW()` macro).  Once they're completely
gutted, I'll rename the "MD" classes to "DI" and kill the wrappers.  I
also expect to make a few schema changes now that it's easier to reason
about everything.

llvm-svn: 231082
2015-03-03 17:24:31 +00:00
David Blaikie a79ac14fa6 [opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.

A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)

import fileinput
import sys
import re

pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")

for line in sys.stdin:
  sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7649

llvm-svn: 230794
2015-02-27 21:17:42 +00:00
David Blaikie 79e6c74981 [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
One of several parallel first steps to remove the target type of pointers,
replacing them with a single opaque pointer type.

This adds an explicit type parameter to the gep instruction so that when the
first parameter becomes an opaque pointer type, the type to gep through is
still available to the instructions.

* This doesn't modify gep operators, only instructions (operators will be
  handled separately)

* Textual IR changes only. Bitcode (including upgrade) and changing the
  in-memory representation will be in separate changes.

* geps of vectors are transformed as:
    getelementptr <4 x float*> %x, ...
  ->getelementptr float, <4 x float*> %x, ...
  Then, once the opaque pointer type is introduced, this will ultimately look
  like:
    getelementptr float, <4 x ptr> %x
  with the unambiguous interpretation that it is a vector of pointers to float.

* address spaces remain on the pointer, not the type:
    getelementptr float addrspace(1)* %x
  ->getelementptr float, float addrspace(1)* %x
  Then, eventually:
    getelementptr float, ptr addrspace(1) %x

Importantly, the massive amount of test case churn has been automated by
same crappy python code. I had to manually update a few test cases that
wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
python script just massages stdin and writes the result to stdout, I
then wrapped that in a shell script to handle replacing files, then
using the usual find+xargs to migrate all the files.

update.py:
import fileinput
import sys
import re

ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
normrep = re.compile(       r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")

def conv(match, line):
  if not match:
    return line
  line = match.groups()[0]
  if len(match.groups()[5]) == 0:
    line += match.groups()[2]
  line += match.groups()[3]
  line += ", "
  line += match.groups()[1]
  line += "\n"
  return line

for line in sys.stdin:
  if line.find("getelementptr ") == line.find("getelementptr inbounds"):
    if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
      line = conv(re.match(ibrep, line), line)
  elif line.find("getelementptr ") != line.find("getelementptr ("):
    line = conv(re.match(normrep, line), line)
  sys.stdout.write(line)

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

After that, check-all (with llvm, clang, clang-tools-extra, lld,
compiler-rt, and polly all checked out).

The extra 'rm' in the apply.sh script is due to a few files in clang's test
suite using interesting unicode stuff that my python script was throwing
exceptions on. None of those files needed to be migrated, so it seemed
sufficient to ignore those cases.

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7636

llvm-svn: 230786
2015-02-27 19:29:02 +00:00
Duncan P. N. Exon Smith be7ea19b58 IR: Make metadata typeless in assembly
Now that `Metadata` is typeless, reflect that in the assembly.  These
are the matching assembly changes for the metadata/value split in
r223802.

  - Only use the `metadata` type when referencing metadata from a call
    intrinsic -- i.e., only when it's used as a `Value`.

  - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode`
    when referencing it from call intrinsics.

So, assembly like this:

    define @foo(i32 %v) {
      call void @llvm.foo(metadata !{i32 %v}, metadata !0)
      call void @llvm.foo(metadata !{i32 7}, metadata !0)
      call void @llvm.foo(metadata !1, metadata !0)
      call void @llvm.foo(metadata !3, metadata !0)
      call void @llvm.foo(metadata !{metadata !3}, metadata !0)
      ret void, !bar !2
    }
    !0 = metadata !{metadata !2}
    !1 = metadata !{i32* @global}
    !2 = metadata !{metadata !3}
    !3 = metadata !{}

turns into this:

    define @foo(i32 %v) {
      call void @llvm.foo(metadata i32 %v, metadata !0)
      call void @llvm.foo(metadata i32 7, metadata !0)
      call void @llvm.foo(metadata i32* @global, metadata !0)
      call void @llvm.foo(metadata !3, metadata !0)
      call void @llvm.foo(metadata !{!3}, metadata !0)
      ret void, !bar !2
    }
    !0 = !{!2}
    !1 = !{i32* @global}
    !2 = !{!3}
    !3 = !{}

I wrote an upgrade script that handled almost all of the tests in llvm
and many of the tests in cfe (even handling many `CHECK` lines).  I've
attached it (or will attach it in a moment if you're speedy) to PR21532
to help everyone update their out-of-tree testcases.

This is part of PR21532.

llvm-svn: 224257
2014-12-15 19:07:53 +00:00
Reid Kleckner 35fc363ce8 Parse 'ghccc' in .ll files as the GHC convention (cc 10)
Previously we just used "cc 10" in the .ll files, but that isn't very
human readable.

llvm-svn: 223076
2014-12-01 21:04:44 +00:00
Hal Finkel dd38c0b876 [DSE] Remove no-data-layout-only type-based overlap checking
DSE's overlap checking contained special logic, used only when no DataLayout
was available, which inferred a complete overwrite when the pointee types were
equal. This logic seems fine for regular loads/stores, but does not work for
memcpy and friends. Instead of fixing this, I'm just removing it.
Philosophically, transformations should not contain enhanced behavior used only
when data layout is lacking (data layout should be strictly additive), and
maintaining these rarely-tested code paths seems not worthwhile at this stage.

Credit to Aliaksei Zasenka for the bug report and the diagnosis. The test case
(slightly reduced from that provided by Aliaksei) replaces the original
contents of test/Transforms/DeadStoreElimination/no-targetdata.ll -- a few
other tests have been updated to have a data layout.

llvm-svn: 220035
2014-10-17 11:56:00 +00:00
Duncan P. N. Exon Smith 176b691d32 Revert "Revert "DI: Fold constant arguments into a single MDString""
This reverts commit r218918, effectively reapplying r218914 after fixing
an Ocaml bindings test and an Asan crash.  The root cause of the latter
was a tightened-up check in `DILexicalBlock::Verify()`, so I'll file a
PR to investigate who requires the loose check (and why).

Original commit message follows.

--

This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString.  Integers are stringified and
a `\0` character is used as a separator.

Part of PR17891.

Note: I've attached my testcases upgrade scripts to the PR.  If I've
just broken your out-of-tree testcases, they might help.

llvm-svn: 219010
2014-10-03 20:01:09 +00:00
Duncan P. N. Exon Smith 786cd049fc Revert "DI: Fold constant arguments into a single MDString"
This reverts commit r218914 while I investigate some bots.

llvm-svn: 218918
2014-10-02 22:15:31 +00:00
Duncan P. N. Exon Smith 571f97bd90 DI: Fold constant arguments into a single MDString
This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString.  Integers are stringified and
a `\0` character is used as a separator.

Part of PR17891.

Note: I've attached my testcases upgrade scripts to the PR.  If I've
just broken your out-of-tree testcases, they might help.

llvm-svn: 218914
2014-10-02 21:56:57 +00:00
Adrian Prantl 87b7eb9d0f Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

Note: I accidentally committed a bogus older version of this patch previously.
llvm-svn: 218787
2014-10-01 18:55:02 +00:00
Adrian Prantl b458dc2eee Revert r218778 while investigating buldbot breakage.
"Move the complex address expression out of DIVariable and into an extra"

llvm-svn: 218782
2014-10-01 18:10:54 +00:00
Adrian Prantl 25a7174e7a Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

llvm-svn: 218778
2014-10-01 17:55:39 +00:00
Robin Morisset 163ef0402a Relax the constraint more in MemoryDependencyAnalysis.cpp
Even loads/stores that have a stronger ordering than monotonic can be safe.
The rule is no release-acquire pair on the path from the QueryInst, assuming that
the QueryInst is not atomic itself.

llvm-svn: 216771
2014-08-29 20:32:58 +00:00
Robin Morisset 4ffe8aaa69 Weak relaxing of the constraints on atomics in MemoryDependencyAnalysis
Monotonic accesses do not have to kill the analysis, as long as the QueryInstr is not
itself atomic.

llvm-svn: 215942
2014-08-18 22:18:11 +00:00
Rui Ueyama c487f7728e Revert "r214897 - Remove dead zero store to calloc initialized memory"
It broke msan.

llvm-svn: 214989
2014-08-06 19:30:38 +00:00
Philip Reames 00c9b6461f Remove dead zero store to calloc initialized memory
Optimize the following IR:

%1 = tail call noalias i8* @calloc(i64 1, i64 4)
%2 = bitcast i8* %1 to i32*
; This store is dead and should be removed
store i32 0, i32* %2, align 4

Memory returned by calloc is guaranteed to be zero initialized. If the value being stored is the constant zero (and the store is not otherwise observable across threads), we can delete the store.  If the store is to an out of bounds address, it is undefined and thus also removable.

Reviewed By: nicholas

Differential Revision: http://reviews.llvm.org/D3942

llvm-svn: 214897
2014-08-05 17:48:20 +00:00