Commit Graph

12567 Commits

Author SHA1 Message Date
Adam Nemet 57ac766ee9 [LoopAccesses] Change LAA:getInfo to return a constant reference
As expected, this required a few more const-correctness fixes.

Based on Hal's feedback on D7684.

llvm-svn: 229899
2015-02-19 19:15:21 +00:00
Adam Nemet 2bd6e984ef [LoopAccesses] Split out LoopAccessReport from VectorizerReport
The only difference between these two is that VectorizerReport adds a
vectorizer-specific prefix to its messages.  When LAA is used in the
vectorizer context the prefix is added when we promote the
LoopAccessReport into a VectorizerReport via one of the constructors.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229897
2015-02-19 19:15:15 +00:00
Adam Nemet 3e87634fd8 [LoopAccesses] Add missing const to APIs in VectorizationReport
When I split out LoopAccessReport from this, I need to create some temps
so constness becomes necessary.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229896
2015-02-19 19:15:13 +00:00
Adam Nemet 339f42b396 [LoopAccesses] Change debug messages from LV to LAA
Also add pass name as an argument to VectorizationReport::emitAnalysis.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229894
2015-02-19 19:15:07 +00:00
Adam Nemet 3bfd93d789 [LoopAccesses] Create the analysis pass
This is a function pass that runs the analysis on demand.  The analysis
can be initiated by querying the loop access info via LAA::getInfo.  It
either returns the cached info or runs the analysis.

Symbolic stride information continues to reside outside of this analysis
pass. We may move it inside later but it's not a priority for me right
now.  The idea is that Loop Distribution won't support run-time stride
checking at least initially.

This means that when querying the analysis, symbolic stride information
can be provided optionally.  Whether stride information is used can
invalidate the cache entry and rerun the analysis.  Note that if the
loop does not have any symbolic stride, the entry should be preserved
across Loop Distribution and LV.

Since currently the only user of the pass is LV, I just check that the
symbolic stride information didn't change when using a cached result.

On the LV side, LoopVectorizationLegality requests the info object
corresponding to the loop from the analysis pass.  A large chunk of the
diff is due to LAI becoming a pointer from a reference.

A test will be added as part of the -analyze patch.

Also tested that with AVX, we generate identical assembly output for the
testsuite (including the external testsuite) before and after.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229893
2015-02-19 19:15:04 +00:00
Adam Nemet 436018c3ff [LoopAccesses] Cache the result of canVectorizeMemory
LAA will be an on-demand analysis pass, so we need to cache the result
of the analysis.  canVectorizeMemory is renamed to analyzeLoop which
computes the result.  canVectorizeMemory becomes the query function for
the cached result.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229892
2015-02-19 19:15:00 +00:00
Adam Nemet c922853b93 [LoopAccesses] Stash the report from the analysis rather than emitting it
The transformation passes will query this and then emit them as part of
their own report.  The currently only user LV is modified to do just
that.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229891
2015-02-19 19:14:56 +00:00
Adam Nemet f219c64723 [LoopAccesses] Make VectorizerParams global + fix for cyclic dep
As LAA is becoming a pass, we can no longer pass the params to its
constructor.  This changes the command line flags to have external
storage.  These can now be accessed both from LV and LAA.

VectorizerParams is moved out of LoopAccessInfo in order to shorten the
code to access it.

This commits also has the fix (D7731) to the break dependence cycle
between the analysis and vector libraries.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229890
2015-02-19 19:14:52 +00:00
Adam Nemet 04d4163e95 Revert "Reformat."
This reverts commit r229651.

I'd like to ultimately revert r229650 but this reformat stands in the
way.  I'll reformat the affected files once the the loop-access pass is
fully committed.

llvm-svn: 229889
2015-02-19 19:14:34 +00:00
Benjamin Kramer 1c2beed7fd LSR: Move set instead of copying. NFC.
llvm-svn: 229871
2015-02-19 17:19:43 +00:00
Benjamin Kramer ea68a944a1 Demote vectors to arrays. No functionality change.
llvm-svn: 229861
2015-02-19 15:26:17 +00:00
Michael Gottesman e5ad66f8a9 [objc-arc] Introduce the concept of RCIdentity and rename all relevant functions to use that name. NFC.
The RCIdentity root ("Reference Count Identity Root") of a value V is a
dominating value U for which retaining or releasing U is equivalent to
retaining or releasing V. In other words, ARC operations on V are
equivalent to ARC operations on U.

This is a useful property to ascertain since we can use this in the ARC
optimizer to make it easier to match up ARC operations by always mapping
ARC operations to RCIdentityRoots instead of pointers themselves. Then
we perform pairing of retains, releases which are applied to the same
RCIdentityRoot.

In general, the two ways that we see RCIdentical values in ObjC are via:

  1. PointerCasts
  2. Forwarding Calls that return their argument verbatim.

As such in ObjC, two RCIdentical pointers must always point to the same
memory location.

Previously this concept was implicit in the code and various methods
that dealt with this concept were given functional names that did not
conform to any name in the "ARC" model. This often times resulted in
code that was hard for the non-ARC acquanted to understand resulting in
unhappiness and confusion.

llvm-svn: 229796
2015-02-19 00:42:38 +00:00
Michael Gottesman dfa3e4b08a [objc-arc-contract] Rename contractRelease => tryToContractReleaseIntoStoreStrong.
NFC. Makes it clearer what this method is actually supposed to do.

llvm-svn: 229795
2015-02-19 00:42:34 +00:00
Michael Gottesman 1827973f80 [objc-arc-contract] Refactor out tryToPeepholeInstruction into its own method. NFC.
The main method of ObjCARCContract is really large and busy. By refactoring this
out, it becomes easier to reason about.

llvm-svn: 229794
2015-02-19 00:42:30 +00:00
Michael Gottesman 56bd6a077a [objc-arc-contract] Reorganize the code a bit and make the debug output easier to read.
llvm-svn: 229793
2015-02-19 00:42:27 +00:00
Sanjoy Das 11b279a832 Partial fix for bug 22589
Don't spend the entire iteration space in the scalar loop prologue if
computing the trip count overflows.  This change also gets rid of the
backedge check in the prologue loop and the extra check for
overflowing trip-count.

Differential Revision: http://reviews.llvm.org/D7715

llvm-svn: 229731
2015-02-18 19:32:25 +00:00
Andrew Kaylor 527c5dc68d Adding implementation to outline C++ catch handlers for native Windows 64 exception handling.
Differential Revision: http://reviews.llvm.org/D7363

llvm-svn: 229715
2015-02-18 18:31:51 +00:00
Mohit K. Bhakkad 518946e440 [MSan][MIPS] VarArgHelper for MIPS64
Reviewers: Reviewers: eugenis, kcc, samsonov, petarj

Subscribers: dsanders, sagar, llvm-commits

Differential Revision: http://reviews.llvm.org/D7182

llvm-svn: 229667
2015-02-18 11:41:24 +00:00
NAKAMURA Takumi a250484c4c Reformat.
llvm-svn: 229651
2015-02-18 08:36:14 +00:00
NAKAMURA Takumi fa520c5f49 Revert r229622: "[LoopAccesses] Make VectorizerParams global" and others. r229622 brought cyclic dependencies between Analysis and Vector.
r229622: "[LoopAccesses] Make VectorizerParams global"
  r229623: "[LoopAccesses] Stash the report from the analysis rather than emitting it"
  r229624: "[LoopAccesses] Cache the result of canVectorizeMemory"
  r229626: "[LoopAccesses] Create the analysis pass"
  r229628: "[LoopAccesses] Change debug messages from LV to LAA"
  r229630: "[LoopAccesses] Add canAnalyzeLoop"
  r229631: "[LoopAccesses] Add missing const to APIs in VectorizationReport"
  r229632: "[LoopAccesses] Split out LoopAccessReport from VectorizerReport"
  r229633: "[LoopAccesses] Add -analyze support"
  r229634: "[LoopAccesses] Change LAA:getInfo to return a constant reference"
  r229638: "Analysis: fix buildbots"

llvm-svn: 229650
2015-02-18 08:34:47 +00:00
Craig Topper 1348f17205 [X86] Remove AVX512 pslldq/psrldq shift intrinsics. They aren't implemented yet and when they are they should be done with shuffles like SSE2 and AVX2.
llvm-svn: 229641
2015-02-18 06:24:49 +00:00
Craig Topper b324e43aed [X86] Remove AVX2 and SSE2 pslldq and psrldq intrinsics. We can represent them in IR with vector shuffles now. All their uses have been removed from clang in favor of shuffles.
llvm-svn: 229640
2015-02-18 06:24:44 +00:00
Adam Nemet 85fd9f8d09 [LoopAccesses] Change LAA:getInfo to return a constant reference
As expected, this required a few more const-correctness fixes.

Based on Hal's feedback on D7684.

llvm-svn: 229634
2015-02-18 03:44:33 +00:00
Adam Nemet d7350dbb85 [LoopAccesses] Split out LoopAccessReport from VectorizerReport
The only difference between these two is that VectorizerReport adds a
vectorizer-specific prefix to its messages.  When LAA is used in the
vectorizer context the prefix is added when we promote the
LoopAccessReport into a VectorizerReport via one of the constructors.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229632
2015-02-18 03:44:25 +00:00
Adam Nemet 8b12afbeee [LoopAccesses] Add missing const to APIs in VectorizationReport
When I split out LoopAccessReport from this, I need to create some temps
so constness becomes necessary.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229631
2015-02-18 03:44:20 +00:00
Adam Nemet d0db4c1395 [LoopAccesses] Change debug messages from LV to LAA
Also add pass name as an argument to VectorizationReport::emitAnalysis.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229628
2015-02-18 03:43:37 +00:00
Adam Nemet d6b7e29815 [LoopAccesses] Create the analysis pass
This is a function pass that runs the analysis on demand.  The analysis
can be initiated by querying the loop access info via LAA::getInfo.  It
either returns the cached info or runs the analysis.

Symbolic stride information continues to reside outside of this analysis
pass. We may move it inside later but it's not a priority for me right
now.  The idea is that Loop Distribution won't support run-time stride
checking at least initially.

This means that when querying the analysis, symbolic stride information
can be provided optionally.  Whether stride information is used can
invalidate the cache entry and rerun the analysis.  Note that if the
loop does not have any symbolic stride, the entry should be preserved
across Loop Distribution and LV.

Since currently the only user of the pass is LV, I just check that the
symbolic stride information didn't change when using a cached result.

On the LV side, LoopVectorizationLegality requests the info object
corresponding to the loop from the analysis pass.  A large chunk of the
diff is due to LAI becoming a pointer from a reference.

A test will be added as part of the -analyze patch.

Also tested that with AVX, we generate identical assembly output for the
testsuite (including the external testsuite) before and after.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229626
2015-02-18 03:43:24 +00:00
Adam Nemet 01abb2c355 [LoopAccesses] Make blockNeedsPredication static
blockNeedsPredication is in LoopAccess in order to share it with the
vectorizer.  It's a utility needed by LoopAccess not strictly provided
by it but it's a good place to share it.  This makes the function static
so that it no longer required to create an LoopAccessInfo instance in
order to access it from LV.

This was actually causing problems because it would have required
creating LAI much earlier that LV::canVectorizeMemory().

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229625
2015-02-18 03:43:19 +00:00
Adam Nemet 3cf32ad6db [LoopAccesses] Cache the result of canVectorizeMemory
LAA will be an on-demand analysis pass, so we need to cache the result
of the analysis.  canVectorizeMemory is renamed to analyzeLoop which
computes the result.  canVectorizeMemory becomes the query function for
the cached result.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229624
2015-02-18 03:42:57 +00:00
Adam Nemet 5474be2c80 [LoopAccesses] Stash the report from the analysis rather than emitting it
The transformation passes will query this and then emit them as part of
their own report.  The currently only user LV is modified to do just
that.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229623
2015-02-18 03:42:50 +00:00
Adam Nemet 4f3ede5a01 [LoopAccesses] Make VectorizerParams global
As LAA is becoming a pass, we can no longer pass the params to its
constructor.  This changes the command line flags to have external
storage.  These can now be accessed both from LV and LAA.

VectorizerParams is moved out of LoopAccessInfo in order to shorten the
code to access it.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229622
2015-02-18 03:42:43 +00:00
Adam Nemet 30f16e1696 [LoopAccesses] Rename LoopAccessAnalysis to LoopAccessInfo
LoopAccessAnalysis will be used as the name of the pass.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229621
2015-02-18 03:42:35 +00:00
Akira Hatanaka 1defd5afbd [InstCombine] Do not insert a GEP instruction before a landingpad instruction.
InstCombiner::visitGetElementPtrInst was using getFirstNonPHI to compute the
insertion point, which caused the verifier to complain when a GEP was inserted
before a landingpad instruction. This commit fixes it to use getFirstInsertionPt
instead.

rdar://problem/19394964

llvm-svn: 229619
2015-02-18 03:30:11 +00:00
Hal Finkel 4393559621 [BDCE] Don't forget uses of root instructions seen before the instruction itself
When visiting the initial list of "root" instructions (those which must always
be alive), for those that are integer-valued (such as invokes returning an
integer), we mark their bits as (initially) all dead (we might, obviously, find
uses of those bits later, but all bits are assumed dead until proven
otherwise). Don't do so, however, if we're already seen a use of those bits by
another root instruction (such as a store).

Fixes a miscompile of the sanitizer unit tests on x86_64.

Also, add a debug line for visiting the root instructions, and remove a debug
line which tried to print instructions being removed (printing dead
instructions is dangerous, and can sometimes crash).

llvm-svn: 229618
2015-02-18 03:12:28 +00:00
Benjamin Kramer 6cd780ff21 Prefer SmallVector::append/insert over push_back loops.
Same functionality, but hoists the vector growth out of the loop.

llvm-svn: 229500
2015-02-17 15:29:18 +00:00
Elena Demikhovsky ef035bb974 Fixed a bug in store sinking.
The problem was in store-sink barrier check.

Store sink barrier should be checked for ModRef (read-write) mode.

http://llvm.org/bugs/show_bug.cgi?id=22613

llvm-svn: 229495
2015-02-17 13:10:05 +00:00
Hal Finkel 2bb61ba2fe [BDCE] Add a bit-tracking DCE pass
BDCE is a bit-tracking dead code elimination pass. It is based on ADCE (the
"aggressive DCE" pass), with the added capability to track dead bits of integer
valued instructions and remove those instructions when all of the bits are
dead.

Currently, it does not actually do this all-bits-dead removal, but rather
replaces the instruction's uses with a constant zero, and lets instcombine (and
the later run of ADCE) do the rest. Because we essentially get a run of ADCE
"for free" while tracking the dead bits, we also do what ADCE does and removes
actually-dead instructions as well (this includes instructions newly trivially
dead because all bits were dead, but not all such instructions can be removed).

The motivation for this is a case like:

int __attribute__((const)) foo(int i);
int bar(int x) {
  x |= (4 & foo(5));
  x |= (8 & foo(3));
  x |= (16 & foo(2));
  x |= (32 & foo(1));
  x |= (64 & foo(0));
  x |= (128& foo(4));
  return x >> 4;
}

As it turns out, if you order the bit-field insertions so that all of the dead
ones come last, then instcombine will remove them. However, if you pick some
other order (such as the one above), the fact that some of the calls to foo()
are useless is not locally obvious, and we don't remove them (without this
pass).

I did a quick compile-time overhead check using sqlite from the test suite
(Release+Asserts). BDCE took ~0.4% of the compilation time (making it about
twice as expensive as ADCE).

I've not looked at why yet, but we eliminate instructions due to having
all-dead bits in:
External/SPEC/CFP2006/447.dealII/447.dealII
External/SPEC/CINT2006/400.perlbench/400.perlbench
External/SPEC/CINT2006/403.gcc/403.gcc
MultiSource/Applications/ClamAV/clamscan
MultiSource/Benchmarks/7zip/7zip-benchmark

llvm-svn: 229462
2015-02-17 01:36:59 +00:00
Mehdi Amini b9a0fa4822 InstCombine: fold more cases of (fp_to_u/sint (u/sint_to_fp val))
Fixes radar 15486701.

From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 229437
2015-02-16 21:47:54 +00:00
James Molloy 83570247f1 Run LICM as part of the cleanup phase from the scalar optimizer.
Things like LoopUnrolling can produce loop invariant values - make sure
we pick them up.

llvm-svn: 229419
2015-02-16 18:59:54 +00:00
Hal Finkel c64150b8f3 [ADCE] Don't indent inside an anonymous namespace
To be consistent with what clang-format does, don't add extra indentation
inside an anonymous namespace. NFC.

llvm-svn: 229412
2015-02-16 18:08:00 +00:00
James Molloy e32d806b5f [LoopReroll] Relax some assumptions a little.
We won't find a root with index zero in any loop that we are able to reroll.
However, we may find one in a non-rerollable loop, so bail gracefully instead
of failing hard.

llvm-svn: 229406
2015-02-16 17:02:00 +00:00
James Molloy 4c7deb2259 [LoopReroll] Don't crash on dead code
If a PHI has no users, don't crash; bail gracefully. This shouldn't
happen often, but we can make no guarantees that previous passes didn't leave
dead code around.

llvm-svn: 229405
2015-02-16 17:01:52 +00:00
Evgeniy Stepanov 292acab847 [asan] Reuse a common function.
Do not reimplement RoundUpToAlignment.

llvm-svn: 229397
2015-02-16 14:49:37 +00:00
Aaron Ballman f9a1897c72 Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition.
llvm-svn: 229340
2015-02-15 22:54:22 +00:00
Hal Finkel 8626ed2eae [ADCE] Convert another loop for a range-based for
We can use a range-based for for the operands loop too; NFC.

llvm-svn: 229319
2015-02-15 15:51:25 +00:00
Hal Finkel 92fb2d3803 [ADCE] Use inst_range and range-based fors
Convert a few loops to range-based fors; NFC.

llvm-svn: 229318
2015-02-15 15:51:23 +00:00
Hal Finkel c6035cff55 [ADCE] Fix formatting of pointer types
We prefer to put the * with the variable, not with the type; NFC.

llvm-svn: 229317
2015-02-15 15:47:52 +00:00
Hal Finkel 234d8fea7b [ADCE] Fix capitalization of another local variable
Bring another local variable in compliance with our naming conventions, NFC.

llvm-svn: 229316
2015-02-15 15:45:30 +00:00
Hal Finkel 75901293a1 [ADCE] Fix capitalization of some local variables
Bring some local variables in compliance with our naming conventions, NFC.

llvm-svn: 229315
2015-02-15 15:45:28 +00:00
Elena Demikhovsky 6f5a859633 Enabled cost calculation for masked memory operations.
We already have implementation for cost calculation for
masked memory operations. I just call it from the loop vectorizer.

llvm-svn: 229290
2015-02-15 08:08:48 +00:00