Commit Graph

11846 Commits

Author SHA1 Message Date
James Molloy 6b95d8ed36 Enable noalias metadata by default and swap the order of the SLP and Loop vectorizers by default.
After some time maturing, hopefully the flags themselves will be removed.

llvm-svn: 217144
2014-09-04 13:23:08 +00:00
Tilmann Scheller faabbb5fb6 [GVN] Format variable name.
Local variables need to start with an upper case letter.

llvm-svn: 217133
2014-09-04 06:38:00 +00:00
David Majnemer 13046deef3 IndVarSimplify: Address review comments for r217102
No functional change intended, just some cleanups and comments added.

llvm-svn: 217115
2014-09-04 00:23:13 +00:00
Kostya Serebryany 3175521844 [asan] fix debug info produced for asan-coverage=2
llvm-svn: 217106
2014-09-03 23:24:18 +00:00
David Majnemer c6ab01ecca IndVarSimplify: Don't let LFTR compare against a poison value
LinearFunctionTestReplace tries to use the *next* indvar to compare
against when possible.  However, it may be the case that the calculation
for the next indvar has NUW/NSW flags and that it may only be safely
used inside the loop.  Using it in a comparison to calculate the exit
condition could result in observing poison.

This fixes PR20680.

Differential Revision: http://reviews.llvm.org/D5174

llvm-svn: 217102
2014-09-03 23:03:18 +00:00
Kostya Serebryany 351b078b6d [asan] add -asan-coverage=3: instrument all blocks and critical edges.
llvm-svn: 217098
2014-09-03 22:37:37 +00:00
Benjamin Kramer 89854ebe8e Make some helpers static or move into the llvm namespace.
llvm-svn: 217077
2014-09-03 21:04:12 +00:00
Sanjay Patel 9433a28845 Preserve IR flags (nsw, nuw, exact, fast-math) in SLP vectorizer (PR20802).
The SLP vectorizer should propagate IR-level optimization hints/flags (nsw, nuw, exact, fast-math)
when converting scalar instructions into vectors. But this isn't a simple copy - we need to take
the intersection (the logical 'and') of the sets of flags on the scalars.

The solution is further complicated because we can have non-uniform (non-SIMD) vector ops after:
http://reviews.llvm.org/D4015
http://llvm.org/viewvc/llvm-project?view=revision&revision=211339

The vast majority of changed files are existing tests that were not propagating IR flags, but I've
also added a new test file for focused testing of IR flag possibilities.

Differential Revision: http://reviews.llvm.org/D5172

llvm-svn: 217051
2014-09-03 17:40:30 +00:00
Sanjay Patel a982d992f0 Change name of copyFlags() to copyIRFlags(). Add convenience method for logical 'and' of all flags. NFC.
Adding 'IR' to the names in an attempt to be less ambiguous about the flags we're dealing with here.

The 'and' method is needed by the SLPVectorizer (PR20802) and possibly other passes.

llvm-svn: 217004
2014-09-03 01:06:50 +00:00
Hal Finkel 445dda5c4a Add pass-manager flags to use CFL AA
Add -use-cfl-aa (and -use-cfl-aa-in-codegen) to add CFL AA in the default pass
managers (for easy testing).

llvm-svn: 216978
2014-09-02 22:12:54 +00:00
Kostya Serebryany ad23852ac3 [asan] Assign a low branch weight to ASan's slow path, patch by Jonas Wagner. This speeds up asan (at least on SPEC) by 1%-5% or more. Also fix lint in dfsan.
llvm-svn: 216972
2014-09-02 21:46:51 +00:00
Yi Jiang 77a609b556 Generate extract for in-tree uses if the use is scalar operand in vectorized instruction. radar://18144665
llvm-svn: 216946
2014-09-02 21:00:39 +00:00
David Blaikie 15913f46b2 unique_ptrify the result of SpecialCaseList::create
llvm-svn: 216925
2014-09-02 18:13:54 +00:00
David Majnemer 49428105aa LICM: Don't crash when an instruction is used by an unreachable BB
Summary:
BBs might contain non-LCSSA'd values after the LCSSA pass is run if they
are unreachable from the entry block.

Normally, the users of the instruction would be PHIs but the unreachable
BBs have normal users; rewrite their uses to be undef values.

An alternative fix could involve fixing this at LCSSA but that would
require this invariant to hold after subsequent transforms.  If a BB
created an unreachable block, they would be in violation of this.

This fixes PR19798.

Differential Revision: http://reviews.llvm.org/D5146

llvm-svn: 216911
2014-09-02 16:22:00 +00:00
David Majnemer d4cffcf073 SROA: Don't insert instructions before a PHI
SROA may decide that it needs to insert a bitcast and would set it's
insertion point before a PHI.  This will create an invalid module
right quick.

Instead, choose the first insertion point in the basic block that holds
our PHI.

This fixes PR20822.

Differential Revision: http://reviews.llvm.org/D5141

llvm-svn: 216891
2014-09-01 21:20:14 +00:00
David Majnemer d2df50196f Revert "Revert two GEP-related InstCombine commits"
This reverts commit r216698 which reverted r216523 and r216598.

We would attempt to perform the transformation even if the match()
failed because, as a side effect, it would set V.  This would trick us
into believing that we correctly found a place to correctly apply the
transform.

An additional test case was added to getelementptr.ll so that we might
not regress in the future.

llvm-svn: 216890
2014-09-01 21:10:02 +00:00
Sanjay Patel 5ad239e15a Add a convenience method to copy wrapping, exact, and fast-math flags (NFC).
The loop vectorizer preserves wrapping, exact, and fast-math properties of scalar instructions.
This patch adds a convenience method to make that operation easier because we need to do this
in the loop vectorizer, SLP vectorizer, and possibly other places.

Although this is a 'no functional change' patch, I've added a testcase to verify that the exact
flag is preserved by the loop vectorizer. The wrapping and fast-math flags are already checked
in existing testcases.

Differential Revision: http://reviews.llvm.org/D5138

llvm-svn: 216886
2014-09-01 18:44:57 +00:00
Chandler Carruth 18cee1defc Fix a really bad miscompile introduced in r216865 - the else-if logic
chain became completely broken here as *all* intrinsic users ended up
being skipped, and the ones that seemed to be singled out were actually
the exact wrong set.

This is a great example of why long else-if chains can be easily
confusing. Switch the entire code to use early exits and early continues
to have simpler (and more importantly, correct) logic here, as well as
fixing the reversed logic for detecting and continuing on lifetime
intrinsics.

I've also significantly cleaned up the test case and added another test
case demonstrating an example where the optimization is not (trivially)
safe to perform.

llvm-svn: 216871
2014-09-01 10:09:18 +00:00
Renato Golin 86a6c3f269 Small refactor on VectorizerHint for deduplication
Previously, the hint mechanism relied on clean up passes to remove redundant
metadata, which still showed up if running opt at low levels of optimization.
That also has shown that multiple nodes of the same type, but with different
values could still coexist, even if temporary, and cause confusion if the
next pass got the wrong value.

This patch makes sure that, if metadata already exists in a loop, the hint
mechanism will never append a new node, but always replace the existing one.
It also enhances the algorithm to cope with more metadata types in the future
by just adding a new type, not a lot of code.

Re-applying again due to MSVC 2013 being minimum requirement, and this patch
having C++11 that MSVC 2012 didn't support.

Fixes PR20655.

llvm-svn: 216870
2014-09-01 10:00:17 +00:00
Hal Finkel 0c083024f0 Feed AA to the inliner and use AA->getModRefBehavior in AddAliasScopeMetadata
This feeds AA through the IFI structure into the inliner so that
AddAliasScopeMetadata can use AA->getModRefBehavior to figure out which
functions only access their arguments (instead of just hard-coding some
knowledge of memory intrinsics). Most of the information is only available from
BasicAA; this is important for preserving alias scoping information for
target-specific intrinsics when doing the noalias parameter attribute to
metadata conversion.

llvm-svn: 216866
2014-09-01 09:01:39 +00:00
Nick Lewycky fc243d54d2 Ignore lifetime intrinsics in use list for MemCpyOptimizer. Patch by Luqman Aden, review by Hal Finkel.
llvm-svn: 216865
2014-09-01 06:03:11 +00:00
Hal Finkel cbb85f249e Fix AddAliasScopeMetadata again - alias.scope must be a complete description
I thought that I had fixed this problem in r216818, but I did not do a very
good job. The underlying issue is that when we add alias.scope metadata we are
asserting that this metadata completely describes the aliasing relationships
within the current aliasing scope domain, and so in the context of translating
noalias argument attributes, the pointers must all be based on noalias
arguments (as underlying objects) and have no other kind of underlying object.
In r216818 excluding appropriate accesses from getting alias.scope metadata is
done by looking for underlying objects that are not identified function-local
objects -- but that's wrong because allocas, etc. are also function-local
objects and we need to explicitly check that all underlying objects are the
noalias arguments for which we're adding metadata aliasing scopes.

This fixes the underlying-object check for adding alias.scope metadata, and
does some refactoring of the related capture-checking eligibility logic (and
adds more comments; hopefully making everything a bit clearer).

Fixes self-hosting on x86_64 with -mllvm -enable-noalias-to-md-conversion (the
feature is still disabled by default).

llvm-svn: 216863
2014-09-01 04:26:40 +00:00
Craig Topper 6dc4a8bc2c Fix some cases where StringRef was being passed by const reference. Remove const from some other StringRefs since its implicitly const already.
llvm-svn: 216820
2014-08-30 16:48:02 +00:00
Hal Finkel a3708df41a Fix AddAliasScopeMetadata to not add scopes when deriving from unknown pointers
The previous implementation of AddAliasScopeMetadata, which adds noalias
metadata to preserve noalias parameter attribute information when inlining had
a flaw: it would add alias.scope metadata to accesses which might have been
derived from pointers other than noalias function parameters. This was
incorrect because even some access known not to alias with all noalias function
parameters could easily alias with an access derived from some other pointer.
Instead, when deriving from some unknown pointer, we cannot add alias.scope
metadata at all. This fixes a miscompile of the test-suite's tramp3d-v4.
Furthermore, we cannot add alias.scope to functions unless we know they
access only argument-derived pointers (currently, we know this only for
memory intrinsics).

Also, we fix a theoretical problem with using the NoCapture attribute to skip
the capture check. This is incorrect (as explained in the comment added), but
would not matter in any code generated by Clang because we get only inferred
nocapture attributes in Clang-generated IR.

This functionality is not yet enabled by default.

llvm-svn: 216818
2014-08-30 12:48:33 +00:00
David Majnemer 492e612e01 InstCombine: Respect recursion depth in visitUDivOperand
llvm-svn: 216817
2014-08-30 09:19:05 +00:00
David Majnemer 5e96f1b4c8 InstCombine: Try harder to combine icmp instructions
consider: (and (icmp X, Y), (and Z, (icmp A, B)))
It may be possible to combine (icmp X, Y) with (icmp A, B).
If we successfully combine, create an 'and' instruction with Z.

This fixes PR20814.

N.B. There is room for improvement after this change but I'm not
convinced it's worth chasing yet.

llvm-svn: 216814
2014-08-30 06:18:20 +00:00
Hal Finkel 2d3d6da44b Fix a typo in AddAliasScopeMetadata
llvm-svn: 216741
2014-08-29 16:33:41 +00:00
David Majnemer 400e725bde Revert two GEP-related InstCombine commits
This reverts commit r216523 and r216598; people have reported
regressions.

llvm-svn: 216698
2014-08-29 00:06:43 +00:00
Reid Kleckner febb279c9c Don't promote byval pointer arguments when padding matters
Don't promote byval pointer arguments when when their size in bits is
not equal to their alloc size in bits. This can happen for x86_fp80,
where the size in bits is 80 but the alloca size in bits in 128.
Promoting these types can break passing unions of x86_fp80s and other
types.

Patch by Thomas Jablin!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D5057

llvm-svn: 216693
2014-08-28 22:42:00 +00:00
David Majnemer 074052b623 InstCombine: Remove redundant combines
InstSimplify already handles icmp (X+Y), X (and things like it)
appropriately.  The first thing that InstCombine does is run
InstSimplify on the instruction.

llvm-svn: 216659
2014-08-28 10:08:37 +00:00
Erik Eckstein 8354cfaf95 Fix: SLPVectorizer tried to move an instruction which was replaced by a vector instruction.
For a detailed description of the problem see the comment in the test file.
The problematic moveBefore() calls are not required anymore because the new
scheduling algorithm ensures a correct ordering anyway.

llvm-svn: 216656
2014-08-28 07:04:02 +00:00
David Majnemer 76d06bc613 InstSimplify: Move a transform from InstCombine to InstSimplify
Several combines involving icmp (shl C2, %X) C1 can be simplified
without introducing any new instructions.  Move them to InstSimplify;
while we are at it, make them more powerful.

llvm-svn: 216642
2014-08-28 03:34:28 +00:00
David Majnemer 22ccfc4484 InstCombine: Combine gep X, (Y-X) to Y
We try to perform this transform in InstSimplify but we aren't always
able to.  Sometimes, we need to insert a bitcast if X and Y don't have
the same time.

llvm-svn: 216598
2014-08-27 20:08:37 +00:00
Michael Zolotukhin 5dc466b863 [SLP] Re-enable vectorization of GEP expressions (re-apply r210342 with a fix).
llvm-svn: 216549
2014-08-27 15:01:18 +00:00
Craig Topper e1d1294853 Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created.
llvm-svn: 216525
2014-08-27 05:25:25 +00:00
Craig Topper 3af9722529 Fix some cases were ArrayRefs were being passed by reference. Also remove 'const' from some other ArrayRef uses since its implicitly const already.
llvm-svn: 216524
2014-08-27 05:25:00 +00:00
David Majnemer 54e97d5dc0 InstCombine: Optimize GEP's involving ptrtoint better
We supported transforming:
(gep i8* X, -(ptrtoint Y))

to:
(inttoptr (sub (ptrtoint X), (ptrtoint Y)))

However, this only fired if 'X' had type i8*.  Generalize this to
support various types of different sizes.  This results in much better
CodeGen, especially for pointers to packed structs.

llvm-svn: 216523
2014-08-27 05:16:04 +00:00
Joerg Sonnenberger cb5674b9c2 Revert r210342 and r210343, add test case for the crasher.
PR 20642.

llvm-svn: 216475
2014-08-26 19:06:41 +00:00
Dinesh Dwivedi 4919bbe29d This patch enables SimplifyUsingDistributiveLaws() to handle following pattens.
(X >> Z) & (Y >> Z)  -> (X&Y) >> Z  for all shifts.
(X >> Z) | (Y >> Z)  -> (X|Y) >> Z  for all shifts.
(X >> Z) ^ (Y >> Z)  -> (X^Y) >> Z  for all shifts.

These patterns were previously handled separately in visitAnd()/visitOr()/visitXor().

Differential Revision: http://reviews.llvm.org/D4951

llvm-svn: 216443
2014-08-26 08:53:32 +00:00
Reid Kleckner 3715461b48 musttail: Don't eliminate varargs packs if there is a forwarding call
Also clean up and beef up this grep test for the feature.

llvm-svn: 216425
2014-08-26 00:59:51 +00:00
Sanjay Patel 4e31cdabd1 fix typos in comments
llvm-svn: 216424
2014-08-26 00:59:15 +00:00
Reid Kleckner e6e88f99b3 ArgPromotion: Don't touch variadic functions
Adding, removing, or changing non-pack parameters can change the ABI
classification of pack parameters. Clang and other frontends encode the
classification in the IR of the call site, but the callee side
determines it dynamically based on the number of registers consumed so
far. Changing the prototype affects the number of registers consumed
would break such code.

Dead argument elimination performs a similar task and already has a
similar check to avoid this problem.

Patch by Thomas Jablin!

llvm-svn: 216421
2014-08-25 23:58:48 +00:00
Rafael Espindola 3fd1e9933f Modernize raw_fd_ostream's constructor a bit.
Take a StringRef instead of a "const char *".
Take a "std::error_code &" instead of a "std::string &" for error.

A create static method would be even better, but this patch is already a bit too
big.

llvm-svn: 216393
2014-08-25 18:16:47 +00:00
Bruno Cardoso Lopes e2a1fa35df Remove dangling initializers in GlobalDCE
GlobalDCE deletes global vars and updates their initializers to nullptr
while leaving underlying constants to be cleaned up later by its uses.
The clean up may never happen, fix this by forcing it every time it's
safe to destroy constants.

Final patch by Rafael Espindola
http://reviews.llvm.org/D4931

<rdar://problem/17523868>

llvm-svn: 216390
2014-08-25 17:51:14 +00:00
Stepan Dyatkovskiy c90308bf83 MergeFunctions, tiny refactoring:
cmpAPFloat has been renamed to cmpAPFloats (multiple form).

llvm-svn: 216376
2014-08-25 08:22:46 +00:00
Stepan Dyatkovskiy 7f895c1184 MergeFunctions, tiny refactoring:
cmpAPInt has been renamed to cmpAPInts (multiple form).

llvm-svn: 216375
2014-08-25 08:19:50 +00:00
Stepan Dyatkovskiy 0b765dee6e MergeFunctions, tiny refactoring:
cmpType has been renamed to cmpTypes (multiple form).

llvm-svn: 216374
2014-08-25 08:16:39 +00:00
Stepan Dyatkovskiy 016daddc52 MergeFunctions, tiny refactoring:
cmpGEP has been renamed to cmpGEPs (multiple form).

llvm-svn: 216373
2014-08-25 08:12:45 +00:00
Karthik Bhat 7f33ff7dea Allow vectorization of division by uniform power of 2.
This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible.
Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend.
Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971)

llvm-svn: 216371
2014-08-25 04:56:54 +00:00
Craig Topper 4627679cec Use range based for loops to avoid needing to re-mention SmallPtrSet size.
llvm-svn: 216351
2014-08-24 23:23:06 +00:00