llvm-project

Commit Graph

Author	SHA1	Message	Date
Easwaran Raman	94edaaaefb	Revert r271728 as it breaks Windows build llvm-svn: 271738	2016-06-03 21:14:26 +00:00
Easwaran Raman	d142050f3a	Analysis pass to access profile summary info Differential Revision: http://reviews.llvm.org/D20648 llvm-svn: 271728	2016-06-03 20:37:19 +00:00
Sanjay Patel	dba8b4c04d	transform obscured FP sign bit ops into a fabs/fneg using TLI hook This is effectively a revert of: http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886) and: http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions. This is intended to resolve the objections raised on the dev list: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html and: https://llvm.org/bugs/show_bug.cgi?id=24886#c4 In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later. Differential Revision: http://reviews.llvm.org/D19391 llvm-svn: 271573	2016-06-02 20:01:37 +00:00
Sanjoy Das	48cad71243	Inline isDereferenceableFromAttribute; NFC Now that `Value::getPointerDereferenceableBytes` looks beyond just attributes, the name `isDereferenceableFromAttribute` is misleading. Just inline the function, since it is small and only used once. llvm-svn: 271456	2016-06-02 00:52:53 +00:00
Sanjoy Das	00953cbe1d	Remove Value::isPointerDereferenceable; NFCI ... and merge into `Value::getPointerDereferenceableBytes`. This was suggested by Artur Pilipenko in D20764 -- since we no longer allow loads of unsized types, there is no need anymore to have this special logic. llvm-svn: 271455	2016-06-02 00:52:48 +00:00
Geoff Berry	0c09517867	[SCEV] Keep SCEVExpander insert points consistent. Summary: Make sure that the SCEVExpander Builder insert point and any saved/restored insert points are kept consistent (i.e. their Instruction and BasicBlock match) when moving instructions in SCEVExpander. This fixes an issue triggered by http://reviews.llvm.org/D18001 [LSR] Create fewer redundant instructions. Test case will be added in reapply commit of above change: http://reviews.llvm.org/D18480 Reapply [LSR] Create fewer redundant instructions. Reviewers: sanjoy Subscribers: mzolotukhin, sanjoy, qcolombet, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20703 llvm-svn: 271424	2016-06-01 20:03:09 +00:00
Daniel Berlin	73694bb92b	Revert "Claim NoAlias if two GEPs index different fields of the same struct" This reverts commit 2d5d6493f43eb68493a3852b8c226ac9fafdc7eb. llvm-svn: 271422	2016-06-01 18:55:32 +00:00
George Burgess IV	18b83fe6cf	[CFLAA] Recognize builtin allocation functions. This patch extends CFLAA to recognize allocation functions such as malloc, free, etc, so we can treat them more aggressively. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D20776 llvm-svn: 271421	2016-06-01 18:39:54 +00:00
Daniel Berlin	e846c9dc52	Claim NoAlias if two GEPs index different fields of the same struct Patch by Taewook Oh Summary: Patch for Bug 27478. Make BasicAliasAnalysis claims NoAlias if two GEPs index different fields of the same structure. Reviewers: hfinkel, dberlin Subscribers: dberlin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20665 llvm-svn: 271415	2016-06-01 18:12:01 +00:00
Sanjoy Das	10df497a1f	Reduce dependence on pointee types when deducing dereferenceability Summary: Change some of the internal interfaces in Loads.cpp to keep track of the number of bytes we're trying to prove dereferenceable using an explicit `Size` parameter. Before this, the `Size` parameter was implicitly inferred from the pointee type of the pointer whose dereferenceability we were trying to prove, causing us to be conservative around bitcasts. This was unfortunate since bitcast instructions are no-ops and should never break optimizations. With an explicit `Size` parameter, we're more precise (as shown in the test cases), and the code is simpler. We should eventually move towards a `DerefQuery` struct that groups together a base pointer, an offset, a size and an alignment; but this patch is a first step. Reviewers: apilipenko, dblaikie, hfinkel, reames Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20764 llvm-svn: 271406	2016-06-01 16:47:45 +00:00
George Burgess IV	a880146925	[CFLAA] Don't link GEP pointers to GEP indices. Code like the following is considered broken, and doesn't need to be supported by our AA magicks: void getFoo(int P) { int PAlias = (int )((char )NULL + (uintptr_t)P); } This patch makes CFLAA drop support for code like this. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D20775 llvm-svn: 271322	2016-05-31 19:55:05 +00:00
Saleem Abdulrasool	d2f705ddf9	X86: permit using SjLj EH on x86 targets as an option This adds support to the backed to actually support SjLj EH as an exception model. This is NOT the default model, and requires explicitly opting into it from the frontend. GCC supports this model and for MinGW can still be enabled via the `--using-sjlj-exceptions` options. Addresses PR27749! llvm-svn: 271244	2016-05-31 01:48:07 +00:00
Sanjoy Das	f857081c8c	[SCEV] Consolidate comments; NFC Consolidate documentation by removing comments from the .cpp file where the comments in the .cpp file were copy-pasted from the header. llvm-svn: 271157	2016-05-29 00:38:22 +00:00
Sanjoy Das	108fcf2e2c	[SCEV] Rename functions to LLVM style; NFC llvm-svn: 271156	2016-05-29 00:38:00 +00:00
Sanjoy Das	f49ca52b9d	[SCEV] See through op.with.overflow intrinsics (re-apply) Summary: This change teaches SCEV to see reduce `(extractvalue 0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if possible). This was first checked in at r265912 but reverted in r265950 because it exposed some issues around how SCEV handled post-inc add recurrences. Those issues have now been fixed. Reviewers: atrick, regehr Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18684 llvm-svn: 271152	2016-05-29 00:34:42 +00:00
Sanjoy Das	7e4a64167d	[SCEV] Don't always add no-wrap flags to post-inc add recs Fixes PR27315. The post-inc version of an add recurrence needs to "follow the same rules" as a normal add or subtract expression. Otherwise we miscompile programs like ``` int main() { int a = 0; unsigned a_u = 0; volatile long last_value; do { a_u += 3; last_value = (long) ((int) a_u); if (will_add_overflow(a, 3)) { // Leave, and don't actually do the increment, so no UB. printf("last_value = %ld\n", last_value); exit(0); } a += 3; } while (a != 46); return 0; } ``` This patch changes SCEV to put no-wrap flags on post-inc add recurrences only when the poison from a potential overflow will go ahead to cause undefined behavior. To avoid regressing performance too much, I've assumed infinite loops without side effects is undefined behavior to prove poison<->UB equivalence in more cases. This isn't ideal, but is not new to LLVM as a whole, and far better than the situation I'm trying to fix. llvm-svn: 271151	2016-05-29 00:32:17 +00:00
Sanjoy Das	70c2bbd29c	[ValueTracking] ICmp instructions propagate poison This is a stripped down version of D19211, leaving out the questionable "branching in poison is UB" bit. llvm-svn: 271150	2016-05-29 00:31:18 +00:00
Michael Zolotukhin	d69cd1e086	[LoopUnrollAnalyzer] Add a comment to visitCastInst. llvm-svn: 271086	2016-05-28 01:40:14 +00:00
Benjamin Kramer	82de7d323d	Apply clang-tidy's misc-move-constructor-init throughout LLVM. No functionality change intended, maybe a tiny performance improvement. llvm-svn: 270997	2016-05-27 14:27:24 +00:00
Michael Zolotukhin	15e745133e	[LoopUnrollAnalyzer] Bail out instead of dying with assert when facing huge index. This fixes PR27902. llvm-svn: 270946	2016-05-27 00:55:16 +00:00
Michael Kuperstein	ae21491819	[BasicAA] Extend inbound GEP negative offset logic to GlobalVariables r270777 improved the precision of alloca vs. inbounbds GEP alias queries: if we have (a) an inbounds GEP and (b) a pointer based on an alloca, and the beginning of the object the GEP points to would have a negative offset with respect to the alloca, then the GEP can not alias pointer (b). This makes the same logic fire when (b) is based on a GlobalVariable instead of an alloca. Differential Revision: http://reviews.llvm.org/D20652 llvm-svn: 270893	2016-05-26 19:30:49 +00:00
David Majnemer	7f32420ed5	[CaptureTracking] Volatile operations capture their memory location The memory location that corresponds to a volatile operation is very special. They are observed by the machine in ways which we cannot reason about. Differential Revision: http://reviews.llvm.org/D20555 llvm-svn: 270879	2016-05-26 17:36:22 +00:00
Peter Collingbourne	b9aa1f4a03	MemorySSA: Revert r269678 and r268068; replace with special casing in MemorySSA. It turns out that too many passes are relying on alias analysis results for control dependencies. Until we fix that by introducing a more accurate modelling of control dependencies, special case assume in MemorySSA instead. Also introduce tests to ensure we don't regress the FunctionAttrs or LICM passes. Differential Revision: http://reviews.llvm.org/D20658 llvm-svn: 270823	2016-05-26 04:58:46 +00:00
Davide Italiano	bd543d0a0b	[LazyValueInfo] Simplify `return after else`. NFCI. llvm-svn: 270779	2016-05-25 22:29:34 +00:00
Michael Kuperstein	82069c44ca	[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries If a we have (a) a GEP and (b) a pointer based on an alloca, and the beginning of the object the GEP points would have a negative offset with repsect to the alloca, then the GEP can not alias pointer (b). For example, consider code like: struct { int f0, int f1, ...} foo; ... foo alloca; foo random = bar(alloca); int f0 = &alloca.f0 int f1 = &random->f1; Which is lowered, approximately, to: %alloca = alloca %struct.foo %random = call %struct.foo @random(%struct.foo* %alloca) %f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0 %f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1 Assume %f1 and %f0 alias. Then %f1 would point into the object allocated by %alloca. Since the %f1 GEP is inbounds, that means %random must also point into the same object. But since %f0 points to the beginning of %alloca, the highest %f1 can be is (%alloca + 3). This means %random can not be higher than (%alloca - 1), and so is not inbounds, a contradiction. Differential Revision: http://reviews.llvm.org/D20495 llvm-svn: 270777	2016-05-25 22:23:08 +00:00
Hal Finkel	2f6886844e	Look for a loop's starting location in the llvm.loop metadata Getting accurate locations for loops is important, because those locations are used by the frontend to generate optimization remarks. Currently, optimization remarks for loops often appear on the wrong line, often the first line of the loop body instead of the loop itself. This is confusing because that line might itself be another loop, or might be somewhere else completely if the body was inlined function call. This happens because of the way we find the loop's starting location. First, we look for a preheader, and if we find one, and its terminator has a debug location, then we use that. Otherwise, we look for a location on an instruction in the loop header. The fallback heuristic is not bad, but will almost always find the beginning of the body, and not the loop statement itself. The preheader location search often fails because there's often not a preheader, and even when there is a preheader, depending on how it was formed, it sometimes carries the location of some preceeding code. I don't see any good theoretical way to fix this problem. On the other hand, this seems like a straightforward solution: Put the debug location in the loop's llvm.loop metadata. A companion Clang patch will cause Clang to insert llvm.loop metadata with appropriate locations when generating debugging information. With these changes, our loop remarks have much more accurate locations. Differential Revision: http://reviews.llvm.org/D19738 llvm-svn: 270771	2016-05-25 21:42:37 +00:00
Ahmed Bougacha	201b97f550	[TLI] Also cover Linux 64 libfunc (stat64, ...) prototype checking. My script missed those in r270750. llvm-svn: 270763	2016-05-25 21:16:33 +00:00
Ahmed Bougacha	1fe3f1ca50	[TLI] Fix NumParams==0 prototype checking typo. There was a typo in r267758. It caused invalid accesses when given something like "void @free(...)", as NumParams == 0, and we then try to look at the 0th parameter. Turns out, most of these were untested; add both attribute and missing-prototype checks for all libc libfuncs. Differential Revision: http://reviews.llvm.org/D20543 llvm-svn: 270750	2016-05-25 20:22:45 +00:00
Oleg Ranevskyy	eb4eccae5c	[SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}" Summary: Description This makes `WidenIV::widenIVUse` (IndVarSimplify.cpp) fail to widen narrow IV uses in some cases. The latter affects IndVarSimplify which may not eliminate narrow IV's when there actually exists such a possibility, thereby producing ineffective code. When `WidenIV::widenIVUse` gets a NarrowUse such as `{(-2 + %inc.lcssa),+,1}<nsw><%for.body3>`, it first tries to get a wide recurrence for it via the `getWideRecurrence` call. `getWideRecurrence` returns recurrence like this: `{(sext i32 (-2 + %inc.lcssa) to i64),+,1}<nsw><%for.body3>`. Then a wide use operation is generated by `cloneIVUser`. The generated wide use is evaluated to `{(-2 + (sext i32 %inc.lcssa to i64))<nsw>,+,1}<nsw><%for.body3>`, which is different from the `getWideRecurrence` result. `cloneIVUser` sees the difference and returns nullptr. This patch also fixes the broken LLVM tests by adding missing <nsw> entries introduced by the correction. Minimal reproducer: ``` int foo(int a, int b, int c); int baz(); void bar() { int arr[20]; int i = 0; for (i = 0; i < 4; ++i) arr[i] = baz(); for (; i < 20; ++i) arr[i] = foo(arr[i - 4], arr[i - 3], arr[i - 2]); } ``` Clang command line: ``` clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf test.cpp -o test.ir ``` Expected result: The ` -mllvm -debug` log shows that all the IV's for the second `for` loop have been eliminated. Reviewers: sanjoy Subscribers: atrick, asl, aemerson, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D20058 llvm-svn: 270695	2016-05-25 13:01:33 +00:00
Michael Zolotukhin	7216dd4668	[LoopUnrollAnalyzer] Fix a crash in UnrolledInstAnalyzer::visitCastInst. This fixes PR27847. Now for real. llvm-svn: 270629	2016-05-24 22:59:58 +00:00
Sanjay Patel	23019d1006	[ValueTracking, InstSimplify] extend isKnownNonZero() to handle vector constants Similar in spirit to D20497 : If all elements of a constant vector are known non-zero, then we can say that the whole vector is known non-zero. It seems like we could extend this to FP scalar/vector too, but isKnownNonZero() says it only works for integers and pointers for now. Differential Revision: http://reviews.llvm.org/D20544 llvm-svn: 270562	2016-05-24 14:18:49 +00:00
Michael Zolotukhin	3898b2b587	[LoopUnrollAnalyzer] Fix a crash in UnrolledInstAnalyzer::visitCastInst. This fixes PR27847. llvm-svn: 270517	2016-05-24 00:51:01 +00:00
Sanjay Patel	e8dc090a2b	fix formatting; NFC llvm-svn: 270465	2016-05-23 17:57:54 +00:00
Sanjay Patel	8ec7e7c216	use 'auto' with 'dyn_cast'; fix formatting; NFC llvm-svn: 270370	2016-05-22 16:07:20 +00:00
Sanjay Patel	e2e89ef936	[ValueTracking, InstCombine] extend isKnownToBeAPowerOfTwo() to handle vector splat constants We could try harder to handle non-splat vector constants too, but that seems much rarer to me. Note that the div test isn't resolved because there's a check for isIntegerTy() guarding that transform. Differential Revision: http://reviews.llvm.org/D20497 llvm-svn: 270369	2016-05-22 15:41:53 +00:00
Michael Kuperstein	c6de57e47a	Revert r270268 due to unused variable warnings. llvm-svn: 270272	2016-05-20 20:55:51 +00:00
Michael Kuperstein	f45e5b58b8	[BasicAA] Turn DecomposeGEPExpression runtime checks into asserts. When it has a DataLayout, DecomposeGEPExpression() should return the same object as GetUnderlyingObject(). Per the FIXME, it currently always has a DL, so the runtime check is redundant and can become an assert. llvm-svn: 270268	2016-05-20 20:26:50 +00:00
Easwaran Raman	bb578ef0dd	Allow -inline-threshold to override default threshold. Before r257832, the threshold used by SimpleInliner was explicitly specified or generated from opt levels and passed to the base class Inliner's constructor. There, it was first overridden by explicitly specified -inline-threshold. The refactoring in r257832 did not preserve this behavior for all opt levels. This change brings back the original behavior. Differential Revision: http://reviews.llvm.org/D20452 llvm-svn: 270153	2016-05-19 23:02:09 +00:00
Matthew Simpson	6feebe9847	[LAA] Check independence of strided accesses before forward case This patch changes the order in which we attempt to prove the independence of strided accesses. We previously did this after we knew the dependence distance was positive. With this change, we check for independence before handling the negative distance case. The patch prevents LAA from reporting forward dependences for independent strided accesses. This change was requested in the review of D19984. llvm-svn: 270072	2016-05-19 15:37:19 +00:00
Sanjoy Das	f5d40d5350	[SCEV] Be more aggressive in proving NUW ... for AddRec's in loops for which SCEV is unable to compute a max tripcount. This is the NUW variant of r269211 and fixes PR27691. (Note: PR27691 is not a correct or stability bug, it was created to track a pending task). llvm-svn: 269790	2016-05-17 17:51:14 +00:00
Geoff Berry	9b4ff336ce	[BasicAA] Update comments based on feedback from hfinkel. NFCI. Original change Hal's comments were based on: http://reviews.llvm.org/D19730 llvm-svn: 269678	2016-05-16 18:51:54 +00:00
Matthew Simpson	37ec5f914e	[LAA] Rename forwarding conflict detection option (NFC) This patch renames the option enabling the store-to-load forwarding conflict detection optimization. This change was requested in the review of D20241. llvm-svn: 269668	2016-05-16 17:00:56 +00:00
Adam Nemet	884d313b7f	[LAA] Comment couldPreventStoreLoadForward. NFC Also s/Cycles/Iters/ in NumCyclesForStoreLoadThroughMemory to make it clear that this is not about clock cycles but loop cycles/iterations. llvm-svn: 269667	2016-05-16 16:57:47 +00:00
Adam Nemet	9b5852aeb2	[LAA] clang-format the function couldPreventStoreLoadForward. NFC llvm-svn: 269666	2016-05-16 16:57:42 +00:00
Matthew Simpson	a250dc9f11	[LAA] Add option to disable conflict detection (NFC) llvm-svn: 269654	2016-05-16 14:14:49 +00:00
Adam Nemet	c62e554e9a	[LAA] Include MaxSafeDepDistBytes in the analysis print-out llvm-svn: 269508	2016-05-13 22:49:13 +00:00
Adam Nemet	4ad38b63d5	[LAA] Prepare the code to print more things in the summary. NFC llvm-svn: 269507	2016-05-13 22:49:09 +00:00
Michael Zolotukhin	963a6d9c69	Revert "Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."" This reverts commit r269395. Try to reapply with a fix from chapuni. llvm-svn: 269486	2016-05-13 21:23:25 +00:00
Silviu Baranga	24dbd2e760	[scan-build] fix warnings emiited on LLVM Analysis code base Fix "Logic error" warnings of the type "Called C++ object pointer is null" reported by Clang Static Analyzer on the following files: lib/Analysis/ScalarEvolution.cpp, lib/Analysis/LoopInfo.cpp. Patch by Apelete Seketeli! llvm-svn: 269424	2016-05-13 14:54:50 +00:00
Michael Zolotukhin	9be3b8b9bb	Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..." This reverts commit r269388. It caused some bots to fail, I'm reverting it until I investigate the issue. llvm-svn: 269395	2016-05-13 06:32:25 +00:00
Michael Zolotukhin	b7b8052982	[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... Summary: ...loop after the last iteration. This is really hard to do correctly. The core problem is that we need to model liveness through the induction PHIs from iteration to iteration in order to get the correct results, and we need to correctly de-duplicate the common subgraphs of instructions feeding some subset of the induction PHIs. All of this can be driven either from a side effect at some iteration or from the loop values used after the loop finishes. This patch implements this by storing the forward-propagating analysis of each instruction in a cache to recall whether it was free and whether it has become live and thus counted toward the total unroll cost. Then, at each sink for a value in the loop, we recursively walk back through every value that feeds the sink, including looping back through the iterations as needed, until we have marked the entire input graph as live. Because we cache this, we never visit instructions more than twice -- once when we analyze them and put them into the cache, and once when we count their cost towards the unrolled loop. Also, because the cache is only two bits and because we are dealing with relatively small iteration counts, we can store all of this very densely in memory to avoid this from becoming an excessively slow analysis. The code here is still pretty gross. I would appreciate suggestions about better ways to factor or split this up, I've stared too long at the algorithmic side to really have a good sense of what the design should probably look at. Also, it might seem like we should do all of this bottom-up, but I think that is a red herring. Specifically, the simplification power is much greater working top-down. We can forward propagate very effectively, even across strange and interesting recurrances around the backedge. Because we use data to propagate, this doesn't cause a state space explosion. Doing this level of constant folding, etc, would be very expensive to do bottom-up because it wouldn't be until the last moment that you could collapse everything. The current solution is essentially a top-down simplification with a bottom-up cost accounting which seems to get the best of both worlds. It makes the simplification incremental and powerful while leaving everything dead until we know it is needed. Finally, a core property of this approach is its monotonicity. At all times, the current UnrolledCost is a conservatively low estimate. This ensures that we will never early-exit from the analysis due to exceeding a threshold when if we had continued, the cost would have gone back below the threshold. These kinds of bugs can cause incredibly hard to track down random changes to behavior. We could use a techinque similar (but much simpler) within the inliner as well to avoid considering speculated code in the inline cost. Reviewers: chandlerc Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D11758 llvm-svn: 269388	2016-05-13 01:42:39 +00:00
Michael Zolotukhin	a59a308e8d	[LoopUnrollAnalyzer] Don't treat gep-instructions with simplified offset as simplified. Summary: Currently we consider such instructions as simplified, which is incorrect, because if their user isn't simplified, we can't actually simplify them too. This biases our estimates of profitability: for instance the analyzer expects much more gains from unrolling memcpy loops than there actually are. Reviewers: hfinkel, chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17365 llvm-svn: 269387	2016-05-13 01:42:34 +00:00
Chandler Carruth	49c22190d0	[PM] Port of the DepndenceAnalysis to the new PM. Ported DA to the new PM by splitting the former DependenceAnalysis Pass into a DependenceInfo result type and DependenceAnalysisWrapperPass type and adding a new PM-style DependenceAnalysis analysis pass returning the DependenceInfo. Patch by Philip Pfaffe, most of the review by Justin. Differential Revision: http://reviews.llvm.org/D18834 llvm-svn: 269370	2016-05-12 22:19:39 +00:00
Adam Nemet	2c34ab51a4	[LAA] Use std::min. NFC llvm-svn: 269356	2016-05-12 21:41:53 +00:00
Sanjoy Das	4e8c80382f	[SCEVExpander] Fix a failed cast<> assertion SCEVExpander::replaceCongruentIVs assumes the backedge value of an SCEV-analysable PHI to always be an instruction, when this is not necessarily true. For now address this by bailing out of the optimization if the backedge value of the PHI is a non-Instruction. llvm-svn: 269213	2016-05-11 17:41:41 +00:00
Sanjoy Das	abb7b93eb9	[SCEVExpander] Don't break SSA in replaceCongruentIVs `SCEVExpander::replaceCongruentIVs` bypasses `hoistIVInc` if both the original and the isomorphic increments are PHI nodes. Doing this can break SSA if the isomorphic increment is not dominated by the original increment. Get rid of the bypass, and let `hoistIVInc` do the right thing. Fixes PR27232 (compile time crash/hang). llvm-svn: 269212	2016-05-11 17:41:34 +00:00
Sanjoy Das	787c2460c2	[SCEV] Be more aggressive around proving no-wrap ... for AddRec's in loops for which SCEV is unable to compute a max tripcount. This is not a problem for "normal" loops[0] that don't have guards or assumes, but helps in cases where we have guards or assumes in the loop that can be used to constrain incoming values over the backedge. This partially fixes PR27691 (we still don't handle the NUW case). [0]: for "normal" loops, in the cases where we'd be able to prove no-wrap via isKnownPredicate, we'd also be able to compute a max tripcount. llvm-svn: 269211	2016-05-11 17:41:26 +00:00
Vedant Kumar	ee20294af5	[BasicAA] Compare GEP indices based on value (Fix PR27418) Equivalent GEP indices with different types are treated as different indices altogether, leading to an incorrect AA result. Fix the issue by comparing indices based on their values. Thanks to Mikael Holmén for reporting the issue! Differential Revision: http://reviews.llvm.org/D19935 llvm-svn: 269197	2016-05-11 15:45:43 +00:00
Artur Pilipenko	7a26326442	NFC. Introduce Value::isPointerDereferenceable Extract a part of isDereferenceableAndAlignedPointer functionality to Value: Reviewed By: hfinkel, sanjoy Differential Revision: http://reviews.llvm.org/D17611 llvm-svn: 269190	2016-05-11 14:43:28 +00:00
Easwaran Raman	9b792923d0	Revert r269131 llvm-svn: 269138	2016-05-10 23:26:04 +00:00
Easwaran Raman	7eccf4ee0e	Reapply r266477 and r266488 llvm-svn: 269131	2016-05-10 22:03:23 +00:00
Sanjay Patel	6786bc5390	[InstSimplify] use computeKnownBits on shift amount operands Do simplifications common to all shift instructions based on the amount shifted: 1. If the shift amount is known larger than the bitwidth, the result is undefined. 2. If the valid bits of the shift amount are all known to be 0, it's a shift by zero, so the shift operand is the result. Note that we could generalize the shift-by-zero transform into a shift-by-constant if all of the valid bits in the shift amount are known, but that would have to be done in InstCombine rather than here because it would mean we need to create a new shift instruction. Differential Revision: http://reviews.llvm.org/D19874 llvm-svn: 269114	2016-05-10 20:46:54 +00:00
Peter Collingbourne	ccdc225c27	Re-apply r269081 and r269082 with a fix for MSVC. llvm-svn: 269094	2016-05-10 18:07:21 +00:00
Peter Collingbourne	4d41cb6cc6	Revert r269081 and r269082 while I try to find the right incantation to fix MSVC build. llvm-svn: 269091	2016-05-10 17:54:43 +00:00
Peter Collingbourne	0df2b085bc	WholeProgramDevirt: Move logic for finding devirtualizable call sites to Analysis. The plan is to eventually make this logic simpler, however I expect it to be a little tricky for the foreseeable future (at least until we're rid of pointee types), so move it here so that it can be reused to build a summary index for devirtualization. Differential Revision: http://reviews.llvm.org/D20005 llvm-svn: 269081	2016-05-10 17:34:21 +00:00
Silviu Baranga	adf4b739ea	[LAA] Use re-written SCEV expressions when computing distances This removes a redundant stride versioning step (we already do it in getPtrStride, so it has no effect) and uses PSE to get the SCEV expressions for the source and destination (this might have changed when getPtrStride was called). I discovered this through code inspection, and couldn't produce a regression test for it. llvm-svn: 269052	2016-05-10 12:28:49 +00:00
James Molloy	aa1d638800	Revert "[VectorUtils] Query number of sign bits to allow more truncations" This was a fairly simple patch but on closer inspection was seriously flawed and caused PR27690. This reverts commit r268921. llvm-svn: 269051	2016-05-10 12:27:23 +00:00
Denis Zobnin	15d1e64b2b	[LAA] Rename "isStridedPtr" with "getPtrStride". NFC. Changing misleading function name was approved in http://reviews.llvm.org/D17268. Patch by Roman Shirokiy. llvm-svn: 269021	2016-05-10 05:55:16 +00:00
Sanjoy Das	12c91dc4c8	[ValueTracking] Use guards to prove non-nullness of a value Reviewers: apilipenko, majnemer, reames Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20044 llvm-svn: 269008	2016-05-10 02:35:44 +00:00
Sanjoy Das	d47f42435a	[BasicAA] Guard intrinsics don't write to memory Summary: The idea is very close to what we do for assume intrinsics: we mark the guard intrinsics as writing to arbitrary memory to maintain control dependence, but under the covers we teach AA that they do not mod any particular memory location. Reviewers: chandlerc, hfinkel, gbiv, reames Subscribers: george.burgess.iv, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19575 llvm-svn: 269007	2016-05-10 02:35:41 +00:00
Sanjoy Das	0b6518d24e	[SCEVExpander] Clang format expressions; NFC The boolean expressions are somewhat hard to read otherwise. llvm-svn: 268998	2016-05-10 00:32:31 +00:00
Sanjoy Das	2512d0c837	[SCEV] Use guards to prove predicates We can use calls to @llvm.experimental.guard to prove predicates, relying on the fact that in all locations domianted by a call to @llvm.experimental.guard the predicate it is guarding is known to be true. llvm-svn: 268997	2016-05-10 00:31:49 +00:00
Adam Nemet	0a77dfad95	[LV] Hint at the new loop distribution pragma in optimization remark When we encounter unsafe memory dependencies, loop distribution could help. Even though, the diagnostics is in LAA, it's only currently emitted in the vectorizer. llvm-svn: 268987	2016-05-09 23:03:44 +00:00
Sanjay Patel	0f153424a9	[Inliner] don't assume that a Constant alloca size is a ConstantInt (PR27277) Differential Revision: http://reviews.llvm.org/D20077 llvm-svn: 268980	2016-05-09 21:51:53 +00:00
Matt Arsenault	1af53a91c0	DivergenceAnalysis: Fix crash with no return blocks The post dominator tree does not have a root node in this case. llvm-svn: 268933	2016-05-09 16:57:08 +00:00
Sanjay Patel	0fb9880bf5	fix spelling; NFC llvm-svn: 268929	2016-05-09 16:07:45 +00:00
James Molloy	5c20e27b7f	[VectorUtils] Query number of sign bits to allow more truncations When deciding if a vector calculation can be done in a smaller bitwidth, use sign bit information from ValueTracking to add more information and allow more truncations. llvm-svn: 268921	2016-05-09 14:32:30 +00:00
David Majnemer	eac58d8f68	[X86] Promote several single precision FP libcalls on Windows A number of libcalls don't exist in any particular lib but are, instead, defined in math.h as inline functions (even in C mode!). Don't rely on their existence when lowering @llvm.{cos,sin,floor,..}.f32, promote them instead. N.B. We had logic to handle FREM but were missing out on a number of others. This change generalizes the FREM handling. llvm-svn: 268875	2016-05-08 08:15:50 +00:00
Sanjoy Das	987aaa1374	[ValueTracking] Hoist some computation out of a loop; NFC There is no need to match the comparison instruction repeatedly. llvm-svn: 268836	2016-05-07 02:08:24 +00:00
Sanjoy Das	5056e19fce	Clean up comment; NFC llvm-svn: 268835	2016-05-07 02:08:22 +00:00
Sanjoy Das	6082c1a39c	Delete trailing whitespace; NFC llvm-svn: 268834	2016-05-07 02:08:15 +00:00
Mehdi Amini	3b132e34b0	ThinLTO: fix assertion and refactor check for hidden use from inline ASM in a helper function This test was crashing, and currently it breaks bootstrapping clang with debuginfo Differential Revision: http://reviews.llvm.org/D20008 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268715	2016-05-06 08:25:33 +00:00
Adam Nemet	724ab22378	[LAA] Fix confusing debug message This message used to be correct, when all we cared about was whether the dependence was safe (i.e. NoDep) or unsafe. With the current more precise characterization, this is a forward dep. llvm-svn: 268695	2016-05-05 23:41:28 +00:00
Xinliang David Li	28a932742c	[PM] port Branch Frequency Analaysis pass to new PM llvm-svn: 268687	2016-05-05 21:13:27 +00:00
Chad Rosier	226a734f1a	[ValueTracking] Early exit when further analysis won't be fruitful. This should have NFC in the context of codegen, but may have positive implications on compile-time. llvm-svn: 268651	2016-05-05 17:41:19 +00:00
Chad Rosier	25cfb7dbd6	[ValueTracking] Improve isImpliedCondition for matching LHS and Imm RHSs. llvm-svn: 268636	2016-05-05 15:39:18 +00:00
Xinliang David Li	6e5dd41481	[PM] Port Branch Probability Analysis pass to the new pass manager. Differential Revision: http://reviews.llvm.org/D19839 llvm-svn: 268601	2016-05-05 02:59:57 +00:00
David Majnemer	3918cdd2a1	[ConstantFolding, ValueTracking] Fold constants involving bitcasts of ConstantVector We assumed that ConstantVectors would be rather uninteresting from the perspective of analysis. However, this is not the case due to a quirk of how LLVM handles vectors of i1. Vectors of i1 are not ConstantDataVectors like vectors of i8, i16, i32 or i64 because i1's SizeInBits differs from it's StoreSizeInBytes. This leads to it being categorized as a ConstantVector instead of a ConstantDataVector. Instead, treat ConstantVector more uniformly. This fixes PR27591. llvm-svn: 268479	2016-05-04 06:13:33 +00:00
Justin Bogner	e839c3e6ab	PM: Check that loop passes preserve a basic set of analyses A loop pass that didn't preserve this entire set of passes wouldn't play well with other loop passes, since these are generally a basic requirement to do any interesting transformations to a loop. Adds a helper to get the set of analyses a loop pass should preserve, and checks that any loop pass we run satisfies the requirement. llvm-svn: 268444	2016-05-03 21:35:08 +00:00
Sanjoy Das	013a4ac4aa	[SCEV] Tweak the output format and content of -analyze In the "LoopDispositions:" section: - Instead of printing out a list, print out a "dictionary" to make it obvious by inspection which disposition is for which loop. This is just a cosmetic change. - Print dispositions for parent _and_ sibling loops. I will use this to write a test case. llvm-svn: 268405	2016-05-03 17:49:57 +00:00
Anna Thomas	43d7e1cbff	Fold compares irrespective of whether allocation can be elided Summary When a non-escaping pointer is compared to a global value, the comparison can be folded even if the corresponding malloc/allocation call cannot be elided. We need to make sure the global value is not null, since comparisons to null cannot be folded. In future, we should also handle cases when the the comparison instruction dominates the pointer escape. Reviewers: sanjoy Subscribers s.egerton, llvm-commits Differential Revision: http://reviews.llvm.org/D19549 llvm-svn: 268390	2016-05-03 14:58:21 +00:00
David Majnemer	3d90bb79c4	[LoopUnroll] Unroll loops which have exit blocks to EH pads We were overly cautious in our analysis of loops which have invokes which unwind to EH pads. The loop unroll transform is safe because it only clones blocks in the loop body, it does not try to split critical edges involving EH pads. Instead, move the necessary safety check to LoopUnswitch. N.B. The safety check for loop unswitch is covered by an existing test which fails without it. llvm-svn: 268357	2016-05-03 03:57:40 +00:00
John Regehr	e1c481dccf	[LVI] Add an API to LazyValueInfo so that it can export ConstantRanges that it computes. Currently this is used for testing and precision tuning, but it might be used by optimizations later. Differential Revision: http://reviews.llvm.org/D19179 llvm-svn: 268291	2016-05-02 19:58:00 +00:00
George Burgess IV	6edb891c8e	[CFLAA] Fix a use-of-invalid-pointer bug. As shown in the diff, we used to add to CFLAA's cache by doing `Cache[Fn] = buildSetsFrom(Fn)`. `buildSetsFrom(Fn)` may cause `Cache` to reallocate its underlying storage, if this happens and `Cache[Fn]` was evaluated prior to `buildSetsFrom(Fn)`, then we'll store the result to a bad address. Patch by Jia Chen. llvm-svn: 268269	2016-05-02 18:09:19 +00:00
Simon Pilgrim	33ae13d3c3	Fixed MSVC 'not all control paths return a value' warning llvm-svn: 268198	2016-05-01 15:52:31 +00:00
Sanjoy Das	f2f00fb11a	[SCEV] When printing via -analysis, dump loop disposition There are currently some bugs in tree around SCEV caching an incorrect loop disposition. Printing out loop dispositions will let us write whitebox tests as those are fixed. The dispositions are printed as a list in "inside out" order, i.e. innermost loop first. llvm-svn: 268177	2016-05-01 04:51:05 +00:00
David Majnemer	826e9831a7	[ValueTracking] Make the code in lookThroughCast No functionality change is intended. llvm-svn: 268108	2016-04-29 21:22:04 +00:00
Chad Rosier	cd62bf5821	[InstCombine] Determine the result of a select based on a dominating condition. Differential Revision: http://reviews.llvm.org/D19550 llvm-svn: 268104	2016-04-29 21:12:31 +00:00
David Majnemer	d2a074b1f4	[ValueTracking] matchSelectPattern needs to be more careful around FP matchSelectPattern attempts to see through casts which mask min/max patterns from being more obvious. Under certain circumstances, it would misidentify a sequence of instructions as a min/max because it assumed that folding casts would preserve the result. This is not the case for floating point <-> integer casts. This fixes PR27575. llvm-svn: 268086	2016-04-29 18:40:34 +00:00
Geoff Berry	b92cd5293e	[BasicAA] Treat llvm.assume as not accessing memory in getModRefBehavior(Function) Reviewers: dberlin, chandlerc, hfinkel, reames, sanjoy Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19730 llvm-svn: 268068	2016-04-29 17:18:28 +00:00
Filipe Cabecinhas	0da9937517	Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them. Summary: Historically, we had a switch in the Makefiles for turning on "expensive checks". This has never been ported to the cmake build, but the (dead-ish) code is still around. This will also make it easier to turn it on in buildbots. Reviewers: chandlerc Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits Differential Revision: http://reviews.llvm.org/D19723 llvm-svn: 268050	2016-04-29 15:22:48 +00:00
Matt Arsenault	790eb1c490	DivergenceAnalysis: Fix crash with unreachable blocks Unreachable blocks may not be in the dominator tree, so don't crash on them. llvm-svn: 268001	2016-04-29 06:17:47 +00:00
Chad Rosier	567556aa9c	[Inliner] Formatting. NFC. Patch by Aditya Kumar! Differential Revision: http://reviews.llvm.org/D19047 llvm-svn: 267888	2016-04-28 14:47:23 +00:00
Ahmed Bougacha	d765a82b54	[TLI] Unify LibFunc signature checking. NFCI. I tried to be as close as possible to the strongest check that existed before; cleaning these up properly is left for future work. Differential Revision: http://reviews.llvm.org/D19469 llvm-svn: 267758	2016-04-27 19:04:35 +00:00
Ahmed Bougacha	220c4010bf	[TLI] Fix indentation. NFC. llvm-svn: 267757	2016-04-27 19:04:29 +00:00
Matthew Simpson	e5dfb08fcb	[TTI] Add hook for vector extract with extension This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725	2016-04-27 15:20:21 +00:00
Teresa Johnson	df5ef8711f	[ThinLTO] Refine fix to avoid renaming of uses in inline assembly. Summary: Refine the workaround from r266877 that attempts to prevent renaming of locals in inline assembly, so that in addition to looking for a llvm.used local value, that there is at least one inline assembly call in the module. Otherwise, debug functions added to the llvm.used can block importing/exporting unnecessarily. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19573 llvm-svn: 267717	2016-04-27 14:19:38 +00:00
Artur Pilipenko	345f01481b	NFC. Introduce Value::getPointerDerferecnceableBytes Extract a part of isDereferenceableAndAlignedPointer functionality to Value::getPointerDerferecnceableBytes. Currently it's a NFC, but in future I'm going to accumulate all the logic about value dereferenceability in this function similarly to Value::getPointerAlignment function (D16144). Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17572 llvm-svn: 267708	2016-04-27 12:51:01 +00:00
Artur Pilipenko	9bb6beabf4	isSafeToLoadUnconditionally support queries without a context This is required to use this function from isSafeToSpeculativelyExecute Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16231 llvm-svn: 267692	2016-04-27 11:00:48 +00:00
Philip Reames	c67651dd70	[LVI] Delete stale and misleading comment. llvm-svn: 267661	2016-04-27 03:03:15 +00:00
Philip Reames	2ab964e263	[LVI] Add a comment explaining a subtle piece of code Or at least, I didn't understand the implications the first several times I read it it. llvm-svn: 267648	2016-04-27 01:02:25 +00:00
Philip Reames	3f83dbeed9	[LVI] Reduce compile time by lazily scanning blocks if needed When encountering a non-local pointer, LVI would eagerly scan the block for dereferences of the given object to prove the pointer to be non null. That's all well and good, but then we'd go recurse through our input blocks. As a result, we could end up scanning each and every block we traverse, even if the final definition was obviously non null or we found a constant value somewhere up the chain. The previous code papered over this by using the isKnownNonNull routine from value tracking. This made the duplication less painful in the common case. Instead, we know do the block scan only after we've gotten the recursive results back. This lets us stop scanning individual blocks as soon as we've determined it to be non-null in any predecessor block and use our usual merge rules to propagate that information cheaply through successor blocks. For a pointer which can be found non-null, this does strictly less work and sometimes substaintially so. Note that the case where we can't prove something non-null is still the really expensive case. We end up scanning each and every block looking for a dereference and never end up finding one. llvm-svn: 267642	2016-04-27 00:30:55 +00:00
Philip Reames	f105db4fc3	[LVI] Cut short search if we know we can't return a useful result Previously we were recursing on our operands for unary and binary operators regardless of whether we knew how to reason about the operator in question. This has the effect of doing a potentially large amount of work, only to throw it away. By checking whether the operation is one LVI can handle, we can cut short the search and return the (overdefined) answer more quickly. The quality of the results produced should not change. llvm-svn: 267626	2016-04-26 23:27:33 +00:00
Philip Reames	053c2a6f25	[LVI] Apply transfer rule for overdefine inputs for binary operators As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well. This patch builds on 267609 which did the same thing for unary casts. llvm-svn: 267620	2016-04-26 23:10:35 +00:00
Philip Reames	e5030e85ea	[LVI] A better fix for the assertion error introduced by 267609 Essentially, I was using the wrong size function. For types which were sized, but not primitive, I wasn't getting a useful size for the operand and failed an assert. I fixed this, and also added a guard that the input is a sized type. Test case is for the original mistake. I'm not sure how to actually exercise the sized type check. llvm-svn: 267618	2016-04-26 22:52:30 +00:00
Philip Reames	d5c62a0aad	[LVI] Speculative fix for assertion seen in clang bots I'll clean this up and add a test case shortly. I want to make sure this does actually fix the bots; if not, I'll revert. llvm-svn: 267617	2016-04-26 22:31:53 +00:00
Philip Reames	38c87c2e50	[LVI] Infer local facts from unary expressions As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well. This patch implements only the unary operation case. Once this is in, I'll implement the same for the binary operations. Differential Revision: http://reviews.llvm.org/D19492 llvm-svn: 267609	2016-04-26 21:48:16 +00:00
Philip Reames	1918384155	[LVI] Make a precondition explicit rather than handling a case which never happens [NFC] llvm-svn: 267481	2016-04-25 22:21:24 +00:00
Philip Reames	3bb2832900	[LVI] Clarify comments describing the lattice values There has been much recent confusion about the partition in the lattice between constant and non-constant values. Hopefully, documenting this will prevent confusion going forward. llvm-svn: 267440	2016-04-25 18:48:43 +00:00
Philip Reames	6671577eb3	[LVI] Split solveBlockValueConstantRange into two [NFC] This function handled both unary and binary operators. Cloning and specializing leads to much easier to follow code with minimal duplicatation. llvm-svn: 267438	2016-04-25 18:30:31 +00:00
Chad Rosier	e2cbd13e56	[ValueTracking] Improve isImpliedCondition when the dominating cond is false. llvm-svn: 267430	2016-04-25 17:23:36 +00:00
Silviu Baranga	795c629ec9	[SCEV] Improve the run-time checking of the NoWrap predicate Summary: This implements a new method of run-time checking the NoWrap SCEV predicates, which should be easier to optimize and nicer for targets that don't correctly handle multiplication/addition of large integer types (like i128). If the AddRec is {a,+,b} and the backedge taken count is c, the idea is to check that \|b\| * c doesn't have unsigned overflow, and depending on the sign of b, that: a + \|b\| * c >= a (b >= 0) or a - \|b\| * c <= a (b <= 0) where the comparisons above are signed or unsigned, depending on the flag that we're checking. The advantage of doing this is that we avoid extending to a larger type and we avoid the multiplication of large types (multiplying i128 can be expensive). Reviewers: sanjoy Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D19266 llvm-svn: 267389	2016-04-25 09:27:16 +00:00
Nick Lewycky	af50837a31	Remove emacs mode markers from .cpp files. NFC .cpp files are unambiguously C++, you only need the mode markers on .h files. llvm-svn: 267353	2016-04-24 17:55:41 +00:00
Teresa Johnson	28e457bccd	[ThinLTO] Remove GlobalValueInfo class from index Summary: Remove the GlobalValueInfo and change the ModuleSummaryIndex to directly reference summary objects. The info structure was there to support lazy parsing of the combined index summary objects, which is no longer needed and not supported. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19462 llvm-svn: 267344	2016-04-24 14:57:11 +00:00
Mehdi Amini	c3ed48c1bd	Reorganize GlobalValueSummary with a "Flags" bitfield. Right now it only contains the LinkageType, but will be extended with "hasSection", "isOptSize", "hasInlineAssembly", etc. Differential Revision: http://reviews.llvm.org/D19404 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267319	2016-04-24 03:18:18 +00:00
Andrew Kaylor	aa641a5171	Re-commit optimization bisect support (r267022) without new pass manager support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231	2016-04-22 22:06:11 +00:00
Peter Collingbourne	7dd8dbf486	Introduce llvm.load.relative intrinsic. This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant. Differential Revision: http://reviews.llvm.org/D18367 llvm-svn: 267223	2016-04-22 21:18:02 +00:00
Peter Collingbourne	265ebd7d70	CodeGen: Use PLT relocations for relative references to unnamed_addr functions. The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual functions defined in other DSOs. The unnamed_addr attribute means that the function's address is not significant, so we're allowed to substitute it with the address of a PLT entry. Also includes a bonus feature: addends for COFF image-relative references. Differential Revision: http://reviews.llvm.org/D17938 llvm-svn: 267211	2016-04-22 20:40:10 +00:00
Sanjoy Das	a6155b659a	Have isKnownNotFullPoison be smarter around control flow Summary: (... while still not using a PostDomTree) The way we use isKnownNotFullPoison from SCEV today, the new CFG walking logic will not trigger for any realistic cases -- it will kick in only for situations where we could have merged the contiguous basic blocks anyway[0], since the poison generating instruction dominates all of its non-PHI uses (which are the only uses we consider right now). However, having this change in place will allow a later bugfix to break fewer llvm-lit tests. [0]: i.e. cases where block A branches to block B and B is A's only successor and A is B's only predecessor. Reviewers: broune, bjarke.roune Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19212 llvm-svn: 267175	2016-04-22 17:41:06 +00:00
Vedant Kumar	6013f45f92	Revert "Initial implementation of optimization bisect support." This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115	2016-04-22 06:51:37 +00:00
Sanjoy Das	efdeb45ffd	[SCEV] Extract out a `isSCEVExprNeverPoison` helper; NFCI Summary: Also adds a small comment blurb on control flow + no-wrap flags, since that question came up a few days back on llvm-dev. Reviewers: bjarke.roune, broune Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D19209 llvm-svn: 267110	2016-04-22 05:38:54 +00:00
Andrew Kaylor	f0f279291c	Initial implementation of optimization bisect support. This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022	2016-04-21 17:58:54 +00:00
Philip Reames	92c43699bc	[unordered] Add tests and conservative handling in support of future changes [NFCI] This change adds a couple of test cases to make sure FindAvailableLoadedValue does the right thing. At the moment, the code added is dead, but separating it makes follow on changes far more obvious. llvm-svn: 266999	2016-04-21 16:51:08 +00:00
Chad Rosier	99bc480bc3	Address Philip's post-commit feedback for r266987. NFC. llvm-svn: 266998	2016-04-21 16:18:02 +00:00
Chad Rosier	af83e40dee	Refactor implied condition logic from ValueTracking directly into CmpInst. NFC. Differential Revision: http://reviews.llvm.org/D19330 llvm-svn: 266987	2016-04-21 14:04:54 +00:00
Nick Lewycky	762f8a8549	Add optimization for 'icmp slt (or A, B), A' and some related idioms based on knowledge of the sign bit for A and B. No matter what value you OR in to A, the result of (or A, B) is going to be UGE A. When A and B are positive, it's SGE too. If A is negative, OR'ing a value into it can't make it positive, but can increase its value closer to -1, therefore (or A, B) is SGE A. Working through all possible combinations produces this truth table: ``` A is +, -, +/- F F F + B is T F ? - ? F ? +/- ``` The related optimizations are flipping the 'slt' for 'sge' which always NOTs the result (if the result is known), and swapping the LHS and RHS while swapping the comparison predicate. There are more idioms left to implement (aren't there always!) but I've stopped here because any more would risk becoming unreasonable for reviewers. llvm-svn: 266939	2016-04-21 00:53:14 +00:00
Chad Rosier	41dd31f0b0	[ValueTracking] Make isImpliedCondition return an Optional<bool>. NFC. Phabricator Revision: http://reviews.llvm.org/D19277 llvm-svn: 266904	2016-04-20 19:15:26 +00:00
Teresa Johnson	b35cc691ea	[ThinLTO] Prevent importing of "llvm.used" values Summary: This patch prevents importing from (and therefore exporting from) any module with a "llvm.used" local value. Local values need to be promoted and renamed when importing, and their presense on the llvm.used variable indicates that there are opaque uses that won't see the rename. One such example is a use in inline assembly. See also the discussion at: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098047.html As part of this, move collectUsedGlobalVariables out of Transforms/Utils and into IR/Module so that it can be used more widely. There are several other places in LLVM that used copies of this code that can be cleaned up as a follow on NFC patch. Reviewers: joker.eph Subscribers: pcc, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18986 llvm-svn: 266877	2016-04-20 14:39:45 +00:00
David Majnemer	b4b27230bf	[ValueTracking, VectorUtils] Refactor getIntrinsicIDForCall The functionality contained within getIntrinsicIDForCall is two-fold: it checks if a CallInst's callee is a vectorizable intrinsic. If it isn't an intrinsic, it attempts to map the call's target to a suitable intrinsic. Move the mapping functionality into getIntrinsicForCallSite and rename getIntrinsicIDForCall to getVectorIntrinsicIDForCall while reimplementing it in terms of getIntrinsicForCallSite. llvm-svn: 266801	2016-04-19 19:10:21 +00:00
Chad Rosier	b7dfbb40a3	[ValueTracking] Improve isImpliedCondition for conditions with matching operands. This patch improves SimplifyCFG to catch cases like: if (a < b) { if (a > b) <- known to be false unreachable; } Phabricator Revision: http://reviews.llvm.org/D18905 llvm-svn: 266767	2016-04-19 17:19:14 +00:00
Brendon Cahoon	be2da82cd8	[DependenceAnalysis] Refactor uses of getConstantPart. NFC. Rather than checking for the SCEV type prior to calling getContantPart, perform the checks in the function. This reduces the number of places where the checks are needed. Differential Revision: http://reviews.llvm.org/D19241 llvm-svn: 266759	2016-04-19 16:46:57 +00:00
Daniel Berlin	77fa84eadd	Correct IDF calculator for ReverseIDF Summary: Need to use predecessors for reverse graph, successors for forward graph. succ_iterator/pred_iterator are not compatible, this patch is all the work necessary to work around that (which is what everywhere else does). Not sure if there is a better way, so cc'ing some random folks to take a gander :) Reviewers: dblaikie, qcolombet, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18796 llvm-svn: 266718	2016-04-19 06:13:28 +00:00
Michael Kuperstein	de16b44f74	Port DemandedBits to the new pass manager. Differential Revision: http://reviews.llvm.org/D18679 llvm-svn: 266699	2016-04-18 23:55:01 +00:00
Sanjoy Das	432c1c3fb3	[BPI] Consider deoptimize calls as "unreachable" Summary: Calls to @llvm.experimental.deoptimize are expected to "never execute", so optimize them as such. Reviewers: chandlerc Subscribers: junbuml, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19095 llvm-svn: 266654	2016-04-18 19:01:28 +00:00
Easwaran Raman	a163d27611	Revert r266488. This goes with r266477 which has been reverted. llvm-svn: 266631	2016-04-18 17:10:17 +00:00
Eric Liu	d09f15ea6f	Revert "Replace the use of MaxFunctionCount module flag" This reverts commit r266477. This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData", while "ProfileData" depends on "Object", which depends on "BitCode", which depends on "Analysis". llvm-svn: 266619	2016-04-18 15:31:11 +00:00
Mehdi Amini	b550cb1750	[NFC] Header cleanup Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' \| xargs grep -L 'IndexedMap[<]' \| xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595	2016-04-18 09:17:29 +00:00
Easwaran Raman	6d09f993c2	Add ProfileData to required_libraries This should fix ppc64be build breakage due to r266477 llvm-svn: 266488	2016-04-15 23:08:52 +00:00
Easwaran Raman	f53baca686	Replace the use of MaxFunctionCount module flag Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count. Differential Revision: http://reviews.llvm.org/D18622 llvm-svn: 266477	2016-04-15 21:39:58 +00:00
Justin Lebar	8650a4da93	[TTI] Add getInliningThresholdMultiplier. Summary: InlineCost's threshold is multiplied by this value. This lets us adjust the inlining threshold up or down on a per-target basis. For example, we might want to increase the threshold on targets where calls are unusually expensive. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18560 llvm-svn: 266405	2016-04-15 01:38:48 +00:00
Michael Kuperstein	16f13e252b	[AliasSetTracker] Correctly handle changing the size of an entry If the size of an AST entry changes, we also need to make sure we perform necessary alias set merges, as the new size may overlap pointers in other sets. We happen to run into this with memset, because memset allows an entry for a i8* pointer to have a decidedly non-i8 size. This fixes PR27262. Differential Revision: http://reviews.llvm.org/D18939 llvm-svn: 266381	2016-04-14 22:00:11 +00:00
Renato Golin	5cb666add7	[ARM] Adding IEEE-754 SIMD detection to loop vectorizer Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON. This patch teaches the loop vectorizer to only allow transformations of loops that either contain no floating-point operations or have enough allowance flags supporting lack of precision (ex. -ffast-math, Darwin). For that, the target description now has a method which tells us if the vectorizer is allowed to handle FP math without falling into unsafe representations, plus a check on every FP instruction in the candidate loop to check for the safety flags. This commit makes LLVM behave like GCC with respect to ARM NEON support, but it stops short of fixing the underlying problem: sub-normals. Neither GCC nor LLVM have a flag for allowing sub-normal operations. Before this patch, GCC only allows it using unsafe-math flags and LLVM allows it by default with no way to turn it off (short of not using NEON at all). As a first step, we push this change to make it safe and in sync with GCC. The second step is to discuss a new sub-normal's flag on both communitues and come up with a common solution. The third step is to improve the FastMath flags in LLVM to encode sub-normals and use those flags to restrict NEON FP. Fixes PR16275. llvm-svn: 266363	2016-04-14 20:42:18 +00:00
Nicolai Haehnle	13d90f324c	[DivergenceAnalysis] Treat PHI with incoming undef as constant Summary: If a PHI has an incoming undef, we can pretend that it is equal to one non-undef, non-self incoming value. This is particularly relevant in combination with the StructurizeCFG pass, which introduces PHI nodes with undefs. Previously, this lead to branch conditions that were uniform before StructurizeCFG to become non-uniform afterwards, which confused the SIAnnotateControlFlow pass. This fixes a crash when Mesa radeonsi compiles a shader from dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_dynamic_vertex Reviewers: arsenm, tstellarAMD, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19013 llvm-svn: 266347	2016-04-14 17:42:47 +00:00
Silviu Baranga	b77365b595	[SCEV][LAA] Add tests for SCEV expression transformations performed during LAA Summary: Add a print method to Predicated Scalar Evolution which prints all interesting transformations done by PSE. Loop Access Analysis will now print this as part of the analysis output. We now use this to check the exact expression transformations that were done by PSE in LAA. The additional checking also acts as white-box testing for the getAsAddRec method. Reviewers: anemet, sanjoy Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18792 llvm-svn: 266334	2016-04-14 16:08:45 +00:00
David Majnemer	0f26b0aeb4	[CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only one of the inputs is NaN: -NUM will return the non-NaN argument while -NAN would return NaN. It is desirable to lower to @llvm.{min,max}num to -NAN if they don't have a native instruction for -NUM. Notably, ARMv7 NEON's vmin has the -NAN semantics. N.B. Of course, it is only safe to do this if the intrinsic call is marked nnan. llvm-svn: 266279	2016-04-14 07:13:24 +00:00
George Burgess IV	cae581d13f	[CFLAA] Fix up code style a bit. NFC. llvm-svn: 266262	2016-04-13 23:27:37 +00:00
Easwaran Raman	d295b00ae9	Return immediately from analyzeCall if analyzeBlock returns false. This is part of the patch reviewed at http://reviews.llvm.org/D17584 llvm-svn: 266249	2016-04-13 21:20:22 +00:00
David L Kreitzer	752c1448fe	Simplify strlen to a subtraction for certain cases. Patch by Li Huang (li1.huang@intel.com) Differential Revision: http://reviews.llvm.org/D18230 llvm-svn: 266200	2016-04-13 14:31:06 +00:00
Petar Jovanovic	644b8c1a5d	Calculate __builtin_object_size when pointer depends on a condition This patch fixes calculating of builtin_object_size if it depends on a condition. Before this patch compiler did not know how to calculate the object size when it finds a condition that cannot be eliminated. This patch enables calculating of builtin_object_size even in case when condition cannot be eliminated by choosing minimum or maximum value as a result from condition. Choosing minimum or maximum value from condition is based on the second argument of __builtin_object_size function. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18438 llvm-svn: 266193	2016-04-13 12:25:25 +00:00
David Majnemer	3ee5f34469	[InstCombine] We folded an fcmp to an i1 instead of a vector of i1 Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175	2016-04-13 06:55:52 +00:00
Jeroen Ketema	e48e393729	Add space between words in verify-scev-maps option help message llvm-svn: 266149	2016-04-12 23:21:46 +00:00
George Burgess IV	278199f615	Add the allocsize attribute to LLVM. `allocsize` is a function attribute that allows users to request that LLVM treat arbitrary functions as allocation functions. This patch makes LLVM accept the `allocsize` attribute, and makes `@llvm.objectsize` recognize said attribute. The review for this was split into two patches for ease of reviewing: D18974 and D14933. As promised on the revisions, I'm landing both patches as a single commit. Differential Revision: http://reviews.llvm.org/D14933 llvm-svn: 266032	2016-04-12 01:05:35 +00:00
Sanjoy Das	f9d88e650b	This reverts commit r265913 and r265912 See PR27315 r265913: "[IndVars] Eliminate op.with.overflow when possible" r265912: "[SCEV] See through op.with.overflow intrinsics" llvm-svn: 265950	2016-04-11 15:26:18 +00:00
Teresa Johnson	2d5487cf44	[ThinLTO] Move summary computation from BitcodeWriter to new pass Summary: This is the first step in also serializing the index out to LLVM assembly. The per-module summary written to bitcode is moved out of the bitcode writer and to a new analysis pass (ModuleSummaryIndexWrapperPass). The pass itself uses a new builder class to compute index, and the builder class is used directly in places where we don't have a pass manager (e.g. llvm-as). Because we are computing summaries outside of the bitcode writer, we no longer can use value ids created by the bitcode writer's ValueEnumerator. This required changing the reference graph edge type to use a new ValueInfo class holding a union between a GUID (combined index) and Value* (permodule index). The Value* are converted to the appropriate value ID during bitcode writing. Also, this enables removal of the BitWriter library's dependence on the Analysis library that was previously required for the summary computation. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18763 llvm-svn: 265941	2016-04-11 13:58:45 +00:00
Sanjoy Das	3c529a40ca	[SCEV] See through op.with.overflow intrinsics Summary: This change teaches SCEV to see reduce `(extractvalue 0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if possible). Reviewers: atrick, regehr Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18684 llvm-svn: 265912	2016-04-10 22:50:26 +00:00
Easwaran Raman	9a3fc17ad4	Refactor Threshold computation. NFC. This is part of changes reviewed in http://reviews.llvm.org/D17584. llvm-svn: 265852	2016-04-08 21:28:02 +00:00
Sanjoy Das	87b9e1b727	Propagate Undef in llvm.cos Intrinsic Summary: The llvm cos intrinsic currently does not propagate undef's. This change transforms cos(undef) to null value or 0. There are 2 test cases added as well. Patch by Anna Thomas! Reviewers: sanjoy Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18863 llvm-svn: 265825	2016-04-08 18:21:11 +00:00
Silviu Baranga	6f444dfd55	Re-commit [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV This re-commits r265535 which was reverted in r265541 because it broke the windows bots. The problem was that we had a PointerIntPair which took a pointer to a struct allocated with new. The problem was that new doesn't provide sufficient alignment guarantees. This pattern was already present before r265535 and it just happened to work. To fix this, we now separate the PointerToIntPair from the ExitNotTakenInfo struct into a pointer and a bool. Original commit message: Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 llvm-svn: 265786	2016-04-08 14:29:09 +00:00
Sanjoy Das	5ce3272833	Don't IPO over functions that can be de-refined Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762	2016-04-08 00:48:30 +00:00
Mehdi Amini	a797877a7e	Const correctness for BranchProbabilityInfo (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265731	2016-04-07 21:59:28 +00:00
JF Bastien	800f87a871	NFC: make AtomicOrdering an enum class Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602	2016-04-06 21:19:33 +00:00
Silviu Baranga	a393baf1fd	Revert r265535 until we know how we can fix the bots llvm-svn: 265541	2016-04-06 14:06:32 +00:00
Silviu Baranga	72b4a4a330	[SCEV] Introduce a guarded backedge taken count and use it in LAA and LV Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 llvm-svn: 265535	2016-04-06 13:18:26 +00:00
David Majnemer	12fd50410d	[SLPVectorizer] Vectorizing the libm sqrt to llvm's sqrt intrinsic requires nnan To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has undefined behavior for negative numbers other than -0.0 (which allows for better optimization, because there is no need to worry about errno being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt." This means that it's unsafe to replace sqrt with llvm.sqrt unless the call is annotated with nnan. Thanks to Hal Finkel for pointing this out! llvm-svn: 265521	2016-04-06 07:04:53 +00:00
David Majnemer	25d03dbcde	[SLPVectorizer] Vectorize libcalls of sqrt We didn't realize that we could transform the libcall into a vectorized intrinsic. llvm-svn: 265493	2016-04-06 00:14:59 +00:00
George Burgess IV	7e5404cc20	[CFLAA] Fix PR27213; incorrect tagging of args/globals Prior to this patch, CFLAA wouldn't tag arguments/globals properly if it didn't find any "interesting" edges on them. This means that, if all you do is store constants to a global or argument, we would never actually treat it as a global/argument. Test case: define void @foo(i32* %A, i32* %B) #0 { entry: store i32 0, i32* %A, align 4 store i32 0, i32* %B, align 4 ret void } CFLAA would say that %A can't alias %B, because neither pointer was used in an interesting way. This patch makes us note whether something is an argument, global, ... regardless of how interesting CFLAA thinks its uses are. (For the record, using a value in an interesting way means loading from it, using it in a GEP, ...) llvm-svn: 265474	2016-04-05 21:40:45 +00:00
Junmo Park	53470fc451	Minor code cleanups. NFC. llvm-svn: 265468	2016-04-05 21:14:31 +00:00
Duncan P. N. Exon Smith	1de3c7e790	IR: Introduce ConstantAggregate, NFC Add a common parent class for ConstantArray, ConstantVector, and ConstantStruct called ConstantAggregate. These are the aggregate subclasses of Constant that take operands. This is mainly a cleanup, adding common `isa` target and removing duplicated code. However, it also simplifies caching which constants point transitively at `GlobalValue` (a possible future direction). llvm-svn: 265466	2016-04-05 21:10:45 +00:00
Brendon Cahoon	86f783e315	[DependenceAnalysis] Check if result of getConstantPart is null A seg-fault occurs due to a reference of a null pointer, which is the value returned by getConstantPart. This function returns null if the constant part is not found. The code that calls this function needs to check for the null return value. Differential Revision: http://reviews.llvm.org/D18718 llvm-svn: 265319	2016-04-04 18:13:18 +00:00
Peter Zotov	0218d0f383	Mark some FP intrinsics as safe to speculatively execute Floating point intrinsics in LLVM are generally not speculatively executed, since most of them are defined to behave the same as libm functions, which set errno. However, the only error that can happen when executing ceil, floor, nearbyint, rint and round libm functions per POSIX.1-2001 is -ERANGE, and that requires the maximum value of the exponent to be smaller than the number of mantissa bits, which is not the case with any of the floating point types supported by LLVM. The trunc and copysign functions never set errno per per POSIX.1-2001. Differential Revision: http://reviews.llvm.org/D18643 llvm-svn: 265262	2016-04-03 12:30:46 +00:00
Mehdi Amini	89038a1071	Fix "warning: variabl 'XX’ set but not used" in release build (variable used in assertion, NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265220	2016-04-02 05:34:19 +00:00
David Majnemer	ae272d718e	[NVPTX] Infer __nvvm_reflect as nounwind, readnone This patch simply mirrors the attributes we give to @llvm.nvvm.reflect to the __nvvm_reflect libdevice call. This shaves about 30% of the code in libdevice away because of CSE opportunities. It's also helps us figure out that libdevice implementations of transcendental functions don't have side-effects. llvm-svn: 265060	2016-03-31 21:29:57 +00:00
Sanjoy Das	56df0ec610	[InstCombine] Fix incorrect rule from rL236202 The rule for SMIN introduced in rL236202 doesn't work as advertised: the check for Pred == ICmpInst::ICMP_SGT was missing. llvm-svn: 264996	2016-03-31 05:14:34 +00:00
Sanjoy Das	c9d6d8b106	Delete trailing whitespace llvm-svn: 264995	2016-03-31 05:14:29 +00:00
Sanjoy Das	e12c0e5159	[SCEV] Track NoWrap properties using MatchBinaryOp, NFC This way once we teach MatchBinaryOp to map more things into arithmetic, the non-wrapping add recurrence construction would understand it too. Right now MatchBinaryOp still only understands arithmetic, so this is solely a code-reorganization change. llvm-svn: 264994	2016-03-31 05:14:26 +00:00
Sanjoy Das	118d919a6a	[SCEV] NFC code motion to simplify later change llvm-svn: 264993	2016-03-31 05:14:22 +00:00
James Molloy	8e46cd05a1	[VectorUtils] Don't try and truncate PHIs to a smaller bitwidth We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means. However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018. This fixes PR17018. llvm-svn: 264852	2016-03-30 10:11:43 +00:00
Sanjoy Das	2381fcd557	[SCEV] Extract out a MatchBinaryOp; NFCI MatchBinaryOp abstracts out the IR instructions from the operations they represent. While this change is NFC, we will use this factoring later to map things like `(extractvalue 0 (sadd.with.overflow X Y))` to `(add X Y)`. llvm-svn: 264747	2016-03-29 16:40:44 +00:00
Sanjoy Das	260ad4dd63	[SCEV] Use Operator::getOpcode instead of manual dispatch; NFC llvm-svn: 264746	2016-03-29 16:40:39 +00:00
Eugene Zelenko	35623fb7d5	Fix Clang-tidy modernize-deprecated-headers warnings in some files; other minor fixes. Differential revision: http://reviews.llvm.org/D18469 llvm-svn: 264598	2016-03-28 17:40:08 +00:00
Philip Reames	b5681138e4	Allow value forwarding past release fences in GVN A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-load forwarding across a release fence. We choose to be much more conservative about stores. In theory, nothing prevents us from shifting a store from after a release fence to before it, and then eliminating the preceeding (previously fenced) store. Doing this without actually moving the second store is likely also legal, but we chose to be conservative at this time. The LangRef indicates only atomic loads and stores are effected by fences. This patch chooses to be far more conservative then that. This is the GVN companion to http://reviews.llvm.org/D11434 which applied the same logic in EarlyCSE and has been baking in tree for a while now. Differential Revision: http://reviews.llvm.org/D11436 llvm-svn: 264472	2016-03-25 22:40:35 +00:00
Duncan P. N. Exon Smith	1d15a9f0c9	IR: Reserve an MDKind for !llvm.loop; NFC This reserves an MDKind for !llvm.loop, which allows callers to avoid a string-based lookup. I'm not sure why it was missing. There should be no functionality change here, just a small compile-time speedup. llvm-svn: 264371	2016-03-25 00:35:38 +00:00
Adam Nemet	59a6550425	[LAA] Formatting fix in previous change llvm-svn: 264244	2016-03-24 05:15:24 +00:00
Adam Nemet	279784ffc4	[LAA] Support memchecks involving loop-invariant addresses We used to only allow SCEVAddRecExpr for pointer expressions in order to be able to compute the bounds. However this is also trivially possible for loop-invariant addresses (scUnknown) since then the bounds are the address itself. Interestingly, we used allow this for the special case when the loop-invariant address happens to also be an SCEVAddRecExpr (in an outer loop). There are a couple more loops that are vectorized in SPEC after this. My guess is that the main reason we don't see more because for example a loop-invariant load is vectorized into a splat vector with several vector-inserts. This is likely to make the vectorization unprofitable. I.e. we don't notice that a later LICM will move all of this out of the loop so the cost estimate should really be 0. llvm-svn: 264243	2016-03-24 04:28:47 +00:00
Easwaran Raman	12b79aa0f1	Add getBlockProfileCount method to BlockFrequencyInfo Differential Revision: http://reviews.llvm.org/D18233 llvm-svn: 264179	2016-03-23 18:18:26 +00:00
Silviu Baranga	d68ed85401	[SCEV] Change the SCEV Predicates interfaces for conversion to AddRecExpr to return SCEVAddRecExpr* instead of SCEV* Summary: This changes the conversion functions from SCEV * to SCEVAddRecExpr from ScalarEvolution and PredicatedScalarEvolution to return a SCEVAddRecExpr* instead of a SCEV* (which removes the need of most clients to do a dyn_cast right after calling these functions). We also don't add new predicates if the transformation was not successful. This is not entirely a NFC (as it can theoretically remove some predicates from LAA when we have an unknown dependece), but I couldn't find an obvious regression test for it. Reviewers: sanjoy Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18368 llvm-svn: 264161	2016-03-23 15:29:30 +00:00
Mehdi Amini	c04fc7a60f	Rename DenseMap::resize() into DenseMap::reserve() (NFC) This is more coherent with usual containers. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264026	2016-03-22 07:20:00 +00:00
Matt Arsenault	155dda9134	Implement constant folding for bitreverse llvm-svn: 263945	2016-03-21 15:00:35 +00:00
Silviu Baranga	f875e4fd92	[IndVars] Fix PR26974: make sure replaceCongruentIVs doesn't break LCSSA Summary: replaceCongruentIVs can break LCSSA when trying to replace IV increments since it tries to replace all uses of a phi node with another phi node while both of the phi nodes are not necessarily in the processed loop. This will cause an assert in IndVars. To fix this, we add a check to make sure that the replacement maintains LCSSA. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18266 llvm-svn: 263941	2016-03-21 12:44:29 +00:00
Adam Nemet	709e3046ee	[LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead Summary: It can hurt performance to prefetch ahead too much. Be conservative for now and don't prefetch ahead more than 3 iterations on Cyclone. Reviewers: hfinkel Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17949 llvm-svn: 263772	2016-03-18 00:27:43 +00:00
Adam Nemet	6d8beeca53	[LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses Summary: And use this TTI for Cyclone. As it was explained in the original RFC (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW prefetcher work up to 2KB strides. I am also adding tests for this and the previous change (D17943): * Cyclone prefetching accesses with a large stride * Cyclone not prefetching accesses with a small stride * Generic Aarch64 subtarget not prefetching either Reviewers: hfinkel Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17945 llvm-svn: 263771	2016-03-18 00:27:38 +00:00
Bjorn Steinbrink	59fdec673d	Add Rust's personality function to the list of known personality functions Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18192 llvm-svn: 263581	2016-03-15 20:35:45 +00:00
Manuel Jacob	6be355961e	Re-add ConstantFoldInstOperands form taking opcode and return type. Summary: This form was replaced by a form taking an instruction instead of opcode and return type in r258391. After committing this change (and some depending, follow-up changes) it turned out in the review thread to be controversial. The discussion didn't come to a conclusion yet. I'm re-adding the old form to fix the API regression and to provide a better base for discussion, possibly on llvm-dev. A difference to the original function is that it can't be called with GEPs (similarly to how it was already the case for compares). In order to support opaque pointers in the future, folding GEPs needs to be passed the source element type, which is not possible with the current API. Reviewers: dberlin, reames Subscribers: dblaikie, eddyb Differential Revision: http://reviews.llvm.org/D17901 llvm-svn: 263501	2016-03-14 22:34:17 +00:00
Michael Kuperstein	b7860fedd4	[AliasSetTracker] Do not strip pointer casts when processing MemSetInst This fixes PR26843. llvm-svn: 263462	2016-03-14 18:34:29 +00:00
Fiona Glaser	2e5c0c2858	ConstantFoldInstruction: avoid wasted calls to ConstantFoldConstantExpression Check to see if all operands are constant before calling simplify on them so that we don't perform wasted simplifications. llvm-svn: 263374	2016-03-13 05:36:15 +00:00
Chandler Carruth	5bfbc3f941	[AA] Make BasicAA just require domtree. This doesn't change how many times we construct domtrees in the normal pipeline, and it removes fragility and instability where basic-aa may not be run in time to see domtrees because they happen to be constructed afterward. This isn't quite as clean as the change to memdep because there is a mode where basic-aa specifically runs without domtrees -- in the hacking version used by function-attrs with the legacy pass manager. llvm-svn: 263234	2016-03-11 13:53:18 +00:00
Chandler Carruth	aef32bd319	[memdep] Just require domtree for memdep. This doesn't cause us to construct dominator trees any more often in the normal pipeline, and removes an entire mode of memdep that needed to be reasoned about and maintained. Perhaps more importantly, it removes the ability for the results of memdep to be different because of accidental pass scheduling goofs or the order of evaluation of 'getResult' calls. Essentially, 'getCachedResult', unless across IR-unit boundaries, is extremely dangerous. We need to work much harder to avoid it (or its analog in the old pass manager). llvm-svn: 263232	2016-03-11 13:46:00 +00:00
Chandler Carruth	b47f8010a9	[PM] Make the AnalysisManager parameter to run methods a reference. This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, many parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219	2016-03-11 11:05:24 +00:00
Chandler Carruth	b4faf13c15	[PM] Implement the final conclusion as to how the analysis IDs should work in the face of the limitations of DLLs and templated static variables. This requires passes that use the AnalysisBase mixin provide a static variable themselves. So as to keep their APIs clean, I've made these private and befriended the CRTP base class (which is the common practice). I've added documentation to AnalysisBase for why this is necessary and at what point we can go back to the much simpler system. This is clearly a better pattern than the extern template as it caught numerous places where the template magic hadn't been applied and things were "just working" but would eventually have broken mysteriously. llvm-svn: 263216	2016-03-11 10:22:49 +00:00
Chandler Carruth	45a9c203a0	[PM/AA] Teach the AAManager how to handle module analyses in addition to function analyses, and use it to wire up globals-aa to the new pass manager. llvm-svn: 263211	2016-03-11 09:15:11 +00:00
Chandler Carruth	cf3f4f25ca	[CG] Back out my pointless move ctor and add the explicit template instantiation needed for the mingw dll build bot. llvm-svn: 263114	2016-03-10 14:33:10 +00:00
Chandler Carruth	4c660f7087	[CG] Add a new pass manager printer pass for the old call graph and actually finish wiring up the old call graph. There were bugs in the old call graph that hadn't been caught because it wasn't being tested. It wasn't being tested because it wasn't in the pipeline system and we didn't have a printing pass to run in tests. This fixes all of that. As for why I'm still keeping the old call graph alive its so that I can port GlobalsAA to the new pass manager with out forking it to work with the lazy call graph. That's clearly the right eventual design, but it seems pragmatic to defer that until its necessary. The old call graph works just fine for GlobalsAA. llvm-svn: 263104	2016-03-10 11:24:11 +00:00
Chandler Carruth	1ecd740cf0	[CG] Actually hoist up the generic CallGraphPrinter pass from a weird location in the opt tool to live along side the analysis in LLVM's libraries. No functionality changed here, but this will allow me to port the printer to the new pass manager as well. llvm-svn: 263101	2016-03-10 11:08:44 +00:00
Chandler Carruth	5f432292a6	[CG] Rename the DOT printing pass to actually reference "DOT". There is another pass by the generic name 'CallGraphPrinter' which is actually just a call graph printer tucked away inside the opt tool. I'd like to bring it out and make it follow the same patterns as the rest of the CallGraph code, but doing so would end up conflicting with the name of the DOT printing pass. So this makes the DOT printing pass name be more precise. No functionality changed here. llvm-svn: 263100	2016-03-10 11:04:40 +00:00
Chandler Carruth	61440d225b	[PM] Port memdep to the new pass manager. This is a fairly straightforward port to the new pass manager with one exception. It removes a very questionable use of releaseMemory() in the old pass to invalidate its caches between runs on a function. I don't think this is really guaranteed to be safe. I've just used the more direct port to the new PM to address this by nuking the results object each time the pass runs. While this could cause some minor malloc traffic increase, I don't expect the compile time performance hit to be noticable, and it makes the correctness and other aspects of the pass much easier to reason about. In some cases, it may make things faster by making the sets and maps smaller with better locality. Indeed, the measurements collected by Bruno (thanks!!!) show mostly compile time improvements. There is sadly very limited testing at this point as there are only two tests of memdep, and both rely on GVN. I'll be porting GVN next and that will exercise this heavily though. Differential Revision: http://reviews.llvm.org/D17962 llvm-svn: 263082	2016-03-10 00:55:30 +00:00
Philip Reames	d9f4a3d18c	[BasicAA/MDA] Sink aliasing rules for malloc and calloc into BasicAA MemoryDependenceAnalysis had a hard-coded exception to the general aliasing rules for malloc and calloc. The reasoning that applied there is equally valid in BasicAA and clarifies the remaining logic in MDA. In principal, this can expose slightly more optimization opportunities, but since essentially all of our aliasing aware memory optimization passes go through MDA, this will likely be NFC in practice. Differential Revision: http://reviews.llvm.org/D15912 llvm-svn: 263075	2016-03-09 23:19:56 +00:00
Philip Reames	8f12eba78d	[ValueTracking] Extract isKnownPositive [NFCI] Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem. llvm-svn: 263062	2016-03-09 21:31:47 +00:00
Sanjoy Das	97d19bd95f	[SCEV] Slightly generalize getRangeViaFactoring Building on the previous change, this generalizes ScalarEvolution::getRangeViaFactoring to work with {Ext(C?A:B)+k0,+,Ext(C?A:B)+k1} where Ext can be a zero extend, sign extend or truncate operation, and k0 and k1 are constants. llvm-svn: 262979	2016-03-09 01:51:02 +00:00
Sanjoy Das	d3488c6060	[SCEV] Slightly generalize getRangeViaFactoring This change generalizes ScalarEvolution::getRangeViaFactoring to work with {Ext(C?A:B),+,Ext(C?A:B)} where Ext can be a zero extend, sign extend or truncate operation. llvm-svn: 262978	2016-03-09 01:50:57 +00:00
Sanjay Patel	b8d071bc8a	use range-based for loop; NFCI llvm-svn: 262956	2016-03-08 20:53:48 +00:00
Easwaran Raman	b1bd398ceb	Revert revisions 262636, 262643, 262679, and 262682. llvm-svn: 262883	2016-03-08 00:36:35 +00:00
Chandler Carruth	af8321ecf7	[memdep] Switch to range based for loops. llvm-svn: 262831	2016-03-07 15:12:57 +00:00
Chandler Carruth	b32febe48e	[memdep] Switch a function to return true on success instead of false. This is much more clear and less surprising IMO. It also makes things more consistent with the increasingly large chunk of LLVM code that assumes true-on-success. llvm-svn: 262826	2016-03-07 12:45:07 +00:00
Chandler Carruth	40e21f2a20	[memdep] Cleanup the implementation doxygen comments and remove duplicated comments. In several cases these had diverged making them especially nice to canonicalize. I checked to make sure we weren't losing important information of course. llvm-svn: 262825	2016-03-07 12:30:06 +00:00
Chandler Carruth	60fb1b4bd2	[memdep] Run clang-format over the header before porting it to the new pass manager. The port will involve substantial edits here, and would likely introduce bad formatting if formatted in isolation, so just get all the formatting up to snuff. I'll also go through and try to freshen the doxygen here as well as modernizing some of the code. llvm-svn: 262821	2016-03-07 10:19:30 +00:00
Philip Reames	a0c9f6e736	[LVI] Fix a bug which prevented use of !range metadata within a query The diff is relatively large since I took a chance to rearrange the code I had to touch in a more obvious way, but the key bit is merely using the !range metadata when we can't analyze the instruction further. The previous !range metadata code was essentially just dead since no binary operator or cast will have !range metadata (per Verifier) and it was otherwise dropped on the floor. llvm-svn: 262751	2016-03-04 22:27:39 +00:00
Easwaran Raman	588c68a87b	Fix a memory leak. llvm-svn: 262682	2016-03-04 01:18:40 +00:00
Philip Reames	b7270446cf	[ValueTracking] "constant fold" an experimental hidden option llvm-svn: 262648	2016-03-03 19:50:32 +00:00
Philip Reames	146307eb52	[ValueTracking] Remove dead code from an old experiment This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits. While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default. Given this, it's time to abandon the experiment. Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments. Given how easy the patch is to apply, there's no reason to leave the code in tree. For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads. In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach. llvm-svn: 262646	2016-03-03 19:44:06 +00:00
Easwaran Raman	fd6557e368	Fix breakage caused by r262636. Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused)) llvm-svn: 262643	2016-03-03 18:53:20 +00:00
Sanjoy Das	724f5cf278	[SCEV] Prove no-overflow via constant ranges Exploit ScalarEvolution::getRange's newly acquired smartness (since r262438) by using that to infer nsw and nuw when possible. llvm-svn: 262639	2016-03-03 18:31:29 +00:00
Sanjoy Das	11ef606f1d	[SCEV] Be less eager about demoting zexts to sexts After r262438 we can have provably positive NSW SCEV expressions whose zero extensions cannot be simplified (since r262438 makes SCEV better at computing constant ranges). This means demoting sexts of positive add recurrences eagerly can result in an unsimplified zero extension where we could have had a simplified sign extension. This change fixes the issue by teaching SCEV to demote sext of a positive SCEV expression to a zext only if the sext could not be simplified. llvm-svn: 262638	2016-03-03 18:31:23 +00:00
Easwaran Raman	3035719c86	Infrastructure for PGO enhancements in inliner This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636	2016-03-03 18:26:33 +00:00
Chandler Carruth	12884f7f80	[AA] Hoist the logic to reformulate various AA queries in terms of other parts of the AA interface out of the base class of every single AA result object. Because this logic reformulates the query in terms of some other aspect of the API, it would easily cause O(n^2) query patterns in alias analysis. These could in turn be magnified further based on the number of call arguments, and then further based on the number of AA queries made for a particular call. This ended up causing problems for Rust that were actually noticable enough to get a bug (PR26564) and probably other places as well. When originally re-working the AA infrastructure, the desire was to regularize the pattern of refinement without losing any generality. While I think it was successful, that is clearly proving to be too costly. And the cost is needless: we gain no actual improvement for this generality of making a direct query to tbaa actually be able to re-use some other alias analysis's refinement logic for one of the other APIs, or some such. In short, this is entirely wasted work. To the extent possible, delegation to other API surfaces should be done at the aggregation layer so that we can avoid re-walking the aggregation. In fact, this significantly simplifies the logic as we no longer need to smuggle the aggregation layer into each alias analysis (or the TargetLibraryInfo into each alias analysis just so we can form argument memory locations!). However, we also have some delegation logic inside of BasicAA and some of it even makes sense. When the delegation logic is baking in specific knowledge of aliasing properties of the LLVM IR, as opposed to simply reformulating the query to utilize a different alias analysis interface entry point, it makes a lot of sense to restrict that logic to a different layer such as BasicAA. So one aspect of the delegation that was in every AA base class is that when we don't have operand bundles, we re-use function AA results as a fallback for callsite alias results. This relies on the IR properties of calls and functions w.r.t. aliasing, and so seems a better fit to BasicAA. I've lifted the logic up to that point where it seems to be a natural fit. This still does a bit of redundant work (we query function attributes twice, once via the callsite and once via the function AA query) but it is exactly twice here, no more. The end result is that all of the delegation logic is hoisted out of the base class and into either the aggregation layer when it is a pure retargeting to a different API surface, or into BasicAA when it relies on the IR's aliasing properties. This should fix the quadratic query pattern reported in PR26564, although I don't have a stand-alone test case to reproduce it. It also seems general goodness. Now the numerous AAs that don't need target library info don't carry it around and depend on it. I think I can even rip out the general access to the aggregation layer and only expose that in BasicAA as it is the only place where we re-query in that manner. However, this is a non-trivial change to the AA infrastructure so I want to get some additional eyes on this before it lands. Sadly, it can't wait long because we should really cherry pick this into 3.8 if we're going to go this route. Differential Revision: http://reviews.llvm.org/D17329 llvm-svn: 262490	2016-03-02 15:56:53 +00:00
Sanjoy Das	dcd3a88e29	[SCEV] Minor naming, braces cleanup; NFC llvm-svn: 262459	2016-03-02 04:52:22 +00:00
Sanjoy Das	6b017a11ba	Add a comment with a rational for the unusual code structure llvm-svn: 262454	2016-03-02 02:56:29 +00:00
Sanjoy Das	eca1b53b95	Qualify getRangeForAffineAR with this-> for MSVC llvm-svn: 262453	2016-03-02 02:44:08 +00:00
Sanjoy Das	1168f93c2b	Perturb code in an attempt to appease MSVC For some reason MSVC seems to think I'm calling getConstant() from a static context. Try to avoid this issue by explicitly specifying 'this->' (though I'm not confident that this will actually work). llvm-svn: 262451	2016-03-02 02:34:20 +00:00
Sanjoy Das	62a1c33929	More code permutation to appease MSVC llvm-svn: 262449	2016-03-02 02:15:42 +00:00
Sanjoy Das	9e5ebf145c	Remove "auto" to appease the MSVC bots llvm-svn: 262448	2016-03-02 01:59:37 +00:00
Sanjoy Das	bf73098472	[SCEV] Make getRange smarter around selects Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}" by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q}) instead. The latter can be easier to compute precisely in cases like "{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in such cases the latter form simplifies to [0,N+1) union [0,N+1). llvm-svn: 262438	2016-03-02 00:57:54 +00:00
Sanjoy Das	b765b633cb	[SCEV] Extract out a getRangeForAffineAR; NFC Pure code-motion change. Will be used later in making getRange more clever. llvm-svn: 262437	2016-03-02 00:57:39 +00:00
Sanjoy Das	f1e9cae00e	[SCEV] Minor cleanup: rename method, C++11'ify; NFC llvm-svn: 262374	2016-03-01 19:28:01 +00:00
Adam Nemet	b8486e5a32	[LAA] Add missing debug output llvm-svn: 262279	2016-03-01 00:50:08 +00:00
Benjamin Kramer	6bb15021b3	[InstSimplify] Restore fsub 0.0, (fsub 0.0, X) ==> X optzn I accidentally removed this in r262212 but there was no test coverage to detect it. llvm-svn: 262215	2016-02-29 12:18:25 +00:00
Benjamin Kramer	f5b2a47ac6	[InstSimplify] fsub 0.0, (fsub -0.0, X) ==> X is only safe if signed zeros are ignored. Only allow fsub -0.0, (fsub -0.0, X) ==> X without nsz. PR26746. llvm-svn: 262212	2016-02-29 11:12:23 +00:00
NAKAMURA Takumi	df0cd72657	[PM] Appease mingw32's auto-import DLL build with minimal tweaks, with fix for clang. char AnalysisBase::ID should be declared as extern and defined in one module. llvm-svn: 262188	2016-02-28 17:17:00 +00:00
NAKAMURA Takumi	ca04a1f720	Revert r262185, "[PM] Appease mingw32's auto-import DLL build with minimal tweaks." I'll rework soon. llvm-svn: 262186	2016-02-28 16:54:06 +00:00
NAKAMURA Takumi	de40e7437e	[PM] Appease mingw32's auto-import DLL build with minimal tweaks. char AnalysisBase::ID should be declared as extern and defined in one module. llvm-svn: 262185	2016-02-28 16:38:46 +00:00
Chandler Carruth	afcec4c55a	[PM] Provide explicit instantiation declarations and definitions for the PassManager and AnalysisManager template specializations as well. llvm-svn: 262128	2016-02-27 10:45:35 +00:00
Chandler Carruth	2a54094d40	[PM] Provide two templates for the two directionalities of analysis manager proxies and use those rather than repeating their definition four times. There are real differences between the two directions: outer AMs are const and don't need to have invalidation tracked. But every proxy in a particular direction is identical except for the analysis manager type and the IR unit they proxy into. This makes them prime candidates for nice templates. I've started introducing explicit template instantiation declarations and definitions as well because we really shouldn't be emitting all this everywhere. I'm going to go back and add the same for the other templates like this in a follow-up patch. I've left the analysis manager as an opaque type rather than using two IR units and requiring it to be an AnalysisManager template specialization. I think its important that users retain the ability to provide their own custom analysis management layer and provided it has the appropriate API everything should Just Work. llvm-svn: 262127	2016-02-27 10:38:10 +00:00
Philip Reames	70b391864d	Suppress an uncovered switch warning [NFC] llvm-svn: 262109	2016-02-27 05:18:30 +00:00
Philip Reames	adf0e35308	[LVI] Extend select handling to catch min/max/clamp idioms Most of this is fairly straight forward. Add handling for min/max via existing matcher utility and ConstantRange routines. Add handling for clamp by exploiting condition constraints on inputs. Note that I'm only handling two constant ranges at this point. It would be reasonable to consider treating overdefined as a full range if the instruction is typed as an integer, but that should be a separate change. Differential Revision: http://reviews.llvm.org/D17184 llvm-svn: 262085	2016-02-26 22:53:59 +00:00
Chandler Carruth	3a63435551	[PM] Introduce CRTP mixin base classes to help define passes and analyses in the new pass manager. These just handle really basic stuff: turning a type name into a string statically that is nice to print in logs, and getting a static unique ID for each analysis. Sadly, the format of passes in anonymous namespaces makes using their names in tests really annoying so I've customized the names of the no-op passes to keep tests sane to read. This is the first of a few simplifying refactorings for the new pass manager that should reduce boilerplate and confusion. llvm-svn: 262004	2016-02-26 11:44:45 +00:00
Michael Zolotukhin	9f520ebc54	[LoopUnrollAnalyzer] Check that we're using SCEV for the same loop we're simulating. Summary: Check that we're using SCEV for the same loop we're simulating. Otherwise, we might try to use the iteration number of the current loop in SCEV expressions for inner/outer loops IVs, which is clearly incorrect. Reviewers: chandlerc, hfinkel Subscribers: sanjoy, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17632 llvm-svn: 261958	2016-02-26 02:57:05 +00:00
Hongbin Zheng	bc53977a0d	Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC Differential Revision: http://reviews.llvm.org/D17571 llvm-svn: 261904	2016-02-25 17:54:25 +00:00
Hongbin Zheng	751337faa7	Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC Differential Revision: http://reviews.llvm.org/D17570 llvm-svn: 261903	2016-02-25 17:54:15 +00:00
Hongbin Zheng	3f97840721	Introduce analysis pass to compute PostDominators in the new pass manager. NFC Differential Revision: http://reviews.llvm.org/D17537 llvm-svn: 261902	2016-02-25 17:54:07 +00:00
Hongbin Zheng	66b19fbc4e	Revert "Introduce analysis pass to compute PostDominators in the new pass manager. NFC" This reverts commit a3e5cc6a51ab5ad88d1760c63284294a4e34c018. llvm-svn: 261891	2016-02-25 16:45:53 +00:00
Hongbin Zheng	ad782ce3f7	Revert "Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC" This reverts commit 109c38b2226a87b0be73fa7a0a8c1a81df20aeb2. llvm-svn: 261890	2016-02-25 16:45:46 +00:00
Hongbin Zheng	921fabf34b	Revert "Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC" This reverts commit 8228b4d374edeb4cc0c5fddf6e1ab876918ee126. llvm-svn: 261889	2016-02-25 16:45:37 +00:00
Hongbin Zheng	2fa386fd6c	Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC Differential Revision: http://reviews.llvm.org/D17571 llvm-svn: 261884	2016-02-25 16:33:26 +00:00
Hongbin Zheng	237197ba63	Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC Differential Revision: http://reviews.llvm.org/D17570 llvm-svn: 261883	2016-02-25 16:33:15 +00:00
Hongbin Zheng	a0273a04f5	Introduce analysis pass to compute PostDominators in the new pass manager. NFC Differential Revision: http://reviews.llvm.org/D17537 llvm-svn: 261882	2016-02-25 16:33:06 +00:00
Justin Bogner	eecc3c826a	PM: Implement a basic loop pass manager This creates the new-style LoopPassManager and wires it up with dummy and print passes. This version doesn't support modifying the loop nest at all. It will be far easier to discuss and evaluate the approaches to that with this in place so that the boilerplate is out of the way. llvm-svn: 261831	2016-02-25 07:23:08 +00:00
Artur Pilipenko	31bcca47d3	NFC. Move isDereferenceable to Loads.h/cpp This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16180 llvm-svn: 261736	2016-02-24 12:49:04 +00:00
Artur Pilipenko	ae51afc5c7	NFC. Move getAlignment helper function from ValueTracking to Value class. Reviewed By: reames, hfinkel Differential Revision: http://reviews.llvm.org/D16144 llvm-svn: 261735	2016-02-24 12:25:10 +00:00
Chandler Carruth	c5d211ef2c	[PM] Remove an overly aggressive assert now that I can actually test the pattern that triggers it. This essentially requires an immutable function analysis, as that will survive anything we do to invalidate it. When we have such patterns, the function analysis manager will not get cleared between runs of the proxy. If we actually need an assert about how things are queried, we can add more elaborate machinery for computing it, but so far I'm not aware of significant value provided. Thanks to Justin Lebar for noticing this when he made a (seemingly innocuous) change to FunctionAttrs that is enough to trigger it in one test there. Now it is covered by a direct test of the pass manager code. llvm-svn: 261627	2016-02-23 10:47:57 +00:00
Chandler Carruth	77b6e47f74	[PM] Improve the API and comments around the analysis manager proxies. These are really handles that ensure the analyses get cleared at appropriate places, and as such copying doesn't really make sense. Instead, they should look more like unique ownership objects. Make that the case. Relatedly, if you create a temporary of one and move out of it its destructor shouldn't actually clear anything. I don't think there is any code that can trigger this currently, but it seems like a more robust implementation. If folks want, I can add a unittest that forces this to be exercised, but that seems somewhat pointless -- whether a temporary is ever created in the innards of AnalysisManager is not really something we should be adding a reliance on, but I didn't want to leave a timebomb in the code here. If anyone has a cleaner way to represent this, I'm all ears, but I wanted to assure myself that this wasn't in fact responsible for another bug I'm chasing down (it wasn't) and figured I'd commit that. llvm-svn: 261594	2016-02-23 00:05:00 +00:00
Krzysztof Parzyszek	e261e5ac47	More detailed dependence test between volatile and non-volatile accesses Differential Revision: http://reviews.llvm.org/D16857 llvm-svn: 261589	2016-02-22 23:07:43 +00:00
Sanjoy Das	5079f6260f	[ConstantRange] Rename a method and add more doc Rename makeNoWrapRegion to a more obvious makeGuaranteedNoWrapRegion, and add a comment about the counter-intuitive aspects of the function. This is to help prevent cases like PR26628. llvm-svn: 261532	2016-02-22 16:13:02 +00:00
Duncan P. N. Exon Smith	e9bc579c37	ADT: Remove == and != comparisons between ilist iterators and pointers I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498	2016-02-21 20:39:50 +00:00
Tobias Grosser	934fcf4dc6	ScalerEvolution: Only erase temporary values if they actually have been added This addresses post-review comments from Sanjoy Das for r261485. llvm-svn: 261486	2016-02-21 18:50:09 +00:00
Tobias Grosser	11332e5ec5	ScalarEvolution: Do not keep temporary PHI values in ValueExprMap Before this patch simplified SCEV expressions for PHI nodes were only returned the very first time getSCEV() was called, but later calls to getSCEV always returned the non-simplified value, which had "temporarily" been stored in the ValueExprMap, but was never removed and consequently blocked the caching of the simplified PHI expression. llvm-svn: 261485	2016-02-21 17:42:10 +00:00
Joerg Sonnenberger	36894dcfed	When MemoryDependenceAnalysis hits a CFG with many transparent blocks, the algorithm easily degrades into quadratic memory and time complexity. The easiest example is a long chain of BBs that don't otherwise use a location. The caching will add an entry for every intermediate block and limiting the number of results doesn't help as no results are produced until a definition is found. Introduce a limit similar to the existing instructions-per-block limit. This limit counts the total number of blocks checked. If the limit is reached, entries are considered unknown. The initial value is 1000, which avoids regressions for normal sized functions while still limiting edge cases to reasnable memory consumption and execution time. Differential Revision: http://reviews.llvm.org/D16123 llvm-svn: 261430	2016-02-20 11:24:44 +00:00
Benjamin Kramer	2337c1fe13	[LVI] Move ConstantRanges instead of copying. No functional change intended. Copying small (<= 64 bits) APInts isn't expensive but bloats code by generating the slow path everywhere. Moving doesn't care about the size of the value. llvm-svn: 261426	2016-02-20 10:40:34 +00:00
Chandler Carruth	342c671b66	[PM/AA] Wire up CFLAA to the new pass manager fully, and port one of its tests over to exercise this code. This uncovered a few missing bits here and there in the analysis, but nothing interesting. llvm-svn: 261404	2016-02-20 03:52:02 +00:00
Chandler Carruth	4f846a5f15	[PM/AA] Port alias analysis evaluator to the new pass manager, and use it to actually test the new pass manager AA wiring. This patch was extracted from the (somewhat too large) D12357 and rebosed on top of the slightly different design of the new pass manager AA wiring that I just landed. With this we can start testing the AA in a thorough way with the new pass manager. Some minor cleanups to the code in the pass was necessitated here, but otherwise it is a very minimal change. Differential Revision: http://reviews.llvm.org/D17372 llvm-svn: 261403	2016-02-20 03:46:03 +00:00
Sanjoy Das	807d33da96	[SCEV] Don't spell `SCEV ` variables as `Scev`; NFC It reads odd since most other places name a `SCEV ` as `S`. Pure renaming change. llvm-svn: 261393	2016-02-20 01:44:10 +00:00
Sanjoy Das	c42f7cc3f8	[SCEV] Don't use std::make_pair; NFC `{A, B}` reads cleaner than `std::make_pair(A, B)`. llvm-svn: 261392	2016-02-20 01:35:56 +00:00
Richard Trieu	7a08381403	Remove uses of builtin comma operator. Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270	2016-02-18 22:09:30 +00:00
Philip Reames	bd09e86f82	[CaptureTracking] Support atomicrmw and cmpxchg These atomic operations are conceptually both a load and store from the same location. As such, we can treat them as the most conservative of those two components which in practice, means we can treat them like stores. An cmpxchg or atomicrmw captures the values, but not the locations accessed. Note: We can probably be more aggressive about the comparison value in an cmpxhg since to have it be in memory, it must already be captured, but I figured it was better to avoid that for the moment. Note 2: It turns out that since we don't actually support cmpxchg of pointer type, writing a negative test is impossible. Differential Revision: http://reviews.llvm.org/D17400 llvm-svn: 261245	2016-02-18 19:23:27 +00:00
Haicheng Wu	5cf99095bb	[AliasSetTracker] Teach AliasSetTracker about MemSetInst This change is to fix the problem discussed in http://lists.llvm.org/pipermail/llvm-dev/2016-February/095446.html. llvm-svn: 261052	2016-02-17 02:01:50 +00:00
Chandler Carruth	e5944d97d8	[LCG] Construct an actual call graph with call-edge SCCs nested inside reference-edge SCCs. This essentially builds a more normal call graph as a subgraph of the "reference graph" that was the old model. This allows both to exist and the different use cases to use the aspect which addresses their needs. Specifically, the pass manager and other ordering constrained logic can use the reference graph to achieve conservative order of visit, while analyses reasoning about attributes and other properties derived from reachability can reason about the direct call graph. Note that this isn't necessarily complete: it doesn't model edges to declarations or indirect calls. Those can be found by scanning the instructions of the function if desirable, and in fact every user currently does this in order to handle things like calls to instrinsics. If useful, we could consider caching this information in the call graph to save the instruction scans, but currently that doesn't seem to be important. An important realization for why the representation chosen here works is that the call graph is a formal subset of the reference graph and thus both can live within the same data structure. All SCCs of the call graph are necessarily contained within an SCC of the reference graph, etc. The design is to build 'RefSCC's to model SCCs of the reference graph, and then within them more literal SCCs for the call graph. The formation of actual call edge SCCs is not done lazily, unlike reference edge 'RefSCC's. Instead, once a reference SCC is formed, it directly builds the call SCCs within it and stores them in a post-order sequence. This is used to provide a consistent platform for mutation and update of the graph. The post-order also allows for very efficient updates in common cases by bounding the number of nodes (and thus edges) considered. There is considerable common code that I'm still looking for the best way to factor out between the various DFS implementations here. So far, my attempts have made the code harder to read and understand despite reducing the duplication, which seems a poor tradeoff. I've not given up on figuring out the right way to do this, but I wanted to wait until I at least had the system working and tested to continue attempting to factor it differently. This also requires introducing several new algorithms in order to handle all of the incremental update scenarios for the more complex structure involving two edge colorings. I've tried to comment the algorithms sufficiently to make it clear how this is expected to work, but they may still need more extensive documentation. I know that there are some changes which are not strictly necessarily coupled here. The process of developing this started out with a very focused set of changes for the new structure of the graph and algorithms, but subsequent changes to bring the APIs and code into consistent and understandable patterns also ended up touching on other aspects. There was no good way to separate these out without causing massive merge conflicts. Ultimately, to a large degree this is a rewrite of most of the core algorithms in the LCG class and so I don't think it really matters much. Many thanks to the careful review by Sanjoy Das! Differential Revision: http://reviews.llvm.org/D16802 llvm-svn: 261040	2016-02-17 00:18:16 +00:00
Philip Reames	845435c86a	Revert 260705, it appears to be causing pr26628 The root issue appears to be a confusion around what makeNoWrapRegion actually does. It seems likely we need two versions of this function with slightly different semantics. llvm-svn: 260981	2016-02-16 17:14:30 +00:00
Junmo Park	6ebdc14cf1	[SCEVExpander] Make findExistingExpansion smarter Summary: Extending findExistingExpansion can use existing value in ExprValueMap. This patch gives 0.3~0.5% performance improvements on benchmarks(test-suite, spec2000, spec2006, commercial benchmark) Reviewers: mzolotukhin, sanjoy, zzheng Differential Revision: http://reviews.llvm.org/D15559 llvm-svn: 260938	2016-02-16 06:46:58 +00:00
Chandler Carruth	6f5770b10f	[PM/AA] Actually wire the AAManager I built for the new pass manager into the new pass manager and fix the latent bugs there. This lets everything live together nicely, but it isn't really useful yet. I never finished wiring the AA layer up for the new pass manager, and so subsequent patches will change this to do that wiring and get AA stuff more fully integrated into the new pass manager. Turns out this is necessary even to get functionattrs ported over. =] llvm-svn: 260836	2016-02-13 23:32:00 +00:00
Benjamin Kramer	8f59adb217	[ConstantFolding] Reduce APInt and APFloat copying. llvm-svn: 260826	2016-02-13 16:54:14 +00:00
Justin Lebar	144c5a6c15	Add convergent property to CodeMetrics. Summary: No functional changes. Reviewers: jingyue, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17126 llvm-svn: 260728	2016-02-12 21:01:31 +00:00
Philip Reames	2b9100dfbd	[LVI] Exploit nsw/nuw when computing constant ranges As the title says. Modelled after similar code in SCEV. This is useful when analysing induction variables in loops which have been canonicalized by other passes. I wrote the tests as non-loops specifically to avoid the generality introduced in http://reviews.llvm.org/D17174. While that can handle many induction variables without needing to exploit nsw, there's no reason not to use it if we've already proven it. Differential Revision: http://reviews.llvm.org/D17177 llvm-svn: 260705	2016-02-12 19:05:16 +00:00
Philip Reames	854a84c0b0	[LVI] Improve select handling to use condition This patches teaches LVI to recognize clamp idioms (e.g. select(a > 5, a, 5) will always produce something greater than 5. The tests end up being somewhat simplistic because trying to exercise the case I actually care about (a loop with a range check on a clamped secondary induction variable) ends up tripping across a couple of other imprecisions in the analysis. Ah, the joys of LVI... Differential Revision: http://reviews.llvm.org/D16827 llvm-svn: 260627	2016-02-12 00:09:18 +00:00
Artur Pilipenko	66d6d3eb2d	Make context-sensitive isDereferenceable queries in isSafeToLoadUnconditionally This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In the subsequent change isSafeToSpeculativelyExecute will be modified to use isSafeToLoadUnconditionally instead of isDereferenceableAndAlignedPointer. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D16227 llvm-svn: 260520	2016-02-11 13:42:59 +00:00
Philip Reames	bb781b46e2	[LVI] Handle constants defensively There's nothing preventing callers of LVI from asking for lattice values representing a Constant. In fact, given that several callers are walking back through PHI nodes and trying to simplify predicates, such queries are actually quite common. This is mostly harmless today, but we start volatiling assertions if we add new calls to getBlockValue in otherwise reasonable places. Note that this change is not NFC. Specifically: 1) The result returned through getValueAt will now be more precise. In principle, this could trigger any latent infinite optimization loops in callers, but in practice, we're unlikely to see this. 2) The result returned through getBlockValueAt is potentially weakened for non-constants that were previously queried. With the old code, you had the possibility that a later query might bypass the cache and discover some information the original query did not. I can't find a scenario which actually causes this to happen, but it was in principle possible. On the other hand, this may end up reducing compile time when the same value is queried repeatedly. llvm-svn: 260439	2016-02-10 21:46:32 +00:00
Sanjoy Das	ef8ed0c0db	[MemoryBuiltins] Fix an issue with hasNoAliasAttr Summary: `hasNoAliasAttr` is buggy: it checks to see if the called function has a `noalias` attribute, which is incorrect since functions are not even allowed to have the `noalias` attribute. The comment on its only caller, `llvm::isNoAliasFn`, makes it pretty clear that the intention to do the `noalias` check on the return value, and not the callee. Unfortunately I couldn't find a way to test this upstream -- fixing this does not change the observable behavior of any of the passes that use this. This is not very surprising, since `noalias` does not tell anything about the contents of the allocated memory (so, e.g., you still cannot fold loads). I'll be happy to be proven wrong though. Reviewers: chandlerc, reames Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17037 llvm-svn: 260298	2016-02-09 21:54:18 +00:00
Sanjoy Das	ca2edc7ad5	[GMR/OperandBundles] Teach getModRefBehavior about operand bundles In general, memory restrictions on a called function (e.g. readnone) cannot be transferred to a CallSite that has operand bundles. It is possible to make this inference smarter, but lets fix the behavior to be correct first. llvm-svn: 260193	2016-02-09 02:31:47 +00:00
Sanjoy Das	1c481f50d2	Add an "addUsedAAAnalyses" helper function Summary: Passes that call `getAnalysisIfAvailable<T>` also need to call `addUsedIfAvailable<T>` in `getAnalysisUsage` to indicate to the legacy pass manager that it uses `T`. This contract was being violated by passes that used `createLegacyPMAAResults`. This change fixes this by exposing a helper in AliasAnalysis.h, `addUsedAAAnalyses`, that is complementary to createLegacyPMAAResults and does the right thing when called from `getAnalysisUsage`. Reviewers: chandlerc Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17010 llvm-svn: 260183	2016-02-09 01:21:57 +00:00
Sanjoy Das	55394d929c	Remove SCEVAAWrapperPass from createLegacyPMAAResults; NFC Summary: createLegacyPMAAResults is only called by CGSCC and Module passes, so the call to getAnalysisIfAvailable<SCEVAAWrapperPass>() never succeeds (SCEVAAWrapperPass is a function pass). Reviewers: chandlerc Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17009 llvm-svn: 260182	2016-02-09 01:21:50 +00:00
Wei Mi	fc1cab305f	This patch is to fix PR26529 caused by r259736. IndVarSimplify assumes scAddRecExpr to be expanded in literal form instead of canonical form by calling disableCanonicalMode after it creates SCEVExpander. When CanonicalMode is disabled, SCEVExpander::expand should always return PHI node for scAddRecExpr. r259736 broke the assumption. The fix is to let SCEVExpander::expand skip the reuse Value logic if CanonicalMode is false. In addition, Besides IndVarSimplify, LSR pass also calls disableCanonicalMode before doing rewrite. We can remove the original check of LSRMode in reuse Value logic and use CanonicalMode instead. llvm-svn: 260174	2016-02-09 00:07:08 +00:00
Michael Zolotukhin	1da4afdfc9	Factor out UnrollAnalyzer to Analysis, and add unit tests for it. Summary: Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests. The change itself is supposed to be NFC, except adding a couple of tests. I plan to add more tests as I add new functionality and find/fix bugs. Reviewers: chandlerc, hfinkel, sanjoy Subscribers: zzheng, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D16623 llvm-svn: 260169	2016-02-08 23:03:59 +00:00
Silviu Baranga	ea63a7f512	[SCEV][LAA] Re-commit r260085 and r260086, this time with a fix for the memory sanitizer issue. The PredicatedScalarEvolution's copy constructor wasn't copying the Generation value, and was leaving it un-initialized. Original commit message: [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection Summary: This change adds no wrap SCEV predicates with: - support for runtime checking - support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)} Note that we are sign extending the increment of the SCEV, even for the zext case. This is needed to cover the fairly common case where y would be a (small) negative integer. In order to do this, this change adds two new flags: nusw and nssw that are applicable to AddRecExprs and permit the transformations above. We also change isStridedPtr in LAA to be able to make use of these predicates. With this feature we should now always be able to work around overflow issues in the dependence analysis. Reviewers: mzolotukhin, sanjoy, anemet Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel Differential Revision: http://reviews.llvm.org/D15412 llvm-svn: 260112	2016-02-08 17:02:45 +00:00

... 4 5 6 7 8 ...

6578 Commits