llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	a0b5496026	[PredicateInfo] Add test for multiple branches on same condition (NFC) This illustrates a case where RenamedOp does not correspond to the value used in the condition, which it ideally should.	2020-07-10 20:59:03 +02:00
Craig Topper	1cf6f210a2	[IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison. This matches the recent change to InstSimplify from D83440. Differential Revision: https://reviews.llvm.org/D83535	2020-07-10 10:42:25 -07:00
Roman Lebedev	1d542f0ca8	Revert "[OpenMPOpt] ICV Tracking" There appears to be some kind of memory corruption/use-after-free/etc going on here. In particular, in `OpenMPOpt::deleteParallelRegions()`, in `DeleteCallCB()`, `CI` is garbage. WIll post reproducer in the original review. This reverts commit `6c4a5e9257`.	2020-07-10 19:00:15 +03:00
Johannes Doerfert	43d8d59d6d	[Attributor][NFC] Update tests after recent changes Attributor tests are mostly updated using the auto upgrade scripts but sometimes we forget. If we do it manually or continue using old check lines that still match we see unrelated changes down the line. This is just a cleanup.	2020-07-10 10:39:32 -05:00
Roman Lebedev	2655a70a04	[InstCombine] After merging store into successor, queue prev. store to be visited (PR46661) We can happen to have a situation with many stores eligible for transform, but due to our visitation order (top to bottom), when we have processed the first eligible instruction, we would not try to reprocess the previous instructions that are now also eligible. So after we've successfully merged a store that was second-to-last instruction into successor, if the now-second-to-last instruction is also a such store that is eligible, add it to worklist to be revisited. Fixes https://bugs.llvm.org/show_bug.cgi?id=46661	2020-07-10 17:49:16 +03:00
Roman Lebedev	ef0ecb7b03	[NFCI][InstCombine] PR46661: multiple stores eligible for merging into successor - worklist issue The testcase should pass with a single instcombine iteration.	2020-07-10 17:49:16 +03:00
Florian Hahn	264ab1e2c8	[LV] Pick vector loop body as insert point for SCEV expansion. Currently the DomTree is not kept up to date for additional blocks generated in the vector loop, for example when vectorizing with predication. SCEVExpander relies on dominance checks when looking for existing instructions to re-use and in some cases that can lead to the expander picking instructions that do not actually dominate their insert point (e.g. as in PR46525). Unfortunately keeping the DT up-to-date is a bit tricky, because the CFG is only patched up after generating code for a block. For now, we can just use the vector loop header, as this ensures the inserted instructions dominate all uses in the vector loop. There should be no noticeable impact on the generated code, as other passes should sink those instructions, if profitable. Fixes PR46525. Reviewers: Ayal, gilr, mkazantsev, dmgreen Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D83288	2020-07-10 10:37:12 +01:00
Diogo Sampaio	7bf168390f	[BDCE] SExt -> ZExt when no sign bits is used and instruction has multiple uses Summary: This allows to convert any SExt to a ZExt when we know none of the extended bits are used, specially in cases where there are multiple uses of the value. Reviewers: dmgreen, eli.friedman, spatel, lebedev.ri, nikic Reviewed By: lebedev.ri, nikic Subscribers: hiraditya, dmgreen, craig.topper, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60413	2020-07-10 08:34:53 +01:00
Chen Zheng	f1efb8bb4b	[SCEV][IndVarSimplify] insert point should not be block front. The block front may be a PHI node, inserting a cast instructions like BitCast, PtrToInt, IntToPtr among PHIs is not right. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D80975	2020-07-09 21:56:57 -04:00
Nikita Popov	c0308fd154	[PredicateInfo] Print RenamedOp (NFC) Make it easier to debug renaming issues.	2020-07-09 23:14:24 +02:00
Craig Topper	469da663f2	[InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison Follow up from the transform being removed in D83360. If X is probably not poison, then the transform is safe. Still plan to remove or adjust the code from ConstantFolding after this. Differential Revision: https://reviews.llvm.org/D83440	2020-07-09 12:21:03 -07:00
Craig Topper	122b0640fc	[InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison We can't fold to the non-undef value unless we know it isn't poison. So check each element with isGuaranteedNotToBeUndefOrPoison. This currently rules out all constant expressions. Differential Revision: https://reviews.llvm.org/D83442	2020-07-09 11:01:12 -07:00
Florian Hahn	9477d39e61	[SCCP] Move tests using only ipsccp from IPConstantProp to SCCP (NFC). Some of the tests in the llvm/test/Transforms/IPConstantProp directory actually only use -ipsccp. Those tests belong to the other (IP)SCCP tests in llvm/test/Transforms/SCCP/ and this commits moves them there to avoid confusion with IPConstantProp.	2020-07-09 17:16:15 +01:00
Diogo Sampaio	a0e981c190	[NFC] Add SExt multiuses test	2020-07-09 15:31:16 +01:00
dfukalov	167767a775	SpeculativeExecution: Fix for logic change introduced in D81730. Summary: The test case started to hoist bitcasts to upper BB after D81730. Reverted unintentional logic change. Some instructions may have zero cost but will not be hoisted by different limitation so should be counted for threshold. Reviewers: aprantl, arsenm, nhaehnle Reviewed By: aprantl Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82761	2020-07-09 15:45:23 +03:00
Florian Hahn	a86ce06faf	[SCCP] Use conditional info with AND/OR branch conditions. Currently SCCP does not combine the information of conditions joined by AND in the true branch or OR in the false branch. For branches on AND, 2 copies will be inserted for the true branch, with one being the operand of the other as in the code below. We can combine the information using intersection. Note that for the OR case, the copies are inserted in the false branch, where using intersection is safe as well. define void @foo(i32 %a) { entry: %lt = icmp ult i32 %a, 100 %gt = icmp ugt i32 %a, 20 %and = and i1 %lt, %gt ; Has predicate info ; branch predicate info { TrueEdge: 1 Comparison: %lt = icmp ult i32 %a, 100 Edge: [label %entry,label %true] } %a.0 = call i32 @llvm.ssa.copy.140247425954880(i32 %a) ; Has predicate info ; branch predicate info { TrueEdge: 1 Comparison: %gt = icmp ugt i32 %a, 20 Edge: [label %entry,label %false] } %a.1 = call i32 @llvm.ssa.copy.140247425954880(i32 %a.0) br i1 %and, label %true, label %false true: ; preds = %entry call void @use(i32 %a.1) %true.1 = icmp ne i32 %a.1, 20 call void @use.i1(i1 %true.1) ret void false: ; preds = %entry call void @use(i32 %a.1) ret void } Reviewers: efriedma, davide, mssimpso, nikic Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D77808	2020-07-09 12:59:24 +01:00
Paul Walker	6b403319f8	[SVE] Scalarize fixed length masked loads and stores. When adding support for scalable vector masked loads and stores we accidently opened up likewise for fixed length vectors. This patch restricts support to scalable vectors only, thus ensuring fixed length vectors are treated the same regardless of SVE support. Differential Revision: https://reviews.llvm.org/D83341	2020-07-09 10:47:04 +00:00
Jun Ma	f0bfad2ed9	[Coroutines] Refactor sinkLifetimeStartMarkers Differential Revision: https://reviews.llvm.org/D83379	2020-07-09 18:23:28 +08:00
Dmitry Polukhin	9e7fddbd36	[yaml][clang-tidy] Fix multiline YAML serialization Summary: New line duplication logic introduced in https://reviews.llvm.org/D63482 has two issues: (1) there is no logic that removes duplicate newlines when clang-apply-replacment reads YAML and (2) in general such logic should be applied to all strings and should happen on string serialization level instead in YAML parser. This diff changes multiline strings quotation from single quote `'` to double `"`. It solves problems with internal newlines because now they are escaped. Also double quotation solves the problem with leading whitespace after newline. In case of single quotation YAML parsers should remove leading whitespace according to specification. In case of double quotation these leading are internal space and they are preserved. There is no way to instruct YAML parsers to preserve leading whitespaces after newline so double quotation is the only viable option that solves all problems at once. Test Plan: check-all Reviewers: gribozavr, mgehre, yvvan Subscribers: xazax.hun, hiraditya, cfe-commits, llvm-commits Tags: #clang-tools-extra, #clang, #llvm Differential Revision: https://reviews.llvm.org/D80301	2020-07-09 02:41:58 -07:00
Craig Topper	ac0af12ed2	[InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison Part of addressing post-commit feedback from D83360	2020-07-08 15:24:55 -07:00
Craig Topper	9b1e95329a	[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms As noted here https://lists.llvm.org/pipermail/llvm-dev/2016-October/106182.html and by alive2, this transform isn't valid. If X is poison this potentially propagates poison when it shouldn't. This same transform still exists in DAGCombiner. Differential Revision: https://reviews.llvm.org/D83360	2020-07-08 12:53:05 -07:00
Nikita Popov	a48cf72238	[InstSimplify] Handle not inserted instruction gracefully (PR46638) When simplifying comparisons using a dominating assume, bail out if the context instruction is not inserted.	2020-07-08 21:43:32 +02:00
Wei Mi	e32469a140	[SampleFDO] Enable sample-profile-top-down-load and sample-profile-merge-inlinee by default. sample-profile-top-down-load is an internal option which can enable top-down order of inlining and profile annotation in sample profile load pass. It was found to be beneficial for better profile annotation. Recently we found it could also solve some build time issue. Suppose function A has many callsites in function B. In the last release binary where sample profile was collected, the outline copy of A is large because there are many other functions inlined into A. However although all the callsites calling A in B are inlined, but every inlined body is small (A was inlined into B before other functions are inlined into A), there is no build time issue in last release. In an optimized build using the sample profile collected from last release, without top-down inlining, we saw a case that A got very large because of inlining, and then multiple callsites of A got inlined into B, and that led to a huge B which caused significant build time issue besides profile annotation issue. To solve that problem, the patch enables the flag sample-profile-top-down-load by default. sample-profile-top-down-load can have better performance when it is enabled together with sample-profile-merge-inlinee so in this patch we also enable sample-profile-merge-inlinee by default. Differential Revision: https://reviews.llvm.org/D82919	2020-07-08 09:23:18 -07:00
Arthur Eubanks	481709e831	[NewPM][opt] Share -disable-loop-unrolling between pass managers There's no reason to introduce a new option for the NPM. The various PGO options are shared in this manner. Reviewed By: echristo Differential Revision: https://reviews.llvm.org/D83368	2020-07-08 08:50:56 -07:00
Stanislav Mekhanoshin	64030099c3	SLP: honor requested max vector size merging PHIs At the moment this place does not check maximum size set by TTI and just creates a maximum possible vectors. Differential Revision: https://reviews.llvm.org/D82227	2020-07-08 08:06:15 -07:00
Florian Hahn	80970ac875	[DSE,MSSA] Eliminate stores by terminators (free,lifetime.end). This patch adds support for eliminating stores by free & lifetime.end calls. We can remove stores that are not read before calling a memory terminator and we can eliminate all stores after a memory terminator until we see a new lifetime.start. The second case seems to not really trigger much in practice though. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72410	2020-07-08 08:59:46 +01:00
Florian Hahn	04b85e2bcb	Revert "[SLP] Make sure instructions are ordered when computing spill cost." This seems to break http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/24371 This reverts commit `eb46137daa`.	2020-07-07 23:15:01 +01:00
Christopher Tetreault	021d56abb9	[SVE] Make Constant::getSplatValue work for scalable vector splats Summary: Make Constant::getSplatValue recognize scalable vector splats of the form created by ConstantVector::getSplat. Add unit test to verify that C == ConstantVector::getSplat(C)->getSplatValue() for fixed width and scalable vector splats Reviewers: efriedma, spatel, fpetrogalli, c-rhodes Reviewed By: efriedma Subscribers: sdesmalen, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82416	2020-07-07 13:45:51 -07:00
Arthur Eubanks	2279380eab	[Inliner] Don't skip inlining alwaysinline in optnone functions Previously the NPM inliner would skip all potential inlines in an optnone function, but alwaysinline callees should be inlined regardless of optnone. Fixes inline-optnone.ll under NPM. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D83021	2020-07-07 12:54:55 -07:00
Nikita Popov	8691544a27	[SCCP] Use range metadata for loads and calls When all else fails, use range metadata to constrain the result of loads and calls. It should also be possible to use !nonnull, but that would require some general support for inequalities in SCCP first. Differential Revision: https://reviews.llvm.org/D83179	2020-07-07 21:09:21 +02:00
Nikita Popov	9dfea03517	[SCCP] Handle assume predicates Take assume predicates into account when visiting ssa.copy. The handling is the same as for branch predicates, with the difference that we're always on the true edge. Differential Revision: https://reviews.llvm.org/D83257	2020-07-07 20:22:52 +02:00
Hans Wennborg	7fc279ca3d	[GlobalOpt] Don't remove inalloca from musttail-called functions Otherwise the verifier complains about the mismatching function ABIs. Differential revision: https://reviews.llvm.org/D83300	2020-07-07 19:02:46 +02:00
Roman Lebedev	16266e6396	[Scalarizer] When gathering scattered scalar, don't replace it with itself The (previously-crashing) test-case would cause us to seemingly-harmlessly replace some use with something else, but we can't replace it with itself, so we would crash.	2020-07-07 17:03:53 +03:00
Ayal Zaks	7bf299c8d8	[LV] Vectorize without versioning-for-unit-stride under -Os/-Oz If a loop is in a function marked OptSize, Loop Access Analysis should refrain from generating runtime checks for unit strides that will version the loop. If a loop is in a function marked OptSize and its vectorization is enabled, it should be vectorized w/o any versioning. Fixes PR46228. Differential Revision: https://reviews.llvm.org/D81345	2020-07-07 15:04:21 +03:00
Max Kazantsev	094e99d264	[Test] Add one more missing optimization opportunity test	2020-07-07 13:04:15 +07:00
Jordan Rupprecht	10c82eecbc	Revert "[LV] Enable the LoopVectorizer to create pointer inductions" This reverts commit `a8fe12065e`. It causes a crash when building gzip. Will post the detailed reduced test case to D81267.	2020-07-06 17:50:38 -07:00
Roman Lebedev	db05f2e34a	[Scalarizer] Centralize instruction DCE As reported in https://reviews.llvm.org/D83101#2133062 the new visitInsertElementInst()/visitExtractElementInst() functionality is causing miscompiles (previously-crashing test added) It is due to the fact how the infra of Scalarizer is dealing with DCE, it was not updated or was it ready for such scalar value forwarding. It always assumed that the moment we "scalarized" something, it can go away, and did so with prejudice. But that is no longer safe/okay to do. Instead, let's prevent it from ever shooting itself into foot, and let's just accumulate the instructions-to-be-deleted in a vector, and collectively cleanup (those that are actually dead) them all at the end. All existing tests are not reporting any new garbage leftovers, but maybe it's test coverage issue.	2020-07-07 01:12:51 +03:00
David Green	146dad0077	[ARM] MVE FP16 cost adjustments This adjusts the MVE fp16 cost model, similar to how we already do for integer casts. It uses the base cost of 1 per cvt for most fp extend / truncates, but adjusts it for loads and stores where we know that a extending load has been used to get the load into the correct lane, and only an MVE VCVTB is then needed. Differential Revision: https://reviews.llvm.org/D81813	2020-07-06 15:57:51 +01:00
Roman Lebedev	51f9310ff2	[Scalarizer] ExtractElement handling w/ variable insert index (PR46524) Summary: Similar to D82961. Reviewers: bjope, cameron.mcinally, arsenm, jdoerfert Reviewed By: jdoerfert Subscribers: arphaman, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82970	2020-07-06 13:19:33 +03:00
Roman Lebedev	6e50474581	[Scalarizer] InsertElement handling w/ variable insert index (PR46524) Summary: I'm interested in taking the original C++ input, for which we currently are stuck with an alloca and producing roughly the lower IR, with neither an alloca nor a vector ops: https://godbolt.org/z/cRRWaJ For that, as intermediate step, i'd to somehow perform scalarization. As per @arsenmn suggestion, i'm trying to see if scalarizer can help me avoid writing a bicycle. I'm not sure if it's really intentional that variable insert is not handled currently. If it really is, and is supposed to stay that way (?), i guess i could guard it.. See [[ https://bugs.llvm.org/show_bug.cgi?id=46524 \| PR46524 ]]. Reviewers: bjope, cameron.mcinally, arsenm, jdoerfert Reviewed By: jdoerfert Subscribers: arphaman, uabelho, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82961	2020-07-06 13:19:32 +03:00
Roman Lebedev	28b7816b78	[Scalarizer] ExtractElement handling w/ constant extract index Summary: It appears to be better IR-wise to aggressively scalarize it, rather than relying on gathering it, and leaving it as-is. Reviewers: jdoerfert, bjope, arsenm, cameron.mcinally Reviewed By: jdoerfert Subscribers: arphaman, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83101	2020-07-06 13:19:32 +03:00
Roman Lebedev	f62c8dbc99	[Scalarizer] InsertElement handling w/ constant insert index Summary: As it can be clearly seen from the diff, this results in nicer IR. Reviewers: jdoerfert, arsenm, bjope, cameron.mcinally Reviewed By: jdoerfert Subscribers: arphaman, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83102	2020-07-06 13:19:32 +03:00
David Green	55227f85d0	[ARM] Use BaseT::getMemoryOpCost for getMemoryOpCost This alters getMemoryOpCost to use the Base TargetTransformInfo version that includes some additional checks for whether extending loads are legal. This will generally have the effect of making <2 x ..> and some <4 x ..> loads/stores more expensive, which in turn should help favour larger vector factors. Notably it alters the cost of a <4 x half>, which with the current codegen will be expensive if it is not extended. Differential Revision: https://reviews.llvm.org/D82456	2020-07-06 10:58:40 +01:00
Nikita Popov	516ff1d4ba	[SCCP] Add test for range metadata (NFC)	2020-07-05 21:41:04 +02:00
sstefan1	6c4a5e9257	[OpenMPOpt] ICV Tracking This is the first and most basic ICV Tracking implementation. For this first version, we only support deduplication within the same BB. Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6, uenoku, baziotis Differential Revision: https://reviews.llvm.org/D81788	2020-07-04 23:31:50 +02:00
Roman Lebedev	7ea46aee36	Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions" Assume bundle can have more than one entry with the same name, but at least AlignmentFromAssumptionsPass::extractAlignmentInfo() uses getOperandBundle("align"), which internally assumes that it isn't the case, and happily crashes otherwise. Minimal reduced reproducer: run `opt -alignment-from-assumptions` on target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %0 = type { i64, %1, i8, i64, %2, i32, %3, i8 } %1 = type opaque %2 = type { i8, i8, i16 } %3 = type { i32, i32, i32, i32 } ; Function Attrs: nounwind define i32 @f(%0* noalias nocapture readonly %arg, %0* noalias %arg1) local_unnamed_addr #0 { bb: call void @llvm.assume(i1 true) [ "align"(%0* %arg, i64 8), "align"(%0* %arg1, i64 8) ] ret i32 0 } ; Function Attrs: nounwind willreturn declare void @llvm.assume(i1) #1 attributes #0 = { nounwind "reciprocal-estimates"="none" } attributes #1 = { nounwind willreturn } This is what we'd have with -mllvm -enable-knowledge-retention This reverts commit `c95ffadb24`.	2020-07-04 23:49:23 +03:00
Roman Lebedev	11a3f040c7	[Utils] Make -assume-builder/-assume-simplify actually work on Old-PM clang w/ old-pm currently would simply crash when -mllvm -enable-knowledge-retention=true is specified. Clearly, these two passes had no Old-PM test coverage, which would have shown the problem - not requiring AssumptionCacheTracker, but then trying to always get it. Also, why try to get domtree only if it's cached, but at the same time marking it as required?	2020-07-04 21:06:36 +03:00
Sanjay Patel	3b8ae1001f	[InstCombine] fix miscompile from umul_with_overflow matching As noted in PR46561: https://bugs.llvm.org/show_bug.cgi?id=46561 ...it takes something beyond a minimal IR example to trigger this bug because it relies on matching non-canonical IR. There are no tests that show the need for matching this pattern, so I'm just deleting it to fix the miscompile.	2020-07-04 11:16:23 -04:00
Roman Lebedev	c3b8bd1eea	[InstCombine] Always try to invert non-canonical predicate of an icmp Summary: The actual transform i was going after was: https://rise4fun.com/Alive/Tp9H ``` Name: zz Pre: isPowerOf2(C0) && isPowerOf2(C1) && C1 == C0 %t0 = and i8 %x, C0 %r = icmp eq i8 %t0, C1 => %t = icmp eq i8 %t0, 0 %r = xor i1 %t, -1 Name: zz Pre: isPowerOf2(C0) %t0 = and i8 %x, C0 %r = icmp ne i8 %t0, 0 => %t = icmp eq i8 %t0, 0 %r = xor i1 %t, -1 ``` but as it can be seen from the current tests, we already canonicalize most of it, and we are only missing handling multi-use non-canonical icmp predicates. If we have both `!=0` and `==0`, even though we can CSE them, we end up being stuck with them. We should canonicalize to the `==0`. I believe this is one of the cleanup steps i'll need after `-scalarizer` if i end up proceeding with my WIP alloca promotion helper pass. Reviewers: spatel, jdoerfert, nikic Reviewed By: nikic Subscribers: zzheng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83139	2020-07-04 18:12:04 +03:00
Sanjay Patel	ef70cc9d1a	[InstCombine] improve debug value names; NFC The use of 'tmp' can trigger warnings from the update_test_checks.py script. That's evidence of a flaw in the script's logic, but we can always do better than naming variables 'tmp' in LLVM too. The phi test file should be updated with auto-generated regex CHECK lines, so it isn't affected by cosmetic diffs, but I don't have time to do that right now.	2020-07-04 11:06:30 -04:00
Sanjay Patel	14936e01e2	[InstCombine] add test for miscompile (PR46561); NFC	2020-07-04 11:06:30 -04:00
Nikita Popov	3b671022e4	[InstSimplify] Simplify comparison between zext(x) and sext(x) This is picking up a loose thread from D69006: We can simplify (zext x) ule (sext x) and (zext x) sge (sext x) to true, with various permutations. Oddly, SCEV knows about this identity, but nothing on the IR level does. Differential Revision: https://reviews.llvm.org/D83081	2020-07-04 11:03:00 +02:00
Nikita Popov	93ccb8eb52	[InstSimplify] Add additional zext/sext comparison tests (NFC) Add vector variants, and negative tests where the operand does not match.	2020-07-04 11:03:00 +02:00
Francis Visoiu Mistrih	aa5ec34e31	[LoopDeletion] Emit a remark when a dead loop is deleted This emits a remark when LoopDeletion deletes a dead loop, using the source location of the loop's header. There are currently two reasons for removing the loop: invariant loop or loop that never executes. Differential Revision: https://reviews.llvm.org/D83113	2020-07-03 15:20:23 -07:00
Roman Lebedev	17a15c32af	[NFCI][LoopUnroll] s/%tmp/%i/ in one test to silence update script warning	2020-07-04 00:39:36 +03:00
Roman Lebedev	341ab51149	[NFCI][InstCombine] shift.ll: s/%tmp/%i/ to silence update script warning	2020-07-04 00:39:35 +03:00
Sanjay Patel	7fd8af1de0	[InstCombine] fold mul of sext bools to 'and' Alive2: define i32 @src(i1 %x, i1 %y) { %0: %zx = sext i1 %x to i32 %zy = sext i1 %y to i32 %r = mul i32 %zx, %zy ret i32 %r } => define i32 @tgt(i1 %x, i1 %y) { %0: %a = and i1 %x, %y %r = zext i1 %a to i32 ret i32 %r } Transformation seems to be correct! https://alive2.llvm.org/ce/z/gaPQxA	2020-07-03 17:28:40 -04:00
Sanjay Patel	5504d8b04a	[InstCombine] add more tests for mul of bools; NFC	2020-07-03 17:28:22 -04:00
Florian Hahn	31971ca1c6	[InstCombine] Try to narrow expr if trunc cannot be removed. Narrowing an input expression of a truncate to a type larger than the result of the truncate won't allow removing the truncate, but it may enable further optimizations, e.g. allowing for larger vectorization factors. For now this is intentionally limited to integer types only, to avoid producing new vector ops that might not be suitable for the target. If we know that the only user is a trunc, we can also be allow more cases, e.g. also shortening expressions with some additional shifts. I would appreciate feedback on the best place to do such a narrowing. This fixes PR43580. Reviewers: spatel, RKSimon, lebedev.ri, xbolva00 Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D82973	2020-07-03 20:22:51 +01:00
Sanjay Patel	40fcc42498	[InstCombine] fold mul of zext bools to 'and' The base case only works because we are relying on a poison-unsafe select transform; if that is fixed, we would regress on patterns like this. The extra use tests show that the select transform can't be applied consistently. So it may be a regression to have an extra instruction on 1 test, but that result was not created safely and does not happen reliably.	2020-07-03 13:14:18 -04:00
Sanjay Patel	5d60377864	[InstCombine] add tests for mul of bools; NFC	2020-07-03 13:14:18 -04:00
Roman Lebedev	4dd784000e	[NFC][InstCombine] Add some more tests for select based on non-canonical bit-test	2020-07-03 20:12:46 +03:00
Nikita Popov	cf1d9f9f49	[InstSimplify] Fold icmp with dominating assume If we assume(x > y), then we should be able to fold the basic implications of that, like x >= y. This already happens if either one of the operands is constant (LVI) or if the conditions are exactly the same (GVN), but not if we have an implication with non-constant operands. Support this by querying AssumptionCache. Fixes https://bugs.llvm.org/show_bug.cgi?id=40149. Differential Revision: https://reviews.llvm.org/D82717	2020-07-03 18:53:58 +02:00
Florian Hahn	eb46137daa	[SLP] Make sure instructions are ordered when computing spill cost. The entries in VectorizableTree are not necessarily ordered by their position in basic blocks. Collect them and order them by dominance so later instructions are guaranteed to be visited first. For instructions in different basic blocks, we only scan to the beginning of the block, so their order does not matter, as long as all instructions in a basic block are grouped together. Using dominance ensures a deterministic order. The modified test case contains an example where we compute a wrong spill cost (2) without this patch, even though there is no call between any instruction in the bundle. This seems to have limited practical impact, .e.g on X86 with a recent Intel Xeon CPU with -O3 -march=native -flto on MultiSource,SPEC2000,SPEC2006 there are no binary changes. Reviewers: craig.topper, RKSimon, xbolva00, ABataev, spatel Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D82444	2020-07-03 17:30:17 +01:00
Florian Hahn	039145c72b	[SLP] Precommit test for which spill cost is computed incorrectly. Test for D82444.	2020-07-03 17:15:52 +01:00
Florian Hahn	7a1161767b	[InstCombine] Precommit tests for PR43580.	2020-07-03 17:14:02 +01:00
Sanjay Patel	63774642af	[InstCombine] add one-use check to cast+select narrowing transform Prevent increasing the instruction count.	2020-07-03 11:54:09 -04:00
Sanjay Patel	0cd0ae1f29	[InstCombine] add tests to show missing one-use checks; NFC	2020-07-03 11:54:09 -04:00
Simon Pilgrim	eb0e7acbd4	[InstCombine] canEvaluateTruncated - use KnownBits to check for inrange shift amounts Currently canEvaluateTruncated can only attempt to truncate shifts if they are scalar/uniform constant amounts that are in range. This patch replaces the constant extraction code with KnownBits handling, using the KnownBits::getMaxValue to check that the amounts are inrange. This enables support for nonuniform constant cases, and also variable shift amounts that have been masked somehow. Annoyingly, this still won't work for vectors with (demanded) undefs as KnownBits returns nothing in those cases, but its a definite improvement on what we currently have. Differential Revision: https://reviews.llvm.org/D83127	2020-07-03 16:02:10 +01:00
Sam Parker	18850981c8	[NFC][SimplifyCFG] Move X86 tests into subdir	2020-07-03 14:28:27 +01:00
Simon Pilgrim	1ab88de0ed	Add tests for trunc(shl/lshr/ashr(*ext(x),zext(and(y,c)))) patterns with variable shifts with clamped shift amounts	2020-07-03 13:39:16 +01:00
Simon Pilgrim	b18405fbc0	Add vector trunc(or(shl(zext(x),c1),zext(x))) tests	2020-07-03 13:32:00 +01:00
Simon Pilgrim	80d4f33479	Regenerate apint-cast tests and replace %tmp variable names to silence update_test_checks warnings	2020-07-03 11:42:16 +01:00
Simon Pilgrim	b3a2882dbc	Add nonuniform vector trunc(or(shl(zext(x),c1),srl(zext(x),c2))) tests	2020-07-03 11:42:15 +01:00
Simon Pilgrim	029046dc32	Regenerate mul-trunc tests, add vector variants and replace %tmp variable names to silence update_test_checks warnings	2020-07-03 11:42:15 +01:00
Simon Pilgrim	3da42f4810	[InstCombine] Add sext(ashr(shl(trunc(x),c),c)) folding support for vectors Replacing m_ConstantInt with m_Constant permits folding of vectors as well as scalars. Differential Revision: https://reviews.llvm.org/D83058	2020-07-03 10:04:37 +01:00
Simon Pilgrim	76673c65e7	Regenerate PR19420 tests	2020-07-03 10:04:37 +01:00
Sam Parker	0724153bbe	[CostModel] Fix cast crash Don't presume instruction operands while matching reductions. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46430 Differential Revision: https://reviews.llvm.org/D82453	2020-07-03 07:53:45 +01:00
Roman Lebedev	e98030a55f	[NFC][Scalarizer] Also scalarize loads in newly-added tests Should help better showcase improvements	2020-07-03 02:37:29 +03:00
Roman Lebedev	739c7a0a04	[NFC][Scalarizer] Add some insertelement/extractelement tests See D82961/D82970/D83101/D83102.	2020-07-03 02:04:47 +03:00
Nikita Popov	359345d609	[InstSimplify] Add test for sext/zext comparisons (NFC)	2020-07-02 22:21:59 +02:00
Arthur Eubanks	0059f6ffe8	[NewPM] Add -basic-aa to pr33196.ll The legacy pass manager implicitly adds BasicAA, but the new PM does not. This causes pr33196.ll to fail under NPM. There are almost certainly lots of other failures like this, wanted to get some input on if adding -basic-aa to tests makes sense at scale. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D82915	2020-07-02 11:27:52 -07:00
Arthur Eubanks	3d12e79094	[NewPM][LSR] Rename strength-reduce -> loop-reduce The legacy pass was called "loop-reduce". This lowers the number of check-llvm failures under NPM by 83. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D82925	2020-07-02 11:15:29 -07:00
Simon Pilgrim	50b25e0679	[InstCombine] Add some sext/trunc tests to show missing support for non-uniform vectors	2020-07-02 17:11:56 +01:00
Simon Pilgrim	769b979930	[InstCombine] Add (vXi1 trunc(lshr(x,c))) -> icmp_eq(and(x,c')) support for non-uniform vectors As noted on PR46531, we were only performing this transform on uniform vectors as we were using the m_APInt pattern matcher to extract the shift amount. Differential Revision: https://reviews.llvm.org/D83035	2020-07-02 16:56:33 +01:00
Simon Pilgrim	103d62e131	[InstCombine] Add some (vXi1 trunc(lshr(x,c))) -> icmp_eq(and(x,c')) tests for vectors with undef elements Suggested on D83035	2020-07-02 16:04:30 +01:00
Simon Pilgrim	23eeae5526	Regenerate sext/trunc tests and replace %tmp variable names to silence update_test_checks warnings	2020-07-02 14:37:21 +01:00
Simon Pilgrim	421c02e5c6	[InstCombine] Add some (vXi1 trunc(lshr(x,c))) -> icmp_eq(and(x,c')) tests for non-uniform vectors As noticed on PR46531	2020-07-02 11:56:51 +01:00
Simon Pilgrim	11c4bb0c7c	Regenerate apint-shift tests and replace %tmp variable names to silence update_test_checks warnings	2020-07-02 11:56:51 +01:00
Anna Welker	a8fe12065e	[LV] Enable the LoopVectorizer to create pointer inductions This patch enables the LoopVectorizer to build a phi of pointer type and provide the vector loads and stores with vector type getelementptrs built from the pointer induction variable, which produces much less instructions than the previous approach of creating scalar getelementpointers and glue them together to a vector. Differential Revision: https://reviews.llvm.org/D81267	2020-07-02 11:39:28 +01:00
Nuno Lopes	7f903873b8	DSE: fix builtin function recognition to take decl into account	2020-07-02 10:28:47 +01:00
Nikita Popov	a59dc55c2a	[InstSimplify] Move assume icmp test (NFC) Move this test from InstCombine into InstSimplify.	2020-07-01 23:35:52 +02:00
Sergey Dmitriev	cb8faaacb5	[CallGraph] Add support for callback call sites Summary: This patch changes call graph analysis to recognize callback call sites and add an artificial 'reference' call record from the broker function caller to the callback function in the call graph. A presence of such reference enforces bottom-up traversal order for callback functions in CG SCC pass manager because callback function logically becomes a callee of the broker function caller. Reviewers: jdoerfert, hfinkel, sstefan1, baziotis Reviewed By: jdoerfert Subscribers: hiraditya, kuter, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82572	2020-07-01 13:44:11 -07:00
Nikita Popov	91836fd7f3	[LVI][CVP] Handle (x \| y) < C style conditions InstCombine may convert conditions like (x < C) && (y < C) into (x \| y) < C (for some C). This patch teaches LVI to recognize that in this case, it can infer either x < C or y < C along the edge. This fixes the issue reported at https://github.com/rust-lang/rust/issues/73827. Differential Revision: https://reviews.llvm.org/D82715	2020-07-01 20:43:24 +02:00
Nikita Popov	0f6afd946d	[CVP] Use different number in test (NFC) To make it clear that this is not intended to be specific to mask / bit tests.	2020-07-01 18:43:59 +02:00
Hiroshi Yamauchi	6bd1db08e7	[InstCombine] Don't let an alignment assume prevent new/delete removals. Remove allocations with alignment assume. Differential Revision: https://reviews.llvm.org/D81854	2020-07-01 09:22:32 -07:00
Florian Hahn	1ccc49924a	[AArch64] Add getCFInstrCost, treat branches as free for throughput. D79164/2596da31740f changed getCFInstrCost to return 1 per default. AArch64 did not have its own implementation, hence the throughput cost of CFI instructions is overestimated. On most cores, most branches should be predicated and essentially free throughput wise. This restores a 9% performance regression on a SPEC2006 benchmark on AArch64 with -O3 LTO & PGO. This patch effectively restores pre `2596da3174` behavior for AArch64 and undoes the AArch64 test changes of the patch. Reviewers: samparker, dmgreen, anemet Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D82755	2020-06-30 20:34:04 +01:00
David Green	9e49d1d9b8	[InstCombine] fma x, y, 0 -> fmul x, y If the addend of the fma is zero, common sense would suggest that we can convert fma x, y, 0.0 to fmul x, y. This comes up with some user code that was expecting the first fma in an unrolled loop to simplify to a fmul. Floating point often does not follow naive common sense though. Alive suggests that this should be guarded by nsz (as fadd -0.0, 0.0 = 0.0). fma x, y, -0.0 is always valid. Differential Revision: https://reviews.llvm.org/D82778	2020-06-30 19:56:37 +01:00
Sanjay Patel	09b8dbf70c	[PhaseOrdering][NewPM] update test that silently showed bug with SpeculativeExecutionPass; NFC See D82735 / rG1a6cebb4d12c744699e23624f8afda5cbe216fe6	2020-06-30 14:22:20 -04:00
David Green	787b1a4746	[InstCombine] New FMA tests and regenerate tests. NFC	2020-06-30 18:05:13 +01:00
Max Kazantsev	f01d9e6fc3	[SimplifyCFG] Fix inconsistency in block size assessment for threading Sometimes SimplifyCFG may decide to perform jump threading. In order to do it, it follows the following algorithm: 1. Checks if the block is small enough for threading; 2. If yes, inserts a PR Phi relying that the next iteration will remove it by performing jump threading; 3. The next iteration checks the block again and performs the threading. This logic has a corner case: inserting the PR Phi increases block's size by 1. If the block size at first check was max possible, one more Phi will exceed this size, and we will neither perform threading nor remove the created Phi node. As result, we will end up with worse IR than before. This patch fixes this situation by excluding Phis from block size computation. Excluding Phis from size computation for threading also makes sense by itself because in case of threadign all those Phis will be removed. Differential Revision: https://reviews.llvm.org/D81835 Reviewed By: asbirlea, nikic	2020-06-30 12:40:07 +07:00
Matt Arsenault	7c308dc80a	LowerConstantIntrinsics: Fix missing test for byval behavior	2020-06-29 14:45:31 -04:00
Nikita Popov	c84a952dc7	[IndVars] Regenerate test checks (NFC)	2020-06-29 20:33:50 +02:00
Matt Arsenault	3621a520d3	Inliner: Add missing test for alignment assume with byval No tests were stressing the behavior for hasPassPointeeByValueAttr.	2020-06-29 10:39:58 -04:00
Sanjay Patel	b6315aee5b	[VectorCombine] try to form vector compare and binop to eliminate scalar ops binop i1 (cmp Pred (ext X, Index0), C0), (cmp Pred (ext X, Index1), C1) --> vcmp = cmp Pred X, VecC ext (binop vNi1 vcmp, (shuffle vcmp, Index1)), Index0 This is a larger pattern than the existing extractelement folds because we can't reasonably vectorize the sub-patterns with constants based on cost model calcs (it doesn't usually make sense to replace a single extracted scalar op with constant operand with a vector op). I salvaged as much of the existing logic as I could, but there might be better ways to share and reduce code. The motivating case from PR43745: https://bugs.llvm.org/show_bug.cgi?id=43745 ...is the special case of a 2-way reduction. We tried to get SLP to handle that particular pattern in D59710, but that caused crashing and regressions. This patch is more general, but hopefully safer. The v2f64 test with SSE2 surprised me - the cost model accounting looks like this: OldCost = 0 (free extract of f64 at index 0) + 1 (extract of f64 at index 1) + 2 (scalar fcmps) + 1 (and of bools) = 4 NewCost = 2 (vector fcmp) + 1 (shuffle) + 1 (vector 'and') + 1 (extract of bool) = 5 Differential Revision: https://reviews.llvm.org/D82474	2020-06-29 10:38:52 -04:00
Nikita Popov	a28d38a6bc	[SimplifyCFG] Make test more robust (NFC) Avoid changing this test if blocks get merged.	2020-06-28 20:51:03 +02:00
Nikita Popov	d5a482acf9	[SimplifyCFG] Regenerate test checks (NFC)	2020-06-28 20:51:02 +02:00
Xun Li	c8755b6378	[Coroutines] Optimize the lifespan of temporary co_await object Summary: If we ever assign co_await to a temporary variable, such as foo(co_await expr), we generate AST that looks like this: MaterializedTemporaryExpr(CoawaitExpr(...)). MaterializedTemporaryExpr would emit an intrinsics that marks the lifetime start of the temporary storage. However such temporary storage will not be used until co_await is ready to write the result. Marking the lifetime start way too early causes extra storage to be put in the coroutine frame instead of the stack. As you can see from https://godbolt.org/z/zVx_eB, the frame generated for get_big_object2 is 12K, which contains a big_object object unnecessarily. After this patch, the frame size for get_big_object2 is now only 8K. There are still room for improvements, in particular, GCC has a 4K frame for this function. But that's a separate problem and not addressed in this patch. The basic idea of this patch is during CoroSplit, look for every local variable in the coroutine created through AllocaInst, identify all the lifetime start/end markers and the use of the variables, and sink the lifetime.start maker to the places as close to the first-ever use as possible. Reviewers: lewissbaker, modocache, junparser Reviewed By: junparser Subscribers: hiraditya, llvm-commits, rsmith, ChuanqiXu, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82314	2020-06-28 10:18:15 -07:00
Sanjay Patel	931411136a	[VectorCombine] add test for scalable vectors; NFC	2020-06-28 12:44:44 -04:00
Sanjay Patel	2f3549f813	Revert "[VectorCombine] add test for scalable vectors; NFC" This reverts commit `700ec6b848`. An extra test diff snuck here.	2020-06-28 12:43:11 -04:00
Sanjay Patel	700ec6b848	[VectorCombine] add test for scalable vectors; NFC	2020-06-28 12:42:00 -04:00
Nikita Popov	8758e14c6f	[InstCombine] Add tests for assume implication (NFC)	2020-06-28 16:18:44 +02:00
Nikita Popov	70c5d95248	[CVP] Add tests for icmp or and/or edge conds (NFC)	2020-06-28 14:54:55 +02:00
dfukalov	c7bcd431d9	SpeculativeExecution: fix incorrect debug info move Summary: Debug info related instructions got zero cost so hoisted unconditionally Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46267 Reviewers: arsenm, nhaehnle, chandlerc, aprantl Reviewed By: aprantl Subscribers: ormris, uabelho, wdng, aprantl, hiraditya, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D81730	2020-06-28 14:35:00 +03:00
Roman Lebedev	f0634100cd	[Analysis] isDereferenceableAndAlignedPointer(): don't crash on `bitcast <1 x ???> to ???`	2020-06-27 18:30:59 +03:00
Fangrui Song	4cd19a6e15	[BasicAA] Rename -disable-basicaa to -disable-basic-aa to be consistent with the canonical name "basic-aa"	2020-06-26 20:55:44 -07:00
Fangrui Song	f31811f2dc	[BasicAA] Rename deprecated -basicaa to -basic-aa Follow-up to D82607 Revert an accidental change (empty.ll) of D82683	2020-06-26 20:41:37 -07:00
Arthur Eubanks	059994f219	[NewPM][BasicAA] basicaa -> basic-aa in Transforms/{New,}GVN Summary: Following https://reviews.llvm.org/D82607. Reviewers: ychen Subscribers: asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82688	2020-06-26 20:28:18 -07:00
Arthur Eubanks	339eed5d0b	[NewPM][BasicAA] basicaa -> basic-aa in Transforms/DeadStoreElimination Summary: Following https://reviews.llvm.org/D82607. Reviewers: ychen Subscribers: george.burgess.iv, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82689	2020-06-26 20:13:37 -07:00
Vedant Kumar	9649c2095f	[InstCombine] Drop debug loc in TryToSinkInstruction (reland) Summary: The advice in HowToUpdateDebugInfo.rst is to "... preserve the debug location of an instruction if the instruction either remains in its basic block, or if its basic block is folded into a predecessor that branches unconditionally". TryToSinkInstruction doesn't seem to satisfy the criteria as it's sinking an instruction to some successor block. Preserving the debug loc can make single-stepping appear to go backwards, or make a breakpoint hit on that location happen "too late" (since single-stepping from that breakpoint can cause the function to return unexpectedly). So, drop the debug location. This was reverted in `ee3620643d` because it removed source locations from inlinable calls, breaking a verifier rule. I've added an exception for calls because the alternative (setting a line 0 location) is not better. I tested the updated patch by completing a stage2 RelWithDebInfo build. Reviewers: aprantl, davide Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82487	2020-06-26 17:18:15 -07:00
Vedant Kumar	ee3620643d	Revert "[InstCombine] Drop debug loc in TryToSinkInstruction" This reverts commit `903cf140d0`. This might be causing verifier failures on the bots, such as: "inlinable function call in a function with debug info must have a !dbg location" -- http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/16976/steps/bootstrap%20clang/logs/stdio	2020-06-26 14:59:40 -07:00
Arthur Eubanks	691c086d15	[NewPM][BasicAA] basicaa -> basic-aa in Transforms/SLPVectorizer Following https://reviews.llvm.org/D82607. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D82681	2020-06-26 14:58:41 -07:00
Vedant Kumar	903cf140d0	[InstCombine] Drop debug loc in TryToSinkInstruction Summary: The advice in HowToUpdateDebugInfo.rst is to "... preserve the debug location of an instruction if the instruction either remains in its basic block, or if its basic block is folded into a predecessor that branches unconditionally". TryToSinkInstruction doesn't seem to satisfy the criteria as it's sinking an instruction to some successor block. Preserving the debug loc can make single-stepping appear to go backwards, or make a breakpoint hit on that location happen "too late" (since single-stepping from that breakpoint can cause the function to return unexpectedly). So, drop the debug location. Reviewers: aprantl, davide Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82487	2020-06-26 13:23:24 -07:00
Roman Lebedev	64258773ad	[CostModel] Avoid traditional ConstantExpr crashy pitfails I'm not sure if this is a regression from D81448 + D81643, which moved at least the code cast from elsewhere, or somehow no one triggered that before. But now we can reach it with a non-instruction.. It is not straight-forward to write cost-model tests for constantexprs, `-cost-model -analyze -cost-kind=` does not appear to look at them, or maybe i'm doing it wrong. I've encountered that via a SimplifyCFG crash, so reduced (currently-crashing) test is added. There are likely other instances. For now, simply restore previous status quo of not crashing and returning TTI::TCC_Basic.	2020-06-26 22:48:10 +03:00
Rong Xu	b4bceb94ee	[PGO] Add a functionality to always instrument the func entry BB Add an option to always instrument function entry BB (default off) Add an option to do atomically updates on the first counter in each instrumented function. Differential Revision: https://reviews.llvm.org/D82123	2020-06-26 10:43:23 -07:00
Arthur Eubanks	a95796a380	[NewPM][LoopUnroll] Rename unroll* to loop-unroll* The legacy pass is called "loop-unroll", but in the new PM it's called "unroll". Also applied to unroll-and-jam and unroll-full. Fixes various check-llvm tests when NPM is turned on. Reviewed By: Whitney, dmgreen Differential Revision: https://reviews.llvm.org/D82590	2020-06-26 09:28:32 -07:00
David Sherwood	7a834a0a4e	[SVE] Fix scalable vector bug in DataLayout::getIntPtrType Fixed an issue in DataLayout::getIntPtrType where we were assuming the input type was always a fixed vector type, which isn't true. Added a test that exposed the problem to: Transforms/InstCombine/vector_gep1.ll Differential Revision: https://reviews.llvm.org/D82294	2020-06-26 07:58:45 +01:00
Michael Liao	dccfaacf93	[InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`. Summary: - `ptrtoint` and `inttoptr` are defined as no-op casts if the integer value as the same size as the pointer value. The pair of `ptrtoint`/`inttoptr` is in fact a no-op cast sequence between different address spaces. Teach `infer-address-spaces` to handle them like a `bitcast`. Reviewers: arsenm, chandlerc Subscribers: jvesely, wdng, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81938	2020-06-25 20:46:56 -04:00
Hiroshi Yamauchi	9878996c70	Revert "[PGO] Extend the value profile buckets for mem op sizes." This reverts commit `63a89693f0`. Due to a build failure like http://lab.llvm.org:8011/builders/sanitizer-windows/builds/65386/steps/annotate/logs/stdio	2020-06-25 11:13:49 -07:00
Kirill Naumov	d48c7859fb	[InlineCost] GetElementPtr with constant operands If the GEP instruction contanins only constants as its arguments, then it should be recognized as a constant. For now, there was also added a flag to turn off this simplification if it causes any regressions ("disable-gep-const-evaluation") which is off by default. Once I gather needed data of the effectiveness of this simplification, the flag will be deleted. Reviewers: apilipenko, davidxl, mtrofin Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D81026	2020-06-25 18:09:51 +00:00
Hiroshi Yamauchi	63a89693f0	[PGO] Extend the value profile buckets for mem op sizes. Extend the memop value profile buckets to be more flexible (could accommodate a mix of individual values and ranges) and to cover more value ranges (from 11 to 22 buckets). Disabled behind a flag (to be enabled separately) and the existing code to be removed later. Differential Revision: https://reviews.llvm.org/D81682	2020-06-25 10:22:56 -07:00
Sanjay Patel	c9e8c9e3ea	[InstCombine] fold fmul/fdiv with fabs operands fabs(X) * fabs(Y) --> fabs(X * Y) fabs(X) / fabs(Y) --> fabs(X / Y) If both operands of fmul/fdiv are positive, then the result must be positive. There's a NAN corner-case that prevents removing the more specific fold just above this one: fabs(X) * fabs(X) -> X * X That fold works even with NAN because the sign-bit result of the multiply is not specified if X is NAN. We can't remove that and use the more general fold that is proposed here because once we convert to this: fabs (X * X) ...it is not legal to simplify the 'fabs' out of that expression when X is NAN. That's because fabs() guarantees that the sign-bit is always cleared - even for NAN values. So this patch has the potential to lose information, but it seems unlikely if we do the more specific fold ahead of this one. Differential Revision: https://reviews.llvm.org/D82277	2020-06-25 11:35:38 -04:00
Sanjay Patel	c336f21af5	[PhaseOrdering] delete test for vectorization; NFC As requested in D81416, I'm deleting the file that I added with: rGdf79443	2020-06-25 09:34:11 -04:00
Florian Hahn	4837daf883	[DSE,MSSA] Check if Def is removable only wen we try to remove it. Non-removable MemoryDefs can still eliminate other defs. Update the isRemovable checks to only candidates for removal.	2020-06-25 14:01:10 +01:00
Tyker	c95ffadb24	[AssumeBundles] Use operand bundles to encode alignment assumptions Summary: NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining. Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71739	2020-06-25 12:59:44 +02:00
Tyker	8938a6c9ed	[NFC] update test to make diff of the following commit clear	2020-06-25 12:59:44 +02:00
David Sherwood	ee26a31e7b	[SVE] Make ConstantFoldGetElementPtr work for scalable vectors of indices This patch fixes a compiler crash that was hit when trying to simplify the following code: getelementptr [2 x i64], [2 x i64]* null, i64 0, <vscale x 2 x i64> zeroinitializer For the case where we have a null pointer value like above, we just need to ensure we don't assume the indices are always fixed width. Differential Revision: https://reviews.llvm.org/D82183	2020-06-25 07:28:19 +01:00
Max Kazantsev	4c6548222b	[Test] Add more tests for selects & phis	2020-06-25 10:54:07 +07:00
Max Kazantsev	1eeb714787	[InstCombine] Combine select & Phi by same condition This patch transforms ``` p = phi [x, y] s = select cond, z, p ``` with ``` s = phi[x, z] ``` if we can prove that the Phi node takes values basing on select's condition. Differential Revision: https://reviews.llvm.org/D82072 Reviewed By: nikic	2020-06-25 10:44:10 +07:00
Michele Scandale	413a187856	[Inliner] Handle 'no-signed-zeros-fp-math' function attribute. All other floating point math optimization related attribute are merged in a conservative way during function inlining. This commit adds the merge rule for the 'no-signed-zeros-fp-math' attribute. Differential Revision: https://reviews.llvm.org/D81714	2020-06-24 17:53:59 -07:00
Amara Emerson	090c108d04	Don't inline dynamic allocas that simplify to huge static allocas. Some sequences of optimizations can generate call sites which may never be executed during runtime, and through constant propagation result in dynamic allocas being converted to static allocas with very large allocation amounts. The inliner tries to move these to the caller's entry block, resulting in the stack limits being reached/bypassed. Avoid inlining functions if this would result. The threshold of 64k currently doesn't get triggered on the test suite with an -Os LTO build on arm64, care should be taken in changing this in future to avoid needlessly pessimising inlining behaviour. Differential Revision: https://reviews.llvm.org/D81765	2020-06-24 17:39:03 -07:00
Kirill Naumov	7f094f7f9d	[InlineCost] PrinterPass prints constants to which instructions are simplified This patch enables printing of constants to see which instructions were constant-folded. Needed for tests and better visiual analysis of inliner's work. Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D81024	2020-06-24 22:52:31 +00:00
Roman Lebedev	0c22147027	[NFCI][InstSimplify] Add CHECK-LABEL to new icmp.ll test	2020-06-25 01:10:35 +03:00
Roman Lebedev	8911a35180	[SROA] convertValue(): we can have <N x iK*> to <M x iQ> cast Provided test case crashes otherwise. Much like to the opposite case.	2020-06-25 00:58:54 +03:00
Roman Lebedev	07a23c06dd	[SROA] convertValue(): we can have <N x iK> to <M x iQ*> cast Provided test case crashes otherwise. If NewTy is already DL.getIntPtrType(NewTy), CreateBitCast() won't actually create any bitcast, so we are better off just doing the general thing.	2020-06-25 00:58:53 +03:00
Roman Lebedev	2b8d706b19	[IR] GetUnderlyingObject(), stripPointerCastsAndOffsets(): don't crash on `bitcast <1 x i8> to i8` I'm not sure how to write standalone tests for each of two changes here. If either one of these two fixes is missing, the test fill crash.	2020-06-25 00:58:53 +03:00
Roman Lebedev	381054a989	[InstCombine] visitBitCast(): do not crash on weird `bitcast <1 x i8> to i8` Even if we know that RHS of a bitcast is a pointer, we can't assume LHS is, because it might be a single-element vector of pointer.	2020-06-25 00:58:53 +03:00
Yuanfang Chen	ebc88811b5	Remove Passes dependency on CodeGen The dependency was introduced in `5134020ea6`. The only functional change from this removal would be the new PM interface for the two codegen passes. This is not necessary since we don't have codegen pipeline using new PM yet. This removal is to break the potential circular dependency between Passes and CodeGen once the codegen begins to gain new PM support.	2020-06-24 14:52:46 -07:00
Kirill Naumov	6a5d7d498c	[InlineCost] InlineCostAnnotationWriterPass introduced This class allows to see the inliner's decisions for better optimization verifications and tests. To use, use flag "-passes="print<inline-cost>"". This is the second attempt to integrate the patch. The problem from the first try has been discussed and fixed in D82205. Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev Reviewed By: mtrofin Differential revision: https://reviews.llvm.org/D81743	2020-06-24 21:27:07 +00:00
Florian Hahn	35bb9bfbb0	[SLP] Limit GEP lists based on width of index computation. D68667 introduced a tighter limit to the number of GEPs to simplify together. The limit was based on the vector element size of the pointer, but the pointers themselves are not actually put in vectors. IIUC we try to vectorize the index computations here, so we should base the limit on the vector element size of the computation of the index. This restores the test regression on AArch64 and also restores the vectorization for a important pattern in SPEC2006/464.h264ref on AArch64 (@test_i16_extend). We get a large benefit from doing a single load up front and then processing the index computations in vectors. Note that we could probably even further improve the AArch64 codegen, if we would do zexts to i32 instead of i64 for the sub operands and then do a single vector sext on the result of the subtractions. AArch64 provides dedicated vector instructions to do so. Sketch of proof in Alive: https://alive2.llvm.org/ce/z/A4xYAB Reviewers: craig.topper, RKSimon, xbolva00, ABataev, spatel Reviewed By: ABataev, spatel Differential Revision: https://reviews.llvm.org/D82418	2020-06-24 19:56:53 +01:00

1 2 3 4 5 ...

15430 Commits