llvm-project

Commit Graph

Author	SHA1	Message	Date
Max Kazantsev	dc80366d52	[ScalarEvolution] Apply Depth limit to getMulExpr This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV. We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292. This patch applies the same limit on max expression depth for getAddExpr and getMulExpr. Differential Revision: https://reviews.llvm.org/D33984 llvm-svn: 305463	2017-06-15 11:48:21 +00:00
Craig Topper	f93b7b1c1f	[ValueTracking] Correct early out in computeKnownBitsFromOperator to work with non power of 2 bit widths There's an early out that's trying to detect when we don't know any bits that make up the legal range of a shift. The code subtracts one from BitWidth which creates a mask in the lower bits for power of 2 bit widths. This is then ANDed with the known bits to see if any of those bits are known. If the bit width isn't a power of 2 this creates a non-sensical mask. This patch corrects this by rounding up to a power of 2 before doing the subtract and mask. Differential Revision: https://reviews.llvm.org/D34165 llvm-svn: 305400	2017-06-14 17:04:59 +00:00
Simon Pilgrim	7ce9926ce4	Strip UTF8 BOM that got added for some reason in rL305163 llvm-svn: 305282	2017-06-13 09:58:27 +00:00
Sanjay Patel	2ad88f81f0	fix typos/formatting; NFC llvm-svn: 305243	2017-06-12 22:34:37 +00:00
Yaron Keren	7d46392124	Address http://bugs.llvm.org/pr32207 by making BannerPrinted local to runOnSCC and skipping banner for function declarations. Reviewed By: Mehdi AMINI Differential Revision: https://reviews.llvm.org/D34086 llvm-svn: 305179	2017-06-12 02:18:50 +00:00
Simon Pilgrim	516938452f	Fix unused variable warning on non-debug EXPENSIVE_CHECKS builds llvm-svn: 305163	2017-06-11 12:49:29 +00:00
Davide Italiano	83122058cf	[MemorySSA] preservesAll() implies preserves<MemorySSA>(). NFCI. llvm-svn: 305160	2017-06-11 01:05:45 +00:00
Andrew Kaylor	647025f9e1	[InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin Differential Revision: https://reviews.llvm.org/D33737 llvm-svn: 305132	2017-06-09 23:18:11 +00:00
Craig Topper	7ad13f259f	[LVI] Fix spelling error in comment. NFC llvm-svn: 305115	2017-06-09 21:21:17 +00:00
Craig Topper	6dd9dcf26e	[LVI] Const correct and rename the LVILatticeVal parameter to getPredicateResult. NFC Previously it was non-const reference named Result which would tend to make someone think that it was an outparam when really its an input. llvm-svn: 305114	2017-06-09 21:18:16 +00:00
Craig Topper	31ce4ec2fd	[LazyValueInfo] Don't run the more complex predicate handling code for EQ and NE in getPredicateResult Summary: Unless I'm mistaken, the special handling for EQ/NE should cover everything and there is no reason to fallthrough to the more complex code. For that matter I'm not sure there's any reason to special case EQ/NE other than avoiding creating temporary ConstantRanges. This patch moves the complex code into an else so we only do it when we are handling a predicate other than EQ/NE. Reviewers: anna, reames, resistor, Farhana Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34000 llvm-svn: 305086	2017-06-09 16:16:20 +00:00
Sanjay Patel	fef83e8fb9	[ValueTracking] fix typo; NFC llvm-svn: 305080	2017-06-09 14:21:18 +00:00
Peter Collingbourne	e357fbd243	Write summaries for merged modules when splitting modules for ThinLTO. This is to prepare to allow for dead stripping of globals in the merged modules. Differential Revision: https://reviews.llvm.org/D33921 llvm-svn: 305027	2017-06-08 23:01:49 +00:00
Craig Topper	db52809e77	[LazyValueInfo] Make LVILatticeVal intersect method take arguments by reference so we don't copy ConstantRanges unless we need to. llvm-svn: 304990	2017-06-08 17:08:58 +00:00
John Brawn	da4a68a1d2	[BPI] Don't assume that strcmp returning >0 is more likely than <0 The zero heuristic assumes that integers are more likely positive than negative, but this also has the effect of assuming that strcmp return values are more likely positive than negative. Given that for nonzero strcmp return values it's the ordering of arguments that determines the sign of the result there's no reason to assume that's true. Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to decide if it's strcmp-like, and if so only assume that nonzero is more likely than zero i.e. strings are more often different than the same. This causes a slight code generation change in the spec2006 benchmark 403.gcc, but with no noticeable performance impact. The intent of this patch is to allow better optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are also some changes that need to be made to if-conversion. Differential Revision: https://reviews.llvm.org/D33934 llvm-svn: 304970	2017-06-08 09:44:40 +00:00
David Blaikie	7a9b788830	GlobalsModRef: Ensure optnone+readonly/readnone attributes are respected llvm-svn: 304945	2017-06-07 21:37:39 +00:00
Alina Sbirlea	33e5872367	[mssa] Fix case when there is no definition in a block prior to an inserted use. Summary: Check that the first access before one being tested is valid. Before this patch, if there was no definition prior to the Use being tested, the first time Iter was deferenced, it hit the sentinel. Reviewers: dberlin, gbiv Subscribers: sanjoy, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D33950 llvm-svn: 304926	2017-06-07 16:46:53 +00:00
Craig Topper	73ba1c84be	[InstCombine][InstSimplify] Use APInt::isNullValue/isOneValue to reduce compiled code for comparing APInts with 0 and 1. NFC These methods are specifically optimized to only counting leading zeros without an additional uint64_t compare. llvm-svn: 304876	2017-06-07 07:40:37 +00:00
NAKAMURA Takumi	92c99cd6dc	Update libdeps to add BinaryFormat, introduced in r304864. llvm-svn: 304869	2017-06-07 04:48:49 +00:00
NAKAMURA Takumi	ef9d9481b5	Reorder and reformat. llvm-svn: 304868	2017-06-07 04:48:45 +00:00
Craig Topper	7945248267	[LazyValueInfo] Remove redundant calls to ConstantRange::contains. The same exact call was made in the if above and we already know it returned true. NFC llvm-svn: 304857	2017-06-07 00:58:09 +00:00
Davide Italiano	c88f2c712f	[CFLAA] Remove unused include. NFCI. llvm-svn: 304842	2017-06-06 23:16:19 +00:00
David Blaikie	c662b50150	GlobalsModRef+OptNone: Don't prove readnone/other properties from an optnone function Seems like at least one reasonable interpretation of optnone is that the optimizer never "looks inside" a function. This fix is consistent with that interpretation. Specifically this came up in the situation: f3 calls f2 calls f1 f2 is always_inline f1 is optnone The application of readnone to f1 (& thus to f2) caused the inliner to kill the call to f2 as being trivially dead (without even checking the cost function, as it happens - not sure if that's also a bug). llvm-svn: 304833	2017-06-06 20:51:15 +00:00
Anna Thomas	4acfc7e16e	[LVI Printer] Rely on the LVI analysis functions rather than the LVI cache Summary: LVIPrinter pass was previously relying on the LVICache. We now directly call the the LVI functions which solves the value if the LVI information is not already available in the cache. This has 2 benefits over the printing of LVI cache: 1. higher coverage (i.e. catches errors) in LVI code when cache value is invalidated. 2. relies on the core functions, and not dependent on the LVI cache (which may be scrapped at some point). It would still catch any cache invalidation errors, since we first go through the cache. Reviewers: reames, dberlin, sanjoy Reviewed by: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32135 llvm-svn: 304819	2017-06-06 19:25:31 +00:00
Anna Thomas	b2a212c070	[Atomics][LoopIdiom] Recognize unordered atomic memcpy Summary: Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy. The only difference for recognizing an unordered atomic memcpy and instead of a normal memcpy is that the loads and/or stores involved are unordered atomic operations. Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html Patch by Daniel Neilson! Reviewers: reames, anna, skatkov Reviewed By: reames, anna Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33243 llvm-svn: 304806	2017-06-06 16:45:25 +00:00
Chandler Carruth	6bda14b313	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Joey Gouly	61eaa63b65	[InstSimplify] Constant fold the new GEP in SimplifyGEPInst. llvm-svn: 304784	2017-06-06 10:17:14 +00:00
Craig Topper	aa9a24bd8b	[InstSimplify] Remove some redundant code from InstSimplify now that llvm::isKnownNonEqual handles vectors. isKnownNonEqual is called a little earlier in this function and can handle the case that we were checking here as well as more complex cases. llvm-svn: 304775	2017-06-06 07:13:17 +00:00
Craig Topper	3002d5b0bf	[ValueTracking] Remove scalar only restriction from isKnownNonEqual. The computeKnownBits and isKnownNonZero calls this code relies on should work fine for vectors. This will be used by another commit to remove some code from InstSimplify that is redundant for scalars, but was needed for vectors due to this issue. llvm-svn: 304774	2017-06-06 07:13:15 +00:00
Craig Topper	2dfb4804f2	[InstSimplify] Use the getTrue/getFalse helpers and make sure we use the computed result type instead of hardcoding to i1. NFC Currently, isKnownNonEqual punts on vectors so the hardcoding to i1 doesn't matter. But I plan to fix that in a future patch. llvm-svn: 304773	2017-06-06 07:13:13 +00:00
Craig Topper	8e662f7f81	[ValueTracking] Use the computeKnownBits version that returns a KnownBits object instead of taking one by reference. NFC llvm-svn: 304772	2017-06-06 07:13:11 +00:00
Craig Topper	8365df825e	[ValueTracking] Use APInt::intersects to avoid some temporary APInts. NFC llvm-svn: 304771	2017-06-06 07:13:09 +00:00
Craig Topper	c2790ecda8	[InstSimplify] Use ICmpInst::isEquality predicate method. NFC llvm-svn: 304770	2017-06-06 07:13:04 +00:00
Evgeny Stupachenko	f2b3b467e5	Fix PR23384 (part 2 of 3) NFC Summary: The patch moves LSR cost comparison to target part. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30561 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304750	2017-06-05 23:37:00 +00:00
Craig Topper	da8037f299	[InstSimplify] Use llvm::all_of instead of a manual loop. NFC llvm-svn: 304692	2017-06-04 22:41:56 +00:00
Craig Topper	d470d73c2d	[ConstantFolding] Combine an if statement into an earlier one that checked the same condition. NFC llvm-svn: 304681	2017-06-04 08:21:53 +00:00
Craig Topper	0dd29e2256	[ConstantFolding][X86] Replace an LLVM_FALLTHROUGH with a break because it really shouldn't fallthrough. This is actually NFC because the next case starts with the same if statement as this case did. So the result will be the same and it will fallthrough to the end of the switch. But there's no reason to rely on that so we should just break. llvm-svn: 304680	2017-06-04 08:21:51 +00:00
Craig Topper	fe9ad82e44	[ConstantFolding] Properly support constant folding of vector powi intrinsic. The second argument is not a vector so needs special treatment. llvm-svn: 304679	2017-06-04 07:30:28 +00:00
Craig Topper	7c553edced	[ConstantFolding] Fix constant folding for vector cttz and ctlz intrinsics to understand that the second argument is still a scalar. llvm-svn: 304668	2017-06-03 18:50:29 +00:00
Craig Topper	a803d5b8b0	[LazyValueInfo] Use Type::getIntegerBitWidth instead of casting to IntegerType to call getBitWidth. NFC llvm-svn: 304656	2017-06-03 07:47:14 +00:00
Craig Topper	0e5f1093ee	[LazyValueInfo] Make solveBlockValueCast take a CastInst* instead of Instruction*. Makes getOpcode return the appropriate enum without a cast. NFC llvm-svn: 304655	2017-06-03 07:47:08 +00:00
Jun Bum Lim	2960d41e68	[InlineCost] Enable the new switch cost heuristic Summary: This is to enable the new switch inline cost heuristic (r301649) by removing the old heuristic as well as the flag itself. In my experiment for LLVM test suite and spec2000/2006, +17.82% performance and 8% code size reduce was observed in spec2000/vertex with O3 LTO in AArch64. No significant code size / performance regression was found in O3/O2/Os. No significant complain was reported from the llvm-dev thread. Reviewers: hans, chandlerc, eraman, haicheng, mcrosier, bmakam, eastig, ddibyend, echristo Reviewed By: echristo Subscribers: javed.absar, kristof.beyls, echristo, aemerson, rengolin, mehdi_amini Differential Revision: https://reviews.llvm.org/D32653 llvm-svn: 304594	2017-06-02 20:42:54 +00:00
Craig Topper	9277a86f03	[LazyValueInfo] Fix formatting NFC. llvm-svn: 304567	2017-06-02 17:28:12 +00:00
Craig Topper	3778c8943b	[LazyValueInfo] Make solveBlockValueBinaryOp take a BinaryOperator* instead of Instruction*. This removes a cast of getOpcode to BinaryOps. llvm-svn: 304563	2017-06-02 16:33:13 +00:00
Craig Topper	84a9f168f1	[LazyValueInfo] Fix typo in comment. NFC llvm-svn: 304560	2017-06-02 16:21:13 +00:00
Craig Topper	b23e7c78a5	[InstSimplify][ConstantFolding] Teach constant folding how to handle icmp null, (inttoptr x) as well as it handles icmp (inttoptr x), null Summary: The constant folding code currently assumes that the constant expression will always be on the left and the simple null will be on the right. But that's not true at least on the path from InstSimplify. This patch adds support to ConstantFolding to detect the reversed case. Reviewers: spatel, dberlin, majnemer, davide, joey Reviewed By: joey Subscribers: joey, llvm-commits Differential Revision: https://reviews.llvm.org/D33801 llvm-svn: 304559	2017-06-02 16:17:32 +00:00
Benjamin Kramer	c1f5ae236c	[OrderedBasicBlock] Return false for comesBefore(A, A) So far it would return true for the first uncached query, then cached queries return false. llvm-svn: 304545	2017-06-02 13:10:31 +00:00
Eli Friedman	0d823d610d	Add opt-bisect support for region passes. This is necessary to get opt-bisect working with polly. Differential Revision: https://reviews.llvm.org/D33751 llvm-svn: 304476	2017-06-01 21:22:26 +00:00
Teresa Johnson	596b2e7ab2	[PGO] Adjust indirect call promotion threshold Summary: Reduce min percent required for indirect call promotion from 33% to 30%, which matches gcc's threshold and catches the same hot opportunities. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33798 llvm-svn: 304469	2017-06-01 21:10:10 +00:00
Evgeniy Stepanov	56584bbf16	(NFC) Track global summary liveness in GVFlags. Replace GVFlags::LiveRoot with GVFlags::Live and use that instead of all the DeadSymbols sets. This is refactoring in order to make liveness information available in the RegularLTO pipeline. llvm-svn: 304466	2017-06-01 20:30:06 +00:00
Reid Kleckner	fc7ba565ed	[EH] Recognize __(gxx\|gcc)_personality_seh0 as the GNU EH personalities These are no-ops when there are no invokes. We don't need to emit LSDAs for them. Fixes PR33220. llvm-svn: 304367	2017-05-31 22:35:52 +00:00
Galina Kistanova	244621faad	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304361	2017-05-31 22:16:24 +00:00
Galina Kistanova	8514dd540d	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304358	2017-05-31 22:09:46 +00:00
Galina Kistanova	0b69e363f6	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304356	2017-05-31 22:02:05 +00:00
Galina Kistanova	c2b642d009	Added missing break; added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304340	2017-05-31 20:25:13 +00:00
Zaara Syeda	3a7578c658	[PPC] Inline expansion of memcmp This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 llvm-svn: 304313	2017-05-31 17:12:38 +00:00
George Burgess IV	0a7b989036	[CFLAA] Add missing break; note things are broken. Thanks to Galina Kistanova for finding the missing break! When trying to make a test for this, I realized our logic for handling extractvalue/insertvalue/... is somewhat broken. This makes constructing a test-case for this missing break nontrivial. llvm-svn: 304275	2017-05-31 02:35:26 +00:00
Daniel Berlin	71ff663e1b	InstructionSimplify: Remove now-redundant reachability tests, as dominates() already does them llvm-svn: 304270	2017-05-31 01:47:24 +00:00
Max Kazantsev	d8fe3eb9cb	[SCEV][NFC] Remove redundant params from isAvailableAtLoopEntry Params DT and LI are redundant, because these values are contained in fields anyways. Differential Revision: https://reviews.llvm.org/D33668 llvm-svn: 304204	2017-05-30 10:54:58 +00:00
Tobias Grosser	e3684d0b84	[SCEV] Assume parameters coming from function calls contain IVs The optimistic delinearization implemented in LLVM detects array sizes by looking for non-linear products between parameters and induction variables. In OpenCL code, such products often look like: A[get_global_id(0) * N + get_global_id(1)] Hence, the IV is hidden in the get_global_id() call and consequently delinearization would fail as no induction variable is available that helps us to identify N as array size parameter. We now use a very simple heuristic to change this. We assume that each parameter that comes directly from a function call is a hidden induction variable. As a result, we can delinearize the access above to: A[get_global_id(0)][get_global_id(1] llvm-svn: 304073	2017-05-27 15:17:49 +00:00
Keno Fischer	090f1959c1	[SCEVExpander] Try harder to avoid introducing inttoptr Summary: This fixes introduction of an incorrect inttoptr/ptrtoint pair in the included test case which makes use of non-integral pointers. I suspect there are more cases like this left, but this takes care of the one I was seeing at the moment. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D33129 llvm-svn: 304058	2017-05-27 03:22:55 +00:00
Craig Topper	348314dfb8	[InstSimplify] Push commuted op checks for and/or of icmp further down to avoid duplicate work Previously, we called simplifyPossiblyCastedAndOrOfICmps twice with the operands commuted, but the call to simplifyAndOrOfICmpsWithConstants further down already handles commuting and doesn't need to be called both ways. This patch pushes double calls further down to just the individual routines that need to be called twice. Differential Revision: https://reviews.llvm.org/D33603 llvm-svn: 304044	2017-05-26 22:42:34 +00:00
Craig Topper	9bce1ad232	[InstSimplify] Move a variable declaration to make simplifyAndOfICmps look more like simplifyOrOfICmps. NFC llvm-svn: 304023	2017-05-26 19:04:02 +00:00
Craig Topper	c8bebb1e84	[InstSimplify] Use commutable matchers to shorten some code This code was replicated two additional times to handle commuted cases, but I think a commutable matcher can take care of it. Differential Revision: https://reviews.llvm.org/D33585 llvm-svn: 304022	2017-05-26 19:03:59 +00:00
Craig Topper	1da22c3244	[InstSimplify] Use m_APInt instead of m_ConstantInt in ((V + N) & C1) \| (V & C2) handling in order to support splat vectors. The tests here are have operands commuted to provide more coverage. I also commuted one of the instructions in the scalar tests so the 4 tests cover the 4 commuted variations Differential Revision: https://reviews.llvm.org/D33599 llvm-svn: 304021	2017-05-26 19:03:53 +00:00
Max Kazantsev	41450329f7	Re-enable "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start" The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on many non-X86 configs with this patch. The reason of this is that the patch makes a correctless fix that changes optimizer's behavior for this test. Without the change, LSR was making an overconfident simplification basing on a wrong SCEV. Apparently it did not need the IV analysis to do this. With the change, it chose a different way to simplify (that wasn't so confident), and this way required the IV analysis. Now, following the right execution path, LSR tries to make a transformation relying on IV Users analysis. This analysis is target-dependent due to this code: // LSR is not APInt clean, do not touch integers bigger than 64-bits. // Also avoid creating IVs of non-native types. For example, we don't want a // 64-bit IV in 32-bit code just because the loop has one 64-bit cast. uint64_t Width = SE->getTypeSizeInBits(I->getType()); if (Width > 64 \|\| !DL.isLegalInteger(Width)) return false; To make a proper transformation in this test case, the type i32 needs to be legal for the specified data layout. When the test runs on some non-X86 configuration (e.g. pure ARM 64), opt gets confused by the specified target and does not use it, rejecting the specified data layout as well. Instead, it uses some default layout that does not treat i32 as a legal type (currently the layout that is used when it is not specified does not have legal types at all). As result, the transformation we expect to happen does not happen for this test. This re-enabling patch does not have any source code changes compared to the original patch rL303730. The only difference is that the failing test is moved to X86 directory and now has requirement of running on x86 only to comply with the specified target triple and data layout. Differential Revision: https://reviews.llvm.org/D33543 llvm-svn: 303971	2017-05-26 06:47:04 +00:00
Craig Topper	25d9ba9a12	[InstSimplify] Use APInt::isMask isntead of manually implementing it. NFC llvm-svn: 303968	2017-05-26 05:16:22 +00:00
Craig Topper	50500d5054	[InstSimplify] Use m_ConstantInt matchers to short some code. NFC llvm-svn: 303967	2017-05-26 05:16:20 +00:00
Chandler Carruth	29c22d2835	[LegacyPM] Make the 'addLoop' method accept a loop to add rather than having it internally allocate the loop. This is a much more flexible API and necessary in the new loop unswitch to reasonably support both new and old PMs in common code. It also just seems like a cleaner separation of concerns. NFC, this should just be a pure refactoring. Differential Revision: https://reviews.llvm.org/D33528 llvm-svn: 303834	2017-05-25 03:01:31 +00:00
Craig Topper	77e07cc010	[InstSimplify] Simplify uadd/sadd/umul/smul with overflow intrinsics when the Zero or Undef is on the LHS. Summary: This code was migrated from InstCombine a few years ago. InstCombine had nearby code that would move Constants to the RHS for these, but InstSimplify doesn't have such code on this path. Reviewers: spatel, majnemer, davide Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33473 llvm-svn: 303774	2017-05-24 17:05:28 +00:00
Craig Topper	8205a1a9b6	[ValueTracking] Convert most of the calls to computeKnownBits to use the version that returns the KnownBits object. This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits. Differential Revision: https://reviews.llvm.org/D33431 llvm-svn: 303773	2017-05-24 16:53:07 +00:00
Craig Topper	a2025eaaef	[ValueTracking] Add OptimizationRemarkEmitter to the other signature for commuteKnownBits. This is needed for an upcoming patch. llvm-svn: 303772	2017-05-24 16:53:03 +00:00
Diana Picus	183863fc3b	Revert "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start" This reverts commit r303730 because it broke all the buildbots. llvm-svn: 303747	2017-05-24 14:16:04 +00:00
Jonas Paulsson	8624b7e1ce	[LoopVectorizer] Let target prefer scalar addressing computations. The loop vectorizer usually vectorizes any instruction it can and then extracts the elements for a scalarized use. On SystemZ, all elements containing addresses must be extracted into address registers (GRs). Since this extraction is not free, it is better to have the address in a suitable register to begin with. By forcing address arithmetic instructions and loads of addresses to be scalar after vectorization, two benefits result: * No need to extract the register * LSR optimizations trigger (LSR isn't handling vector addresses currently) Benchmarking show improvements on SystemZ with this new behaviour. Any other target could try this by returning false in the new hook prefersVectorizedAddressing(). Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand https://reviews.llvm.org/D32422 llvm-svn: 303744	2017-05-24 13:42:56 +00:00
Max Kazantsev	13e016bf48	[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that the loop of our base recurrency is the bottom-lost in terms of domination. This assumption may be broken by an expression which is treated as invariant, and which depends on a complex Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence. Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike other SCEVs, SCEVUnknown are sometimes position-bound. For example, here: for (...) { // loop phi = {A,+,B} } X = load ... Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment). It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop> may be existant. This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr, if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer expect such behavior. llvm-svn: 303730	2017-05-24 08:52:18 +00:00
Tim Northover	997f5f10c6	InstructionSimplify: don't speculate about Constants changing. When presented with an icmp/select pair, we can end up asking what would happen if we replaced one constant with another in an instruction. This is a mistake, while non-constant Values could become a constant, constants cannot change and trying to do so can lead to completely invalid IR (a GEP referencing a non-existant field in the original case). llvm-svn: 303580	2017-05-22 21:28:08 +00:00
Sanjoy Das	036dda25a5	[SCEV] Clarify behavior around max backedge taken count This is a re-application of a r303497 that was reverted in r303498. I thought it had broken a bot when it had not (the breakage did not go away with the revert). This change makes the split between the "exact" backedge taken count and the "maximum" backedge taken count a bit more obvious. Both of these are upper bounds on the number of times the loop header executes (since SCEV does not account for most kinds of abnormal control flow), but the latter is guaranteed to be a constant. There were a few places where the max backedge taken count was a non-constant; I've changed those to compute constants instead. At this point, I'm not sure if the constant max backedge count can be computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without losing precision. If it can, we can simplify even further by making `getMaxBackedgeTakenCount` a thin wrapper around `getBackedgeTakenCount` and `getUnsignedRange`. llvm-svn: 303531	2017-05-22 06:46:04 +00:00
Sanjoy Das	8963650cfa	Revert "[SCEV] Clarify behavior around max backedge taken count" This reverts commit r303497 since it breaks the msan bootstrap bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1379/ llvm-svn: 303498	2017-05-21 05:02:12 +00:00
Sanjoy Das	5207168383	[SCEV] Clarify behavior around max backedge taken count This change makes the split between the "exact" backedge taken count and the "maximum" backedge taken count a bit more obvious. Both of these are upper bounds on the number of times the loop header executes (since SCEV does not account for most kinds of abnormal control flow), but the latter is guaranteed to be a constant. There were a few places where the max backedge taken count was a non-constant; I've changed those to compute constants instead. At this point, I'm not sure if the constant max backedge count can be computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without losing precision. If it can, we can simplify even further by making `getMaxBackedgeTakenCount` a thin wrapper around `getBackedgeTakenCount` and `getUnsignedRange`. llvm-svn: 303497	2017-05-21 01:47:50 +00:00
Xin Tong	9fbfeefadf	Revert "Add pthread_self function prototype and make it speculatable." This reverts commit 143d7445b5dfa2f6d6c45bdbe0433d9fc531be21. Build breaking llvm-svn: 303496	2017-05-21 00:37:55 +00:00
Xin Tong	75af3af957	Add pthread_self function prototype and make it speculatable. Summary: This allows pthread_self to be pulled out of a loop by LICM. Reviewers: hfinkel, arsenm, davide Reviewed By: davide Subscribers: davide, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D32782 llvm-svn: 303495	2017-05-20 22:40:25 +00:00
Matthias Braun	57fd12db0c	Fix breakage after r303461 - Improve wchar_t size predicitions based on target triple. - Be less strict in wchar_t size verifier. llvm-svn: 303477	2017-05-20 01:28:52 +00:00
Matthias Braun	50ec0b5dce	SimplifyLibCalls: Optimize wcslen Refactor the strlen optimization code to work for both strlen and wcslen. This especially helps with programs in the wild where people pass L"string"s to const std::wstring& function parameters and the wstring constructor gets inlined. This also fixes a lingerind API problem/bug in getConstantStringInfo() where zeroinitializers would always give you an empty string (without a length) back regardless of the actual length of the initializer which did not work well in the TrimAtNul==false causing the PR mentioned below. Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG memcpy lowering and may lead to some cases for out-of-bounds zeroinitializer accesses not getting optimized anymore. So some code with UB may produce out of bound memory reads now instead of just producing zeros. The refactoring "accidentally" fixes http://llvm.org/PR32124 Differential Revision: https://reviews.llvm.org/D32839 llvm-svn: 303461	2017-05-19 22:37:09 +00:00
Daniel Berlin	a5130bbd12	BasicAA: Uninserted instructions have no parent, and notDifferentParent explicitly allows for this case, but getParent crashes when handed one. llvm-svn: 303442	2017-05-19 19:01:21 +00:00
Craig Topper	9c913bfd49	[InstSimplify] Fix 80 column violation. NFC llvm-svn: 303433	2017-05-19 16:56:53 +00:00
Reid Kleckner	96ab8726a3	[IR] De-virtualize ~Value to save a vptr Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362	2017-05-18 17:24:10 +00:00
Max Kazantsev	627ad0fec3	[SCEV][NFC] Remove duplication of isLoopInvariant code Replace two places that duplicate the code of isLoopInvariant method with the invocation of this method. Differential Revision: https://reviews.llvm.org/D33313 llvm-svn: 303336	2017-05-18 08:26:41 +00:00
Serguei Katkov	ba831f78fd	[BPI] Reduce the probability of unreachable edge to minimal value greater than 0 The probability of edge coming to unreachable block should be as low as possible. The change reduces the probability to minimal value greater than zero. The bug https://bugs.llvm.org/show_bug.cgi?id=32214 show the example when the probability of edge coming to unreachable block is greater than for edge coming to out of the loop and it causes incorrect loop rotation. Please note that with this change the behavior of unreachable heuristic is a bit different than others. Specifically, before this change the sum of probabilities coming to unreachable blocks have the same weight for all branches (it was just split over all edges of this block coming to unreachable blocks). With this change it might be slightly different but not to much due to probability of taken branch to unreachable block is really small. Reviewers: chandlerc, sanjoy, vsk, congh, junbuml, davidxl, dexonsmith Reviewed By: chandlerc, dexonsmith Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D30633 llvm-svn: 303327	2017-05-18 06:11:56 +00:00
Craig Topper	8a950275f7	[Statistics] Add a method to atomically update a statistic that contains a maximum Summary: There are several places in the codebase that try to calculate a maximum value in a Statistic object. We currently do this in one of two ways: MaxNumFoo = std::max(MaxNumFoo, NumFoo); or MaxNumFoo = (MaxNumFoo > NumFoo) ? MaxNumFoo : NumFoo; The first version reads from MaxNumFoo one time and uncontionally rwrites to it. The second version possibly reads it twice depending on the result of the first compare. But we have no way of knowing if the value was changed by another thread between the reads and the writes. This patch adds a method to the Statistic object that can ensure that we only store if our value is the max and the previous max didn't change after we read it. If it changed we'll recheck if our value should still be the max or not and try again. This spawned from an audit I'm trying to do of all places we uses the implicit conversion to unsigned on the Statistics objects. See my previous thread on llvm-dev https://groups.google.com/forum/#!topic/llvm-dev/yfvxiorKrDQ Reviewers: dberlin, chandlerc, hfinkel, dblaikie Reviewed By: chandlerc Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D33301 llvm-svn: 303318	2017-05-18 00:51:39 +00:00
Sanjay Patel	e2787b9a35	[InstSimplify] handle all icmp i1 X, C in one place; NFCI We already handled all of the new tests identically, but several of those went through a lot of unnecessary processing before getting folded. Another motivation for grouping these cases together is that InstCombine needs a similar fold. Currently, it handles the 'not' cases inefficiently which can lead to bugs as described in the post-commit comments of: https://reviews.llvm.org/D32143 llvm-svn: 303295	2017-05-17 20:27:55 +00:00
Max Kazantsev	4c7f293d24	[SCEV] Always sort AddRecExprs from different loops by dominance Sorting of AddRecExprs by loop nesting does not make sense since we only invoke the CompareSCEVComplexity for AddRecExprs that are used by one SCEV. This guarantees that there is always a dominance relationship between them. This patch removes the sorting by nesting which is a dead code in current usage of this function. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D33228 llvm-svn: 303235	2017-05-17 04:09:14 +00:00
Max Kazantsev	b67d344850	[SCEV][NFC] Replace redundant dyn_cast with cast in getAddExpr Replace dyn_cast which is ensured by isa just one line above with cast. Differential Revision: https://reviews.llvm.org/D33231 llvm-svn: 303234	2017-05-17 03:58:42 +00:00
Francis Visoiu Mistrih	b52e036600	BitVector: add iterators for set bits Differential revision: https://reviews.llvm.org/D32060 llvm-svn: 303227	2017-05-17 01:07:53 +00:00
Sanjay Patel	877364ff99	[InstSimplify] add folds for constant mask of value shifted by constant We would eventually catch these via demanded bits and computing known bits in InstCombine, but I think it's better to handle the simple cases as soon as possible as a matter of efficiency. This fold allows further simplifications based on distributed ops transforms. eg: %a = lshr i8 %x, 7 %b = or i8 %a, 2 %c = and i8 %b, 1 InstSimplify can directly fold this now: %a = lshr i8 %x, 7 Differential Revision: https://reviews.llvm.org/D33221 llvm-svn: 303213	2017-05-16 21:51:04 +00:00
Easwaran Raman	3cd1479c3f	[Inliner] Do not mix callsite and callee hotness based updates. Update threshold based on callee's hotness only when BFI is not available. Otherwise use only callsite's hotness. This makes it easier to reason about hotness related threshold updates. Differential revision: https://reviews.llvm.org/D33157 llvm-svn: 303210	2017-05-16 21:18:09 +00:00
Easwaran Raman	dadc0f11ad	Add hasProfileSummary and has{Sample\|Instrumentation}Profile methods ProfileSummaryInfo already checks whether the module has sample profile in determining profile counts. This will also be useful in inliner to clean up threshold updates. llvm-svn: 303204	2017-05-16 20:14:39 +00:00
Max Kazantsev	b09b5db793	[SCEV] Fix sorting order for AddRecExprs The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs by loop depth, but does not pay attention to dominance of loops. This can lead us to the following buggy situation: for (...) { // loop1 op1 = {A,+,B} } for (...) { // loop2 op2 = {A,+,B} S = add op1, op2 } In this case there is no guarantee that in operand list of S the op2 comes before op1 (loop depth is the same, so they will be sorted just lexicographically), so we can incorrectly treat S as a recurrence of loop1, which is wrong. This patch changes the sorting logic so that it places the dominated recs before the dominating recs. This ensures that when we pick the first recurrency in the operands order, it will be the bottom-most in terms of domination tree. The attached test set includes some tests that produce incorrect SCEV estimations and crashes with oldlogic. Reviewers: sanjoy, reames, apilipenko, anna Reviewed By: sanjoy Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33121 llvm-svn: 303148	2017-05-16 07:27:06 +00:00
Peter Collingbourne	6f0ecca3b5	IR: Give function GlobalValue::getRealLinkageName() a less misleading name: dropLLVMManglingEscape(). This function gives the wrong answer on some non-ELF platforms in some cases. The function that does the right thing lives in Mangler.h. To try to discourage people from using this function, give it a different name. Differential Revision: https://reviews.llvm.org/D33162 llvm-svn: 303134	2017-05-16 00:39:01 +00:00
Adam Nemet	e29686e5c1	[SLP] Enable 64-bit wide vectorization on AArch64 ARM Neon has native support for half-sized vector registers (64 bits). This is beneficial for example for 2D and 3D graphics. This patch adds the option to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer. * Performance Analysis This change was motivated by some internal benchmarks but it is also beneficial on SPEC and the LLVM testsuite. The results are with -O3 and PGO. A negative percentage is an improvement. The testsuite was run with a sample size of 4. SPEC * CFP2006/482.sphinx3 -3.34% A pretty hot loop is SLP vectorized resulting in nice instruction reduction. This used to be a +22% regression before rL299482. * CFP2000/177.mesa -3.34% * CINT2000/256.bzip2 +6.97% My current plan is to extend the fix in rL299482 to i16 which brings the regression down to +2.5%. There are also other problems with the codegen in this loop so there is further room for improvement. ** LLVM testsuite * SingleSource/Benchmarks/Misc/ReedSolomon -10.75% There are multiple small SLP vectorizations outside the hot code. It's a bit surprising that it adds up to 10%. Some of this may be code-layout noise. * MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40% The opt-viewer screenshot can be seen at F3218284. We start at a colder store but the tree leads us into the hottest loop. * MultiSource/Applications/lambda-0.1.3/lambda -2.68% * MultiSource/Benchmarks/Bullet/bullet -2.18% This is using 3D vectors. * SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67% Noise, binary is unchanged. * MultiSource/Benchmarks/Ptrdist/anagram/anagram +4.90% There is an additional SLP in the cold code. The test runs for ~1sec and prints out over 2000 lines. This is most likely noise. * MultiSource/Applications/aha/aha +1.63% * MultiSource/Applications/JM/lencod/lencod +1.41% * SingleSource/Benchmarks/Misc/richards_benchmark +1.15% Differential Revision: https://reviews.llvm.org/D31965 llvm-svn: 303116	2017-05-15 21:15:01 +00:00
Sanjay Patel	a23b141cd2	[InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949) These folds were introduced with https://reviews.llvm.org/rL127064 as part of solving: https://bugs.llvm.org/show_bug.cgi?id=9343 As shown here: http://rise4fun.com/Alive/C8 ...however, the sdiv exact case needs a stronger predicate. I opted for duplicated code instead of adding another fallthrough because I think that's easier to read (and edit in case we need/want to restrict/loosen the predicates any more). This should fix: https://bugs.llvm.org/show_bug.cgi?id=32949 https://bugs.llvm.org/show_bug.cgi?id=32948 Differential Revision: https://reviews.llvm.org/D32954 llvm-svn: 303104	2017-05-15 19:16:49 +00:00
Craig Topper	716cad8bb7	[SCEV] Use copy initialization of APInts instead of direct initialization. This is based on post commit feed back from r302769. llvm-svn: 303092	2017-05-15 18:14:16 +00:00
Craig Topper	1a36b7d836	[ValueTracking] Replace all uses of ComputeSignBit with computeKnownBits. This patch finishes off the conversion of ComputeSignBit to computeKnownBits. Differential Revision: https://reviews.llvm.org/D33166 llvm-svn: 303035	2017-05-15 06:39:41 +00:00
Sanjoy Das	f6f6fb903e	Move some code into ScalarEvolution.cpp; NFC I need to add some asserts to these constructors that are easier to add once they're in the .cpp file. llvm-svn: 303032	2017-05-15 04:22:09 +00:00
Craig Topper	bb9737247a	[InstCombine] Merge duplicate functionality between InstCombine and ValueTracking Summary: Merge overflow computation for signed add, appearing both in InstCombine and ValueTracking. As part of the merge, cleanup the interface for overflow checks in InstCombine. Patch by Yoav Ben-Shalom. Reviewers: craig.topper, majnemer Reviewed By: craig.topper Subscribers: takuto.ikuta, llvm-commits Differential Revision: https://reviews.llvm.org/D32946 llvm-svn: 303029	2017-05-15 02:44:08 +00:00
Craig Topper	479daaf74c	[InstSimplify] Add patterns for folding (A & B) \| (~A ^ B) -> (~A ^ B) and its commuted variants. We already had (A & ~B) \| (A ^ B), but we missed the cases where the not was part of the xor. llvm-svn: 303004	2017-05-14 07:54:43 +00:00
Craig Topper	dfc8955ee6	[BasicAA] Alphabetize includes. NFC llvm-svn: 303002	2017-05-14 06:18:34 +00:00
Craig Topper	9fe357971c	[ValueTracking] Remove const_casts on several calls to computeKnownBits and ComputeSignBit. NFC llvm-svn: 302991	2017-05-13 17:22:16 +00:00
Andrew Kaylor	b01e94ee8d	[TLI] Add mapping for various '__<func>_finite' forms of the math routines to SVML routines Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31789 llvm-svn: 302957	2017-05-12 22:11:26 +00:00
Andrew Kaylor	f7c864f89c	[ConstantFolding] Add folding for various math '__<func>_finite' routines generated from -ffast-math Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31788 llvm-svn: 302956	2017-05-12 22:11:20 +00:00
Andrew Kaylor	3cd8c16d7f	[TLI] Add declarations for various math header file routines from math-finite.h that create '__<func>_finite as functions Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31787 llvm-svn: 302955	2017-05-12 22:11:12 +00:00
Craig Topper	8df66c602a	[KnownBits] Add bit counting methods to KnownBits struct and use them where possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925	2017-05-12 17:20:30 +00:00
Serguei Katkov	63c9c81152	[BPI] Ignore remainder while distributing the remaining probability from unreachanble This is a follow up patch for https://reviews.llvm.org/rL300440 to address a comment. To make implementation to be consistent with other cases we just ignore the remainder after distribution of remaining probability between reachable edges. If we reduced the probability of some edges coming to unreachable blocks we should distribute the remaining part across other edges coming to reachable blocks to satisfy the condition that sum of all probabilities should be equal to one. If this remaining part is not divided by number of "reachable" edges then we get this remainder. This remainder probability should be pretty small. Other cases just ignore if the sum of probabilities is not equal to one so we do the same. Reviewers: chandlerc, sanjoy, vsk, junbuml, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D32124 llvm-svn: 302883	2017-05-12 07:50:06 +00:00
Peter Collingbourne	f3e9f12296	CallGraph: Remove almost-unused field 'Root'. llvm-svn: 302852	2017-05-11 23:59:05 +00:00
Teresa Johnson	2a6b7991d4	Restrict call metadata based hotness detection to Sample PGO mode Summary: Don't use the metadata on call instructions for determining hotness unless we are in sample PGO mode, where it is needed because profile counts are not accurate. In instrumentation mode this is not necessary and does more harm than good when calls have VP metadata that hasn't been properly scaled after transformations or dropped after constant prop based devirtualization (both should be fixed, but we don't need to do this in the first place for instrumentation PGO). This required adjusting a number of tests to distinguish between sample and instrumentation PGO handling, and to add in profile summary metadata so that getProfileCount can get the summary. Reviewers: davidxl, danielcdh Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D32877 llvm-svn: 302844	2017-05-11 23:18:05 +00:00
Easwaran Raman	c103ef89ee	Decrease inlinecold-threshold to 45 I ran the test-suite (including SPEC 2006) in PGO mode comparing cold thresholds of 225 and 45. Here are some stats on the text size: Out of 904 tests that ran, 197 see a change in text size. The average text size reduction (of all the 904 binaries) is 1.07%. Of the 197 binaries, 19 see a text size increase, as high as 18%, but most of them are small single source benchmarks. There are 3 multisource benchmarks with a >0.5% size increase (0.7, 1.3 and 2.1 are their % increases). On the other side of the spectrum, 31 benchmarks see >10% size reduction and 6 of them are MultiSource. I haven't run the test-suite with other values of inlinecold-threshold. Since we have a cold callsite threshold of 45, I picked this value. Differential revision: https://reviews.llvm.org/D33106 llvm-svn: 302829	2017-05-11 21:36:28 +00:00
Craig Topper	e3e1a35f68	[SCEV] Reduce possible APInt allocations a bit. llvm-svn: 302769	2017-05-11 06:48:54 +00:00
Craig Topper	6694a4e6d6	[SCEV] Remove unneeded 'using namespace APIntOps'. llvm-svn: 302768	2017-05-11 06:48:51 +00:00
Teresa Johnson	94624aca2a	Ensure non-null ProfileSummaryInfo passed to ModuleSummaryIndex builder This fixes a ubsan bot failure after r302597, which made getProfileCount non-static, but ended up invoking it on a null ProfileSummaryInfo object in some cases from buildModuleSummaryIndex. Most testing passed because the non-static getProfileCount currently doesn't access any member variables, but I found this when testing a follow on patch (D32877) that adds a member variable access. llvm-svn: 302705	2017-05-10 18:52:16 +00:00
Amara Emerson	836b0f48c1	Add a late IR expansion pass for the experimental reduction intrinsics. This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence. Differential Revision: https://reviews.llvm.org/D32245 llvm-svn: 302631	2017-05-10 09:42:49 +00:00
Easwaran Raman	f5f9160072	[ProfileSummary] Make getProfileCount a non-static member function. This change is required because the notion of count is different for sample profiling and getProfileCount will need to determine the underlying profile type. Differential revision: https://reviews.llvm.org/D33012 llvm-svn: 302597	2017-05-09 23:21:10 +00:00
Amara Emerson	cf9daa33a7	Introduce experimental generic intrinsics for horizontal vector reductions. - This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vectorizers have the common code to do shuffle reductions factored out into LoopUtils, and now have a unified interface for generating reductions regardless of the preference of the target. LoopUtils now uses TTI to determine what kind of reductions the target wants to handle. - For CodeGen, basic legalization support is added. Differential Revision: https://reviews.llvm.org/D30086 llvm-svn: 302514	2017-05-09 10:43:25 +00:00
Craig Topper	ef869ecf0e	[SCEV] Don't use std::move on both inputs to APInt::operator+ or operator-. It might be confusing to the reader. NFC llvm-svn: 302448	2017-05-08 17:39:01 +00:00
Craig Topper	868813ffbb	[ValueTracking] Use KnownOnes to provide a better bound on known zeros for ctlz/cttz intrinics This patch uses KnownOnes of the input of ctlz/cttz to bound the value that can be returned from these intrinsics. This makes these intrinsics more similar to the handling for ctpop which already uses known bits to produce a similar bound. Differential Revision: https://reviews.llvm.org/D32521 llvm-svn: 302444	2017-05-08 17:22:34 +00:00
Sanjay Patel	6745447753	[InstSimplify] fix typo; NFC llvm-svn: 302439	2017-05-08 16:35:02 +00:00
Craig Topper	6e11a05e7e	[ValueTracking] Introduce a version of computeKnownBits that returns a KnownBits struct. Begin using it to replace internal usages of ComputeSignBit This introduces a new interface for computeKnownBits that returns the KnownBits object instead of requiring it to be pre-constructed and passed in by reference. This is a much more convenient interface as it doesn't require the caller to figure out the BitWidth to pre-construct the object. It's so convenient that I believe we can use this interface to remove the special ComputeSignBit flavor of computeKnownBits. As a step towards that idea, this patch replaces all of the internal usages of ComputeSignBit with this new interface. As you can see from the patch there were a couple places where we called ComputeSignBit which really called computeKnownBits, and then called computeKnownBits again directly. I've reduced those places to only making one call to computeKnownBits. I bet there are probably external users that do it too. A future patch will update the external users and remove the ComputeSignBit interface. I'll also working on moving more locations to the KnownBits returning interface for computeKnownBits. Differential Revision: https://reviews.llvm.org/D32848 llvm-svn: 302437	2017-05-08 16:22:48 +00:00
Sanjay Patel	2df38a80f1	[InstCombine/InstSimplify] add comments about code duplication; NFC llvm-svn: 302436	2017-05-08 16:21:55 +00:00
Zvi Rackover	558f86b4bc	InstructionSimplify: Refactor foldIdentityShuffles. NFC. Summary: Minor refactoring of foldIdentityShuffles() which allows the removal of a ConstantDataVector::get() in SimplifyShuffleVectorInstruction. Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32955 Conflicts: lib/Analysis/InstructionSimplify.cpp llvm-svn: 302433	2017-05-08 15:46:58 +00:00
Zvi Rackover	dfbd3d7903	IR: Add a shufflevector mask commutation helper function. NFC. Summary: Following up on Sanjay's suggetion in D32955, move this functionality into ShuffleVectornstruction. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32956 llvm-svn: 302420	2017-05-08 12:40:18 +00:00
Craig Topper	389d8cebd1	[SCEV] Use APInt::operator*=(uint64_t) to avoid a temporary APInt for a constant. llvm-svn: 302404	2017-05-08 04:55:13 +00:00
Craig Topper	d6f2639fd7	[SCEV] Have getRangeForAffineARHelper take StartRange by const reference to avoid a copy in many of the cases. llvm-svn: 302398	2017-05-08 02:29:15 +00:00
Zvi Rackover	973ff7c74c	InstructionSimplify: Relanding r301766 Summary: Re-applying r301766 with a fix to a typo and a regression test. The log message for r301766 was: ================================================================================== InstructionSimplify: Canonicalize shuffle operands. NFC-ish. Summary: Apply canonicalization rules: 1. Input vectors with no elements selected from can be replaced with undef. 2. If only one input vector is constant it shall be the second one. This allows constant-folding to cover more ad-hoc simplifications that were in place and avoid duplication for RHS and LHS checks. There are more rules we may want to add in the future when we see a justification. e.g. mask elements that select undef elements can be replaced with undef. ================================================================================== Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32863 llvm-svn: 302373	2017-05-07 18:16:37 +00:00
Craig Topper	252682a41b	[SCEV] Use move semantics in ScalarEvolution::setRange Summary: This makes setRange take ConstantRange by rvalue reference since most callers were passing an unnamed temporary ConstantRange. We can then move that ConstantRange into the DenseMap caches. For the callers that weren't passing a temporary, I've added std::move to to the local variable being passed. Reviewers: sanjoy, mzolotukhin, efriedma Reviewed By: sanjoy Subscribers: takuto.ikuta, llvm-commits Differential Revision: https://reviews.llvm.org/D32943 llvm-svn: 302371	2017-05-07 16:28:17 +00:00
Sanjay Patel	599e65b1ff	[InstSimplify] use ConstantRange to simplify or-of-icmps We can simplify (or (icmp X, C1), (icmp X, C2)) to 'true' or one of the icmps in many cases. I had to check some of these with Alive to prove to myself it's right, but everything seems to check out. Eg, the deleted code in instcombine was completely ignoring predicates with mismatched signedness. This is a follow-up to: https://reviews.llvm.org/rL301260 https://reviews.llvm.org/D32143 llvm-svn: 302370	2017-05-07 15:11:40 +00:00
Sanjoy Das	df8c2ebe73	Remove unnecessary const_cast llvm-svn: 302368	2017-05-07 05:29:36 +00:00
Sanjoy Das	40415eeb59	Use array_pod_sort instead of std::sort llvm-svn: 302367	2017-05-07 05:29:34 +00:00
Craig Topper	6c5e22a4b8	[SCEV] Remove extra APInt copies from getRangeForAffineARHelper. This changes one parameter to be a const APInt& since we only read from it. Use std::move on local APInts once they are no longer needed so we can reuse their allocations. Lastly, use operator+=(uint64_t) instead of adding 1 to an APInt twice creating a new APInt each time. llvm-svn: 302335	2017-05-06 06:03:07 +00:00
Craig Topper	69f1af29fb	[SCEV] Use std::move to avoid some APInt copies. llvm-svn: 302334	2017-05-06 05:22:56 +00:00
Craig Topper	c97fdb846e	[SCEV] Use APInt's uint64_t operations instead of creating a temporary APInt to hold 1. llvm-svn: 302333	2017-05-06 05:15:11 +00:00
Craig Topper	8f26b7945e	[SCEV] Avoid a couple APInt copies by capturing by reference since the method returns a reference. llvm-svn: 302332	2017-05-06 05:15:09 +00:00
Craig Topper	2b195fd2c3	[LazyValueInfo] Avoid unnecessary copies of ConstantRanges Summary: ConstantRange contains two APInts which can allocate memory if their width is larger than 64-bits. So we shouldn't copy it when we can avoid it. This changes LVILatticeVal::getConstantRange() to return its internal ConstantRange by reference. This allows many places that just need a ConstantRange reference to avoid making a copy. Several places now capture the return value of getConstantRange() by reference so they can call methods on it that don't need a new object. Lastly it adds std::move in one place to capture to move a local ConstantRange into an LVILatticeVal. Reviewers: reames, dberlin, sanjoy, anna Reviewed By: reames Subscribers: grandinj, llvm-commits Differential Revision: https://reviews.llvm.org/D32884 llvm-svn: 302331	2017-05-06 03:35:15 +00:00
Matthias Braun	60b40b8fec	TargetLibraryInfo: Introduce wcslen wcslen is part of the C99 and C++98 standards. - This introduces the function to TargetLibraryInfo. - Also set attributes for wcslen in llvm::inferLibFuncAttributes(). Differential Revision: https://reviews.llvm.org/D32837 llvm-svn: 302278	2017-05-05 20:25:50 +00:00
Craig Topper	f0aeee01c3	[KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits. This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown. Differential Revision: https://reviews.llvm.org/D32637 llvm-svn: 302262	2017-05-05 17:36:09 +00:00
Sanjay Patel	e42b4d566e	[InstSimplify] add folds for or-of-casted-icmps The sibling folds for 'and' with casts were added with https://reviews.llvm.org/rL273200. This is a preliminary step for adding the 'or' variants for the folds added with https://reviews.llvm.org/rL301260. The reason for the strange form with constant LHS in the 1st test is because there's another missing fold in that case for the inverted predicate. That should be fixed when we add the ConstantRange functionality for 'or-of-icmps' that already exists for 'and-of-icmps'. I'm hoping to share more code for the and/or cases, so we won't have these differences. This will allow us to remove code from InstCombine. It's also possible that we can remove some code here in InstSimplify. I think we have some duplicated folds because patterns are not matched in a general way. Differential Revision: https://reviews.llvm.org/D32876 llvm-svn: 302189	2017-05-04 19:51:34 +00:00
Sanjay Patel	142cb83768	[InstSimplify] move logic-of-icmps helper functions; NFC Putting these next to each other should make it easier to see what's missing from each side. Patch to plug one of those holes should be posted soon. llvm-svn: 302178	2017-05-04 18:19:17 +00:00
Peter Collingbourne	9667b91b13	Re-apply r302108, "IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI." with a fix for the clang backend. llvm-svn: 302176	2017-05-04 18:03:25 +00:00
Michael Zolotukhin	3207d30fdd	Fix a typo. llvm-svn: 302175	2017-05-04 17:42:34 +00:00
Eric Liu	f6039f255e	Revert "IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI." This reverts commit r302108. This causes crash in clang bootstrap with LTO. Contacted the auther in the original commit. llvm-svn: 302140	2017-05-04 11:49:39 +00:00
Peter Collingbourne	5f85a9deda	IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI. When profiling a no-op incremental link of Chromium I found that the functions computeImportForFunction and computeDeadSymbols were consuming roughly 10% of the profile. The goal of this change is to improve the performance of those functions by changing the map lookups that they were previously doing into pointer dereferences. This is achieved by changing the ValueInfo data structure to be a pointer to an element of the global value map owned by ModuleSummaryIndex, and changing reference lists in the GlobalValueSummary to hold ValueInfos instead of GUIDs. This means that a ValueInfo will take a client directly to the summary list for a given GUID. Differential Revision: https://reviews.llvm.org/D32471 llvm-svn: 302108	2017-05-04 03:36:16 +00:00
Michael Zolotukhin	37162adf3e	[SCEV] createAddRecFromPHI: Optimize for the most common case. Summary: The existing implementation creates a symbolic SCEV expression every time we analyze a phi node and then has to remove it, when the analysis is finished. This is very expensive, and in most of the cases it's also unnecessary. According to the data I collected, ~60-70% of analyzed phi nodes (measured on SPEC) have the following form: PN = phi(Start, OP(Self, Constant)) Handling such cases separately significantly speeds this up. Reviewers: sanjoy, pete Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32663 llvm-svn: 302096	2017-05-03 23:53:38 +00:00
Craig Topper	8189a87a1e	[KnownBits] Add methods for determining if KnownBits is a constant value This patch adds isConstant and getConstant for determining if KnownBits represents a constant value and to retrieve the value. Use them to simplify code. Differential Revision: https://reviews.llvm.org/D32785 llvm-svn: 302091	2017-05-03 23:12:29 +00:00

1 2 3 4 5 ...

7458 Commits