llvm-project

Commit Graph

Author	SHA1	Message	Date
Guozhi Wei	bce228ca42	[TargetTransformInfo] Handle intrinsic call in getInstructionLatency() Usually an intrinsic is a simple target instruction, it should have a small latency. A real function call has much larger latency. So handle the intrinsic call in function getInstructionLatency(). Differential Revision: https://reviews.llvm.org/D38104 llvm-svn: 314003	2017-09-22 18:25:53 +00:00
Guozhi Wei	3d1305f6da	[TargetTransformInfo] Static alloca has 0 cost Static alloca usually doesn't generate any machine instructions, so it has 0 cost. Differential Revision: https://reviews.llvm.org/D37879 llvm-svn: 313410	2017-09-15 22:28:12 +00:00
Guozhi Wei	21f8fad909	[TargetTransformInfo] Detect 0 latency instructions For instructions that unlikely generate machine instructions, they should also have 0 latency. Differential Revision: https://reviews.llvm.org/D37833 llvm-svn: 313288	2017-09-14 19:20:02 +00:00
Silviu Baranga	ac920f7716	[LAA] Allow more run-time alias checks by coercing pointer expressions to AddRecExprs Summary: LAA can only emit run-time alias checks for pointers with affine AddRec SCEV expressions. However, non-AddRecExprs can be now be converted to affine AddRecExprs using SCEV predicates. This change tries to add the minimal set of SCEV predicates in order to enable run-time alias checking. Reviewers: anemet, mzolotukhin, mkuper, sanjoy, hfinkel Reviewed By: hfinkel Subscribers: mssimpso, Ayal, dorit, roman.shirokiy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D17080 llvm-svn: 313012	2017-09-12 07:48:22 +00:00
Guozhi Wei	62d6414465	[TargetTransformInfo] Add a new public interface getInstructionCost Current TargetTransformInfo can support throughput cost model and code size model, but sometimes we also need instruction latency cost model in different optimizations. Hal suggested we need a single public interface to query the different cost of an instruction. So I proposed following interface: enum TargetCostKind { TCK_RecipThroughput, ///< Reciprocal throughput. TCK_Latency, ///< The latency of instruction. TCK_CodeSize ///< Instruction code size. }; int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const; All clients should mainly use this function to query the cost of an instruction, parameter <kind> specifies the desired cost model. This patch also provides a simple default implementation of getInstructionLatency. The default getInstructionLatency provides latency numbers for only small number of instruction classes, those latency numbers are only reasonable for modern OOO processors. It can be extended in following ways: Add more detail into this function. Add getXXXLatency function and call it from here. Implement target specific getInstructionLatency function. Differential Revision: https://reviews.llvm.org/D37170 llvm-svn: 312832	2017-09-08 22:29:17 +00:00
Zvi Rackover	25799d93f0	X86: Improve AVX512 fptoui lowering Summary: Add patterns for fptoui <16 x float> to <16 x i8> fptoui <16 x float> to <16 x i16> Reviewers: igorb, delena, craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37505 llvm-svn: 312704	2017-09-07 07:40:34 +00:00
Alexandre Isoard	405728fd47	[SCEV] Add URem support to SCEV In LLVM IR the following code: %r = urem <ty> %t, %b is equivalent to %q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort using that relation: %r --> (-%b * (%t /u %b)) + %t We implement two special cases: - if %b is 1, the result is always 0 - if %b is a power-of-two, we produce a zext/trunc based expression instead That is, the following code: %r = urem i32 %t, 65536 Produces: %r --> (zext i16 (trunc i32 %a to i16) to i32) Note that while this helps get a tighter bound on the range analysis and the known-bits analysis, this exposes some normalization shortcoming of SCEVs: %div = udim i32 %a, 65536 %mul = mul i32 %div, 65536 %rem = urem i32 %a, 65536 %add = add i32 %mul, %rem Will usually not be reduced. llvm-svn: 312329	2017-09-01 14:59:59 +00:00
Matt Arsenault	376f1bd73c	AMDGPU: Don't assert in TTI with fp32 denorms enabled Also refine for f16 and rcp cases. llvm-svn: 312213	2017-08-31 05:47:00 +00:00
Simon Pilgrim	c63f93a197	[CostModel][X86][XOP] Improve costs for XOP shuffles VPPERM/VPERMIL2PD/VPERMIL2PS all provide more effective 2-input shuffles than regular AVX instructions llvm-svn: 311005	2017-08-16 13:50:20 +00:00
Jakub Kuderski	638c085d07	[Dominators] Include infinite loops in PostDominatorTree Summary: This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change. What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG. This patch makes the following assumptions: - A sequence of updates should produce the same tree as a recalculating it. - Any sequence of the same updates should lead to the same tree. - Siblings and roots are unordered. The last two properties are essential to efficiently perform batch updates in the future. When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently. This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough. That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call: ``` # functions: 52283 # samples: 337609 # reverse unreachable BBs: 216022 # BBs: 247840796 Percent reverse-unreachable: 0.08716159869015269 % Max(PercRevUnreachable) in a function: 87.58620689655172 % # > 25 % samples: 471 ( 0.1395104988314885 % samples ) ... in 145 ( 0.27733680163724345 % functions ) ``` Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway. I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :). Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel Reviewed By: dberlin Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050 Differential Revision: https://reviews.llvm.org/D35851 llvm-svn: 310940	2017-08-15 18:14:57 +00:00
Hal Finkel	b03dd4be70	[ValueTracking] Don't delete assumes of side-effectful instructions ValueTracking has to strike a balance when attempting to propagate information backwards from assumes, because if the information is trivially propagated backwards, it can appear to LLVM that the assumption is known to be true, and therefore can be removed. This is sound (because an assumption has no semantic effect except for causing UB), but prevents the assume from allowing further optimizations. The isEphemeralValueOf check exists to try and prevent this issue by not removing the source of an assumption. This tries to make it a little bit more general to handle the case of side-effectful instructions, such as in %0 = call i1 @get_val() %1 = xor i1 %0, true call void @llvm.assume(i1 %1) Patch by Ariel Ben-Yehuda, thanks! Differential Revision: https://reviews.llvm.org/D36590 llvm-svn: 310859	2017-08-14 17:11:43 +00:00
Chandler Carruth	37c7b08710	[ValueTracking] Revert r310583 which enabled functionality that still is causing compile time issues. Moreover, the patch deleted the flag in addition to changing the default, and links to a code review that doesn't even discuss the flag and just has an update to a Clang test case. I've followed up on the commit thread to ask for numbers on compile time at this point, leaving the flag in place until things stabilize, and pointing at specific code that seems to exhibit excessive compile time with this patch. Original commit message for r310583: """ [ValueTracking] Enabling ValueTracking patch by default (recommit). Part 2. The original patch was an improvement to IR ValueTracking on non-negative integers. It has been checked in to trunk (D18777, r284022). But was disabled by default due to performance regressions. Perf impact has improved. The patch would be enabled by default. """" llvm-svn: 310816	2017-08-14 07:03:24 +00:00
Simon Pilgrim	b59c2d9d73	[CostModel][X86] Add SSE2 two-src shuffle costs llvm-svn: 310654	2017-08-10 19:32:35 +00:00
Simon Pilgrim	7354531b82	[CostModel][X86] Add avx1 two-src shuffle costs llvm-svn: 310650	2017-08-10 19:02:51 +00:00
Simon Pilgrim	ac2e50a4ca	[CostModel][X86] Add avx2 two-src shuffle costs llvm-svn: 310645	2017-08-10 18:29:34 +00:00
Simon Pilgrim	2f529412e1	[CostModel][X86] Extend two src shuffle cost tests Cover most 128/256/512/1024-bit cases for vXf64/vXi64, vXf32/vXi32, vXi16 + vXi8 llvm-svn: 310641	2017-08-10 18:02:45 +00:00
Simon Pilgrim	fe67612eba	[CostModel][X86] Add avx512vbmi broadcast/reverse/single-src shuffle cost tests llvm-svn: 310633	2017-08-10 17:33:25 +00:00
Simon Pilgrim	702e5fa391	[CostModel][X86] Improve single src shuffle costs Add missing SK_PermuteSingleSrc costs for AVX2 targets and earlier, also added some of the simpler SK_PermuteTwoSrc costs to support splitting of SK_PermuteSingleSrc shuffles llvm-svn: 310632	2017-08-10 17:27:20 +00:00
Simon Pilgrim	419215abb7	[CostModel][X86] Added v2f64/v2i64 single src shuffle model tests Fixed label checks for all prefixes llvm-svn: 310606	2017-08-10 15:25:08 +00:00
Nikolai Bozhenov	d97136c182	[ValueTracking] Enabling ValueTracking patch by default (recommit). Part 2. The original patch was an improvement to IR ValueTracking on non-negative integers. It has been checked in to trunk (D18777, r284022). But was disabled by default due to performance regressions. Perf impact has improved. The patch would be enabled by default. Reviewers: reames, hfinkel Differential Revision: https://reviews.llvm.org/D34101 Patch by: Olga Chupina <olga.chupina@intel.com> llvm-svn: 310583	2017-08-10 11:24:57 +00:00
Amara Emerson	56dca4e3ca	[SCEV] Preserve NSW information for sext(subtract). Pushes the sext onto the operands of a Sub if NSW is present. Also adds support for propagating the nowrap flags of the llvm.ssub.with.overflow intrinsic during analysis. Differential Revision: https://reviews.llvm.org/D35256 llvm-svn: 310117	2017-08-04 20:19:46 +00:00
Max Kazantsev	2cb3653404	[SCEV] Re-enable "Cache results of computeExitLimit" The patch rL309080 was reverted because it did not clean up the cache on "forgetValue" method call. This patch re-enables this change, adds the missing check and introduces two new unit tests that make sure that the cache is cleaned properly. Differential Revision: https://reviews.llvm.org/D36087 llvm-svn: 309925	2017-08-03 08:41:30 +00:00
Tobias Grosser	670a5d88a3	[tests] Do not emity binary bitcode to stdout in RegionInfo tests llvm-svn: 309485	2017-07-29 09:58:43 +00:00
Sanjoy Das	843ab57457	Revert "[SCEV] Cache results of computeExitLimit" This reverts commit r309080. The patch needs to clear out the ScalarEvolution::ExitLimits cache in forgetMemoizedResults. I've replied on the commit thread for the patch with more details. llvm-svn: 309357	2017-07-28 03:25:07 +00:00
Davide Italiano	01cb947abb	[JumpThreading] Add an option to dump LazyValueInfo after the run. Differential Revision: https://reviews.llvm.org/D35973 llvm-svn: 309353	2017-07-28 02:57:43 +00:00
Max Kazantsev	f282aed428	[SCEV] Cache results of computeExitLimit This patch adds a cache for computeExitLimit to save compilation time. A lot of examples of tests that take extensive time to compile are attached to the bug 33494. Differential Revision: https://reviews.llvm.org/D35827 llvm-svn: 309080	2017-07-26 04:55:54 +00:00
Max Kazantsev	0e9e0796f4	[SCEV] Limit max size of AddRecExpr during evolving When SCEV calculates product of two SCEVAddRecs from the same loop, it tries to combine them into one big AddRecExpr. If the sizes of the initial SCEVs were `S1` and `S2`, the size of their product is `S1 + S2 - 1`, and every operand of the resulting SCEV is combined from operands of initial SCEV and has much higher complexity than they have. As result, if we try to calculate something like: %x1 = {a,+,b} %x2 = mul i32 %x1, %x1 %x3 = mul i32 %x2, %x1 %x4 = mul i32 %x3, %x2 ... The size of such SCEVs grows as `2^N`, and the arguments become more and more complex as we go forth. This leads to long compilation and huge memory consumption. This patch sets a limit after which we don't try to combine two `SCEVAddRecExpr`s into one. By default, max allowed size of the resulting AddRecExpr is set to 16. Differential Revision: https://reviews.llvm.org/D35664 llvm-svn: 308847	2017-07-23 15:40:19 +00:00
Ulrich Weigand	33435c4c9c	[SystemZ] Add support for IBM z14 processor (2/3) This adds support for the new 32-bit vector float instructions of z14. This includes: - Enabling the instructions for the assembler/disassembler. - CodeGen for the instructions, including new LLVM intrinsics. - Scheduler description support for the instructions. - Update to the vector cost function calculations. In general, CodeGen support for the new v4f32 instructions closely matches support for the existing v2f64 instructions. llvm-svn: 308195	2017-07-17 17:42:48 +00:00
Kamil Rytarowski	cce21c1dfe	Make shell redirection construct portable Summary: NetBSD shell sh(1) does not support ">& /dev/null" construct. This is bashism. The portable and POSIX solution is to use: "> /dev/null 2>&1". This change fixes 22 Unexpected Failures on NetBSD/amd64 for the "check-llvm" target. Sponsored by <The NetBSD Foundation> Reviewers: joerg, dim, rnk Reviewed By: joerg, rnk Subscribers: rnk, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D35277 llvm-svn: 307789	2017-07-12 13:24:46 +00:00
Max Kazantsev	b9edcbcb1d	Re-enable "[IndVars] Canonicalize comparisons between non-negative values and indvars" The patch was reverted due to a bug. The bug was that if the IV is the 2nd operand of the icmp instruction, then the "Pred" variable gets swapped and differs from the instruction's predicate. In this patch we use the original predicate to do the transformation. Also added a test case that exercises this situation. Differentian Revision: https://reviews.llvm.org/D35107 llvm-svn: 307477	2017-07-08 17:17:30 +00:00
Max Kazantsev	98838527c6	Revert "Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars""" It appears that the problem is still there. Needs more analysis to understand why SaturatedMultiply test fails. llvm-svn: 307249	2017-07-06 10:47:13 +00:00
Max Kazantsev	c8db20b78c	Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"" It seems that the patch was reverted by mistake. Clang testing showed failure of the MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the fresh code base and was able to confirm that the transformation introduced by the change does not happen in the said test. This gives a strong confidence that the actual reason of the failure of the initial patch was somewhere else, and that problem now seems to be fixed. Re-submitting the change to confirm that. llvm-svn: 307244	2017-07-06 09:57:41 +00:00
Brendon Cahoon	cb8c7b912d	[DependenceAnalysis] Make sure base objects are the same when comparing GEPs The dependence analysis was returning incorrect information when using the GEPs to compute dependences. The analysis uses the GEP indices under certain conditions, but was doing it incorrectly when the base objects of the GEP are aliases, but pointing to different locations in the same array. This patch adds another check for the base objects. If the base pointer SCEVs are not equal, then the dependence analysis should fall back on the path that uses the whole SCEV for the dependence check. This fixes PR33567. Differential Revision: https://reviews.llvm.org/D34702 llvm-svn: 307203	2017-07-05 21:35:47 +00:00
Max Kazantsev	ebe56283bc	Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars" This patch seems to cause failures of test MathExtras.SaturatingMultiply on multiple buildbots. Reverting until the reason of that is clarified. Differential Revision: https://reviews.llvm.org/rL307126 llvm-svn: 307135	2017-07-05 09:44:41 +00:00
Max Kazantsev	80bc4a5554	[IndVars] Canonicalize comparisons between non-negative values and indvars -If there is a IndVar which is known to be non-negative, and there is a value which is also non-negative, then signed and unsigned comparisons between them produce the same result. Both of those can be seen in the same loop. To allow other optimizations to simplify them, we turn all instructions like %c = icmp slt i32 %iv, %b to %c = icmp ult i32 %iv, %b if both %iv and %b are known to be non-negative. Differential Revision: https://reviews.llvm.org/D34979 llvm-svn: 307126	2017-07-05 06:38:49 +00:00
Mohammed Agabaria	eb09a810e6	[X86][CM] update add\sub costs of vectors of 64 in X86\SLM arch this patch updates the cost of addq\subq (add\subtract of vectors of 64bits) based on the performance numbers of SLM arch. Differential Revision: https://reviews.llvm.org/D33983 llvm-svn: 306974	2017-07-02 12:16:15 +00:00
Max Kazantsev	8d0322e612	[SCEV] Use depth limit instead of local cache for SExt and ZExt In rL300494 there was an attempt to deal with excessive compile time on invocations of getSign/ZeroExtExpr using local caching. This approach only helps if we request the same SCEV multiple times throughout recursion. But in the bug PR33431 we see a case where we request different values all the time, so caching does not help and the size of the cache grows enormously. In this patch we remove the local cache for this methods and add the recursion depth limit instead, as we do for arithmetics. This gives us a guarantee that the invocation sequence is limited and reasonably short. Differential Revision: https://reviews.llvm.org/D34273 llvm-svn: 306785	2017-06-30 05:04:09 +00:00
Jakub Kuderski	837755cf8b	[Dominators] Don't compute DFS InOut numbers eagerly. Summary: DFS InOut numbers currently get eagerly computer upon DomTree construction. They are only needed to answer dome dominance queries and they get invalidated by updates and recalculations. Because of that, it is faster in practice to compute them lazily when they are actually needed. Clang built without this patch takes 6m 45s to boostrap on my machine, and with the patch applied 6m 38s. Reviewers: sanjoy, dberlin, chandlerc Reviewed By: dberlin Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D34296 llvm-svn: 306778	2017-06-30 01:28:21 +00:00
Alexandre Isoard	41044876fc	Reverting r306695 while investigating failing test case. Failing test case: Transforms/LoopVectorize.iv_outside_user.ll llvm-svn: 306723	2017-06-29 18:48:56 +00:00
Alexandre Isoard	aa29afc756	ScalarEvolution: Add URem support In LLVM IR the following code: %r = urem <ty> %t, %b is equivalent to: %q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort this way. Note: While SRem and SDiv are also related this way, SCEV does not provides SDiv yet. llvm-svn: 306695	2017-06-29 16:29:04 +00:00
Dorit Nuzman	e0e0f1ddb0	[AVX2] [TTI CostModel] Add cost of interleaved loads/stores for AVX2 The cost of an interleaved access was only implemented for AVX512. For other X86 targets an overly conservative Base cost was returned, resulting in avoiding vectorization where it is actually profitable to vectorize. This patch starts to add costs for AVX2 for most prominent cases of interleaved accesses (stride 3,4 chars, for now). Note1: Improvements of up to ~4x were observed in some of EEMBC's rgb workloads; There is also a known issue of 15-30% degradations on some of these workloads, associated with an interleaved access followed by type promotion/widening; the resulting shuffle sequence is currently inefficient and will be improved by a series of patches that extend the X86InterleavedAccess pass (such as D34601 and more to follow). Note 2: The costs in this patch do not reflect port pressure penalties which can be very dominant in the case of interleaved accesses since most of the shuffle operations are restricted to a single port. Further tuning, that may incorporate these considerations, will be done on top of the upcoming improved shuffle sequences (that is, along with the abovementioned work to extend X86InterleavedAccess pass). Differential Revision: https://reviews.llvm.org/D34023 llvm-svn: 306238	2017-06-25 08:26:25 +00:00
Michael Kruse	47f856095a	[BasicAA] Use MayAlias instead of PartialAlias for fallback. Using various methods, BasicAA tries to determine whether two GetElementPtr memory locations alias when its base pointers are known to be equal. When none of its heuristics are applicable, it falls back to PartialAlias to, according to a comment, protect TBAA making a wrong decision in case of unions and malloc. PartialAlias is not correct, because a PartialAlias result implies that some, but not all, bytes overlap which is not necessarily the case here. AAResults returns the first analysis result that is not MayAlias. BasicAA is always the first alias analysis. When it returns PartialAlias, no other analysis is queried to give a more exact result (which was the intention of returning PartialAlias instead of MayAlias). For instance, ScopedAA could return a more accurate result. The PartialAlias hack was introduced in r131781 (and re-applied in r132632 after some reverts) to fix llvm.org/PR9971 where TBAA returns a wrong NoAlias result due to a union. A test case for the malloc case mentioned in the comment was not provided and I don't think it is affected since it returns an omnipotent char anyway. Since r303851 (https://reviews.llvm.org/D33328) clang does emit specific TBAA for unions anymore (but "omnipotent char" instead). Hence, the PartialAlias workaround is not required anymore. This patch passes the test-suite and check-llvm/check-clang of a self-hoisted build on x64. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D34318 llvm-svn: 305938	2017-06-21 18:25:37 +00:00
Simon Pilgrim	68204b83a7	[CostModel][X86] Add scalar arithmetic cost tests llvm-svn: 305810	2017-06-20 17:10:27 +00:00
Simon Pilgrim	36c17935e4	[CostModel][X86] Declare costs variables based on type The alphabetical progression isn't that useful llvm-svn: 305808	2017-06-20 17:04:46 +00:00
Anna Thomas	7949f4529a	[JumpThreading][LVI] Invalidate LVI information after blocks are merged Summary: After a single predecessor is merged into a basic block, we need to invalidate the LVI information for the new merged block, when LVI is not provably true for all of instructions in the new block. The test cases added show the correct LVI information using the LVI printer pass. Reviewers: reames, dberlin, davide, sanjoy Reviewed by: dberlin, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34108 llvm-svn: 305699	2017-06-19 15:23:33 +00:00
Max Kazantsev	dc80366d52	[ScalarEvolution] Apply Depth limit to getMulExpr This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV. We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292. This patch applies the same limit on max expression depth for getAddExpr and getMulExpr. Differential Revision: https://reviews.llvm.org/D33984 llvm-svn: 305463	2017-06-15 11:48:21 +00:00
John Brawn	da4a68a1d2	[BPI] Don't assume that strcmp returning >0 is more likely than <0 The zero heuristic assumes that integers are more likely positive than negative, but this also has the effect of assuming that strcmp return values are more likely positive than negative. Given that for nonzero strcmp return values it's the ordering of arguments that determines the sign of the result there's no reason to assume that's true. Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to decide if it's strcmp-like, and if so only assume that nonzero is more likely than zero i.e. strings are more often different than the same. This causes a slight code generation change in the spec2006 benchmark 403.gcc, but with no noticeable performance impact. The intent of this patch is to allow better optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are also some changes that need to be made to if-conversion. Differential Revision: https://reviews.llvm.org/D33934 llvm-svn: 304970	2017-06-08 09:44:40 +00:00
Anna Thomas	4acfc7e16e	[LVI Printer] Rely on the LVI analysis functions rather than the LVI cache Summary: LVIPrinter pass was previously relying on the LVICache. We now directly call the the LVI functions which solves the value if the LVI information is not already available in the cache. This has 2 benefits over the printing of LVI cache: 1. higher coverage (i.e. catches errors) in LVI code when cache value is invalidated. 2. relies on the core functions, and not dependent on the LVI cache (which may be scrapped at some point). It would still catch any cache invalidation errors, since we first go through the cache. Reviewers: reames, dberlin, sanjoy Reviewed by: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32135 llvm-svn: 304819	2017-06-06 19:25:31 +00:00
Joey Gouly	61eaa63b65	[InstSimplify] Constant fold the new GEP in SimplifyGEPInst. llvm-svn: 304784	2017-06-06 10:17:14 +00:00
George Burgess IV	0a7b989036	[CFLAA] Add missing break; note things are broken. Thanks to Galina Kistanova for finding the missing break! When trying to make a test for this, I realized our logic for handling extractvalue/insertvalue/... is somewhat broken. This makes constructing a test-case for this missing break nontrivial. llvm-svn: 304275	2017-05-31 02:35:26 +00:00

1 2 3 4 5 ...

1303 Commits