llvm-project

Commit Graph

Author	SHA1	Message	Date
Matthew Simpson	b6915fbfa2	[InstCombine] Simplify selects that test cmpxchg instructions If a select instruction tests the returned flag of a cmpxchg instruction and selects between the returned value of the cmpxchg instruction and its compare operand, the result of the select will always be equal to its false value. Differential Revision: https://reviews.llvm.org/D39383 llvm-svn: 316994	2017-10-31 12:34:02 +00:00
David Green	64f53b4214	[LoopUnroll] Clean up remarks for unroll remainder The optimisation remarks for loop unrolling with an unrolled remainder looks something like: test.c:7:18: remark: completely unrolled loop with 3 iterations [-Rpass=loop-unroll] C[i] += A[i*N+j]; ^ test.c:6:9: remark: unrolled loop by a factor of 4 with run-time trip count [-Rpass=loop-unroll] for(int j = 0; j < N; j++) ^ This removes the first of the two messages. Differential revision: https://reviews.llvm.org/D38725 llvm-svn: 316986	2017-10-31 10:47:46 +00:00
Max Kazantsev	84286ce5dd	[IRCE][NFC] Rename fields of InductiveRangeCheck Rename `Offset`, `Scale`, `Length` into `Begin`, `Step`, `End` respectively to make naming of similar entities for Ranges and Range Checks more consistent. Differential Revision: https://reviews.llvm.org/D39414 llvm-svn: 316979	2017-10-31 06:19:05 +00:00
Max Kazantsev	21e7b53490	[NFC] Get rid of variables used in assert only llvm-svn: 316977	2017-10-31 05:33:58 +00:00
Philip Reames	59bf1e0548	[IndVarSimplify] Simplify code using preheader assumption As noted in the nice block comment, the previous code didn't actually handle multi-entry loops correctly, it just assumed SCEV didn't analyze such loops. Given SCEV has comments to the contrary, that seems a bit suspect. More importantly, the pass actually requires loopsimplify form which ensures a loop-preheader is available. Remove the excessive generaility and shorten the code greatly. Note that we do successfully analyze many multi-entry loops, but we do so by converting them to single entry loops. See the added test case. llvm-svn: 316976	2017-10-31 05:16:46 +00:00
Max Kazantsev	488ec975bb	Reapply "[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors" This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 316975	2017-10-31 05:07:56 +00:00
Philip Reames	39a8dbff87	[SimplifyIndVar] Extract out invariant expression handling Previously, the code returned early from the function when it couldn't find a free expansion, it should be returning from the transform. I don't have a test case, noticed this via inspection. As a follow up, I'm going to revisit the logic in the extract function. I think that essentially the whole helper routine can be replaced with SCEVExpander, but I wanted to do that in a series of separate commits. llvm-svn: 316974	2017-10-31 04:19:06 +00:00
Philip Reames	5552f503d5	Undo accidental commit These files shouldn't have been submitted in 316967 llvm-svn: 316968	2017-10-31 00:04:09 +00:00
Philip Reames	9c3cbeea39	[CGP] Fix crash on i96 bit multiply Issue found by llvm-isel-fuzzer on OSS fuzz, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3725 If anyone actually cares about > 64 bit arithmetic, there's a lot more to do in this area. There's a bunch of obviously wrong code in the same function. I don't have the time to fix all of them and am just using this to understand what the workflow for fixing fuzzer cases might look like. llvm-svn: 316967	2017-10-30 23:59:51 +00:00
Yaxun Liu	d23f23d81c	InferAddressSpaces: Fix bug about replacing addrspacecast InferAddressSpaces assumes the pointee type of addrspacecast is the same as the operand, which is not always true and causes invalid IR. This bug cause build failure in HCC. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39432 llvm-svn: 316957	2017-10-30 21:19:41 +00:00
Davide Italiano	834b45129b	[NewGVN] Stop assuming PHI args ordering when looking at phi-of-ops. It's not guaranteed. There's a bug open to sort them in predecessor order, but it won't happen anytime soon. In the meanwhile, passes will have to do an O(#preds) scan. Such is life. llvm-svn: 316953	2017-10-30 20:20:16 +00:00
Daniel Neilson	f9c7d29c77	Create instruction classes for identifying any atomicity of memory intrinsic. (NFC) Summary: For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html This patch fleshes out the instruction class hierarchy with respect to atomic and non-atomic memory intrinsics. With this change, the relevant part of the class hierarchy becomes: IntrinsicInst -> MemIntrinsicBase (methods-only class) -> MemIntrinsic (non-atomic intrinsics) -> MemSetInst -> MemTransferInst -> MemCpyInst -> MemMoveInst -> AtomicMemIntrinsic (atomic intrinsics) -> AtomicMemSetInst -> AtomicMemTransferInst -> AtomicMemCpyInst -> AtomicMemMoveInst -> AnyMemIntrinsic (both atomicities) -> AnyMemSetInst -> AnyMemTransferInst -> AnyMemCpyInst -> AnyMemMoveInst This involves some class renaming: ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst A script for doing this renaming in downstream trees is included below. An example of where the Any* classes should be used in LLVM is when reasoning about the effects of an instruction (ex: aliasing). --- Script for renaming AtomicMem* classes: PREFIXES="[<,([:space:]]" CLASSES="MemIntrinsic\|MemTransferInst\|MemSetInst\|MemMoveInst\|MemCpyInst" SUFFIXES="[;)>,[:space:]]" REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})" REGEX2="visitElementUnorderedAtomic(${CLASSES})" FILES=$( grep -E "(${REGEX}\|${REGEX2})" -r . \| tr ':' ' ' \| awk '{print $1}' \| sort \| uniq ) SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g" SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g" for f in $FILES; do echo "Processing: $f" sed -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f done Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev Reviewed By: sanjoy Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38419 llvm-svn: 316950	2017-10-30 19:51:48 +00:00
Mandeep Singh Grang	f83268bd9e	[GVNHoist] Fix non-deterministic sort order of PHIs for identical instructions Summary: This fixes failure in Transforms/GVNHoist/hoist.ll uncovered by D39245. Reviewers: hiraditya, spop, dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39410 llvm-svn: 316949	2017-10-30 19:42:41 +00:00
Clement Courbet	b2c3eb8cf1	[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2). - Targets that want to support memcmp expansions now return the list of supported load sizes. - Expansion codegen does not assume that all power-of-two load sizes smaller than the max load size are valid. For examples, this is not the case for x86(32bit)+sse2. Fixes PR34887. llvm-svn: 316905	2017-10-30 14:19:33 +00:00
Florian Hahn	d0208b4b1c	Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP. This version of the patch includes a fix addressing a stage2 LTO buildbot failure and addressed some additional nits. Original commit message: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to ret i32 2 with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 316891	2017-10-30 10:07:42 +00:00
Max Kazantsev	390fc57771	[IRCE][NFC] Store Length as SCEV in RangeCheck instead of Value llvm-svn: 316889	2017-10-30 09:35:16 +00:00
Florian Hahn	d18443edad	Revert r316887 to fix buildbot failures. llvm-svn: 316888	2017-10-30 09:21:50 +00:00
Florian Hahn	925d3e4a98	Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP. This version of the patch includes a fix addressing a stage2 LTO buildbot failure and addressed some additional nits. Original commit message: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to ret i32 2 with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 316887	2017-10-30 09:04:18 +00:00
Max Kazantsev	1d7c0439b9	[GVN][NFC] Mark instruction for deletion instead of immediate erasing in LoadPRE It is done to uniformly handle instructions removal. Differential Revision: https://reviews.llvm.org/D39369 llvm-svn: 316884	2017-10-30 04:48:34 +00:00
Sanjay Patel	b049173157	[SimplifyCFG] use pass options and remove the latesimplifycfg pass This is no-functional-change-intended. This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and D35411 (defer folding unconditional branches) with pass parameters rather than a named "latesimplifycfg" pass. Now that we have individual options to control the functionality, we could decouple when these fire (but that's an independent patch if desired). The next planned step would be to add another option bit to disable the sinking transform mentioned in D38566. This should also make it clear that the new pass manager needs to be updated to limit simplifycfg in the same way as the old pass manager. Differential Revision: https://reviews.llvm.org/D38631 llvm-svn: 316835	2017-10-28 18:43:07 +00:00
Craig Topper	49687104d6	[PartialInlineLibCalls] Teach PartialInlineLibCalls to honor nobuiltin, properly check the function signature, and check TLI::has Summary: We shouldn't do this transformation if the function is marked nobuitlin. We were only checking that the return type is floating point, we really should be checking the argument types and argument count as well. This can be accomplished by using the other version of getLibFunc that takes the Function and not just the name. We should also be checking TLI::has since sqrtf is a macro on Windows. Fixes PR32559. Reviewers: hfinkel, spatel, davide, efriedma Reviewed By: davide, efriedma Subscribers: efriedma, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D39381 llvm-svn: 316819	2017-10-28 00:36:58 +00:00
Artur Pilipenko	8aadc643cf	[LoopPredication] Handle the case when the guard and the latch IV have different offsets This is a follow up change for D37569. Currently the transformation is limited to the case when: * The loop has a single latch with the condition of the form: ++i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=. * The step of the IV used in the latch condition is 1. * The IV of the latch condition is the same as the post increment IV of the guard condition. * The guard condition is of the form i u< guardLimit. This patch enables the transform in the case when the latch is latchStart + i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=. And the guard is guardStart + i u< guardLimit Reviewed By: anna Differential Revision: https://reviews.llvm.org/D39097 llvm-svn: 316768	2017-10-27 14:46:17 +00:00
Max Kazantsev	665907c3c2	[GVN][NFC] Refactor loop iteration with foreach llvm-svn: 316748	2017-10-27 08:19:35 +00:00
Eugene Zelenko	57bd5a0274	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316724	2017-10-27 01:09:08 +00:00
Philip Reames	29dd40b38e	[SimplifyIndVars] Shorten code by using SCEV helper [NFC] llvm-svn: 316709	2017-10-26 22:02:16 +00:00
Dehao Chen	ed2d5402cb	Do not add discriminator encoding for debug intrinsics. Summary: There are certain requirements for debug location of debug intrinsics, e.g. the scope of the DILocalVariable should be the same as the scope of its debug location. As a result, we should not add discriminator encoding for debug intrinsics. Reviewers: dblaikie, aprantl Reviewed By: aprantl Subscribers: JDevlieghere, aprantl, bjope, sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D39343 llvm-svn: 316703	2017-10-26 21:20:52 +00:00
Philip Reames	21cc2fa3f6	[LICM] Restructure implicit exit handling to be more clear [NFCI] When going to explain this to someone else, I got tripped up by the complicated meaning of IsKnownNonEscapingObject in load-store promotion. Extract a helper routine and clarify naming/scopes to make this a bit more obvious. llvm-svn: 316699	2017-10-26 21:00:15 +00:00
Balaram Makam	9ee942f481	Reapply r316582 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred. Summary: This reverts r316612 to reapply r316582. The buildbot failure was unrelated to this commit. Reviewers: Subscribers: llvm-svn: 316669	2017-10-26 15:04:53 +00:00
Bjorn Pettersson	86db068e39	[LSV] Avoid adding vectors of pointers as candidates Summary: We no longer add vectors of pointers as candidates for load/store vectorization. It does not seem to work anyway, but without this patch we can end up in asserts when trying to create casts between an integer type and the pointer of vectors type. The test case I've added used to assert like this when trying to cast between i64 and <2 x i16>: opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. #0 PrintStackTraceSignalHandler(void) #1 SignalHandler(int) #2 __restore_rt #3 __GI_raise #4 __GI_abort #5 __GI___assert_fail #6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value, llvm::Type, llvm::Twine const&, llvm::Instruction) #7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value, llvm::Type, llvm::Twine const&) #8 Vectorizer::vectorizeStoreChain(llvm::ArrayRef<llvm::Instruction>, llvm::SmallPtrSet<llvm::Instruction, 16u>) Reviewers: arsenm Reviewed By: arsenm Subscribers: nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D39296 llvm-svn: 316665	2017-10-26 13:59:15 +00:00
Bjorn Pettersson	22a2282da1	[LSV] Skip all non-byte sizes, not only less than eight bits Summary: The code comments indicate that no effort has been spent on handling load/stores when the size isn't a multiple of the byte size correctly. However, the code only avoided types smaller than 8 bits. So for example a load of an i28 could still be considered as a candidate for vectorization. This patch adjusts the code to behave according to the code comment. The test case used to hit the following assert when trying to use "cast" an i32 to i28 using CreateBitOrPointerCast: opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. #0 PrintStackTraceSignalHandler(void) #1 SignalHandler(int) #2 __restore_rt #3 __GI_raise #4 __GI_abort #5 __GI___assert_fail #6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value, llvm::Type, llvm::Twine const&, llvm::Instruction) #7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value, llvm::Type, llvm::Twine const&) #8 (anonymous namespace)::Vectorizer::vectorizeLoadChain(llvm::ArrayRef<llvm::Instruction>, llvm::SmallPtrSet<llvm::Instruction, 16u>*) Reviewers: arsenm Reviewed By: arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39295 llvm-svn: 316663	2017-10-26 13:42:55 +00:00
Eugene Zelenko	5c2aecef78	[Transforms] Revert r316630 changes in Scalar/MergeICmps.cpp to fix broken build bots (NFC). llvm-svn: 316634	2017-10-26 01:25:14 +00:00
Eugene Zelenko	5adb96cc92	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316630	2017-10-26 00:55:39 +00:00
Matthew Simpson	99f57933ba	Attempt to unbreak the expensive-checks-win bot llvm-svn: 316625	2017-10-25 22:46:34 +00:00
Balaram Makam	52252fe20d	Revert r316582 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred. Summary: This reverts commit r316582. It looks like this commit broke tests on one buildbot: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/5719 . . . Failing Tests (1): LLVM :: Transforms/CalledValuePropagation/simple-arguments.ll Reviewers: Subscribers: llvm-svn: 316612	2017-10-25 21:32:54 +00:00
Balaram Makam	925ddf1a93	[Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred. Summary: For some irreducible CFG the domtree nodes might be dead, do not update domtree for dead nodes. Reviewers: kuhar, dberlin, hfinkel Reviewed By: kuhar Subscribers: llvm-commits, mcrosier Differential Revision: https://reviews.llvm.org/D38960 llvm-svn: 316582	2017-10-25 14:55:48 +00:00
Matthew Simpson	cb58558c2f	Add CalledValuePropagation pass This patch adds a new pass for attaching !callees metadata to indirect call sites. The pass propagates values to call sites by performing an IPSCCP-like analysis using the generic sparse propagation solver. For indirect call sites having a small set of possible callees, the attached metadata indicates what those callees are. The metadata can be used to facilitate optimizations like intersecting the function attributes of the possible callees, refining the call graph, performing indirect call promotion, etc. Differential Revision: https://reviews.llvm.org/D37355 llvm-svn: 316576	2017-10-25 13:40:08 +00:00
Max Kazantsev	9ac7021a25	[IRCE] Fix intersection between signed and unsigned ranges IRCE for unsigned latch conditions was temporarily disabled by rL314881. The motivating example contained an unsigned latch condition and a signed range check. One of the safe iteration ranges was `[1, SINT_MAX + 1]`. Its right border was incorrectly interpreted as a negative value in `IntersectRange` function, this lead to a miscompile under which we deleted a range check without inserting a postloop where it was needed. This patch brings back IRCE for unsigned latch conditions. Now we treat range intersection more carefully. If the latch condition was unsigned, we only try to consider a range check for deletion if: 1. The range check is also unsigned, or 2. Safe iteration range of the range check lies within `[0, SINT_MAX]`. The same is done for signed latch. Values from `[0, SINT_MAX]` are unambiguous, these values are non-negative under any interpretation, and all values of a range intersected with such range are also non-negative. We also use signed/unsigned min/max functions for range intersection depending on type of the latch condition. Differential Revision: https://reviews.llvm.org/D38581 llvm-svn: 316552	2017-10-25 06:47:39 +00:00
Max Kazantsev	4332a943bc	[IRCE] Smarter detection of empty ranges using SCEV For a SCEV range, this patch replaces the naive emptiness check for SCEV ranges which looks like `Begin == End` with a SCEV check. The range is guaranteed to be empty of `Begin >= End`. We should filter such ranges out and do not try to perform IRCE for them. For example, we can get such range when intersecting range `[A, B)` and `[C, D)` where `A < B < C < D`. The resulting range is `[max(A, C), min(B, D)) = [C, B)`. This range is empty, but its `Begin` does not match with `End`. Making IRCE for an empty range is basically safe but unprofitable because we never actually get into the main loop where the range checks are supposed to be eliminated. This patch uses SCEV mechanisms to treat loops with proved `Begin >= End` as empty. Differential Revision: https://reviews.llvm.org/D39082 llvm-svn: 316550	2017-10-25 06:10:02 +00:00
Eugene Zelenko	7f0f9bc5ab	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316503	2017-10-24 21:24:53 +00:00
Artem Belevich	cb8f6328dc	[NVPTX] allow address space inference for volatile loads/stores. If particular target supports volatile memory access operations, we can avoid AS casting to generic AS. Currently it's only enabled in NVPTX for loads and stores that access global & shared AS. Differential Revision: https://reviews.llvm.org/D39026 llvm-svn: 316495	2017-10-24 20:31:44 +00:00
Adrian Prantl	d20442d383	Delete unused instantiations of DIBuilder. NFC llvm-svn: 316494	2017-10-24 20:26:17 +00:00
Marek Olsak	ce76ea0394	AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1) Summary: Kill the thread if operand 0 == false. llvm.amdgcn.wqm.vote can be applied to the operand. Also allow kill in all shader stages. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38544 llvm-svn: 316427	2017-10-24 10:27:13 +00:00
Marek Olsak	2114fc3bcb	AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D38543 llvm-svn: 316426	2017-10-24 10:26:59 +00:00
Saleem Abdulrasool	619b3269fd	ObjCARC: do not increment past the end of the BB The `BasicBlock::getFirstInsertionPt` call may return `std::end` for the BB. Dereferencing the end iterator results in an assertion failure "(!NodePtr->isKnownSentinel()), function operator*". Ensure that the returned iterator is valid before dereferencing it. If the end is returned, move one position backward to get a valid insertion point. llvm-svn: 316401	2017-10-24 00:09:10 +00:00
Mandeep Singh Grang	9ed81c66ce	[GVNSink] Fix failing GVNSink tests in the reverse iteration bot Summary: The elts of ActivePreds which is defined as a SmallPtrSet are copied into Blocks using std::copy. This makes the resultant order of Blocks non-deterministic. We cannot simply sort Blocks as they need to match the corresponding Values. So a better approach is to define ActivePreds as SmallSetVector. This fixes the following failures in http://lab.llvm.org:8011/builders/reverse-iteration: LLVM :: Transforms/GVNSink/indirect-call.ll LLVM :: Transforms/GVNSink/sink-common-code.ll LLVM :: Transforms/GVNSink/struct.ll Reviewers: dberlin, jmolloy, bkramer, efriedma Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39025 llvm-svn: 316369	2017-10-23 19:56:52 +00:00
Sanjay Patel	b80daf0b48	[SimplifyCFG] delay switch condition forwarding to -latesimplifycfg As discussed in D39011: https://reviews.llvm.org/D39011 ...replacing constants with a variable is inverting the transform done by other IR passes, so we definitely don't want to do this early. In fact, it's questionable whether this transform belongs in SimplifyCFG at all. I'll look at moving this to codegen as a follow-up step. llvm-svn: 316298	2017-10-22 19:10:07 +00:00
Sanjay Patel	24226504a7	[SimplifyCFG] try harder to forward switch condition to phi (PR34471) The missed canonicalization/optimization in the motivating test from PR34471 leads to very different codegen: int switcher(int x) { switch(x) { case 17: return 17; case 19: return 19; case 42: return 42; default: break; } return 0; } int comparator(int x) { if (x == 17) return 17; if (x == 19) return 19; if (x == 42) return 42; return 0; } For the first example, we use a bit-test optimization to avoid a series of compare-and-branch: https://godbolt.org/g/BivDsw Differential Revision: https://reviews.llvm.org/D39011 llvm-svn: 316293	2017-10-22 16:51:03 +00:00
David Green	907b60fbba	[LoopInterchange] Fix phi node ordering miscompile. The way that splitInnerLoopHeader splits blocks requires that the induction PHI will be the first PHI in the inner loop header. This makes sure that is actually the case when there are both IV and reduction phis. Differential Revision: https://reviews.llvm.org/D38682 llvm-svn: 316261	2017-10-21 13:58:37 +00:00
Eugene Zelenko	fce435764e	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316253	2017-10-21 00:57:46 +00:00
Eugene Zelenko	99241d75c1	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316241	2017-10-20 21:47:29 +00:00
Eugene Zelenko	bff0ef0324	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316190	2017-10-19 22:07:16 +00:00
Eugene Zelenko	f27d161bf0	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316187	2017-10-19 21:21:30 +00:00
Simon Pilgrim	0444e4fcd4	Fix MSVC signed/unsigned comparison warning llvm-svn: 316161	2017-10-19 15:00:31 +00:00
Max Kazantsev	3612d4b4f9	[NFC][IRCE] Filter out empty ranges early llvm-svn: 316146	2017-10-19 05:33:28 +00:00
whitequark	a99ecf1bbb	[MergeFunctions] Don't blindly RAUW a GlobalValue with a ConstantExpr. MergeFunctions uses (through FunctionComparator) a map of GlobalValues to identifiers because it needs to compare functions and globals do not have an inherent total order. Thus, FunctionComparator (through GlobalNumberState) has a ValueMap<GlobalValue >. r315852 added a RAUW on globals that may have been previously encountered by the FunctionComparator, which would replace a GlobalValue key with a ConstantExpr *, which is illegal. This commit adjusts that code path to remove the function being replaced from the ValueMap as well. llvm-svn: 316145	2017-10-19 04:47:48 +00:00
Chandler Carruth	3f0e056df4	[PM] Refactor the bounds checking pass to remove a method only called in one place. llvm-svn: 316135	2017-10-18 22:42:36 +00:00
Sanjoy Das	2f27456c82	Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain." This reverts commit r316054. There was some confusion over the review process: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171016/495884.html llvm-svn: 316129	2017-10-18 22:00:57 +00:00
Eugene Zelenko	306d29977d	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316128	2017-10-18 21:46:47 +00:00
Jatin Bhateja	1fc49627e4	[ScalarEvolution] Handling for ICmp occuring in the evolution chain. Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. Currently scope of evaluation is limited to SCEV computation for PHI nodes. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 316054	2017-10-18 01:36:16 +00:00
Michael Zolotukhin	c4fcc189d2	[GlobalDCE] Use DenseMap instead of unordered_multimap for GVDependencies. Summary: std::unordered_multimap happens to be very slow when the number of elements grows large. On one of our internal applications we observed a 17x compile time improvement from changing it to DenseMap. Reviewers: mehdi_amini, serge-sans-paille, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38916 llvm-svn: 316045	2017-10-17 23:47:06 +00:00
Eugene Zelenko	6cadde7f40	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316034	2017-10-17 21:27:42 +00:00
Vitaly Buka	524c0a639d	Fix signed overflow detected by ubsan This overflow does not affect algorithm, so just suppress it. llvm-svn: 316018	2017-10-17 18:33:15 +00:00
Philip Reames	6a7bbfb2e2	Revert 315440 on behalf of mkazantsev This patch reverts rL315440 because of the bug described at https://bugs.llvm.org/show_bug.cgi?id=34937 The fix for the bug is on review as D38944, but not yet ready. Given this is a regression reverting until a fix is ready is called for. Max would have done the revert himself, but is having trouble doing a build of fresh LLVM for some reason. I did the build and test to ensure the revert worked as expected on his behalf. llvm-svn: 315974	2017-10-17 06:21:07 +00:00
Craig Topper	91259e2681	[JumpThreading] Move two PredValueInfoTy vectors to a scope closer to their usage. NFCI llvm-svn: 315941	2017-10-16 21:54:13 +00:00
Eugene Zelenko	dd40f5e7c1	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315940	2017-10-16 21:34:24 +00:00
Akira Hatanaka	e8c1a54c07	[ObjCARC] Do not move a release that has the clang.imprecise_release tag above PHI instructions. ARC optimizer has an optimization that moves a call to an ObjC runtime function above a phi instruction when the phi has a null operand and is an argument passed to the function call. This optimization should not kick in when the runtime function is an objc_release that releases an object with precise lifetime semantics. rdar://problem/34959669 llvm-svn: 315914	2017-10-16 16:46:59 +00:00
Sanjay Patel	42135beac8	[InstCombine] don't unnecessarily generate a constant; NFCI llvm-svn: 315910	2017-10-16 14:47:24 +00:00
NAKAMURA Takumi	414151a47e	Revert rL315894, "SLPVectorizer.cpp: Try to appease stage2-3 difference. (D38586)" llvm-svn: 315896	2017-10-16 09:50:01 +00:00
Nikolai Bozhenov	0e7ebbccc7	Move folding of icmp with zero after checking for min/max idioms. Summary: The following transformation for cmp instruction: icmp smin(x, PositiveValue), 0 -> icmp x, 0 should only be done after checking for min/max to prevent infinite looping caused by a reverse canonicalization. That is why this transformation was moved to place after the mentioned check. Reviewers: spatel, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38934 Patch by: Artur Gainullin <artur.gainullin@intel.com> llvm-svn: 315895	2017-10-16 09:19:21 +00:00
NAKAMURA Takumi	4543affa98	SLPVectorizer.cpp: Try to appease stage2-3 difference. (D38586) llvm-svn: 315894	2017-10-16 09:15:23 +00:00
Sanjay Patel	934738a3da	revert r314984: revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1) Recommitting r314698. The bug exposed by this change should be fixed with: https://reviews.llvm.org/rL315579 llvm-svn: 315857	2017-10-15 15:39:15 +00:00
Sanjay Patel	30f30d37fb	[SimplifyCFG] use range-for-loops, tidy; NFCI There seems to be something missing here as shown in PR34471: https://bugs.llvm.org/show_bug.cgi?id=34471 llvm-svn: 315855	2017-10-15 14:43:39 +00:00
Aaron Ballman	615eb47035	Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people. Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1 llvm-svn: 315854	2017-10-15 14:32:27 +00:00
whitequark	ae12efab20	[MergeFunctions] Merge small functions if possible without a thunk. This can result in significant code size savings in some cases, e.g. an interrupt table all filled with the same assembly stub in a certain Cortex-M BSP results in code blowup by a factor of 2.5. Differential Revision: https://reviews.llvm.org/D34806 llvm-svn: 315853	2017-10-15 12:29:09 +00:00
whitequark	b2ce9ffede	[MergeFunctions] Replace all uses of unnamed_addr functions. This reduces code size for constructs like vtables or interrupt tables that refer to functions in global initializers. Differential Revision: https://reviews.llvm.org/D34805 llvm-svn: 315852	2017-10-15 12:29:01 +00:00
Hongbin Zheng	73f650435b	[LoopInfo][Refactor] Make SetLoopAlreadyUnrolled a member function of the Loop Pass, NFC. This avoid code duplication and allow us to add the disable unroll metadata elsewhere. Differential Revision: https://reviews.llvm.org/D38928 llvm-svn: 315850	2017-10-15 07:31:02 +00:00
Sanjay Patel	b869f76d85	[InstCombine] use m_Neg() to reduce code; NFCI llvm-svn: 315762	2017-10-13 21:28:50 +00:00
Eugene Zelenko	3b87939604	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315760	2017-10-13 21:17:07 +00:00
Peter Collingbourne	868783e855	LowerTypeTests: Give imported symbols a type with size 0 so that they are not assumed not to alias. It is possible for both a base and a derived class to be satisfied with a unique vtable. If a program contains casts of the same pointer to both of those types, the CFI checks will be lowered to this (with ThinLTO): if (p != &__typeid_base_global_addr) trap(); if (p != &__typeid_derived_global_addr) trap(); The optimizer may then use the first condition combined with the assumption that __typeid_base_global_addr and __typeid_derived_global_addr may not alias to optimize away the second comparison, resulting in an unconditional trap. This patch fixes the bug by giving imported globals the type [0 x i8]*, which prevents the optimizer from assuming that they do not alias. Differential Revision: https://reviews.llvm.org/D38873 llvm-svn: 315753	2017-10-13 21:02:16 +00:00
Sanjay Patel	f0242de143	[InstCombine] move code to remove repeated constant check; NFCI Also, consolidate tests for this fold in one place. llvm-svn: 315745	2017-10-13 20:29:11 +00:00
Sanjay Patel	28b3aa3663	[InstCombine] recycle adds for better efficiency Also, clean up unnecessary matcher capture variable initializations. llvm-svn: 315743	2017-10-13 20:12:21 +00:00
Sanjay Patel	2118952162	[InstCombine] use local var to reduce code duplication; NFCI llvm-svn: 315728	2017-10-13 18:32:53 +00:00
Matthew Simpson	2284937bbc	[IPSCCP] Move common functions to ValueLatticeUtils (NFC) This patch moves some common utility functions out of IPSCCP and makes them available globally. The functions determine if interprocedural data-flow analyses can propagate information through function returns, arguments, and global variables. Differential Revision: https://reviews.llvm.org/D37638 llvm-svn: 315719	2017-10-13 17:53:44 +00:00
Sanjay Patel	c419c9f640	[InstCombine] add hasOneUse check to add-zext-add fold to prevent increasing instructions llvm-svn: 315718	2017-10-13 17:47:25 +00:00
Sanjay Patel	76ed9eab29	[InstCombine] use AddOne helper to reduce code; NFC llvm-svn: 315709	2017-10-13 17:00:47 +00:00
Sanjay Patel	8d810fee43	[InstCombine] rearrange code to remove repeated constant check; NFCI llvm-svn: 315703	2017-10-13 16:43:58 +00:00
Sanjay Patel	2150651ac3	[InstCombine] allow zext(bool) + C --> select bool, C+1, C for vector types The backend should be prepared for this transform after: https://reviews.llvm.org/rL311731 llvm-svn: 315701	2017-10-13 16:29:38 +00:00
Daniel Neilson	fa14ebd138	[RS4GC] Look through vector bitcasts when looking for base pointer Summary: In RS4GC it is possible that a base pointer is contained in a vector that has undergone a bitcast from one element-pointertype to another. We teach RS4GC how to look through bitcasts of vector types when looking for a base pointer. Reviewers: anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38849 llvm-svn: 315694	2017-10-13 15:59:13 +00:00
Daniel Jasper	3344a21236	Revert r314923: "Recommit : Use the basic cost if a GEP is not used as addressing mode" Significantly reduces performancei (~30%) of gipfeli (https://github.com/google/gipfeli) I have not yet managed to reproduce this regression with the open-source version of the benchmark on github, but will work with others to get a reproducer to you later today. llvm-svn: 315680	2017-10-13 14:04:21 +00:00
Marco Castelluccio	0dcf64ad20	Disable gcov instrumentation of functions using funclet-based exception handling Summary: This patch fixes the crash from https://bugs.llvm.org/show_bug.cgi?id=34659 and https://bugs.llvm.org/show_bug.cgi?id=34833. Reviewers: rnk, majnemer Reviewed By: rnk, majnemer Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D38223 llvm-svn: 315677	2017-10-13 13:49:15 +00:00
Eugene Zelenko	5323550e9a	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315640	2017-10-12 23:30:03 +00:00
Anna Thomas	61aec18d46	[CVP] Process binary operations even when def is local Summary: This patch adds processing of binary operations when the def of operands are in the same block (i.e. local processing). Earlier we bailed out in such cases (the bail out was introduced in rL252032) because LVI at that time was more precise about context at the end of basic blocks, which implied local def and use analysis didn't benefit CVP. Since then we've added support for LVI in presence of assumes and guards. The test cases added show how local def processing in CVP helps adding more information to the ashr, sdiv, srem and add operators. Note: processCmp which suffers from the same problem will be handled in a later patch. Reviewers: philip, apilipenko, SjoerdMeijer, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38766 llvm-svn: 315634	2017-10-12 22:39:52 +00:00
Artur Pilipenko	ead69ee4bd	[LoopPredication] Check whether the loop is already guarded by the first iteration check condition llvm-svn: 315623	2017-10-12 21:21:17 +00:00
Bruno Cardoso Lopes	993d2e67d8	Revert "Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP."" This reverts commit r315593: still affect two bots: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/5308 http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21751/ llvm-svn: 315618	2017-10-12 20:52:34 +00:00
Artur Pilipenko	b4527e1ce2	[LoopPredication] Support ule, sle latch predicates This is a follow up for the loop predication change 313981 to support ule, sle latch predicates. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D38177 llvm-svn: 315616	2017-10-12 20:40:27 +00:00
Bruno Cardoso Lopes	326fdcbff8	Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP." This is r315288 & r315294, which were reverted due to stage2 bot failures. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315593	2017-10-12 16:54:11 +00:00
Don Hinton	3e0199f7eb	[dump] Remove NDEBUG from test to enable dump methods [NFC] Summary: Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP. Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods. Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so it'll be picked up by public headers. Differential Revision: https://reviews.llvm.org/D38406 llvm-svn: 315590	2017-10-12 16:16:06 +00:00
Hongbin Zheng	d36f2030e2	[SimplifyIndVar] Replace IVUsers with loop invariant whenever possible Differential Revision: https://reviews.llvm.org/D38415 llvm-svn: 315551	2017-10-12 02:54:11 +00:00
Zachary Turner	41a9ee98f9	Revert "[ADT] Make Twine's copy constructor private." This reverts commit 4e4ee1c507e2707bb3c208e1e1b6551c3015cbf5. This is failing due to some code that isn't built on MSVC so I didn't catch. Not immediately obvious how to fix this at first glance, so I'm reverting for now. llvm-svn: 315536	2017-10-11 23:54:34 +00:00
Zachary Turner	337462b365	[ADT] Make Twine's copy constructor private. There's a lot of misuse of Twine scattered around LLVM. This ranges in severity from benign (returning a Twine from a function by value that is just a string literal) to pretty sketchy (storing a Twine by value in a class). While there are some uses for copying Twines, most of the very compelling ones are confined to the Twine class implementation itself, and other uses are either dubious or easily worked around. This patch makes Twine's copy constructor private, and fixes up all callsites. Differential Revision: https://reviews.llvm.org/D38767 llvm-svn: 315530	2017-10-11 23:33:06 +00:00
Eugene Zelenko	6f1ae631f7	[Transforms] Revert r315516 changes in PredicateInfo to fix Windows build bots (NFC). llvm-svn: 315519	2017-10-11 21:56:44 +00:00
Eugene Zelenko	286d5897d6	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315516	2017-10-11 21:41:43 +00:00
Vivek Pandya	9590658fb8	[NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure parameterized emit() calls Summary: This is not functional change to adopt new emit() API added in r313691. Reviewed By: anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38285 llvm-svn: 315476	2017-10-11 17:12:59 +00:00
Max Kazantsev	fecaff1bd9	[NFC] Fix variables used only for assert in GVN llvm-svn: 315448	2017-10-11 10:31:49 +00:00
Max Kazantsev	3b81809e06	[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 315440	2017-10-11 08:10:43 +00:00
Max Kazantsev	0c8dd052b8	[LICM] Disallow sinking of unordered atomic loads into loops Sinking of unordered atomic load into loop must be disallowed because it turns a single load into multiple loads. The relevant section of the documentation is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for Optimizers section. Here is the full text of this section: > Notes for optimizers > In terms of the optimizer, this prohibits any transformation that > transforms a single load into multiple loads, transforms a store into > multiple stores, narrows a store, or stores a value which would not be > stored otherwise. Some examples of unsafe optimizations are narrowing > an assignment into a bitfield, rematerializing a load, and turning loads > and stores into a memcpy call. Reordering unordered operations is safe, > though, and optimizers should take advantage of that because unordered > operations are common in languages that need them. Patch by Daniil Suchkov! Reviewed By: reames Differential Revision: https://reviews.llvm.org/D38392 llvm-svn: 315438	2017-10-11 07:26:45 +00:00
Max Kazantsev	25d8655dc2	[IRCE] Do not process empty safe ranges IRCE should not apply when the safe iteration range is proved to be empty. In this case we do unneeded job creating pre/post loops and then never go to the main loop. This patch makes IRCE not apply to empty safe ranges, adds test for this situation and also modifies one of existing tests where it used to happen slightly. Reviewed By: anna Differential Revision: https://reviews.llvm.org/D38577 llvm-svn: 315437	2017-10-11 06:53:07 +00:00
Davide Italiano	e2138fe41b	[GVN] Don't replace constants with constants. This fixes PR34908. Patch by Alex Crichton! Differential Revision: https://reviews.llvm.org/D38765 llvm-svn: 315429	2017-10-11 04:21:51 +00:00
Eugene Zelenko	e9ea08a097	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315383	2017-10-10 22:49:55 +00:00
Dehao Chen	3f56a05ae5	Use the first instruction's count to estimate the funciton's entry frequency. Summary: In the current implementation, we only have accurate profile count for standalone symbols. For inlined functions, we do not have entry count data because it's not available in LBR. In this patch, we use the first instruction's frequency to estimiate the function's entry count, especially for inlined functions. This may be inaccurate due to debug info in optimized code. However, this is a better estimate than the static 80/20 estimation we have in the current implementation. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D38478 llvm-svn: 315369	2017-10-10 21:13:50 +00:00
Bruno Cardoso Lopes	57304923ca	Revert "[SCCP] Propagate integer range info for parameters in IPSCCP." This reverts commit r315288. This is part of fixing segfault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315329	2017-10-10 16:37:57 +00:00
Bruno Cardoso Lopes	122c4b3c8c	Revert "[SCCP] Fix mem-sanitizer failure introduced by r315288." This reverts commit r315294. Part of fixing seg fault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315328	2017-10-10 16:37:51 +00:00
Florian Hahn	7d2375df30	[SCCP] Fix mem-sanitizer failure introduced by r315288. llvm-svn: 315294	2017-10-10 10:33:45 +00:00
Florian Hahn	22a44bca40	[SCCP] Propagate integer range info for parameters in IPSCCP. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315288	2017-10-10 09:32:38 +00:00
Clement Courbet	e2e8a5c496	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." (fixed stability issues) This reverts commit d6492333d3b478a1d88163315002022f8d5e58dc. llvm-svn: 315281	2017-10-10 08:00:45 +00:00
Xinliang David Li	4cdc9dab0a	Renable r314928 Eliminate inttype phi with inttoptr/ptrtoint. This version fixed a bug in finding the matching phi -- the order of the incoming blocks may be different (triggered in self build on Windows). A new test case is added. llvm-svn: 315272	2017-10-10 05:07:54 +00:00
Adam Nemet	0965da2055	Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.* Sync it up with the name of the class actually defined here. This has been bothering me for a while... llvm-svn: 315249	2017-10-09 23:19:02 +00:00
Sanjay Patel	ce36b03b03	[InstCombine] fix formatting; NFC llvm-svn: 315223	2017-10-09 17:54:46 +00:00
Sanjay Patel	72d339abb7	[InstCombine] use correct type when propagating constant condition in simplifyDivRemOfSelectWithZeroOp (PR34856) llvm-svn: 315130	2017-10-06 23:43:06 +00:00
Sanjay Patel	ae2e3a44d2	[InstCombine] rename SimplifyDivRemOfSelect to be clearer, add comments, simplify code; NFCI There's at least one bug here - this code can fail with vector types (PR34856). It's also being called for FREM; I'm still trying to understand how that is valid. llvm-svn: 315127	2017-10-06 23:20:16 +00:00
Reid Kleckner	b6b210e61f	Revert "Roll forward r314928" This appears to be miscompiling Clang, as shown on two Windows bootstrap bots: http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7611 http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/6870 Nothing else is in the blame list. Both emit errors on this valid code in the Windows ucrt headers: C:\...\ucrt\malloc.h:95:32: error: invalid operands to binary expression ('char ' and 'int') _Ptr = (char)_Ptr + _ALLOCA_S_MARKER_SIZE; ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~ I am attempting to reproduce this now. This reverts r315044 llvm-svn: 315108	2017-10-06 21:17:51 +00:00
Dehao Chen	9bd60429e2	Directly return promoted direct call instead of rely on stripPointerCast. Summary: stripPointerCast is not reliably returning the value that's being type-casted. Instead it may look further at function attributes to further propagate the value. Instead of relying on stripPOintercast, the more reliable solution is to directly use the pointer to the promoted direct call. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38603 llvm-svn: 315077	2017-10-06 17:04:55 +00:00
Clement Courbet	d12c189e2e	Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." Still a few stability issues on windows. This reverts commit 67e3db9bc121ba244e20337aabc7cf341a62b545. llvm-svn: 315058	2017-10-06 13:02:24 +00:00
Clement Courbet	4e1bae8136	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." (fixed unit tests by making comparisons stable) This reverts commit 1b2d359ce256fd6737da4e93833346a0bd6d7583. llvm-svn: 315056	2017-10-06 12:12:35 +00:00
Xinliang David Li	bcd36f7c5a	Roll forward r314928 Fixed ThinLTO bootstrap failure : track new bitcast per incomingVal. Added new tests. llvm-svn: 315044	2017-10-06 05:15:25 +00:00
Davide Italiano	c74ea93b8c	[PM] Retire disable unit-at-a-time switch. This is a vestige from the GCC-3 days, which disables IPO passes when set. I don't think anybody actually uses it as there are several IPO passes which still run with this flag set and nobody complained/noticed. This reduces the delta between current and new pass manager and allows us to easily review the difference when we decide to flip the switch (or audit which passes should run, FWIW). llvm-svn: 315043	2017-10-06 04:39:40 +00:00
Jakub Kuderski	cbe9fae99d	[CodeExtractor] Fix multiple bugs under certain shape of extracted region Summary: If the extracted region has multiple exported data flows toward the same BB which is not included in the region, correct resotre instructions and PHI nodes won't be generated inside the exitStub. The solution is simply put the restore instructions right after the definition of output values instead of putting in exitStub. Unittest for this bug is included. Author: myhsu Reviewers: chandlerc, davide, lattner, silvas, davidxl, wmi, kuhar Subscribers: dberlin, kuhar, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D37902 llvm-svn: 315041	2017-10-06 03:37:06 +00:00
Daniel Berlin	08dd582ea0	NewGVN: Factor out duplicate parts of OpIsSafeForPHIOfOps llvm-svn: 315040	2017-10-06 01:33:06 +00:00
Peter Collingbourne	715bcfe0c9	ModuleUtils: Stop using comdat members to generate unique module ids. It is possible for two modules to define the same set of external symbols without causing a duplicate symbol error at link time, as long as each of the symbols is a comdat member. So we cannot use them as part of a unique id for the module. Differential Revision: https://reviews.llvm.org/D38602 llvm-svn: 315026	2017-10-05 21:54:53 +00:00
Sanjay Patel	7ac2db6a48	[InstCombine] improve folds for icmp gt/lt (shr X, C1), C2 We can always eliminate the shift in: icmp gt/lt (shr X, C1), C2 --> icmp gt/lt X, C' This patch was supposed to just be an efficiency improvement because we were doing this 3-step process to fold: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: ADD: %1 = udiv i4 %x, 2 IC: Old = %c = icmp ugt i4 %1, 1 New = <badref> = icmp uge i4 %x, 4 IC: ADD: %c = icmp uge i4 %x, 4 IC: ERASE %2 = icmp ugt i4 %1, 1 IC: Visiting: %c = icmp uge i4 %x, 4 IC: Old = %c = icmp uge i4 %x, 4 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %2 = icmp uge i4 %x, 4 IC: Visiting: %c = icmp ugt i4 %x, 3 IC: DCE: %1 = udiv i4 %x, 2 IC: ERASE %1 = udiv i4 %x, 2 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: ret i1 %c When we could go directly to canonical icmp form: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: Old = %c = icmp ugt i4 %s, 1 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %1 = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: %c = icmp ugt i4 %x, 3 ...but then I noticed that the folds were incomplete too: https://godbolt.org/g/aB2hLE Here are attempts to prove the logic with Alive: https://rise4fun.com/Alive/92o Name: lshr_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) Name: ashr_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_ugt Pre: (((C2+1) << C1) u>> C1) == (C2+1) %sh = lshr i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, ((C2+1) << C1) - 1 Name: ashr_sgt Pre: (C2 != 127) && ((C2+1) << C1 != -128) && (((C2+1) << C1) >> C1) == (C2+1) %sh = ashr i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, ((C2+1) << C1) - 1 Name: ashr_exact_sgt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, (C2 << C1) Name: ashr_exact_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_exact_ugt Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, (C2 << C1) Name: lshr_exact_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) We did something similar for 'shl' in D28406. Differential Revision: https://reviews.llvm.org/D38514 llvm-svn: 315021	2017-10-05 21:11:49 +00:00
Dehao Chen	16f01fb1db	Annotate VP prof on indirect call if it is ICPed in the profiled binary. Summary: In SamplePGO, when an indirect call is promoted in the profiled binary, before profile annotation, it will be promoted and inlined. For the original indirect call, the current implementation will not mark VP profile on it. This is an issue when profile becomes stale. This patch annotates VP prof on indirect calls during annotation. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D38477 llvm-svn: 315016	2017-10-05 20:15:29 +00:00
Davide Italiano	c8708e59e8	[PassManager] Improve the interaction between -O2 and ThinLTO. Run GDCE slightly later so that we don't have to repeat it twice when preparing for Thin. Thanks to Mehdi for the suggestion. llvm-svn: 314999	2017-10-05 18:23:25 +00:00
Davide Italiano	ff829cea8b	[PassManager] Run global optimizations after the inliner. The inliner performs some kind of dead code elimination as it goes, but there are cases that are not really caught by it. We might at some point consider teaching the inliner about them, but it is OK for now to run GlobalOpt + GlobalDCE in tandem as their benefits generally outweight the cost, making the whole pipeline faster. This fixes PR34652. Differential Revision: https://reviews.llvm.org/D38154 llvm-svn: 314997	2017-10-05 18:06:37 +00:00
Ayal Zaks	c9e0f886e5	[LV] Fix PR34743 - handle casts that sink after interleaved loads When ignoring a load that participates in an interleaved group, make sure to move a cast that needs to sink after it. Testcase derived from reproducer of PR34743. Differential Revision: https://reviews.llvm.org/D38338 llvm-svn: 314986	2017-10-05 15:45:14 +00:00
Clement Courbet	922e5bc698	Revert "Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion.""" broken test on windows This reverts commit c91479518344fd1fc071c5bd5848f6eb83e53dca. llvm-svn: 314985	2017-10-05 14:42:06 +00:00
Sanjay Patel	f11b5b4f87	revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1) There is a bot failure that appears to be related to this change: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/2117 ...so reverting to confirm that and attempting to keep the bot green while investigating. llvm-svn: 314984	2017-10-05 14:26:15 +00:00
Ayal Zaks	fc3f7a4f0c	[LV] Fix PR34711 - widen instruction ranges when sinking casts Instead of trying to keep LastWidenRecipe updated after creating each recipe, have tryToWiden() retrieve the last recipe of the current VPBasicBlock and check if it's a VPWidenRecipe when attempting to extend its range. This ensures that such extensions, optimized to maintain the original instruction order, do so only when the instructions are to maintain their relative order. The latter does not always hold, e.g., when a cast needs to sink to unravel first order recurrence (r306884). Testcase derived from reproducer of PR34711. Differential Revision: https://reviews.llvm.org/D38339 llvm-svn: 314981	2017-10-05 12:41:49 +00:00
Clement Courbet	4cafbb9b5e	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."" llvm-svn: 314980	2017-10-05 12:39:57 +00:00
Clement Courbet	6603fc0e7b	Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." Breaks clang-stage1-cmake-RA-incremental/llvm/test/Transforms/MergeICmps/X86/tuple-four-int8.ll This reverts commit 3038c459d67f8898ffa295d54a013b280690abfa. llvm-svn: 314972	2017-10-05 08:03:39 +00:00
Craig Topper	17b0c78447	[InstCombine] Fix a vector splat handling bug in transformZExtICmp. We were using an i1 type and then zero extending to a vector. Instead just create the 0/1 directly as a ConstantInt with the correct type. No need to ask ConstantExpr to zero extend for us. This bug is a bit tricky to hit because it requires us to visit a zext of an icmp that would normally be simplified to true/false, but that icmp hasnt' been visited yet. In the test case this zext and icmp were created by visiting a udiv and due to worklist ordering we got to the zext first. Fixes PR34841. llvm-svn: 314971	2017-10-05 07:59:11 +00:00
Clement Courbet	902eef32eb	[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion. Summary: This is to avoid e.g. merging two cheap icmps if the target is not going to expand to something nice later. Reviewers: dberlin, spatel Subscribers: davide, nemanjai Differential Revision: https://reviews.llvm.org/D38232 llvm-svn: 314970	2017-10-05 07:49:09 +00:00
Xinliang David Li	04ab11a08a	Revert r314928 to investigate thinLTO bootstrap failure llvm-svn: 314961	2017-10-05 01:40:13 +00:00
Craig Topper	7a93092399	[InstCombine] Improve support for ashr in foldICmpAndShift We can support ashr similar to lshr, if we know that none of the shifted in bits are used. In that case SimplifyDemandedBits would normally convert it to lshr. But that conversion doesn't happen if the shift has additional users. Differential Revision: https://reviews.llvm.org/D38521 llvm-svn: 314945	2017-10-04 23:06:13 +00:00
Hans Wennborg	899809d531	Fix a -Wparentheses warning. NFC. llvm-svn: 314936	2017-10-04 21:14:07 +00:00
Marcello Maggioni	df3e71e037	[LoopDeletion] Move deleteDeadLoop to to LoopUtils. NFC llvm-svn: 314934	2017-10-04 20:42:46 +00:00
Sanjay Patel	4c33d5213b	[SimplifyCFG] put the optional assumption cache pointer in the options struct; NFCI This is a follow-up to https://reviews.llvm.org/D38138. I fixed the capitalization of some functions because we're changing those lines anyway and that helped verify that we weren't accidentally dropping any options by using default param values. llvm-svn: 314930	2017-10-04 20:26:25 +00:00
Xinliang David Li	7a73757358	Recommit r314561 after fixing msan build failure (trial 2) Incoming val defined by terminator instruction which also requires bitcasts can not be handled. llvm-svn: 314928	2017-10-04 20:17:55 +00:00
Jun Bum Lim	d40e03c2d8	Recommit : Use the basic cost if a GEP is not used as addressing mode Recommitting r314517 with the fix for handling ConstantExpr. Original commit message: Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target. However, since it doesn't check its actual users, it will return FREE even in cases where the GEP cannot be folded away as a part of actual addressing mode. For example, if an user of the GEP is a call instruction taking the GEP as a parameter, then the GEP may not be folded in isel. llvm-svn: 314923	2017-10-04 18:33:52 +00:00
Clement Courbet	98eaa88357	[NFC] clang-format lib/Transforms/Scalar/MergeICmps.cpp llvm-svn: 314906	2017-10-04 15:13:52 +00:00
Max Kazantsev	8aacef6cae	[IRCE] Temporarily disable unsigned latch conditions by default We have found some corner cases connected to range intersection where IRCE makes a bad thing when the latch condition is unsigned. The fix for that will go as a follow up. This patch temporarily disables IRCE for unsigned latch conditions until the issue is fixed. The unsigned latch conditions were introduced to IRCE by rL310027. Differential Revision: https://reviews.llvm.org/D38529 llvm-svn: 314881	2017-10-04 06:53:22 +00:00

1 2 3 4 5 ...

19047 Commits