llvm-project

Commit Graph

Author	SHA1	Message	Date
Dorit Nuzman	06903d16af	Revert r285517 due to build failures. llvm-svn: 285518	2016-10-30 14:34:57 +00:00
Dorit Nuzman	3c1c658f24	[LoopVectorize] Make interleaved-accesses analysis less conservative about possible pointer-wrap-around concerns, in some cases. Before this patch, collectConstStridedAccesses (part of interleaved-accesses analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when examining all candidate pointers. This is too conservative. Instead, this patch makes collectConstStridedAccesses use an optimistic approach, calling getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the candidate interleave groups have been formed, revisits the pointer-wrapping analysis but only where it matters: namely, in groups that have gaps, and where the gaps are not at the very end of the group (in which case the loop is peeled). This second time getPtrStride is called with [Assume=false, ShouldCheckWrap=true], but this could further be improved to using Assume=true, once we also add the logic to track that we are not going to meet the scev runtime checks threshold. Differential Revision: https://reviews.llvm.org/D25276 llvm-svn: 285517	2016-10-30 12:23:26 +00:00
Teresa Johnson	bf28c8fa45	[ThinLTO] Use per-summary flag to prevent exporting locals used in inline asm Summary: Instead of using the workaround of suppressing the entire index for modules that call inline asm that may reference locals, use the NoRename flag on the summary for any locals in the llvm.used set, and add a reference edge from any functions containing inline asm. This avoids issues from having no summaries despite the module defining global values, which was preventing more aggressive index-based optimization. It will be followed by a subsequent patch to make a similar fix for local references in module level asm (to fix PR30610). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26121 llvm-svn: 285513	2016-10-30 05:40:44 +00:00
Teresa Johnson	38d4df714c	[ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC) Rename as suggested in code review for D26063. llvm-svn: 285508	2016-10-29 21:52:23 +00:00
Teresa Johnson	1b9c2be8f4	[ThinLTO] Use NoPromote flag in summary during promotion Summary: Replace the check of whether a GV has a section with the flag check in the summary. This is in preparation for using the NoPromote flag to convey other situations when we can't promote (e.g. locals used in inline asm). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26063 llvm-svn: 285507	2016-10-29 21:31:48 +00:00
Sanjay Patel	978f827d12	[InstCombine] re-use bitcasted compare operands in selects (PR28001) These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>. The bitcasts obfuscate the expected min/max forms as shown in PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001#c6 Differential Revision: https://reviews.llvm.org/D25943 llvm-svn: 285495	2016-10-29 15:22:04 +00:00
Justin Lebar	0ede5fb1bb	Don't leave unused divs/rems sitting around in BypassSlowDivision. Summary: This "pass" eagerly creates div and rem instructions even when only one is needed -- it relies on a later pass (machine DCE?) to clean them up. This is problematic not just from a cleanliness perspective (this pass is running during CodeGenPrepare, so should leave the IR in a better state), but it also creates a problem for instruction selection. If we always have a div+rem, isel will always select a divrem instruction (if possible), even when a single div or rem would do. Specifically, in NVPTX, we want to compute rem from the output of div, if available. But if a div is not available, we want to leave the rem alone. This transformation is overeager if div is always available. Because this code runs as part of CodeGenPrepare, it's nontrivial to write a test for this change. But this will effectively be tested by a later patch which adds the aforementioned change to NVPTX isel. Reviewers: tra Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26088 llvm-svn: 285460	2016-10-28 21:43:54 +00:00
Justin Lebar	468bf73209	Don't claim the udiv created in BypassSlowDivision is exact. Summary: In BypassSlowDivision's short-dividend path, we would create e.g. udiv exact i32 %a, %b "exact" here means that we are asserting that %a is a multiple of %b. But we have no reason to believe this must be true -- this is just a bug, as far as I can tell. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D26097 llvm-svn: 285459	2016-10-28 21:43:51 +00:00
Matt Arsenault	ef00283425	SpeculativeExecution: Allow speculating more inst types Partial step towards removing the whitelist and only using TTI's cost. llvm-svn: 285438	2016-10-28 20:00:33 +00:00
George Burgess IV	013fd7315f	[MemorySSA] Add const to getClobberingMemoryAccess. Thanks to bryant for the patch! Differential Revision: https://reviews.llvm.org/D26086 llvm-svn: 285432	2016-10-28 19:22:46 +00:00
Igor Laevsky	c3ccf5d77b	[LCSSA] Perform LCSSA verification only for the current loop nest. Now LPPassManager will run LCSSA verification only for the top-level loop which was processed on the current iteration. Differential Revision: https://reviews.llvm.org/D25873 llvm-svn: 285394	2016-10-28 12:57:20 +00:00
Davide Italiano	631cd27f29	[Reassociate] Removing instructions mutates the IR. Fixes PR 30784. Discussed with Justin, who pointed out that in the new PassManager infrastructure we can have more fine-grained control on which analyses we want to preserve, but this is the best we can do with the current infrastructure. llvm-svn: 285380	2016-10-28 02:47:09 +00:00
Teresa Johnson	58fbc916a0	[ThinLTO] Rename HasSection to NoRename (NFC) Summary: This is in preparation for a change to utilize this flag for symbols referenced/defined in either inline or module level assembly. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26048 llvm-svn: 285376	2016-10-28 02:24:59 +00:00
Sanjay Patel	c0de9c9e40	[InstCombine] fix foldSPFofSPF() to handle vector splats llvm-svn: 285345	2016-10-27 21:19:40 +00:00
Haicheng Wu	430b3e4893	[LoopUnroll] Check partial unrolling is enabled before initialization. NFC. Differential Revision: https://reviews.llvm.org/D23891 llvm-svn: 285330	2016-10-27 18:40:02 +00:00
Sanjay Patel	611f9f92fc	[InstCombine] handle simple vector integer constants in IsFreeToInvert llvm-svn: 285318	2016-10-27 17:30:50 +00:00
Dehao Chen	b94c09baa0	Add Loop Sink pass to reverse the LICM based of basic block frequency. Summary: LICM may hoist instructions to preheader speculatively. Before code generation, we need to sink down the hoisted instructions inside to loop if it's beneficial. This pass is a reverse of LICM: looking at instructions in preheader and sinks the instruction to basic blocks inside the loop body if basic block frequency is smaller than the preheader frequency. Reviewers: hfinkel, davidxl, chandlerc Subscribers: anna, modocache, mgorny, beanz, reames, dberlin, chandlerc, mcrosier, junbuml, sanjoy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22778 llvm-svn: 285308	2016-10-27 16:30:08 +00:00
Alexey Bataev	46c0278e7d	[SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer. After successfull horizontal reduction vectorization attempt for PHI node vectorizer tries to update root binary op by combining vectorized tree and the ReductionPHI node. But during vectorization this ReductionPHI can be vectorized itself and replaced by the `undef` value, while the instruction itself is marked for deletion. This 'marked for deletion' PHI node then can be used in new binary operation, causing "Use still stuck around after Def is destroyed" crash upon PHI node deletion. Also the test is fixed to make it perform actual testing. Differential Revision: https://reviews.llvm.org/D25671 llvm-svn: 285286	2016-10-27 12:02:28 +00:00
Dehao Chen	e713000eb6	Introduce updateDiscriminator interface to DILocation to make it cleaner assigning discriminators. Summary: This patch introduces updateDiscriminator to DILocation so that it can be directly called by AddDiscriminator. It also makes it easier to update the discriminator later. Reviewers: dnovillo, dblaikie, aprantl, echristo Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25959 llvm-svn: 285207	2016-10-26 15:48:45 +00:00
Sanjay Patel	8d7196bfde	[InstCombine] clean up commonCastTransforms; NFC 1. Use 'auto' with dyn_cast. 2. Variables start with a capital letter. 3. Use proper punctuation in comments. llvm-svn: 285200	2016-10-26 14:52:35 +00:00
Andrea Di Biagio	9bcb064f19	[IndVarSimplify][DebugLoc] When widening the exit loop condition, correctly reuse the debug location of the original comparison. When the loop exit condition is canonicalized as a != compaison, reuse the debug location of the original (non canonical) comparison. Before this patch, the debug location of the new icmp was obtained from the loop latch terminator. This patch fixes the issue by correctly setting the IRBuilder's "current debug location" to the location of the original compare. Differential Revision: https://reviews.llvm.org/D25953 llvm-svn: 285185	2016-10-26 10:28:32 +00:00
Peter Collingbourne	7b7bac367c	Cloning: Also clone global variable attached metadata. llvm-svn: 285161	2016-10-26 02:57:33 +00:00
Evgeniy Stepanov	ea6d49d3ee	Utility functions for appending to llvm.used/llvm.compiler.used. llvm-svn: 285143	2016-10-25 23:53:31 +00:00
Rong Xu	33308f92eb	[PGO] Fix select instruction annotation Summary: Select instruction annotation in IR PGO uses the edge count to infer the branch count. It's currently placed in setInstrumentedCounts() where no all the BB counts have been computed. This leads to wrong branch weights. Move the annotation after all BB counts are populated. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25961 llvm-svn: 285128	2016-10-25 21:47:24 +00:00
Guozhi Wei	ae541f6a71	[InstCombine] Resubmit the combine of A->B->A BitCast and fix for pr27996 The original patch of the A->B->A BitCast optimization was reverted by r274094 because it may cause infinite loop inside compiler https://llvm.org/bugs/show_bug.cgi?id=27996. The problem is with following code xB = load (type B); xA = load (type A); +yA = (A)xB; B -> A +zAn = PHI[yA, xA]; PHI +zBn = (B)zAn; // A -> B store zAn; store zBn; optimizeBitCastFromPhi generates +zBn = (B)zAn; // A -> B and expects it will be combined with the following store instruction to another store zAn Unfortunately before combineStoreToValueType is called on the store instruction, optimizeBitCastFromPhi is called on the new BitCast again, and this pattern repeats indefinitely. optimizeBitCastFromPhi only generates BitCast for load/store instructions, only the BitCast before store can cause the reexecution of optimizeBitCastFromPhi, and BitCast before store can easily be handled by InstCombineLoadStoreAlloca.cpp. So the solution to the problem is if all users of a CI are store instructions, we should not do optimizeBitCastFromPhi on it. Then optimizeBitCastFromPhi will not be called on the new BitCast instructions. Differential Revision: https://reviews.llvm.org/D23896 llvm-svn: 285116	2016-10-25 20:43:42 +00:00
Sanjay Patel	f3dda13bd2	[InstCombine] Ensure that truncated int types are legal. Fixes the FIXMEs in D25952 and rL285075. Patch by bryant! Differential Revision: https://reviews.llvm.org/D25955 llvm-svn: 285108	2016-10-25 20:11:47 +00:00
Matthew Simpson	c62266d680	[LV] Sink scalar operands of predicated instructions When we predicate an instruction (div, rem, store) we place the instruction in its own basic block within the vectorized loop. If a predicated instruction has scalar operands, it's possible to recursively sink these scalar expressions into the predicated block so that they might avoid execution. This patch sinks as much scalar computation as possible into predicated blocks. We previously were able to sink such operands only if they were extractelement instructions. Differential Revision: https://reviews.llvm.org/D25632 llvm-svn: 285097	2016-10-25 18:59:45 +00:00
Michael Ilseman	e542804343	Add -strip-nonlinetable-debuginfo capability This adds a new function to DebugInfo.cpp that takes an llvm::Module as input and removes all debug info metadata that is not directly needed for line tables, thus effectively stripping all type and variable information from the module. The primary motivation for this feature was the bitcode work flow (cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html for more background). This is not wired up yet, but will be in subsequent patches. For testing, the new functionality is exposed to opt with a -strip-nonlinetable-debuginfo option. The secondary use-case (and one that works right now!) is as a reduction pass in bugpoint. I added two new bugpoint options (-disable-strip-debuginfo and -disable-strip-debug-types) to control the new features. By default it will first attempt to remove all debug information, then only the type info, and then proceed to hack at any remaining MDNodes. Thanks to Adrian Prantl for stewarding this patch! llvm-svn: 285094	2016-10-25 18:44:13 +00:00
Michael Kuperstein	cffedc4a94	Fix 80-char violations. NFC. llvm-svn: 285092	2016-10-25 18:31:23 +00:00
Dehao Chen	c1472b5092	Move discriminator assignment to where it is used. (NFC) llvm-svn: 285084	2016-10-25 16:50:27 +00:00
Andrea Di Biagio	824cabd06d	[IndVarSimplify][Dwarf] When widening the IV increment, correctly set the debug loc. When indvars widened an induction variable, the debug location for the loop increment computation was incorrectly set equal to the debug loc of the loop latch terminator. This patch fixes the issue by propagating the correct location from the original loop increment instruction to the new widened increment. Differential Revision: https://reviews.llvm.org/D25872 llvm-svn: 285083	2016-10-25 16:45:17 +00:00
Geoff Berry	91e9a5cc23	[EarlyCSE] Make MemorySSA memory dependency check more aggressive. Now that MemorySSA keeps track of whether MemoryUses are optimized, use getClobberingMemoryAccess() to check MemoryUse memory dependencies since it should no longer be so expensive. This is a follow-up change to https://reviews.llvm.org/D25881 llvm-svn: 285080	2016-10-25 16:18:47 +00:00
Sanjay Patel	e3de152530	fix formatting; NFC llvm-svn: 285078	2016-10-25 16:12:31 +00:00
Sanjay Patel	d59f7f9047	[InstCombine] add test and code comment to show potentially misguided icmp trunc transform llvm-svn: 285075	2016-10-25 15:16:39 +00:00
Peter Collingbourne	4f3b2df9bb	GlobalDCE: Restore a statement accidentally removed in r285048. llvm-svn: 285052	2016-10-25 02:57:27 +00:00
Peter Collingbourne	7695cb6da8	GlobalDCE: Deduplicate code. NFCI. llvm-svn: 285048	2016-10-25 01:58:26 +00:00
Davide Italiano	c3e0ce8f85	Merge two if conditions into one. NFCI. llvm-svn: 285008	2016-10-24 19:41:47 +00:00
Adrian Prantl	28d2d281e7	add-discriminators: Fix handling of lexical scopes. This fixes a bug in the handling of lexical scopes, when more than one scope is defined on the same line or functions are inlined into call sites that are on the same line as the function definition. This situation can easily happen in macro expansions. The problem is solved by introducing a SmallDenseMap<DIScope , DILexicalBlockFile , 1> that keeps track of all the different lexical scopes that share a line/file location. Fixes PR30681. llvm-svn: 284998	2016-10-24 18:23:51 +00:00
Rong Xu	b05bac940d	Check the number of Args in LibCallsShrinkWrap. Some library fucntions can have no argument. llvm-svn: 284989	2016-10-24 16:50:12 +00:00
Geoff Berry	6815468768	[EarlyCSE] Optimize MemoryPhis and reduce memory clobber queries w/ MemorySSA Summary: When using MemorySSA, re-optimize MemoryPhis when removing a store since this may create MemoryPhis with all identical arguments. Also, when using MemorySSA to check if two MemoryUses are reading from the same version of the heap, use the defining access instead of calling getClobberingAccess, since the latter can currently result in many more AA calls. Once the MemorySSA use optimization tracking changes are done, we can remove this limitation, which should result in more loads being CSE'd. Reviewers: dberlin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D25881 llvm-svn: 284984	2016-10-24 15:54:00 +00:00
Nico Weber	b38d341106	Revert 284971. It seems to break selfhost on some bots, see e.g. http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/21 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/20 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/22 llvm-svn: 284979	2016-10-24 14:52:04 +00:00
Pablo Barrio	f9e0d0b7d0	[JumpThreading] Unfold selects that depend on the same condition Summary: These are good candidates for jump threading. This enables later opts (such as InstCombine) to combine instructions from the selects with instructions out of the selects. SimplifyCFG will fold the select again if unfolding wasn't worth it. Patch by James Molloy and Pablo Barrio. Reviewers: reames, bkramer, mcrosier, gberry, haicheng, jmolloy, sebpop Subscribers: jojo, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D25477 llvm-svn: 284971	2016-10-24 13:04:45 +00:00
Daniel Berlin	f5361139bb	Now that VS2013 is gone, make a memoryssa structure an anonymous union again llvm-svn: 284910	2016-10-22 04:15:41 +00:00
Davide Italiano	738837eed9	[CtorUtils] Modernize. No functional changes intended. llvm-svn: 284904	2016-10-22 01:21:24 +00:00
Peter Collingbourne	ecdd58f1d6	Analysis: Move llvm::getConstantRangeFromMetadata to IR library. We're about to start using it there. Differential Revision: https://reviews.llvm.org/D25877 llvm-svn: 284865	2016-10-21 19:59:26 +00:00
Anna Thomas	0860259434	[StripGCRelocates] New pass to remove gc.relocates added by RS4GC Summary: Utility pass to remove gc.relocates created by rewrite statepoints for GC. With respect to safepoint verification, the IR generated would be incorrect, and cannot run as such. This would be a single transformation on the final optimized IR. The benefit of the pass is for easy analysis when the IRs are 'polluted' by too many gc.relocates. Added tests. test run: All RS4GC tests with -verify option. Local downstream tests on large IR files. This also works when the pointer being gc.relocated is another gc.relocate. Reviewers: sanjoy, reames Subscribers: beanz, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D25096 llvm-svn: 284855	2016-10-21 18:43:16 +00:00
John Brawn	84b21835f1	[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case. Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved. Differential Revision: https://reviews.llvm.org/D25682 llvm-svn: 284818	2016-10-21 11:08:48 +00:00
Davide Italiano	d15477b09d	Revert "[GVN/PRE] Hoist global values outside of loops." There's no agreement about this patch. I personally find the PRE machinery of the current GVN hard enough to reason about that I'm not sure I'll try to land this again, instead of working on the rewrite). llvm-svn: 284796	2016-10-21 01:37:02 +00:00
Daniel Berlin	cd2deacac6	[MSSA] Avoid unnecessary use walks when calling getClobberingMemoryAccess Summary: This allows us to mark when uses have been optimized. This lets us avoid rewalking (IE when people call getClobberingAccess on everything), and also enables us to later relax the requirement of use optimization during updates with less cost. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25172 llvm-svn: 284771	2016-10-20 20:13:45 +00:00
Benjamin Kramer	26b2593b24	[GVN] Use defaulted members. No functional change. llvm-svn: 284726	2016-10-20 13:09:12 +00:00
Benjamin Kramer	2a8bef8769	Do a sweep over move ctors and remove those that are identical to the default. All of these existed because MSVC 2013 was unable to synthesize default move ctors. We recently dropped support for it so all that error-prone boilerplate can go. No functionality change intended. llvm-svn: 284721	2016-10-20 12:20:28 +00:00
Artur Pilipenko	5c6ef75485	[IndVarSimplify] Teach calculatePostIncRange to take guards into account Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D25739 llvm-svn: 284632	2016-10-19 19:43:54 +00:00
Matthew Simpson	41fa838f07	[LV] Avoid emitting trivially dead instructions Some instructions from the original loop, when vectorized, can become trivially dead. This happens because of the way we structure the new loop. For example, we create new induction variables and induction variable "steps" in the new loop. Thus, when we go to vectorize the original induction variable update, it may no longer be needed due to the instructions we've already created. This patch prevents us from creating these redundant instructions. This reduces code size before simplification and allows greater flexibility in code generation since we have fewer unnecessary instruction uses. Differential Revision: https://reviews.llvm.org/D25631 llvm-svn: 284631	2016-10-19 19:22:02 +00:00
Artur Pilipenko	f2d5dc5dc6	[IndVarSimplify] Use control-dependent range information to prove non-negativity This change is motivated by the case when IndVarSimplify doesn't widen a comparison of IV increment because it can't prove IV increment being non-negative. We end up with a redundant trunc of the widened increment on this example. for.body: %i = phi i32 [ %start, %for.body.lr.ph ], [ %i.inc, %for.inc ] %within_limits = icmp ult i32 %i, 64 br i1 %within_limits, label %continue, label %for.end continue: %i.i64 = zext i32 %i to i64 %arrayidx = getelementptr inbounds i32, i32* %base, i64 %i.i64 %val = load i32, i32* %arrayidx, align 4 br label %for.inc for.inc: %i.inc = add nsw nuw i32 %i, 1 %cmp = icmp slt i32 %i.inc, %limit br i1 %cmp, label %for.body, label %for.end There is a range check inside of the loop which guarantees the IV to be non-negative. NSW on the increment guarantees that the increment is also non-negative. Teach IndVarSimplify to use the range check to prove non-negativity of loop increments. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D25738 llvm-svn: 284629	2016-10-19 18:59:03 +00:00
Vitaly Buka	490fda3366	[asan] Replace std::to_string with llvm::to_string llvm-svn: 284557	2016-10-19 00:16:56 +00:00
Vitaly Buka	5910a92560	[asan] Simplify calculation of stack frame layout extraction calculation of stack description into separate function. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25754 llvm-svn: 284547	2016-10-18 23:29:52 +00:00
Vitaly Buka	d88e52012b	[asan] Append line number to variable name if line is available and in the same file as the function. PR30498 Reviewers: eugenis Differential Revision: https://reviews.llvm.org/D25715 llvm-svn: 284546	2016-10-18 23:29:41 +00:00
Rong Xu	1c0e9b97d2	Conditionally eliminate library calls where the result value is not used Summary: This pass shrink-wraps a condition to some library calls where the call result is not used. For example: sqrt(val); is transformed to if (val < 0) sqrt(val); Even if the result of library call is not being used, the compiler cannot safely delete the call because the function can set errno on error conditions. Note in many functions, the error condition solely depends on the incoming parameter. In this optimization, we can generate the condition can lead to the errno to shrink-wrap the call. Since the chances of hitting the error condition is low, the runtime call is effectively eliminated. These partially dead calls are usually results of C++ abstraction penalty exposed by inlining. This optimization hits 108 times in 19 C/C++ programs in SPEC2006. Reviewers: hfinkel, mehdi_amini, davidxl Subscribers: modocache, mgorny, mehdi_amini, xur, llvm-commits, beanz Differential Revision: https://reviews.llvm.org/D24414 llvm-svn: 284542	2016-10-18 21:36:27 +00:00
Davide Italiano	36efa68463	[GVN] Consistently use division instead of shift. NFCI. This is in line with other places of GVN (e.g. load coercion logic). llvm-svn: 284535	2016-10-18 21:02:27 +00:00
Davide Italiano	64cd985e44	[GVN] Remove dead code. NFC. llvm-svn: 284534	2016-10-18 21:00:26 +00:00
Benjamin Kramer	ee042234ae	[esan] Remove global variable. It's not thread safe and completely unnecessary. llvm-svn: 284520	2016-10-18 19:39:23 +00:00
Vitaly Buka	8e1906ea7e	[asan] Make -asan-experimental-poisoning the only behavior Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25735 llvm-svn: 284505	2016-10-18 18:04:59 +00:00
Dehao Chen	018a3afa99	Ignore debug info when making optimization decisions in SimplifyCFG. Summary: Debug info should not affect code generation. This patch properly handles debug info to make sure the generated code are the same with or without debug info. Reviewers: davidxl, mzolotukhin, jmolloy Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D25286 llvm-svn: 284415	2016-10-17 19:28:44 +00:00
Oliver Stannard	fe4432b105	[SimplifyCFG] Don't lower complex ConstantExprs to lookup tables Not all ConstantExprs can be represented by a global variable, for example most pointer arithmetic other than addition of a constant, so we can't convert these values from switch statements to lookup tables. Differential Revision: https://reviews.llvm.org/D25550 llvm-svn: 284379	2016-10-17 12:00:24 +00:00
Davide Italiano	590ad7037e	[GVN/PRE] Hoist global values outside of loops. In theory this could be generalized to move anything where we prove the operands are available, but that would require rewriting PRE. As NewGVN will hopefully come soon, and we're trying to rewrite PRE in terms of NewGVN+MemorySSA, it's probably not worth spending too much time on it. Fix provided by Daniel Berlin! llvm-svn: 284311	2016-10-15 21:35:23 +00:00
Benjamin Kramer	d8b079708d	[SimplifyCFG] Use the error checking provided by getPrevNode. BasicBlock::size is O(insts), making this loop O(blocks*insts), which can be really slow on generated code. getPrevNode already checks if we're at the beginning of the block and returns nullptr if so, just use that instead. No functionality change intended. llvm-svn: 284303	2016-10-15 13:15:05 +00:00
Evgeny Astigeevich	48fd87e4aa	[NFC] Loop Versioning for LICM code clean up - Removed unused class members. - Made class internal data private. - Made class scoped data function scoped where it's possible. - Replace naked new/delete with unique_ptr. - Made resources guaranteed to be freed. Differential Revision: https://reviews.llvm.org/D25464 llvm-svn: 284290	2016-10-14 23:00:36 +00:00
Sanjay Patel	6d6eca5cdc	[InstCombine] use m_APInt to allow sub with constant folds for splat vectors llvm-svn: 284247	2016-10-14 16:31:54 +00:00
Sanjay Patel	c6c5965a42	[InstCombine] sub X, sext(bool Y) -> add X, zext(bool Y) Prefer add/zext because they are better supported in terms of value-tracking. Note that the backend should be prepared for this IR canonicalization (including vector types) after: https://reviews.llvm.org/rL284015 Differential Revision: https://reviews.llvm.org/D25135 llvm-svn: 284241	2016-10-14 15:24:31 +00:00
Matthew Simpson	1d4b163fc0	[LV] Account for predicated stores in instruction costs This patch ensures that we scale the estimated cost of predicated stores by block probability. This is a follow-on patch for r284123. llvm-svn: 284126	2016-10-13 14:54:31 +00:00
Matthew Simpson	6cdb5a6f96	[LV] Avoid rounding errors for predicated instruction costs This patch modifies the cost calculation of predicated instructions (div and rem) to avoid the accumulation of rounding errors due to multiple truncating integer divisions. The calculation for predicated stores will be addressed in a follow-on patch since we currently don't scale the cost of predicated stores by block probability. Differential Revision: https://reviews.llvm.org/D25333 llvm-svn: 284123	2016-10-13 14:19:48 +00:00
Sebastian Pop	5068d7a338	Memory-SSA: strengthen defClobbersUseOrDef interface As Danny pointed out, defClobbersUseOrDef should use MemoryLocOrCall to make sure fences are properly handled. llvm-svn: 284099	2016-10-13 03:23:33 +00:00
Sebastian Pop	5ba9f24ed7	commit back "GVN-hoist: fix store past load dependence analysis (PR30216, PR30499)" This is with an extra change to avoid calling MemoryLocation::get() on a call instruction. Differential Revision: https://reviews.llvm.org/D25542 llvm-svn: 284098	2016-10-13 01:39:10 +00:00
Reid Kleckner	8958f6a529	Revert "GVN-hoist: fix store past load dependence analysis (PR30216, PR30499)" This CL didn't actually address the test case in PR30499, and clang still crashes. Also revert dependent change "Memory-SSA cleanup of clobbers interface, NFC" Reverts r283965 and r283967. llvm-svn: 284093	2016-10-13 00:18:26 +00:00
Haicheng Wu	1ef17e90b2	Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049. The original summary: This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. llvm-svn: 284053	2016-10-12 21:29:38 +00:00
Haicheng Wu	45e4ef737d	Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" This reverts commit r284044. llvm-svn: 284051	2016-10-12 21:02:22 +00:00
Haicheng Wu	6cac34fd41	[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. Differential Revision: https://reviews.llvm.org/D24790 llvm-svn: 284044	2016-10-12 20:24:32 +00:00
Sanjoy Das	bc357e8fa3	[SimplifyCFG] Don't create PHI nodes for constant bundle operands Summary: Constant bundle operands may need to retain their constant-ness for correctness. I'll admit that this is slightly odd, but it looks like SimplifyCFG already does this for things like @llvm.frameaddress and @llvm.stackmap, so I suppose adding one more case is not a big deal. It is possible to add a mechanism to denote bundle operands that need to remain constants, but that's probably too complicated for the time being. Reviewers: jmolloy Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D25502 llvm-svn: 284028	2016-10-12 18:15:33 +00:00
Chad Rosier	c215c3fd14	[CVP] Convert an AShr to a LShr if 1st operand is known to be nonnegative. An arithmetic shift can be safely changed to a logical shift if the first operand is known positive. This allows ComputeKnownBits (and similar analysis) to determine the sign bit of the shifted value in some cases. In turn, this allows InstCombine to canonicalize a signed comparison (a > 0) into an equality check (a != 0). PR30577 Differential Revision: https://reviews.llvm.org/D25119 llvm-svn: 284013	2016-10-12 13:41:38 +00:00
Simon Pilgrim	fd0d7b21e0	[InstCombine] Fix constexpr issue in select combining As discussed by Andrea on PR30486, we have an unsafe cast to an Instruction type in the select combine which doesn't take into account that it could be a ConstantExpr instead. Differential Revision: https://reviews.llvm.org/D25466 llvm-svn: 284000	2016-10-12 10:20:15 +00:00
Sebastian Pop	d57d93c9de	Memory-SSA cleanup of clobbers interface, NFC This implements the cleanup that Danny asked to commit separately from the previous fix to GVN-hoist in https://reviews.llvm.org/D25476#inline-219818 Tested with ninja check on x86_64-linux. llvm-svn: 283967	2016-10-12 03:08:40 +00:00
Sebastian Pop	ab12fb62ee	GVN-hoist: fix store past load dependence analysis (PR30216, PR30499) This is a refreshed version of a patch that was reverted: it fixes the problems reported in both PR30216 and PR30499, and contains all the test-cases from both bugs. To hoist stores past loads, we used to search for potential conflicting loads on the hoisting path by following a MemorySSA def-def link from the store to be hoisted to the previous defining memory access, and from there we followed the def-use chains to all the uses that occur on the hoisting path. The problem is that the def-def link may point to a store that does not alias with the store to be hoisted, and so the loads that are walked may not alias with the store to be hoisted, and even as in the testcase of PR30216, the loads that may alias with the store to be hoisted are not visited. The current patch visits all loads on the path from the store to be hoisted to the hoisting position and uses the alias analysis to ask whether the store may alias the load. I was not able to use the MemorySSA functionality to ask for whether load and store are clobbered: I'm not sure which function to call, so I used a call to AA->isNoAlias(). Store past store is still working as before using a MemorySSA query: I added an extra test to pr30216.ll to make sure store past store does not regress. Tested on x86_64-linux with check and a test-suite run. Differential Revision: https://reviews.llvm.org/D25476 llvm-svn: 283965	2016-10-12 02:23:39 +00:00
Kostya Serebryany	4d25ad93f3	[sanitizer-coverage] use private linkage for coverage guards, delete old commented-out code. llvm-svn: 283924	2016-10-11 19:36:50 +00:00
Igor Laevsky	04423cf785	[LCSSA] Implement linear algorithm for the isRecursivelyLCSSAForm For each block check that it doesn't have any uses outside of it's innermost loop. Differential Revision: https://reviews.llvm.org/D25364 llvm-svn: 283877	2016-10-11 13:37:22 +00:00
David Majnemer	80dca0c78f	[InstCombine] Transform !range metadata to !nonnull when combining loads When combining an integer load with !range metadata that does not include 0 to a pointer load, make sure emit !nonnull metadata on the newly-created pointer load. This prevents the !nonnull metadata from being dropped during a ptrtoint/inttoptr pair. This fixes PR30597. Patch by Ariel Ben-Yehuda! Differential Revision: https://reviews.llvm.org/D25215 llvm-svn: 283836	2016-10-11 01:00:45 +00:00
Mehdi Amini	732afdd09a	Turn cl::values() (for enum) from a vararg function to using C++ variadic template The core of the change is supposed to be NFC, however it also fixes what I believe was an undefined behavior when calling: va_start(ValueArgs, Desc); with Desc being a StringRef. Differential Revision: https://reviews.llvm.org/D25342 llvm-svn: 283671	2016-10-08 19:41:06 +00:00
Gor Nishanov	1b6aec8e25	[coroutines] Store an address of destroy OR cleanup part in the coroutine frame. Summary: If heap allocation of a coroutine is elided, we need to make sure that we will update an address stored in the coroutine frame from f.destroy to f.cleanup. Before this change, CoroSplit synthesized these stores after coro.begin: ``` store void (%f.Frame) @f.resume, void (%f.Frame)* %resume.addr store void (%f.Frame) @f.destroy, void (%f.Frame)* %destroy.addr ``` In those cases where we did heap elision, but were not able to devirtualize all indirect calls, destroy call will attempt to "free" the coroutine frame stored on the stack. Oops. Now we use select to put an appropriate coroutine subfunction in the destroy slot. As bellow: ``` store void (%f.Frame) @f.resume, void (%f.Frame)* %resume.addr %0 = select i1 %need.alloc, void (%f.Frame) @f.destroy, void (%f.Frame) @f.cleanup store void (%f.Frame) %0, void (%f.Frame)* %destroy.addr ``` Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25377 llvm-svn: 283625	2016-10-08 00:22:50 +00:00
Davide Italiano	f6988d2980	[InstCombine] Don't unpack arrays that are too large (part 2). This is similar to r283599, but for store instructions. Thanks to David for pointing out! llvm-svn: 283612	2016-10-07 21:53:09 +00:00
Davide Italiano	da11412243	[InstCombine] Don't unpack arrays that are too large Differential Revision: https://reviews.llvm.org/D25376 llvm-svn: 283599	2016-10-07 20:57:42 +00:00
Davide Italiano	c0169fa94f	[LoopIdiomRecognize] Merge two if conditions into one. NFCI. llvm-svn: 283579	2016-10-07 18:39:43 +00:00
Sanjay Patel	4326c4ac8f	[InstCombine] fold select X, (ext X), C If we're going to canonicalize IR towards select of constants, try harder to create those. Also, don't lose the metadata. This is actually 4 related transforms in one patch: // select X, (sext X), C --> select X, -1, C // select X, (zext X), C --> select X, 1, C // select X, C, (sext X) --> select X, C, 0 // select X, C, (zext X) --> select X, C, 0 Differential Revision: https://reviews.llvm.org/D25126 llvm-svn: 283575	2016-10-07 17:53:07 +00:00
Dehao Chen	6e0c8446db	Invoke add-discriminator at -g0 -fsample-profile Summary: -fsample-profile needs discriminator, which will not be added if built with -g0. This patch makes sure the discriminator is added for sample-profile at -g0. A followup patch will be send out to update clang tests. Reviewers: davidxl, dblaikie, echristo, dnovillo Subscribers: mehdi_amini, probinson, llvm-commits Differential Revision: https://reviews.llvm.org/D25132 llvm-svn: 283565	2016-10-07 15:21:31 +00:00
Matthew Simpson	a371c14ffe	[LV] Don't mark multi-use branch conditions uniform Previously, we marked the branch conditions of latch blocks uniform after vectorization if they were instructions contained in the loop. However, if a condition instruction has users other than the branch, it may not remain uniform. This patch ensures the conditions we mark uniform are only used by the branch. This should fix PR30627. Reference: https://llvm.org/bugs/show_bug.cgi?id=30627 llvm-svn: 283563	2016-10-07 15:20:13 +00:00
Alexey Bataev	6ad5da7c81	[SLPVectorizer] Fix for PR25748: reduction vectorization after loop unrolling. The next code is not vectorized by the SLPVectorizer: ``` int test(unsigned int *p) { int sum = 0; for (int i = 0; i < 8; i++) sum += p[i]; return sum; } ``` During optimization this loop is fully unrolled and SLPVectorizer is unable to vectorize it. Patch tries to fix this problem. Differential Revision: https://reviews.llvm.org/D24796 llvm-svn: 283535	2016-10-07 09:39:22 +00:00
Oliver Stannard	4df1cc0b00	[ARM] Don't convert switches to lookup tables of pointers with ROPI/RWPI With the ROPI and RWPI relocation models we can't always have pointers to global data or functions in constant data, so don't try to convert switches into lookup tables if any value in the lookup table would require a relocation. We can still safely emit lookup tables of other values, such as simple constants. Differential Revision: https://reviews.llvm.org/D24462 llvm-svn: 283530	2016-10-07 08:48:24 +00:00
David Majnemer	8c03c1bade	[SimplifyCFG] Correctly test for unconditional branches in GetCaseResults GetCaseResults assumed that a terminator with one successor was an unconditional branch. This is not necessarily the case, it could be a cleanupret. Strengthen the check by querying whether or not the terminator is exceptional. llvm-svn: 283517	2016-10-07 01:38:35 +00:00
Rong Xu	0e79f7d11d	[PGO] Create weak alias for the renamed Comdat function Add a weak alias to the renamed Comdat function in IR level instrumentation, using it's original name. This ensures the same behavior w/ and w/o IR instrumentation, even for non standard conforming code. Differential Revision: http://reviews.llvm.org/D25339 llvm-svn: 283490	2016-10-06 20:38:13 +00:00
Michael Ilseman	6d6b4d87a3	Revert "Add -strip-nonlinetable-debuginfo capability" This reverts commit r283473. Reverted until review is completed. llvm-svn: 283478	2016-10-06 18:30:26 +00:00
Michael Ilseman	d0a4db7632	Add -strip-nonlinetable-debuginfo capability This adds a new function to DebugInfo.cpp that takes an llvm::Module as input and removes all debug info metadata that is not directly needed for line tables, thus effectively stripping all type and variable information from the module. The primary motivation for this feature was the bitcode work flow (cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html for more background). This is not wired up yet, but will be in subsequent patches. For testing, the new functionality is exposed to opt with a -strip-nonlinetable-debuginfo option. The secondary use-case (and one that works right now!) is as a reduction pass in bugpoint. I added two new bugpoint options (-disable-strip-debuginfo and -disable-strip-debug-types) to control the new features. By default it will first attempt to remove all debug information, then only the type info, and then proceed to hack at any remaining MDNodes. llvm-svn: 283473	2016-10-06 17:58:38 +00:00
Anna Thomas	488c05763c	[RS4GC] Fix comment to show TODO. NFC llvm-svn: 283449	2016-10-06 13:24:20 +00:00
Sagar Thakur	f9292220dc	[EfficiencySanitizer] Adds shadow memory parameters for 40-bit virtual memory address. Adding 40-bit shadow memory parameters because MIPS64 uses 40-bit virtual memory addresses. Reviewed by rengolin. Differential: https://reviews.llvm.org/D23801 llvm-svn: 283433	2016-10-06 09:52:06 +00:00
David Callahan	c1051ab26e	Modify df_iterator to support post-order actions Summary: This makes a change to the state used to maintain visited information for depth first iterator. We know assume a method "completed(...)" which is called after all children of a node have been visited. In all existing cases, this method does nothing so this patch has no functional changes. It will however allow a client to distinguish back from cross edges in a DFS tree. Reviewers: nadav, mehdi_amini, dberlin Subscribers: MatzeB, mzolotukhin, twoh, freik, llvm-commits Differential Revision: https://reviews.llvm.org/D25191 llvm-svn: 283391	2016-10-05 21:36:16 +00:00
Anna Zaks	9a6a6eff0e	[asan] Reapply: Switch to using dynamic shadow offset on iOS The VM layout is not stable between iOS version releases, so switch to dynamic shadow offset. This is the LLVM counterpart of https://reviews.llvm.org/D25218 Differential Revision: https://reviews.llvm.org/D25219 llvm-svn: 283376	2016-10-05 20:34:13 +00:00
Matthew Simpson	a58c50dff0	[LV] Pass profitability analysis in vectorizer constructor (NFC) The vectorizer already holds a pointer to one cost model artifact in a member variable (i.e., MinBWs). As we add more, it will be easier to communicate these artifacts to the vectorizer if we simply pass a pointer to the cost model instead. llvm-svn: 283373	2016-10-05 20:23:46 +00:00
Matthew Simpson	386546124f	[LV] Pass legality analysis in vectorizer constructor (NFC) The vectorizer already holds a pointer to the legality analysis in a member variable, so it makes sense that we would pass it in the constructor. llvm-svn: 283368	2016-10-05 19:53:20 +00:00
Matthew Simpson	6a8e0bcf3d	[LV] Remove obsolete comment (NFC) llvm-svn: 283365	2016-10-05 19:19:49 +00:00
Matthew Simpson	ee3fdc7e26	[LV] Use getScalarizationOverhead in memory instruction costs (NFC) This patch refactors the cost estimation of scalarized loads and stores to reuse getScalarizationOverhead for the cost of the extractelement and insertelement instructions we might create. The existing code accounted for this cost, but it was functionally equivalent to the helper function. llvm-svn: 283364	2016-10-05 19:11:54 +00:00
Matthew Simpson	1755d81b29	[LV] Add helper function for predicated block probability (NFC) The cost model has to estimate the probability of executing predicated blocks. However, we currently always assume predicated blocks have a 50% chance of executing (this value is hardcoded in several places throughout the code). Since we always use the same value, this patch adds a helper function for getting this uniform probability. The function simplifies some comments and makes our assumptions more clear. In the future, we may want to extend this with actual block probability information if it's available. llvm-svn: 283354	2016-10-05 18:30:36 +00:00
Matthew Simpson	c631167609	[LV] Add isScalarWithPredication helper function (NFC) This patch adds a single helper function for checking if an instruction will be scalarized with predication. Such instructions include conditional stores and instructions that may divide by zero. Existing checks have been updated to use the new function. llvm-svn: 283350	2016-10-05 17:52:34 +00:00
Anna Zaks	e732ce4dff	Revert "[asan] LLVM: Switch to using dynamic shadow offset on iOS" This reverts commit abe77a118615cd90b0d7f127e4797096afa2b394. Revert as these changes broke a Chromium buildbot. llvm-svn: 283348	2016-10-05 17:42:02 +00:00
Mehdi Amini	a6f81ca8ea	Use StringRef in ARCRuntimeEntryPoints APIs (NFC) llvm-svn: 283288	2016-10-05 01:15:04 +00:00
Michael Zolotukhin	5cda89ad36	[LoopDistribute] Fix a typo in the pass name. llvm-svn: 283282	2016-10-05 00:44:52 +00:00
Anna Zaks	ef97d2c589	[asan] LLVM: Switch to using dynamic shadow offset on iOS The VM layout is not stable between iOS version releases, so switch to dynamic shadow offset. This is the LLVM counterpart of https://reviews.llvm.org/D25218 Differential Revision: https://reviews.llvm.org/D25219 llvm-svn: 283239	2016-10-04 19:02:29 +00:00
Anna Thomas	479cbb9405	[RS4GC] Handle ShuffleVector instruction in findBasePointer Summary: This patch modifies the findBasePointer to handle the shufflevector instruction. Tests run: RS4GC tests, local downstream tests. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25197 llvm-svn: 283219	2016-10-04 13:48:37 +00:00
Sanjoy Das	0359a193a7	[PruneEH] Be correct in the face IPO This fixes one spot I had missed in r265762. Credit goes to Philip Reames for spotting this one! llvm-svn: 283137	2016-10-03 19:35:30 +00:00
Dehao Chen	92abc7e9f2	Refactor LICM pass in preparation for LoopSink pass. Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778). Reviewers: davidxl, danielcdh, hfinkel, chandlerc Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D24168 llvm-svn: 283134	2016-10-03 18:52:08 +00:00
Hans Wennborg	b4d2678c6f	Jump threading: avoid trying to split edge into landingpad block (PR27840) Splitting the edge is nontrivial because of the landing pad, and we would currently assert trying to do it. Differential Revision: https://reviews.llvm.org/D24680 llvm-svn: 283129	2016-10-03 18:18:04 +00:00
Volkan Keles	1c38681ae6	Add new target hooks for LoadStoreVectorizer Summary: Added 6 new target hooks for the vectorizer in order to filter types, handle size constraints and decide how to split chains. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, mzolotukhin, wdng, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D24727 llvm-svn: 283099	2016-10-03 10:31:34 +00:00
Sanjoy Das	1f7b813e2b	Remove duplicated code; NFC ICmpInst::makeConstantRange does exactly the same thing as ConstantRange::makeExactICmpRegion. llvm-svn: 283059	2016-10-02 00:09:57 +00:00
Mehdi Amini	117296c0a0	Use StringRef in Pass/PassManager APIs (NFC) llvm-svn: 283004	2016-10-01 02:56:57 +00:00
Mehdi Amini	6610b01a27	[ASAN] Add the binder globals on Darwin to llvm.compiler.used to avoid LTO dead-stripping The binder is in a specific section that "reverse" the edges in a regular dead-stripping: the binder is live as long as a global it references is live. This is a big hammer that prevents LLVM from dead-stripping these, while still allowing linker dead-stripping (with special knowledge of the section). Differential Revision: https://reviews.llvm.org/D24673 llvm-svn: 282988	2016-10-01 00:05:34 +00:00
Sanjay Patel	f7b851fe84	[InstCombine] allow non-splat folds of select cond (ext X), C llvm-svn: 282906	2016-09-30 19:49:22 +00:00
Gor Nishanov	a263a60ad5	[Coroutines] Part15c: Fix coro-split to correctly handle definitions between coro.save and coro.suspend Summary: In the case below, %Result.i19 is defined between coro.save and coro.suspend and used after coro.suspend. We need to correctly place such a value into the coroutine frame. ``` %save = call token @llvm.coro.save(i8* null) %Result.i19 = getelementptr inbounds %"struct.lean_future<int>::Awaiter", %"struct.lean_future<int>::Awaiter"* %ref.tmp7, i64 0, i32 0 %suspend = call i8 @llvm.coro.suspend(token %save, i1 false) switch i8 %suspend, label %exit [ i8 0, label %await.ready i8 1, label %exit ] await.ready: %val = load i32, i32* %Result.i19 ``` Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24418 llvm-svn: 282902	2016-09-30 19:24:19 +00:00
Gor Nishanov	c16219486a	[Coroutines] Part15b: Fix dbg information handling in coro-split. Summary: Without the fix, if there was a function inlined into the coroutine with debug information, CloneFunctionInto(NewF, &F, VMap, /ModuleLevelChanges=/true, Returns); would duplicate all of the debug information including the DICompileUnit. We know use VMap to indicate that debug metadata for a File, Unit and FunctionType should not be duplicated when we creating clones that will become f.resume, f.destroy and f.cleanup. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24417 llvm-svn: 282899	2016-09-30 19:05:06 +00:00
Gor Nishanov	768de2c604	[Coroutines] Part 15a: Lower coro.subfn.addr in CoroCleanup Summary: Not all coro.subfn.addr intrinsics can be eliminated in CoroElide through devirtualization. Those that remain need to be lowered in CoroCleanup. Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24412 llvm-svn: 282897	2016-09-30 18:41:35 +00:00
Dehao Chen	977853b7c5	Update loop unroller cost model to make sure debug info does not affect optimization decisions. Summary: Debug info should not affect optimization decisions. This patch updates loop unroller cost model to make it not affected by debug info. Reviewers: davidxl, mzolotukhin Subscribers: haicheng, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D25098 llvm-svn: 282894	2016-09-30 18:30:04 +00:00
Etienne Bergeron	0ca0568604	[asan] Support dynamic shadow address instrumentation Summary: This patch is adding the support for a shadow memory with dynamically allocated address range. The compiler-rt needs to export a symbol containing the shadow memory range. This is required to support ASAN on windows 64-bits. Reviewers: kcc, rnk, vitalybuka Subscribers: zaks.anna, kubabrecka, dberris, llvm-commits, chrisha Differential Revision: https://reviews.llvm.org/D23354 llvm-svn: 282881	2016-09-30 17:46:32 +00:00
Artur Pilipenko	2af93490fb	CVP. Turn marking adds as no wrap on by default (was turned off by 279082) With 282650 in tree extra no wrap on adds doesn't cause regressions anymore. Reenable the optimzation. llvm-svn: 282872	2016-09-30 16:20:08 +00:00
Matthew Simpson	7808833e28	[LV] Build all scalar steps for non-uniform induction variables When building the steps for scalar induction variables, we previously attempted to determine if all the scalar users of the induction variable were uniform. If they were, we would only emit the step corresponding to vector lane zero. This optimization was too aggressive. We generally don't know the entire set of induction variable users that will be scalar. We have isScalarAfterVectorization, but this is only a conservative estimate of the instructions that will be scalarized. Thus, an induction variable may have scalar users that aren't already known to be scalar. To avoid emitting unused steps, we can only check that the induction variable is uniform. This should fix PR30542. Reference: https://llvm.org/bugs/show_bug.cgi?id=30542 llvm-svn: 282863	2016-09-30 15:13:52 +00:00
Adam Nemet	f744ad78e9	[LDist] Port to new streaming API for opt remarks llvm-svn: 282838	2016-09-30 04:56:25 +00:00
Adam Nemet	f57cc62abf	[LoopUnroll] Port to the new streaming interface for opt remarks. llvm-svn: 282834	2016-09-30 03:44:16 +00:00
Piotr Padlewski	d28694739c	[thinlto] Don't decay threshold for hot callsites Summary: We don't want to decay hot callsites to import chains of hot callsites. The same mechanism is used in LIPO. Reviewers: tejohnson, eraman, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24976 llvm-svn: 282833	2016-09-30 03:01:17 +00:00
Adam Nemet	fce0178847	[LoopDataPrefetch] Port to new streaming API for opt remarks llvm-svn: 282826	2016-09-30 00:42:43 +00:00
Adam Nemet	951c6b1955	[LV] Port the remarks in processLoop to the new streaming API This completes LV. llvm-svn: 282821	2016-09-30 00:29:30 +00:00
Adam Nemet	4fd9c42279	[LV] Port the last opt remark in Hints to the new streaming interface llvm-svn: 282820	2016-09-30 00:29:25 +00:00
Adam Nemet	877ccee8cc	[LAA, LV] Port to new streaming interface for opt remarks. Update LV (Recommit after making sure IsVerbose gets properly initialized in DiagnosticInfoOptimizationBase. See previous commit that takes care of this.) OptimizationRemarkAnalysis directly takes the role of the report that is generated by LAA. Then we need the magic to be able to turn an LAA remark into an LV remark. This is done via a new OptimizationRemark ctor. llvm-svn: 282813	2016-09-30 00:01:30 +00:00
Sanjay Patel	453ceff261	[InstCombine] fix function names; NFC Also, make foldSelectExtConst() a member of InstCombiner, remove unnecessary parameters from its interface, and group visitSelectInst helpers together in the header file. llvm-svn: 282796	2016-09-29 22:18:30 +00:00
Adam Nemet	556a06b1ee	Revert "[LAA, LV] Port to new streaming interface for opt remarks. Update LV" This reverts commit r282758. There are some clang failures I haven't seen. llvm-svn: 282759	2016-09-29 20:17:37 +00:00
Adam Nemet	c1d21817d1	[LAA, LV] Port to new streaming interface for opt remarks. Update LV OptimizationRemarkAnalysis directly takes the role of the report that is generated by LAA. Then we need the magic to be able to turn an LAA remark into an LV remark. This is done via a new OptimizationRemark ctor. llvm-svn: 282758	2016-09-29 20:12:18 +00:00
Adam Nemet	3628282a77	[LV] Port OptimizationRemarkAnalysisFPCommute and OptimizationRemarkAnalysisAliasing to new streaming API for opt remarks llvm-svn: 282742	2016-09-29 18:04:47 +00:00
Adam Nemet	6e1edd5d1f	[LV] Convert processLoop to new streaming API for opt remarks llvm-svn: 282740	2016-09-29 17:55:13 +00:00
Sanjay Patel	ccc2927b69	fix formatting; NFC llvm-svn: 282737	2016-09-29 17:48:19 +00:00
Kostya Serebryany	a9b0dd0e51	[sanitizer-coverage/libFuzzer] make the guards for trace-pc 32-bit; create one array of guards per function, instead of one guard per BB. reorganize the code so that trace-pc-guard does not create unneeded globals llvm-svn: 282735	2016-09-29 17:43:24 +00:00
Piotr Padlewski	ba72b95f7b	[thinlto] Add cold-callsite import heuristic Summary: Not tunned up heuristic, but with this small heuristic there is about +0.10% improvement on SPEC 2006 Reviewers: tejohnson, mehdi_amini, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24940 llvm-svn: 282733	2016-09-29 17:32:07 +00:00
Adam Nemet	eb0ba8d50f	[LV] Move static createMissedAnalysis from anonymous to global namespace This is an attempt to fix a windows bot. llvm-svn: 282730	2016-09-29 17:25:00 +00:00
Adam Nemet	0bfa441701	[LV] Convert CostModel to use the new streaming opt remark API Here we can already remove the member function emitAnalysis. llvm-svn: 282729	2016-09-29 17:15:48 +00:00
Adam Nemet	70757dd95a	[LV] Split most of createMissedAnalysis into a static function. NFC This will be shared between Legality and CostModel. llvm-svn: 282728	2016-09-29 17:05:35 +00:00
Adam Nemet	9988ca3db3	[LV] Convert all but one opt remark in Legality to new streaming interface The last one remaining after which emitAnalysis can be removed is when we convert the LAA's report to a vectorization report. This requires converting LAA to the new interface first. llvm-svn: 282726	2016-09-29 16:49:42 +00:00
Adam Nemet	9a1a5ef212	[LV] Convert emitRemark to new opt remark streaming interface Also renamed the function to emitRemarkWithHints to better reflect what the function actually does. llvm-svn: 282723	2016-09-29 16:23:12 +00:00
Volkan Keles	6ec2ac0416	Test commit. NFC. llvm-svn: 282717	2016-09-29 13:04:37 +00:00
Evgeny Stupachenko	dc8a254663	Wisely choose sext or zext when widening IV. Summary: The patch fixes regression caused by two earlier patches D18777 and D18867. Reviewers: reames, sanjoy Differential Revision: http://reviews.llvm.org/D24280 From: Li Huang llvm-svn: 282650	2016-09-28 23:39:39 +00:00
Dehao Chen	5461d8bdb5	Refactor the ProfileSummaryInfo to use doInitialization and doFinalization to handle Module update. Summary: This refactors the change in r282616 Reviewers: davidxl, eraman, mehdi_amini Subscribers: mehdi_amini, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D25041 llvm-svn: 282630	2016-09-28 21:00:58 +00:00
Jonas Paulsson	58c5a7f55a	[SystemZ] Implementation of getUnrollingPreferences(). This commit enables more unrolling for SystemZ by implementing the SystemZTargetTransformInfo::getUnrollingPreferences() method. It has been found that it is better to only unroll moderately, so the DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order to set this to a lower value for SystemZ (4). Reviewers: Evgeny Stupachenko, Ulrich Weigand. https://reviews.llvm.org/D24451 llvm-svn: 282570	2016-09-28 09:41:38 +00:00
Adam Nemet	c507ac96f5	[Inliner] Port all opt remarks to new streaming API llvm-svn: 282559	2016-09-27 23:47:03 +00:00
Adam Nemet	04758ba385	Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFC With the new streaming interface, these class names need to be typed a lot and it's way too looong. llvm-svn: 282544	2016-09-27 22:19:23 +00:00
Adam Nemet	1142147e41	[Inliner] Fold the analysis remark into the missed remark There is really no reason for these to be separate. The vectorizer started this pretty bad tradition that the text of the missed remarks is pretty meaningless, i.e. vectorization failed. There, you have to query analysis to get the full picture. I think we should just explain the reason for missing the optimization in the missed remark when possible. Analysis remarks should provide information that the pass gathers regardless whether the optimization is passing or not. llvm-svn: 282542	2016-09-27 21:58:17 +00:00
Michael Zolotukhin	1a554be3b6	[LoopSimplify] When simplifying phis in loop-simplify, do it only if it preserves LCSSA form. llvm-svn: 282541	2016-09-27 21:03:45 +00:00
Adam Nemet	a62b7e1a28	Output optimization remarks in YAML (Re-committed after moving the template specialization under the yaml namespace. GCC was complaining about this.) This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282539	2016-09-27 20:55:07 +00:00
Reid Kleckner	6481822e28	[DebugInfo] Add comments to phi dbg.value tracking code, NFC LLVM developers might be surprised to learn that there are blocks without valid insertion points (catchswitch), so it seems worth calling that out explicitly. Also add a FIXME about what we should really be doing if we ever need to make optimized Windows EH code debuggable. While I'm here, make auto usage more consistent with LLVM standards and avoid an unecessary call to insertBefore. llvm-svn: 282521	2016-09-27 18:45:31 +00:00
Adam Nemet	cc2a3fa8e8	Revert "Output optimization remarks in YAML" This reverts commit r282499. The GCC bots are failing llvm-svn: 282503	2016-09-27 16:39:24 +00:00
Adam Nemet	92e928c10a	Output optimization remarks in YAML This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282499	2016-09-27 16:15:16 +00:00
Kostya Serebryany	45c144754b	[sanitizer-coverage] fix a bug in trace-gep llvm-svn: 282467	2016-09-27 01:55:08 +00:00
Kostya Serebryany	186d61801c	[sanitizer-coverage] don't emit the CTOR function if nothing has been instrumented llvm-svn: 282465	2016-09-27 01:08:33 +00:00
Ivan Krasin	4ff4f21e15	Revert r277556. Add -lowertypetests-bitsets-level to control bitsets generation Summary: We don't currently need this facility for CFI. Disabling individual hot methods proved to be a better strategy in Chrome. Also, the design of the feature is suboptimal, as pointed out by Peter Collingbourne. Reviewers: pcc Subscribers: kcc Differential Revision: https://reviews.llvm.org/D24948 llvm-svn: 282461	2016-09-27 00:29:53 +00:00
Peter Collingbourne	53a852b648	LowerTypeTests: Remove unused variable. llvm-svn: 282456	2016-09-26 23:56:17 +00:00
Peter Collingbourne	6ed92e3f53	LowerTypeTests: Create LowerTypeTestsModule class and move implementation there. Related simplifications. llvm-svn: 282455	2016-09-26 23:54:39 +00:00
Piotr Padlewski	d9830eb79f	[thinlto] Basic thinlto fdo heuristic Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437	2016-09-26 20:37:32 +00:00
Daniel Berlin	1e98c04226	Remove pruning of phi nodes in MemorySSA - it makes updating harder Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24923 llvm-svn: 282419	2016-09-26 17:22:54 +00:00
Matthew Simpson	b764aba2ab	[LV] Scalarize instructions marked scalar after vectorization This patch ensures that we actually scalarize instructions marked scalar after vectorization. Previously, such instructions may have been vectorized instead. Differential Revision: https://reviews.llvm.org/D23889 llvm-svn: 282418	2016-09-26 17:08:37 +00:00
Gor Nishanov	bc0ebb383c	[Coroutines] Part14: Handle coroutines with no suspend points. Summary: If coroutine has no suspend points, remove heap allocation and turn a coroutine into a normal function. Also, if a pattern is detected that coroutine resumes or destroys itself prior to coro.suspend call, turn the suspend point into a simple jump to resume or cleanup label. This pattern occurs when coroutines are used to propagate errors in functions that return expected<T>. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24408 llvm-svn: 282414	2016-09-26 15:49:28 +00:00
Alexey Bataev	793c946ecb	[InstCombine] Fixed bug introduced in r282237 The index of the new insertelement instruction was evaluated in the wrong way, it was considered as the index of the inserted value instead of index of the position, where the value should be inserted. llvm-svn: 282401	2016-09-26 13:18:59 +00:00
Andrea Di Biagio	a82d52d11d	[InstCombine] Teach the udiv folding logic how to handle constant expressions. This patch fixes PR30366. Function foldUDivShl() worked under the assumption that one of the values in input to the function was always an instance of llvm::Instruction. However, function visitUDivOperand() (the only user of foldUDivShl) was clearly violating that precondition; internally, visitUDivOperand() uses pattern matches to check the operands of a udiv. Pattern matchers for binary operators know how to handle both Instruction and ConstantExpr values. This patch fixes the problem in foldUDivShl(). Now we use pattern matchers instead of explicit casts to Instruction. The reduced test case from PR30366 has been added to test file InstCombine/udiv-simplify.ll. Differential Revision: https://reviews.llvm.org/D24565 llvm-svn: 282398	2016-09-26 12:07:23 +00:00
Duncan P. N. Exon Smith	11c06ea55a	ObjCARC: Don't look at users of ConstantData Stop looking at users of UndefValue and ConstantPointerNull in the objective C ARC optimizers. The other users aren't actually interesting, since they're not pointing at a particular object. I imagine these calls could be optimized through -instcombine... maybe they already are? These early returns will be required at some point in the future, with a WIP patch that asserts when someone accesses a use-list on ConstantData. llvm-svn: 282338	2016-09-24 21:01:20 +00:00
Duncan P. N. Exon Smith	4fd9b7e16f	Scalar: Ignore ConstantData in processAssumption Assumptions on UndefValue and ConstantPointerNull aren't relevant to other users. Ignore them entirely to avoid wasting cycles walking through their (possibly extremely extensive (cross-module)) use-lists. It wasn't clear how to add a specific test for this, and it'll be covered anyway by an eventual patch that asserts when trying to access the use-list of an instance of ConstantData. llvm-svn: 282334	2016-09-24 20:00:38 +00:00
Duncan P. N. Exon Smith	c82c11428e	GlobalStatus: Don't walk use-lists of ConstantData Return early from llvm::isSafeToDestroyConstant() whenever the value `isa<ConstantData>()`. These constants are shared across the LLVMContext. We never really want to delete them here, and walking their use-lists can be very expensive. (This is motivated by an eventual goal of removing use-lists entirely from ConstantData.) llvm-svn: 282320	2016-09-24 02:30:11 +00:00
Alexey Bataev	fee9078dcd	[InstCombine] Fix for PR29124: reduce insertelements to shufflevector If inserting more than one constant into a vector: define <4 x float> @foo(<4 x float> %x) { %ins1 = insertelement <4 x float> %x, float 1.0, i32 1 %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2 ret <4 x float> %ins2 } InstCombine could reduce that to a shufflevector: define <4 x float> @goo(<4 x float> %x) { %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3> ret <4 x float> %shuf } Also, InstCombine tries to convert shuffle instruction to single insertelement, if one of the vectors is a constant vector and only a single element from this constant should be used in shuffle, i.e. shufflevector <4 x float> %v, <4 x float> <float undef, float 1.0, float undef, float undef>, <4 x i32> <i32 0, i32 5, i32 undef, i32 undef> -> insertelement <4 x float> %v, float 1.0, 1 Differential Revision: https://reviews.llvm.org/D24182 llvm-svn: 282237	2016-09-23 09:14:08 +00:00
Sanjay Patel	30ef70b090	[InstCombine] fold X urem C -> X < C ? X : X - C when C is big (PR28672) We already have the udiv variant of this transform, so I think this is ok for InstCombine too even though there is an increase in IR instructions. As the tests and TODO comments show, the transform can lead to follow-on combines. This should fix: https://llvm.org/bugs/show_bug.cgi?id=28672 Differential Revision: https://reviews.llvm.org/D24527 llvm-svn: 282209	2016-09-22 22:36:26 +00:00
Hans Wennborg	c7957ef86c	Revert r282168 "GVN-hoist: fix store past load dependence analysis (PR30216)" and also the dependent r282175 "GVN-hoist: do not dereference null pointers" It's causing compiler crashes building Harfbuzz (PR30499). llvm-svn: 282199	2016-09-22 21:20:53 +00:00
Sebastian Pop	1531f30ccc	GVN-hoist: do not dereference null pointers there may be basic blocks without memory accesses, in which case the list of accesses is a null pointer. llvm-svn: 282175	2016-09-22 17:22:58 +00:00
Sebastian Pop	8e6e3318c2	GVN-hoist: fix store past load dependence analysis (PR30216) To hoist stores past loads, we used to search for potential conflicting loads on the hoisting path by following a MemorySSA def-def link from the store to be hoisted to the previous defining memory access, and from there we followed the def-use chains to all the uses that occur on the hoisting path. The problem is that the def-def link may point to a store that does not alias with the store to be hoisted, and so the loads that are walked may not alias with the store to be hoisted, and even as in the testcase of PR30216, the loads that may alias with the store to be hoisted are not visited. The current patch visits all loads on the path from the store to be hoisted to the hoisting position and uses the alias analysis to ask whether the store may alias the load. I was not able to use the MemorySSA functionality to ask for whether load and store are clobbered: I'm not sure which function to call, so I used a call to AA->isNoAlias(). Store past store is still working as before using a MemorySSA query: I added an extra test to pr30216.ll to make sure store past store does not regress. Differential Revision: https://reviews.llvm.org/D24517 llvm-svn: 282168	2016-09-22 15:33:51 +00:00
Sebastian Pop	5d68aa7913	GVN-hoist: fix typo llvm-svn: 282165	2016-09-22 15:08:09 +00:00
Etienne Bergeron	7f0e315327	[compiler-rt] fix typo in option description [NFC] llvm-svn: 282163	2016-09-22 14:57:24 +00:00
Sebastian Pop	440f15b7fc	GVN-hoist: only hoist relevant scalar instructions Without this patch, GVN-hoist would think that a branch instruction is a scalar instruction and would try to value number it. The patch filters out all such kind of irrelevant instructions. A bit frustrating is that there is no easy way to discard all those very infrequent instructions, a bit like isa<TerminatorInst> that stands for a large family of instructions. I'm thinking that checking for those very infrequent other instructions would cost us more in compilation time than just letting those instructions getting numbered, so I'm still thinking that a simpler check: if (isa<TerminatorInst>(I)) return false; is better than listing all the other less frequent instructions. Differential Revision: https://reviews.llvm.org/D23929 llvm-svn: 282160	2016-09-22 14:45:40 +00:00
Keith Walker	ba1598975f	Reapplying r281895 (and follow-up r281964) after fixing pr30468. The additional fix is: When adding debug information to a lowered phi node in mem2reg check that we have a valid insertion point after the phi for adding the debug information. This change addresses the issue in pr30468 where a lowered phi was added before a catchswitch and no debug information should be added after the phi in this case. Differential Revision: https://reviews.llvm.org/D24797 llvm-svn: 282155	2016-09-22 14:13:25 +00:00
Anna Thomas	82c3717f54	[RS4GC] Remat in presence of phi and use live value Summary: Reviewers: Subscribers: llvm-svn: 282150	2016-09-22 13:13:06 +00:00
Sagar Thakur	e74eb4e7b9	[EfficiencySanitizer] Using '$' instead of '#' for struct counter name For MIPS '#' is the start of comment line. Therefore we get assembler errors if # is used in the structure names. Differential: D24334 Reviewed by: zhaoqin llvm-svn: 282141	2016-09-22 08:33:06 +00:00
Dorit Nuzman	d1247a684e	Fix revision 281960 llvm-svn: 282139	2016-09-22 07:56:23 +00:00
Chad Rosier	00eb8db3a1	[LoopInterchange] Track all dependencies, not just anti dependencies. Currently, we give up on loop interchange if we encounter a flow dependency anywhere in the loop list. Worse yet, we don't even track output dependencies. This patch updates the dependency matrix computation to track flow and output dependencies in the same way we track anti dependencies. This improves an internal workload by 2.2x. Note the loop interchange pass is off by default and it can be enabled with '-mllvm -enable-loopinterchange' Differential Revision: https://reviews.llvm.org/D24564 llvm-svn: 282101	2016-09-21 19:16:47 +00:00
Nico Weber	a489438849	revert 281908 because 281909 got reverted llvm-svn: 282097	2016-09-21 18:25:43 +00:00
Matthew Simpson	15869f86d8	[LV] Don't emit unused scalars for uniform instructions If we identify an instruction as uniform after vectorization, we know that we should only use the value corresponding to the first vector lane of each unroll iteration. However, when scalarizing such instructions, we still produce values for the other vector lanes. This patch prevents us from generating the unused scalars. Differential Revision: https://reviews.llvm.org/D24275 llvm-svn: 282087	2016-09-21 16:50:24 +00:00
Dehao Chen	160fbc3f95	Change the basic block weight calculation algorithm to use max instead of voting. Summary: Now that we have more precise debug info, we should change back to use maximum to get basic block weight. Reviewers: dnovillo Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D24788 llvm-svn: 282084	2016-09-21 16:26:51 +00:00
Matthew Simpson	a95e8bb7ed	[LV] Rename "Width" to "Lane" (NFC) llvm-svn: 282083	2016-09-21 16:09:23 +00:00
Hans Wennborg	1049085c78	Revert r281895 "Add @llvm.dbg.value entries for the phi node created by -mem2reg" (And follow-up r281964.) It caused PR30468. llvm-svn: 282077	2016-09-21 15:55:53 +00:00
Arnold Schwaighofer	f62ba1031f	DeadArgElim: Don't mark swifterror arguments as unused Replacing swifterror arguments with undef creates invalid IR. rdar://28300490 llvm-svn: 282075	2016-09-21 15:29:08 +00:00
Chad Rosier	f7c76f91e0	[LoopInterchange] Various cleanup. NFC. llvm-svn: 282071	2016-09-21 13:28:41 +00:00
Xinliang David Li	9780fc1451	code cleanup -- commoning IR travsersals llvm-svn: 282034	2016-09-20 22:39:47 +00:00
Anna Thomas	8cd7de1d18	[RS4GC] Refactor code for Rematerializing in presence of phi. NFC Summary: This is an NFC refactoring change as a precursor to the actual fix for rematerializing in presence of phi. https://reviews.llvm.org/D24399 Pasted from review: findRematerializableChainToBasePointer changed to return the root of the chain. instead of true or false. move the PHI matching logic into the caller by inspecting the root return value. This includes an assertion that the alternate root is in the liveset for the call. Tested with current RS4GC tests. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24780 llvm-svn: 282023	2016-09-20 21:36:02 +00:00
Xinliang David Li	c7368287b7	[Profile] Do not annotate select insts not covered in profile. Fixed PR/30466 llvm-svn: 282009	2016-09-20 20:20:01 +00:00
Xinliang David Li	a754c47ac5	[Profile] code refactoring: make getStep a method in base class llvm-svn: 282002	2016-09-20 19:07:22 +00:00
Adrian Prantl	12fa3b3911	ASAN: Don't drop debug info attachements for global variables. This is a follow-up to r281284. Global Variables now can have !dbg attachements, so ASAN should clone these when generating a sanitized copy of a global variable. <rdar://problem/24899262> llvm-svn: 281994	2016-09-20 18:28:42 +00:00
Keith Walker	22b5dbc8bf	Make llvm::ConvertDebugDeclareToDebugValue() be a void function (NFC) The routines llvm::ConvertDebugDeclareToDebugValue() always returned a true value which was never checked at the call site; change the function return type to void. This NFC cleanup was approved in the review https://reviews.llvm.org/D23715 llvm-svn: 281964	2016-09-20 10:36:17 +00:00
Dorit Nuzman	02efef0525	Reverting revision 281960 due to test failures. llvm-svn: 281961	2016-09-20 08:27:48 +00:00
Dorit Nuzman	d3686e5269	[SROA] Preserve llvm.mem.parallel_loop_access metadata. SROA doesn't preserve the llvm.mem.parallel_loop_access metadata when it transforms loads/stores. This patch fixes a couple occurences of this issue. (Partially addresses PR28981). Differential Revision: https://reviews.llvm.org/D23549 llvm-svn: 281960	2016-09-20 07:50:49 +00:00
Kostya Serebryany	06694d0a2f	[sanitizer-coverage] add comdat to coverage guards if needed llvm-svn: 281952	2016-09-20 00:16:54 +00:00
Philip Reames	b1472ffed7	[LCSSA] Cache LoopExits to avoid wasted work When looking at the scribus_1.3 example from https://llvm.org/bugs/show_bug.cgi?id=10584, I noticed that we were spending a large amount of time computing loop exits in LCSSA. This code appears to be written with the assumption that LoopExits are stored in the Loop and thus cheap to query. This is not true, so we should cache the result across the potentially long running loop which tends to visit a small handful of Loops. On the particular example from 10584, this change drops the time spent in LCSSA computation by about 80%. Differential Revision: https://reviews.llvm.org/D24509 llvm-svn: 281949	2016-09-19 23:30:23 +00:00
David Callahan	c165a4e215	Merge branch 'ADCE5' llvm-svn: 281947	2016-09-19 23:17:58 +00:00
Dehao Chen	20866ed57e	Handle early inline for hot callsites that reside in the same basic block. Summary: Callsites in the same basic block should share the same hotness. This patch checks for the hottest callsite in the same basic block, and use the hotness for all callsites in that basic block for early inline decisions. It also fixes the test to add "-S" so theat the "CHECK-NOT" is actually checking the content. Reviewers: dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24734 llvm-svn: 281927	2016-09-19 18:38:14 +00:00
Dehao Chen	82667d04c5	Only set branch weight during sample pgo annotation when max_weight of the branch is non-zero. Otherwise use default static profile to set branch probability. Summary: It does not make sense to set equal weights for all unkown branches as we have static branch prediction available. Reviewers: dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24732 llvm-svn: 281912	2016-09-19 16:33:41 +00:00
Dehao Chen	38e3731c47	Use call target count to derive the call instruction weight Summary: The call target count profile is directly derived from LBR branch->target data. This is more reliable than instruction frequency profiles that could be moved across basic block boundaries. This patches uses call target count profile to annotate call instructions. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24410 llvm-svn: 281911	2016-09-19 16:06:37 +00:00
Etienne Bergeron	6ba5176862	[asan] Support dynamic shadow address instrumentation Summary: This patch is adding the support for a shadow memory with dynamically allocated address range. The compiler-rt needs to export a symbol containing the shadow memory range. This is required to support ASAN on windows 64-bits. Reviewers: kcc, rnk, vitalybuka Subscribers: kubabrecka, dberris, llvm-commits, chrisha Differential Revision: https://reviews.llvm.org/D23354 llvm-svn: 281908	2016-09-19 15:58:38 +00:00
Keith Walker	c941252374	Add @llvm.dbg.value entries for the phi node created by -mem2reg When phi nodes are created in the -mem2reg phase, the @llvm.dbg.declare entries are converted to @llvm.dbg.value entries at the place where the store instructions existed. However no entry is created to describe the resulting value of the phi node. The effect of this is especially noticeable in for loops which have a constant for the intial value; the loop control variable's location would be described as the intial constant value in the loop body once the -mem2reg optimization phase was run. This change adds the creation of the @llvm.dbg.value entries to describe variables whose location is the result of a phi node created in -mem2reg. Also when the phi node is finally lowered to a machine instruction it is important that the lowered "load" instruction is placed before the associated DEBUG_VALUE entry describing the value loaded. Differential Revision: https://reviews.llvm.org/D23715 llvm-svn: 281895	2016-09-19 09:49:30 +00:00
James Molloy	0efb96a8ee	[SimplifyCFG] Update (AND) IR flags when CSE'ing instructions We were updating metadata but not IR flags. Because we pick an arbitrary instruction to be the CSE candidate, it comes down to luck (50% or less chance) if this results in broken codegen or not, which is why PR30373 which is actually not the fault of the commit it was bisected down to. Fixes PR30373. llvm-svn: 281889	2016-09-19 08:23:08 +00:00
Dehao Chen	41cde0b986	Handle Invoke during sample profiler annotation: make it inlinable. Summary: Previously we reline on inst-combine to remove inlinable invoke instructions. This causes trouble because a few extra optimizations are schedule early that could introduce too much CFG change (e.g. simplifycfg removes too much control flow). This patch handles invoke instruction in-place during sample profile annotation, so that we do not rely on instcombine to remove those invoke instructions. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24409 llvm-svn: 281870	2016-09-18 23:11:37 +00:00
Simon Pilgrim	f33a6b713b	Fix covered-switch-default warning llvm-svn: 281865	2016-09-18 21:08:35 +00:00
Xinliang David Li	f0630d3d96	Fix built bot failure llvm-svn: 281859	2016-09-18 18:52:08 +00:00
Xinliang David Li	4ca1733a06	[Profile] Implement select instruction instrumentation in IR PGO Differential Revision: http://reviews.llvm.org/D23727 llvm-svn: 281858	2016-09-18 18:34:07 +00:00
Elena Demikhovsky	5f8cc0c346	[Loop Vectorizer] Consecutive memory access - fixed and simplified Amended consecutive memory access detection in Loop Vectorizer. Load/Store were not handled properly without preceding GEP instruction. Differential Revision: https://reviews.llvm.org/D20789 llvm-svn: 281853	2016-09-18 13:56:08 +00:00
Elena Demikhovsky	a1a0e7ddbe	[Loop vectorizer] Simplified GEP cloning. NFC. Simplified GEP cloning in vectorizeMemoryInstruction(). Added an assertion that checks consecutive GEP, which should have only one loop-variant operand. Differential Revision: https://reviews.llvm.org/D24557 llvm-svn: 281851	2016-09-18 09:22:54 +00:00
Kostya Serebryany	8e781a888a	[libFuzzer] use 'if guard' instead of 'if guard >= 0' with trace-pc; change the guard type to intptr_t; use separate array for 8-bit counters llvm-svn: 281845	2016-09-18 04:52:23 +00:00
Teresa Johnson	fbb431b292	[ThinLTO] Ensure anonymous globals renamed even at -O0 Summary: This fixes an issue when files are compiled with -flto=thin at default -O0. We need to rename anonymous globals before attempting to write the module summary because all values need names for the summary. This was happening at -O1 and above, but not before the early exit when constructing the pipeline for -O0. Also add an internal -prepare-for-thinlto option to enable this to be tested via opt. Fixes PR30419. Reviewers: mehdi_amini Subscribers: probinson, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24701 llvm-svn: 281840	2016-09-17 20:40:16 +00:00
Mehdi Amini	a53d49e1b5	Don't create a SymbolTable in Function when the LLVMContext discards value names (NFC) The ValueSymbolTable is used to detect name conflict and rename instructions automatically. This is not needed when the value names are automatically discarded by the LLVMContext. No functional change intended, just saving a little bit of memory. This is a recommit of r281806 after fixing the accessor to return a pointer instead of a reference and updating all the call-sites. llvm-svn: 281813	2016-09-17 06:00:02 +00:00
Kostya Serebryany	8ad4155745	[sanitizer-coverage] change trace-pc to use 8-byte guards llvm-svn: 281809	2016-09-17 05:03:05 +00:00
Sanjay Patel	f26710d97d	[InstCombine] canonicalize vector select with constant vector condition to shuffle As discussed on llvm-dev ( http://lists.llvm.org/pipermail/llvm-dev/2016-August/104210.html ): turn a vector select with constant condition operand into a shuffle as a canonicalization step. Shuffles may be easier to reason about in conjunction with other shuffles and insert/extract. Possible known (minor?) regressions from this change are filed as: https://llvm.org/bugs/show_bug.cgi?id=28530 https://llvm.org/bugs/show_bug.cgi?id=28531 https://llvm.org/bugs/show_bug.cgi?id=30371 If something terrible happens to perf after this commit, feel free to revert until a backend fix is in place. Differential Revision: https://reviews.llvm.org/D24279 llvm-svn: 281787	2016-09-16 22:16:18 +00:00
Sanjay Patel	c96f6db246	[InstCombine] allow vector types for constant folding / computeKnownBits (PR24942) computeKnownBits() already works for integer vectors, so allow vector types when calling that from InstCombine. I don't think the change to use m_APInt in computeKnownBits is strictly necessary because we do check for ConstantVector later, but it's more efficient to handle the splat case without needing to loop on vector elements. This should work with InstSimplify, but doesn't yet, so I made that a FIXME comment on the test for PR24942: https://llvm.org/bugs/show_bug.cgi?id=24942 Differential Revision: https://reviews.llvm.org/D24677 llvm-svn: 281777	2016-09-16 21:20:36 +00:00
Eli Friedman	66fdba8799	LoopDistribute should preserve GlobalsAA. Differential Revision: https://reviews.llvm.org/D24204 llvm-svn: 281758	2016-09-16 18:01:48 +00:00
Eli Friedman	02d48be5c0	LoopLoadElimination should preserve GlobalsAA. Avoids losing GlobalsAA in the standard pass pipeline. Differential Revision: https://reviews.llvm.org/D24094 llvm-svn: 281757	2016-09-16 17:58:07 +00:00
Mehdi Amini	27d2379b4e	Rename NameAnonFunctions to NameAnonGlobals to match what it is doing (NFC) llvm-svn: 281745	2016-09-16 16:56:30 +00:00
Mehdi Amini	2cac787919	Fix NameAnonFunctions pass: for ThinLTO we need to rename global variables as well A follow-up patch will rename this pass and the source file accordingly, but I figured the non-NFC change will be easier to spot in isolation. Differential Revision: https://reviews.llvm.org/D24641 llvm-svn: 281744	2016-09-16 16:56:25 +00:00
Sanjay Patel	10494b2682	[InstCombine] add helper functions for visitICmpInst(); NFCI llvm-svn: 281743	2016-09-16 16:10:22 +00:00
Vitaly Buka	6c7a0bc3d9	Revert "[asan] Avoid lifetime analysis for allocas with can be in ambiguous state" This approach is not good enough. Working on the new solution. This reverts commit r280907. llvm-svn: 281689	2016-09-16 01:38:46 +00:00
Vitaly Buka	4670ae5f61	Revert "[asan] Add flag to allow lifetime analysis of problematic allocas" This approach is not good enough. Working on the new solution. This reverts commit r281126. llvm-svn: 281688	2016-09-16 01:38:43 +00:00
Sanjay Patel	8da42cc5d3	[InstCombine] move folds for icmp (sh C2, Y), C1 in with other icmp+sh folds; NFCI llvm-svn: 281672	2016-09-15 22:26:31 +00:00
Kostya Serebryany	66a9c175bf	[sanitizer-coverage] make trace-pc-guard and indirect-call work together llvm-svn: 281665	2016-09-15 22:11:08 +00:00
Sanjay Patel	af91d1f81e	[InstCombine] allow icmp (shr/shl) folds for vectors These 2 helper functions were already using APInt internally, so just change the API and caller to allow folds for splats. The scalar regression tests look quite thorough, so I just added a couple of tests to prove that vectors are handled too. These folds should be grouped with the other cmp+shift folds though. That can be an NFC follow-up. llvm-svn: 281663	2016-09-15 21:35:30 +00:00
Mehdi Amini	d880309835	[GlobalOpt] Dead Eliminate declarations GlobalOpt is already dead-code-eliminating global definitions. With this change it also takes care of declarations. Hopefully this should make it now a strict superset of GlobalDCE. This is important for LTO/ThinLTO as we don't want the linker to see "undefined reference" when it processes the input files: it could prevent proper internalization (or even load an extra file from a static archive, changing the behavior of the program!). llvm-svn: 281653	2016-09-15 20:26:27 +00:00
David Majnemer	8b16da8744	[InstCombine] Do not RAUW a constant GEP canRewriteGEPAsOffset expects to process instructions, not constants. This fixes PR30342. llvm-svn: 281650	2016-09-15 20:10:09 +00:00
Sanjay Patel	524fcdf041	[InstCombine] simplify code; NFCI llvm-svn: 281644	2016-09-15 19:04:55 +00:00
Sanjay Patel	d93c4c0137	fix function names; NFC llvm-svn: 281637	2016-09-15 18:22:25 +00:00
Sanjay Patel	886a542e23	[InstCombine] allow icmp (sub nsw) folds for vectors Also, clean up the code and comments for the existing folds in foldICmpSubConstant(). llvm-svn: 281631	2016-09-15 18:05:17 +00:00
Sanjay Patel	362ff5c0a5	[InstCombine] remove duplicated fold ; NFCI This pattern is matched in foldICmpBinOpEqualityWithConstant() and already works with vectors too. I changed some comments over there to point out the current location. The tests for this transform are currently in 'sub.ll'. Note that the remaining folds in this block all require a sub too, so they should get grouped with the other icmp(sub) patterns. llvm-svn: 281627	2016-09-15 17:01:17 +00:00
Sanjay Patel	40c53ea933	[InstCombine] allow (icmp sgt smin(PosA, B), 0) fold for vectors llvm-svn: 281624	2016-09-15 16:23:20 +00:00
Etienne Bergeron	78582b2ada	[compiler-rt] Changing function prototype returning unused value Summary: The return value of `maybeInsertAsanInitAtFunctionEntry` is ignored. Reviewers: rnk Subscribers: llvm-commits, chrisha, dberris Differential Revision: https://reviews.llvm.org/D24568 llvm-svn: 281620	2016-09-15 15:45:05 +00:00
Etienne Bergeron	52e4743e24	Fix silly mistake introduced here : https://reviews.llvm.org/D24566 Asan bots are currently broken without this patch. llvm-svn: 281618	2016-09-15 15:35:59 +00:00
Etienne Bergeron	c0669ce984	address comments from: https://reviews.llvm.org/D24566 using startswith instead of find. llvm-svn: 281617	2016-09-15 15:19:19 +00:00
Sanjay Patel	9745983a4d	[InstCombine] clean up foldICmpWithConstant(); NFC 1. Early exit to reduce indent 2. Rename variables 3. Add local 'Pred' variable llvm-svn: 281615	2016-09-15 15:11:12 +00:00
Sanjay Patel	06b127a771	[InstCombine] add helper function for foldICmpWithConstant; NFC This is a big glob of transforms that probably should work for vectors, but currently they are disallowed because of ConstantInt guards. llvm-svn: 281614	2016-09-15 14:37:50 +00:00
Sanjay Patel	7577a3d799	[InstCombine] use m_APInt to allow icmp folds using known bits for splat constant vectors llvm-svn: 281613	2016-09-15 14:15:47 +00:00
Sanjay Patel	9efb1bdcc4	[InstCombine] refactor eq/ne cases in foldICmpUsingKnownBits() ; NFCI The pattern matching and transforms are identical; the cmp predicate just changes. llvm-svn: 281561	2016-09-14 23:38:56 +00:00
Etienne Bergeron	752f8839a4	[compiler-rt] Avoid instrumenting sanitizer functions Summary: Function __asan_default_options is called by __asan_init before the shadow memory got initialized. Instrumenting that function may lead to flaky execution. As the __asan_default_options is provided by users, we cannot expect them to add the appropriate function atttributes to avoid instrumentation. Reviewers: kcc, rnk Subscribers: dberris, chrisha, llvm-commits Differential Revision: https://reviews.llvm.org/D24566 llvm-svn: 281503	2016-09-14 17:18:37 +00:00
Chad Rosier	e6b3a63a3d	[LoopInterchange] Typo. NFC. llvm-svn: 281501	2016-09-14 17:12:30 +00:00

... 3 4 5 6 7 ...

16655 Commits