llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Zolotukhin	4fdf974e3e	Follow-up for r261595: use range loop. llvm-svn: 261597	2016-02-23 00:48:44 +00:00
Michael Zolotukhin	de19ed1eb1	[LoopUnroll] Avoid unnecessary DT recomputation. Summary: When we completely unroll a loop, it's pretty easy to update DT in-place and thus avoid rebuilding it. DT recalculation is one of the most time-consuming tasks in loop-unroll, so avoiding it at least in case of full unroll should be beneficial. On some extreme (but still real-world) tests this patch improves compile time by ~2x. Reviewers: escha, jmolloy, hfinkel, sanjoy, chandlerc Subscribers: joker.eph, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D17473 llvm-svn: 261595	2016-02-23 00:30:50 +00:00
Dehao Chen	6c73b49911	Set function entry count as 0 if sample profile is not found for the function. Summary: This change makes the sample profile's behavior consistent with instr profile. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17522 llvm-svn: 261587	2016-02-22 22:46:21 +00:00
Adam Nemet	fb31d580ea	[LoopDataPrefetch] Make it testable with opt Summary: Since this is an IR pass it's nice to be able to write tests without llc. This is the counterpart of the llc test under CodeGen/PowerPC/loop-data-prefetch.ll. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17464 llvm-svn: 261578	2016-02-22 21:41:22 +00:00
Michael Zolotukhin	d734bea8ba	[LoopUnrolling] Fix a bug introduced in r259869 (PR26688). The issue was that we only required LCSSA rebuilding if the immediate parent-loop had values used outside of it. The fix is to enaable the same logic for all outer loops, not only immediate parent. llvm-svn: 261575	2016-02-22 21:21:45 +00:00
Philip Reames	ce38c2ddf6	[RS4GC] "Constant fold" the rs4gc-split-vector-values flag This flag was part of a migration to a new means of handling vectors-of-points which was described in the llvm-dev thread "FYI: Relocating vector of pointers". The old code path has been off by default for a while without complaints, so time to cleanup. llvm-svn: 261569	2016-02-22 21:01:28 +00:00
Philip Reames	79fa9b75c0	[RS4GC] Revert optimization attempt due to memory corruption This change reverts "246133 [RewriteStatepointsForGC] Reduce the number of new instructions for base pointers" and a follow on bugfix 12575. As pointed out in pr25846, this code suffers from a memory corruption bug. Since I'm (empirically) not going to get back to this any time soon, simply reverting the problematic change is the right answer. llvm-svn: 261565	2016-02-22 20:45:56 +00:00
Justin Lebar	ccbd8f5a02	Revert "[attrs] Handle convergent CallSites." This reverts r261544, which was causing a test failure in Transforms/FunctionAttrs/readattrs.ll. llvm-svn: 261549	2016-02-22 18:24:43 +00:00
Justin Lebar	7bf9187abb	[attrs] Handle convergent CallSites. Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Reviewers: chandlerc, jingyue Subscribers: hfinkel, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17317 llvm-svn: 261544	2016-02-22 17:51:35 +00:00
Benjamin Kramer	451f54cf62	Fix some abuse of auto flagged by clang's -Wrange-loop-analysis. llvm-svn: 261524	2016-02-22 13:11:58 +00:00
Elena Demikhovsky	9914dbd11b	Allow setting MaxRerollIterations above 16 By Ayal Zaks. Differential Revision http://reviews.llvm.org/D17258 llvm-svn: 261517	2016-02-22 09:38:28 +00:00
Duncan P. N. Exon Smith	e9bc579c37	ADT: Remove == and != comparisons between ilist iterators and pointers I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498	2016-02-21 20:39:50 +00:00
Duncan P. N. Exon Smith	ec6f7fed54	TransformUtils: Avoid getNodePtrUnchecked() in integer division, NFC Stop relying on `getNodePtrUnchecked()` being useful on invalid iterators. This function is documented to be for internal use only, and the pointer type will eventually have to change to remove UB from ilist_iterator. Instead, check the iterator before it has been invalidated. llvm-svn: 261497	2016-02-21 20:14:29 +00:00
Sanjay Patel	2440130437	fix inaccurate comment; NFC llvm-svn: 261484	2016-02-21 17:33:31 +00:00
Sanjay Patel	368ac5dbf7	[InstCombine] add getNegativeIsTrueBoolVec() helper function; NFC Originally part of: http://reviews.llvm.org/D17485 We need this when simplifying masked memory ops too. llvm-svn: 261483	2016-02-21 17:29:33 +00:00
Sanjoy Das	979a11d5b2	[LoopDeletion] Add an assert that verifies LCSSA This is inspired by PR24804 -- had this assert been there before, isolating the root cause for PR24804 would have been far easier. llvm-svn: 261481	2016-02-21 17:11:59 +00:00
Simon Pilgrim	471efd244a	[InstCombine] SSE/SSE2 (u)comiss/(u)comisd comparison intrinsics only use the lowest vector element llvm-svn: 261460	2016-02-20 23:17:35 +00:00
Benjamin Kramer	7d537ae747	[SimplifyCFG] Use pointer identity to simplify predicate. No functional change intended. llvm-svn: 261427	2016-02-20 10:40:42 +00:00
David Majnemer	1efa23ddab	[SimplifyCFG] Merge together cleanuppads Cleanuppads may be merged together if one is the only predecessor of the other in which case a simple transform can be performed: replace the a cleanupret with a branch and remove an unnecessary cleanuppad. Differential Revision: http://reviews.llvm.org/D17459 llvm-svn: 261390	2016-02-20 01:07:45 +00:00
Hans Wennborg	a0f7090563	Revert r255691 "[LoopVectorizer] Refine loop vectorizer's register usage calculator by ignoring specific instructions." It caused PR26509. llvm-svn: 261368	2016-02-19 21:40:12 +00:00
Matthew Simpson	29c997c1a1	[LV] Vectorize first-order recurrences This patch enables the vectorization of first-order recurrences. A first-order recurrence is a non-reduction recurrence relation in which the value of the recurrence in the current loop iteration equals a value defined in the previous iteration. The load PRE of the GVN pass often creates these recurrences by hoisting loads from within loops. In this patch, we add a new recurrence kind for first-order phi nodes and attempt to vectorize them if possible. Vectorization is performed by shuffling the values for the current and previous iterations. The vectorization cost estimate is updated to account for the added shuffle instruction. Contributed-by: Matthew Simpson and Chad Rosier <mcrosier@codeaurora.org> Differential Revision: http://reviews.llvm.org/D16197 llvm-svn: 261346	2016-02-19 17:56:08 +00:00
Silviu Baranga	ad1dafb2c3	[LV] Fix PR26600: avoid out of bounds loads for interleaved access vectorization Summary: If we don't have the first and last access of an interleaved load group, the first and last wide load in the loop can do an out of bounds access. Even though we discard results from speculative loads, this can cause problems, since it can technically generate page faults (or worse). We now discard interleaved load groups that don't have the first and load in the group. Reviewers: hfinkel, rengolin Subscribers: rengolin, llvm-commits, mzolotukhin, anemet Differential Revision: http://reviews.llvm.org/D17332 llvm-svn: 261331	2016-02-19 15:46:10 +00:00
Chandler Carruth	31088a9d58	[LPM] Factor all of the loop analysis usage updates into a common helper routine. We were getting this wrong in small ways and generally being very inconsistent about it across loop passes. Instead, let's have a common place where we do this. One minor downside is that this will require some analyses like SCEV in more places than they are strictly needed. However, this seems benign as these analyses are complete no-ops, and without this consistency we can in many cases end up with the legacy pass manager scheduling deciding to split up a loop pass pipeline in order to run the function analysis half-way through. It is very, very annoying to fix these without just being very pedantic across the board. The only loop passes I've not updated here are ones that use AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer. They seemed less relevant. With this patch, almost all of the problems in PR24804 around loop pass pipelines are fixed. The one remaining issue is that we run simplify-cfg and instcombine in the middle of the loop pass pipeline. We've recently added some loop variants of these passes that would seem substantially cleaner to use, but this at least gets us much closer to the previous state. Notably, the seven loop pass managers is down to three. I've not updated the loop passes using LoopAccessAnalysis because that analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't clear that those transforms want to support those forms anyways. They all run late anyways, so this is harmless. Similarly, LSR is left alone because it already carefully manages its forms and doesn't need to get fused into a single loop pass manager with a bunch of other loop passes. LoopReroll didn't use loop simplified form previously, and I've updated the test case to match the trivially different output. Finally, I've also factored all the pass initialization for the passes that use this technique as well, so that should be done regularly and reliably. Thanks to James for the help reviewing and thinking about this stuff, and Ben for help thinking about it as well! Differential Revision: http://reviews.llvm.org/D17435 llvm-svn: 261316	2016-02-19 10:45:18 +00:00
Chandler Carruth	ac07270828	[AA] Preserve the AA results wrapper pass as well as BasicAA in a few more places to prevent gratuitous re-"runs" of these passes. The passes themselves don't do any work when run, but we keep spending time scheduling and running these needlessly when we really don't need to do so. This is the first patch towards fixing the really horrible loop pass pipeline fragmentation pointed out by Sanjoy in PR24804. llvm-svn: 261302	2016-02-19 03:12:14 +00:00
Lawrence Hu	84e6f1dd70	Bug fix: use dyn_cast_or_null instead of dyn_cast Differential Revision: http://reviews.llvm.org/D17154 llvm-svn: 261299	2016-02-19 02:17:07 +00:00
Richard Trieu	7a08381403	Remove uses of builtin comma operator. Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270	2016-02-18 22:09:30 +00:00
Adam Nemet	9d9cb274ea	[PPCLoopDataPrefetch] Move pass to Transforms/Scalar/LoopDataPrefetch. NFC This patch is part of the work to make PPCLoopDataPrefetch target-independent (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758). Obviously the pass still only used from PPC at this point. Subsequent patches will start driving this from ARM64 as well. Due to the previous patch most lines should show up as moved lines. llvm-svn: 261265	2016-02-18 21:38:19 +00:00
Matthew Simpson	92821cb4a8	Reapply commit r259357 with a fix for PR26629 Commit r259357 was reverted because it caused PR26629. We were assuming all roots of a vectorizable tree could be truncated to the same width, which is not the case in general. This commit reapplies the patch along with a fix and a new test case to ensure we don't regress because of this issue again. This should fix PR26629. llvm-svn: 261212	2016-02-18 14:14:40 +00:00
Chandler Carruth	9c4ed175c2	[PM] Port the PostOrderFunctionAttrs pass to the new pass manager and convert one test to use this. This is a particularly significant milestone because it required a working per-function AA framework which can be queried over each function from within a CGSCC transform pass (and additionally a module analysis to be accessible). This is essentially the point of the entire pass manager rewrite. A CGSCC transform is able to query for multiple different function's analysis results. It works. The whole thing appears to actually work and accomplish the original goal. While we were able to hack function attrs and basic-aa to "work" in the old pass manager, this port doesn't use any of that, it directly leverages the new fundamental functionality. For this to work, the CGSCC framework also has to support SCC-based behavior analysis, etc. The only part of the CGSCC pass infrastructure not sorted out at this point are the updates in the face of inlining and running function passes that mutate the call graph. The changes are pretty boring and boiler-plate. Most of the work was factored into more focused preperatory patches. But this is what wires it all together. llvm-svn: 261203	2016-02-18 11:03:11 +00:00
Junmo Park	80440eb804	Minor code cleanup. NFC. llvm-svn: 261200	2016-02-18 10:09:20 +00:00
Kostya Serebryany	d4590c7304	[sanitizer-coverage] implement -fsanitize-coverage=trace-pc. This is similar to trace-bb, but has a different API. We already use the equivalent flag in GCC for Linux kernel fuzzing. We may be able to use this flag with AFL too llvm-svn: 261159	2016-02-17 21:34:43 +00:00
Amaury Sechet	da71cb7b92	NFC: Fix formating llvm-svn: 261156	2016-02-17 21:21:29 +00:00
Haicheng Wu	57e1a3e6ee	[LIR] Avoid turning non-temporal stores into memset This is to fix PR26645. llvm-svn: 261149	2016-02-17 21:00:06 +00:00
Adrian Prantl	a5b2a64980	Debug Info: Teach LdStHasDebugValue() (Local.cpp) about DIExpressions. This function is used to check whether a dbg.value intrinsic has already been inserted, but without comparing the DIExpression, it would erroneously fire on split aggregates and only the first scalar would survive. Found via http://reviews.llvm.org/D16867. <rdar://problem/24456528> llvm-svn: 261145	2016-02-17 20:02:25 +00:00
Elena Demikhovsky	88e76cad16	Create masked gather and scatter intrinsics in Loop Vectorizer. Loop vectorizer now knows to vectorize GEP and create masked gather and scatter intrinsics for random memory access. The feature is enabled on AVX-512 target. Differential Revision: http://reviews.llvm.org/D15690 llvm-svn: 261140	2016-02-17 19:23:04 +00:00
Amaury Sechet	61a7d629ec	Fix load alignement when unpacking aggregates structs Summary: Store and loads unpacked by instcombine do not always have the right alignement. This explicitely compute the alignement and set it. Reviewers: dblaikie, majnemer, reames, hfinkel, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17326 llvm-svn: 261139	2016-02-17 19:21:28 +00:00
David Majnemer	f48bcb2bd9	Revert "Reapply commit r258404 with fix." This reverts commit r259357, it caused PR26629. llvm-svn: 261137	2016-02-17 19:02:36 +00:00
Frederic Riss	009d60650d	[ObjCARC] Handle ARCInstKind::ClaimRV in OptimizeIndividualCalls. When support for objc_unsafeClaimAutoreleasedReturnValue has been added to the ARC optimizer in r258970, one case was missed which would lead the optimizer to execute an llvm_unreachable. In this case, just handle ClaimRV in the same way we handle RetainRV. llvm-svn: 261134	2016-02-17 18:51:27 +00:00
Mehdi Amini	1db10ac6ce	Define the ThinLTO Pipeline (experimental) Summary: On the contrary to Full LTO, ThinLTO can afford to shift compile time from the frontend to the linker: both phases are parallel (even if it is not totally "free": projects like clang are reusing product from the "compile phase" for multiple link, think about libLLVMSupport reused for opt, llc, etc.). This pipeline is based on the proposal in D13443 for full LTO. We didn't move forward on this proposal because the LTO link was far too long after that. We believe that we can afford it with ThinLTO. The ThinLTO pipeline integrates in the regular O2/O3 flow: - The compile phase perform the inliner with a somehow lighter function simplification. (TODO: tune the inliner thresholds here) This is intendend to simplify the IR and get rid of obvious things like linkonce_odr that will be inlined. - The link phase will run the pipeline from the start, extended with some specific passes that leverage the augmented knowledge we have during LTO. Especially after the inliner is done, a sequence of globalDCE/globalOpt is performed, followed by another run of the "function simplification" passes. It is not clear if this part of the pipeline will stay as is, as the split model of ThinLTO does not allow the same benefit as FullLTO without added tricks. The measurements on the public test suite as well as on our internal suite show an overall net improvement. The binary size for the clang executable is reduced by 5%. We're still tuning it with the bringup of ThinLTO and it will evolve, but this should provide a good starting point. Reviewers: tejohnson Differential Revision: http://reviews.llvm.org/D17115 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 261029	2016-02-16 23:02:29 +00:00
Mehdi Amini	ec8bee1879	Refactor the PassManagerBuilder: extract a "addFunctionSimplificationPasses()" (NFC) It is intended to contains the passes run over a function after the inliner is done with a function and before it moves to its callers. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 261028	2016-02-16 22:54:27 +00:00
Junmo Park	6ebdc14cf1	[SCEVExpander] Make findExistingExpansion smarter Summary: Extending findExistingExpansion can use existing value in ExprValueMap. This patch gives 0.3~0.5% performance improvements on benchmarks(test-suite, spec2000, spec2006, commercial benchmark) Reviewers: mzolotukhin, sanjoy, zzheng Differential Revision: http://reviews.llvm.org/D15559 llvm-svn: 260938	2016-02-16 06:46:58 +00:00
Silviu Baranga	ec7063ac77	[LV] Add support for insertelt/extractelt processing during type truncation Summary: While shrinking types according to the required bits, we can encounter insert/extract element instructions. This will cause us to reach an llvm_unreachable statement. This change adds support for truncating insert/extract element operations, and adds a regression test. Reviewers: jmolloy Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17078 llvm-svn: 260893	2016-02-15 15:38:17 +00:00
Roman Gareev	036c08874a	Tweak the LICM code to reuse the first sub-loop instead of creating a new one LICM starts with an empty AST, and then merges in each sub-loop. While the add code is appropriate for sub-loop 2 and up, it's utterly unnecessary for sub-loop 1. If the AST starts off empty, we can just clone/move the contents of the subloop into the containing AST. Reviewed-by: Philip Reames <listmail@philipreames.com> Differential Revision: http://reviews.llvm.org/D16753 llvm-svn: 260892	2016-02-15 14:48:50 +00:00
Benjamin Kramer	8a752e316d	Use ArrayRef to hide SmallVector details, kill a useless vector copy along the way. llvm-svn: 260824	2016-02-13 16:01:12 +00:00
Chandler Carruth	632d208c78	[attrs] Move the norecurse deduction to operate on the node set rather than the SCC object, and have it scan the instruction stream directly rather than relying on call records. This makes the behavior of this routine consistent between libc routines and LLVM intrinsics for libc routines. We can go and start teaching it about those being norecurse, but we should behave the same for the intrinsic and the libc routine rather than differently. I chatted with James Molloy and the inconsistency doesn't seem intentional and likely is due to intrinsic calls not being modelled in the call graph analyses. This also fixes a bug where we would deduce norecurse on optnone functions, when generally we try to handle optnone functions as-if they were replaceable and thus unanalyzable. llvm-svn: 260813	2016-02-13 08:47:51 +00:00
Keno Fischer	7c7c3e3591	[Cloning] Clone every Function's Debug Info Summary: Export the CloneDebugInfoMetadata utility, which clones all debug info associated with a function into the first module. Also use this function in CloneModule on each function we clone (the CloneFunction entrypoint already does this). Without this, cloning a module will lead to DI quality regressions, especially since r252219 reversed the Function <-> DISubprogram edge (before we could get lucky and have this edge preserved if the DISubprogram itself was, e.g. due to location metadata). This was verified to fix missing debug information in julia and a unittest to verify the new behavior is included. Patch by Yichao Yu! Thanks! Reviewers: loladiro, pcc Differential Revision: http://reviews.llvm.org/D17165 llvm-svn: 260791	2016-02-13 02:04:29 +00:00
Chad Rosier	81362a8599	[LIR] Allow merging of memsets in negatively strided loops. Last part of PR25166. llvm-svn: 260732	2016-02-12 21:03:23 +00:00
Justin Lebar	6086c6a387	Fix typo in comment. llvm-svn: 260731	2016-02-12 21:01:37 +00:00
Justin Lebar	db63949e8d	[SimplifyCFG] Don't fold conditional branches that contain calls to convergent functions. Summary: Performing this optimization duplicates the call to the convergent function and adds new control-flow dependencies, which is a no-no. Reviewers: jingyue Subscribers: broune, hfinkel, tra, resistor, joker.eph, arsenm, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17128 llvm-svn: 260730	2016-02-12 21:01:36 +00:00
Justin Lebar	df04d2a1f1	[LoopRotate] Don't perform loop rotation if the loop header calls a convergent function. Summary: Calls to convergent functions can be duplicated, but only if the duplicates are not control-flow dependent on any additional values. Loop rotation doesn't meet the bar. Reviewers: jingyue Subscribers: mzolotukhin, llvm-commits, arsenm, joker.eph, resistor, tra, hfinkel, broune Differential Revision: http://reviews.llvm.org/D17127 llvm-svn: 260729	2016-02-12 21:01:33 +00:00
David Majnemer	01674939b2	Remove unused variable llvm-svn: 260722	2016-02-12 20:33:51 +00:00
Philip Reames	96fccc2d09	[GVN] Common code for local and non-local load availability [NFCI] The attached patch removes all of the block local code for performing X-load forwarding by reusing the code used in the non-local case. The motivation here is to remove duplication and in the process increase our test coverage of some fairly tricky code. I have some upcoming changes I'll be proposing in this area and wanted to have the code cleaned up a bit first. Note: The review for this mostly happened in email which didn't make it to phabricator on the 258882 commit thread. Differential Revision: http://reviews.llvm.org/D16608 llvm-svn: 260711	2016-02-12 19:24:57 +00:00
Chad Rosier	4acff96646	[LIR] Partially revert r252926(NFC), which introduced a very subtle change. In short, before r252926 we were comparing an unsigned (StoreSize) against an a APInt (Stride), which is fine and well. After we were zero extending the Stride and then converting to an unsigned, which is not the same thing. Obviously, Stides can also be negative. This commit just restores the original behavior. AFAICT, it's not possible to write a test case to expose the issue because the code already has checks to make sure the StoreSize can't overflow an unsigned (which prevents the Stride from overflowing an unsigned as well). llvm-svn: 260706	2016-02-12 19:05:27 +00:00
David Majnemer	0f0abc7bc2	[InstCombine] Don't aggressively replace xor with icmp For some cases, InstCombine replaces the sequence of xor/sub instruction followed by cmp instruction into a single cmp instruction. However, this replacement may result suboptimal result especially when the xor/sub has more than one use, as discussed in bug 26465 (https://llvm.org/bugs/show_bug.cgi?id=26465). This patch make the replacement happen only when xor/sub has only one use. Differential Revision: http://reviews.llvm.org/D16915 Patch by Taewook Oh! llvm-svn: 260695	2016-02-12 18:12:38 +00:00
Chandler Carruth	3937bc70f9	[attrs] Simplify the convergent removal to directly use the pre-built node set rather than walking the SCC directly. This directly exposes the functions and has already had null entries filtered out. We also don't need need to handle optnone as it has already been handled in the caller -- we never try to remove convergent when there are optnone functions in the SCC. With this change, the code for removing convergent should work with the new pass manager and a different SCC analysis. llvm-svn: 260668	2016-02-12 09:47:49 +00:00
Chandler Carruth	057df3d423	[attrs] Consolidate the test for a non-SCC, non-convergent function call with the test for a non-convergent intrinsic call. While it is possible to use the call records to search for function calls, we're going to do an instruction scan anyways to find the intrinsics, we can handle both cases while scanning instructions. This will also make the logic more amenable to the new pass manager which doesn't use the same call graph structure. My next patch will remove use of CallGraphNode entirely and allow this code to work with both the old and new pass manager. Fortunately, it should also get strictly simpler without changing functionality. llvm-svn: 260666	2016-02-12 09:23:53 +00:00
Chandler Carruth	bbbbec0b54	[attrs] Run clang-format over a newly added routine in function-attrs before I update it to be friendly with the new pass manager. llvm-svn: 260653	2016-02-12 03:07:50 +00:00
Evgeniy Stepanov	ba6ca87ffb	[msan] Put msan constructor in a comdat. MSan adds a constructor to each translation unit that calls __msan_init, and does nothing else. The idea is to run __msan_init before any instrumented code. This results in multiple constructors and multiple .init_array entries in the final binary, one per translation unit. This is absolutely unnecessary; one would be enough. This change moves the constructors to a comdat group in order to drop the extra ones. llvm-svn: 260632	2016-02-12 00:37:52 +00:00
Matthew Simpson	a4e43c5b51	[SLP] Add debug output for extract cost (NFC) llvm-svn: 260614	2016-02-11 23:06:40 +00:00
Quentin Colombet	490cfbe2a2	Re-apply r238452, the bug was in clang and was fixed in r260567. Original commit message: [InstCombine] Fold IntToPtr and PtrToInt into preceding loads. Currently we only fold a BitCast into a Load when the BitCast is its only user. Do the same for any no-op cast. Patch by Philip Pfaffe! Differential Revision: http://reviews.llvm.org/D9152 llvm-svn: 260612	2016-02-11 22:30:41 +00:00
Mehdi Amini	9c1c3ac627	Revert "Refactor the PassManagerBuilder: extract a "addFunctionSimplificationPasses()"" This reverts commit r260603. I didn't intend to push it :( From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 260607	2016-02-11 22:09:11 +00:00
Mehdi Amini	c5bf5ecc1b	Revert "Define the ThinLTO Pipeline" This reverts commit r260604. I didn't intend to push this now. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 260606	2016-02-11 22:09:07 +00:00
Mehdi Amini	484470d605	Define the ThinLTO Pipeline Summary: On the contrary to Full LTO, ThinLTO can afford to shift compile time from the frontend to the linker: both phases are parallel. This pipeline is based on the proposal in D13443 for full LTO. We ] didn't move forward on this proposal because the link was far too long after that. This patch refactor the "function simplification" passes that are part of the inliner loop in a helper function (this part is NFC and can be commited separately to simplify the diff). The ThinLTO pipeline integrates in the regular O2/O3 flow: - The compile phase perform the inliner with a somehow lighter function simplification. (TODO: tune the inliner thresholds here) This is intendend to simplify the IR and get rid of obvious things like linkonce_odr that will be inlined. - The link phase will run the pipeline from the start, extended with some specific passes that leverage the augmented knowledge we have during LTO. Especially after the inliner is done, a sequence of globalDCE/globalOpt is performed, followed by another run of the "function simplification" passes. The measurements on the public test suite as well as on our internal suite show an overall net improvement. The binary size for the clang executable is reduced by 5%. We're still tuning it with the bringup of ThinLTO but this should provide a good starting point. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits, dexonsmith Differential Revision: http://reviews.llvm.org/D17115 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 260604	2016-02-11 22:00:31 +00:00
Mehdi Amini	f9a3718c5a	Refactor the PassManagerBuilder: extract a "addFunctionSimplificationPasses()" It is intended to contains the passes run over a function after the inliner is done with a function and before it moves to its callers. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 260603	2016-02-11 22:00:25 +00:00
Pete Cooper	5562c333b8	Set load alignment on aggregate loads. When optimizing a extractvalue(load), we generate a load from the aggregate type. This load didn't have alignment set and so would get the alignment of the type. This breaks when the type is packed and so the alignment should be lower. For example, loading { int, int } would give us alignment of 4, but the original load from this type may have an alignment of 1 if packed. Reviewed by David Majnemer Differential revision: http://reviews.llvm.org/D17158 llvm-svn: 260587	2016-02-11 21:10:40 +00:00
Jun Bum Lim	10e58e867d	Fixed typo in r260530 llvm-svn: 260541	2016-02-11 16:46:13 +00:00
Jun Bum Lim	339e9723c1	[InstCombine] Simplify a known nonzero incoming value of PHI Summary: When a PHI is used only to be compared with zero, it is possible to replace an incoming value with any non-zero constant if the incoming value can be proved as a known nonzero value. For example, in below code, we can replace the incoming value %v with any non-zero constant based on the fact that the PHI is only used to be compared with zero and %v is a known non-zero value: %v = select %cond, 1, 2 %p = phi [%v, BB] ... %c = icmp eq, %p, 0 Reviewers: mcrosier, jmolloy, sanjoy Subscribers: hfinkel, mcrosier, majnemer, llvm-commits, haicheng, bmakam, mssimpso, gberry Differential Revision: http://reviews.llvm.org/D16240 llvm-svn: 260530	2016-02-11 15:50:07 +00:00
Tamas Berghammer	d12f31535a	Fix MSVC 2013 build after rL260504 llvm-svn: 260511	2016-02-11 11:27:51 +00:00
Artur Pilipenko	44e7c51b05	Don't propagate dereferenceable attribute through gc.relocate in InstCombine Reviewed By: reames Differential Revision: http://reviews.llvm.org/D16143 llvm-svn: 260509	2016-02-11 11:22:46 +00:00
Ashutosh Nema	2260a3a046	Fixed typo in comment & coding style for LoopVersioningLICM. llvm-svn: 260504	2016-02-11 09:23:53 +00:00
Teresa Johnson	41806854cc	Fix Windows bot failure in Transforms/FunctionImport/funcimport.ll Make sure we split ":" from the end of the global function id (which is <path>:<function> for local functions) instead of the beginning to avoid splitting at the wrong place for Windows file paths that contain a ":". llvm-svn: 260469	2016-02-10 23:47:38 +00:00
Mehdi Amini	4064174892	FunctionImport: add a progressive heuristic to limit importing too deep in the callgraph The current function importer will walk the callgraph, importing transitively any callee that is below the threshold. This can lead to import very deep which is costly in compile time and not necessarily beneficial as most of the inline would happen in imported function and not necessarilly in user code. The actual factor has been carefully chosen by flipping a coin ;) Some tuning need to be done (just at the existing limiting threshold). Reviewers: tejohnson Differential Revision: http://reviews.llvm.org/D17082 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 260466	2016-02-10 23:31:45 +00:00
Mehdi Amini	c87d7d02e1	Use a StringSet in Internalize, and allow to create the pass from an existing one (NFC) There is not reason to pass an array of "char *" to rebuild a set if the client already has one. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 260462	2016-02-10 23:24:31 +00:00
Philip Reames	d59638e14f	Follow up to 260439, Speculative fix to clang builders It looks like clang has a couple of test cases which caught the fact LVI was not slightly more precise after 260439. When looking at the failures, it struck me as wasteful to be querying nullness of a constant via LVI, so instead of tweaking the clang tests, let's just stop querying constants from this source. llvm-svn: 260451	2016-02-10 22:22:41 +00:00
Teresa Johnson	e1164de5d0	Restore "[ThinLTO] Use MD5 hash in function index." with fix This restores commit r260408, along with a fix for a bot failure. The bot failure was caused by dereferencing a unique_ptr in the same call instruction parameter list where it was passed via std::move. Apparently due to luck this was not exposed when I built the compiler with clang, only with gcc. llvm-svn: 260442	2016-02-10 21:55:02 +00:00
Teresa Johnson	89f38fb5cc	Revert "[ThinLTO] Use MD5 hash in function index." due to bot failure This reverts commit r260408. Bot failure that I need to investigate. llvm-svn: 260412	2016-02-10 19:11:15 +00:00
Teresa Johnson	0919a84071	[ThinLTO] Use MD5 hash in function index. Summary: This patch uses the lower 64-bits of the MD5 hash of a function name as a GUID in the function index, instead of storing function names. Any local functions are first given a global name by prepending the original source file name. This is the same naming scheme and GUID used by PGO in the indexed profile format. This change has a couple of benefits. The primary benefit is size reduction in the combined index file, for example 483.xalancbmk's combined index file was reduced by around 70%. It should also result in memory savings for the index file in memory, as the in-memory map is also indexed by the hash instead of the string. Second, this enables integration with indirect call promotion, since the indirect call profile targets are recorded using the same global naming convention and hash. This will enable the function importer to easily locate function summaries for indirect call profile targets to enable their import and subsequent promotion. The original source file name is recorded in the bitcode in a new module-level record for use in the ThinLTO backend pipeline. Reviewers: davidxl, joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D17028 llvm-svn: 260408	2016-02-10 18:57:54 +00:00
Rong Xu	13b01dc8d9	[PGO] Indirect-call profile annotation in IR level profiling This patch reads the indirect-call value records in the profile and makes the annotation in the indirect-call instruction. This is for IR level profile instrumentation. Differential Revision: http://reviews.llvm.org/D16935 llvm-svn: 260400	2016-02-10 18:24:45 +00:00
Teresa Johnson	488a800a4c	[ThinLTO] Move global processing from Linker to TransformUtils (NFC) Summary: As discussed on IRC, move the ThinLTOGlobalProcessing code out of the linker, and into TransformUtils. The name of the class is changed to FunctionImportGlobalProcessing. Reviewers: joker.eph, rafael Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D17081 llvm-svn: 260395	2016-02-10 18:11:31 +00:00
Daniel Berlin	f6c9ae9c6d	Rename a member variable to be more accurate with how it is used llvm-svn: 260389	2016-02-10 17:41:25 +00:00
Daniel Berlin	932b4cbf5d	Constify two functions, make them accessible to unit tests llvm-svn: 260387	2016-02-10 17:39:43 +00:00
Rong Xu	33c76c0cc2	[PGO] Differentiate Clang instrumentation and IR level instrumentation profiles This patch uses one bit in profile version to differentiate Clang instrumentation and IR level instrumentation profiles. PGOInstrumenation generates a COMDAT variable __llvm_profile_raw_version so that the compiler runtime can set the right profile kind. For Maco-O platform, we generate the variable as linkonce_odr linkage as COMDAT is not supported. PGOInstrumenation now checks this bit to make sure it's an IR level instrumentation profile. The patch was submitted as r260164 but reverted due to a Darwin test breakage. Original Differential Revision: http://reviews.llvm.org/D15540 Differential Revision: http://reviews.llvm.org/D17020 llvm-svn: 260385	2016-02-10 17:18:30 +00:00
Tom Stellard	6fa4e290d7	StructurizeCFG: Initialize SkipUniformRegions in the default constructor This should fix some random bot failures caused by r260336. llvm-svn: 260342	2016-02-10 01:10:09 +00:00
Tom Stellard	755a4e6b57	StructurizeCFG: Add an option for skipping regions with only uniform branches Summary: Tests for this will be added once the AMDGPU backend enables this option. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16602 llvm-svn: 260336	2016-02-10 00:39:37 +00:00
Justin Lebar	260854bfaf	Add convergent-removing bits to FunctionAttrs pass. Summary: Remove the convergent attribute on any functions which provably do not contain or invoke any convergent functions. After this change, we'll be able to modify clang to conservatively add 'convergent' to all functions when compiling CUDA. Reviewers: jingyue, joker.eph Subscribers: llvm-commits, tra, jhen, hfinkel, resistor, chandlerc, arsenm Differential Revision: http://reviews.llvm.org/D17013 llvm-svn: 260319	2016-02-09 23:03:22 +00:00
Peter Collingbourne	9b656527f1	Fix GCC build. llvm-svn: 260317	2016-02-09 23:01:38 +00:00
Peter Collingbourne	df49d1bbb2	WholeProgramDevirt: introduce. This pass implements whole program optimization of virtual calls in cases where we know (via bitset information) that the list of callees is fixed. This includes the following: - Single implementation devirtualization: if a virtual call has a single possible callee, replace all calls with a direct call to that callee. - Virtual constant propagation: if the virtual function's return type is an integer <=64 bits and all possible callees are readnone, for each class and each list of constant arguments: evaluate the function, store the return value alongside the virtual table, and rewrite each virtual call as a load from the virtual table. - Uniform return value optimization: if the conditions for virtual constant propagation hold and each function returns the same constant value, replace each virtual call with that constant. - Unique return value optimization for i1 return values: if the conditions for virtual constant propagation hold and a single vtable's function returns 0, or a single vtable's function returns 1, replace each virtual call with a comparison of the vptr against that vtable's address. Differential Revision: http://reviews.llvm.org/D16795 llvm-svn: 260312	2016-02-09 22:50:34 +00:00
Philip Reames	ea4d8e8ce9	[InstCombine][GC] Handle gc.relocations of vector type We introduced gc.relocates of vector-of-pointer types a couple of weeks back. Somehow, I missed updating the InstCombine rule to account for this. If we hit this code path with a vector-of-pointers gc.relocate, we'd crash on a cast<PointerType>. I also took the chance to do a bit of code style cleanup. llvm-svn: 260279	2016-02-09 21:09:22 +00:00
Sanjoy Das	10c8a04b80	[FunctionAttrs] Fix SCC logic around operand bundles FunctionAttrs does an "optimistic" analysis of SCCs as a unit, which means normally it is able to disregard calls from an SCC into itself. However, calls and invokes with operand bundles are allowed to have memory effects not fully described by the memory effects on the call target, so we can't be optimistic around operand-bundled calls from an SCC into itself. llvm-svn: 260244	2016-02-09 18:40:40 +00:00
Sanjoy Das	1c481f50d2	Add an "addUsedAAAnalyses" helper function Summary: Passes that call `getAnalysisIfAvailable<T>` also need to call `addUsedIfAvailable<T>` in `getAnalysisUsage` to indicate to the legacy pass manager that it uses `T`. This contract was being violated by passes that used `createLegacyPMAAResults`. This change fixes this by exposing a helper in AliasAnalysis.h, `addUsedAAAnalyses`, that is complementary to createLegacyPMAAResults and does the right thing when called from `getAnalysisUsage`. Reviewers: chandlerc Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17010 llvm-svn: 260183	2016-02-09 01:21:57 +00:00
Rong Xu	d0dfb67fe1	[PGO] Revert r260146 as it breaks Darwin platforms. r260146 \| xur \| 2016-02-08 13:07:46 -0800 (Mon, 08 Feb 2016) \| 13 lines [PGO] Differentiate Clang instrumentation and IR level instrumentation profiles llvm-svn: 260170	2016-02-08 23:11:16 +00:00
Michael Zolotukhin	1da4afdfc9	Factor out UnrollAnalyzer to Analysis, and add unit tests for it. Summary: Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests. The change itself is supposed to be NFC, except adding a couple of tests. I plan to add more tests as I add new functionality and find/fix bugs. Reviewers: chandlerc, hfinkel, sanjoy Subscribers: zzheng, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D16623 llvm-svn: 260169	2016-02-08 23:03:59 +00:00
Sanjay Patel	4d36bbaf19	rangify; NFC llvm-svn: 260151	2016-02-08 21:32:43 +00:00
Rong Xu	1288a19421	[PGO] Differentiate Clang instrumentation and IR level instrumentation profiles This patch uses one bit in profile version to differentiate Clang instrumentation and IR level instrumentation profiles. PGOInstrumenation generates a COMDAT variable __llvm_profile_raw_version so that the compiler runtime can set the right profile kind. PGOInstrumenation now checks this bit to make sure it's an IR level instrumentation profile. Differential Revision: http://reviews.llvm.org/D15540 llvm-svn: 260146	2016-02-08 21:07:46 +00:00
Sanjay Patel	e08381a529	fix typos; NFC llvm-svn: 260130	2016-02-08 19:27:33 +00:00
Xinliang David Li	a82d6c0a4b	[PGO] Enable compression in pgo instrumentation This reduces sizes of instrumented object files, final binaries, process images, and raw profile data. The format of the indexed profile data remain the same. Differential Revision: http://reviews.llvm.org/D16388 llvm-svn: 260117	2016-02-08 18:13:49 +00:00
Silviu Baranga	ea63a7f512	[SCEV][LAA] Re-commit r260085 and r260086, this time with a fix for the memory sanitizer issue. The PredicatedScalarEvolution's copy constructor wasn't copying the Generation value, and was leaving it un-initialized. Original commit message: [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection Summary: This change adds no wrap SCEV predicates with: - support for runtime checking - support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)} Note that we are sign extending the increment of the SCEV, even for the zext case. This is needed to cover the fairly common case where y would be a (small) negative integer. In order to do this, this change adds two new flags: nusw and nssw that are applicable to AddRecExprs and permit the transformations above. We also change isStridedPtr in LAA to be able to make use of these predicates. With this feature we should now always be able to work around overflow issues in the dependence analysis. Reviewers: mzolotukhin, sanjoy, anemet Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel Differential Revision: http://reviews.llvm.org/D15412 llvm-svn: 260112	2016-02-08 17:02:45 +00:00
Haicheng Wu	b35f772b90	[JumpThreading] Change a return of ComputeValueKnownInPredecessors() Change a return statement of ComputeValueKnownInPredecessors() to be the same as the rest return statements of the function. Otherwise, it might return true with an empty Result when the current basic block has no predecessors and trigger the first assert of JumpThreading::ProcessThreadableEdges(). llvm-svn: 260110	2016-02-08 17:00:39 +00:00
Igor Breger	1a39a34eae	[SLP] Fix placement of debug statement (NFC) By Ayal Zaks (ayal.zaks@intel.com) Differential Revision: http://reviews.llvm.org/D16976 llvm-svn: 260094	2016-02-08 14:11:39 +00:00
Silviu Baranga	41b4973329	Revert r260086 and r260085. They have broken the memory sanitizer bots. llvm-svn: 260087	2016-02-08 11:56:15 +00:00
Silviu Baranga	70a98bb9e8	[LoopVersioning] Don't assert when there are no memchecks We shouldn't assert when there are no memchecks, since we can have SCEV checks. There is already an assert covering the case where there are no SCEV checks or memchecks. This also changes the LAA pointer wrapping versioning test to use the loop versioning pass (this was how I managed to trigger the assert in the loop versioning pass). llvm-svn: 260086	2016-02-08 11:15:29 +00:00
Silviu Baranga	a35fadc7c4	[SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection Summary: This change adds no wrap SCEV predicates with: - support for runtime checking - support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)} Note that we are sign extending the increment of the SCEV, even for the zext case. This is needed to cover the fairly common case where y would be a (small) negative integer. In order to do this, this change adds two new flags: nusw and nssw that are applicable to AddRecExprs and permit the transformations above. We also change isStridedPtr in LAA to be able to make use of these predicates. With this feature we should now always be able to work around overflow issues in the dependence analysis. Reviewers: mzolotukhin, sanjoy, anemet Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel Differential Revision: http://reviews.llvm.org/D15412 llvm-svn: 260085	2016-02-08 10:45:50 +00:00
Maxim Ostapenko	b1e3f60fb9	[asan] Introduce new hidden -asan-use-private-alias option. As discussed in https://github.com/google/sanitizers/issues/398, with current implementation of poisoning globals we can have some CHECK failures or false positives in case of mixing instrumented and non-instrumented code due to ASan poisons innocent globals from non-sanitized binary/library. We can use private aliases to avoid such errors. In addition, to preserve ODR violation detection, we introduce new __odr_asan_gen_XXX symbol for each instrumented global that indicates if this global was already registered. To detect ODR violation in runtime, we should only check the value of indicator and report an error if it isn't equal to zero. Differential Revision: http://reviews.llvm.org/D15642 llvm-svn: 260075	2016-02-08 08:30:57 +00:00
Asaf Badouh	ad5c3fc47d	[X86][AVX512] add intrinsics of Scalar FP to integer conversion with rounding mode Differential Revision: http://reviews.llvm.org/D16629 llvm-svn: 260033	2016-02-07 14:59:13 +00:00
Daniel Berlin	905a646c24	Don't use module context here. It's unnecessary and makes it harder to write unittests llvm-svn: 260015	2016-02-07 02:03:39 +00:00
Daniel Berlin	1b51a2957d	Compute live-in for MemorySSA llvm-svn: 260014	2016-02-07 01:52:19 +00:00
Daniel Berlin	7898ca658e	Only insert into definingblocks once per block llvm-svn: 260013	2016-02-07 01:52:15 +00:00
Ashutosh Nema	df6763abe8	New Loop Versioning LICM Pass Summary: When alias analysis is uncertain about the aliasing between any two accesses, it will return MayAlias. This uncertainty from alias analysis restricts LICM from proceeding further. In cases where alias analysis is uncertain we might use loop versioning as an alternative. Loop Versioning will create a version of the loop with aggressive aliasing assumptions in addition to the original with conservative (default) aliasing assumptions. The version of the loop making aggressive aliasing assumptions will have all the memory accesses marked as no-alias. These two versions of loop will be preceded by a memory runtime check. This runtime check consists of bound checks for all unique memory accessed in loop, and it ensures the lack of memory aliasing. The result of the runtime check determines which of the loop versions is executed: If the runtime check detects any memory aliasing, then the original loop is executed. Otherwise, the version with aggressive aliasing assumptions is used. The pass is off by default and can be enabled with command line option -enable-loop-versioning-licm. Reviewers: hfinkel, anemet, chatur01, reames Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga, llvm-commits Differential Revision: http://reviews.llvm.org/D9151 llvm-svn: 259986	2016-02-06 07:47:48 +00:00
Michael Zolotukhin	73957179d3	[LoopUnrolling] Try harder to avoid rebuilding LCSSA when possible. In r255133 (reapplied r253126) we started to avoid redundant recomputation of LCSSA after loop-unrolling. This patch moves one step further in this direction - now we can avoid it for much wider range of loops, as we start to look at IR and try to figure out if the transformation actually breaks LCSSA phis or makes it necessary to insert new ones. Differential Revision: http://reviews.llvm.org/D16838 llvm-svn: 259869	2016-02-05 02:17:36 +00:00
Joseph Tremoulet	adc2376375	[RS4GC] Pass DenseMap by reference, NFC Summary: Passing the rematerialized values map to insertRematerializationStores by value looks to be a simple oversight; update it to pass by reference. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16911 llvm-svn: 259867	2016-02-05 01:42:52 +00:00
Adam Nemet	9455c1d2b1	[LoopLoadElim] Don't allow versioning when optForSize This was requested in the review of D16300. llvm-svn: 259861	2016-02-05 01:14:05 +00:00
Wei Mi	a49559befb	[SCEV] Try to reuse existing value during SCEV expansion Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. The original commit triggered regressions in Polly tests. The regressions exposed two problems which have been fixed in current version. 1. Polly will generate a new function based on the old one. To generate an instruction for the new function, it builds SCEV for the old instruction, applies some tranformation on the SCEV generated, then expands the transformed SCEV and insert the expanded value into new function. Because SCEV expansion may reuse value cached in ExprValueMap, the value in old function may be inserted into new function, which is wrong. In SCEVExpander::expand, there is a logic to check the cached value to be used should dominate the insertion point. However, for the above case, the check always passes. That is because the insertion point is in a new function, which is unreachable from the old function. However for unreachable node, DominatorTreeBase::dominates thinks it will be dominated by any other node. The fix is to simply add a check that the cached value to be used in expansion should be in the same function as the insertion point instruction. 2. When the SCEV is of scConstant type, expanding it directly is cheaper than reusing a normal value cached. Although in the cached value set in ExprValueMap, there is a Constant type value, but it is not easy to find it out -- the cached Value set is not sorted according to the potential cost. Existing reuse logic in SCEVExpander::expand simply chooses the first legal element from the cached value set. The fix is that when the SCEV is of scConstant type, don't try the reuse logic. simply expand it. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259736	2016-02-04 01:27:38 +00:00
Gerolf Hoflehner	2432bd0ddd	[SimplifyCFG] Fix for "endless" loop after dead code removal (Alternative to D16251) Summary: This is a simpler fix to the problem than the dominator approach in http://reviews.llvm.org/D16251. It adds only values into the gather() while loop that have been seen before. The actual endless loop is in the constant compare gather() routine in Utils/SimplifyCFG.cpp. The same value ret.0.off0.i is pushed back into the queue: %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i Here is what happens at the IR level: for.cond.i: ; preds = %if.end6.i, %if.end.i54 %ix.0.i = phi i32 [ 0, %if.end.i54 ], [ %inc.i55, %if.end6.i ] %ret.0.off0.i = phi i1 [false, %if.end.i54], [%.ret.0.off0.i, %if.end6.i] <<< %cmp2.i = icmp ult i32 %ix.0.i, %11 br i1 %cmp2.i, label %for.body.i, label %LBJ_TmpSimpleNeedExt.exit if.end6.i: ; preds = %for.body.i %cmp10.i = icmp ugt i32 %conv.i, %add9.i %.ret.0.off0.i = or i1 %ret.0.off0.i, %cmp10.i <<< When if.end.i54 gets eliminated which removes the definition of ret.0.off0.i. The result is the expression %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i (Note the first ‘or’ operand is now %.ret.0.off0.i, and NOT %ret.0.off0.i). And now there is use of .ret.0.off0.i before a definition which triggers the “endless” loop in gather(): while(!DFT.empty()) { V = DFT.pop_back_val(); // V is .ret.0.off0.i if (Instruction *I = dyn_cast<Instruction>(V)) { // If it is a \|\| (or && depending on isEQ), process the operands. if (I->getOpcode() == (isEQ ? Instruction::Or : Instruction::And)) { DFT.push_back(I->getOperand(1)); // This is now .ret.0.off0.i also DFT.push_back(I->getOperand(0)); continue; // “endless loop” for .ret.0.off0.i } Reviewers: reames, ahatanak Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16839 llvm-svn: 259730	2016-02-03 23:54:25 +00:00
Vedant Kumar	2d5b5d3d3a	[InstrProfiling] Fix a comment (NFC) llvm-svn: 259727	2016-02-03 23:22:43 +00:00
Junmo Park	e90057a5f3	Minor code cleanups. NFC. llvm-svn: 259725	2016-02-03 23:16:39 +00:00
David Majnemer	a53b5bbb18	[LoopStrengthReduce] Don't rewrite PHIs with incoming values from CatchSwitches Bail out if we have a PHI on an EHPad that gets a value from a CatchSwitchInst. Because the CatchSwitchInst cannot be split, there is no good place to stick any instructions. This fixes PR26373. llvm-svn: 259702	2016-02-03 21:30:34 +00:00
Wei Mi	97de385868	Revert r259662, which caused regressions on polly tests. llvm-svn: 259675	2016-02-03 18:05:57 +00:00
Quentin Colombet	7ec03dc7f8	[InstCombine] Revert r238452: Fold IntToPtr and PtrToInt into preceding loads. According to git bisect, this is the root cause of a miscompile for Regex in libLLVMSupport. I am still working on reducing a test case. The actual bug may be elsewhere and this commit just exposed it. Anyway, at the moment, to reproduce, follow these steps: 1. Build clang and libLTO in release mode. 2. Create a new build directory <stage2> and cd into it. 3. Use clang and libLTO from #1 to build llvm-extract in Release mode + asserts using -O2 -flto 4. Run llvm-extract -ralias '.bar' -S test/Other/extract-alias.ll Result: program doesn't contain global named '.bar'! Expected result: @a0a0bar = alias void ()* @bar @a0bar = alias void ()* @bar declare void @bar() Note: In step #3, if you don't use lto or asserts, the miscompile disappears. llvm-svn: 259674	2016-02-03 18:04:13 +00:00
Wei Mi	ed133978a0	[SCEV] Try to reuse existing value during SCEV expansion Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259662	2016-02-03 17:05:12 +00:00
Peter Collingbourne	0c0d7e2d0f	LowerBitSets: Don't bother to do any work if the llvm.bitset.test intrinsic is unused. llvm-svn: 259625	2016-02-03 03:48:46 +00:00
Peter Collingbourne	83cc981c49	Add #include "llvm/Support/raw_ostream.h" to fix Windows build. llvm-svn: 259623	2016-02-03 03:16:37 +00:00
Peter Collingbourne	9f7ec14009	Transforms: Move GlobalOpt's Evaluator to Utils where it can be reused. llvm-svn: 259621	2016-02-03 02:51:00 +00:00
Adam Nemet	d52ed84160	[LoopVersioning] Expose loop versioning as a pass too Summary: LoopVersioning is a transform utility that transform passes can use to run-time disambiguate may-aliasing accesses. I'd like to also expose as pass to allow it to be unit-tested. I am planning to add support for non-aliasing annotation in LoopVersioning and I'd like to be able to write tests directly using this pass. (After that feature is done, the pass could also be used to look for optimization opportunities that are hidden behind incomplete alias information at compile time.) The pass drives LoopVersioning in its default way which is to fully disambiguate may-aliasing accesses no matter how many checks are required. Reviewers: hfinkel, ashutosh.nema, sbaranga Subscribers: zzheng, mssimpso, llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D16612 llvm-svn: 259610	2016-02-03 00:06:10 +00:00
George Burgess IV	60adac46f2	Attempt #2 to unbreak r259595. llvm-svn: 259602	2016-02-02 23:26:01 +00:00
George Burgess IV	b5a229f779	Attempt to fix builds broken by r259595. llvm-svn: 259599	2016-02-02 23:15:26 +00:00
George Burgess IV	e1100f533f	This patch adds MemorySSA to LLVM. Please see include/llvm/Transforms/Utils/MemorySSA.h for a description of MemorySSA, and what it does. Differential Revision: http://reviews.llvm.org/D7864 llvm-svn: 259595	2016-02-02 22:46:49 +00:00
Anna Zaks	3b50e70bbe	[asan] Add iOS support to AddressSanitzier Differential Revision: http://reviews.llvm.org/D15625 llvm-svn: 259586	2016-02-02 22:05:07 +00:00
Eugene Zelenko	ecefe5a81f	Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes. Differential revision: http://reviews.llvm.org/D16793 llvm-svn: 259539	2016-02-02 18:20:45 +00:00
Sanjay Patel	4b198802b3	function names start with a lowercase letter; NFC llvm-svn: 259425	2016-02-01 22:23:39 +00:00
Sanjay Patel	103ab7d571	[InstCombine] simplify masked scatter/gather intrinsics with zero masks A masked scatter with a zero mask means there's no store. A masked gather with a zero mask means the passthru arg is returned. This is a continuation of: http://reviews.llvm.org/rL259369 http://reviews.llvm.org/rL259392 llvm-svn: 259421	2016-02-01 22:10:26 +00:00
Sanjay Patel	04f792bdc9	[InstCombine] simplify masked store intrinsics with all ones or zeros masks A masked store with a zero mask means there's no store. A masked store with an allOnes mask means it's a normal vector store. This is a continuation of: http://reviews.llvm.org/rL259369 llvm-svn: 259392	2016-02-01 19:39:52 +00:00
David Majnemer	f8853ae7b3	[InstCombine] Don't transform (X+INT_MAX)>=(Y+INT_MAX) -> (X<=Y) This miscompile came about because we tried to use a transform which was only appropriate for xor operators when addition was present. This fixes PR26407. llvm-svn: 259375	2016-02-01 17:37:56 +00:00
Sanjay Patel	b695c5557c	[InstCombine] simplify masked load intrinsics with all ones or zeros masks A masked load with a zero mask means there's no load. A masked load with an allOnes mask means it's a normal vector load. Differential Revision: http://reviews.llvm.org/D16691 llvm-svn: 259369	2016-02-01 17:00:10 +00:00
Matthew Simpson	73dad62174	[LV] Rename RdxPHIsToFix to PHIsToFix (NFC) In the future, we will vectorize recurrences other than reductions. This patch renames a few variables and updates their associated comments to enable them to be reused for non-reduction PHI nodes. This change was requested in the review for D16197. llvm-svn: 259364	2016-02-01 16:07:01 +00:00
Matthew Simpson	c578d67407	Reapply commit r258404 with fix. The previous patch caused PR26364. The fix is to ensure that we don't enter a cycle when iterating over use-def chains. llvm-svn: 259357	2016-02-01 13:38:29 +00:00
Sanjay Patel	0069f56e33	add helper function for minnum/maxnum ; NFC llvm-svn: 259326	2016-01-31 16:35:23 +00:00
Sanjay Patel	8af7fbc34c	use range-based for loop; NFC llvm-svn: 259325	2016-01-31 16:34:48 +00:00
Sanjay Patel	690955fcbc	fix formatting; NFC llvm-svn: 259324	2016-01-31 16:34:11 +00:00
Sanjay Patel	24b77d11bc	simplify; NFC llvm-svn: 259323	2016-01-31 16:33:33 +00:00
Craig Topper	2ed5369424	Convert int to Twine instead of using utostr since it was already being added to a Twine. NFC llvm-svn: 259308	2016-01-31 00:15:35 +00:00
Matt Arsenault	56c079f393	InstCombine: fabs(x) * fabs(x) -> x * x llvm-svn: 259295	2016-01-30 05:02:00 +00:00
Matthias Braun	b30f2f5141	Avoid overly large SmallPtrSet/SmallSet These sets perform linear searching in small mode so it is never a good idea to use SmallSize/N bigger than 32. llvm-svn: 259283	2016-01-30 01:24:31 +00:00
Sanjay Patel	6038d3e5c6	function names start with a lower case letter ; NFC llvm-svn: 259264	2016-01-29 23:27:03 +00:00
Sanjay Patel	f9f5d3cc45	fix formatting; NFC llvm-svn: 259262	2016-01-29 23:14:58 +00:00
Fiona Glaser	36e8230db0	Fix typo in LoopSimplifyCFG llvm-svn: 259261	2016-01-29 23:12:52 +00:00
Fiona Glaser	b417d464e6	Add LoopSimplifyCFG pass Loop transformations can sometimes fail because the loop, while in valid rotated LCSSA form, is not in a canonical CFG form. This is an extremely simple pass that just merges obviously redundant blocks, which can be used to fix some known failure cases. In the future, it may be enhanced with more cases (and have code shared with SimplifyCFG). This allows us to run LoopSimplifyCFG -> LoopRotate -> LoopUnroll, so that SimplifyCFG cleans up the loop before Rotate tries to run. Not currently used in the pass manager, since this pass doesn't do anything unless you can hook it up in an LPM with other loop passes. It'll be added once Chandler cleans up things to allow this. Tested in a custom pipeline out of tree to confirm it works in practice (in addition to the included trivial test). llvm-svn: 259256	2016-01-29 22:35:36 +00:00
Sanjay Patel	66fff73c76	[InstCombine] avoid an insertelement transformation that induces the opposite extractelement fold (PR26354) We would infinite loop because we created a shufflevector that was wider than needed and then failed to combine that with the insertelement. When subsequently visiting the extractelement from that shuffle, we see that it's unnecessary, delete it, and trigger another visit to the insertelement. llvm-svn: 259236	2016-01-29 20:21:02 +00:00
David Majnemer	75f492e7f1	Fix the build llvm-svn: 259215	2016-01-29 17:46:57 +00:00
Matthew Simpson	53d00ef874	[SLP] Fix printing of debug statement (NFC) llvm-svn: 259212	2016-01-29 17:21:38 +00:00
Sanjoy Das	c816f03b70	[RS4GC] Address post-commit review on r259208 from David NFC llvm-svn: 259211	2016-01-29 17:20:49 +00:00

1 2 3 4 5 ...

14581 Commits