llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	bd4a3be7d2	[InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBits The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits. This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero. Differential Revision: http://reviews.llvm.org/D19614 llvm-svn: 267873	2016-04-28 12:22:53 +00:00
Rong Xu	6e34c490ff	[PGO] Promote indirect calls to conditional direct calls with value-profile This patch implements the transformation that promotes indirect calls to conditional direct calls when the indirect-call value profile meta-data is available. Differential Revision: http://reviews.llvm.org/D17864 llvm-svn: 267815	2016-04-27 23:20:27 +00:00
Sanjay Patel	facf45a82f	[SimplifyCFG] propagate branch metadata when creating select There's no existing test for this path, and I don't know how to expose it in a regression test, but I'm assuming there's some reason this path exists. llvm-svn: 267813	2016-04-27 23:14:12 +00:00
Rong Xu	af5aebaa32	[PGO] Prohibit address recording if the function is both internal and COMDAT Differential Revision: http://reviews.llvm.org/D19515 llvm-svn: 267792	2016-04-27 21:17:30 +00:00
Ahmed Bougacha	ace97c1f7d	[LIR] Set attributes on memset_pattern16. "inferattrs" will deduce the attribute, but it will be too late for many optimizations. Set it ourselves when creating the call. Differential Revision: http://reviews.llvm.org/D17598 llvm-svn: 267762	2016-04-27 19:04:50 +00:00
Ahmed Bougacha	7f97193dd7	[LIR] Reuse variable. NFCI. llvm-svn: 267761	2016-04-27 19:04:46 +00:00
Ahmed Bougacha	44c19876c7	[InferAttrs] Mark memset_pattern16 params nocapture. Differential Revision: http://reviews.llvm.org/D19471 llvm-svn: 267760	2016-04-27 19:04:43 +00:00
Ahmed Bougacha	b0624a2cb4	[TLI] Unify LibFunc attribute inference. NFCI. Now the pass is just a tiny wrapper around the util. This lets us reuse the logic elsewhere (done here for BuildLibCalls) instead of duplicating it. The next step is to have something like getOrInsertLibFunc that also sets the attributes. Differential Revision: http://reviews.llvm.org/D19470 llvm-svn: 267759	2016-04-27 19:04:40 +00:00
Ahmed Bougacha	d765a82b54	[TLI] Unify LibFunc signature checking. NFCI. I tried to be as close as possible to the strongest check that existed before; cleaning these up properly is left for future work. Differential Revision: http://reviews.llvm.org/D19469 llvm-svn: 267758	2016-04-27 19:04:35 +00:00
Matthew Simpson	622b95be7b	[LV] Reallow positive-stride interleaved load groups with gaps We previously disallowed interleaved load groups that may cause us to speculatively access memory out-of-bounds (r261331). We did this by ensuring each load group had an access corresponding to the first and last member. Instead of bailing out for these interleaved groups, this patch enables us to peel off the last vector iteration, ensuring that we execute at least one iteration of the scalar remainder loop. This solution was proposed in the review of the previous patch. Differential Revision: http://reviews.llvm.org/D19487 llvm-svn: 267751	2016-04-27 18:21:36 +00:00
Arch D. Robison	aca7c412b4	[SLPVectorizer] Refactor where MinVecRegSize and MaxVecRegSize live. This is the first of two commits for extending SLP Vectorizer to deal with aggregates. This commit merely refactors existing logic. http://reviews.llvm.org/D14185 llvm-svn: 267748	2016-04-27 17:46:25 +00:00
Matthew Simpson	e5dfb08fcb	[TTI] Add hook for vector extract with extension This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725	2016-04-27 15:20:21 +00:00
Teresa Johnson	df5ef8711f	[ThinLTO] Refine fix to avoid renaming of uses in inline assembly. Summary: Refine the workaround from r266877 that attempts to prevent renaming of locals in inline assembly, so that in addition to looking for a llvm.used local value, that there is at least one inline assembly call in the module. Otherwise, debug functions added to the llvm.used can block importing/exporting unnecessarily. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19573 llvm-svn: 267717	2016-04-27 14:19:38 +00:00
Artur Pilipenko	9bb6beabf4	isSafeToLoadUnconditionally support queries without a context This is required to use this function from isSafeToSpeculativelyExecute Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16231 llvm-svn: 267692	2016-04-27 11:00:48 +00:00
Adam Nemet	d2fa414718	[LoopDist] Add llvm.loop.distribute.enable loop metadata Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672	2016-04-27 05:28:18 +00:00
Vaivaswatha Nagaraj	08efb0efcd	[Cloning] cloneLoopWithPreheader(): add assert to ensure no sub-loops Summary: cloneLoopWithPreheader() does not update LoopInfo for sub-loop of the original loop being cloned. Add assert to ensure no sub-loops for loop being cloned. Reviewers: anemet, ashutosh.nema, hfinkel Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D15922 llvm-svn: 267671	2016-04-27 05:25:09 +00:00
Evgeny Stupachenko	23ce61b663	The patch fixes PR27392. Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662	2016-04-27 03:04:54 +00:00
Sanjoy Das	5253a089ba	Fix typo in comment; NFC llvm-svn: 267653	2016-04-27 01:44:31 +00:00
Mehdi Amini	b4e1e8297b	ThinLTO: do not promote GlobalVariable that have a specific section. Differential Revision: http://reviews.llvm.org/D18298 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267646	2016-04-27 00:32:13 +00:00
Matt Arsenault	ba437c67d2	SLSR: Use UnknownAddressSpace instead of 0 for pure arithmetic. In the case where isLegalAddressingMode is used for cases not related to addressing modes, such as pure adds and muls, it should not be using address space 0. LSR already passes -1 as the address space in these cases. llvm-svn: 267645	2016-04-27 00:32:09 +00:00
Adam Nemet	61399ac424	[LoopDist] Split main class. NFC This splits out the per-loop functionality from the Pass class. With this the fact whether the loop is forced-distribute with the new metadata/pragma can be cached in the per-loop class rather than passed around. llvm-svn: 267643	2016-04-27 00:31:03 +00:00
Justin Bogner	c2bf63d29d	PM: Port Reassociate to the new pass manager llvm-svn: 267631	2016-04-26 23:39:29 +00:00
Justin Bogner	cb8a21c88e	Reassociate: Convert another functor into a lambda. NFC Also move the explanatory comment with it. llvm-svn: 267628	2016-04-26 23:32:00 +00:00
Sanjay Patel	29dea0d230	[SimplifyCFG] propagate branch metadata when creating select llvm-svn: 267624	2016-04-26 23:15:48 +00:00
Sanjay Patel	d2d2aa52cd	[LowerExpectIntrinsic] make default likely/unlikely ratio bigger We need the default ratio to be sufficiently large that it triggers transforms based on block frequency info (BFI) and plays well with the recently introduced BranchProbability used by CGP. Differential Revision: http://reviews.llvm.org/D19435 llvm-svn: 267615	2016-04-26 22:23:38 +00:00
Justin Bogner	90744d215b	Reassociate: Simplify using lambdas. NFC llvm-svn: 267614	2016-04-26 22:22:18 +00:00
David Majnemer	abb9f55c80	Revert "[SimplifyLibCalls] sprintf doesn't copy null bytes" The destination buffer that sprintf uses is restrict qualified, we do not need to worry about derived pointers referenced via format specifiers. This reverts commit r267580. llvm-svn: 267605	2016-04-26 21:04:47 +00:00
Elena Demikhovsky	308a7eb0d2	Masked Store in Loop Vectorizer - bugfix Fixed a bug in loop vectorization with conditional store. Differential Revision: http://reviews.llvm.org/D19532 llvm-svn: 267597	2016-04-26 20:18:04 +00:00
Justin Bogner	4563a06cee	PM: Port Internalize to the new pass manager llvm-svn: 267596	2016-04-26 20:15:52 +00:00
David Majnemer	8cd77baebc	[SimplifyLibCalls] sprintf doesn't copy null bytes sprintf doesn't read or copy the terminating null byte from it's string operands. sprintf will append it's own after processing all of the format specifiers. This fixes PR27526. llvm-svn: 267580	2016-04-26 18:16:49 +00:00
Dehao Chen	5d6d4841ed	Tune basic block annotation algorithm. Summary: Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary. This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19301 llvm-svn: 267519	2016-04-26 04:59:11 +00:00
Hal Finkel	e4c0c1679b	[SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when merging When SimplifyCFG merges identical instructions from both sides of a diamond, it can preserve !llvm.mem.parallel_loop_access (as it does with most of the other metadata). There's no real data or control dependency change in this case. llvm-svn: 267515	2016-04-26 02:06:06 +00:00
Hal Finkel	411d31ad72	[LoopVectorize] Don't consider conditional-load dereferenceability for marked parallel loops I really thought we were doing this already, but we were not. Given this input: void Test(int res, int c, int d, int p) { for (int i = 0; i < 16; i++) res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } we did not vectorize the loop. Even with "assume_safety" the check that we don't if-convert conditionally-executed loads (to protect against data-dependent deferenceability) was not elided. One subtlety: As implemented, it will still prefer to use a masked-load instrinsic (given target support) over the speculated load. The choice here seems architecture specific; the best option depends on how expensive the masked load is compared to a regular load. Ideally, using the masked load still reduces unnecessary memory traffic, and so should be preferred. If we'd rather do it the other way, flipping the order of the checks is easy. The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also implies that if conversion is okay. Differential Revision: http://reviews.llvm.org/D19512 llvm-svn: 267514	2016-04-26 02:00:36 +00:00
David Majnemer	30ffc4ce45	[SROA] Don't falsely report that changes have occured We would report that the function changed despite creating no new allocas or performing any promotion. This fixes PR27316. llvm-svn: 267507	2016-04-26 01:05:00 +00:00
Justin Bogner	1a07501379	PM: Port GlobalOpt to the new pass manager llvm-svn: 267499	2016-04-26 00:28:01 +00:00
Justin Bogner	d2f3d0a79d	PM: Convert the logic for GlobalOpt into static functions. NFC Pass all of the state we need around as arguments, so that these functions are easier to reuse. There is one part of this that is unusual: we pass around a functor to look up a DomTree for a function. This will be a necessary abstraction when we try to use this code in both the legacy and the new pass manager. llvm-svn: 267498	2016-04-26 00:27:56 +00:00
Arch D. Robison	be0490a6e8	Optimize store of "bitcast" from vector to aggregate. This patch is what was the "instcombine" portion of D14185, with an additional test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). The patch causes instcombine to replace sequences of extractelement-insertvalue-store that act essentially like a bitcast followed by a store. Differential review: http://reviews.llvm.org/D14260 llvm-svn: 267482	2016-04-25 22:22:39 +00:00
Teresa Johnson	c851d216e2	[ThinLTO] Introduce typedef for commonly-used map type (NFC) Add a typedef for the std::map<GlobalValue::GUID, GlobalValueSummary *> map that is passed around to identify summaries for values defined in a particular module. This shortens up declarations in a variety of places. llvm-svn: 267471	2016-04-25 21:09:51 +00:00
Etienne Bergeron	50f02aa3fa	Cleanup redundant expression in InstCombineAndOrXor. Summary: The expression is redundant on both side of operator \|. detected by : http://reviews.llvm.org/D19451 Reviewers: rnk, majnemer Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D19459 llvm-svn: 267458	2016-04-25 20:15:33 +00:00
Chad Rosier	e2cbd13e56	[ValueTracking] Improve isImpliedCondition when the dominating cond is false. llvm-svn: 267430	2016-04-25 17:23:36 +00:00
Anna Thomas	95f68aa7eb	Test commit: modified comment. NFC llvm-svn: 267406	2016-04-25 13:58:05 +00:00
James Molloy	eb040cc55f	[GlobalOpt] Allow constant globals to be SRA'd The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!). There seems to be no reason why we can't SRA constants too, so let's do it. llvm-svn: 267393	2016-04-25 10:48:29 +00:00
Mehdi Amini	bf4513b9aa	Run GlobalOpt before emitting the bitcode for ThinLTO This is motivated by reducing the size of the IR and thus reduce compile time. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267385	2016-04-25 08:47:49 +00:00
Mehdi Amini	f72ca86b71	ThinLTO: Move createNameAnonFunctionPass insertion in PassManagerBuilder (NFC) It is just code motion, but makes more sense this way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267384	2016-04-25 08:47:37 +00:00
Simon Pilgrim	4c564ad4dd	Tweak comments to make it clear that these combines are for SSE scalar instructions. llvm-svn: 267360	2016-04-24 19:31:56 +00:00
Simon Pilgrim	4b5462f119	[InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns. llvm-svn: 267359	2016-04-24 18:35:59 +00:00
Simon Pilgrim	83020942d3	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2) Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND). 2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements 3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns). Differential Revision: http://reviews.llvm.org/D19318 llvm-svn: 267357	2016-04-24 18:23:14 +00:00
Simon Pilgrim	424da1637a	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2) This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input) 2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input Differential Revision: http://reviews.llvm.org/D17490 llvm-svn: 267356	2016-04-24 18:12:42 +00:00
Simon Pilgrim	1c9a9f255c	[InstCombine] Avoid updating argument demanded elements in separate passes. As discussed on D17490, we should attempt to update an intrinsic's arguments demanded elements in one pass if we can. llvm-svn: 267355	2016-04-24 17:57:27 +00:00
Simon Pilgrim	2f6097d113	[X86][InstCombine] Tidyup VPERMILVAR -> shufflevector conversion to helper function. NFCI. llvm-svn: 267352	2016-04-24 17:23:46 +00:00
Simon Pilgrim	c0c56e747a	[X86][InstCombine] Tidyup PSHUFB -> shufflevector conversion to helper function. NFCI. llvm-svn: 267351	2016-04-24 17:00:34 +00:00
Teresa Johnson	28e457bccd	[ThinLTO] Remove GlobalValueInfo class from index Summary: Remove the GlobalValueInfo and change the ModuleSummaryIndex to directly reference summary objects. The info structure was there to support lazy parsing of the combined index summary objects, which is no longer needed and not supported. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19462 llvm-svn: 267344	2016-04-24 14:57:11 +00:00
Mehdi Amini	cb87494f4c	Always traverse GlobalVariable initializer when computing the export list Summary: We are always importing the initializer for a GlobalVariable. So if a GlobalVariable is in the export-list, we pull in any refs as well. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19102 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267303	2016-04-23 23:29:24 +00:00
Sanjay Patel	dc88bd6e1f	replace duplicated static functions for profile metadata access with BranchInst member function; NFCI llvm-svn: 267295	2016-04-23 20:01:22 +00:00
Sanjay Patel	85ce0f1f1f	improve documentation comments; NFC llvm-svn: 267292	2016-04-23 16:31:48 +00:00
Nico Weber	0aa9845d15	Revert r267210, it makes clang assert (PR27490). llvm-svn: 267232	2016-04-22 22:08:42 +00:00
Andrew Kaylor	aa641a5171	Re-commit optimization bisect support (r267022) without new pass manager support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231	2016-04-22 22:06:11 +00:00
Rong Xu	f8f051cbf5	[PGO] change the interface for createPGOFuncNameMetadata() This patch changes the interface for createPGOFuncNameMetadata() where we add another PGOFuncName argument. Differential Revision: http://reviews.llvm.org/D19433 llvm-svn: 267216	2016-04-22 21:00:17 +00:00
Philip Reames	5f0e36947b	[unordered] sink unordered stores at end of blocks The existing code turned out to be completely correct when auditted. Thus, only minor code changes and adding a couple of tests. llvm-svn: 267215	2016-04-22 20:53:32 +00:00
Sanjoy Das	f97229d6ba	Fold compares for distinct allocations Summary: We can fold compares to false when two distinct allocations within a function are compared for equality. Patch by Anna Thomas! Reviewers: majnemer, reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19390 llvm-svn: 267214	2016-04-22 20:52:25 +00:00
Philip Reames	eedef73b63	[unordered] Extend load/store type canonicalization to handle unordered operations Extend the type canonicalization logic to work for unordered atomic loads and stores. Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before. Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered. If you see problems, feel free to revert this change, but please make sure you collect a test case. llvm-svn: 267210	2016-04-22 20:33:48 +00:00
Justin Bogner	b93949089e	PM: Port SinkingPass to the new pass manager llvm-svn: 267199	2016-04-22 19:54:10 +00:00
Justin Bogner	82077c4ab0	PM: Reorder the functions used for SinkingPass. NFC This will make the port to the new PM easier to follow. llvm-svn: 267198	2016-04-22 19:54:04 +00:00
Jun Bum Lim	d29a24e4fd	[DeadStoreElimination] Shorten beginning of memset overwritten by later stores Summary: This change will shorten memset if the beginning of memset is overwritten by later stores. Reviewers: hfinkel, eeckstein, dberlin, mcrosier Subscribers: mgrang, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18906 llvm-svn: 267197	2016-04-22 19:51:29 +00:00
Justin Bogner	395c2127ed	PM: Port DCE to the new pass manager Also add a very basic test, since apparently there aren't any tests for DCE whatsoever to add the new pass version to. llvm-svn: 267196	2016-04-22 19:40:41 +00:00
Adam Nemet	fe3def7c2a	[LoopUtils] Extend findStringMetadataForLoop to return the value for metadata E.g. for: !1 = {"llvm.distribute", i32 1} it now returns the MDOperand for 1. I will use this in LoopDistribution to check the value of the metadata. Note that the change is backward-compatible with its current use in LoopVersioningLICM. An Optional implicitly converts to a bool depending whether it contains a value or not. llvm-svn: 267190	2016-04-22 19:10:05 +00:00
Chad Rosier	1a4bc110f5	[EarlyCSE/CVP] Add stats for CVPs and make sure to account for any Changes. llvm-svn: 267187	2016-04-22 18:47:21 +00:00
Geoff Berry	9fe26e6dc9	[MemorySSA] Fix bug in CachingMemorySSAWalker::invalidateInfo Summary: CachingMemorySSAWalker::invalidateInfo was using IsCall to determine which cache map needed to be cleared of entries referring to the invalidated MemoryAccess, but there could also be entries referring to it in the other cache map (value entries, not key entries). This change just clears both tables to be conservatively correct. Also add a verifyRemoved() function, called when expensive checks (i.e. XDEBUG) are enabled to verify that the invalidated MemoryAccess object is not referenced in any of the caches. Reviewers: dberlin, george.burgess.iv Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19388 llvm-svn: 267157	2016-04-22 14:44:10 +00:00
David Majnemer	bfd695d591	[EarlyCSE] Don't add the overflow flags to the hash We take the intersection of overflow flags while CSE'ing. This permits us to consider two instructions with different overflow behavior to be replaceable. llvm-svn: 267153	2016-04-22 14:12:50 +00:00
Silviu Baranga	e985c76b90	[InstCombine] Preserve fast math flags when combining PHIs Summary: When optimizing PHIs which have inputs floating point binary operators, we preserve all IR flags except the fast math flags. This change removes the logic which tracked some of the IR flags (no wrap, exact) and replaces it by doing an and on the IR flags of all inputs to the PHI - which will also handle the fast math flags. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19370 llvm-svn: 267139	2016-04-22 11:21:36 +00:00
Vedant Kumar	6013f45f92	Revert "Initial implementation of optimization bisect support." This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115	2016-04-22 06:51:37 +00:00
David Majnemer	d0ce8f1485	[GVN] Respect fast-math-flags on fcmps We assumed that flags were only present on binary operators. This is not true, they may also be present on calls and fcmps. llvm-svn: 267113	2016-04-22 06:37:51 +00:00
David Majnemer	9554c1339c	[EarlyCSE] Take the intersection of flags on instructions EarlyCSE had inconsistent behavior with regards to flag'd instructions: - In some cases, it would pessimize if the available instruction had different flags by not performing CSE. - In other cases, it would miscompile if it replaced an instruction which had no flags with an instruction which has flags. Fix this by being more consistent with our flag handling by utilizing andIRFlags. llvm-svn: 267111	2016-04-22 06:37:45 +00:00
Duncan P. N. Exon Smith	71480bd0c7	ValueMapper/Enumerator: Clean up code in post-order traversals, NFC Re-layer the functions in the new (i.e., newly correct) post-order traversals in ValueEnumerator (r266947) and ValueMapper (r266949). Instead of adding a node to the worklist in a helper function and returning a flag to say what happened, return the node itself. This makes the code way cleaner: the worklist is local to the main function, there is no flag for an early loop exit (since we can cleanly bury the loop), and it's perfectly clear when pointers into the worklist might be invalidated. I'm fixing both algorithms in the same commit to avoid repeating the commit message; if you take the time to understand one the other should be easy. The diff itself isn't entirely obvious since the traversals have some noise (i.e., things to do), but here's the high-level change: auto helper = [&WL](T Op) { auto helper = [](T &I, T E) { => while (I != E) { if (shouldVisit(Op)) { T Op = I++; WL.push(Op, Op->begin()); if (shouldVisit(Op)) { return true; return Op; } } return false; return nullptr; }; }; => WL.push(S, S->begin()); WL.push(S, S->begin()); while (!empty()) { while (!empty()) { auto N = WL.top().N; auto N = WL.top().N; auto &I = WL.top().I; auto &I = WL.top().I; bool DidChange = false; while (I != N->end()) if (helper(I++)) { => if (T *Op = helper(I, N->end()) { DidChange = true; WL.push(Op, Op->begin()); break; continue; } } if (DidChange) continue; POT.push(WL.pop()); => POT.push(WL.pop()); } } Thanks to Mehdi for helping me find a better way to layer this. llvm-svn: 267099	2016-04-22 02:33:06 +00:00
Mike Aizatsky	243b71fd8b	Fixed flag description Summary: asan-use-after-return control feature we call use-after-return or stack-use-after-return. Reviewers: kcc, aizatsky, eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19284 llvm-svn: 267064	2016-04-21 22:00:13 +00:00
Derek Bruening	d862c178b0	[esan] EfficiencySanitizer instrumentation pass Summary: Adds an instrumentation pass for the new EfficiencySanitizer ("esan") performance tuning family of tools. Multiple tools will be supported within the same framework. Preliminary support for a cache fragmentation tool is included here. The shared instrumentation includes: + Turn mem{set,cpy,move} instrinsics into library calls. + Slowpath instrumentation of loads and stores via callouts to the runtime library. + Fastpath instrumentation will be per-tool. + Which memory accesses to ignore will be per-tool. Reviewers: eugenis, vitalybuka, aizatsky, filcab Subscribers: filcab, vkalintiris, pcc, silvas, llvm-commits, zhaoqin, kcc Differential Revision: http://reviews.llvm.org/D19167 llvm-svn: 267058	2016-04-21 21:30:22 +00:00
JF Bastien	c22d29982b	NFC: fix copy / paste comment llvm-svn: 267039	2016-04-21 19:53:39 +00:00
JF Bastien	3e2e69f607	NFC: fix nonsensical comment llvm-svn: 267036	2016-04-21 19:41:48 +00:00
Sanjoy Das	a085cfc150	Folding compares with unescaped allocations Summary: If we know that the pointer allocated within a function does not escape, we can fold away comparisons that are done with global pointers Patch by Anna Thomas! Reviewers: reames, majnemer, sanjoy Subscribers: mgrang, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D19276 llvm-svn: 267035	2016-04-21 19:26:45 +00:00
Philip Reames	a98c7ead30	[instcombine][unordered] Extend load(select) transform to handle unordered loads llvm-svn: 267023	2016-04-21 17:59:40 +00:00
Andrew Kaylor	f0f279291c	Initial implementation of optimization bisect support. This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022	2016-04-21 17:58:54 +00:00
Philip Reames	3ac0718423	[unordered] unordered loads from null are still unreachable llvm-svn: 267019	2016-04-21 17:45:05 +00:00
Adam Nemet	6dcf0788fc	[LoopUtils] Fix typo in comment llvm-svn: 267016	2016-04-21 17:33:22 +00:00
Adam Nemet	293be666eb	[LoopUtils] Add asserts to findStringMetadataForLoop. NFC These ensure that operand array has at least one element and it is the self-reference. llvm-svn: 267015	2016-04-21 17:33:20 +00:00
Adam Nemet	963341c872	[LoopUtils] Move def of findStringMetadataForLoop to LoopUtils.cpp. NFC The decl is in LoopUtils.h. I think that this was added to LoopVersioningLICM.cpp by mistake. llvm-svn: 267014	2016-04-21 17:33:17 +00:00
Adam Nemet	f787826b46	[LoopUtils] Rename {check->find}StringMetadata{Into->For}Loop. NFC "Into" was misleading. I am also planning to use this helper to look for loop metadata and return the argument, so find seems like a better name. llvm-svn: 267013	2016-04-21 17:33:12 +00:00
Philip Reames	ac55090e96	[instcombine][unordered] Implement *-load forwarding for unordered atomics This builds on 266999 which made FindAvailableValue do the right thing. Tests included show the newly enabled transforms and those which disabled either due to conservatism or correctness requirements. llvm-svn: 267006	2016-04-21 17:03:33 +00:00
Sanjoy Das	54a3a006ca	[SimplifyCFG] Fold `llvm.guard(false)` to unreachable Summary: `llvm.guard(false)` always bails out of the current compilation unit, so we can prune any control flow following it. Reviewers: hfinkel, pcc, reames Subscribers: majnemer, reames, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19245 llvm-svn: 266955	2016-04-21 05:09:12 +00:00
Duncan P. N. Exon Smith	0ab44dbf8f	ValueMapper: Map uniqued nodes in post-order The iteratitive algorithm from r265456 claimed but failed to create a post-order traversal. It had the same error that was fixed in the ValueEnumerator in r266947: now, instead of pushing all operands on the worklist at once, we pause whenever an operand gets pushed in order to go depth-first (I know, it sounds obvious). Sadly, I have no idea how to observe this from outside the algorithm and so I haven't written a test. The output should be the same; it should just use fewer temporary nodes now. I've added some comments that I hope make the current logic clear enough it's unlikely to regress. llvm-svn: 266949	2016-04-21 02:34:36 +00:00
Mehdi Amini	bda3c97c16	ThinLTO/ModuleLinker: add a flag to not always pull-in linkonce when performing importing Summary: The function importer already decided what symbols need to be pulled in. Also these magically added ones will not be in the export list for the source module, which can confuse the internalizer for instance. Reviewers: tejohnson, rafael Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19096 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266948	2016-04-21 01:59:39 +00:00
Dehao Chen	a8bae82373	Refine instruction weight annotation algorithm for sample profiler. Summary: This patch refined the instruction weight anootation algorithm: 1. Do not use dbg_value intrinsics for annotation. 2. Annotate cold calls if the call is inlined in profile, but not inlined before preparation. This indicates that the annotation preparation step found no sample for the inlined callsite, thus the call should be very cold. Reviewers: dnovillo, davidxl Subscribers: mgrang, llvm-commits Differential Revision: http://reviews.llvm.org/D19286 llvm-svn: 266936	2016-04-20 23:36:23 +00:00
Kostya Serebryany	a83bfeac9d	Rename asan-check-lifetime into asan-stack-use-after-scope Summary: This is done for consistency with asan-use-after-return. I see no other users than tests. Reviewers: aizatsky, kcc Differential Revision: http://reviews.llvm.org/D19306 llvm-svn: 266906	2016-04-20 20:02:58 +00:00
Chad Rosier	b346dcbc25	Typo. llvm-svn: 266905	2016-04-20 19:16:23 +00:00
Chad Rosier	41dd31f0b0	[ValueTracking] Make isImpliedCondition return an Optional<bool>. NFC. Phabricator Revision: http://reviews.llvm.org/D19277 llvm-svn: 266904	2016-04-20 19:15:26 +00:00
Teresa Johnson	b35cc691ea	[ThinLTO] Prevent importing of "llvm.used" values Summary: This patch prevents importing from (and therefore exporting from) any module with a "llvm.used" local value. Local values need to be promoted and renamed when importing, and their presense on the llvm.used variable indicates that there are opaque uses that won't see the rename. One such example is a use in inline assembly. See also the discussion at: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098047.html As part of this, move collectUsedGlobalVariables out of Transforms/Utils and into IR/Module so that it can be used more widely. There are several other places in LLVM that used copies of this code that can be cleaned up as a follow on NFC patch. Reviewers: joker.eph Subscribers: pcc, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18986 llvm-svn: 266877	2016-04-20 14:39:45 +00:00
Mehdi Amini	bb3a1d92f3	ThinLTO: never promote as external weak This linkage is not intended to express that a declaration refers to a weak symbol, but that the symbol might not be present at link time. I don't believe it was the intent. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266856	2016-04-20 04:18:11 +00:00
Mehdi Amini	2c719cc117	FunctionImport: make sure we always select the right callee in presence of alias From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266854	2016-04-20 04:17:36 +00:00
Mehdi Amini	6968ef773b	ThinLTO: Move alias importing decision on the summary From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266845	2016-04-20 01:04:20 +00:00
Marcin Koscielnicki	ef2e7b4819	[Mips] [MSan] VarArgMIPS64Helper: Use target's endian, not host's. Ugh. Differential Revision: http://reviews.llvm.org/D19292 llvm-svn: 266833	2016-04-19 23:46:59 +00:00
David Majnemer	b4b27230bf	[ValueTracking, VectorUtils] Refactor getIntrinsicIDForCall The functionality contained within getIntrinsicIDForCall is two-fold: it checks if a CallInst's callee is a vectorizable intrinsic. If it isn't an intrinsic, it attempts to map the call's target to a suitable intrinsic. Move the mapping functionality into getIntrinsicForCallSite and rename getIntrinsicIDForCall to getVectorIntrinsicIDForCall while reimplementing it in terms of getIntrinsicForCallSite. llvm-svn: 266801	2016-04-19 19:10:21 +00:00
Chad Rosier	b7dfbb40a3	[ValueTracking] Improve isImpliedCondition for conditions with matching operands. This patch improves SimplifyCFG to catch cases like: if (a < b) { if (a > b) <- known to be false unreachable; } Phabricator Revision: http://reviews.llvm.org/D18905 llvm-svn: 266767	2016-04-19 17:19:14 +00:00
Mehdi Amini	aeb1e59b71	Minor improvement to debug output for Function Importer (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266723	2016-04-19 09:21:30 +00:00
Daniel Berlin	77fa84eadd	Correct IDF calculator for ReverseIDF Summary: Need to use predecessors for reverse graph, successors for forward graph. succ_iterator/pred_iterator are not compatible, this patch is all the work necessary to work around that (which is what everywhere else does). Not sure if there is a better way, so cc'ing some random folks to take a gander :) Reviewers: dblaikie, qcolombet, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18796 llvm-svn: 266718	2016-04-19 06:13:28 +00:00
Michael Kuperstein	de16b44f74	Port DemandedBits to the new pass manager. Differential Revision: http://reviews.llvm.org/D18679 llvm-svn: 266699	2016-04-18 23:55:01 +00:00
Xinliang David Li	e6b892940f	Port InstrProfiling pass to the new pass manager Differential Revision: http://reviews.llvm.org/D18126 llvm-svn: 266637	2016-04-18 17:47:38 +00:00
Mehdi Amini	b550cb1750	[NFC] Header cleanup Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' \| xargs grep -L 'IndexedMap[<]' \| xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595	2016-04-18 09:17:29 +00:00
Duncan P. N. Exon Smith	724c503499	Transforms: Try harder to fix bootstrap after r266565 This catches two nullptr insertions into the ValueMap I missed in r266567. I missed CloneFunction becuase it never calls RemapInstruction directly. Here's one of the still-failing bots: http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11496 llvm-svn: 266570	2016-04-17 20:11:09 +00:00
Duncan P. N. Exon Smith	0fdaf8c9c2	Linker: Don't double-schedule appending variables Add an assertion to ValueMapper that prevents double-scheduling of GlobalValues to remap, and fix the one place it happened. There are tons of tests that fail with this assertion in place and without the code change, so I'm not adding another. Although it looks related, r266563 was, indeed, removing dead code. AFAICT, this cross-file double-scheduling started in r266510 when the cross-file recursion was removed. llvm-svn: 266569	2016-04-17 19:40:20 +00:00
Duncan P. N. Exon Smith	a71301befa	Transforms: Fix bootstrap after r266565 Apparently there isn't test coverage for all of these. I'd appreciate if someone with could reproduce and send me something to reduce, but for now I've just looked for users of RemapInstruction and MapValue and ensured they don't accidentally insert nullptr. Here is one of the bootstraps that caught: http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11494 llvm-svn: 266567	2016-04-17 19:26:49 +00:00
Duncan P. N. Exon Smith	3d555ac96d	ValueMapper: Don't allow explicit null mappings of Values, NFC As a follow-up to r123058, assert that there are no null mappings in the ValueMap instead of just ignoring them when they are there. There were a couple of accidental insertions in CloneFunction so I cleaned those up (caught by testcases). llvm-svn: 266565	2016-04-17 18:53:24 +00:00
Sanjoy Das	99042473d0	Fix a typo in rL265762 I accidentally replaced `mayBeOverridden` with `!isInterposable`. Remove the negation and add a test case that would've caught this. Many thanks to Håkan Hjort for spotting this! llvm-svn: 266551	2016-04-17 04:30:43 +00:00
Duncan P. N. Exon Smith	5ab2be094e	IR: Use an explicit map for debug info type uniquing Rather than relying on the structural equivalence of DICompositeType to merge type definitions, use an explicit map on the LLVMContext that LLParser and BitcodeReader consult when constructing new nodes. Each non-forward-declaration DICompositeType with a non-empty 'identifier:' field is stored/loaded from the type map, and the first definiton will "win". This map is opt-in: clients that expect ODR types from different modules to be merged must call LLVMContext::ensureDITypeMap. - Clients that just happen to load more than one Module in the same LLVMContext won't magically merge types. - Clients (like LTO) that want to continue to merge types based on ODR identifiers should opt-in immediately. I have updated LTOCodeGenerator.cpp, the two "linking" spots in gold-plugin.cpp, and llvm-link (unless -disable-debug-info-type-map) to set this. With this in place, it will be straightforward to remove the DITypeRef concept (i.e., referencing types by their 'identifier:' string rather than pointing at them directly). llvm-svn: 266549	2016-04-17 03:58:21 +00:00
Duncan P. N. Exon Smith	694ab4e966	ValueMapper: Separate mapping of distinct and uniqued nodes (again) Since the result of a mapped distinct node is known up front, it's more efficient to map them separately from uniqued nodes. This commit pulls them out of the post-order traversal and stores them in a worklist to be remapped at the top-level. This is essentially reapplying r244181 ("ValueMapper: Rotate distinct node remapping algorithm") to the new iterative algorithm from r265456 ("ValueMapper: Rewrite Mapper::mapMetadata without recursion"). Now that the traversal logic only handles uniqued MDNodes, it's much simpler to inline it all into MDNodeMapper::createPOT (I've killed the MDNodeMapper::push and MDNodeMapper::tryToPop helpers and localized the traversal worklist). The resulting high-level algorithm for MDNodeMapper::map now looks like this: - Distinct nodes are immediately mapped and added to MDNodeMapper::DistinctWorklist using MDNodeMapper::mapDistinctNode. - Uniqued nodes are mapped via MDNodeMapper::mapTopLevelUniquedNode, which traverses the transitive uniqued subgraph of a node to calculate uniqued node mappings in bulk. - This is a simplified version of MDNodeMapper::map from before this commit (originally r265456) that doesn't traverse through any distinct nodes. - Distinct nodes are added to MDNodeMapper::DistinctWorklist via MDNodeMapper::mapDistinctNode. - This uses MDNodeMapper::createPOT to fill a MDNodeMapper::UniquedGraph (a post-order traversal and side table), UniquedGraph::propagateChanges to track which uniqued nodes need to change, and MDNodeMapper::mapNodesInPOT to create the uniqued nodes. - Placeholders for forward references are now only needed when there's a uniquing cycle (a cycle of uniqued nodes unbroken by distinct nodes). This is the key functionality change that we're reintroducing (from r244181). As of r265456, a temporary forward reference might be needed for any cycle that involved uniqued nodes. - After mapping the first node appropriately, MDNodeMapper::map works through MDNodeMapper::DistinctWorklist. For each distinct node, its operands are remapped with MDNodeMapper::mapDistinctNode and MDNodeMapper::mapTopLevelUniquedNode until all nodes have been mapped. Sadly there's nothing observable I can test here; no real functionality change, just a compile-time speedup from reduced malloc traffic. llvm-svn: 266537	2016-04-16 21:44:08 +00:00
Duncan P. N. Exon Smith	0cb5c344b4	ValueMapper: Only put cyclic nodes into CyclicNodes, NFCI As a minor fixup to r266258, only track nodes that needed a placeholder in CyclicNodes in MDNodeMapper::mapUniquedNodes. There should be no observable functionality change, just some local memory savings because CyclicNodes only needs to grow to accommodate nodes that are actually involved in cycles. (This was the original intent of r266258, or else the vector would have been called "ChangedNodes".) llvm-svn: 266536	2016-04-16 21:09:53 +00:00
Simon Atanasyan	e12bef7ea7	ValueMapper: Fix unused var warning. NFC llvm-svn: 266529	2016-04-16 11:49:40 +00:00
Mehdi Amini	1aafabf752	ThinLTO: Move the ODR resolution to be based purely on the summary. This is a requirement for the cache handling in D18494 Differential Revision: http://reviews.llvm.org/D18908 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266519	2016-04-16 07:02:16 +00:00
Mehdi Amini	2d28f7aa07	ThinLTO: Make aliases explicit in the summary To be able to work accurately on the reference graph when taking decision about internalizing, promoting, renaming, etc. We need to have the alias information explicit. Differential Revision: http://reviews.llvm.org/D18836 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266517	2016-04-16 06:56:44 +00:00
Duncan P. N. Exon Smith	a77d073305	ValueMapper: Stop memoizing ConstantAsMetadata Stop memoizing ConstantAsMetadata in ValueMapper::mapMetadata. Now we have to recompute it, but these metadata aren't particularly common, and it restricts the lifetime of the Metadata map unnecessarily. (The motivation is that I have a patch which uses a single Metadata map for the lifetime of IRMover. Mehdi profiled r266446 with the patch applied and we saw a pretty big speedup in lib/Linker.) llvm-svn: 266513	2016-04-16 03:39:44 +00:00
Duncan P. N. Exon Smith	39423b0294	Reapply "ValueMapper: Eliminate cross-file co-recursion, NFC" This reverts commit r266507, reapplying r266503 (and r266505 "ValueMapper: Use API from r266503 in unit tests, NFC") completely unchanged. I reverted because of a bot failure here: http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/16810/ However, looking more closely, the failure was from a host-compiler crash (clang 3.7.1) when building: lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/DwarfAccelTable.cpp.o I didn't modify that file, or anything it includes, with that commit. The next build (which hadn't picked up my revert) got past it: http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/16811/ I think this was just unfortunate timing. I suppose the bot must be flakey. llvm-svn: 266510	2016-04-16 02:29:55 +00:00
Duncan P. N. Exon Smith	6fe1ff260b	Revert "ValueMapper: Eliminate cross-file co-recursion, NFC" This reverts commit r266503, in case it's the root cause of this bot failure: http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/16810 I'm also reverting r266505 -- "ValueMapper: Use API from r266503 in unit tests, NFC" -- since it's in the way. llvm-svn: 266507	2016-04-16 02:05:33 +00:00
Duncan P. N. Exon Smith	f0d73f95c1	ValueMapper: Eliminate cross-file co-recursion, NFC Eliminate co-recursion of Mapper::mapValue through ValueMaterializer::materializeInitFor, through a major redesign of the ValueMapper.cpp interface. - Expose a ValueMapper class that controls the entry points to the mapping algorithms. - Change IRLinker to use ValueMapper directly, rather than llvm::RemapInstruction, llvm::MapValue, etc. - Use (e.g.) ValueMapper::scheduleMapGlobalInit to add mapping work to a worklist in ValueMapper instead of recursing. There were two fairly major complications. Firstly, IRLinker::linkAppendingVarProto incorporates an on-the-fly IR ugprade that I had to split apart. Long-term, this upgrade should be done in the bitcode reader (and we should only accept the "new" form), but for now I've just made it work and added a FIXME. The hold-op is that we need to deprecate C API that relies on this. Secondly, IRLinker has special logic to correctly implement aliases with comdats, and uses two ValueToValueMapTy instances and two ValueMaterializers. I supported this by allowing clients to register an alternate mapping context, whose MCID can be passed in when scheduling new work. While out of scope for this commit, it should now be straightforward to remove recursion from Mapper::mapValue. llvm-svn: 266503	2016-04-16 01:29:08 +00:00
Duncan P. N. Exon Smith	db6861e7dd	ValueMapper: Hide Mapper::VM behind an accessor, NFC Change Mapper::VM to a pointer and add a `getVM()` accessor for it. While this has no functionality change, it minimizes the diff on an upcoming patch that allows switching between instances of ValueToValueMapTy on a single Mapper instance. llvm-svn: 266490	2016-04-15 23:18:43 +00:00
Evgeniy Stepanov	40cd1514cf	[cfi] Support explicit sections for functions in cfi-icall. Allow explicit section for indirectly called functions in cfi-icall. Jumptables for functions in the same type class must be contiguous, so they always go to the default text section. Fixes PR25079. llvm-svn: 266486	2016-04-15 22:55:38 +00:00
David Majnemer	2e02ba78d5	[InstCombine] Don't transform compares of calls to functions named fabs{f,l,} InstCombine wants to optimize compares of calls to fabs with zero. However, we didn't have the necessary legality checking to verify that the function call had the same behavior as fabs. llvm-svn: 266452	2016-04-15 17:21:03 +00:00
Adrian Prantl	75819aedf6	[PR27284] Reverse the ownership between DICompileUnit and DISubprogram. Currently each Function points to a DISubprogram and DISubprogram has a scope field. For member functions the scope is a DICompositeType. DIScopes point to the DICompileUnit to facilitate type uniquing. Distinct DISubprograms (with isDefinition: true) are not part of the type hierarchy and cannot be uniqued. This change removes the subprograms list from DICompileUnit and instead adds a pointer to the owning compile unit to distinct DISubprograms. This would make it easy for ThinLTO to strip unneeded DISubprograms and their transitively referenced debug info. Motivation ---------- Materializing DISubprograms is currently the most expensive operation when doing a ThinLTO build of clang. We want the DISubprogram to be stored in a separate Bitcode block (or the same block as the function body) so we can avoid having to expensively deserialize all DISubprograms together with the global metadata. If a function has been inlined into another subprogram we need to store a reference the block containing the inlined subprogram. Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script that updates LLVM IR testcases to the new format. http://reviews.llvm.org/D19034 <rdar://problem/25256815> llvm-svn: 266446	2016-04-15 15:57:41 +00:00
Sanjay Patel	f11ab05bdb	[SimplifyCFG] propagate branch metadata when creating select (PR27344) This is almost identical to: http://reviews.llvm.org/rL264527 This doesn't solve PR27344; it just allows the profile weights to survive. To solve the bug, we need to use the profile weights in the backend. llvm-svn: 266442	2016-04-15 15:32:12 +00:00
Justin Lebar	cf63b64fc6	[PM] Add a SpeculativeExecution pass for targets with divergent branches. Summary: This IR pass is helpful for GPUs, and other targets with divergent branches. It's a nop on targets without divergent branches. Reviewers: chandlerc Subscribers: llvm-commits, jingyue, rnk, joker.eph, tra Differential Revision: http://reviews.llvm.org/D18626 llvm-svn: 266399	2016-04-15 00:32:12 +00:00
Justin Lebar	cad81cf6b3	[Speculation] Add a SpeculativeExecution mode where the pass does nothing unless TTI::hasBranchDivergence() is true. Summary: This lets us add this pass to the IR pass manager unconditionally; it will simply not do anything on targets without branch divergence. Reviewers: tra Subscribers: llvm-commits, jingyue, rnk, chandlerc Differential Revision: http://reviews.llvm.org/D18625 llvm-svn: 266398	2016-04-15 00:32:09 +00:00
Renato Golin	5cb666add7	[ARM] Adding IEEE-754 SIMD detection to loop vectorizer Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON. This patch teaches the loop vectorizer to only allow transformations of loops that either contain no floating-point operations or have enough allowance flags supporting lack of precision (ex. -ffast-math, Darwin). For that, the target description now has a method which tells us if the vectorizer is allowed to handle FP math without falling into unsafe representations, plus a check on every FP instruction in the candidate loop to check for the safety flags. This commit makes LLVM behave like GCC with respect to ARM NEON support, but it stops short of fixing the underlying problem: sub-normals. Neither GCC nor LLVM have a flag for allowing sub-normal operations. Before this patch, GCC only allows it using unsafe-math flags and LLVM allows it by default with no way to turn it off (short of not using NEON at all). As a first step, we push this change to make it safe and in sync with GCC. The second step is to discuss a new sub-normal's flag on both communitues and come up with a common solution. The third step is to improve the FastMath flags in LLVM to encode sub-normals and use those flags to restrict NEON FP. Fixes PR16275. llvm-svn: 266363	2016-04-14 20:42:18 +00:00
Sanjay Patel	e998b91d86	[InstCombine] remove constant by inverting compare + logic (PR27105) https://llvm.org/bugs/show_bug.cgi?id=27105 We can check if all bits outside of a constant mask are set with a single constant. As noted in the bug report, although this form should be considered the canonical IR, backends may want to transform this into an 'andn' / 'andc' comparison against zero because that could be a single machine instruction. Differential Revision: http://reviews.llvm.org/D18842 llvm-svn: 266362	2016-04-14 20:17:40 +00:00
Dehao Chen	34cc676732	Fix null pointer access for discriminator assignment. Summary: This fixes the buildbot failure. Reviewers: dnovillo, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19129 llvm-svn: 266360	2016-04-14 19:46:38 +00:00
Dehao Chen	46f8fbbb1b	Update discriminator assignment algorithm to handle nested call correctly. Summary: Add discriminator for nested call correctly. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19127 llvm-svn: 266354	2016-04-14 18:37:18 +00:00
Davide Italiano	96d2a1c603	[ValueMapper] Range-loopify to improve readability. NFC. llvm-svn: 266350	2016-04-14 18:07:32 +00:00
Nicolai Haehnle	05b127da06	[StructurizeCFG] Annotate branches that were treated as uniform Summary: This fully solves the problem where the StructurizeCFG pass does not consider the same branches as uniform as the SIAnnotateControlFlow pass. The patch in D19013 helps with this problem, but is not sufficient (and, interestingly, causes a "regression" with one of the existing test cases). No tests included here, because tests in D19013 already cover this. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19018 llvm-svn: 266346	2016-04-14 17:42:35 +00:00
David Majnemer	0f26b0aeb4	[CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only one of the inputs is NaN: -NUM will return the non-NaN argument while -NAN would return NaN. It is desirable to lower to @llvm.{min,max}num to -NAN if they don't have a native instruction for -NUM. Notably, ARMv7 NEON's vmin has the -NAN semantics. N.B. Of course, it is only safe to do this if the intrinsic call is marked nnan. llvm-svn: 266279	2016-04-14 07:13:24 +00:00
Tim Northover	5c02f9ad28	ARM: override cost function to re-enable ConstantHoisting (& fix it). At some point, ARM stopped getting any benefit from ConstantHoisting because the pass called a different variant of getIntImmCost. Reimplementing the correct variant revealed some problems, however: + ConstantHoisting was modifying switch statements. This is simply invalid, the cases must remain integer constants no matter the notional cost. + ConstantHoisting was mangling alloca instructions in the entry block. These should be handled by FrameLowering, so constants actually have a cost of 0. Worse, the resulting bitcasts meant they became dynamic allocas. rdar://25707382 llvm-svn: 266260	2016-04-13 23:08:27 +00:00
Duncan P. N. Exon Smith	11f60fd65a	ValueMapper: Resolve cycles on the new nodes Fix a major bug from r265456. Although it's now much rarer, ValueMapper sometimes has to duplicate cycles. The might-transitively-reference-a-temporary counts don't decrement on their own when there are cycles, and you need to call MDNode::resolveCycles to fix it. r265456 was checking the input nodes to see if they were unresolved. This is useless; they should never be unresolved. Instead we should check the output nodes and resolve cycles on them. llvm-svn: 266258	2016-04-13 22:54:01 +00:00
JF Bastien	8331458deb	NFC mergefunc: const correctness Some of the comparators were const others weren't making it annoying to add new comparators which call existing ones. llvm-svn: 266247	2016-04-13 21:12:21 +00:00
Betul Buyukkurt	bf8554c279	[PGO] Remove redundant VP instrumentation LLVM optimization passes may reduce a profiled target expression to a constant. Removing runtime calls at such instrumentation points would help speedup the runtime of the instrumented program. llvm-svn: 266229	2016-04-13 18:52:19 +00:00
Mehdi Amini	b5b289339b	Revert "Make aliases explicit in the summary" Inadvertently commited... This reverts commit e618ec93786d99df2ddf280ad2d5e02f5516cecf. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266215	2016-04-13 17:20:07 +00:00
Mehdi Amini	ce744a95fd	Make aliases explicit in the summary Summary: To be able to work accurately on the reference graph when taking decision about internalizing, promoting, renaming, etc. We need to have the alias information explicit. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18836 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266214	2016-04-13 17:18:42 +00:00
David L Kreitzer	752c1448fe	Simplify strlen to a subtraction for certain cases. Patch by Li Huang (li1.huang@intel.com) Differential Revision: http://reviews.llvm.org/D18230 llvm-svn: 266200	2016-04-13 14:31:06 +00:00
David Majnemer	3ee5f34469	[InstCombine] We folded an fcmp to an i1 instead of a vector of i1 Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175	2016-04-13 06:55:52 +00:00
Mehdi Amini	105938302a	Minor cleanup in Internalize, hide helper class using anonymous namespace (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266173	2016-04-13 06:32:29 +00:00
Mehdi Amini	59269a874f	Really return whether Internalize did change the Module or not. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266169	2016-04-13 05:25:16 +00:00
Mehdi Amini	3949b9e6dd	Modernize Internalizer with for-range loop (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266168	2016-04-13 05:25:12 +00:00
Mehdi Amini	24d3414f06	Refactor the InternalizePass into a helper class, and expose it through a public free function (NFC) There is really no reason to require to instanciate a pass manager to internalize. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266167	2016-04-13 05:25:08 +00:00
Mehdi Amini	4078709957	Refactor Internalization pass to use as a callback instead of a StringSet (NFC) This will save a bunch of copies / initialization of intermediate datastructure, and (hopefully) simplify the code. This also abstract the symbol preservation mechanism outside of the Internalization pass into the client code, which is not forced to keep a map of strings for instance (ThinLTO will prefere hashes). From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266163	2016-04-13 04:20:32 +00:00
Mehdi Amini	ef7555fbb2	Fix FunctionImport export list computation: need to take a reference to a map entry to actually modify it From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266159	2016-04-13 01:52:32 +00:00
Mehdi Amini	818f67add5	Fix mismatch on returned type between header and implementation for createNameAnonFunctionPass() From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266151	2016-04-12 23:25:11 +00:00

1 2 3 4 5 ...

15017 Commits