llvm-project

Commit Graph

Author	SHA1	Message	Date
Ahmed Bougacha	44c19876c7	[InferAttrs] Mark memset_pattern16 params nocapture. Differential Revision: http://reviews.llvm.org/D19471 llvm-svn: 267760	2016-04-27 19:04:43 +00:00
Ahmed Bougacha	b0624a2cb4	[TLI] Unify LibFunc attribute inference. NFCI. Now the pass is just a tiny wrapper around the util. This lets us reuse the logic elsewhere (done here for BuildLibCalls) instead of duplicating it. The next step is to have something like getOrInsertLibFunc that also sets the attributes. Differential Revision: http://reviews.llvm.org/D19470 llvm-svn: 267759	2016-04-27 19:04:40 +00:00
Ahmed Bougacha	d765a82b54	[TLI] Unify LibFunc signature checking. NFCI. I tried to be as close as possible to the strongest check that existed before; cleaning these up properly is left for future work. Differential Revision: http://reviews.llvm.org/D19469 llvm-svn: 267758	2016-04-27 19:04:35 +00:00
Matthew Simpson	622b95be7b	[LV] Reallow positive-stride interleaved load groups with gaps We previously disallowed interleaved load groups that may cause us to speculatively access memory out-of-bounds (r261331). We did this by ensuring each load group had an access corresponding to the first and last member. Instead of bailing out for these interleaved groups, this patch enables us to peel off the last vector iteration, ensuring that we execute at least one iteration of the scalar remainder loop. This solution was proposed in the review of the previous patch. Differential Revision: http://reviews.llvm.org/D19487 llvm-svn: 267751	2016-04-27 18:21:36 +00:00
Arch D. Robison	aca7c412b4	[SLPVectorizer] Refactor where MinVecRegSize and MaxVecRegSize live. This is the first of two commits for extending SLP Vectorizer to deal with aggregates. This commit merely refactors existing logic. http://reviews.llvm.org/D14185 llvm-svn: 267748	2016-04-27 17:46:25 +00:00
Matthew Simpson	e5dfb08fcb	[TTI] Add hook for vector extract with extension This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725	2016-04-27 15:20:21 +00:00
Teresa Johnson	df5ef8711f	[ThinLTO] Refine fix to avoid renaming of uses in inline assembly. Summary: Refine the workaround from r266877 that attempts to prevent renaming of locals in inline assembly, so that in addition to looking for a llvm.used local value, that there is at least one inline assembly call in the module. Otherwise, debug functions added to the llvm.used can block importing/exporting unnecessarily. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19573 llvm-svn: 267717	2016-04-27 14:19:38 +00:00
Artur Pilipenko	9bb6beabf4	isSafeToLoadUnconditionally support queries without a context This is required to use this function from isSafeToSpeculativelyExecute Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16231 llvm-svn: 267692	2016-04-27 11:00:48 +00:00
Adam Nemet	d2fa414718	[LoopDist] Add llvm.loop.distribute.enable loop metadata Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672	2016-04-27 05:28:18 +00:00
Vaivaswatha Nagaraj	08efb0efcd	[Cloning] cloneLoopWithPreheader(): add assert to ensure no sub-loops Summary: cloneLoopWithPreheader() does not update LoopInfo for sub-loop of the original loop being cloned. Add assert to ensure no sub-loops for loop being cloned. Reviewers: anemet, ashutosh.nema, hfinkel Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D15922 llvm-svn: 267671	2016-04-27 05:25:09 +00:00
Evgeny Stupachenko	23ce61b663	The patch fixes PR27392. Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662	2016-04-27 03:04:54 +00:00
Sanjoy Das	5253a089ba	Fix typo in comment; NFC llvm-svn: 267653	2016-04-27 01:44:31 +00:00
Mehdi Amini	b4e1e8297b	ThinLTO: do not promote GlobalVariable that have a specific section. Differential Revision: http://reviews.llvm.org/D18298 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267646	2016-04-27 00:32:13 +00:00
Matt Arsenault	ba437c67d2	SLSR: Use UnknownAddressSpace instead of 0 for pure arithmetic. In the case where isLegalAddressingMode is used for cases not related to addressing modes, such as pure adds and muls, it should not be using address space 0. LSR already passes -1 as the address space in these cases. llvm-svn: 267645	2016-04-27 00:32:09 +00:00
Adam Nemet	61399ac424	[LoopDist] Split main class. NFC This splits out the per-loop functionality from the Pass class. With this the fact whether the loop is forced-distribute with the new metadata/pragma can be cached in the per-loop class rather than passed around. llvm-svn: 267643	2016-04-27 00:31:03 +00:00
Justin Bogner	c2bf63d29d	PM: Port Reassociate to the new pass manager llvm-svn: 267631	2016-04-26 23:39:29 +00:00
Justin Bogner	cb8a21c88e	Reassociate: Convert another functor into a lambda. NFC Also move the explanatory comment with it. llvm-svn: 267628	2016-04-26 23:32:00 +00:00
Sanjay Patel	29dea0d230	[SimplifyCFG] propagate branch metadata when creating select llvm-svn: 267624	2016-04-26 23:15:48 +00:00
Sanjay Patel	d2d2aa52cd	[LowerExpectIntrinsic] make default likely/unlikely ratio bigger We need the default ratio to be sufficiently large that it triggers transforms based on block frequency info (BFI) and plays well with the recently introduced BranchProbability used by CGP. Differential Revision: http://reviews.llvm.org/D19435 llvm-svn: 267615	2016-04-26 22:23:38 +00:00
Justin Bogner	90744d215b	Reassociate: Simplify using lambdas. NFC llvm-svn: 267614	2016-04-26 22:22:18 +00:00
David Majnemer	abb9f55c80	Revert "[SimplifyLibCalls] sprintf doesn't copy null bytes" The destination buffer that sprintf uses is restrict qualified, we do not need to worry about derived pointers referenced via format specifiers. This reverts commit r267580. llvm-svn: 267605	2016-04-26 21:04:47 +00:00
Elena Demikhovsky	308a7eb0d2	Masked Store in Loop Vectorizer - bugfix Fixed a bug in loop vectorization with conditional store. Differential Revision: http://reviews.llvm.org/D19532 llvm-svn: 267597	2016-04-26 20:18:04 +00:00
Justin Bogner	4563a06cee	PM: Port Internalize to the new pass manager llvm-svn: 267596	2016-04-26 20:15:52 +00:00
David Majnemer	8cd77baebc	[SimplifyLibCalls] sprintf doesn't copy null bytes sprintf doesn't read or copy the terminating null byte from it's string operands. sprintf will append it's own after processing all of the format specifiers. This fixes PR27526. llvm-svn: 267580	2016-04-26 18:16:49 +00:00
Dehao Chen	5d6d4841ed	Tune basic block annotation algorithm. Summary: Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary. This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19301 llvm-svn: 267519	2016-04-26 04:59:11 +00:00
Hal Finkel	e4c0c1679b	[SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when merging When SimplifyCFG merges identical instructions from both sides of a diamond, it can preserve !llvm.mem.parallel_loop_access (as it does with most of the other metadata). There's no real data or control dependency change in this case. llvm-svn: 267515	2016-04-26 02:06:06 +00:00
Hal Finkel	411d31ad72	[LoopVectorize] Don't consider conditional-load dereferenceability for marked parallel loops I really thought we were doing this already, but we were not. Given this input: void Test(int res, int c, int d, int p) { for (int i = 0; i < 16; i++) res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } we did not vectorize the loop. Even with "assume_safety" the check that we don't if-convert conditionally-executed loads (to protect against data-dependent deferenceability) was not elided. One subtlety: As implemented, it will still prefer to use a masked-load instrinsic (given target support) over the speculated load. The choice here seems architecture specific; the best option depends on how expensive the masked load is compared to a regular load. Ideally, using the masked load still reduces unnecessary memory traffic, and so should be preferred. If we'd rather do it the other way, flipping the order of the checks is easy. The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also implies that if conversion is okay. Differential Revision: http://reviews.llvm.org/D19512 llvm-svn: 267514	2016-04-26 02:00:36 +00:00
David Majnemer	30ffc4ce45	[SROA] Don't falsely report that changes have occured We would report that the function changed despite creating no new allocas or performing any promotion. This fixes PR27316. llvm-svn: 267507	2016-04-26 01:05:00 +00:00
Justin Bogner	1a07501379	PM: Port GlobalOpt to the new pass manager llvm-svn: 267499	2016-04-26 00:28:01 +00:00
Justin Bogner	d2f3d0a79d	PM: Convert the logic for GlobalOpt into static functions. NFC Pass all of the state we need around as arguments, so that these functions are easier to reuse. There is one part of this that is unusual: we pass around a functor to look up a DomTree for a function. This will be a necessary abstraction when we try to use this code in both the legacy and the new pass manager. llvm-svn: 267498	2016-04-26 00:27:56 +00:00
Arch D. Robison	be0490a6e8	Optimize store of "bitcast" from vector to aggregate. This patch is what was the "instcombine" portion of D14185, with an additional test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). The patch causes instcombine to replace sequences of extractelement-insertvalue-store that act essentially like a bitcast followed by a store. Differential review: http://reviews.llvm.org/D14260 llvm-svn: 267482	2016-04-25 22:22:39 +00:00
Teresa Johnson	c851d216e2	[ThinLTO] Introduce typedef for commonly-used map type (NFC) Add a typedef for the std::map<GlobalValue::GUID, GlobalValueSummary *> map that is passed around to identify summaries for values defined in a particular module. This shortens up declarations in a variety of places. llvm-svn: 267471	2016-04-25 21:09:51 +00:00
Etienne Bergeron	50f02aa3fa	Cleanup redundant expression in InstCombineAndOrXor. Summary: The expression is redundant on both side of operator \|. detected by : http://reviews.llvm.org/D19451 Reviewers: rnk, majnemer Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D19459 llvm-svn: 267458	2016-04-25 20:15:33 +00:00
Chad Rosier	e2cbd13e56	[ValueTracking] Improve isImpliedCondition when the dominating cond is false. llvm-svn: 267430	2016-04-25 17:23:36 +00:00
Anna Thomas	95f68aa7eb	Test commit: modified comment. NFC llvm-svn: 267406	2016-04-25 13:58:05 +00:00
James Molloy	eb040cc55f	[GlobalOpt] Allow constant globals to be SRA'd The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!). There seems to be no reason why we can't SRA constants too, so let's do it. llvm-svn: 267393	2016-04-25 10:48:29 +00:00
Mehdi Amini	bf4513b9aa	Run GlobalOpt before emitting the bitcode for ThinLTO This is motivated by reducing the size of the IR and thus reduce compile time. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267385	2016-04-25 08:47:49 +00:00
Mehdi Amini	f72ca86b71	ThinLTO: Move createNameAnonFunctionPass insertion in PassManagerBuilder (NFC) It is just code motion, but makes more sense this way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267384	2016-04-25 08:47:37 +00:00
Simon Pilgrim	4c564ad4dd	Tweak comments to make it clear that these combines are for SSE scalar instructions. llvm-svn: 267360	2016-04-24 19:31:56 +00:00
Simon Pilgrim	4b5462f119	[InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns. llvm-svn: 267359	2016-04-24 18:35:59 +00:00
Simon Pilgrim	83020942d3	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2) Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND). 2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements 3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns). Differential Revision: http://reviews.llvm.org/D19318 llvm-svn: 267357	2016-04-24 18:23:14 +00:00
Simon Pilgrim	424da1637a	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2) This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input) 2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input Differential Revision: http://reviews.llvm.org/D17490 llvm-svn: 267356	2016-04-24 18:12:42 +00:00
Simon Pilgrim	1c9a9f255c	[InstCombine] Avoid updating argument demanded elements in separate passes. As discussed on D17490, we should attempt to update an intrinsic's arguments demanded elements in one pass if we can. llvm-svn: 267355	2016-04-24 17:57:27 +00:00
Simon Pilgrim	2f6097d113	[X86][InstCombine] Tidyup VPERMILVAR -> shufflevector conversion to helper function. NFCI. llvm-svn: 267352	2016-04-24 17:23:46 +00:00
Simon Pilgrim	c0c56e747a	[X86][InstCombine] Tidyup PSHUFB -> shufflevector conversion to helper function. NFCI. llvm-svn: 267351	2016-04-24 17:00:34 +00:00
Teresa Johnson	28e457bccd	[ThinLTO] Remove GlobalValueInfo class from index Summary: Remove the GlobalValueInfo and change the ModuleSummaryIndex to directly reference summary objects. The info structure was there to support lazy parsing of the combined index summary objects, which is no longer needed and not supported. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19462 llvm-svn: 267344	2016-04-24 14:57:11 +00:00
Mehdi Amini	cb87494f4c	Always traverse GlobalVariable initializer when computing the export list Summary: We are always importing the initializer for a GlobalVariable. So if a GlobalVariable is in the export-list, we pull in any refs as well. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19102 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267303	2016-04-23 23:29:24 +00:00
Sanjay Patel	dc88bd6e1f	replace duplicated static functions for profile metadata access with BranchInst member function; NFCI llvm-svn: 267295	2016-04-23 20:01:22 +00:00
Sanjay Patel	85ce0f1f1f	improve documentation comments; NFC llvm-svn: 267292	2016-04-23 16:31:48 +00:00
Nico Weber	0aa9845d15	Revert r267210, it makes clang assert (PR27490). llvm-svn: 267232	2016-04-22 22:08:42 +00:00

1 2 3 4 5 ...

14911 Commits