llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	ca140b17cb	[InstCombine][SSE] Added support to VPERMD/VPERMPS to shuffle combine to accept UNDEF elements. llvm-svn: 268206	2016-05-01 20:43:02 +00:00
Simon Pilgrim	c590492075	Dropped FIXME comment llvm-svn: 268205	2016-05-01 20:33:25 +00:00
Simon Pilgrim	eeacc40e27	[InstCombine][SSE] Added support to VPERMILVAR to shuffle combine to accept UNDEF elements. llvm-svn: 268204	2016-05-01 20:22:42 +00:00
Simon Pilgrim	cc7f567b6a	[InstCombine][AVX] Fixed PERMILVAR identity tests and added additional decode tests llvm-svn: 268203	2016-05-01 20:06:47 +00:00
Simon Pilgrim	e5e8c2fde0	[InstCombine][SSE] Added support to PSHUFB to shuffle combine to accept UNDEF elements. llvm-svn: 268202	2016-05-01 19:26:21 +00:00
Simon Pilgrim	cae3e70707	[InstCombine][SSE] Regenerate MOVSX/MOVZX tests llvm-svn: 268201	2016-05-01 18:28:45 +00:00
Simon Pilgrim	8cddf8b3c6	[InstCombine][AVX2] Combine VPERMD/VPERMPS intrinsics with constant masks to shufflevector. llvm-svn: 268199	2016-05-01 16:41:22 +00:00
Simon Pilgrim	c179435055	[InstCombine][AVX2] Added VPERMD/VPERMPS shuffle combining placeholder tests. For future support for VPERMD/VPERMPS to generic shuffles combines llvm-svn: 268166	2016-04-30 20:41:52 +00:00
Simon Pilgrim	8e38a5439b	[InstCombine][AVX] Split off VPERMILVAR tests and added additional tests for UNDEF mask elements llvm-svn: 268159	2016-04-30 07:32:19 +00:00
Sanjoy Das	47cf2affbd	[LowerGuardIntrinsics] Keep track of !make.implicit metadata If a guard call being lowered by LowerGuardIntrinsics has the `!make.implicit` metadata attached, then reattach the metadata to the branch in the resulting expanded form of the intrinsic. This allows us to implement null checks as guards and still get the benefit of implicit null checks. llvm-svn: 268148	2016-04-30 00:55:59 +00:00
Lawrence Hu	1befea2bdc	Reroll loops with multiple IV and negative step part 3 support multiple induction variables This patch enable loop reroll for the following case: for(int i=0; i<N; i += 2) { S += a++; S += a++; }; Differential Revision: http://reviews.llvm.org/D16550 llvm-svn: 268147	2016-04-30 00:51:22 +00:00
Sanjoy Das	52c68bb0f5	[LowerGuardIntrinsics] Preserve calling conv when lowering llvm-svn: 268142	2016-04-30 00:17:47 +00:00
Sanjay Patel	bc6fad0bdf	add minimal test to show dropped metadata llvm-svn: 268141	2016-04-30 00:12:54 +00:00
Sanjay Patel	6748ec49e9	remove the metadata added with r267827 We can demonstrate the 'select' bug and fix with a simpler test case. The merged weight values are already tested in another test. llvm-svn: 268139	2016-04-30 00:02:36 +00:00
Sanjoy Das	107aefc2fc	Mark guards on true as "trivially dead" This moves some logic added to EarlyCSE in rL268120 into `llvm::isInstructionTriviallyDead`. Adds a test case for DCE to demonstrate that passes other than EarlyCSE can now pick up on the new information. llvm-svn: 268126	2016-04-29 22:23:16 +00:00
Sanjoy Das	ee81b23fe7	[EarlyCSE] Simplify guard intrinsics Summary: This change teaches EarlyCSE some basic properties of guard intrinsics: - Guard intrinsics read all memory, but don't write to any memory - After a guard has executed, the condition it was guarding on can be assumed to be true - Guard intrinsics on a constant `true` are no-ops Reviewers: reames, hfinkel Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19578 llvm-svn: 268120	2016-04-29 21:52:58 +00:00
Chad Rosier	cd62bf5821	[InstCombine] Determine the result of a select based on a dominating condition. Differential Revision: http://reviews.llvm.org/D19550 llvm-svn: 268104	2016-04-29 21:12:31 +00:00
David Majnemer	d2a074b1f4	[ValueTracking] matchSelectPattern needs to be more careful around FP matchSelectPattern attempts to see through casts which mask min/max patterns from being more obvious. Under certain circumstances, it would misidentify a sequence of instructions as a min/max because it assumed that folding casts would preserve the result. This is not the case for floating point <-> integer casts. This fixes PR27575. llvm-svn: 268086	2016-04-29 18:40:34 +00:00
Geoff Berry	b92cd5293e	[BasicAA] Treat llvm.assume as not accessing memory in getModRefBehavior(Function) Reviewers: dberlin, chandlerc, hfinkel, reames, sanjoy Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19730 llvm-svn: 268068	2016-04-29 17:18:28 +00:00
Sanjay Patel	362dcf9615	auto-generate checks llvm-svn: 268061	2016-04-29 16:39:37 +00:00
Simon Pilgrim	07a691c706	[InstCombine][SSE] Added x86 pshufb undef mask tests FIXME: We currently don't support folding constant pshufb shuffle masks containing undef elements. llvm-svn: 268016	2016-04-29 09:13:53 +00:00
Simon Pilgrim	5779fb61b0	[InstCombine][SSE] Regenerated x86 pshufb tests llvm-svn: 268014	2016-04-29 08:53:35 +00:00
David Majnemer	1a5799fe3e	[DeadArgumentElimination] Propagate operand bundles to promoted call sites We neglected to transfer operand bundles when performing argument promotion. llvm-svn: 268008	2016-04-29 07:22:36 +00:00
Adam Nemet	c60d8f8fd0	[LoopDist] Add missing RUN line in test from r268006 llvm-svn: 268007	2016-04-29 07:16:00 +00:00
Adam Nemet	88ec491830	[LoopDist] Also emit optimization remark on success (-Rpass=) The option -Rpass=loop-distribute now reports the loops that were distributed. llvm-svn: 268006	2016-04-29 07:10:46 +00:00
David Majnemer	13d5526392	[SLPVectorizer] Add operand bundles to vectorized functions SLPVectorizing a call site should result in further propagation of its bundles. llvm-svn: 268004	2016-04-29 07:09:51 +00:00
David Majnemer	50ddc0e1b6	[LoopVectorize] Add operand bundles to vectorized functions Also, do not crash when calculating a cost model for loop-invariant token values. llvm-svn: 268003	2016-04-29 07:09:48 +00:00
Matt Arsenault	7d1b6c81af	AMDGPU: Stop reporting an addressing mode for unknown addrspace This was being treated the same as private, which has an immediate offset. For unknown, it probably means it's for a computation not actually being used for accessing memory, so it should not have a nontrivial addressing mode. llvm-svn: 268002	2016-04-29 06:25:10 +00:00
David Majnemer	cd24bb1d3a	[ArgumentPromotion] Propagate operand bundles to promoted call sites We neglected to transfer operand bundles when performing argument promotion. This fixes PR27568. llvm-svn: 267986	2016-04-29 04:56:12 +00:00
Michael Zolotukhin	1816d03b7d	[PR25281] Remove AAResultsWrapper from preserved analyses of loop vectorizer. We don't preserve AAResults, because, for one, we don't preserve SCEV-AA. That fixes PR25281. llvm-svn: 267980	2016-04-29 03:31:25 +00:00
Hal Finkel	1b66f7e3c8	[LoopVectorize] Keep hints from original loop on the vector loop We need to keep loop hints from the original loop on the new vector loop. Failure to do this meant that, for example: void foo(int *b) { #pragma clang loop unroll(disable) for (int i = 0; i < 16; ++i) b[i] = 1; } this loop would be unrolled. Why? Because we'd vectorize it, thus dropping the hints that unrolling should be disabled, and then we'd unroll it. llvm-svn: 267970	2016-04-29 01:27:40 +00:00
Adam Nemet	0ba164bbcb	[LoopDist] Emit optimization remarks (-Rpass) I closely followed the precedents set by the vectorizer: With -Rpass-missed, the loop is reported with further details pointing to -Rpass--analysis. * -Rpass-analysis reports the details why distribution has failed. * Regardless of -Rpass*, when distribution fails for a loop where distribution was forced with the pragma, a warning is produced according to -Wpass-failed. In this case the analysis info is also printed even without -Rpass-analysis. llvm-svn: 267952	2016-04-28 23:08:32 +00:00
Hal Finkel	50316d95a9	[Inliner] Preserve llvm.mem.parallel_loop_access metadata When inlining a call site with llvm.mem.parallel_loop_access metadata, this metadata needs to be propagated to all cloned memory-accessing instructions. Otherwise, inlining parts of the loop body will invalidate the annotation. With this functionality, we now vectorize the following as expected: void Body(int res, int c, int d, int p, int i) { res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } void Test(int res, int c, int d, int p, int n) { int i; #pragma clang loop vectorize(assume_safety) for (i = 0; i < 1600; i++) { Body(res, c, d, p, i); } } llvm-svn: 267949	2016-04-28 23:00:04 +00:00
Arch D. Robison	0e61034018	[SLPVectorizer] Extend SLP Vectorizer to deal with aggregates. The refactoring portion part was done as r267748. http://reviews.llvm.org/D14185 llvm-svn: 267899	2016-04-28 16:11:45 +00:00
Simon Pilgrim	bd4a3be7d2	[InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBits The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits. This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero. Differential Revision: http://reviews.llvm.org/D19614 llvm-svn: 267873	2016-04-28 12:22:53 +00:00
Sanjay Patel	21bd38a07b	Update test to use FileCheck Also, add some metadata to show what that currently looks like. llvm-svn: 267827	2016-04-28 00:29:27 +00:00
Rong Xu	a4c3f67fe8	more buildbot failure fix to r267792 __llvm_prf_nm length is embedded in llvm_used. Relax llvm_used check. llvm-svn: 267816	2016-04-27 23:23:53 +00:00
Rong Xu	6e34c490ff	[PGO] Promote indirect calls to conditional direct calls with value-profile This patch implements the transformation that promotes indirect calls to conditional direct calls when the indirect-call value profile meta-data is available. Differential Revision: http://reviews.llvm.org/D17864 llvm-svn: 267815	2016-04-27 23:20:27 +00:00
Rong Xu	4b1dc5d60b	Fix buildbot failure due to r267792 Relax the test check as some targets do not have name compression. llvm-svn: 267803	2016-04-27 22:06:35 +00:00
Rong Xu	af5aebaa32	[PGO] Prohibit address recording if the function is both internal and COMDAT Differential Revision: http://reviews.llvm.org/D19515 llvm-svn: 267792	2016-04-27 21:17:30 +00:00
Simon Pilgrim	3f595aabe2	[InstCombine][AVX2] Add AVX2 per-element vector shift tests At the moment we don't simplify PSRAV/PSRLV/PSLLV intrinsics to generic IR for constant shift amounts, but we could. llvm-svn: 267777	2016-04-27 20:25:34 +00:00
David Majnemer	0c80e2eac6	[CodeGenPrepare] Don't sink a cast past its user The sink cast machinery is supposed to sink casts as close to their user as possible. However, an EH pad is the first instruction in it's basic block. Don't sink if the user is an EH pad. This fixes PR27536. llvm-svn: 267767	2016-04-27 19:36:38 +00:00
Ahmed Bougacha	ace97c1f7d	[LIR] Set attributes on memset_pattern16. "inferattrs" will deduce the attribute, but it will be too late for many optimizations. Set it ourselves when creating the call. Differential Revision: http://reviews.llvm.org/D17598 llvm-svn: 267762	2016-04-27 19:04:50 +00:00
Ahmed Bougacha	44c19876c7	[InferAttrs] Mark memset_pattern16 params nocapture. Differential Revision: http://reviews.llvm.org/D19471 llvm-svn: 267760	2016-04-27 19:04:43 +00:00
Matthew Simpson	622b95be7b	[LV] Reallow positive-stride interleaved load groups with gaps We previously disallowed interleaved load groups that may cause us to speculatively access memory out-of-bounds (r261331). We did this by ensuring each load group had an access corresponding to the first and last member. Instead of bailing out for these interleaved groups, this patch enables us to peel off the last vector iteration, ensuring that we execute at least one iteration of the scalar remainder loop. This solution was proposed in the review of the previous patch. Differential Revision: http://reviews.llvm.org/D19487 llvm-svn: 267751	2016-04-27 18:21:36 +00:00
Gerolf Hoflehner	88017c08a6	[InstCombine] Sharpended test case in pr21210.ll llvm-svn: 267742	2016-04-27 17:19:54 +00:00
Matthew Simpson	e5dfb08fcb	[TTI] Add hook for vector extract with extension This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725	2016-04-27 15:20:21 +00:00
Simon Pilgrim	f23aa2a9c9	[InstCombine][SSE] Regenerated vector shift tests llvm-svn: 267699	2016-04-27 12:04:44 +00:00
Simon Pilgrim	d2ea708739	[InstCombine][SSE] Added DemandedBits tests for MOVMSK instructions MOVMSK zeros the upper bits of the gpr - we should be able to use this. llvm-svn: 267686	2016-04-27 09:53:09 +00:00
Adam Nemet	d2fa414718	[LoopDist] Add llvm.loop.distribute.enable loop metadata Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672	2016-04-27 05:28:18 +00:00
Evgeny Stupachenko	23ce61b663	The patch fixes PR27392. Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662	2016-04-27 03:04:54 +00:00
Philip Reames	3f83dbeed9	[LVI] Reduce compile time by lazily scanning blocks if needed When encountering a non-local pointer, LVI would eagerly scan the block for dereferences of the given object to prove the pointer to be non null. That's all well and good, but then we'd go recurse through our input blocks. As a result, we could end up scanning each and every block we traverse, even if the final definition was obviously non null or we found a constant value somewhere up the chain. The previous code papered over this by using the isKnownNonNull routine from value tracking. This made the duplication less painful in the common case. Instead, we know do the block scan only after we've gotten the recursive results back. This lets us stop scanning individual blocks as soon as we've determined it to be non-null in any predecessor block and use our usual merge rules to propagate that information cheaply through successor blocks. For a pointer which can be found non-null, this does strictly less work and sometimes substaintially so. Note that the case where we can't prove something non-null is still the really expensive case. We end up scanning each and every block looking for a dereference and never end up finding one. llvm-svn: 267642	2016-04-27 00:30:55 +00:00
Justin Bogner	c2bf63d29d	PM: Port Reassociate to the new pass manager llvm-svn: 267631	2016-04-26 23:39:29 +00:00
Sanjay Patel	29dea0d230	[SimplifyCFG] propagate branch metadata when creating select llvm-svn: 267624	2016-04-26 23:15:48 +00:00
Philip Reames	053c2a6f25	[LVI] Apply transfer rule for overdefine inputs for binary operators As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well. This patch builds on 267609 which did the same thing for unary casts. llvm-svn: 267620	2016-04-26 23:10:35 +00:00
Philip Reames	e5030e85ea	[LVI] A better fix for the assertion error introduced by 267609 Essentially, I was using the wrong size function. For types which were sized, but not primitive, I wasn't getting a useful size for the operand and failed an assert. I fixed this, and also added a guard that the input is a sized type. Test case is for the original mistake. I'm not sure how to actually exercise the sized type check. llvm-svn: 267618	2016-04-26 22:52:30 +00:00
Sanjay Patel	d2d2aa52cd	[LowerExpectIntrinsic] make default likely/unlikely ratio bigger We need the default ratio to be sufficiently large that it triggers transforms based on block frequency info (BFI) and plays well with the recently introduced BranchProbability used by CGP. Differential Revision: http://reviews.llvm.org/D19435 llvm-svn: 267615	2016-04-26 22:23:38 +00:00
Philip Reames	38c87c2e50	[LVI] Infer local facts from unary expressions As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well. This patch implements only the unary operation case. Once this is in, I'll implement the same for the binary operations. Differential Revision: http://reviews.llvm.org/D19492 llvm-svn: 267609	2016-04-26 21:48:16 +00:00
David Majnemer	abb9f55c80	Revert "[SimplifyLibCalls] sprintf doesn't copy null bytes" The destination buffer that sprintf uses is restrict qualified, we do not need to worry about derived pointers referenced via format specifiers. This reverts commit r267580. llvm-svn: 267605	2016-04-26 21:04:47 +00:00
Elena Demikhovsky	308a7eb0d2	Masked Store in Loop Vectorizer - bugfix Fixed a bug in loop vectorization with conditional store. Differential Revision: http://reviews.llvm.org/D19532 llvm-svn: 267597	2016-04-26 20:18:04 +00:00
Justin Bogner	4563a06cee	PM: Port Internalize to the new pass manager llvm-svn: 267596	2016-04-26 20:15:52 +00:00
David Majnemer	8cd77baebc	[SimplifyLibCalls] sprintf doesn't copy null bytes sprintf doesn't read or copy the terminating null byte from it's string operands. sprintf will append it's own after processing all of the format specifiers. This fixes PR27526. llvm-svn: 267580	2016-04-26 18:16:49 +00:00
Dehao Chen	5d6d4841ed	Tune basic block annotation algorithm. Summary: Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary. This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19301 llvm-svn: 267519	2016-04-26 04:59:11 +00:00
Hal Finkel	e4c0c1679b	[SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when merging When SimplifyCFG merges identical instructions from both sides of a diamond, it can preserve !llvm.mem.parallel_loop_access (as it does with most of the other metadata). There's no real data or control dependency change in this case. llvm-svn: 267515	2016-04-26 02:06:06 +00:00
Hal Finkel	411d31ad72	[LoopVectorize] Don't consider conditional-load dereferenceability for marked parallel loops I really thought we were doing this already, but we were not. Given this input: void Test(int res, int c, int d, int p) { for (int i = 0; i < 16; i++) res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } we did not vectorize the loop. Even with "assume_safety" the check that we don't if-convert conditionally-executed loads (to protect against data-dependent deferenceability) was not elided. One subtlety: As implemented, it will still prefer to use a masked-load instrinsic (given target support) over the speculated load. The choice here seems architecture specific; the best option depends on how expensive the masked load is compared to a regular load. Ideally, using the masked load still reduces unnecessary memory traffic, and so should be preferred. If we'd rather do it the other way, flipping the order of the checks is easy. The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also implies that if conversion is okay. Differential Revision: http://reviews.llvm.org/D19512 llvm-svn: 267514	2016-04-26 02:00:36 +00:00
Sanjay Patel	a31b0c0ece	[CodeGenPrepare] don't convert an unpredictable select into control flow Suggested in the review of D19488: http://reviews.llvm.org/D19488 llvm-svn: 267504	2016-04-26 00:47:39 +00:00
Justin Bogner	1a07501379	PM: Port GlobalOpt to the new pass manager llvm-svn: 267499	2016-04-26 00:28:01 +00:00
Justin Bogner	6f6c5f2a02	GlobalOpt: Convert a bunch of tests from grep to FileCheck llvm-svn: 267493	2016-04-25 23:36:50 +00:00
Sanjay Patel	82059090d3	Add check for "branch_weights" with prof metadata While we're here, fix the comment and variable names to make it clear that these are raw weights, not percentages. llvm-svn: 267491	2016-04-25 23:15:16 +00:00
Arch D. Robison	be0490a6e8	Optimize store of "bitcast" from vector to aggregate. This patch is what was the "instcombine" portion of D14185, with an additional test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). The patch causes instcombine to replace sequences of extractelement-insertvalue-store that act essentially like a bitcast followed by a store. Differential review: http://reviews.llvm.org/D14260 llvm-svn: 267482	2016-04-25 22:22:39 +00:00
Chad Rosier	bbabc85031	Fix typo from r267432. llvm-svn: 267436	2016-04-25 18:20:27 +00:00
Chad Rosier	4c4e3336b8	[ValueTracking] Add an additional test case for r266767 where one operand is a const. llvm-svn: 267432	2016-04-25 17:41:48 +00:00
Chad Rosier	e2cbd13e56	[ValueTracking] Improve isImpliedCondition when the dominating cond is false. llvm-svn: 267430	2016-04-25 17:23:36 +00:00
James Molloy	eb040cc55f	[GlobalOpt] Allow constant globals to be SRA'd The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!). There seems to be no reason why we can't SRA constants too, so let's do it. llvm-svn: 267393	2016-04-25 10:48:29 +00:00
Simon Pilgrim	4b5462f119	[InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns. llvm-svn: 267359	2016-04-24 18:35:59 +00:00
Simon Pilgrim	83020942d3	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2) Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND). 2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements 3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns). Differential Revision: http://reviews.llvm.org/D19318 llvm-svn: 267357	2016-04-24 18:23:14 +00:00
Simon Pilgrim	424da1637a	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2) This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input) 2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input Differential Revision: http://reviews.llvm.org/D17490 llvm-svn: 267356	2016-04-24 18:12:42 +00:00
Duncan P. N. Exon Smith	a59d3e5af8	DebugInfo: Remove MDString-based type references Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around DIType*. It is no longer legal to refer to a DICompositeType by its 'identifier:', and DIBuilder no longer retains all types with an 'identifier:' automatically. Aside from the bitcode upgrade, this is mainly removing logic to resolve an MDString-based reference to an actualy DIType. The commits leading up to this have made the implicit type map in DICompileUnit's 'retainedTypes:' field superfluous. This does not remove DITypeRef, DIScopeRef, DINodeRef, and DITypeRefArray, or stop using them in DI-related metadata. Although as of this commit they aren't serving a useful purpose, there are patchces under review to reuse them for CodeView support. The tests in LLVM were updated with deref-typerefs.sh, which is attached to the thread "[RFC] Lazy-loading of debug info metadata": http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html llvm-svn: 267296	2016-04-23 21:08:00 +00:00
Nico Weber	0aa9845d15	Revert r267210, it makes clang assert (PR27490). llvm-svn: 267232	2016-04-22 22:08:42 +00:00
Peter Collingbourne	7dd8dbf486	Introduce llvm.load.relative intrinsic. This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant. Differential Revision: http://reviews.llvm.org/D18367 llvm-svn: 267223	2016-04-22 21:18:02 +00:00
Philip Reames	5f0e36947b	[unordered] sink unordered stores at end of blocks The existing code turned out to be completely correct when auditted. Thus, only minor code changes and adding a couple of tests. llvm-svn: 267215	2016-04-22 20:53:32 +00:00
Sanjoy Das	f97229d6ba	Fold compares for distinct allocations Summary: We can fold compares to false when two distinct allocations within a function are compared for equality. Patch by Anna Thomas! Reviewers: majnemer, reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19390 llvm-svn: 267214	2016-04-22 20:52:25 +00:00
Philip Reames	eedef73b63	[unordered] Extend load/store type canonicalization to handle unordered operations Extend the type canonicalization logic to work for unordered atomic loads and stores. Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before. Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered. If you see problems, feel free to revert this change, but please make sure you collect a test case. llvm-svn: 267210	2016-04-22 20:33:48 +00:00
Justin Bogner	b93949089e	PM: Port SinkingPass to the new pass manager llvm-svn: 267199	2016-04-22 19:54:10 +00:00
Jun Bum Lim	d29a24e4fd	[DeadStoreElimination] Shorten beginning of memset overwritten by later stores Summary: This change will shorten memset if the beginning of memset is overwritten by later stores. Reviewers: hfinkel, eeckstein, dberlin, mcrosier Subscribers: mgrang, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18906 llvm-svn: 267197	2016-04-22 19:51:29 +00:00
Justin Bogner	395c2127ed	PM: Port DCE to the new pass manager Also add a very basic test, since apparently there aren't any tests for DCE whatsoever to add the new pass version to. llvm-svn: 267196	2016-04-22 19:40:41 +00:00
Adam Nemet	54053a518e	[LoopVersioningLICM] Add test coverage for llvm.loop.licm_versioning.disable In the next change, I am generalizing the function findStringMetadataForLoop and I want to make sure I don't break this. Looks like there was no coverage for this so far. llvm-svn: 267182	2016-04-22 18:34:50 +00:00
Chad Rosier	1a60159064	[SimplifyCFG] Add final missing implications to isImpliedTrueByMatchingCmp. Summary: eq imply [u\|s]ge and [u\|s]le are true. Remove redundant logic by implementing isImpliedFalseByMatchingCmp(Pred1, Pred2) as isImpliedTrueByMatchingCmp(Pred1, getInversePredicate(Pred2)). llvm-svn: 267177	2016-04-22 17:57:34 +00:00
Chad Rosier	3456cb5672	[SimplifyCFG] Add missing implications to isImpliedTrueByMatchingCmp. Summary: [u\|s]gt and [u\|s]lt imply [u\|s]ge and [u\|s]le are true, respectively. I've simplified the existing tests and added additional tests to cover the new cases mentioned above. I've also added tests for all the cases where the first compare doesn't imply anything about the second compare. llvm-svn: 267171	2016-04-22 17:14:12 +00:00
Chad Rosier	1960d13e29	[SimplifyCFG] Simplify code review by temporarily removing this test file. A followup commit will replace these tests with simplified and more inclusive tests. The diff is unreadable if this were to be done in a single commit. llvm-svn: 267170	2016-04-22 17:14:08 +00:00
David Majnemer	bfd695d591	[EarlyCSE] Don't add the overflow flags to the hash We take the intersection of overflow flags while CSE'ing. This permits us to consider two instructions with different overflow behavior to be replaceable. llvm-svn: 267153	2016-04-22 14:12:50 +00:00
Silviu Baranga	e985c76b90	[InstCombine] Preserve fast math flags when combining PHIs Summary: When optimizing PHIs which have inputs floating point binary operators, we preserve all IR flags except the fast math flags. This change removes the logic which tracked some of the IR flags (no wrap, exact) and replaces it by doing an and on the IR flags of all inputs to the PHI - which will also handle the fast math flags. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19370 llvm-svn: 267139	2016-04-22 11:21:36 +00:00
David Majnemer	d0ce8f1485	[GVN] Respect fast-math-flags on fcmps We assumed that flags were only present on binary operators. This is not true, they may also be present on calls and fcmps. llvm-svn: 267113	2016-04-22 06:37:51 +00:00
David Majnemer	9554c1339c	[EarlyCSE] Take the intersection of flags on instructions EarlyCSE had inconsistent behavior with regards to flag'd instructions: - In some cases, it would pessimize if the available instruction had different flags by not performing CSE. - In other cases, it would miscompile if it replaced an instruction which had no flags with an instruction which has flags. Fix this by being more consistent with our flag handling by utilizing andIRFlags. llvm-svn: 267111	2016-04-22 06:37:45 +00:00
Sanjoy Das	a085cfc150	Folding compares with unescaped allocations Summary: If we know that the pointer allocated within a function does not escape, we can fold away comparisons that are done with global pointers Patch by Anna Thomas! Reviewers: reames, majnemer, sanjoy Subscribers: mgrang, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D19276 llvm-svn: 267035	2016-04-21 19:26:45 +00:00
Philip Reames	a98c7ead30	[instcombine][unordered] Extend load(select) transform to handle unordered loads llvm-svn: 267023	2016-04-21 17:59:40 +00:00
Philip Reames	3ac0718423	[unordered] unordered loads from null are still unreachable llvm-svn: 267019	2016-04-21 17:45:05 +00:00
Philip Reames	ac55090e96	[instcombine][unordered] Implement *-load forwarding for unordered atomics This builds on 266999 which made FindAvailableValue do the right thing. Tests included show the newly enabled transforms and those which disabled either due to conservatism or correctness requirements. llvm-svn: 267006	2016-04-21 17:03:33 +00:00
Philip Reames	92c43699bc	[unordered] Add tests and conservative handling in support of future changes [NFCI] This change adds a couple of test cases to make sure FindAvailableLoadedValue does the right thing. At the moment, the code added is dead, but separating it makes follow on changes far more obvious. llvm-svn: 266999	2016-04-21 16:51:08 +00:00
Sanjoy Das	54a3a006ca	[SimplifyCFG] Fold `llvm.guard(false)` to unreachable Summary: `llvm.guard(false)` always bails out of the current compilation unit, so we can prune any control flow following it. Reviewers: hfinkel, pcc, reames Subscribers: majnemer, reames, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19245 llvm-svn: 266955	2016-04-21 05:09:12 +00:00
Mehdi Amini	bda3c97c16	ThinLTO/ModuleLinker: add a flag to not always pull-in linkonce when performing importing Summary: The function importer already decided what symbols need to be pulled in. Also these magically added ones will not be in the export list for the source module, which can confuse the internalizer for instance. Reviewers: tejohnson, rafael Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19096 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266948	2016-04-21 01:59:39 +00:00
Nick Lewycky	762f8a8549	Add optimization for 'icmp slt (or A, B), A' and some related idioms based on knowledge of the sign bit for A and B. No matter what value you OR in to A, the result of (or A, B) is going to be UGE A. When A and B are positive, it's SGE too. If A is negative, OR'ing a value into it can't make it positive, but can increase its value closer to -1, therefore (or A, B) is SGE A. Working through all possible combinations produces this truth table: ``` A is +, -, +/- F F F + B is T F ? - ? F ? +/- ``` The related optimizations are flipping the 'slt' for 'sge' which always NOTs the result (if the result is known), and swapping the LHS and RHS while swapping the comparison predicate. There are more idioms left to implement (aren't there always!) but I've stopped here because any more would risk becoming unreasonable for reviewers. llvm-svn: 266939	2016-04-21 00:53:14 +00:00
Vedant Kumar	932866bfe7	[test/PGOProfile] Make tests independent of the raw profile version (NFC) Differential Revision: http://reviews.llvm.org/D19290 llvm-svn: 266928	2016-04-20 22:24:01 +00:00
Teresa Johnson	b35cc691ea	[ThinLTO] Prevent importing of "llvm.used" values Summary: This patch prevents importing from (and therefore exporting from) any module with a "llvm.used" local value. Local values need to be promoted and renamed when importing, and their presense on the llvm.used variable indicates that there are opaque uses that won't see the rename. One such example is a use in inline assembly. See also the discussion at: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098047.html As part of this, move collectUsedGlobalVariables out of Transforms/Utils and into IR/Module so that it can be used more widely. There are several other places in LLVM that used copies of this code that can be cleaned up as a follow on NFC patch. Reviewers: joker.eph Subscribers: pcc, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18986 llvm-svn: 266877	2016-04-20 14:39:45 +00:00
Mandeep Singh Grang	029a0567fa	[LLVM] Remove unwanted --check-prefix=CHECK from unit tests. NFC. Summary: Removed unwanted --check-prefix=CHECK from numerous unit tests. Reviewers: t.p.northover, dblaikie, uweigand, MatzeB, tstellarAMD, mcrosier Subscribers: mcrosier, dsanders Differential Revision: http://reviews.llvm.org/D19279 llvm-svn: 266834	2016-04-19 23:51:52 +00:00
Marcin Koscielnicki	3fdc257d6a	[AArch64] [ARM] Make a target-independent llvm.thread.pointer intrinsic. Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that just return the thread pointer. I have a pending patch that does the same for SystemZ (D19054), and there are many more targets that could benefit from one. This patch merges the ARM and AArch64 intrinsics into a single target independent one that will also be used by subsequent targets. Differential Revision: http://reviews.llvm.org/D19098 llvm-svn: 266818	2016-04-19 20:51:05 +00:00
Chad Rosier	b7dfbb40a3	[ValueTracking] Improve isImpliedCondition for conditions with matching operands. This patch improves SimplifyCFG to catch cases like: if (a < b) { if (a > b) <- known to be false unreachable; } Phabricator Revision: http://reviews.llvm.org/D18905 llvm-svn: 266767	2016-04-19 17:19:14 +00:00
Simon Pilgrim	998cffa6b9	[InstCombine][X86] Added extra tests introduced for D17490 llvm-svn: 266732	2016-04-19 12:59:52 +00:00
Simon Pilgrim	74b3bfdf71	[InstCombine][X86] Regenerate SSE combine tests as part of setup for D17490 Regenerated with utils/update_test_checks.py llvm-svn: 266731	2016-04-19 12:56:46 +00:00
Tim Northover	b629c77692	ARM: use a pseudo-instruction for cmpxchg at -O0. The fast register-allocator cannot cope with inter-block dependencies without spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions where any value produced within a block is dead by the end, but not for cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded after regalloc. Fortunately this is at -O0 so we don't have to care about performance. This simplifies the various axes of expansion considerably: we assume a strong seq_cst operation and ensure ordering via the always-present DMB instructions rather than v8 acquire/release instructions. Should fix the 32-bit part of PR25526. llvm-svn: 266679	2016-04-18 21:48:55 +00:00
Chad Rosier	e30fed70e6	[ValueTracking] Correct lit test comments. NFC. llvm-svn: 266657	2016-04-18 19:11:45 +00:00
Eric Liu	d09f15ea6f	Revert "Replace the use of MaxFunctionCount module flag" This reverts commit r266477. This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData", while "ProfileData" depends on "Object", which depends on "BitCode", which depends on "Analysis". llvm-svn: 266619	2016-04-18 15:31:11 +00:00
Renato Golin	4b18a510a2	[ARM] AArch32 v8 NEON is still not IEEE-754 compliant llvm-svn: 266603	2016-04-18 12:06:47 +00:00
Sanjoy Das	99042473d0	Fix a typo in rL265762 I accidentally replaced `mayBeOverridden` with `!isInterposable`. Remove the negation and add a test case that would've caught this. Many thanks to Håkan Hjort for spotting this! llvm-svn: 266551	2016-04-17 04:30:43 +00:00
Mehdi Amini	2d28f7aa07	ThinLTO: Make aliases explicit in the summary To be able to work accurately on the reference graph when taking decision about internalizing, promoting, renaming, etc. We need to have the alias information explicit. Differential Revision: http://reviews.llvm.org/D18836 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266517	2016-04-16 06:56:44 +00:00
Evgeniy Stepanov	40cd1514cf	[cfi] Support explicit sections for functions in cfi-icall. Allow explicit section for indirectly called functions in cfi-icall. Jumptables for functions in the same type class must be contiguous, so they always go to the default text section. Fixes PR25079. llvm-svn: 266486	2016-04-15 22:55:38 +00:00
Adrian Prantl	dc75a6b517	Convert this sample-based-profiling testcase to use a NoDebug CU. llvm-svn: 266481	2016-04-15 22:05:38 +00:00
Easwaran Raman	f53baca686	Replace the use of MaxFunctionCount module flag Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count. Differential Revision: http://reviews.llvm.org/D18622 llvm-svn: 266477	2016-04-15 21:39:58 +00:00
Tim Northover	903f81ba18	ARM: don't try to hoist constant RHS out of a division. Divisions by a constant can be converted into multiplies which are usually cheaper, but this isn't possible if the constant gets separated (particularly in loops). Fix this by telling ConstantHoisting that the immediate in a DIV is cheap. I considered making the check generic, but neither AArch64 (strangely) nor x86 showed any benefit on the tests I had. llvm-svn: 266464	2016-04-15 18:17:18 +00:00
David Majnemer	2e02ba78d5	[InstCombine] Don't transform compares of calls to functions named fabs{f,l,} InstCombine wants to optimize compares of calls to fabs with zero. However, we didn't have the necessary legality checking to verify that the function call had the same behavior as fabs. llvm-svn: 266452	2016-04-15 17:21:03 +00:00
Adrian Prantl	75819aedf6	[PR27284] Reverse the ownership between DICompileUnit and DISubprogram. Currently each Function points to a DISubprogram and DISubprogram has a scope field. For member functions the scope is a DICompositeType. DIScopes point to the DICompileUnit to facilitate type uniquing. Distinct DISubprograms (with isDefinition: true) are not part of the type hierarchy and cannot be uniqued. This change removes the subprograms list from DICompileUnit and instead adds a pointer to the owning compile unit to distinct DISubprograms. This would make it easy for ThinLTO to strip unneeded DISubprograms and their transitively referenced debug info. Motivation ---------- Materializing DISubprograms is currently the most expensive operation when doing a ThinLTO build of clang. We want the DISubprogram to be stored in a separate Bitcode block (or the same block as the function body) so we can avoid having to expensively deserialize all DISubprograms together with the global metadata. If a function has been inlined into another subprogram we need to store a reference the block containing the inlined subprogram. Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script that updates LLVM IR testcases to the new format. http://reviews.llvm.org/D19034 <rdar://problem/25256815> llvm-svn: 266446	2016-04-15 15:57:41 +00:00
Sanjay Patel	f11ab05bdb	[SimplifyCFG] propagate branch metadata when creating select (PR27344) This is almost identical to: http://reviews.llvm.org/rL264527 This doesn't solve PR27344; it just allows the profile weights to survive. To solve the bug, we need to use the profile weights in the backend. llvm-svn: 266442	2016-04-15 15:32:12 +00:00
Sanjay Patel	81433e99b9	[SimplifyCFG] add metadata to show failure to propagate (PR27344) llvm-svn: 266435	2016-04-15 14:53:35 +00:00
Justin Lebar	f04e678e36	Move divergent-target test into CodeGen/NVPTX because it requires an NVPTX target. llvm-svn: 266403	2016-04-15 01:20:52 +00:00
Justin Lebar	cad81cf6b3	[Speculation] Add a SpeculativeExecution mode where the pass does nothing unless TTI::hasBranchDivergence() is true. Summary: This lets us add this pass to the IR pass manager unconditionally; it will simply not do anything on targets without branch divergence. Reviewers: tra Subscribers: llvm-commits, jingyue, rnk, chandlerc Differential Revision: http://reviews.llvm.org/D18625 llvm-svn: 266398	2016-04-15 00:32:09 +00:00
Vedant Kumar	4960fbf391	[test] Require 'asserts' for a test which uses -debug-only Without this line, bots which run check-all on Release compilers will break. llvm-svn: 266386	2016-04-14 23:32:40 +00:00
Michael Kuperstein	16f13e252b	[AliasSetTracker] Correctly handle changing the size of an entry If the size of an AST entry changes, we also need to make sure we perform necessary alias set merges, as the new size may overlap pointers in other sets. We happen to run into this with memset, because memset allows an entry for a i8* pointer to have a decidedly non-i8 size. This fixes PR27262. Differential Revision: http://reviews.llvm.org/D18939 llvm-svn: 266381	2016-04-14 22:00:11 +00:00
Renato Golin	5cb666add7	[ARM] Adding IEEE-754 SIMD detection to loop vectorizer Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON. This patch teaches the loop vectorizer to only allow transformations of loops that either contain no floating-point operations or have enough allowance flags supporting lack of precision (ex. -ffast-math, Darwin). For that, the target description now has a method which tells us if the vectorizer is allowed to handle FP math without falling into unsafe representations, plus a check on every FP instruction in the candidate loop to check for the safety flags. This commit makes LLVM behave like GCC with respect to ARM NEON support, but it stops short of fixing the underlying problem: sub-normals. Neither GCC nor LLVM have a flag for allowing sub-normal operations. Before this patch, GCC only allows it using unsafe-math flags and LLVM allows it by default with no way to turn it off (short of not using NEON at all). As a first step, we push this change to make it safe and in sync with GCC. The second step is to discuss a new sub-normal's flag on both communitues and come up with a common solution. The third step is to improve the FastMath flags in LLVM to encode sub-normals and use those flags to restrict NEON FP. Fixes PR16275. llvm-svn: 266363	2016-04-14 20:42:18 +00:00
Sanjay Patel	e998b91d86	[InstCombine] remove constant by inverting compare + logic (PR27105) https://llvm.org/bugs/show_bug.cgi?id=27105 We can check if all bits outside of a constant mask are set with a single constant. As noted in the bug report, although this form should be considered the canonical IR, backends may want to transform this into an 'andn' / 'andc' comparison against zero because that could be a single machine instruction. Differential Revision: http://reviews.llvm.org/D18842 llvm-svn: 266362	2016-04-14 20:17:40 +00:00
Dehao Chen	46f8fbbb1b	Update discriminator assignment algorithm to handle nested call correctly. Summary: Add discriminator for nested call correctly. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19127 llvm-svn: 266354	2016-04-14 18:37:18 +00:00
Adam Nemet	7aab648831	Revert "Support arbitrary addrspace pointers in masked load/store intrinsics" This reverts commit r266086. It breaks the LTO build of gcc in SPEC2000. llvm-svn: 266282	2016-04-14 08:47:17 +00:00
Tim Northover	5c02f9ad28	ARM: override cost function to re-enable ConstantHoisting (& fix it). At some point, ARM stopped getting any benefit from ConstantHoisting because the pass called a different variant of getIntImmCost. Reimplementing the correct variant revealed some problems, however: + ConstantHoisting was modifying switch statements. This is simply invalid, the cases must remain integer constants no matter the notional cost. + ConstantHoisting was mangling alloca instructions in the entry block. These should be handled by FrameLowering, so constants actually have a cost of 0. Worse, the resulting bitcasts meant they became dynamic allocas. rdar://25707382 llvm-svn: 266260	2016-04-13 23:08:27 +00:00
Easwaran Raman	cbd3989742	Test case for r265852. llvm-svn: 266237	2016-04-13 19:43:31 +00:00
Betul Buyukkurt	bf8554c279	[PGO] Remove redundant VP instrumentation LLVM optimization passes may reduce a profiled target expression to a constant. Removing runtime calls at such instrumentation points would help speedup the runtime of the instrumented program. llvm-svn: 266229	2016-04-13 18:52:19 +00:00
Mehdi Amini	b5b289339b	Revert "Make aliases explicit in the summary" Inadvertently commited... This reverts commit e618ec93786d99df2ddf280ad2d5e02f5516cecf. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266215	2016-04-13 17:20:07 +00:00
Mehdi Amini	ce744a95fd	Make aliases explicit in the summary Summary: To be able to work accurately on the reference graph when taking decision about internalizing, promoting, renaming, etc. We need to have the alias information explicit. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18836 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266214	2016-04-13 17:18:42 +00:00
David L Kreitzer	752c1448fe	Simplify strlen to a subtraction for certain cases. Patch by Li Huang (li1.huang@intel.com) Differential Revision: http://reviews.llvm.org/D18230 llvm-svn: 266200	2016-04-13 14:31:06 +00:00
Petar Jovanovic	644b8c1a5d	Calculate __builtin_object_size when pointer depends on a condition This patch fixes calculating of builtin_object_size if it depends on a condition. Before this patch compiler did not know how to calculate the object size when it finds a condition that cannot be eliminated. This patch enables calculating of builtin_object_size even in case when condition cannot be eliminated by choosing minimum or maximum value as a result from condition. Choosing minimum or maximum value from condition is based on the second argument of __builtin_object_size function. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18438 llvm-svn: 266193	2016-04-13 12:25:25 +00:00
David Majnemer	3ee5f34469	[InstCombine] We folded an fcmp to an i1 instead of a vector of i1 Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175	2016-04-13 06:55:52 +00:00
Matt Arsenault	b34eea9cb5	AMDGPU: Remove leftover ShaderType attributes in tests llvm-svn: 266155	2016-04-13 00:39:48 +00:00
Sanjay Patel	5e5056d939	[x86, InstCombine] fix masked load pass-through operand to be a zero vector This bug was introduced with: http://reviews.llvm.org/rL262269 AVX masked loads are specified to set vector lanes to zero when the high bit of the mask element for that lane is zero: "If the mask is 0, the corresponding data element is set to zero in the load form of these instructions, and unmodified in the store form." --Intel manual Differential Revision: http://reviews.llvm.org/D19017 llvm-svn: 266148	2016-04-12 23:16:23 +00:00
Mehdi Amini	d5faa267c4	Add a pass to name anonymous/nameless function Summary: For correct handling of alias to nameless function, we need to be able to refer them through a GUID in the summary. Here we name them using a hash of the non-private global names in the module. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18883 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266132	2016-04-12 21:35:28 +00:00
Mehdi Amini	68da426eea	Move summary creation out of llvm-as into opt Summary: Let keep llvm-as "dumb": it converts textual IR to bitcode. This commit removes the dependency from llvm-as to libLLVMAnalysis. We'll add back summary in llvm-as if we get to a textual representation for it at some point. In the meantime, opt seems like a better place for that. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19032 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266131	2016-04-12 21:35:18 +00:00
James Y Knight	19f6cce4e3	Add __atomic_* lowering to AtomicExpandPass. (Recommit of r266002, with r266011, r266016, and not accidentally including an extra unused/uninitialized element in LibcallRoutineNames) AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266115	2016-04-12 20:18:48 +00:00
Artur Pilipenko	dbe0bc8df4	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change. This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 266086	2016-04-12 15:58:04 +00:00
Rafael Espindola	d41b54be11	This reverts commit r266002, r266011 and r266016. They broke the msan bot. Original message: Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266062	2016-04-12 12:30:25 +00:00
George Burgess IV	278199f615	Add the allocsize attribute to LLVM. `allocsize` is a function attribute that allows users to request that LLVM treat arbitrary functions as allocation functions. This patch makes LLVM accept the `allocsize` attribute, and makes `@llvm.objectsize` recognize said attribute. The review for this was split into two patches for ease of reviewing: D18974 and D14933. As promised on the revisions, I'm landing both patches as a single commit. Differential Revision: http://reviews.llvm.org/D14933 llvm-svn: 266032	2016-04-12 01:05:35 +00:00
JF Bastien	4f43cfd2c2	MergeFunctions: test alloca better r237193 fix handling of alloca size / align in MergeFunctions, but only tested one and didn't follow FunctionComparator::cmpOperations's usual comparison pattern. It also didn't update Instruction.cpp:haveSameSpecialState which I'll do separately. llvm-svn: 266022	2016-04-12 00:03:26 +00:00
Mehdi Amini	ae280e54a9	ThinLTO renaming: use module hash instead of position in the summary This is more robust to changes in the link ordering. Differential Revision: http://reviews.llvm.org/D18946 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266018	2016-04-11 23:26:46 +00:00
Evgeniy Stepanov	f17120a85f	[safestack] Add canary to unsafe stack frames Add StackProtector to SafeStack. This adds limited protection against data corruption in the caller frame. Current implementation treats all stack protector levels as -fstack-protector-all. llvm-svn: 266004	2016-04-11 22:27:48 +00:00

1 2 3 4 5 ...

6688 Commits