llvm-project

Commit Graph

Author	SHA1	Message	Date
Kit Barton	7c80f98b69	[PPC] Remove Darwin support from POWER backend. This patch issues an error message if Darwin ABI is attempted with the PPC backend. It also cleans up existing test cases, either converting the test to use an alternative triple or removing the test if the coverage is no longer needed. Updated Tests ------------- The majority of test cases were updated to use a different triple that does not include the Darwin ABI. Many tests were also updated to use FileCheck, in place of grep. Deleted Tests ------------- llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test specific functionality of dsymutil using an object file created with an old version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he suggested removing the test. llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a PPC test to a SystemZ test, as the behavior is also reproducible there. All other tests that were deleted were specific to the darwin/ppc ABI and no longer necessary. Phabricator Review: https://reviews.llvm.org/D50988 llvm-svn: 340795	2018-08-28 01:18:29 +00:00
Craig Topper	a72012c206	[X86] Correct the cost of (v4i32 (fptoui (v4f64))) under AVX512F. Summary: This was inheriting the cost from the AVX table, but should be legal under AVX512. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51267 llvm-svn: 340708	2018-08-26 18:47:44 +00:00
John Brawn	980da83f84	[PhiValues] Use callback value handles to invalidate deleted values The way that PhiValues is integrated with BasicAA it is possible for a pass which uses BasicAA to pick up an instance of BasicAA that uses PhiValues without intending to, and then delete values from a function in a way that causes PhiValues to return dangling pointers to these deleted values. Fix this by having a set of callback value handles to invalidate values when they're deleted. llvm-svn: 340613	2018-08-24 15:48:30 +00:00
Brian Homerding	3ecabd709f	[FunctionAttrs] Infer WriteOnly Function Attribute These changes expand the FunctionAttr logic in order to mark functions as WriteOnly when appropriate. This is done through an additional bool variable and extended logic. Reviewers: hfinkel, jdoerfert Differential Revision: https://reviews.llvm.org/D48387 llvm-svn: 340537	2018-08-23 15:05:22 +00:00
Philip Reames	6b6d2e0105	[AST] Add a test for attribute intersection Already works, but I initially convinced myself it doesn't, so add a test which shows it does. :) llvm-svn: 340453	2018-08-22 21:10:56 +00:00
Scott Linder	72855e36c5	[AMDGPU] Consider loads from flat addrspace to be potentially divergent In general we can't assume flat loads are uniform, and cases where we can prove they are should be handled through infer-address-spaces. Differential Revision: https://reviews.llvm.org/D50991 llvm-svn: 340343	2018-08-21 21:24:31 +00:00
Philip Reames	c3c23e8cf2	[AST] Remove notion of volatile from alias sets [NFCI] Volatility is not an aliasing property. We used to model volatile as if it had extremely conservative aliasing implications, but that hasn't been true for several years now. So, it doesn't make sense to be in AliasSet. It also turns out the code is entirely a noop. Outside of the AST code to update it, there was only one user: load store promotion in LICM. L/S promotion doesn't need the check since it walks all the users of the address anyway. It already checks each load or store via !isUnordered which causes us to bail for volatile accesses. (Look at the lines immediately following the two remove asserts.) There is the possibility of some small compile time impact here, but the only case which will get noticeably slower is a loop with a large number of loads and stores to the same address where only the last one we inspect is volatile. This is sufficiently rare it's not worth optimizing for.. llvm-svn: 340312	2018-08-21 17:59:11 +00:00
Sanjay Patel	3ce999fa41	[ConstantFolding] improve folding of binops with vector undef operand A non-undef operand may still have undef constant elements, so we should always propagate the vector results per-lane. llvm-svn: 340194	2018-08-20 18:19:02 +00:00
Sanjay Patel	7ff7bd9b3c	[ConstantFolding] add tests for binops on vectors with undef elements; NFC llvm-svn: 340190	2018-08-20 17:31:34 +00:00
Philip Reames	96bc076c3a	[AST] Clarify printing of unknown size locations [NFC] Printing "unknown" is much more clear than an arbitrary large integer llvm-svn: 340108	2018-08-17 23:17:31 +00:00
Philip Reames	26f6176f38	[AST][Tests] Clarify what each test is doing llvm-svn: 340100	2018-08-17 21:58:26 +00:00
Philip Reames	9e313167cf	[AST[Tests] Shorten tests using noalias params llvm-svn: 340099	2018-08-17 21:45:57 +00:00
Philip Reames	079c92e201	[AST] Add tests for argmemonly calls [NFC] First step towards building a test set to rebase D50730 on top of. Starting with clone of memtransfer tests, more to come. llvm-svn: 340095	2018-08-17 21:42:18 +00:00
Sanjay Patel	411b86081e	[ConstantFolding] add simplifications for funnel shift intrinsics This is another step towards being able to canonicalize to the funnel shift intrinsics in IR (see D49242 for the initial patch). We should not have any loss of simplification power in IR between these and the equivalent IR constructs. Differential Revision: https://reviews.llvm.org/D50848 llvm-svn: 340022	2018-08-17 13:23:44 +00:00
Max Kazantsev	7b78d3920c	[MustExecute] Fix algorithmic bug in isGuaranteedToExecute. PR38514 The description of `isGuaranteedToExecute` does not correspond to its implementation. According to description, it should return `true` if an instruction is executed under the assumption that its loop is entered. However there is a sophisticated alrogithm inside that tries to prove that the instruction is executed if the loop is exited, which is not the same thing for infinite loops. There is an attempt to protect from dealing with infinite loops by prohibiting loops without exit blocks, however an infinite loop can have exit blocks. As result of that, MustExecute can falsely consider some blocks that are never entered as mustexec, and LICM can hoist dangerous instructions out of them basing on this fact. This may introduce UB to programs which did not contain it initially. This patch removes the problematic algorithm and replaced it with a one which tries to prove what is required in description. Differential Revision: https://reviews.llvm.org/D50558 Reviewed By: reames llvm-svn: 339984	2018-08-17 06:19:17 +00:00
Sanjay Patel	0ea8d8b951	[ConstantFolding] add tests for funnel shift intrinsics; NFC No functionality for this yet. llvm-svn: 339889	2018-08-16 16:10:42 +00:00
Easwaran Raman	aca738b742	[BFI] Use rounding while computing profile counts. Summary: Profile count of a block is computed by multiplying its block frequency by entry count and dividing the result by entry block frequency. Do rounded division in the last step and update test cases appropriately. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50822 llvm-svn: 339835	2018-08-16 00:26:59 +00:00
Max Kazantsev	5a10d127b9	[AliasSetTracker] Do not treat experimental_guard intrinsic as memory writing instruction The `experimental_guard` intrinsic has memory write semantics to model the thread-exiting logic, but does not do any actual writes to memory. Currently, `AliasSetTracker` treats it as a normal memory write. As result, a loop-invariant load cannot be hoisted out of loop because the guard may possibly alias with it. This patch makes `AliasSetTracker` so that it doesn't treat guards as memory writes. Differential Revision: https://reviews.llvm.org/D50497 Reviewed By: reames llvm-svn: 339753	2018-08-15 06:21:02 +00:00
Max Kazantsev	837418f3f9	[NFC] Add comprehensive test of AliasSetTracker with guards llvm-svn: 339643	2018-08-14 06:37:39 +00:00
Reid Kleckner	40e7663b1f	[BasicAA] Don't assume tail calls with byval don't alias allocas Summary: Calls marked 'tail' cannot read or write allocas from the current frame because the current frame might be destroyed by the time they run. However, a tail call may use an alloca with byval. Calling with byval copies the contents of the alloca into argument registers or stack slots, so there is no lifetime issue. Tail calls never modify allocas, so we can return just ModRefInfo::Ref. Fixes PR38466, a longstanding bug. Reviewers: hfinkel, nlewycky, gbiv, george.burgess.iv Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D50679 llvm-svn: 339636	2018-08-14 01:24:35 +00:00
Max Kazantsev	4e9def57c7	[NFC] Add tests that demonstrate that MustExecute is fundamentally broken llvm-svn: 339417	2018-08-10 09:20:46 +00:00
Krzysztof Parzyszek	90f3249ce2	[SCEV] Properly solve quadratic equations Differential Revision: https://reviews.llvm.org/D48283 llvm-svn: 338758	2018-08-02 19:13:35 +00:00
John Brawn	cd73fe8989	[BasicAA] Use PhiValuesAnalysis if available when handling phi alias By using PhiValuesAnalysis we can get all the values reachable from a phi, so we can be more precise instead of giving up when a phi has phi operands. We can't make BaseicAA directly use PhiValuesAnalysis though, as the user of BasicAA may modify the function in ways that PhiValuesAnalysis can't cope with. For this optional usage to work correctly BasicAAWrapperPass now needs to be not marked as CFG-only (i.e. it is now invalidated even when CFG is preserved) due to how the legacy pass manager handles dependent passes being invalidated, namely the depending pass still has a pointer to the now-dead dependent pass. Differential Revision: https://reviews.llvm.org/D44564 llvm-svn: 338242	2018-07-30 11:52:08 +00:00
Keno Fischer	864fbd8e9a	[SCEV] Don't expand Wrap predicate using inttoptr in ni addrspaces Summary: In non-integral address spaces, we're not allowed to introduce inttoptr/ptrtoint intrinsics. Instead, we need to expand any pointer arithmetic as geps on the base pointer. Luckily this is a common task for SCEV, so all we have to do here is hook up the corresponding helper function and add test case. Fixes PR38290 Reviewers: sanjoy Differential Revision: https://reviews.llvm.org/D49832 llvm-svn: 338073	2018-07-26 21:55:06 +00:00
Stanislav Mekhanoshin	b8269a9589	Fix llvm::ComputeNumSignBits with some operations and llvm.assume Currently ComputeNumSignBits does early exit while processing some of the operations (add, sub, mul, and select). This prevents the function from using AssumptionCacheTracker if passed. Differential Revision: https://reviews.llvm.org/D49759 llvm-svn: 337936	2018-07-25 16:39:24 +00:00
Roman Tereshin	1ba1f9310c	[SCEV] Add zext(C + x + ...) -> D + zext(C-D + x + ...)<nuw><nsw> transform if the top level addition in (D + (C-D + x + ...)) could be proven to not wrap, where the choice of D also maximizes the number of trailing zeroes of (C-D + x + ...), ensuring homogeneous behaviour of the transformation and better canonicalization of such expressions. This enables better canonicalization of expressions like 1 + zext(5 + 20 * %x + 24 * %y) and zext(6 + 20 * %x + 24 * %y) which get both transformed to 2 + zext(4 + 20 * %x + 24 * %y) This pattern is common in address arithmetics and the transformation makes it easier for passes like LoadStoreVectorizer to prove that 2 or more memory accesses are consecutive and optimize (vectorize) them. Reviewed By: mzolotukhin Differential Revision: https://reviews.llvm.org/D48853 llvm-svn: 337859	2018-07-24 21:48:56 +00:00
Max Kazantsev	d41faecc49	[SCEV] Fix buggy behavior in getAddExpr with truncs SCEV tries to constant-fold arguments of trunc operands in SCEVAddExpr, and when it does that, it passes wrong flags into the recursion. It is only valid to pass flags that are proved for narrow type into a computation in wider type if we can prove that trunc instruction doesn't actually change the value. If it did lose some meaningful bits, we may end up proving wrong no-wrap flags for sum of arguments of trunc. In the provided test we end up with `nuw` where it shouldn't be because of this bug. The solution is to conservatively pass `SCEV::FlagAnyWrap` which is always a valid thing to do. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D49471 llvm-svn: 337435	2018-07-19 01:46:21 +00:00
Max Kazantsev	6b12506200	[NFC] Make a test more neat llvm-svn: 337379	2018-07-18 11:03:40 +00:00
Tim Shen	a064622bd3	Re-apply "[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428)." llvm-svn: 337075	2018-07-13 23:58:46 +00:00
Tim Renouf	f3d8295105	DivergenceAnalysis: added debug output Summary: This commit does two things: 1. modified the existing DivergenceAnalysis::dump() so it dumps the whole function with added DIVERGENT: annotations; 2. added code to do that dump if the appropriate -debug-only option is on. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47700 Change-Id: Id97b605aab1fc6f5a11a20c58a99bbe8c565bf83 llvm-svn: 336998	2018-07-13 13:13:30 +00:00
Piotr Padlewski	c63b492bcd	Simplify recursive launder.invariant.group and strip Summary: This patch is crucial for proving equality laundered/stripped pointers. eg: bool foo(A a) { return a == std::launder(a); } Clang with -fstrict-vtable-pointers will emit something like: define dso_local zeroext i1 @_Z3fooP1A(%struct.A %a) { entry: %c = bitcast %struct.A* %a to i8* %call = tail call i8* @llvm.launder.invariant.group.p0i8(i8* %c) %0 = bitcast %struct.A* %a to i8* %1 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %0) %2 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %call) %cmp = icmp eq i8* %1, %2 ret i1 %cmp } and because %2 can be replaced with @llvm.strip.invariant.group(%0) and that %2 and %1 will produce the same value (because strip is readnone) we can replace compare with true. Reviewers: rsmith, hfinkel, majnemer, amharc, kuhar Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D47423 llvm-svn: 336963	2018-07-12 23:55:20 +00:00
Simon Pilgrim	667a5b541f	[TargetTransformInfo] Add pow2 analysis for scalar constants Add ConstantInt analysis to getOperandInfo so we get more realistic div/rem expansion costs comparable to the vector costs. llvm-svn: 336827	2018-07-11 17:51:27 +00:00
Manoj Gupta	77eeac3d9e	llvm: Add support for "-fno-delete-null-pointer-checks" Summary: Support for this option is needed for building Linux kernel. This is a very frequently requested feature by kernel developers. More details : https://lkml.org/lkml/2018/4/4/601 GCC option description for -fdelete-null-pointer-checks: This Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero. -fno-delete-null-pointer-checks is the inverse of this implying that null pointer dereferencing is not undefined. This feature is implemented in LLVM IR in this CL as the function attribute "null-pointer-is-valid"="true" in IR (Under review at D47894). The CL updates several passes that assumed null pointer dereferencing is undefined to not optimize when the "null-pointer-is-valid"="true" attribute is present. Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv Reviewed By: efriedma, george.burgess.iv Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47895 llvm-svn: 336613	2018-07-09 22:27:23 +00:00
Simon Pilgrim	dc113dc7ed	[CostModel][X86] Add SREM/UREM general and constant costs (PR38056) We penalize general SDIV/UDIV costs but don't do the same for SREM/UREM. This patch makes general vector SREM/UREM x20 as costly as scalar, the same approach as we do for SDIV/UDIV. The patch also extends the existing SDIV/UDIV constant costs for SREM/UREM - at the moment this means the additional cost of a MUL+SUB (see D48975). Differential Revision: https://reviews.llvm.org/D48980 llvm-svn: 336486	2018-07-07 16:53:30 +00:00
Tim Shen	2ed501d656	Revert "[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428)." This reverts commit r336140. Our tests shows that LSR assert fails with it. llvm-svn: 336473	2018-07-06 23:20:35 +00:00
Max Kazantsev	20da7e467a	Revert "[InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done" llvm-svn: 336410	2018-07-06 04:04:13 +00:00
Simon Pilgrim	8c3765dc6b	[CostModel][X86] Add UDIV/UREM by pow2 costs Normally InstCombine would have simplified these to SRL/AND instructions but we may still see these during SLP vectorization etc. llvm-svn: 336371	2018-07-05 16:56:28 +00:00
Max Kazantsev	3097b76e8c	[InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done This patch changes order of transform in InstCombineCompares to avoid performing transforms based on ranges which produce complex bit arithmetics before more simple things (like folding with constants) are done. See PR37636 for the motivating example. Differential Revision: https://reviews.llvm.org/D48584 Reviewed By: spatel, lebedev.ri llvm-svn: 336172	2018-07-03 06:23:57 +00:00
Tim Shen	c7cef4bcc4	[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428). Summary: Comment on Transforms/LoopVersioning/incorrect-phi.ll: With the change SCEV is able to prove that the loop doesn't wrap-self (due to zext i16 to i64), disabling the entire loop versioning pass. Removed the zext and just use i64. Reviewers: sanjoy Subscribers: jlebar, hiraditya, javed.absar, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D48409 llvm-svn: 336140	2018-07-02 20:01:54 +00:00
Simon Pilgrim	ac193d4b5c	[CostModel][X86] Add cost tests for fp rounding intrinsics Add cost tests for fp ceil, floor, nearbyint, rint and trunc. llvm-svn: 336122	2018-07-02 17:07:01 +00:00
Piotr Padlewski	5b3db45e8f	Implement strip.invariant.group Summary: This patch introduce new intrinsic - strip.invariant.group that was described in the RFC: Devirtualization v2 Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47103 Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com> llvm-svn: 336073	2018-07-02 04:49:30 +00:00
Roman Shirokiy	272eac85c7	Fix overconfident assert in ScalarEvolution::isImpliedViaMerge We can have AddRec with loops having many predecessors. This changes an assert to an early return. Differential Revision: https://reviews.llvm.org/D48766 llvm-svn: 335965	2018-06-29 11:46:30 +00:00
John Brawn	bdbbd8381f	Add a PhiValuesAnalysis pass to calculate the underlying values of phis This pass is being added in order to make the information available to BasicAA, which can't do caching of this information itself, but possibly this information may be useful for other passes. Incorporates code based on Daniel Berlin's implementation of Tarjan's algorithm. Differential Revision: https://reviews.llvm.org/D47893 llvm-svn: 335857	2018-06-28 14:13:06 +00:00
Adhemerval Zanella	cadcfed7aa	[AArch64] Add custom lowering for v4i8 trunc store This patch adds a custom trunc store lowering for v4i8 vector types. Since there is not v.4b register, the v4i8 is promoted to v4i16 (v.4h) and default action for v4i8 is to extract each element and issue 4 byte stores. A better strategy would be to extended the promoted v4i16 to v8i16 (with undef elements) and extract and store the word lane which represents the v4i8 subvectores. The construction: define void @foo(<4 x i16> %x, i8* nocapture %p) { %0 = trunc <4 x i16> %x to <4 x i8> %1 = bitcast i8* %p to <4 x i8>* store <4 x i8> %0, <4 x i8>* %1, align 4, !tbaa !2 ret void } Can be optimized from: umov w8, v0.h[3] umov w9, v0.h[2] umov w10, v0.h[1] umov w11, v0.h[0] strb w8, [x0, #3] strb w9, [x0, #2] strb w10, [x0, #1] strb w11, [x0] ret To: xtn v0.8b, v0.8h str s0, [x0] ret The patch also adjust the memory cost for autovectorization, so the C code: void foo (const int src, int width, unsigned char dst) { for (int i = 0; i < width; i++) dst++ = src++; } can be vectorized to: .LBB0_4: // %vector.body // =>This Inner Loop Header: Depth=1 ldr q0, [x0], #16 subs x12, x12, #4 // =4 xtn v0.4h, v0.4s xtn v0.8b, v0.8h st1 { v0.s }[0], [x2], #4 b.ne .LBB0_4 Instead of byte operations. llvm-svn: 335735	2018-06-27 13:58:46 +00:00
David Green	8699492304	[DA] Delinearise AddRecs if we can prove they don't wrap We can prove that some delinearized subscripts do not wrap around to become negative by the fact that they are from inbound geps of load/store locations. This helps improve the delinearisation in cases where we can't prove that they are non-negative from SCEV alone. Differential Revision: https://reviews.llvm.org/D48481 llvm-svn: 335481	2018-06-25 15:13:26 +00:00
Simon Pilgrim	9c8f9374b5	[CostModel][AArch64] Add some initial costs for SK_Select and SK_PermuteSingleSrc AArch64 was only setting costs for SK_Transpose, which meant that many of the simpler shuffles (e.g. SK_Select and SK_PermuteSingleSrc for larger vector elements) was being severely overestimated by the default shuffle expansion. This patch adds costs to help improve SLP performance and avoid a regression in reductions introduced by D48174. I'm not very knowledgeable about AArch64 shuffle lowering so I've kept the extra costs to a minimum - someone who knows this code can add extra costs which should improve vectorization a lot more. Differential Revision: https://reviews.llvm.org/D48172 llvm-svn: 335329	2018-06-22 09:45:31 +00:00
Tim Shen	63f244c4f4	[SCEV] Re-apply r335197 (with Polly fixes). Summary: This initiates a discussion on changing Polly accordingly while re-applying r335197 (D48338). I have never worked on Polly. The proposed change to param_div_div_div_2.ll is not educated, but just patterns that match the output. All LLVM files are already reviewed in D48338. Reviewers: jdoerfert, bollu, efriedma Subscribers: jlebar, sanjoy, hiraditya, llvm-commits, bixia Differential Revision: https://reviews.llvm.org/D48453 llvm-svn: 335292	2018-06-21 21:29:54 +00:00
Nicolai Haehnle	1045928aab	AMDGPU: Convert test cases to the dimension-aware intrinsics Summary: Also explicitly port over some tests in llvm.amdgcn.image.* that were missing. Some tests are removed because they no longer apply (i.e. explicitly testing building an address vector via insertelement). This is in preparation for the eventual removal of the old-style intrinsics. Some additional notes: - constant-address-space-32bit.ll: change some GCN-NEXT to GCN because the instruction schedule was subtly altered - insert_vector_elt.ll: the old test didn't actually test anything, because %tmp1 was not used; remove the load, because it doesn't work (Because of the amdgpu_ps calling convention? In any case, it's orthogonal to what the test claims to be testing.) Change-Id: Idfa99b6512ad139e755e82b8b89548ab08f0afcf Reviewers: arsenm, rampitec Subscribers: MatzeB, qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D48018 llvm-svn: 335229	2018-06-21 13:37:19 +00:00
David Green	d143c65de3	[DA] Enable -da-delinearize by default This enables da-delinearize in Dependence Analysis for delinearizing array accesses into multiple dimensions. This can help to increase the power of Dependence analysis on multi-dimensional arrays and prevent having to fall back to the slower and less accurate MIV tests. It adds static checks on the bounds of the arrays to ensure that one dimension doesn't overflow into another, and brings our code in line with our tests. Differential Revision: https://reviews.llvm.org/D45872 llvm-svn: 335217	2018-06-21 11:53:16 +00:00
Simon Pilgrim	2a9cde026c	[X86][AVX] Reduce v4f64/v4i64 shuffle costs (PR37882) These were being over cautious for costs for one/two op general shuffles - VSHUFPD doesn't have to replicate the same shuffle in both lanes like VSHUFPS does. llvm-svn: 335216	2018-06-21 11:37:13 +00:00

1 2 3 4 5 ...

1519 Commits