llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrew Litteken	0a8e097e72	Revert "[IRSim] Adding structural comparison to IRSimilarityCandidate." This reverts commit `b27db2bb68`.	2020-09-23 22:40:37 -05:00
Andrew Litteken	b27db2bb68	[IRSim] Adding structural comparison to IRSimilarityCandidate. Just because sequences of instructions are similar to one another, doesn't mean they are doing the same thing. This introduces a structural check for the IRSimilarityCandidate that compares two IRSimilarityCandidates against one another, and in each instruction creates a mapping between the operands and results, or checks that the existing mapping is valid. If this check passes, it means we have structurally similar IRSimilarityCandidates. Tests for whether the candidates are found in unittests/Analysis/IRSimilarityIdentifierTest.cpp.	2020-09-23 22:31:12 -05:00
Sam McCall	fa69b60806	[JSON] Add error reporting to fromJSON and ObjectMapper Translating between JSON objects and C++ strutctures is common. From experience in clangd, fromJSON/ObjectMapper work well and save a lot of code, but aren't adopted elsewhere at least partly due to total lack of error reporting beyond "ok"/"bad". The recently-added error model should be rich enough for most applications. It requires tracking the path within the root object and reporting local errors at appropriate places. To do this, we exploit the fact that the call graph of recursive parse functions mirror the structure of the JSON itself. The current path is represented as a linked list of segments, each of which is on the stack as a parameter. Concretely, fromJSON now looks like: bool fromJSON(const Value&, T&, Path); Beyond the signature change, this is reasonably unobtrusive: building the path segments is mostly handled by ObjectMapper and the vector<T> fromJSON. However the root caller of fromJSON must now create a Root object to store the errors, which is a little clunky. I've added high-level parse<T>(StringRef) -> Expected<T>, but it's not general enough to be the primary interface I think (at least, not usable in clangd). All existing users (mostly just clangd) are updated in this patch, making this change backwards-compatible is a bit hairy. Differential Revision: https://reviews.llvm.org/D88103	2020-09-24 01:20:09 +02:00
Arthur Eubanks	6b1ce83a12	[NewPM][CGSCC] Handle newly added functions in updateCGAndAnalysisManagerForPass This seems to fit the CGSCC updates model better than calling addNewFunctionInto{Ref,}SCC() on newly created/outlined functions. Now addNewFunctionInto{Ref,}SCC() are no longer necessary. However, this doesn't work on newly outlined functions that aren't referenced by the original function. e.g. if a() was outlined into b() and c(), but c() is only referenced by b() and not by a(), this will trigger an assert. This also fixes an issue I was seeing with newly created functions not having passes run on them. Ran check-llvm with expensive checks. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87798	2020-09-23 15:22:18 -07:00
Andrew Litteken	6ada9e516f	[IRSim] Adding IRSimilarityCandidate that contains a region of IRInstructionData. The IRSimilarityCandidate is a container to hold a region of IRInstructions and offer interfaces for the starting instruction, ending instruction, parent function, length. It also assigns a global value number for each unique instance of a value in the region. It also contains an interface to compare two IRSimilarity as to whether they have the same sequence of similar instructions. Tests for whether the instructions are similar are found in unittests/Analysis/IRSimilarityIdentifierTest.cpp. Recommit of: `4944bb190f` Differential Revision: https://reviews.llvm.org/D86970	2020-09-23 13:43:34 -05:00
Sanjay Patel	6189a8d9f5	[TTI] add wrapper for matching vector reduction to reduce code duplication; NFC I'm not sure what this means, but the order in which we try the matches makes a difference on at least 1 regression test...	2020-09-23 13:48:57 -04:00
Andrew Litteken	88bc59c300	Revert "[IRSim] Adding IRSimilarityCandidate that contains a region of IRInstructionData." This reverts commit `4944bb190f`.	2020-09-22 21:02:34 -05:00
Andrew Litteken	4944bb190f	[IRSim] Adding IRSimilarityCandidate that contains a region of IRInstructionData. The IRSimilarityCandidate is a container to hold a region of IRInstructions and offer interfaces for the starting instruction, ending instruction, parent function, length. It also assigns a global value number for each unique instance of a value in the region. It also contains an interface to compare two IRSimilarity as to whether they have the same sequence of similar instructions. Tests for whether the instructions are similar are found in unittests/Analysis/IRSimilarityIdentifierTest.cpp. Differential Revision: https://reviews.llvm.org/D86970	2020-09-22 18:42:31 -05:00
Stefanos Baziotis	89c1e35f3c	[LoopInfo] empty() -> isInnermost(), add isOutermost() Differential Revision: https://reviews.llvm.org/D82895	2020-09-22 23:28:51 +03:00
Max Kazantsev	e2703c021d	[SCEV] Handle `less` predicates for FoundPred = NE Currently these predicates are ignored, yet their handling is pretty simple. I could not find a single test where it would actually change something, but it's only because isImpliedCondOperands is not smart enough to prove it further on. Yet the situation when we come there with `less` predicate is pretty common. Differential Revision: https://reviews.llvm.org/D87890 Reviewed By: fhahn	2020-09-22 18:56:35 +07:00
Meera Nakrani	a3d0dce260	[ARM][TTI] Prevents constants in a min(max) or max(min) pattern from being hoisted when in a loop Changes TTI function getIntImmCostInst to take an additional Instruction parameter, which enables us to be able to check it is part of a min(max())/max(min()) pattern that will match SSAT. We can then mark the constant used as free to prevent it being hoisted so SSAT can still be generated. Required minor changes in some non-ARM backends to allow for the optional parameter to be included. Differential Revision: https://reviews.llvm.org/D87457	2020-09-22 11:54:10 +00:00
Max Kazantsev	16fde88dbd	[SCEV] Support unsigned predicates in isKnownPredicateViaNoOverflow SCEV should be able to prove facts like `x <u x+1<nuw>`. Differential Revision: https://reviews.llvm.org/D88015 Reviewed By: lebedev.ri	2020-09-22 17:14:05 +07:00
Arthur Eubanks	9db0c572c1	[Delinearization][NewPM] Port delinearization to NPM Also make tests in Analysis/Delinearization work under NPM. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87741	2020-09-21 17:59:08 -07:00
Fangrui Song	8fdac7cb7a	Revert D71539 "Recommit "[SCEV] Look through single value PHIs."" This reverts commit `11dccf8d3a`. A bootstrapped clang crashes (due to ArrayRef::front called on an empty ArrayRef) when compiling some files. Very strangely, this only reproduces with modules. ``` 13 0x0000564d3349e968 llvm::ArrayRef<llvm::BasicBlock>::front() const /proc/self/cwd/llvm/include/llvm/ADT/ArrayRef.h:160:7 14 0x0000564d3349e896 llvm::LoopBase<llvm::BasicBlock, llvm::Loop>::getHeader() const /proc/self/cwd/llvm/include/llvm/Analysis/LoopInfo.h:104:50 15 0x0000564d3349fd9d llvm::LoopBase<llvm::BasicBlock, llvm::Loop>::getLoopLatch() const /proc/self/cwd/llvm/include/llvm/Analysis/LoopInfoImpl.h:210:11 16 0x0000564d33593c8a llvm::ScalarEvolution::computeBackedgeTakenCount(llvm::Loop const, bool) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:6933:15 17 0x0000564d33592ebc llvm::ScalarEvolution::getBackedgeTakenInfo(llvm::Loop const) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:0:30 18 0x0000564d33593a54 llvm::ScalarEvolution::getBackedgeTakenCount(llvm::Loop const, llvm::ScalarEvolution::ExitCountKind) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:6487:36 19 0x0000564d32be2402 llvm::ScalarEvolution::getConstantMaxBackedgeTakenCount(llvm::Loop const) /proc/self/cwd/llvm/include/llvm/Analysis/ScalarEvolution.h:768:5 20 0x0000564d33590807 llvm::ScalarEvolution::getRangeRef(llvm::SCEV const, llvm::ScalarEvolution::RangeSignHint) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:5495:19 21 0x0000564d320abab7 llvm::ScalarEvolution::getSignedRange(llvm::SCEV const) /proc/self/cwd/llvm/include/llvm/Analysis/ScalarEvolution.h:840:12 22 0x0000564d335a03aa llvm::ScalarEvolution::isKnownPredicateViaConstantRanges(llvm::CmpInst::Predicate, llvm::SCEV const, llvm::SCEV const) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:9239:60 23 0x0000564d33586a80 llvm::ScalarEvolution::isKnownViaNonRecursiveReasoning(llvm::CmpInst::Predicate, llvm::SCEV const, llvm::SCEV const*) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:10284:60 ```	2020-09-21 17:21:43 -07:00
Roman Lebedev	0ab99bb314	[NFC][SCEV] Cleanup lowering of @llvm.uadd.sat, (-1 - V) is just ~V	2020-09-21 22:10:59 +03:00
Roman Lebedev	64e2cb7e96	[SCEV] Recognize @llvm.uadd.sat as `%y + umin(%x, (-1 - %y))` ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = uadd_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = sub nsw nuw i32 4294967295, %y %t1 = umin i32 %x, %t0 %r = add nuw i32 %t1, %y ret i32 %r } Transformation seems to be correct! The alternative, naive, lowering could be the following, although i don't think it's better, thought it will likely be needed for sadd/ssub/*shl: ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = uadd_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = zext i32 %x to i33 %t1 = zext i32 %y to i33 %t2 = add nuw i33 %t0, %t1 %t3 = zext i32 4294967295 to i33 %t4 = umin i33 %t2, %t3 %r = trunc i33 %t4 to i32 ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:54 +03:00
Roman Lebedev	fedc9549d5	[SCEV] Recognize @llvm.usub.sat as `%x - (umin %x, %y)` ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = usub_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = umin i32 %x, %y %r = sub nuw i32 %x, %t0 ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:54 +03:00
Roman Lebedev	1bb7ab8c4a	[SCEV] Recognize @llvm.abs as smax(x, -x) As per alive2 (ignoring undef): ---------------------------------------- define i32 @src(i32 %x, i1 %y) { %0: %r = abs i32 %x, 0 ret i32 %r } => define i32 @tgt(i32 %x, i1 %y) { %0: %neg_x = mul i32 %x, 4294967295 %r = smax i32 %x, %neg_x ret i32 %r } Transformation seems to be correct! ---------------------------------------- define i32 @src(i32 %x, i1 %y) { %0: %r = abs i32 %x, 1 ret i32 %r } => define i32 @tgt(i32 %x, i1 %y) { %0: %neg_x = mul nsw i32 %x, 4294967295 %r = smax i32 %x, %neg_x ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:53 +03:00
Florian Hahn	11dccf8d3a	Recommit "[SCEV] Look through single value PHIs." This commit was originally because it was suspected to cause a crash, but a reproducer did not surface. A crash that was exposed by this change was fixed in `1d8f2e5292`. This reverts the revert commit `0581c0b0ee`.	2020-09-21 11:59:50 +01:00
Nikita Popov	445db89b53	[LVI] Get value range from mask comparison InstCombine likes to canonicalize comparisons of the form X == C \|\| X == C+1 into (X & -2) == C'. Make sure LVI can still recover the value range from this. Can of course also be useful for proper mask comparisons. For the sake of clarity, the implementation goes through KnownBits to compute the range.	2020-09-20 21:13:57 +02:00
Nikita Popov	f94bbe19b6	[LVI] Refactor getValueFromICmpCondition (NFC) Rewrite this in a way where the core logic is in a separate function, that is invoked with swapped operands. This makes it easier to add handling for additional icmp patterns.	2020-09-20 21:13:57 +02:00
Dávid Bolvanský	2990518b03	[MemLoc] Support lllvm.memcpy.inline in MemoryLocation::getForArgument Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D87971	2020-09-20 14:01:48 +02:00
Fangrui Song	6913812abc	Fix some clang-tidy bugprone-argument-comment issues	2020-09-19 20:41:25 -07:00
Dávid Bolvanský	d716f1608c	[MemLoc] Support bcmp in MemoryLocation::getForArgument Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D87964	2020-09-19 17:12:43 +02:00
Sanjay Patel	f74a334fe3	[ConstantFolding] add undef handling for fmin/fmax intrinsics The output here may not be optimal (yet), but it should be consistent for commuted operands (it was not before) and correct. We can do better by checking FMF and NaN if needed. Code in InstSimplify generally assumes that we have already folded code like this, so it was not handling 2 constant inputs by commuting consistently.	2020-09-19 10:31:01 -04:00
Andrew Litteken	132aaec4f2	[IRSim] Adding ilist for IRInstructionData. The IRInstructionData structs are a different representation of the program. This list treats the program as if it was "flattened" and the only parent is this list. This lets us easily create ranges of instructions. Differential Revision: https://reviews.llvm.org/D86969	2020-09-19 00:18:39 -05:00
Vitaly Buka	97bfac076a	[NFC][StackSafety] Replace auto with type Fixes static analyzer is warning.	2020-09-18 17:10:28 -07:00
Sanjay Patel	3f100e64b4	[InstSimplify] fix fmin/fmax miscompile for partial undef vectors (PR47567) It would also be correct to return the variable operand in these cases, but eliminating a variable use is probably better for optimization.	2020-09-18 10:05:44 -04:00
Florian Hahn	4635f6050b	[SCEV] Generalize SCEVParameterRewriter to accept SCEV expression as target. This patch extends SCEVParameterRewriter to support rewriting unknown epxressions to arbitrary SCEV expressions. It will be used by further patches. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D67176	2020-09-18 10:05:02 +01:00
Andrew Litteken	7e4c6fb854	[IRSim] Adding IR Instruction Mapper This introduces the IRInstructionMapper, and the associated wrapper for instructions, IRInstructionData, that maps IR level Instructions to unsigned integers. Mapping is done mainly by using the "isSameOperationAs" comparison between two instructions. If they return true, the opcode, result type, and operand types of the instruction are used to hash the instruction with an unsigned integer. The mapper accepts instruction ranges, and adds each resulting integer to a list, and each wrapped instruction to a separate list. At present, branches, phi nodes are not mapping and exception handling is illegal. Debug instructions are not considered. The different mapping schemes are tested in unittests/Analysis/IRSimilarityIdentifierTest.cpp Recommit of: `b04c1a9d31` Differential Revision: https://reviews.llvm.org/D86968	2020-09-17 14:06:16 -05:00
Mikael Holmen	bb037c2a76	[ConstraintSystem] Remove local variable that is set but not read [NFC] gcc 7.4 warns about it.	2020-09-17 14:26:48 +02:00
Sjoerd Meijer	6637d72ddd	[Lint] Add check for intrinsic get.active.lane.mask As @efriedma pointed out in D86301, this "not equal to 0 check" of get.active.lane.mask's second operand needs to live here in Lint and not the Verifier. Differential Revision: https://reviews.llvm.org/D87228	2020-09-17 09:22:03 +01:00
Stella Stamenova	a895040eb0	Revert "[IRSim] Adding IR Instruction Mapper" This reverts commit `b04c1a9d31`.	2020-09-16 20:00:43 -07:00
Andrew Litteken	b04c1a9d31	[IRSim] Adding IR Instruction Mapper This introduces the IRInstructionMapper, and the associated wrapper for instructions, IRInstructionData, that maps IR level Instructions to unsigned integers. Mapping is done mainly by using the "isSameOperationAs" comparison between two instructions. If they return true, the opcode, result type, and operand types of the instruction are used to hash the instruction with an unsigned integer. The mapper accepts instruction ranges, and adds each resulting integer to a list, and each wrapped instruction to a separate list. At present, branches, phi nodes are not mapping and exception handling is illegal. Debug instructions are not considered. The different mapping schemes are tested in unittests/Analysis/IRSimilarityIdentifierTest.cpp Differential Revision: https://reviews.llvm.org/D86968	2020-09-16 20:49:21 -05:00
Arthur Eubanks	f4ea0f9814	[NewPM] Port -print-alias-sets to NPM Really it should be named print<alias-sets>, but for the sake of changing fewer tests, added a TODO to rename after NPM switch and test cleanup. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87713	2020-09-16 18:34:56 -07:00
Alina Sbirlea	344a3d0bc0	[MemorySSA] Rename uses in blocks with Phis. Renaming should include blocks with existing Phis. Resolves PR45927. Differential Revision: https://reviews.llvm.org/D87661	2020-09-16 17:24:17 -07:00
Nikita Popov	0bb06f297f	[InstSimplify] Clarify SimplifyWithOpReplaced() return value If SimplifyWithOpReplaced() cannot simplify the value, null should be returned. Make sure this really does happen in all cases, including those where SimplifyBinOp() returns the original value. This does not matter for existing users, but does mattter for D87480, which would go into an infinite loop otherwise.	2020-09-16 20:53:26 +02:00
Arthur Eubanks	91332c4dbb	[CGSCC][NewPM] Fix adding mutually recursive new functions When adding a new function via addNewFunctionIntoRefSCC(), it creates a new node and immediately populates the edges. Since populateSlow() calls G->get() on all referenced functions, it will create a node (but not populate it) for functions that haven't yet been added. If we add two mutually recursive functions, the assert that the node should never have been created will fire when the second function is added. So here we remove that assert since the node may have already been created (but not yet populated). createNode() is only called from addNewFunctionInto{,Ref}SCC(). https://bugs.llvm.org/show_bug.cgi?id=47502 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D87623	2020-09-15 16:44:08 -07:00
Sanjay Patel	8985755762	[InstSimplify] add limit folds for fmin/fmax If the constant operand is the opposite of the min/max value, then the result must be the other value. This is based on the similar codegen transform proposed in: D87571	2020-09-15 10:58:44 -04:00
Florian Hahn	db22e70d01	[ConstraintSolver] Add isConditionImplied helper. This patch adds a isConditionImplied function that takes a constraint and returns true if the constraint is implied by the current constraints in the system. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D84545	2020-09-15 13:50:11 +01:00
Simon Pilgrim	50d2a5d4c7	LoopCacheAnalysis.h - remove unnecessary includes. NFCI. More remaining dependencies down to LoopCacheAnalysis.cpp	2020-09-15 13:34:35 +01:00
Simon Pilgrim	6d40f35c9f	AliasSetTracker.cpp - remove unnecessary includes. NFCI. These are all directly included in AliasSetTracker.h	2020-09-15 13:34:34 +01:00
Florian Hahn	cd4edf94cd	Recommit "[ConstraintSystem] Add helpers to deal with linear constraints." This patch recommits "[ConstraintSystem] Add helpers to deal with linear constraints." (it reverts the revert commit `8da6ae4ce1`). The reason for the revert was using __builtin_multiply_overflow, which is not available for all compilers. The patch has been updated to use MulOverflow from MathExtras.h	2020-09-15 12:07:26 +01:00
Sanjay Patel	55d371abd7	[InstSimplify] add folds for fmin/fmax with 'nnan' maximum(nnan X, +INF) --> +INF minimum(nnan X, -INF) --> -INF This is based on the similar codegen transform proposed in: D87571	2020-09-14 11:46:11 -04:00
Sanjay Patel	7526376164	[InstSimplify] allow folds for fmin/fmax with 'ninf' maxnum(ninf X, +FLT_MAX) --> +FLT_MAX minnum(ninf X, -FLT_MAX) --> -FLT_MAX This is based on the similar codegen transform proposed in: D87571	2020-09-14 11:18:08 -04:00
Florian Hahn	c4f1b31441	[MemorySSA] Make sure PerformedPhiTrans is updated for each visited def. `1ce82015f6` added a fix to restrict phi optimizations after phi translations. But the current use of performedPhiTranslation only checked whether phi translation happened for the first iterator and missed cases where phi translations happens at subsequent iterators/upwards defs. This patch changes upward_defs_iteartor to take a pointer to a bool, so we can easily ensure the final value includes all visited defs, while still being able to conveniently use it with make_range & co.	2020-09-14 16:11:56 +01:00
Sanjay Patel	22c583c3d0	[InstSimplify] reduce code duplication for fmin/fmax folds; NFC We use the same code structure for folding integer min/max.	2020-09-14 10:32:11 -04:00
Sanjay Patel	7bb9a2f996	[InstSimplify] fix miscompiles with maximum/minimum intrinsics As discussed in the sibling codegen functionality patch D87571, this transform was created with D52766, but it is not correct. The incorrect test diffs were missed during review, but the 'TODO' comment about this functionality was still in the code - we need 'nnan' to enable this fold.	2020-09-14 09:06:41 -04:00
Max Kazantsev	412b417bfa	[NFC] Add missing `const` statements in SCEV	2020-09-14 18:43:24 +07:00
David Green	74760bb00f	[LV][ARM] Add preferInloopReduction target hook. This allows the backend to tell the vectorizer to produce inloop reductions through a TTI hook. For the moment on ARM under MVE this means allowing integer add reductions of the correct size. In the future this can include integer min/max too, under -Os. Differential Revision: https://reviews.llvm.org/D75512	2020-09-12 17:47:04 +01:00
Tyker	78de7297ab	Reland [AssumeBundles] Use operand bundles to encode alignment assumptions NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining.	2020-09-12 15:36:06 +02:00
Nikita Popov	36e2e2e12e	[InstCombine] Fix incorrect SimplifyWithOpReplaced transform (PR47322) This is a followup to D86834, which partially fixed this issue in InstSimplify. However, InstCombine repeats the same transform while dropping poison flags -- which does not cover cases where poison is introduced in some other way. The fix here is a bit more comprehensive, because things are quite entangled, and it's hard to only partially address it without regressing optimization. There are really two changes here: * Export the SimplifyWithOpReplaced API from InstSimplify, with an added AllowRefinement flag. For replacements inside the TrueVal we don't actually care whether refinement occurs or not, the replacement is always legal. This part of the transform is now done in InstSimplify only. (It should be noted that the current AllowRefinement check is not sufficient -- that's an issue we need to address separately.) * Change the InstCombine fold to work by temporarily dropping poison generating flags, running the fold and then restoring the flags if it didn't work out. This will ensure that the InstCombine fold is correct as long as the InstSimplify fold is correct. Differential Revision: https://reviews.llvm.org/D87445	2020-09-12 14:45:06 +02:00
Simon Pilgrim	48b510c4bc	[NFC] Fix compiler warnings due to integer comparison of different signedness Fix by directly using INT_MAX and INT32_MAX. Patch by: @nullptr.cpp (Yang Fan) Differential Revision: https://reviews.llvm.org/D87347	2020-09-11 15:32:03 +01:00
Florian Hahn	8da6ae4ce1	Revert "[ConstraintSystem] Add helpers to deal with linear constraints." This reverts commit `3eb141e507`. This uses __builtin_mul_overflow which is not available everywhere.	2020-09-11 14:49:04 +01:00
Florian Hahn	3eb141e507	[ConstraintSystem] Add helpers to deal with linear constraints. This patch introduces a new ConstraintSystem class, that maintains a set of linear constraints and uses Fourier–Motzkin elimination to eliminate constraints to check if there are solutions for the system. It also adds a convert-constraint-log-to-z3.py script, which can parse the debug output of the constraint system and convert it to a python script that feeds the constraints into Z3 and checks if it produces the same result as the LLVM implementation. This is for verification purposes. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D84544	2020-09-11 14:43:22 +01:00
Mircea Trofin	da92448828	[NFC][MLInliner] Presort instruction successions. Differential Revision: https://reviews.llvm.org/D87489	2020-09-10 21:40:49 -07:00
Nikita Popov	a5168bdb4a	[DemandedBits][BDCE] Add support for min/max intrinsics Add DemandedBits / BDCE support for min/max intrinsics: If the low bits are not demanded in the result, they also aren't demanded in the operands. Differential Revision: https://reviews.llvm.org/D87161	2020-09-10 22:13:31 +02:00
Nikita Popov	99e78cb718	[DemandedBits] Add braces to large if (NFC) While the if only contains a single statement, it happens to be a huge switch. Add braces to make this code easier to read.	2020-09-10 22:13:27 +02:00
Christopher Tetreault	7ddfd9b3eb	[SVE] Bail from VectorUtils heuristics for scalable vectors Bail from maskIsAllZeroOrUndef and maskIsAllOneOrUndef prior to iterating over the number of elements for scalable vectors. Assert that the mask type is not scalable in possiblyDemandedEltsInMask . Assert that the types are correct in all three functions. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87424	2020-09-10 12:29:37 -07:00
Max Kazantsev	8c0bbbade1	[NFC] Refactoring in SCEV: add missing `const` qualifiers	2020-09-10 19:06:37 +07:00
Max Kazantsev	cde8fc65ae	[NFC] Rename variables to avoid name confusion Name `LI` is used for loop info, loop and load inst at the same function, which causes a lot of confusion.	2020-09-10 13:41:10 +07:00
Juneyoung Lee	a6183d0f02	[ValueTracking] isKnownNonZero, computeKnownBits for freeze This implements support for isKnownNonZero, computeKnownBits when freeze is involved. ``` br (x != 0), BB1, BB2 BB1: y = freeze x ``` In the above program, we can say that y is non-zero. The reason is as follows: (1) If x was poison, `br (x != 0)` raised UB (2) If x was fully undef, the branch again raised UB (3) If x was non-zero partially undef, say `undef \| 1`, `freeze x` will return a nondeterministic value which is also non-zero. (4) If x was just a concrete value, it is trivial Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D75808	2020-09-10 08:07:38 +09:00
Mircea Trofin	4b15fc9ddb	[NFC][MLInliner] Don't initialize in an assert. Since the build bots have assertions enabled, this flew under the radar.	2020-09-09 09:56:07 -07:00
Juneyoung Lee	25ce1e0497	[ValueTracking] Add UndefOrPoison/Poison-only version of relevant functions This patch adds isGuaranteedNotToBePoison and programUndefinedIfUndefOrPoison. isGuaranteedNotToBePoison will be used at D75808. The latter function is used at isGuaranteedNotToBePoison. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84242	2020-09-09 20:00:26 +09:00
Max Kazantsev	795e4ee9d2	[NFC] Move functon from IndVarSimplify to SCEV This function can be reused in other places. Differential Revision: https://reviews.llvm.org/D87274 Reviewed By: fhahn, lebedev.ri	2020-09-09 11:20:59 +07:00
Krzysztof Parzyszek	055d209589	Handle masked loads and stores in MemoryLocation/Dependence Differential Revision: https://reviews.llvm.org/D87061	2020-09-08 19:08:44 -05:00
Nikita Popov	8453fbf088	[ValueTracking] Compute known bits of min/max intrinsics Implement known bits for the min/max intrinsics based on the recently added KnownBits primitives.	2020-09-08 21:08:17 +02:00
Nikita Popov	e97f3b1b43	[InstCombine] Fold abs of known negative operand If we know that the abs operand is known negative, we can replace it with a neg. To avoid computing known bits twice, I've removed the fold for the non-negative case from InstSimplify. Both the non-negative and the negative case are handled by InstCombine now, with one known bits call. Differential Revision: https://reviews.llvm.org/D87196	2020-09-08 20:14:35 +02:00
Jay Foad	5350e1b509	[KnownBits] Implement accurate unsigned and signed max and min Use the new implementation in ValueTracking, SelectionDAG and GlobalISel. Differential Revision: https://reviews.llvm.org/D87034	2020-09-07 09:09:01 +01:00
dongAxis	1fd7dc4074	When dumping results of StackLifetime, it will print the following log: BB [7, 8): begin {}, end {}, livein {}, liveout {} BB [1, 2): begin {}, end {}, livein {}, liveout {} ... But it is not convenient to know what the basic block is. So I add the basic block name to it. Reviewed By: vitalybuka TestPlan: check-llvm Differential Revision: https://reviews.llvm.org/D87152	2020-09-07 11:43:16 +08:00
Nikita Popov	b536cbaac5	[ValueTracking] Avoid known bits fallback for non-zero get check (NFCI) The known bits fall back will never be able to infer a non-null value here, so don't bother.	2020-09-06 23:16:38 +02:00
Nikita Popov	ff218cbc84	[InstSimplify] Fold degenerate abs of abs form This addresses the remaining issue from D87188. Due to a series of folds, we may end up with abs-of-abs represented as x == 0 ? -abs(x) : abs(x). Rather than recognizing this as a special abs pattern and doing an abs-of-abs fold on it afterwards, I'm directly folding this to one of the select operands in InstSimplify. The general pattern falls into the "select with operand replaced" category, but that fold is not powerful enough to recognize that both hands of the select are the same for value zero. Differential Revision: https://reviews.llvm.org/D87197	2020-09-06 09:43:08 +02:00
Florian Hahn	1ddb3a369f	[LangRef] Adjust guarantee for llvm.memcpy to also allow equal arguments. This adjusts the description of `llvm.memcpy` to also allow operands to be equal. This is in line with what Clang currently expects. This change is intended to be temporary and followed by re-introduce a variant with the non-overlapping guarantee for cases where we can actually ensure that property in the front-end. See the links below for more details: http://lists.llvm.org/pipermail/cfe-dev/2020-August/066614.html and PR11763. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D86815	2020-09-05 19:18:23 +01:00
Nikita Popov	ac87480bd8	[SCEV] Recognize min/max intrinsics Recognize umin/umax/smin/smax intrinsics and convert them to the already existing SCEV nodes of the same name. In the future we'll want SCEVExpander to also produce the intrinsics, but we're not ready for that yet. Differential Revision: https://reviews.llvm.org/D87160	2020-09-05 16:30:11 +02:00
Nikita Popov	73104b0751	[InstSimplify] Fold min/max based on dominating condition If we have a dominating condition that x >= y, then umax(x, y) is x, etc. I'm doing this in InstSimplify as the corresponding transform for the select form is also done there. Differential Revision: https://reviews.llvm.org/D87168	2020-09-05 16:16:40 +02:00
Arthur Eubanks	c9771391ce	[NewPM][Lint] Port -lint to NewPM This also changes -lint from an analysis to a pass. It's similar to -verify, and that is a normal pass, and lives in llvm/IR. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87057	2020-09-03 13:03:44 -07:00
Arthur Eubanks	e440b4933a	Revert "[NewPM][Lint] Port -lint to NewPM" This reverts commit `883399c840`.	2020-09-02 21:34:29 -07:00
Arthur Eubanks	883399c840	[NewPM][Lint] Port -lint to NewPM This also changes -lint from an analysis to a pass. It's similar to -verify, and that is a normal pass, and lives in llvm/IR. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87057	2020-09-02 21:13:01 -07:00
Craig Topper	b16e8687ab	[CodeGenPrepare][X86] Teach optimizeGatherScatterInst to turn a splat pointer into GEP with scalar base and 0 index This helps SelectionDAGBuilder recognize the splat can be used as a uniform base. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D86371	2020-09-02 20:44:12 -07:00
Eli Friedman	96ef6998df	[InstCombine] Fix a couple crashes with extractelement on a scalable vector. Differential Revision: https://reviews.llvm.org/D86989	2020-09-02 18:02:07 -07:00
Jordan Rupprecht	c90f15d25a	[NFC] Fix unused var in release build	2020-09-01 13:05:56 -07:00
Florian Hahn	0d966ae4b2	[Loads] Add canReplacePointersIfEqual helper. This patch adds an initial, incomeplete and unsound implementation of canReplacePointersIfEqual to check if a pointer value A can be replaced by another pointer value B, that are deemed to be equivalent through some means (e.g. information from conditions). Note that is in general not sound to blindly replace pointers based on equality, for example if they are based on different underlying objects. LLVM's memory model is not completely settled as of now; see https://bugs.llvm.org/show_bug.cgi?id=34548 for a more detailed discussion. The initial version of canReplacePointersIfEqual only rejects a very specific case: replacing a pointer with a constant expression that is not dereferenceable. Such a replacement is problematic and can be restricted relatively easily without impacting most code. Using it to limit replacements in GVN/SCCP/CVP only results in small differences in 7 programs out of MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto. This patch is supposed to be an initial step to improve the current situation and the helper should be made stricter in the future. But this will require careful analysis of the impact on performance. Reviewed By: aqjune Differential Revision: https://reviews.llvm.org/D85524	2020-09-01 20:57:41 +01:00
Alina Sbirlea	c292fba46f	[MemorySSA] Update phi map with replacement value.	2020-09-01 11:56:40 -07:00
Alina Sbirlea	63844c116a	[MemorySSA] Clean up single value phis. MemoryPhis with a single value are correct, but can lead to errors when updating. Clean up single entry Phis newly added when cloning blocks. Resolves PR46574.	2020-08-31 19:26:08 -07:00
Nikita Popov	88b310f64b	[InstSimplify] Reduce code duplication in simplifySelectWithICmpCond (NFC) Canonicalize icmp ne to icmp eq and implement all the folds only once.	2020-08-29 22:38:49 +02:00
Nikita Popov	a5be86fde5	[InstSimplify] Protect against more poison in SimplifyWithOpReplaced (PR47322) Replace the check for poison-producing instructions in SimplifyWithOpReplaced() with the generic helper canCreatePoison() that properly handles poisonous shifts and thus avoids the problem from PR47322. This additionally fixes a bug in IIQ.UseInstrInfo=false mode, which previously could have caused this code to ignore poison flags. Setting UseInstrInfo=false should reduce the possible optimizations, not increase them. This is not a full solution to the problem, as poison could be introduced more indirectly. This is just a minimal, easy to backport fix. Differential Revision: https://reviews.llvm.org/D86834	2020-08-29 21:59:39 +02:00
Nikita Popov	a400a61721	[LVI] Remove unnecessary lambda capture (NFC)	2020-08-29 21:33:19 +02:00
Nikita Popov	6d88f6efd4	Reapply [LVI] Normalize pointer behavior This got reverted because a dependency was reverted. It has since been reapplied, so reapply this as well. ----- Related to D69686. As noted there, LVI currently behaves differently for integer and pointer values: For integers, the block value is always valid inside the basic block, while for pointers it is only valid at the end of the basic block. I believe the integer behavior is the correct one, and CVP relies on it via its getConstantRange() uses. The reason for the special pointer behavior is that LVI checks whether a pointer is dereferenced in a given basic block and marks it as non-null in that case. Of course, this information is valid only after the dereferencing instruction, or in conservative approximation, at the end of the block. This patch changes the treatment of dereferencability: Instead of including it inside the block value, we instead treat it as something similar to an assume (it essentially is a non-nullness assume) and incorporate this information in intersectAssumeOrGuardBlockValueConstantRange() if the context instruction is the terminator of the basic block. This happens either when determining an edge-value internally in LVI, or when a terminator was explicitly passed to getValueAt(). The latter case makes this more powerful than the previous implementation as a side-effect, and this does actually seem benefitial in practice. Of course, we do not want to recompute dereferencability on each intersectAssume call, so we need a new cache for this. The dereferencability analysis requires walking the entire basic block and computing underlying objects of all memory operands. This was previously done separately for each queried pointer value. In the new implementation (both because this makes the caching simpler, and because it is faster), I instead only walk the full BB once and cache all the dereferenced pointers. So the traversal is now performed only once per BB, instead of once per queried pointer value. I think the overall model now makes more sense than before, and there will be no more pitfalls due to differing integer/pointer behavior. Differential Revision: https://reviews.llvm.org/D69914	2020-08-29 21:17:03 +02:00
Roman Lebedev	c1b3e32118	[NFC][InstructionSimplify] Add a warning about not simplifying to not def-reachable See https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html and https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824967.html InstSimply is not allowed to perform simplifications to instructions that are not def-reachable from the original instruction.	2020-08-29 09:58:08 +03:00
Owen Anderson	ed90f15efb	Revert "[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block" This reverts commit `6102310d81`. It appears to cause compilation non-determinism and caused stage3 mismatches.	2020-08-28 23:43:42 +00:00
serge-sans-paille	2296182181	Skip analysis re-computation when no changes are reported This is a follow-up to https://reviews.llvm.org/D80707, generalized to CallGraphSCC, Loop and Region Differential Revision: https://reviews.llvm.org/D86442	2020-08-28 21:41:01 +02:00
David Sherwood	f4257c5832	[SVE] Make ElementCount members private This patch changes ElementCount so that the Min and Scalable members are now private and can only be accessed via the get functions getKnownMinValue() and isScalable(). In addition I've added some other member functions for more commonly used operations. Hopefully this makes the class more useful and will reduce the need for calling getKnownMinValue(). Differential Revision: https://reviews.llvm.org/D86065	2020-08-28 14:43:53 +01:00
Florian Hahn	fd6ebea50d	[MemLoc] Support memcmp in MemoryLocation::getForArgument. This patch adds support for memcmp in MemoryLocation::getForArgument. memcmp reads from the first 2 arguments up to the number of bytes of the third argument. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86725	2020-08-28 10:19:54 +01:00
Martin Storsjö	db1ec04963	[ValueTracking] Remove a stray semicolon. NFC. This silences warnings when built with GCC at least.	2020-08-28 09:24:10 +03:00
serge-sans-paille	b1f4e5979b	(Expensive) Check for Loop, SCC and Region pass return status This generalizes the logic introduced in https://reviews.llvm.org/D80916 to other passes. It's needed by https://reviews.llvm.org/D86442 to assert passes correctly report their status. Differential Revision: https://reviews.llvm.org/D86589	2020-08-28 07:56:35 +02:00
Alina Sbirlea	d370836c20	[MemorySSA] Assert defining access is not a MemoryUse.	2020-08-27 18:21:10 -07:00
Vitaly Buka	23524fdece	[ValueTracking] Replace recursion with Worklist Now findAllocaForValue can handle nontrivial phi cycles.	2020-08-27 14:44:49 -07:00
Vitaly Buka	a40660551e	[StackSafety] Ignore allocas with partial lifetime markers Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D86672	2020-08-27 13:54:41 -07:00
Vitaly Buka	a6927c8621	[NFC][ValueTracking] Add OffsetZero into findAllocaForValue For StackLifetime after finding alloca we need to check that values ponting to the begining of alloca. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D86692	2020-08-27 13:46:22 -07:00
Roman Lebedev	b85f91fdce	[InstSimplify] SimplifyPHINode(): check that instruction is in basic block first As pointed out in post-commit review, this can legally be called on instructions that are not inserted into basic blocks, so don't blindly assume that there is basic block.	2020-08-27 22:32:03 +03:00
Simon Moll	c48b06c44f	[sda][nfc] clang-formatting	2020-08-27 18:27:44 +02:00
Roman Lebedev	6102310d81	[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify, nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places.. While i could teach EarlyCSE how to hash PHI nodes, we can't really do much (anything?) even if we find two identical PHI nodes in different basic blocks, same-BB case is the interesting one, and if we teach InstSimplify about it (which is what i wanted originally, https://reviews.llvm.org/D86530), we get EarlyCSE support for free. So i would think this is pretty uncontroversial. On vanilla llvm test-suite + RawSpeed, this has the following effects: ``` \| statistic name \| baseline \| proposed \| Δ \| % \| \\|%\\| \| \|----------------------------------------------------\|-----------\|-----------\|-------:\|---------:\|---------:\| \| instsimplify.NumPHICSE \| 0 \| 23779 \| 23779 \| 0.00% \| 0.00% \| \| asm-printer.EmittedInsts \| 7942328 \| 7942392 \| 64 \| 0.00% \| 0.00% \| \| assembler.ObjectBytes \| 273069192 \| 273084704 \| 15512 \| 0.01% \| 0.01% \| \| correlated-value-propagation.NumPhis \| 18412 \| 18539 \| 127 \| 0.69% \| 0.69% \| \| early-cse.NumCSE \| 2183283 \| 2183227 \| -56 \| 0.00% \| 0.00% \| \| early-cse.NumSimplify \| 550105 \| 542090 \| -8015 \| -1.46% \| 1.46% \| \| instcombine.NumAggregateReconstructionsSimplified \| 73 \| 4506 \| 4433 \| 6072.60% \| 6072.60% \| \| instcombine.NumCombined \| 3640264 \| 3664769 \| 24505 \| 0.67% \| 0.67% \| \| instcombine.NumDeadInst \| 1778193 \| 1783183 \| 4990 \| 0.28% \| 0.28% \| \| instcount.NumCallInst \| 1758401 \| 1758799 \| 398 \| 0.02% \| 0.02% \| \| instcount.NumInvokeInst \| 59478 \| 59502 \| 24 \| 0.04% \| 0.04% \| \| instcount.NumPHIInst \| 330557 \| 330533 \| -24 \| -0.01% \| 0.01% \| \| instcount.TotalInsts \| 8831952 \| 8832286 \| 334 \| 0.00% \| 0.00% \| \| simplifycfg.NumInvokes \| 4300 \| 4410 \| 110 \| 2.56% \| 2.56% \| \| simplifycfg.NumSimpl \| 1019808 \| 999607 \| -20201 \| -1.98% \| 1.98% \| ``` I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call` transforms, and counter-intuitively results in more instructions total. That being said, the PHI count doesn't decrease that much, and looking at some examples, it seems at least some of them were previously getting PHI CSE'd in SimplifyCFG of all places.. I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time. As a comment in `InstCombinerImpl::visitPHINode()` already stated, there are no guarantees on the ordering of the operands of a PHI node, so if we just naively compare them, we may false-negatively say that the nodes are not equal when the only difference is operand order, which is especially important since the fold is in InstSimplify, so we can't rely on InstCombine sorting them beforehand. Fixing this for the general case is costly (geomean +0.02%), and does not appear to catch anything in test-suite, but for the same-BB case, it's trivial, so let's fix at least that. As per http://llvm-compile-time-tracker.com/compare.php?from=04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions this appears to cause geomean +0.03% compile time increase (regression), but geomean -0.01%..-0.04% code size decrease (improvement).	2020-08-27 18:47:04 +03:00
Vitaly Buka	469debe027	[ValueTracking] Support select in findAllocaForValue	2020-08-27 02:13:52 -07:00
Nikita Popov	d7c119d89c	[InstSimplify] Fold min/max intrinsic based on icmp of operands This is a reboot of D84655, now performing the inner icmp simplification query without undef folds. It should be possible to handle the current foldMinMaxSharedOp() fold based on this, by moving the logic into icmp of min/max instead, making it more general. We can't drop the folds for constant operands, because those also allow undef, which we exclude here. The tests use assumes for exhaustive coverage, and have a few more examples of misc folds we get based on icmp simplification. Differential Revision: https://reviews.llvm.org/D85929	2020-08-26 22:02:57 +02:00
Arthur Eubanks	098d3f9827	[InstSimplify] Simplify to vector constants when possible InstSimplify should do all transformations that ConstProp does, but one thing that ConstProp does that InstSimplify wouldn't is inline vector instructions that are constants, e.g. into a ret. Previously vector instructions wouldn't be inlined in InstSimplify because llvm::Simplify*Instruction() would return nullptr for specific instructions, such as vector instructions that were actually constants, if it couldn't simplify them. This changes SimplifyInsertElementInst, SimplifyExtractElementInst, and SimplifyShuffleVectorInst to return a vector constant when possible. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85946	2020-08-26 11:40:36 -07:00
Juneyoung Lee	684b43c0cf	[IR] Add NoUndef attribute to Intrinsics.td This patch adds NoUndef to Intrinsics.td. The attribute is attached to llvm.assume's operand, because llvm.assume(undef) is UB. It is attached to pointer operands of several memory accessing intrinsics as well. This change makes ValueTracking::getGuaranteedNonPoisonOps' intrinsic check unnecessary, so it is removed. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86576	2020-08-27 02:54:48 +09:00
Mircea Trofin	7cfcecece0	[MLInliner] Simplify TFUTILS_SUPPORTED_TYPES We only need the C++ type and the corresponding TF Enum. The other parameter was used for the output spec json file, but we can just standardize on the C++ type name there. Differential Revision: https://reviews.llvm.org/D86549	2020-08-25 14:19:39 -07:00
Juneyoung Lee	f753f5b050	[ValueTracking] Let getGuaranteedNonPoisonOp find multiple non-poison operands This patch helps getGuaranteedNonPoisonOp find multiple non-poison operands. Instead of special-casing llvm.assume, I think it is also a viable option to add noundef to Intrinsics.td. If it makes sense, I'll make a patch for that. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86477	2020-08-26 04:40:21 +09:00
Nikita Popov	3a54b6a4b7	[MemDep] Use BatchAA when computing pointer dependencies We're not changing IR while running a single MemDep query, so it's safe to cache alias analysis results using BatchAA. This adds BatchAA usage to getSimplePointerDependencyFrom(), which is non-intrusive -- covering larger parts (like a whole processNonLocalLoad query) is also possible, but requires threading BatchAA through a bunch of APIs. For the ThinLTO configuration, this is a 1% geomean improvement on CTMark. Differential Revision: https://reviews.llvm.org/D85583	2020-08-25 21:34:34 +02:00
Ta-Wei Tu	abbd652dd6	[LoopNest] False negative of `arePerfectlyNested` with LCSSA loops Summary: The LCSSA pass (required for all loop passes) sometimes adds additional blocks containing LCSSA variables, and checkLoopsStructure may return false even when the loops are perfectly nested in this case. This is because the successor of the exit block of the inner loop now points to the LCSSA block instead of the latch block of the outer loop. Examples are shown in the test nests-with-lcssa.ll. To fix the issue, the successor of the exit block of the inner loop can now point to a block in which all instructions are LCSSA phi node (except the terminator), and the sole successor of that block should point to the latch block of the outer loop. Reviewed By: Whitney, etiotto Differential Revision: https://reviews.llvm.org/D86133	2020-08-25 16:20:52 +00:00
Mircea Trofin	8c63df2416	[MLInliner] Support training that doesn't require partial rewards If we use training algorithms that don't need partial rewards, we don't need to worry about an ir2native model. In that case, training logs won't contain a 'delta_size' feature either (since that's the partial reward). Differential Revision: https://reviews.llvm.org/D86481	2020-08-24 17:36:29 -07:00
Sam Parker	2e194fe73b	[SCEV] Still trying to fix windows buildbots	2020-08-24 10:26:48 +01:00
Alina Sbirlea	f55ad3973d	[DomTree] Extend update API to allow a post CFG view. Extend the `applyUpdates` in DominatorTree to allow a post CFG view, different from the current CFG. This patch implements the functionality of updating an already up to date DT, to the desired PostCFGView. Combining a set of updates towards an up to date DT and a PostCFGView is not yet supported. Differential Revision: https://reviews.llvm.org/D85472	2020-08-21 17:23:08 -07:00
Arthur Eubanks	b79889c2b1	[opt][NewPM] Add basic-aa in legacy PM compatibility mode The legacy PM alias analysis pipeline by default includes basic-aa. When running `opt -foo-pass` under the NPM and -disable-basic-aa is not specified, use basic-aa. This decreases the number of check-llvm failures under NPM from 913 to 752. Reviewed By: ychen, asbirlea Differential Revision: https://reviews.llvm.org/D86167	2020-08-21 14:05:07 -07:00
Roman Lebedev	5d7c5a5e99	[NFC] Port InstCount pass to new pass manager	2020-08-21 12:39:42 +03:00
Yevgeny Rouban	18bc400f97	[NewPM][PassInstrumentation] Add PreservedAnalyses parameter to AfterPass* callbacks Both AfterPass and AfterPassInvalidated pass instrumentation callbacks get additional parameter of type PreservedAnalyses. This patch was created by @fedor.sergeev. I have just slightly changed it. Reviewers: fedor.sergeev Differential Revision: https://reviews.llvm.org/D81555	2020-08-21 16:10:42 +07:00
David Green	2b69efded0	[ARM][LV] Add a preferPredicatedReductionSelect target hook As part of D84741, this adds a target hook for the preferPredicatedReductionSelect option and makes use of it under MVE, allowing us to tail predicate most reduction loops. Differential Revision: https://reviews.llvm.org/D85980	2020-08-21 08:48:12 +01:00
Sanjay Patel	6f3511a01a	[ValueTracking] define/use max recursion depth in header There's a potential motivating case to increase this limit in PR47191: http://bugs.llvm.org/PR47191 But first we should make it less hacky. The limit in InstCombine is directly tied to this value because an increase there can cause asserts in the underlying value tracking calls if not changed together. The usage in VectorUtils is independent, but the comment suggests that we should use the same value unless there's a known reason to diverge. There are similar limits in codegen analysis, but I think we should leave those independent in case we intentionally want the optimization power/cost to be different there. Differential Revision: https://reviews.llvm.org/D86113	2020-08-19 16:56:59 -04:00
Mehdi Amini	a407ec9b6d	Revert "Revert "[NFC][llvm] Make the contructors of `ElementCount` private."" Was reverted because MLIR/Flang builds were broken, these APIs have been fixed in the meantime.	2020-08-19 17:26:36 +00:00
Mehdi Amini	4fc56d70aa	Revert "[NFC][llvm] Make the contructors of `ElementCount` private." This reverts commit `264afb9e6a`. (and dependent `6b742cc48` and `fc53bd610f`) MLIR/Flang are broken.	2020-08-19 17:21:37 +00:00
Francesco Petrogalli	264afb9e6a	[NFC][llvm] Make the contructors of `ElementCount` private. Differential Revision: https://reviews.llvm.org/D86120	2020-08-19 16:26:44 +00:00
Mircea Trofin	62fc44ca3c	[MLInliner] In development mode, obtain the output specs from a file Different training algorithms may produce models that, besides the main policy output (i.e. inline/don't inline), produce additional outputs that are necessary for the next training stage. To facilitate this, in development mode, we require the training policy infrastructure produce a description of the outputs that are interesting to it, in the form of a JSON file. We special-case the first entry in the JSON file as the inlining decision - we care about its value, so we can guide inlining during training - but treat the rest as opaque data that we just copy over to the training log. Differential Revision: https://reviews.llvm.org/D85674	2020-08-17 16:56:47 -07:00
Tyker	a79e604462	[AssumeBundles] Fix Bug in Assume Queries this bug was causing miscompile. now clang cant properly selfhost with -mllvm --enable-knowledge-retention Reviewed By: jdoerfert, lebedev.ri Differential Revision: https://reviews.llvm.org/D83507	2020-08-17 21:36:53 +02:00
Dávid Bolvanský	0f14b2e6cb	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `50c743fa71`. Patch will be split to smaller ones.	2020-08-17 20:44:33 +02:00
Simon Pilgrim	c1f6ce0c73	[DemandedBits] Improve accuracy of Add propagator The current demand propagator for addition will mark all input bits at and right of the alive output bit as alive. But carry won't propagate beyond a bit for which both operands are zero (or one/zero in the case of subtraction) so a more accurate answer is possible given known bits. I derived a propagator by working through truth tables and using a bit-reversed addition to make demand ripple to the right, but I'm not sure how to make a convincing argument for its correctness in the comments yet. Nevertheless, here's a minimal implementation and test to get feedback. This would help in a situation where, for example, four bytes (<128) packed into an int are added with four others SIMD-style but only one of the four results is actually read. Known A: 0_______0_______0_______0_______ Known B: 0_______0_______0_______0_______ AOut: 00000000001000000000000000000000 AB, current: 00000000001111111111111111111111 AB, patch: 00000000001111111000000000000000 Committed on behalf of: @rrika (Erika) Differential Revision: https://reviews.llvm.org/D72423	2020-08-17 12:54:09 +01:00
Cullen Rhodes	2ccde3c96b	[InlineCost] Fix scalable vectors in visitAlloca Discovered as part of the VLS type work (see D85128). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85848	2020-08-17 10:34:27 +00:00
Vitaly Buka	3b348d9102	[NFC][StackSafety] Move out sort from the loop	2020-08-17 03:30:14 -07:00
Vitaly Buka	e10e7829bf	[StackSafety] Skip ambiguous lifetime analysis If we can't identify alloca used in lifetime marker we need to assume to worst case scenario. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D84630	2020-08-16 18:05:52 -07:00
Vitaly Buka	47552a614a	[StackSafety] Change how callee searched in index Handle other than local linkage types.	2020-08-16 04:37:19 -07:00
Wenlei He	577e58bcc7	[InlineAdvisor] New inliner advisor to replay inlining from optimization remarks This change added a new inline advisor that takes optimization remarks from previous inlining as input, and provides the decision as advice so current inlining can replay inline decisions of a different compilation. Dwarf inline stack with line and discriminator is used as anchor for call sites including call context. The change can be useful for Inliner tuning as it provides a channel to allow external input for tweaking inline decisions. Existing alternatives like alwaysinline attribute is per-function, not per-callsite. Per-callsite inline intrinsic can be another solution (not yet existing), but it's intrusive to implement and also does not differentiate call context. A switch -sample-profile-inline-replay=<inline_remarks_file> is added to hook up the new inline advisor with SampleProfileLoader's inline decision for replay. Since SampleProfileLoader does top-down inlining, inline decision can be specialized for each call context, hence we should be able to replay inlining accurately. However with a bottom-up inliner like CGSCC inlining, the replay can be limited due to lack of specialization for different call context. Apart from that limitation, the new inline advisor can still be used by regular CGSCC inliner later if needed for tuning purpose. This is a resubmit of https://reviews.llvm.org/D83743	2020-08-15 20:17:21 -07:00
Vitaly Buka	fc4fd89852	[StackSafety] Use ValueInfo in ParamAccess::Call This avoid GUID lookup in Index.findSummaryInModule. Follow up for D81242. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D85269	2020-08-14 12:42:44 -07:00
Matt Morehouse	891b2be85d	Revert "[NFC][StackSafety] Move out sort from the loop" This reverts commit `0426e28419` due to ASan buildbot failure.	2020-08-14 08:17:35 -07:00
Vitaly Buka	4c30d4b4e5	[NFC][StackSafety] Change map key comparison	2020-08-14 04:23:15 -07:00
Vitaly Buka	0426e28419	[NFC][StackSafety] Move out sort from the loop	2020-08-14 04:19:10 -07:00
Vitaly Buka	798eb71c3a	[NFC][StackSafety] Dedup callees	2020-08-14 01:14:52 -07:00
Dávid Bolvanský	50c743fa71	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 19:54:27 +02:00
Simon Pilgrim	63863451d1	Fix unused variable warning. NFC. Reduce the dyn_cast<> to a isa<> as that's all non-assert builds require, and move the cast<> inside the assert.	2020-08-13 15:43:20 +01:00
Dávid Bolvanský	f9264995a6	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `44587e2f7e`. Sanitizer tests need to be updated.	2020-08-13 14:37:40 +02:00
Dávid Bolvanský	44587e2f7e	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 14:23:58 +02:00
Dávid Bolvanský	a0485421d2	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `385c9d673f`.	2020-08-13 12:59:15 +02:00
Dávid Bolvanský	385c9d673f	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 12:45:40 +02:00
Ali Tamur	0581c0b0ee	Revert "[SCEV] Look through single value PHIs." This reverts commit `e441b7a7a0`. This patch causes a compile error in tensorflow opensource project. The stack trace looks like: Point of crash: llvm/include/llvm/Analysis/LoopInfoImpl.h : line 35 (gdb) ptype this type = const class llvm::LoopBase<llvm::BasicBlock, llvm::Loop> [with BlockT = llvm::BasicBlock, LoopT = llvm::Loop] (gdb) p this $1 = {ParentLoop = 0x0, SubLoops = std::vector of length 0, capacity 0, Blocks = std::vector of length 0, capacity 1, DenseBlockSet = {<llvm::SmallPtrSetImpl<llvm::BasicBlock const>> = {<llvm::SmallPtrSetImplBase> = {<llvm::DebugEpochBase> = {Epoch = 3}, SmallArray = 0x1b2bf6c8, CurArray = 0x1b2bf6c8, CurArraySize = 8, NumNonEmpty = 0, NumTombstones = 0}, <No data fields>}, SmallStorage = {0xfffffffffffffffe, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, IsInvalid = true} (gdb) p this->DenseBlockSet->CurArray $2 = (const void *) 0xfffffffffffffffe I will try to get a case from tensorflow or use creduce to get a small case.	2020-08-12 23:13:24 -07:00
Nikita Popov	eba5f5f798	[ValueTracking] Add abs intrinsics support to computeConstantRange() Implementation is the same as for SPF_ABS.	2020-08-12 22:28:46 +02:00
Nikita Popov	e2040d38a1	[ValueTracking] Support min/max intrinsics in computeConstantRange() The implementation is the same as for the SPF_* case.	2020-08-12 22:07:29 +02:00
Craig Topper	a7a06ded8b	Recommit "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and its follow up patches This recommits the following patches now that D85684 has landed `1cf6f210a2` [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison. `469da663f2` [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison `122b0640fc` [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison `ac0af12ed2` [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison `9b1e95329a` [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms	2020-08-12 10:45:27 -07:00
Florian Hahn	e441b7a7a0	[SCEV] Look through single value PHIs. Now that SCEVExpander can preserve LCSSA form, we do not have to worry about LCSSA form when trying to look through PHIs. SCEVExpander will take care of inserting LCSSA PHI nodes as required. This increases precision of the analysis in some cases. Reviewed By: mkazantsev, bmahjour Differential Revision: https://reviews.llvm.org/D71539	2020-08-12 10:03:42 +01:00
Nikita Popov	06d567059e	[InstSimplify] Respect CanUseUndef in more places Similar to what we do in IIQ, add an isUndefValue() helper that checks for undef values while respective CanUseUndef. This makes it much easier to search for places that don't respect the flag yet.	2020-08-11 21:53:33 +02:00
Dávid Bolvanský	d68a2859ab	[BPI] Teach BPI about bcmp function bcmp is similar to memcmp	2020-08-11 20:44:53 +02:00
Nikita Popov	d110d4aaff	[InstSimplify] Forbid undef folds in expandBinOp This is the replacement for D84250 based on D84792. As we recursively fold with the same value twice, we need to disable undef folds, to prevent an undef from being folded to two different values. Reverting rG00f3579aea6e3d4a4b7464c3db47294f71cef9e4 and using the test case from https://reviews.llvm.org/D83360#2145793, it no longer performs the incorrect fold. Differential Revision: https://reviews.llvm.org/D85684	2020-08-11 18:39:24 +02:00
Sanjay Patel	1470ce4a76	[InstSimplify] fold min/max with matching min/max operands I think this is the last remaining translation of an existing instcombine transform for the corresponding cmp+sel idiom. This interpretation is more general though - we can remove mismatched signed/unsigned combinations in addition to the more obvious cases. min/max(X, Y) must produce X or Y as the result, so this is just another clause in the existing transform that was already matching a min/max of min/max.	2020-08-11 11:23:15 -04:00
Florian Hahn	3483c28c5b	[SCEV] ] If RHS >= Start, simplify (Start smax RHS) to RHS for trip counts. This is the max version of D85046. This change causes binary changes in 44 out of 237 benchmarks (out of MultiSource/SPEC2000/SPEC2006) Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D85189	2020-08-11 13:20:24 +01:00
Juneyoung Lee	63b5b92bc9	[LazyValueInfo] Let getEdgeValueLocal look into freeze instructions This patch makes getEdgeValueLocal more precise when a freeze instruction is given, by adding support for freeze into constantFoldUser Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84629	2020-08-11 16:39:34 +09:00
Thomas Lively	514445e035	[WebAssembly][ConstantFolding] Fold fp-to-int truncation intrinsics Constant fold both the trapping and saturating versions of the WebAssembly truncation intrinsics. The tests are adapted from the WebAssembly spec tests for the corresponding instructions. Requested in PR46982. Differential Revision: https://reviews.llvm.org/D85392	2020-08-10 12:40:05 -07:00
Mircea Trofin	211117b660	[NFC][MLInliner] remove curly braces for a few sinle-line loops	2020-08-10 09:32:21 -07:00
Mircea Trofin	d5c81be3ca	[NFC][MLInliner] Set up the logger outside the development mode advisor This allows us to subsequently configure the logger for the case when we use a model evaluator and want to log additional outputs. Differential Revision: https://reviews.llvm.org/D85577	2020-08-10 09:22:17 -07:00
Vitaly Buka	1970eefb17	[NFC][StackSafety] Add a couple of early returns	2020-08-09 23:42:09 -07:00
Vitaly Buka	8d91ce8f58	[NFC][StackSafety] Count dataflow inputs	2020-08-09 23:32:41 -07:00
Vitaly Buka	dee812a297	[StackSafety] Fix union which produces wrapped sets	2020-08-09 23:20:17 -07:00
Vitaly Buka	a6feeb1c6b	[NFC][StackSafety] Avoid assert in getBaseObjec	2020-08-09 23:20:17 -07:00
Vitaly Buka	3a34228bff	[StackSafety] Don't keep FullSet in index Optimization. Missing record is enterpreted as FullSet anyway.	2020-08-09 15:01:46 -07:00
Florian Hahn	d236e1c7b6	[InstSimplify/NewGVN] Add option to control the use of undef. Making use of undef is not safe if the simplification result is not used to replace all uses of the result. This leads to problems in NewGVN, which does not replace all uses in the IR directly. See PR33165 for more details. This patch adds an option to SimplifyQuery to disable the use of undef. Note that I've only guarded uses if isa<UndefValue>/m_Undef where SimplifyQuery is currently available. If we agree on the general direction, I'll update the remaining uses. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84792	2020-08-09 19:16:56 +01:00
Vitaly Buka	648228bcc3	[NFC][StackSafety] Fix statistics	2020-08-07 16:18:52 -07:00
Mircea Trofin	64372d93bc	[NFC][MLInliner] Refactor logging implementation This prepares it for logging externally-specified outputs. Differential Revision: https://reviews.llvm.org/D85451	2020-08-07 14:56:56 -07:00
Vitaly Buka	7547508b7a	Revert "[StackSafety] Skip ambiguous lifetime analysis" This reverts commit `0b2616a804`. Crashes with safe-stack.	2020-08-07 14:02:50 -07:00
Vitaly Buka	7d4996033b	[StackSafety,NFC] Add Stats counters	2020-08-07 14:02:50 -07:00
Vitaly Buka	7fb9de2c6f	[StackSafety,NFC] Fix tests in debug	2020-08-06 20:46:39 -07:00
Vitaly Buka	58b95c9b2b	[StackSafety,NFC] Add debug counters	2020-08-06 19:24:02 -07:00
Vitaly Buka	92dcf12b2f	[StackSafety,NFC] Use CHECK-EMPTY in tests	2020-08-06 19:19:51 -07:00
Vitaly Buka	0b2616a804	[StackSafety] Skip ambiguous lifetime analysis If we can't identify alloca used in lifetime marker we need to assume to worst case scenario. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D84630	2020-08-06 19:10:33 -07:00
Vitaly Buka	5c6d9b2bbf	[LTO,NFC] Skip generateParamAccessSummary when empty addGlobalValueSummary can check newly added FunctionSummary and set HasParamAccess to mark that generateParamAccessSummary is needed. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D85182	2020-08-06 19:01:19 -07:00
Jessica Paquette	c8a282bcf7	[GlobalISel] Fix computing known bits for loads with range metadata In GlobalISel, if you have a load into a small type with a range, you'll hit an assert if you try to compute known bits on it starting at a larger type. e.g. ``` %x:_(s8) = G_LOAD %whatever(p0) :: (load 1 ... !range !n) ... %y:_(s32) = G_SOMETHING %x ``` When we walk through G_SOMETHING and hit the load, the width of our known bits is 32. However, the width of the range is going to be 8. This will cause us to hit an assert. To fix this, make computeKnownBitsFromRangeMetadata zero extend or truncate the range type to match the bitwidth of the known bits we're calculating. Add a testcase in CodeGen/GlobalISel/KnownBitsTest.cpp to reflect that this works now. https://reviews.llvm.org/D85375	2020-08-06 16:47:07 -07:00
Sanjay Patel	250a167c41	[InstSimplify] avoid crashing by trying to rem-by-zero Bug was noted in the post-commit comments for: rGe8760bb9a8a3	2020-08-06 16:06:31 -04:00
Mircea Trofin	ca7973cf18	[NFC]{MLInliner] Point out the tests' model dependencies	2020-08-06 09:57:26 -07:00
Mircea Trofin	87fb7aa137	[llvm][MLInliner] Don't log 'mandatory' events We don't want mandatory events in the training log. We do want to handle them, to keep the native size accounting accurate, but that's all. Fixed the code, also expanded the test to capture this. Differential Revision: https://reviews.llvm.org/D85373	2020-08-06 09:04:15 -07:00
David Green	745bf6cf44	[LoopVectorizer] Inloop vector reductions Arm MVE has multiple instructions such as VMLAVA.s8, which (in this case) can take two 128bit vectors, sign extend the inputs to i32, multiplying them together and sum the result into a 32bit general purpose register. So taking 16 i8's as inputs, they can multiply and accumulate the result into a single i32 without any rounding/truncating along the way. There are also reduction instructions for plain integer add and min/max, and operations that sum into a pair of 32bit registers together treated as a 64bit integer (even though MVE does not have a plain 64bit addition instruction). So giving the vectorizer the ability to use these instructions both enables us to vectorize at higher bitwidths, and to vectorize things we previously could not. In order to do that we need a way to represent that the reduction operation, specified with a llvm.experimental.vector.reduce when vectorizing for Arm, occurs inside the loop not after it like most reductions. This patch attempts to do that, teaching the vectorizer about in-loop reductions. It does this through a vplan recipe representing the reductions that the original chain of reduction operations is replaced by. Cost modelling is currently just done through a prefersInloopReduction TTI hook (which follows in a later patch). Differential Revision: https://reviews.llvm.org/D75069	2020-08-06 10:10:50 +01:00
Sanjay Patel	e8760bb9a8	[InstSimplify] fold icmp with mul nsw and constant operands https://rise4fun.com/Alive/slvl Name: mul nsw with icmp eq Pre: (C2 % C1) != 0 %a = mul nsw i8 %x, C1 %r = icmp eq i8 %a, C2 => %r = false Name: mul nsw with icmp ne Pre: (C2 % C1) != 0 %a = mul nsw i8 %x, C1 %r = icmp ne i8 %a, C2 => %r = true Follow-up to the 'nuw' variation added with: rGf879c9b79621	2020-08-05 14:38:39 -04:00
Sanjay Patel	f879c9b796	[InstSimplify] fold icmp with mul nuw and constant operands https://rise4fun.com/Alive/pZEr Name: mul nuw with icmp eq Pre: (C2 %u C1) != 0 %a = mul nuw i8 %x, C1 %r = icmp eq i8 %a, C2 => %r = false Name: mul nuw with icmp ne Pre: (C2 %u C1) != 0 %a = mul nuw i8 %x, C1 %r = icmp ne i8 %a, C2 => %r = true There are potentially several other transforms we need to add based on: D51625 ...but it doesn't look like there was follow-up to that patch.	2020-08-05 14:32:17 -04:00
Jordan Rupprecht	3c39db0c44	Revert "[LoopVectorizer] Inloop vector reductions" This reverts commit `e9761688e4`. It breaks the build: ``` ~/src/llvm-project/llvm/lib/Analysis/IVDescriptors.cpp:868:10: error: no viable conversion from returned value of type 'SmallVector<[...], 8>' to function return type 'SmallVector<[...], 4>' return ReductionOperations; ```	2020-08-05 10:24:15 -07:00
Mircea Trofin	b18c41c66f	[TFUtils] Expose untyped accessor to evaluation result tensors These were implementation detail, but become necessary for generic data copying. Also added const variations to them, and move assignment, since we had a move ctor (and the move assignment helps in a subsequent patch). Differential Revision: https://reviews.llvm.org/D85262	2020-08-05 10:22:45 -07:00
David Green	e9761688e4	[LoopVectorizer] Inloop vector reductions Arm MVE has multiple instructions such as VMLAVA.s8, which (in this case) can take two 128bit vectors, sign extend the inputs to i32, multiplying them together and sum the result into a 32bit general purpose register. So taking 16 i8's as inputs, they can multiply and accumulate the result into a single i32 without any rounding/truncating along the way. There are also reduction instructions for plain integer add and min/max, and operations that sum into a pair of 32bit registers together treated as a 64bit integer (even though MVE does not have a plain 64bit addition instruction). So giving the vectorizer the ability to use these instructions both enables us to vectorize at higher bitwidths, and to vectorize things we previously could not. In order to do that we need a way to represent that the reduction operation, specified with a llvm.experimental.vector.reduce when vectorizing for Arm, occurs inside the loop not after it like most reductions. This patch attempts to do that, teaching the vectorizer about in-loop reductions. It does this through a vplan recipe representing the reductions that the original chain of reduction operations is replaced by. Cost modelling is currently just done through a prefersInloopReduction TTI hook (which follows in a later patch). Differential Revision: https://reviews.llvm.org/D75069	2020-08-05 18:14:05 +01:00
Sanjay Patel	bd2c88b253	[InstSimplify] reduce code duplication in simplifyICmpWithMinMax(); NFC	2020-08-05 11:39:28 -04:00
Evgeniy Brevnov	02a629daad	[BPI][NFC] Unify handling of normal and SCC based loops This is one more NFC part extracted from D79485. Normal and SCC based loops have very different representation and have to be handled separatly each time we deal with loops. D79485 is going to introduce much more extensive use of loops what will be problematic with out this change. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D84838	2020-08-05 11:19:24 +07:00
Mircea Trofin	90b9c49ca6	[llvm] Expose type and element count-related APIs on TensorSpec Added a mechanism to check the element type, get the total element count, and the size of an element. Differential Revision: https://reviews.llvm.org/D85250	2020-08-04 17:32:16 -07:00
Mircea Trofin	65b6dbf939	[llvm][NFC] Moved implementation of TrainingLogger outside of its decl Also renamed a method - printTensor - to print; and added comments.	2020-08-04 14:35:35 -07:00
Xavier Denis	29fe3fe615	[InstSimplify] Peephole optimization for icmp (urem X, Y), X This revision adds the following peephole optimization and it's negation: %a = urem i64 %x, %y %b = icmp ule i64 %a, %x ====> %b = true With John Regehr's help this optimization was checked with Alive2 which suggests it should be valid. This pattern occurs in the bound checks of Rust code, the program const N: usize = 3; const T = u8; pub fn split_mutiple(slice: &[T]) -> (&[T], &[T]) { let len = slice.len() / N; slice.split_at(len * N) } the method call slice.split_at will check that len * N is within the bounds of slice, this bounds check is after some transformations turned into the urem seen above and then LLVM fails to optimize it any further. Adding this optimization would cause this bounds check to be fully optimized away. ref: https://github.com/rust-lang/rust/issues/74938 Differential Revision: https://reviews.llvm.org/D85092	2020-08-04 20:48:37 +02:00
Sanjay Patel	a16882047a	[InstSimplify] refactor min/max folds with shared operand; NFC	2020-08-04 12:21:05 -04:00
Sanjay Patel	04e45ae1c6	[InstSimplify] fold nested min/max intrinsics with constant operands This is based on the existing code for the non-intrinsic idioms in InstCombine. The vector constant constraint is non-obvious: undefs should be ok in the outer call, but they can't propagate safely from the inner call in all cases. Example: https://alive2.llvm.org/ce/z/-2bVbM define <2 x i8> @src(<2 x i8> %x) { %0: %m = umin <2 x i8> %x, { 7, undef } %m2 = umin <2 x i8> { 9, 9 }, %m ret <2 x i8> %m2 } => define <2 x i8> @tgt(<2 x i8> %x) { %0: %m = umin <2 x i8> %x, { 7, undef } ret <2 x i8> %m } Transformation doesn't verify! ERROR: Value mismatch Example: <2 x i8> %x = < undef, undef > Source: <2 x i8> %m = < #x00 (0) [based on undef value], #x00 (0) > <2 x i8> %m2 = < #x00 (0), #x00 (0) > Target: <2 x i8> %m = < #x07 (7), #x10 (16) > Source value: < #x00 (0), #x00 (0) > Target value: < #x07 (7), #x10 (16) >	2020-08-04 08:44:48 -04:00
Sanjay Patel	20c71e55aa	[InstSimplify] reduce code for min/max analysis; NFC This should probably be moved up to some common area eventually when there's another user.	2020-08-04 08:02:33 -04:00
David Green	3c7e7d40a9	[BasicAA] Enable -basic-aa-recphi by default This option was added a while back, to help improve AA around pointer phi loops. It looks for phi(gep(phi, const), x) loops, checking if x can then prove more precise aliasing info. Differential Revision: https://reviews.llvm.org/D82998	2020-08-04 10:43:42 +01:00
Alina Sbirlea	1ce82015f6	[MemorySSA] Restrict optimizations after a PhiTranslation. Merging alias results from different paths, when a path did phi translation is not necesarily correct. Conservatively terminate such paths. Aimed to fix PR46156. Differential Revision: https://reviews.llvm.org/D84905	2020-08-03 14:46:41 -07:00
Sanjay Patel	9e5cf6bde5	[InstSimplify] fold variations of max-of-min with common operand https://alive2.llvm.org/ce/z/ZtxpZ3	2020-08-03 15:02:46 -04:00
Mircea Trofin	4b1b109c51	[llvm] Add a parser from JSON to TensorSpec A JSON->TensorSpec utility we will use subsequently to specify additional outputs needed for certain training scenarios. Differential Revision: https://reviews.llvm.org/D84976	2020-08-03 09:49:31 -07:00
Florian Hahn	ee1c12708a	[SCEV] If Start>=RHS, simplify (Start smin RHS) = RHS for trip counts. In some cases, it seems like we can get rid of unnecessary s/umins by using information from the loop guards (unless I am missing something). One place where this seems to be helpful in practice is when computing loop trip counts. This patch just changes howManyGreaterThans for now. Note that this requires a loop for which we can check 'is guarded'. On SPEC2000/SPEC2006/MultiSource, there are some notable changes for some programs in the number of loops unrolled and trip counts computed. ``` Same hash: 179 (filtered out) Remaining: 58 Metric: scalar-evolution.NumTripCountsComputed Program base patch diff test-suite...langs-C/compiler/compiler.test 25.00 31.00 24.0% test-suite.../Applications/SPASS/SPASS.test 2020.00 2323.00 15.0% test-suite...langs-C/allroots/allroots.test 29.00 32.00 10.3% test-suite.../Prolangs-C/loader/loader.test 17.00 18.00 5.9% test-suite...fice-ispell/office-ispell.test 253.00 265.00 4.7% test-suite...006/450.soplex/450.soplex.test 3552.00 3692.00 3.9% test-suite...chmarks/MallocBench/gs/gs.test 453.00 470.00 3.8% test-suite...ngs-C/assembler/assembler.test 29.00 30.00 3.4% test-suite.../Benchmarks/Ptrdist/bc/bc.test 263.00 270.00 2.7% test-suite...rks/FreeBench/pifft/pifft.test 722.00 741.00 2.6% test-suite...count/automotive-bitcount.test 41.00 42.00 2.4% test-suite...0/253.perlbmk/253.perlbmk.test 1417.00 1451.00 2.4% test-suite...000/197.parser/197.parser.test 387.00 396.00 2.3% test-suite...lications/sqlite3/sqlite3.test 1168.00 1189.00 1.8% test-suite...000/255.vortex/255.vortex.test 173.00 176.00 1.7% Metric: loop-unroll.NumUnrolled Program base patch diff test-suite...langs-C/compiler/compiler.test 1.00 3.00 200.0% test-suite.../Applications/SPASS/SPASS.test 134.00 234.00 74.6% test-suite...count/automotive-bitcount.test 3.00 4.00 33.3% test-suite.../Prolangs-C/loader/loader.test 3.00 4.00 33.3% test-suite...langs-C/allroots/allroots.test 3.00 4.00 33.3% test-suite...Source/Benchmarks/sim/sim.test 10.00 12.00 20.0% test-suite...fice-ispell/office-ispell.test 21.00 25.00 19.0% test-suite.../Benchmarks/Ptrdist/bc/bc.test 32.00 38.00 18.8% test-suite...006/450.soplex/450.soplex.test 300.00 352.00 17.3% test-suite...rks/FreeBench/pifft/pifft.test 60.00 69.00 15.0% test-suite...chmarks/MallocBench/gs/gs.test 57.00 63.00 10.5% test-suite...ngs-C/assembler/assembler.test 10.00 11.00 10.0% test-suite...0/253.perlbmk/253.perlbmk.test 145.00 157.00 8.3% test-suite...000/197.parser/197.parser.test 43.00 46.00 7.0% test-suite...TimberWolfMC/timberwolfmc.test 205.00 214.00 4.4% Geomean difference 7.6% ``` Fixes https://bugs.llvm.org/show_bug.cgi?id=46939 Fixes https://bugs.llvm.org/show_bug.cgi?id=46924 on X86. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D85046	2020-08-03 17:22:42 +01:00
Vitaly Buka	08cf49658c	[StackSafety, NFC] Don't insert empty objects into the map Result should be the same but it makes generateParamAccessSummary 5x faster.	2020-08-02 13:58:56 -07:00
Sanjay Patel	4abc69c6f5	[InstSimplify] fold max (max X, Y), X --> max X, Y https://alive2.llvm.org/ce/z/VGgG3M	2020-08-02 11:50:58 -04:00
Nikita Popov	a0addbb4ec	[InstSimplify] Reduce code duplication in icmp of binop folds (NFC) For folds where we check for the binop on both the LHS and RHS, extract a function that expects it on the LHS and call it with swapped order.	2020-08-02 15:47:18 +02:00
Kazu Hirata	60434989e5	Use llvm::is_contained where appropriate (NFC) Use llvm::is_contained where appropriate (NFC) Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D85083	2020-08-01 21:51:06 -07:00
Craig Topper	85b5315dbe	[InstSimplify] Fold abs(abs(x)) -> abs(x) It's always safe to pick the earlier abs regardless of the nsw flag. We'll just lose it if it is on the outer abs but not the inner abs. Differential Revision: https://reviews.llvm.org/D85053	2020-08-01 13:25:00 -07:00
Sanjay Patel	04b99a4d18	[InstSimplify] simplify abs if operand is known non-negative abs() should be rare enough that using value tracking is not going to be a compile-time cost burden, so use it to reduce a variety of potential patterns. We do this in DAGCombiner too. Differential Revision: https://reviews.llvm.org/D85043	2020-08-01 07:47:06 -04:00
Craig Topper	86dea1f39b	[ValueTracking] Improve llvm.abs handling in computeKnownBits. Add the optimizations we have in the SelectionDAG version. Known non-negative copies all known bits. Any known one other than the sign bit makes result non-negative. Differential Revision: https://reviews.llvm.org/D85000	2020-07-31 15:55:03 -07:00
Sanjay Patel	e591713bff	[ConstantFolding] fold abs intrinsic The handling for minimum value is similar to cttz/ctlz with 0 just above this case. Differential Revision: https://reviews.llvm.org/D84942	2020-07-31 14:08:44 -04:00
Craig Topper	0e0aebc527	[ValueTracking] Add ComputeNumSignBits support for llvm.abs intrinsic If absolute value needs turn a negative number into a positive number it reduces the number of sign bits by at most 1. Differential Revision: https://reviews.llvm.org/D84971	2020-07-31 10:59:12 -07:00
Vitaly Buka	b0eb40ca39	[NFC] Remove unused GetUnderlyingObject paramenter Depends on D84617. Differential Revision: https://reviews.llvm.org/D84621	2020-07-31 02:10:03 -07:00
Vitaly Buka	89051ebace	[NFC] GetUnderlyingObject -> getUnderlyingObject I am going to touch them in the next patch anyway	2020-07-30 21:08:24 -07:00
Vitaly Buka	b256cb88a7	[ValueTracking] Remove AllocaForValue parameter findAllocaForValue uses AllocaForValue to cache resolved values. The function is used only to resolve arguments of lifetime intrinsic which usually are not fare for allocas. So result reuse is likely unnoticeable. In followup patches I'd like to replace the function with GetUnderlyingObjects. Depends on D84616. Differential Revision: https://reviews.llvm.org/D84617	2020-07-30 18:48:34 -07:00
Vitaly Buka	61cab352e3	[NFC] Move findAllocaForValue into ValueTracking.h Differential Revision: https://reviews.llvm.org/D84616	2020-07-30 18:22:59 -07:00
Craig Topper	24f5235d93	[ValueTracking] Add basic computeKnownBits support for llvm.abs intrinsic This includes basic support for computeKnownBits on abs. I've left FIXMEs for more complicated things we could do. Differential Revision: https://reviews.llvm.org/D84963	2020-07-30 16:26:54 -07:00
Florian Hahn	2062b3707c	[LAA] Avoid adding pointers to the checks if they are not needed. Currently we skip alias sets with only reads or a single write and no reads, but still add the pointers to the list of pointers in RtCheck. This can lead to cases where we try to access a pointer that does not exist when grouping checks. In most cases, the way we access PositionMap masked that, as the value would default to index 0. But in the example in PR46854 it causes a crash. This patch updates the logic to avoid adding pointers for alias sets that do not need any checks. It makes things slightly more verbose, by first checking the numbers of reads/writes and bailing out early if we don't need checks for the alias set. I think this makes the logic a bit simpler to follow. Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D84608	2020-07-30 19:21:14 +01:00
Yuanfang Chen	555cf42f38	[NewPM][PassInstrument] Add PrintPass callback to StandardInstrumentations Problem: Right now, our "Running pass" is not accurate when passes are wrapped in adaptor because adaptor is never skipped and a pass could be skipped. The other problem is that "Running pass" for a adaptor is before any "Running pass" of passes/analyses it depends on. (for example, FunctionToLoopPassAdaptor). So the order of printing is not the actual order. Solution: Doing things like PassManager::Debuglogging is very intrusive because we need to specify Debuglogging whenever adaptor is created. (Actually, right now we're not specifying Debuglogging for some sub-PassManagers. Check PassBuilder) This patch move debug logging for pass as a PassInstrument callback. We could be sure that all running passes are logged and in the correct order. This could also be used to implement hierarchy pass logging in legacy PM. We could also move logging of pass manager to this if we want. The test fixes looks messy. It includes changes: - Remove PassInstrumentationAnalysis - Remove PassAdaptor - If a PassAdaptor is for a real pass, the pass is added - Pass reorder (to the correct order), related to PassAdaptor - Add missing passes (due to Debuglogging not passed down) Reviewed By: asbirlea, aeubanks Differential Revision: https://reviews.llvm.org/D84774	2020-07-30 10:07:57 -07:00
Mircea Trofin	71059257bd	[llvm][NFC] TensorSpec abstraction for ML evaluator Further abstracting the specification of a tensor, to more easily support different types and shapes of tensor, and also to perform initialization up-front, at TFModelEvaluator construction time. Differential Revision: https://reviews.llvm.org/D84685	2020-07-29 16:29:21 -07:00
Sanjay Patel	fef513f5cc	[InstSimplify] fold min/max intrinsic with undef operand	2020-07-29 17:03:50 -04:00
Sanjay Patel	5cd695dd7f	[InstSimplify] fold min/max with opposite of limit value	2020-07-29 17:03:50 -04:00
Nikita Popov	897bdca4b8	[ConstantRange] Add API for intrinsics (NFC) This adds a common API for compute constant ranges of intrinsics. The intention here is that a) we can reuse the same code across different passes that handle constant ranges, i.e. this can be reused in SCCP b) we only have to add knowledge about supported intrinsics to ConstantRange, not any consumers. Differential Revision: https://reviews.llvm.org/D84587	2020-07-29 22:16:27 +02:00
Craig Topper	3efc978bae	[LV] Add abs/smin/smax/umin/umax intrinsics to isTriviallyVectorizable This patch adds support for vectorizing these intrinsics. Differential Revision: https://reviews.llvm.org/D84796	2020-07-29 10:23:07 -07:00
Sanjay Patel	ee9617e96b	[InstSimplify] try constant folding intrinsics before general simplifications This matches the behavior of simplify calls for regular opcodes - rely on ConstantFolding before spending time on folds with variables. I am not aware of any diffs from this re-ordering currently, but there was potential for unintended behavior from the min/max intrinsics because that code is implicitly assuming that only 1 of the input operands is constant.	2020-07-29 13:18:40 -04:00
Sanjay Patel	3e8534fbc6	[InstSimplify] allow partial undef constants for vector min/max folds	2020-07-29 11:53:41 -04:00
Sanjay Patel	3c20ede18b	[InstSimplify] fold integer min/max intrinsic with same args	2020-07-29 11:53:41 -04:00
Sanjay Patel	9ee7d7122c	[ConstantFolding] fold integer min/max intrinsics If both operands are undef, return undef. If one operand is undef, clamp to limit constant.	2020-07-29 11:01:13 -04:00
David Green	60280e9818	[Analysis] TTI: Add CastContextHint for getCastInstrCost Currently, getCastInstrCost has limited information about the cast it's rating, often just the opcode and types. Sometimes there is a context instruction as well, but it isn't trustworthy: for instance, when the vectorizer is rating a plan, it calls getCastInstrCost with the old instructions when, in fact, it's trying to evaluate the cost of the instruction post-vectorization. Thus, the current system can get the cost of certain casts incorrect as the correct cost can vary greatly based on the context in which it's used. For example, if the vectorizer queries getCastInstrCost to evaluate the cost of a sext(load) with tail predication enabled, getCastInstrCost will think it's free most of the time, but it's not always free. On ARM MVE, a VLD2 group cannot be extended like a normal VLDR can. Similar situations can come up with how masked loads can be extended when being split. To fix that, this path adds a new parameter to getCastInstrCost to give it a hint about the context of the cast. It adds a CastContextHint enum which contains the type of the load/store being created by the vectorizer - one for each of the types it can produce. Original patch by Pierre van Houtryve Differential Revision: https://reviews.llvm.org/D79162	2020-07-29 13:32:53 +01:00
Sanjay Patel	3fb13b8484	[InstSimplify] allow undefs in icmp with vector constant folds This is the main icmp simplification shortcoming seen in D84655. Alive2 agrees that the basic examples are correct at least: define <2 x i1> @src(<2 x i8> %x) { %0: %r = icmp sle <2 x i8> { undef, 128 }, %x ret <2 x i1> %r } => define <2 x i1> @tgt(<2 x i8> %x) { %0: ret <2 x i1> { 1, 1 } } Transformation seems to be correct! define <2 x i1> @src(<2 x i32> %X) { %0: %A = or <2 x i32> %X, { 63, 63 } %B = icmp ult <2 x i32> %A, { undef, 50 } ret <2 x i1> %B } => define <2 x i1> @tgt(<2 x i32> %X) { %0: ret <2 x i1> { 0, 0 } } Transformation seems to be correct! https://alive2.llvm.org/ce/z/omt2ee https://alive2.llvm.org/ce/z/GW4nP_ Differential Revision: https://reviews.llvm.org/D84762	2020-07-28 15:13:53 -04:00
Evgeniy Brevnov	412b3932c6	[BPI] Fix memory leak reported by sanitizer bots There is a silly mistake where release() is used instead of reset() for free resources of unique pointer. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D84747	2020-07-28 19:53:46 +07:00
Evgeniy Brevnov	3a2b05f9fe	[BPI][NFC] Consolidate code to deal with SCCs under a dedicated data structure. In order to facilitate review of D79485 here is a small NFC change which restructures code around handling of SCCs in BPI. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D84514	2020-07-28 17:42:33 +07:00
Alina Sbirlea	f1d4db4f0c	[GraphDiff] Use class method getChildren instead of GraphTraits. Summary: Use getChildren() method in GraphDiff instead of GraphTraits. This simplifies the code and allows for refactorigns inside GraphDiff. All usecase need not have a light-weight/copyable range. Clean GraphTraits implementation. Reviewers: dblaikie Subscribers: hiraditya, llvm-commits, george.burgess.iv Tags: #llvm Differential Revision: https://reviews.llvm.org/D84562	2020-07-27 16:12:34 -07:00
Kazu Hirata	902cbcd59e	Use llvm::is_contained where appropriate (NFC) Summary: This patch replaces std::find with llvm::is_contained where appropriate. Reviewers: efriedma, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, jvesely, nhaehnle, hiraditya, rogfer01, kerbowa, llvm-commits, vkmr Tags: #llvm Differential Revision: https://reviews.llvm.org/D84489	2020-07-27 10:20:44 -07:00
Sergey Dmitriev	bec77ece14	[CallGraph] Preserve call records vector when replacing call edge Summary: Try not to resize vector of call records in a call graph node when replacing call edge. That would prevent invalidation of iterators stored in the CG SCC pass manager's scc_iterator. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D84295	2020-07-27 06:02:55 -07:00
Sanjay Patel	0481e1ae3c	[InstSimplify] fold integer min/max intrinsics with limit constant	2020-07-26 09:41:54 -04:00
Sanjay Patel	b89ae102e6	[InstSimplify] fold fcmp using isKnownNeverInfinity + isKnownNeverNaN Follow-up to D84035 / rG7393d7574c09. This sidesteps a question of FMF/poison on fcmp raised in PR46077: http://bugs.llvm.org/PR46077 https://alive2.llvm.org/ce/z/TCsyzD define i1 @src(float %x) { %0: %x42 = fadd nnan ninf float %x, 42.000000 %r = fcmp ueq float %x42, inf ret i1 %r } => define i1 @tgt(float %x) { %0: ret i1 0 } Transformation seems to be correct! https://alive2.llvm.org/ce/z/FQaH7a define i1 @src(i8 %x) { %0: %cast = uitofp i8 %x to float %r = fcmp one float inf, %cast ret i1 %r } => define i1 @tgt(i8 %x) { %0: ret i1 1 } Transformation seems to be correct!	2020-07-26 09:04:37 -04:00
Juneyoung Lee	32088f4f7f	[ConstantFolding] Fold freeze if it is never undef or poison This is a simple patch that adds constant folding for freeze instruction. IIUC, it isn't needed to update ConstantFold.cpp because there is no freeze constexpr. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84597	2020-07-26 21:54:44 +09:00
Juneyoung Lee	9f074214b7	[ValueTracking] Instruction::isBinaryOp should be used for constexprs This is a simple patch that makes canCreateUndefOrPoison use Instruction::isBinaryOp because BinaryOperator inherits Instruction. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84596	2020-07-26 21:48:51 +09:00
Nikita Popov	bc79ed7e16	[LVI] Don't require operand number for range (NFC) Pass the Value* instead of the operand number, rename I to CxtI. This makes the function a bit more generally useful.	2020-07-25 16:33:45 +02:00
Johannes Doerfert	ce8928f2e4	[Mem2Reg] Teach promote to register about droppable instructions This is the first of two patches to address PR46753. We basically allow mem2reg to promote allocas that are used in doppable instructions, for now that means `llvm.assume`. The uses of the alloca (or a bitcast or zero offset GEP from there) are replaced by `undef` in the droppable instructions. Reviewed By: Tyker Differential Revision: https://reviews.llvm.org/D83976	2020-07-24 15:15:38 -05:00
Arthur Eubanks	9bb6ce78be	Rename scoped-noalias -> scoped-noalias-aa Summary: To match NewPM name. Also the new name is clearer and more consistent. Subscribers: jvesely, nhaehnle, hiraditya, asbirlea, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D84542	2020-07-24 12:14:27 -07:00
Florian Hahn	1c7c69c795	[ValueTracking] Check for ConstantExpr before using recursive helpers. Make sure we do not call constainsConstantExpression/containsUndefElement on ConstantExpression, which is not supported. In particular, containsUndefElement/constainsConstantExpression are only supported on constants which are supported by getAggregateElement. Unfortunately there's no convenient way to check if a constant supports getAggregateElement, so just check for non-constantexpressions with vector type. Other users of those functions do so too. Reviewers: spatel, nikic, craig.topper, lebedev.ri, jdoerfert, aqjune Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D84512	2020-07-24 17:37:09 +01:00
Simon Pilgrim	0128b9505c	Revert rG5dd566b7c7b78bd- "PassManager.h - remove unnecessary Function.h/Module.h includes. NFCI." This reverts commit `5dd566b7c7`. Causing some buildbot failures that I'm not seeing on MSVC builds.	2020-07-24 13:02:33 +01:00
Simon Pilgrim	5dd566b7c7	PassManager.h - remove unnecessary Function.h/Module.h includes. NFCI. PassManager.h is one of the top headers in the ClangBuildAnalyzer frontend worst offenders list. This exposes a large number of implicit dependencies on various forward declarations/includes in other headers that need addressing.	2020-07-24 12:40:50 +01:00
Eric Christopher	3ac828b8f7	Use llvm::size rather than an empty loop to get the number of top level loops.	2020-07-23 14:55:50 -07:00
Tarindu Jayatilaka	06283661b3	Add new function properties to FunctionPropertiesAnalysis Added LoadInstCount, StoreInstCount, MaxLoopDepth, LoopCount Reviewed By: jdoerfert, mtrofin Differential Revision: https://reviews.llvm.org/D82283	2020-07-23 12:46:47 -07:00
Tarindu Jayatilaka	ee6f0e109c	Add a Printer to the FunctionPropertiesAnalysis A printer pass and a lit test case was added. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D82523	2020-07-23 11:57:11 -07:00
Tarindu Jayatilaka	2f56046d7c	Refactor FunctionPropertiesAnalysis this separates `analyze` logic from `FunctionPropertiesAnalysis` Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D82521	2020-07-23 11:49:10 -07:00
Simon Pilgrim	7eb213499e	RegionInfo.cpp - remove duplicate includes that already exist in RegionInfo.h. NFC. Also remove some unnecessary forward declarations in RegionInfo.h.	2020-07-23 17:50:22 +01:00
Sanjay Patel	7485e92412	[InstSimplify] reduce code duplication for binop expansion; NFC D84250 proposes to extend this code, so the duplication for the commuted case would continue to grow.	2020-07-23 08:35:21 -04:00
Christopher Tetreault	23c5e59d9f	[SVE] Remove calls to VectorType::getNumElements from Analysis Reviewers: efriedma, fpetrogalli, c-rhodes, asbirlea, RKSimon Reviewed By: RKSimon Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81504	2020-07-22 15:19:05 -07:00
Tarindu Jayatilaka	418121c30a	Reapply "Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis" (This reverts commit `a5e0194709`, and corrects author). Rename the pass to be able to extend it to function properties other than inliner features. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D82044	2020-07-22 10:07:35 -07:00
Mircea Trofin	a5e0194709	Revert "Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis" This reverts commit `44a6bda19b`. I forgot to correctly attibute it to tarinduj. Fixing and resubmitting.	2020-07-22 09:42:17 -07:00
Mircea Trofin	44a6bda19b	Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis Rename the pass to be able to extend it to function properties other than inliner features. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D82044	2020-07-22 09:24:15 -07:00
Sebastian Neubauer	2a6c871596	[InstCombine] Move target-specific inst combining For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time. D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target). This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64. This allows to move about 3000 lines out from InstCombine to the targets. Differential Revision: https://reviews.llvm.org/D81728	2020-07-22 15:59:49 +02:00
Max Kazantsev	b96114c1e1	[SCEV] Remove premature assert. PR46786 This assert was added to verify assumption that GEP's SCEV will be of pointer type, basing on fact that it should be a SCEVAddExpr with (at least) last operand being pointer. Two notes: - GEP's SCEV does not have to be a SCEVAddExpr after all simplifications; - In current state, GEP's SCEV does not have to have at least one pointer operands (all of them can become int during the transforms). However, we might want to be at a point where it is true. We are currently removing this assert and will try to enumerate the cases where "is pointer" notion might be lost during the transforms. When all of them are fixed, we can return it. Differential Revision: https://reviews.llvm.org/D84294 Reviewed By: lebedev.ri	2020-07-22 15:43:16 +07:00
Juneyoung Lee	ace0bf7490	[ValueTracking] Fix incorrect handling of canCreateUndefOrPoison .. in isGuaranteedNotToBeUndefOrPoison. This caused early exit of isGuaranteedNotToBeUndefOrPoison, making it return imprecise result. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84251	2020-07-22 09:31:16 +09:00
Nico Weber	4fe912f186	Build: Move TF source file inclusion from build system to source files Outside of compiler-rt (where it's arguably an anti-pattern too), LLVM tries to keep its build files as simple as possible. See e.g. llvm/docs/SupportLibrary.rst, "Code Organization". Differential Revision: https://reviews.llvm.org/D84243	2020-07-21 13:02:34 -04:00
David Green	becaa6803a	[ARM] Constant fold VCTP intrinsics We can sometimes get into the situation where the operand to a vctp intrinsic becomes constant, such as after a loop is fully unrolled. This adds the constant folding needed for them, allowing them to simplify away and hopefully simplifying remaining instructions. Differential Revision: https://reviews.llvm.org/D84110	2020-07-21 11:39:31 +01:00
Nico Weber	e37b220442	[gn build] (manually) hack around `70f8d0ac8a`	2020-07-21 06:35:36 -04:00
Mircea Trofin	70f8d0ac8a	[llvm] Development-mode InlineAdvisor Summary: This is the InlineAdvisor used in 'development' mode. It enables two scenarios: - loading models via a command-line parameter, thus allowing for rapid training iteration, where models can be used for the next exploration phase without requiring recompiling the compiler. This trades off some compilation speed for the added flexibility. - collecting training logs, in the form of tensorflow.SequenceExample protobufs. We generate these as textual protobufs, which simplifies generation and testing. The protobufs may then be readily consumed by a tensorflow-based training algorithm. To speed up training, training logs may also be collected from the 'default' training policy. In that case, this InlineAdvisor does not use a model. RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html Reviewers: jdoerfert, davidxl Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83733	2020-07-20 11:01:56 -07:00
Matt Arsenault	5e999cbe8d	IR: Define byref parameter attribute This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some of the stack-copy implications in its definition. This includes the base IR changes, and some tests for places where it should be treated similarly to byval. Codegen support will be in a future patch. My original attempt at solving some of these problems was to repurpose byval with a different address space from the stack. However, it is technically permitted for the callee to introduce a write to the argument, although nothing does this in reality. There is also talk of removing and replacing the byval attribute, so a new attribute would need to take its place anyway. This is intended avoid some optimization issues with the current handling of aggregate arguments, as well as fixes inflexibilty in how frontends can specify the kernel ABI. The most honest representation of the amdgpu_kernel convention is to expose all kernel arguments as loads from constant memory. Today, these are raw, SSA Argument values and codegen is responsible for turning these into loads. Background: There currently isn't a satisfactory way to represent how arguments for the amdgpu_kernel calling convention are passed. In reality, arguments are passed in a single, flat, constant memory buffer implicitly passed to the function. It is also illegal to call this function in the IR, and this is only ever invoked by a driver of some kind. It does not make sense to have a stack passed parameter in this context as is implied by byval. It is never valid to write to the kernel arguments, as this would corrupt the inputs seen by other dispatches of the kernel. These argumets are also not in the same address space as the stack, so a copy is needed to an alloca. From a source C-like language, the kernel parameters are invisible. Semantically, a copy is always required from the constant argument memory to a mutable variable. The current clang calling convention lowering emits raw values, including aggregates into the function argument list, since using byval would not make sense. This has some unfortunate consequences for the optimizer. In the aggregate case, we end up with an aggregate store to alloca, which both SROA and instcombine turn into a store of each aggregate field. The optimizer never pieces this back together to see that this is really just a copy from constant memory, so we end up stuck with expensive stack usage. This also means the backend dictates the alignment of arguments, and arbitrarily picks the LLVM IR ABI type alignment. By allowing an explicit alignment, frontends can make better decisions. For example, there's real no advantage to an aligment higher than 4, so a frontend could choose to compact the argument layout. Similarly, there is a high penalty to using an alignment lower than 4, so a frontend could opt into more padding for small arguments. Another design consideration is when it is appropriate to expose the fact that these arguments are all really passed in adjacent memory. Currently we have a late IR optimization pass in codegen to rewrite the kernel argument values into explicit loads to enable vectorization. In most programs, unrelated argument loads can be merged together. However, exposing this property directly from the frontend has some disadvantages. We still need a way to track the original argument sizes and alignments to report to the driver. I find using some side-channel, metadata mechanism to track this unappealing. If the kernel arguments were exposed as a single buffer to begin with, alias analysis would be unaware that the padding bits betewen arguments are meaningless. Another family of problems is there are still some gaps in replacing all of the available parameter attributes with metadata equivalents once lowered to loads. The immediate plan is to start using this new attribute to handle all aggregate argumets for kernels. Long term, it makes sense to migrate all kernel arguments, including scalars, to be passed indirectly in the same manner. Additional context is in D79744.	2020-07-20 10:23:09 -04:00
Juneyoung Lee	30201d3b61	[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison use canCreateUndefOrPoison This patch adds support more operations. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D83926	2020-07-20 09:21:39 +09:00
Jameson Nash	8b354cc8db	[ConstantFolding] check applicability of AllOnes constant creation first The getAllOnesValue can only handle things that are bitcast from a ConstantInt, while here we bitcast through a pointer, so we may see more complex objects (like Array or Struct). Differential Revision: https://reviews.llvm.org/D83870	2020-07-19 13:13:57 -04:00
Juneyoung Lee	0a6aee5160	[ValueTracking] Add canCreateUndefOrPoison & let canCreatePoison use Operator This patch - adds `canCreateUndefOrPoison` - refactors `canCreatePoison` so it can deal with constantexprs `canCreateUndefOrPoison` will be used at D83926. Reviewed By: nikic, jdoerfert Differential Revision: https://reviews.llvm.org/D84007	2020-07-20 01:24:30 +09:00
Wenlei He	d41d952be9	Revert "[InlineAdvisor] New inliner advisor to replay inlining from optimization remarks" This reverts commit `2d6ecfa168`.	2020-07-19 08:49:04 -07:00
Wenlei He	2d6ecfa168	[InlineAdvisor] New inliner advisor to replay inlining from optimization remarks Summary: This change added a new inline advisor that takes optimization remarks from previous inlining as input, and provides the decision as advice so current inlining can replay inline decisions of a different compilation. Dwarf inline stack with line and discriminator is used as anchor for call sites including call context. The change can be useful for Inliner tuning as it provides a channel to allow external input for tweaking inline decisions. Existing alternatives like alwaysinline attribute is per-function, not per-callsite. Per-callsite inline intrinsic can be another solution (not yet existing), but it's intrusive to implement and also does not differentiate call context. A switch -sample-profile-inline-replay=<inline_remarks_file> is added to hook up the new inline advisor with SampleProfileLoader's inline decision for replay. Since SampleProfileLoader does top-down inlining, inline decision can be specialized for each call context, hence we should be able to replay inlining accurately. However with a bottom-up inliner like CGSCC inlining, the replay can be limited due to lack of specialization for different call context. Apart from that limitation, the new inline advisor can still be used by regular CGSCC inliner later if needed for tuning purpose. Subscribers: mgorny, aprantl, hiraditya, llvm-commits Tags: #llvm Resubmit for https://reviews.llvm.org/D84086	2020-07-19 08:21:05 -07:00
Sanjay Patel	7393d7574c	[InstSimplify] fold fcmp with infinity constant using isKnownNeverInfinity This is a step towards trying to remove unnecessary FP compares with infinity when compiling with -ffinite-math-only or similar. I'm intentionally not checking FMF on the fcmp itself because I'm assuming that will go away eventually. The analysis part of this was added with rGcd481136 for use with isKnownNeverNaN. Similarly, that could be an enhancement here to get predicates like 'one' and 'ueq'. Differential Revision: https://reviews.llvm.org/D84035	2020-07-19 09:24:52 -04:00
Gui Andrade	c42509413f	[LLVM] Add libatomic load/store functions to TargetLibraryInfo This allows treating these functions like libcalls. This patch is a prerequisite to instrumenting them in MSAN: https://reviews.llvm.org/D83337 Differential Revision: https://reviews.llvm.org/D83361	2020-07-18 03:18:48 +00:00
Eric Christopher	ae08dbc673	Temporarily Revert "[InlineAdvisor] New inliner advisor to replay inlining from optimization remarks" as it is failing the inline-replay.ll test as well as sanitizers/Werror from returning a stack local variable. This reverts commit `029946b112`.	2020-07-17 14:58:01 -07:00
Wenlei He	029946b112	[InlineAdvisor] New inliner advisor to replay inlining from optimization remarks Summary: This change added a new inline advisor that takes optimization remarks for previous inlining as input, and provide the decision as advice so current inlining can replay inline decision of a different compilation. Dwarf inline stack with line and discriminator is used as anchor for call sites. The change can be useful for Inliner tuning. A switch -sample-profile-inline-replay=<inline_remarks_file> is added to hook up the new inliner advisor with SampleProfileLoader's inline decision for replay. The new inline advisor can also be used by regular CGSCC inliner later if needed. Reviewers: davidxl, mtrofin, wmi, hoy Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83743	2020-07-17 13:30:47 -07:00
Benjamin Kramer	9a0689e072	Make helpers static. NFC.	2020-07-17 13:49:11 +02:00
Juneyoung Lee	582901d0b5	[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison consider noundef This patch adds support for noundef arguments. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D83752	2020-07-17 12:53:08 +09:00
Mircea Trofin	9870f77441	[llvm] Moved InlineSizeEstimatorAnalysis test to .ll Summary: Following guidance in https://llvm.org/docs/TestingGuide.html#testing-analysis Reviewers: mehdi_amini Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83918	2020-07-16 12:25:16 -07:00
Eric Christopher	7bfaa40086	Temporarily Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions" due to the performance bugs filed in https://bugs.llvm.org/show_bug.cgi?id=46753. An SROA change soon may obviate some of these problems. This reverts commit `8d09f20798`.	2020-07-16 11:54:04 -07:00
Arthur Eubanks	9adbb5cb3a	[SCEV] Fix ScalarEvolution tests under NPM Many tests use opt's -analyze feature, which does not translate well to NPM and has better alternatives. The alternative here is to explicitly add a pass that calls ScalarEvolution::print(). The legacy pass manager RUNs aren't changing, but they are now pinned to the legacy pass manager. For each legacy pass manager RUN, I added a corresponding NPM RUN using the 'print<scalar-evolution>' pass. For compatibility with update_analyze_test_checks.py and existing test CHECKs, 'print<scalar-evolution>' now prints what -analyze prints per function. This was generated by the following Python script and failures were manually fixed up: import sys for i in sys.argv: with open(i, 'r') as f: s = f.read() with open(i, 'w') as f: for l in s.splitlines(): if "RUN:" in l and ' -analyze ' in l and '\\' not in l: f.write(l.replace(' -analyze ', ' -analyze -enable-new-pm=0 ')) f.write('\n') f.write(l.replace(' -analyze ', ' -disable-output ').replace(' -scalar-evolution ', ' "-passes=print<scalar-evolution>" ').replace(" \| ", " 2>&1 \| ")) f.write('\n') else: f.write(l) There are a couple failures still in ScalarEvolution under NPM, but those are due to other unrelated naming conflicts. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D83798	2020-07-16 11:24:07 -07:00
Matt Arsenault	023883a834	IR: Rename Argument::hasPassPointeeByValueAttr to prepare for byref When the byref attribute is added, there will need to be two similar functions for the existing cases which have an associate value copy, and byref which does not. Most, but not all of the existing uses will use the existing version. The associated size function added by D82679 also needs to contextually differ, and will help eliminate a few places still relying on pointee element types.	2020-07-16 13:50:49 -04:00
Matt Arsenault	0347039a6e	ValueTracking: Fix isKnownNonZero for non-0 null pointers for byval The IR doesn't have a proper concept of invalid pointers, and "null" constants are just all zeros (though it really needs one). I think it's not possible to break this for AMDGPU due to the copy semantics of byval. If you have an original stack object at 0, the byval copy will be placed above it so I don't think it's really possible to hit a 0 address.	2020-07-16 13:50:49 -04:00
David Green	311fafd2c9	[BasicAA] Fix -basicaa-recphi for geps with negative offsets As shown in D82998, the basic-aa-recphi option can cause miscompiles for gep's with negative constants. The option checks for recursive phi, that recurse through a contant gep. If it finds one, it performs aliasing calculations using the other phi operands with an unknown size, to specify that an unknown number of elements after the initial value are potentially accessed. This works fine expect where the constant is negative, as the size is still considered to be positive. So this patch expands the check to make sure that the constant is also positive. Differential Revision: https://reviews.llvm.org/D83576	2020-07-16 17:22:40 +01:00
Craig Topper	00f3579aea	Revert "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and subsequent patches This reverts most of the following patches due to reports of miscompiles. I've left the added test cases with comments updated to be FIXMEs. `1cf6f210a2` [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison. `469da663f2` [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison `122b0640fc` [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison `ac0af12ed2` [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison `9b1e95329a` [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms	2020-07-15 22:02:33 -07:00
Mircea Trofin	4f763b2172	[llvm][NFC] Hide the tensorflow dependency from headers. Summary: This change avoids exposing tensorflow types when including TFUtils.h. They are just an implementation detail, and don't need to be used directly when implementing an analysis requiring ML model evaluation. The TFUtils APIs, while generically typed, are still not exposed unless the tensorflow C library is present, as they currently have no use otherwise. Reviewers: mehdi_amini, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83843	2020-07-14 21:14:11 -07:00
Johannes Doerfert	64d99a1d04	[CallGraph] Update callback call sites in RefreshCallGraph Since D82572, we keep "reference" edges for callback call sites. While not strictly necessary they can improve the traversal order. However, we did not update them properly in case a pass removed the callback call site which caused a verification error (PR46687). With this patch we update these reference edges properly during the invocation of `CallGraphSCCPass::RefreshCallGraph` in non-checking mode. Reviewed By: sdmitriev Differential Revision: https://reviews.llvm.org/D83718	2020-07-14 22:33:57 -05:00
Giorgis Georgakoudis	aef60af34e	[CallGraph] Ignore callback uses Summary: Ignore callback uses when adding a callback function in the CallGraph. Callback functions are typically created when outlining, e.g. for OpenMP, so they have internal scope and linkage. They should not be added to the ExternalCallingNode since they are only callable by the specified caller function at creation time. A CGSCC pass, such as OpenMPOpt, may need to update the CallGraph by adding a new outlined callback function. Without ignoring callback uses, adding breaks CGSCC pass restrictions and results to a broken CallGraph. Reviewers: jdoerfert Subscribers: hiraditya, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83370	2020-07-14 13:08:49 -07:00
Tyker	16f777f421	[NFC] Add debug and stat counters to assume queries and assume builder Summary: Add debug counter and stats counter to assume queries and assume builder here is the collected stats on a build of check-llvm + check-clang. "assume-builder.NumAssumeBuilt": 2720879, "assume-builder.NumAssumesMerged": 761396, "assume-builder.NumAssumesRemoved": 1576212, "assume-builder.NumBundlesInAssumes": 6518809, "assume-queries.NumAssumeQueries": 85566380, "assume-queries.NumUsefullAssumeQueries": 2727360, the NumUsefullAssumeQueries stat is actually pessimistic because in a few places queries ask to keep providing information to try to get better information. and this isn't counted as a usefull query evem tho it can be usefull Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83506	2020-07-14 21:49:14 +02:00
Logan Smith	a19461d9e1	[NFC] Add 'override' keyword where missing in include/ and lib/. This fixes warnings raised by Clang's new -Wsuggest-override, in preparation for enabling that warning in the LLVM build. This patch also removes the virtual keyword where redundant, but only in places where doing so improves consistency within a given file. It also removes a couple unnecessary virtual destructor declarations in derived classes where the destructor inherited from the base class is already virtual. Differential Revision: https://reviews.llvm.org/D83709	2020-07-14 09:47:29 -07:00
Sanjay Patel	e6c016420c	[ValueTracking] fix library to intrinsic mapping to respect 'nobuiltin' attribute This is another problem raised in: http://bugs.llvm.org/PR46627	2020-07-14 10:04:24 -04:00
Sanjay Patel	34d35d4a42	[ValueTracking] fix miscompile in maxnum case of cannotBeOrderedLessThanZeroImpl (PR46627) A miscompile with -0.0 is shown in: http://bugs.llvm.org/PR46627 This is because maxnum(-0.0, +0.0) does not specify a fixed result: http://llvm.org/docs/LangRef.html#llvm-maxnum-intrinsic So we need to tighten the constraints for when it is ok to say the result of maxnum is positive (including +0.0). Differential Revision: https://reviews.llvm.org/D83601	2020-07-14 08:08:09 -04:00
Jameson Nash	2c7a07b59d	[GVN] teach ConstantFolding correct handling of non-integral addrspace casts Here we teach the ConstantFolding analysis pass that it is not legal to replace a load of a bitcast constant (having a non-integral addrspace) with a bitcast of the value of that constant (with a different non-integral addrspace). But also teach it that certain bit patterns are always known and convertable (a fact it already uses elsewhere). This required us to also fix a globalopt test, since, after this change, LLVM is able to realize that the test actually is a valid transform (NULL is always a known bit-pattern) and so it doesn't need to emit the failure remarks for it. Also simplify some of the negative tests for transforms by avoiding a type change in their bitcast, and add positive versions of the same tests, to show that they otherwise should work. Differential Revision: https://reviews.llvm.org/D59730	2020-07-13 21:44:17 -04:00
Jameson Nash	19f01a4847	[GVN] add early exit to ConstantFoldLoadThroughBitcast [NFC] And adds some additional test coverage to ensure later commits don't introduce regressions. Differential Revision: https://reviews.llvm.org/D59730	2020-07-13 21:44:17 -04:00
Mircea Trofin	caf395ee8c	Reapply "[llvm] Native size estimator for training -Oz inliner" This reverts commit `9908a3b9f5`. The fix was to exclude the content of TFUtils.h (automatically included in the LLVM_Analysis module, when LLVM_ENABLE_MODULES is enabled). Differential Revision: https://reviews.llvm.org/D82817	2020-07-13 16:26:26 -07:00
Tyker	8d09f20798	[AssumeBundles] Use operand bundles to encode alignment assumptions Summary: NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining. Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: thopre, yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71739	2020-07-14 01:05:58 +02:00
Davide Italiano	9908a3b9f5	Revert "[llvm] Native size estimator for training -Oz inliner" This reverts commit `83080a294a` as it breaks the macOS modules build.	2020-07-13 13:13:36 -07:00
Mircea Trofin	11046ef69e	[llvm][NFC] Factored the default inlining advice This is in preparation for the 'development' mode advisor. We currently want to track what the default policy's decision would have been, this refactoring makes it easier to do that.	2020-07-13 12:20:35 -07:00
Mircea Trofin	acabaf600b	[llvm][NFC] ML Policies: changed the saved_model protobuf to text Also compacted the checkpoints (variables) to one file (plus the index). This reduces the binary model files to just the variables and their index. The index is very small. The variables are serialized float arrays. When updated through training, the changes are very likely unlocalized, so there's very little value in them being anything else than binary.	2020-07-13 11:07:07 -07:00
Mircea Trofin	83080a294a	[llvm] Native size estimator for training -Oz inliner Summary: This is an experimental ML-based native size estimator, necessary for computing partial rewards during -Oz inliner policy training. Data extraction for model training will be provided in a separate patch. RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html Reviewers: davidxl, jdoerfert Subscribers: mgorny, hiraditya, mgrang, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82817	2020-07-13 10:13:56 -07:00
Teresa Johnson	3e5173dbc3	[BPI] Compile time improvement when erasing blocks (NFC) Summary: eraseBlock is trying to erase all probability info for the given BB. This info is stored in a DenseMap organized like so: using Edge = std::pair<const BasicBlock *, unsigned>; DenseMap<Edge, BranchProbability> Probs; where the unsigned in the Edge key is the successor id. It was walking through every single map entry, checking if the BB in the key's pair matched the given BB. Much more efficient is to do what another method (getEdgeProbability) was already doing, which is to walk the successors of the BB, and simply do a map lookup on the key formed from each <BB, successor id> pair. Doing this dropped the overall compile time for a file containing a very large function by around 32%. Reviewers: davidxl, xur Subscribers: llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D83596	2020-07-10 16:55:54 -07:00
Sidharth Baveja	e541e1b757	[NFC] Separate Peeling Properties into its own struct (re-land after minor fix) Summary: This patch separates the peeling specific parameters from the UnrollingPreferences, and creates a new struct called PeelingPreferences. Functions which used the UnrollingPreferences struct for peeling have been updated to use the PeelingPreferences struct. Author: sidbav (Sidharth Baveja) Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel), anhtuyen (Anh Tuyen Tran), nikic (Nikita Popov) Reviewed By: Meinersbur (Michael Kruse) Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D80580	2020-07-10 18:39:30 +00:00
Florian Hahn	ec00aa99dd	[DomTreeUpdater] Use const auto * when iterating over pointers (NFC). This silences the warning below: llvm-project/llvm/lib/Analysis/DomTreeUpdater.cpp:510:20: warning: loop variable 'BB' is always a copy because the range of type 'const SmallPtrSet<llvm::BasicBlock , 8>' does not return a reference [-Wrange-loop-analysis] for (const auto &BB : DeletedBBs) { ^ llvm-project/llvm/lib/Analysis/DomTreeUpdater.cpp:510:8: note: use non-reference type 'llvm::BasicBlock ' for (const auto &BB : DeletedBBs) { ^~~~~~~~~~~~~~~~ 1 warning generated.	2020-07-10 16:39:15 +01:00
David Green	e1135b486a	Revert "[BasicAA] Enable -basic-aa-recphi by default" This reverts commit `af839a9618`. Some issues appear to be being caused by this. Reverting whilst we investigate.	2020-07-10 13:43:54 +01:00
Simon Pilgrim	b69e0f674f	DomTreeUpdater::dump() - use const auto& iterator in for-range-loop. Avoids unnecessary copies and silences clang tidy warning.	2020-07-10 12:47:15 +01:00
Simon Pilgrim	9ce9831289	StackSafetyAnalysis.cpp - pass ConstantRange arg as const reference. Avoids unnecessary copies and silences clang tidy warning - we do this in most places, there are just a few that were missed.	2020-07-10 12:13:34 +01:00
Simon Pilgrim	9a3e8b11a8	extractConstantWithoutWrapping - use const APInt& returned by SCEVConstant::getAPInt() Avoids unnecessary APInt copies and silences clang tidy warning.	2020-07-10 10:24:29 +01:00
SharmaRithik	e71c7b593a	[CodeMoverUtils] Move OrderedInstructions to CodeMoverUtils Summary: This patch moves OrderedInstructions to CodeMoverUtils as It was the only place where OrderedInstructions is required. Authored By: RithikSharma Reviewer: Whitney, bmahjour, etiotto, fhahn, nikic Reviewed By: Whitney, nikic Subscribers: mgorny, hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D80643	2020-07-10 11:22:43 +05:30
Wei Mi	e296e9dfd6	[NFC] Change getEntryForPercentile to be a static function in ProfileSummaryBuilder. Change file static function getEntryForPercentile to be a static member function in ProfileSummaryBuilder so it can be used by other files. Differential Revision: https://reviews.llvm.org/D83439	2020-07-09 16:38:19 -07:00
Roman Lebedev	c2a61ef388	Revert "[CallGraph] Ignore callback uses" This likely has broken test/Transforms/Attributor/IPConstantProp/ tests. http://45.33.8.238/linux/22502/step_12.txt This reverts commit `205dc0922d`.	2020-07-10 00:02:07 +03:00
Giorgis Georgakoudis	205dc0922d	[CallGraph] Ignore callback uses Summary: Ignore callback uses when adding a callback function in the CallGraph. Callback functions are typically created when outlining, e.g. for OpenMP, so they have internal scope and linkage. They should not be added to the ExternalCallingNode since they are only callable by the specified caller function at creation time. A CGSCC pass, such as OpenMPOpt, may need to update the CallGraph by adding a new outlined callback function. Without ignoring callback uses, adding breaks CGSCC pass restrictions and results to a broken CallGraph. Reviewers: jdoerfert Subscribers: hiraditya, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83370	2020-07-09 13:13:46 -07:00
Craig Topper	469da663f2	[InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison Follow up from the transform being removed in D83360. If X is probably not poison, then the transform is safe. Still plan to remove or adjust the code from ConstantFolding after this. Differential Revision: https://reviews.llvm.org/D83440	2020-07-09 12:21:03 -07:00
Craig Topper	122b0640fc	[InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison We can't fold to the non-undef value unless we know it isn't poison. So check each element with isGuaranteedNotToBeUndefOrPoison. This currently rules out all constant expressions. Differential Revision: https://reviews.llvm.org/D83442	2020-07-09 11:01:12 -07:00
Florian Hahn	0b72b9d07f	[ValueLattice] Simplify canTrackGlobalVariableInterprocedurally (NFC). using all_of and checking for valid users in the lambda seems more straight forward. Also adds a comment explaining what we are checking.	2020-07-09 18:33:09 +01:00

... 4 5 6 7 8 ...

9960 Commits