llvm-project

Commit Graph

Author	SHA1	Message	Date
Arthur Eubanks	a5141b83f1	[LoopInfo][NewPM] Fix tests in Analysis/LoopInfo under NPM	2020-09-22 11:31:00 -07:00
Arthur Eubanks	9db0c572c1	[Delinearization][NewPM] Port delinearization to NPM Also make tests in Analysis/Delinearization work under NPM. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87741	2020-09-21 17:59:08 -07:00
Arthur Eubanks	84a8ca1e6c	[NewPM] Pin -lazy-branch-prob and -lazy-block-freq tests to legacy PM NPM passes just use the normal versions of these analyses instead. Also pin any tests with -analyze to legacy PM. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87857	2020-09-21 17:51:46 -07:00
Fangrui Song	8fdac7cb7a	Revert D71539 "Recommit "[SCEV] Look through single value PHIs."" This reverts commit `11dccf8d3a`. A bootstrapped clang crashes (due to ArrayRef::front called on an empty ArrayRef) when compiling some files. Very strangely, this only reproduces with modules. ``` 13 0x0000564d3349e968 llvm::ArrayRef<llvm::BasicBlock>::front() const /proc/self/cwd/llvm/include/llvm/ADT/ArrayRef.h:160:7 14 0x0000564d3349e896 llvm::LoopBase<llvm::BasicBlock, llvm::Loop>::getHeader() const /proc/self/cwd/llvm/include/llvm/Analysis/LoopInfo.h:104:50 15 0x0000564d3349fd9d llvm::LoopBase<llvm::BasicBlock, llvm::Loop>::getLoopLatch() const /proc/self/cwd/llvm/include/llvm/Analysis/LoopInfoImpl.h:210:11 16 0x0000564d33593c8a llvm::ScalarEvolution::computeBackedgeTakenCount(llvm::Loop const, bool) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:6933:15 17 0x0000564d33592ebc llvm::ScalarEvolution::getBackedgeTakenInfo(llvm::Loop const) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:0:30 18 0x0000564d33593a54 llvm::ScalarEvolution::getBackedgeTakenCount(llvm::Loop const, llvm::ScalarEvolution::ExitCountKind) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:6487:36 19 0x0000564d32be2402 llvm::ScalarEvolution::getConstantMaxBackedgeTakenCount(llvm::Loop const) /proc/self/cwd/llvm/include/llvm/Analysis/ScalarEvolution.h:768:5 20 0x0000564d33590807 llvm::ScalarEvolution::getRangeRef(llvm::SCEV const, llvm::ScalarEvolution::RangeSignHint) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:5495:19 21 0x0000564d320abab7 llvm::ScalarEvolution::getSignedRange(llvm::SCEV const) /proc/self/cwd/llvm/include/llvm/Analysis/ScalarEvolution.h:840:12 22 0x0000564d335a03aa llvm::ScalarEvolution::isKnownPredicateViaConstantRanges(llvm::CmpInst::Predicate, llvm::SCEV const, llvm::SCEV const) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:9239:60 23 0x0000564d33586a80 llvm::ScalarEvolution::isKnownViaNonRecursiveReasoning(llvm::CmpInst::Predicate, llvm::SCEV const, llvm::SCEV const*) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:10284:60 ```	2020-09-21 17:21:43 -07:00
Kazu Hirata	ca8321574d	Fix comment typos. NFC.	2020-09-21 16:12:56 -07:00
Roman Lebedev	64e2cb7e96	[SCEV] Recognize @llvm.uadd.sat as `%y + umin(%x, (-1 - %y))` ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = uadd_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = sub nsw nuw i32 4294967295, %y %t1 = umin i32 %x, %t0 %r = add nuw i32 %t1, %y ret i32 %r } Transformation seems to be correct! The alternative, naive, lowering could be the following, although i don't think it's better, thought it will likely be needed for sadd/ssub/*shl: ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = uadd_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = zext i32 %x to i33 %t1 = zext i32 %y to i33 %t2 = add nuw i33 %t0, %t1 %t3 = zext i32 4294967295 to i33 %t4 = umin i33 %t2, %t3 %r = trunc i33 %t4 to i32 ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:54 +03:00
Roman Lebedev	fedc9549d5	[SCEV] Recognize @llvm.usub.sat as `%x - (umin %x, %y)` ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = usub_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = umin i32 %x, %y %r = sub nuw i32 %x, %t0 ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:54 +03:00
Roman Lebedev	0592de550f	[NFC][SCEV] Add tests for @llvm.*.sat intrinsics	2020-09-21 20:25:53 +03:00
Roman Lebedev	1bb7ab8c4a	[SCEV] Recognize @llvm.abs as smax(x, -x) As per alive2 (ignoring undef): ---------------------------------------- define i32 @src(i32 %x, i1 %y) { %0: %r = abs i32 %x, 0 ret i32 %r } => define i32 @tgt(i32 %x, i1 %y) { %0: %neg_x = mul i32 %x, 4294967295 %r = smax i32 %x, %neg_x ret i32 %r } Transformation seems to be correct! ---------------------------------------- define i32 @src(i32 %x, i1 %y) { %0: %r = abs i32 %x, 1 ret i32 %r } => define i32 @tgt(i32 %x, i1 %y) { %0: %neg_x = mul nsw i32 %x, 4294967295 %r = smax i32 %x, %neg_x ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:53 +03:00
Roman Lebedev	83c2d10d3c	[NFC][SCEV] Add tests for @llvm.abs intrinsic	2020-09-21 20:25:53 +03:00
Florian Hahn	3cbdfe424f	[SCEV] Add additional max BTC tests with loop guards.	2020-09-21 17:41:24 +01:00
Simon Pilgrim	18a3ebcd30	[CostModel][X86] Add some select shuffle costs tests for D87884	2020-09-21 16:09:05 +01:00
Florian Hahn	11dccf8d3a	Recommit "[SCEV] Look through single value PHIs." This commit was originally because it was suspected to cause a crash, but a reproducer did not surface. A crash that was exposed by this change was fixed in `1d8f2e5292`. This reverts the revert commit `0581c0b0ee`.	2020-09-21 11:59:50 +01:00
Florian Hahn	57ae9bb932	[LSR] Preserve MSSA when using SplitCriticalEdge. LSR claims to MemorySSA, but we also have to make sure it is preserved when splitting critical edges. This can be done by passing MSSAU to SplitCriticalEdge. Fixes PR47557.	2020-09-21 09:51:26 +01:00
Dávid Bolvanský	fa33235df5	[BasicAA] Regenerate test checks	2020-09-19 19:36:10 +02:00
Dávid Bolvanský	d716f1608c	[MemLoc] Support bcmp in MemoryLocation::getForArgument Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D87964	2020-09-19 17:12:43 +02:00
Florian Hahn	9d172c8e9c	Recommit "[DSE] Switch to MemorySSA-backed DSE by default." This switches to using DSE + MemorySSA by default again, after fixing the issues reported after the first commit. Notable fixes `fc82006331`, `a0017c2bc2`. This reverts commit `3a59628f3c`.	2020-09-18 11:05:00 +01:00
Florian Hahn	a0017c2bc2	[MemorySSA] Be more conservative when traversing MemoryPhis. I think we need to be even more conservative when traversing memory phis, to make sure we catch any loop carried dependences. This approach updates fillInCurrentPair to use unknown sizes for locations when we walk over a phi, unless the location is guaranteed to be loop-invariant for any possible loop. Using an unknown size for locations should ensure we catch all memory accesses to locations after the given memory location, which includes loop-carried dependences. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87778	2020-09-17 22:09:53 +01:00
Arthur Eubanks	179a22e807	[NewPM] Fix pr45927.ll under NPM	2020-09-17 13:57:55 -07:00
Florian Hahn	51973a607d	[SCEV] Add test cases for max BTC with loop guard info. This adds test cases for PR40961 and PR47247. They illustrate cases in which the max backedge-taken count can be improved by information from the loop guards.	2020-09-17 20:27:48 +01:00
Florian Hahn	9dc1e53787	[MemorySSA] Add another loop clobber test case.	2020-09-17 14:15:29 +01:00
Sjoerd Meijer	6637d72ddd	[Lint] Add check for intrinsic get.active.lane.mask As @efriedma pointed out in D86301, this "not equal to 0 check" of get.active.lane.mask's second operand needs to live here in Lint and not the Verifier. Differential Revision: https://reviews.llvm.org/D87228	2020-09-17 09:22:03 +01:00
Arthur Eubanks	f4ea0f9814	[NewPM] Port -print-alias-sets to NPM Really it should be named print<alias-sets>, but for the sake of changing fewer tests, added a TODO to rename after NPM switch and test cleanup. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87713	2020-09-16 18:34:56 -07:00
Alina Sbirlea	344a3d0bc0	[MemorySSA] Rename uses in blocks with Phis. Renaming should include blocks with existing Phis. Resolves PR45927. Differential Revision: https://reviews.llvm.org/D87661	2020-09-16 17:24:17 -07:00
Arthur Eubanks	09c342493d	[NPM] Translate alias analysis into require<> as well 'require<globals-aa>' is needed to make globals-aa work in NPM, since globals-aa is a module analysis but function passes cannot run module analyses on demand. So don't skip translating alias analyses to 'require<>'. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87743	2020-09-16 08:54:09 -07:00
Alina Sbirlea	d3d7603900	[MemorySSA] Report unoptimized as None, not MayAlias.	2020-09-15 23:58:53 -07:00
Alina Sbirlea	fc82006331	[MemorySSA] Set MustDominate to true for PhiTranslation.	2020-09-15 23:29:57 -07:00
Arthur Eubanks	3b38062d1c	[NewPM] Fix 2003-02-19-LoopInfoNestingBug.ll under NPM Also move it to a more appropriate directory.	2020-09-15 20:21:45 -07:00
Arthur Eubanks	558e5c31b6	[Dominators][NewPM] Pin tests with -analyze to legacy PM -analyze isn't supported in NPM. All affected tests have corresponding NPM RUN line.	2020-09-15 11:59:00 -07:00
Arthur Eubanks	d158e786cc	[DemandedBits][NewPM] Pin some tests to legacy PM All tests have corresponding NPM RUN lines. -analyze doesn't work under NPM.	2020-09-15 11:55:58 -07:00
Arthur Eubanks	9853e84b54	[PostDominators][NewPM] Fix tests to work under NPM Each test has a legacy PM pinned to legacy PM and a NPM RUN line. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87660	2020-09-15 11:19:01 -07:00
Arthur Eubanks	3f69b2140f	[NewPM][opt] Fix -globals-aa not being recognized as alias analysis in NPM Was missing MODULE_ALIAS_ANALYSIS, previously only FUNCTION_ALIAS_ANALYSIS was taken into account. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87664	2020-09-15 11:18:19 -07:00
Arthur Eubanks	e0c7641de6	[RegionInfo][NewPM] Fix RegionInfo tests to work under NPM Pin RUN lines with -analyze to legacy PM, add corresponding NPM RUN line if missing. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87658	2020-09-15 11:12:14 -07:00
Arthur Eubanks	6f66ad13c5	[DependenceAnalysis][NewPM] Fix tests to work under NPM All tests had corresponding NPM lines, simply pin non-NPM lines to legacy PM. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D87665	2020-09-15 11:11:23 -07:00
Arthur Eubanks	54e1bf1154	[LoopAccessAnalysis][NewPM] Fix tests to work under NPM Pin RUN lines with -analyze to legacy PM, add corresponding NPM RUN lines. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D87662	2020-09-15 11:06:47 -07:00
Florian Hahn	3a59628f3c	Revert "[DSE] Switch to MemorySSA-backed DSE by default." This reverts commit `fb109c42d9`. Temporarily revert due to a mis-compile pointed out at D87163.	2020-09-15 18:07:56 +01:00
Florian Hahn	c4f1b31441	[MemorySSA] Make sure PerformedPhiTrans is updated for each visited def. `1ce82015f6` added a fix to restrict phi optimizations after phi translations. But the current use of performedPhiTranslation only checked whether phi translation happened for the first iterator and missed cases where phi translations happens at subsequent iterators/upwards defs. This patch changes upward_defs_iteartor to take a pointer to a bool, so we can easily ensure the final value includes all visited defs, while still being able to conveniently use it with make_range & co.	2020-09-14 16:11:56 +01:00
Florian Hahn	f07f3c7237	[MemorySSA] Precommit test case for PR47498.	2020-09-14 16:11:56 +01:00
Florian Hahn	fb109c42d9	[DSE] Switch to MemorySSA-backed DSE by default. The tests have been updated and I plan to move them from the MSSA directory up. Some end-to-end tests needed small adjustments. One difference to the legacy DSE is that legacy DSE also deletes trivially dead instructions that are unrelated to memory operations. Because MemorySSA-backed DSE just walks the MemorySSA, we only visit/check memory instructions. But removing unrelated dead instructions is not really DSE's job and other passes will clean up. One noteworthy change is in llvm/test/Transforms/Coroutines/ArgAddr.ll, but I think this comes down to legacy DSE not handling instructions that may throw correctly in that case. To cover this with MemorySSA-backed DSE, we need an update to llvm.coro.begin to treat it's return value to belong to the same underlying object as the passed pointer. There are some minor cases MemorySSA-backed DSE currently misses, e.g. related to atomic operations, but I think those can be implemented after the switch. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html For the MultiSource/SPEC2000/SPEC2006 the number of eliminated stores goes from ~17500 (legayc DSE) to ~26300 (MemorySSA-backed). More numbers and details in the thread on llvm-dev. Impact on CTMark: ``` Legacy Pass Manager exec instrs size-text O3 + 0.60% - 0.27% ReleaseThinLTO + 1.00% - 0.42% ReleaseLTO-g. + 0.77% - 0.33% RelThinLTO (link only) + 0.87% - 0.42% RelLO-g (link only) + 0.78% - 0.33% ``` http://llvm-compile-time-tracker.com/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions ``` New Pass Manager exec instrs. size-text O3 + 0.95% - 0.25% ReleaseThinLTO + 1.34% - 0.41% ReleaseLTO-g. + 1.71% - 0.35% RelThinLTO (link only) + 0.96% - 0.41% RelLO-g (link only) + 2.21% - 0.35% ``` http://195.201.131.214:8000/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions Reviewed By: asbirlea, xbolva00, nikic Differential Revision: https://reviews.llvm.org/D87163	2020-09-10 22:24:32 +01:00
Simon Pilgrim	de25ebaac6	[CostModel][X86] Add vXi32 division by uniform constant costs (PR47476) Other types can be handled in future patches but their uniform / non-uniform costs are more similar and don't appear to cause many vectorization issues.	2020-09-10 12:17:54 +01:00
Krzysztof Parzyszek	8b7c8f2c54	Mark masked.{store,scatter,compressstore} intrinsics as write-only	2020-09-09 17:28:21 -05:00
Sam Parker	0af4147804	[ARM][CostModel] CodeSize costs for i1 arith ops When optimising for size, make the cost of i1 logical operations relatively expensive so that optimisations don't try to combine predicates. Differential Revision: https://reviews.llvm.org/D86525	2020-09-07 09:27:18 +01:00
Florian Hahn	1ddb3a369f	[LangRef] Adjust guarantee for llvm.memcpy to also allow equal arguments. This adjusts the description of `llvm.memcpy` to also allow operands to be equal. This is in line with what Clang currently expects. This change is intended to be temporary and followed by re-introduce a variant with the non-overlapping guarantee for cases where we can actually ensure that property in the front-end. See the links below for more details: http://lists.llvm.org/pipermail/cfe-dev/2020-August/066614.html and PR11763. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D86815	2020-09-05 19:18:23 +01:00
Nikita Popov	ac87480bd8	[SCEV] Recognize min/max intrinsics Recognize umin/umax/smin/smax intrinsics and convert them to the already existing SCEV nodes of the same name. In the future we'll want SCEVExpander to also produce the intrinsics, but we're not ready for that yet. Differential Revision: https://reviews.llvm.org/D87160	2020-09-05 16:30:11 +02:00
Nikita Popov	6b50ce3ac9	[SCEV] Add tests for min/max intrinsics (NFC)	2020-09-04 22:08:01 +02:00
Bryan Chan	3404add468	[EarlyCSE] Verify hash code in regression tests As discussed in D86843, -earlycse-debug-hash should be used in more regression tests to catch inconsistency between the hashing and the equivalence check. Differential Revision: https://reviews.llvm.org/D86863	2020-09-04 10:40:35 -04:00
Alina Sbirlea	ce66089ac6	Fix build-bots. BasicAA can be freed (and it is not recomputed).	2020-09-01 20:24:15 -07:00
Alina Sbirlea	1ccfb52a61	[MemCpyOptimizer] Preserve analyses and replace use of lambdas to get them. Summary: Analyses are preserved in MemCpyOptimizer. Get analyses before running the pass and store the pointers, instead of using lambdas and getting them every time on demand. Reviewers: lenary, deadalnix, mehdi_amini, nikic, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74494	2020-09-01 17:35:40 -07:00
Max Kazantsev	e7f53044e7	[Test] Move IndVars test to a proper place	2020-09-01 12:17:31 +07:00
Alina Sbirlea	63844c116a	[MemorySSA] Clean up single value phis. MemoryPhis with a single value are correct, but can lead to errors when updating. Clean up single entry Phis newly added when cloning blocks. Resolves PR46574.	2020-08-31 19:26:08 -07:00
Anna Welker	064981f0ce	[ARM][MVE] Enable MVE gathers and scatters by default Enable MVE gather/scatters by default, which requires some minor adaptations in some tests. Differential revision: https://reviews.llvm.org/D86776	2020-08-28 19:05:29 +01:00
Florian Hahn	fd6ebea50d	[MemLoc] Support memcmp in MemoryLocation::getForArgument. This patch adds support for memcmp in MemoryLocation::getForArgument. memcmp reads from the first 2 arguments up to the number of bytes of the third argument. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86725	2020-08-28 10:19:54 +01:00
Florian Hahn	85dacca29f	[BasicAA] Add first libfunc tests with memcmp.	2020-08-28 10:02:41 +01:00
Vitaly Buka	a40660551e	[StackSafety] Ignore allocas with partial lifetime markers Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D86672	2020-08-27 13:54:41 -07:00
Arthur Eubanks	486ed88533	[ConstProp] Remove ConstantPropagation As discussed in http://lists.llvm.org/pipermail/llvm-dev/2020-July/143801.html. Currently no users outside of unit tests. Replace all instances in tests of -constprop with -instsimplify. Notable changes in tests: * vscale.ll - @llvm.sadd.sat.nxv16i8 is evaluated by instsimplify, use a fake intrinsic instead * InsertElement.ll - insertelement undef is removed by instsimplify in @insertelement_undef llvm/test/Transforms/ConstProp moved to llvm/test/Transforms/InstSimplify/ConstProp Reviewed By: lattner, nikic Differential Revision: https://reviews.llvm.org/D85159	2020-08-26 15:51:30 -07:00
Arthur Eubanks	098d3f9827	[InstSimplify] Simplify to vector constants when possible InstSimplify should do all transformations that ConstProp does, but one thing that ConstProp does that InstSimplify wouldn't is inline vector instructions that are constants, e.g. into a ret. Previously vector instructions wouldn't be inlined in InstSimplify because llvm::Simplify*Instruction() would return nullptr for specific instructions, such as vector instructions that were actually constants, if it couldn't simplify them. This changes SimplifyInsertElementInst, SimplifyExtractElementInst, and SimplifyShuffleVectorInst to return a vector constant when possible. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85946	2020-08-26 11:40:36 -07:00
David Green	677c1590c0	[ARM] Increase MVE gather/scatter cost by MVECostFactor. MVE Gather scatter codegeneration is looking a lot better than it used to, but still has some issues. The instructions we currently model as 1 cycle per element, which is a bit low for some cases. Increasing the cost by the MVECostFactor brings them in-line with our other instruction costs. This will have the effect of only generating then when the extra benefit is more likely to overcome some of the issues. Notably in running out of registers and vectorizing loops that could otherwise be SLP vectorized. In the short-term whilst we look at other ways of dealing with those more directly, we can increase the costs of gathers to make them more likely to be beneficial when created. Differential Revision: https://reviews.llvm.org/D86444	2020-08-26 13:03:46 +01:00
Ta-Wei Tu	abbd652dd6	[LoopNest] False negative of `arePerfectlyNested` with LCSSA loops Summary: The LCSSA pass (required for all loop passes) sometimes adds additional blocks containing LCSSA variables, and checkLoopsStructure may return false even when the loops are perfectly nested in this case. This is because the successor of the exit block of the inner loop now points to the LCSSA block instead of the latch block of the outer loop. Examples are shown in the test nests-with-lcssa.ll. To fix the issue, the successor of the exit block of the inner loop can now point to a block in which all instructions are LCSSA phi node (except the terminator), and the sole successor of that block should point to the latch block of the outer loop. Reviewed By: Whitney, etiotto Differential Revision: https://reviews.llvm.org/D86133	2020-08-25 16:20:52 +00:00
Sam Parker	da4ada116e	[NFC][ARM] arith code size cost tests Add a run to measure the code size cost of arithmetic instructions and add a function for i1 types.	2020-08-25 11:16:01 +01:00
David Sherwood	7b64765cd1	[SVE] Fix TypeSize related warnings with IR truncates of scalable vectors In getCastInstrCost when the instruction is a truncate we were relying upon the implicit TypeSize -> uint64_t cast when asking if a given type has the same size as a legal integer. I've changed the code to only ask the question if the type is fixed length. I have also changed InstCombinerImpl::SimplifyDemandedUseBits to bail out for now if the type is a scalable vector. I've added the following new tests: Analysis/CostModel/AArch64/sve-trunc.ll Transforms/InstCombine/AArch64/sve-trunc.ll for both of these fixes. Differential revision: https://reviews.llvm.org/D86432	2020-08-25 09:17:56 +01:00
Christopher Tetreault	5eff21c8ff	[NFC][documentation] clarify comment in test test referenced a relative path to a file, but the path was not correct relative to the project the test is in Differential Revision: https://reviews.llvm.org/D86368	2020-08-21 14:30:47 -07:00
Sam Parker	acf0bb41e4	[ARM][CostModel] Select instruction costs. Modify the ARM getCmpSelInstrCost implementation for the code size costs of selects. Now consider the legalization cost and increase the cost of i1 because those values wouldn't live in a general purpose register. We also make selects +1 more expensive to account for the IT instruction. Differential Revision: https://reviews.llvm.org/D82091	2020-08-21 08:49:56 +01:00
Chuanqi Xu	f6de5306ec	[NFC][StackSafety] Test that StackLifetime looks through stripPointerCasts StackLifetime class collects lifetime marker of an `alloca` by collect the user of `BitCast` who is the user of the `alloca`. However, either the `alloca` itself could be used with the lifetime marker or the `BitCast` of the `alloca` could be transformed to other instructions. (e.g., it may be transformed to all zero reps in `InstCombine` pass). This patch tries to fix this process in `collectMarkers` functions. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D85399	2020-08-18 16:21:00 -07:00
Dávid Bolvanský	0f14b2e6cb	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `50c743fa71`. Patch will be split to smaller ones.	2020-08-17 20:44:33 +02:00
Simon Pilgrim	c1f6ce0c73	[DemandedBits] Improve accuracy of Add propagator The current demand propagator for addition will mark all input bits at and right of the alive output bit as alive. But carry won't propagate beyond a bit for which both operands are zero (or one/zero in the case of subtraction) so a more accurate answer is possible given known bits. I derived a propagator by working through truth tables and using a bit-reversed addition to make demand ripple to the right, but I'm not sure how to make a convincing argument for its correctness in the comments yet. Nevertheless, here's a minimal implementation and test to get feedback. This would help in a situation where, for example, four bytes (<128) packed into an int are added with four others SIMD-style but only one of the four results is actually read. Known A: 0_______0_______0_______0_______ Known B: 0_______0_______0_______0_______ AOut: 00000000001000000000000000000000 AB, current: 00000000001111111111111111111111 AB, patch: 00000000001111111000000000000000 Committed on behalf of: @rrika (Erika) Differential Revision: https://reviews.llvm.org/D72423	2020-08-17 12:54:09 +01:00
Simon Pilgrim	79d9e2cd93	[DemandedBits] Reorder addition test checks. NFC. As suggested on D72423 we should try to keep the same order as the original IR	2020-08-17 12:54:09 +01:00
Vitaly Buka	e10e7829bf	[StackSafety] Skip ambiguous lifetime analysis If we can't identify alloca used in lifetime marker we need to assume to worst case scenario. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D84630	2020-08-16 18:05:52 -07:00
Vitaly Buka	47552a614a	[StackSafety] Change how callee searched in index Handle other than local linkage types.	2020-08-16 04:37:19 -07:00
Simon Pilgrim	25ce634172	[DemandedBits] Add addition test case from D72423	2020-08-14 15:59:53 +01:00
Vitaly Buka	798eb71c3a	[NFC][StackSafety] Dedup callees	2020-08-14 01:14:52 -07:00
Dávid Bolvanský	50c743fa71	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 19:54:27 +02:00
Dávid Bolvanský	f9264995a6	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `44587e2f7e`. Sanitizer tests need to be updated.	2020-08-13 14:37:40 +02:00
Dávid Bolvanský	44587e2f7e	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 14:23:58 +02:00
Dávid Bolvanský	a0485421d2	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `385c9d673f`.	2020-08-13 12:59:15 +02:00
Dávid Bolvanský	385c9d673f	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 12:45:40 +02:00
Ali Tamur	0581c0b0ee	Revert "[SCEV] Look through single value PHIs." This reverts commit `e441b7a7a0`. This patch causes a compile error in tensorflow opensource project. The stack trace looks like: Point of crash: llvm/include/llvm/Analysis/LoopInfoImpl.h : line 35 (gdb) ptype this type = const class llvm::LoopBase<llvm::BasicBlock, llvm::Loop> [with BlockT = llvm::BasicBlock, LoopT = llvm::Loop] (gdb) p this $1 = {ParentLoop = 0x0, SubLoops = std::vector of length 0, capacity 0, Blocks = std::vector of length 0, capacity 1, DenseBlockSet = {<llvm::SmallPtrSetImpl<llvm::BasicBlock const>> = {<llvm::SmallPtrSetImplBase> = {<llvm::DebugEpochBase> = {Epoch = 3}, SmallArray = 0x1b2bf6c8, CurArray = 0x1b2bf6c8, CurArraySize = 8, NumNonEmpty = 0, NumTombstones = 0}, <No data fields>}, SmallStorage = {0xfffffffffffffffe, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, IsInvalid = true} (gdb) p this->DenseBlockSet->CurArray $2 = (const void *) 0xfffffffffffffffe I will try to get a case from tensorflow or use creduce to get a small case.	2020-08-12 23:13:24 -07:00
Florian Hahn	e441b7a7a0	[SCEV] Look through single value PHIs. Now that SCEVExpander can preserve LCSSA form, we do not have to worry about LCSSA form when trying to look through PHIs. SCEVExpander will take care of inserting LCSSA PHI nodes as required. This increases precision of the analysis in some cases. Reviewed By: mkazantsev, bmahjour Differential Revision: https://reviews.llvm.org/D71539	2020-08-12 10:03:42 +01:00
Dávid Bolvanský	d68a2859ab	[BPI] Teach BPI about bcmp function bcmp is similar to memcmp	2020-08-11 20:44:53 +02:00
Florian Hahn	3483c28c5b	[SCEV] ] If RHS >= Start, simplify (Start smax RHS) to RHS for trip counts. This is the max version of D85046. This change causes binary changes in 44 out of 237 benchmarks (out of MultiSource/SPEC2000/SPEC2006) Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D85189	2020-08-11 13:20:24 +01:00
Thomas Lively	514445e035	[WebAssembly][ConstantFolding] Fold fp-to-int truncation intrinsics Constant fold both the trapping and saturating versions of the WebAssembly truncation intrinsics. The tests are adapted from the WebAssembly spec tests for the corresponding instructions. Requested in PR46982. Differential Revision: https://reviews.llvm.org/D85392	2020-08-10 12:40:05 -07:00
Vitaly Buka	dee812a297	[StackSafety] Fix union which produces wrapped sets	2020-08-09 23:20:17 -07:00
Vitaly Buka	3a34228bff	[StackSafety] Don't keep FullSet in index Optimization. Missing record is enterpreted as FullSet anyway.	2020-08-09 15:01:46 -07:00
Vitaly Buka	654266bea9	[StackSafety] Use getSignedMin() to serialize ranges Almost NFC as it's important only for full sets which should not be serialized at all.	2020-08-09 14:53:13 -07:00
Vitaly Buka	eff04f9595	[NFC][StackSafety] Add index test This directly covers generateParamAccessSummary	2020-08-09 14:34:00 -07:00
Vitaly Buka	2fa401fe53	[NFC][StackSafety] Add shell test requirement	2020-08-09 14:31:17 -07:00
Vitaly Buka	2a11d5dcc9	[NFC][StackSafety] Avoid some duplications in tests	2020-08-09 12:38:53 -07:00
Vitaly Buka	6d9b3cb2fb	Revert "[NFC][StackSafety] Add index test" This reverts commit `5fd49911db`. GUIDs don't match.	2020-08-08 21:26:35 -07:00
Vitaly Buka	5fd49911db	[NFC][StackSafety] Add index test This directly covers generateParamAccessSummary	2020-08-08 19:11:02 -07:00
Vitaly Buka	b317321545	[NFC][StackSafety] noinline in alias tests	2020-08-08 18:21:52 -07:00
Vitaly Buka	7547508b7a	Revert "[StackSafety] Skip ambiguous lifetime analysis" This reverts commit `0b2616a804`. Crashes with safe-stack.	2020-08-07 14:02:50 -07:00
Max Kazantsev	da9e7b1ab0	[Test] Added test showing missing range check elimination opportunity in IndVars Seems that SCEV is not powerful enough to handle this.	2020-08-07 16:47:25 +07:00
Vitaly Buka	7fb9de2c6f	[StackSafety,NFC] Fix tests in debug	2020-08-06 20:46:39 -07:00
Vitaly Buka	39cbcbe1b1	[StackSafety,NFC] Add more tests	2020-08-06 19:50:05 -07:00
Vitaly Buka	d97636196a	[StackSafety,NFC] Sort llvm-lto2 resolutions in tests	2020-08-06 19:46:52 -07:00
Vitaly Buka	92dcf12b2f	[StackSafety,NFC] Use CHECK-EMPTY in tests	2020-08-06 19:19:51 -07:00
Vitaly Buka	0b2616a804	[StackSafety] Skip ambiguous lifetime analysis If we can't identify alloca used in lifetime marker we need to assume to worst case scenario. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D84630	2020-08-06 19:10:33 -07:00
dfukalov	4ccc38813e	[AMDGPU][CostModel] Add f16, f64 and contract cases to fused costs estimation. Add cases of fused fmul+fadd/fsub with f16 and f64 operands to cost model. Also added operations with contract attribute. Fixed line endings in test. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D84995	2020-08-06 21:43:27 +03:00
Arthur Eubanks	d0acd97c68	[NewPM][LoopUnswitch] Pin loop-unswitch to legacy PM or use simple-loop-unswitch As mentioned in http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html, loop-unswitch has not been ported to the NPM. Instead people are using simple-loop-unswitch. Pin all tests in Transforms/LoopUnswitch to legacy PM and replace all other uses of loop-unswitch with simple-loop-unswitch. One test that didn't fit into the above was 2014-06-21-congruent-constant.ll which seems to only pass with loop-unswitch. That is also pinned to legacy PM. Now all tests containing "-loop-unswitch" anywhere in the test succeed with NPM turned on by default. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D85360	2020-08-06 10:56:00 -07:00
Alina Sbirlea	beb9993d96	[MSSA] Update test with more detailed and resilient checks. [NFC]	2020-08-05 16:46:44 -07:00
Arthur Eubanks	4103f4a936	[MSSA][NewPM] Handle tests with -print-memoryssa -print-memoryssa in legacy PM is print<memoryssa> in NPM. Pin tests with -print-memoryssa to legacy PM. Add corresponding tests for NPM where missing. This fixes "unknown pass name 'print-memoryssa'". Some tests still fail in Analysis/MemorySSA due to other passes that haven't been ported. pr43427.ll and pr43438.ll required adding -aa-pipeline=basic-aa, -loop-simplify (since it doesn't run on legacy PM by default), and decrementing some of the MemoryPhi numbers. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D85333	2020-08-05 15:59:45 -07:00
Sam Parker	f2675ab45f	[ARM][CostModel] Implement getCFInstrCost As with other targets, set the throughput cost of control-flow instructions to free so that we don't miss out of vectorization opportunities. Differential Revision: https://reviews.llvm.org/D85283	2020-08-05 12:44:51 +01:00
David Green	3c7e7d40a9	[BasicAA] Enable -basic-aa-recphi by default This option was added a while back, to help improve AA around pointer phi loops. It looks for phi(gep(phi, const), x) loops, checking if x can then prove more precise aliasing info. Differential Revision: https://reviews.llvm.org/D82998	2020-08-04 10:43:42 +01:00
Florian Hahn	b7856f9d8d	[SCEV] Consolidate some smin/smax folding tests into single test file. This patch moves a few spread out smin/smax tests to smin-smax-folds.ll and adds additional test cases that expose further potential for folds.	2020-08-04 10:24:11 +01:00
Alina Sbirlea	1ce82015f6	[MemorySSA] Restrict optimizations after a PhiTranslation. Merging alias results from different paths, when a path did phi translation is not necesarily correct. Conservatively terminate such paths. Aimed to fix PR46156. Differential Revision: https://reviews.llvm.org/D84905	2020-08-03 14:46:41 -07:00
Florian Hahn	ee1c12708a	[SCEV] If Start>=RHS, simplify (Start smin RHS) = RHS for trip counts. In some cases, it seems like we can get rid of unnecessary s/umins by using information from the loop guards (unless I am missing something). One place where this seems to be helpful in practice is when computing loop trip counts. This patch just changes howManyGreaterThans for now. Note that this requires a loop for which we can check 'is guarded'. On SPEC2000/SPEC2006/MultiSource, there are some notable changes for some programs in the number of loops unrolled and trip counts computed. ``` Same hash: 179 (filtered out) Remaining: 58 Metric: scalar-evolution.NumTripCountsComputed Program base patch diff test-suite...langs-C/compiler/compiler.test 25.00 31.00 24.0% test-suite.../Applications/SPASS/SPASS.test 2020.00 2323.00 15.0% test-suite...langs-C/allroots/allroots.test 29.00 32.00 10.3% test-suite.../Prolangs-C/loader/loader.test 17.00 18.00 5.9% test-suite...fice-ispell/office-ispell.test 253.00 265.00 4.7% test-suite...006/450.soplex/450.soplex.test 3552.00 3692.00 3.9% test-suite...chmarks/MallocBench/gs/gs.test 453.00 470.00 3.8% test-suite...ngs-C/assembler/assembler.test 29.00 30.00 3.4% test-suite.../Benchmarks/Ptrdist/bc/bc.test 263.00 270.00 2.7% test-suite...rks/FreeBench/pifft/pifft.test 722.00 741.00 2.6% test-suite...count/automotive-bitcount.test 41.00 42.00 2.4% test-suite...0/253.perlbmk/253.perlbmk.test 1417.00 1451.00 2.4% test-suite...000/197.parser/197.parser.test 387.00 396.00 2.3% test-suite...lications/sqlite3/sqlite3.test 1168.00 1189.00 1.8% test-suite...000/255.vortex/255.vortex.test 173.00 176.00 1.7% Metric: loop-unroll.NumUnrolled Program base patch diff test-suite...langs-C/compiler/compiler.test 1.00 3.00 200.0% test-suite.../Applications/SPASS/SPASS.test 134.00 234.00 74.6% test-suite...count/automotive-bitcount.test 3.00 4.00 33.3% test-suite.../Prolangs-C/loader/loader.test 3.00 4.00 33.3% test-suite...langs-C/allroots/allroots.test 3.00 4.00 33.3% test-suite...Source/Benchmarks/sim/sim.test 10.00 12.00 20.0% test-suite...fice-ispell/office-ispell.test 21.00 25.00 19.0% test-suite.../Benchmarks/Ptrdist/bc/bc.test 32.00 38.00 18.8% test-suite...006/450.soplex/450.soplex.test 300.00 352.00 17.3% test-suite...rks/FreeBench/pifft/pifft.test 60.00 69.00 15.0% test-suite...chmarks/MallocBench/gs/gs.test 57.00 63.00 10.5% test-suite...ngs-C/assembler/assembler.test 10.00 11.00 10.0% test-suite...0/253.perlbmk/253.perlbmk.test 145.00 157.00 8.3% test-suite...000/197.parser/197.parser.test 43.00 46.00 7.0% test-suite...TimberWolfMC/timberwolfmc.test 205.00 214.00 4.4% Geomean difference 7.6% ``` Fixes https://bugs.llvm.org/show_bug.cgi?id=46939 Fixes https://bugs.llvm.org/show_bug.cgi?id=46924 on X86. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D85046	2020-08-03 17:22:42 +01:00
Florian Hahn	ffb4735200	[SCEV] Precommit tests with signed counting down loop. From PR46939.	2020-08-02 10:26:26 +01:00
Sanjay Patel	e591713bff	[ConstantFolding] fold abs intrinsic The handling for minimum value is similar to cttz/ctlz with 0 just above this case. Differential Revision: https://reviews.llvm.org/D84942	2020-07-31 14:08:44 -04:00
Vitaly Buka	89051ebace	[NFC] GetUnderlyingObject -> getUnderlyingObject I am going to touch them in the next patch anyway	2020-07-30 21:08:24 -07:00
Arthur Eubanks	47acbcf09a	[tbaa] Rename type-based-aa -> tbaa For consistency with legacy pass name. Helps with 37 instances of "unknown pass name 'tbaa'" in check-llvm under NPM. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D84967	2020-07-30 19:51:35 -07:00
Sanjay Patel	f7237ee74f	[ConstantFolding] add tests for abs intrinsic; NFC	2020-07-30 09:28:30 -04:00
Yuanfang Chen	8224c5047e	For some tests targeting SystemZ, -march=z13 ---> -mcpu=z13 z13 is not a target. It is a CPU.	2020-07-29 19:18:01 -07:00
Sanjay Patel	9ee7d7122c	[ConstantFolding] fold integer min/max intrinsics If both operands are undef, return undef. If one operand is undef, clamp to limit constant.	2020-07-29 11:01:13 -04:00
Sanjay Patel	9f95895833	[ConstantFolding] add tests for integer min/max intrinsics; NFC	2020-07-29 11:01:13 -04:00
Simon Pilgrim	d1abca187d	[CostModel][X86] Add SSE costs for SMAX/SMIN/UMAX/UMIN intrinsics	2020-07-29 15:55:43 +01:00
Sanjay Patel	8c3262a7b4	[ConstantFolding] update test checks FP min/max intrinsics There's a slight difference in functionality with the new CHECK lines: before, we allowed either -0.0 or 0.0 for maxnum/minnum. That matches the definition, but we should always get a deterministic result from constant folding within the compiler, so now we assert that we got the single expected result in all cases.	2020-07-29 09:43:33 -04:00
Simon Pilgrim	0a0f28254a	[CostModel][X86] Add SSE costs for ABS intrinsics	2020-07-29 14:33:59 +01:00
David Green	9ddb28964c	[ARM] Tune getCastInstrCost for extending masked loads and truncating masked stores This patch uses the feature added in D79162 to fix the cost of a sext/zext of a masked load, or a trunc for a masked store. Previously, those were considered cheap or even free, but it's not the case as we cannot split the load in the same way we would for normal loads. This updates the costs to better reflect reality, and adds a test for it in test/Analysis/CostModel/ARM/cast.ll. It also adds a vectorizer test that showcases the improvement: in some cases, the vectorizer will now choose a smaller VF when tail-predication is enabled, which results in better codegen. (Because if it were to use a higher VF in those cases, the code we see above would be generated, and the vmovs would block tail-predication later in the process, resulting in very poor codegen overall) Original Patch by Pierre van Houtryve Differential Revision: https://reviews.llvm.org/D79163	2020-07-29 13:41:34 +01:00
Simon Pilgrim	c5ef1f1edd	[TTI] Add default cost expansion for abs/smax/smin/umax/umin intrinsics	2020-07-29 12:13:06 +01:00
Simon Pilgrim	3f7249046a	[CostModel][X86] Add smax/smin/umin/umax intrinsics cost model tests Costs currently fall back to scalar generic intrinsic calls	2020-07-28 19:56:11 +01:00
Simon Pilgrim	c6920081a8	[CostModel][X86] Add abs intrinsics cost model tests abs costs currently falls back in scalar generic intrinsic calls	2020-07-28 19:56:10 +01:00
Arthur Eubanks	2ca6c422d2	[FunctionAttrs] Rename functionattrs -> function-attrs To match NewPM pass name, and also for readability. Also rename rpo-functionattrs -> rpo-function-attrs while we're here. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D84694	2020-07-28 09:09:13 -07:00
Florian Hahn	be2ea29ee1	[SCEV] Add additional tests. Increase test coverage for upcoming changes to how SCEV deals with LCSSA phis.	2020-07-28 16:15:57 +01:00
Jinsong Ji	d28f86723f	Re-land "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support" This reverts commit `bf544fa1c3`. Fixed the typo in PPCInstrInfo.cpp.	2020-07-28 14:00:11 +00:00
Jinsong Ji	bf544fa1c3	Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support" This reverts commit `adffce7153`. This is breaking test-suite, revert while investigation.	2020-07-27 21:07:00 +00:00
Jinsong Ji	adffce7153	[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html no one is making use of QPX/A2Q/BGQ/BGP CNK anymore. This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang, CNK support in openmp/polly. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D83915	2020-07-27 19:24:39 +00:00
Juneyoung Lee	32088f4f7f	[ConstantFolding] Fold freeze if it is never undef or poison This is a simple patch that adds constant folding for freeze instruction. IIUC, it isn't needed to update ConstantFold.cpp because there is no freeze constexpr. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84597	2020-07-26 21:54:44 +09:00
Juneyoung Lee	1b802fe34d	NFC; add a test for freeze's constprop	2020-07-26 21:03:23 +09:00
Arthur Eubanks	9bb6ce78be	Rename scoped-noalias -> scoped-noalias-aa Summary: To match NewPM name. Also the new name is clearer and more consistent. Subscribers: jvesely, nhaehnle, hiraditya, asbirlea, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D84542	2020-07-24 12:14:27 -07:00
Tarindu Jayatilaka	06283661b3	Add new function properties to FunctionPropertiesAnalysis Added LoadInstCount, StoreInstCount, MaxLoopDepth, LoopCount Reviewed By: jdoerfert, mtrofin Differential Revision: https://reviews.llvm.org/D82283	2020-07-23 12:46:47 -07:00
Tarindu Jayatilaka	ee6f0e109c	Add a Printer to the FunctionPropertiesAnalysis A printer pass and a lit test case was added. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D82523	2020-07-23 11:57:11 -07:00
Max Kazantsev	c1d8e39236	[Test] Add more simple tests for PR46786	2020-07-22 17:11:26 +07:00
Max Kazantsev	b96114c1e1	[SCEV] Remove premature assert. PR46786 This assert was added to verify assumption that GEP's SCEV will be of pointer type, basing on fact that it should be a SCEVAddExpr with (at least) last operand being pointer. Two notes: - GEP's SCEV does not have to be a SCEVAddExpr after all simplifications; - In current state, GEP's SCEV does not have to have at least one pointer operands (all of them can become int during the transforms). However, we might want to be at a point where it is true. We are currently removing this assert and will try to enumerate the cases where "is pointer" notion might be lost during the transforms. When all of them are fixed, we can return it. Differential Revision: https://reviews.llvm.org/D84294 Reviewed By: lebedev.ri	2020-07-22 15:43:16 +07:00
David Green	becaa6803a	[ARM] Constant fold VCTP intrinsics We can sometimes get into the situation where the operand to a vctp intrinsic becomes constant, such as after a loop is fully unrolled. This adds the constant folding needed for them, allowing them to simplify away and hopefully simplifying remaining instructions. Differential Revision: https://reviews.llvm.org/D84110	2020-07-21 11:39:31 +01:00
Matt Arsenault	ad8e900cb3	Verifier: Disallow byval and similar for AMDGPU calling conventions These imply stack-like semantics, which doesn't make any sense for entry points.	2020-07-20 10:58:57 -04:00
Jameson Nash	8b354cc8db	[ConstantFolding] check applicability of AllOnes constant creation first The getAllOnesValue can only handle things that are bitcast from a ConstantInt, while here we bitcast through a pointer, so we may see more complex objects (like Array or Struct). Differential Revision: https://reviews.llvm.org/D83870	2020-07-19 13:13:57 -04:00
Arthur Eubanks	9adbb5cb3a	[SCEV] Fix ScalarEvolution tests under NPM Many tests use opt's -analyze feature, which does not translate well to NPM and has better alternatives. The alternative here is to explicitly add a pass that calls ScalarEvolution::print(). The legacy pass manager RUNs aren't changing, but they are now pinned to the legacy pass manager. For each legacy pass manager RUN, I added a corresponding NPM RUN using the 'print<scalar-evolution>' pass. For compatibility with update_analyze_test_checks.py and existing test CHECKs, 'print<scalar-evolution>' now prints what -analyze prints per function. This was generated by the following Python script and failures were manually fixed up: import sys for i in sys.argv: with open(i, 'r') as f: s = f.read() with open(i, 'w') as f: for l in s.splitlines(): if "RUN:" in l and ' -analyze ' in l and '\\' not in l: f.write(l.replace(' -analyze ', ' -analyze -enable-new-pm=0 ')) f.write('\n') f.write(l.replace(' -analyze ', ' -disable-output ').replace(' -scalar-evolution ', ' "-passes=print<scalar-evolution>" ').replace(" \| ", " 2>&1 \| ")) f.write('\n') else: f.write(l) There are a couple failures still in ScalarEvolution under NPM, but those are due to other unrelated naming conflicts. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D83798	2020-07-16 11:24:07 -07:00
David Green	311fafd2c9	[BasicAA] Fix -basicaa-recphi for geps with negative offsets As shown in D82998, the basic-aa-recphi option can cause miscompiles for gep's with negative constants. The option checks for recursive phi, that recurse through a contant gep. If it finds one, it performs aliasing calculations using the other phi operands with an unknown size, to specify that an unknown number of elements after the initial value are potentially accessed. This works fine expect where the constant is negative, as the size is still considered to be positive. So this patch expands the check to make sure that the constant is also positive. Differential Revision: https://reviews.llvm.org/D83576	2020-07-16 17:22:40 +01:00
David Green	30fa576627	[BasicAA] Add additional negative phi tests. NFC	2020-07-16 15:32:38 +01:00
dfukalov	76a0c0ee6f	[AMDGPU][CostModel] Improve cost estimation for fused {fadd\|fsub}(a,fmul(b,c)) Summary: If result of fmul(b,c) has one use, in almost all cases (except denormals are IEEE) the pair of operations will be fused in one fma/mad/mac/etc. Reviewers: rampitec Reviewed By: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits, kerbowa Tags: #llvm Differential Revision: https://reviews.llvm.org/D83919	2020-07-16 03:06:38 +03:00
Arthur Eubanks	f413b53a67	[NPM][IVUsers] Rename ivusers -> iv-users LPM passes were named iv-users, which seems nicer than ivusers. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D83803	2020-07-15 09:38:21 -07:00
Tyker	0257ba581c	Fix tests after `16f777f421`	2020-07-14 22:52:26 +02:00
Giorgis Georgakoudis	aef60af34e	[CallGraph] Ignore callback uses Summary: Ignore callback uses when adding a callback function in the CallGraph. Callback functions are typically created when outlining, e.g. for OpenMP, so they have internal scope and linkage. They should not be added to the ExternalCallingNode since they are only callable by the specified caller function at creation time. A CGSCC pass, such as OpenMPOpt, may need to update the CallGraph by adding a new outlined callback function. Without ignoring callback uses, adding breaks CGSCC pass restrictions and results to a broken CallGraph. Reviewers: jdoerfert Subscribers: hiraditya, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83370	2020-07-14 13:08:49 -07:00
Tyker	16f777f421	[NFC] Add debug and stat counters to assume queries and assume builder Summary: Add debug counter and stats counter to assume queries and assume builder here is the collected stats on a build of check-llvm + check-clang. "assume-builder.NumAssumeBuilt": 2720879, "assume-builder.NumAssumesMerged": 761396, "assume-builder.NumAssumesRemoved": 1576212, "assume-builder.NumBundlesInAssumes": 6518809, "assume-queries.NumAssumeQueries": 85566380, "assume-queries.NumUsefullAssumeQueries": 2727360, the NumUsefullAssumeQueries stat is actually pessimistic because in a few places queries ask to keep providing information to try to get better information. and this isn't counted as a usefull query evem tho it can be usefull Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83506	2020-07-14 21:49:14 +02:00
David Sherwood	c06b7e2ab5	[SVE] Fix implicit TypeSize->uint64_t conversion getCastInstrCost In getCastInstrCost() when comparing different sizes for src and dst types we should be using the TypeSize comparison operators instead of relying upon TypeSize being converted a uin64_t. Previously this meant we were dropping the scalable property and treating fixed and scalable vector types the same. Differential Revision: https://reviews.llvm.org/D83461	2020-07-14 08:16:31 +01:00
David Green	e1135b486a	Revert "[BasicAA] Enable -basic-aa-recphi by default" This reverts commit `af839a9618`. Some issues appear to be being caused by this. Reverting whilst we investigate.	2020-07-10 13:43:54 +01:00
Roman Lebedev	c2a61ef388	Revert "[CallGraph] Ignore callback uses" This likely has broken test/Transforms/Attributor/IPConstantProp/ tests. http://45.33.8.238/linux/22502/step_12.txt This reverts commit `205dc0922d`.	2020-07-10 00:02:07 +03:00
Giorgis Georgakoudis	205dc0922d	[CallGraph] Ignore callback uses Summary: Ignore callback uses when adding a callback function in the CallGraph. Callback functions are typically created when outlining, e.g. for OpenMP, so they have internal scope and linkage. They should not be added to the ExternalCallingNode since they are only callable by the specified caller function at creation time. A CGSCC pass, such as OpenMPOpt, may need to update the CallGraph by adding a new outlined callback function. Without ignoring callback uses, adding breaks CGSCC pass restrictions and results to a broken CallGraph. Reviewers: jdoerfert Subscribers: hiraditya, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83370	2020-07-09 13:13:46 -07:00
David Green	af839a9618	[BasicAA] Enable -basic-aa-recphi by default This option was added a while back, to help improve AA around pointer phi loops. It looks for phi(gep(phi, const), x) loops, checking if x can then prove more precise aliasing info. Differential Revision: https://reviews.llvm.org/D82998	2020-07-09 14:54:53 +01:00
Arthur Eubanks	83158cf95d	[BasicAA] Remove -basicaa alias Follow up of https://reviews.llvm.org/D82607. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D83067	2020-07-07 10:44:23 -07:00
Stanislav Mekhanoshin	f7a7efbf88	[AMDGPU] Tweak getTypeLegalizationCost() Even though wide vectors are legal they still cost more as we will have to eventually split them. Not all operations can be uniformly done on vector types. Conservatively add the cost of splitting at least to 8 dwords, which is our widest possible load. We are more or less lying to cost mode with this change but this can prevent vectorizer from creation of wide vectors which results in RA problems for us. Differential Revision: https://reviews.llvm.org/D83078	2020-07-06 14:07:48 -07:00
Roman Lebedev	a2619a60e4	Reland "[ScalarEvolution] createSCEV(): recognize `udiv`/`urem` disguised as an `sdiv`/`srem`" This reverts commit `d3e3f36ff1`, which reverter the original commit `2c16100e6f`, but with polly tests now actually passing.	2020-07-06 18:00:22 +03:00
David Green	146dad0077	[ARM] MVE FP16 cost adjustments This adjusts the MVE fp16 cost model, similar to how we already do for integer casts. It uses the base cost of 1 per cvt for most fp extend / truncates, but adjusts it for loads and stores where we know that a extending load has been used to get the load into the correct lane, and only an MVE VCVTB is then needed. Differential Revision: https://reviews.llvm.org/D81813	2020-07-06 15:57:51 +01:00
Mikhail Goncharov	d3e3f36ff1	Revert "[ScalarEvolution] createSCEV(): recognize `udiv`/`urem` disguised as an `sdiv`/`srem`" Summary: This reverts commit `2c16100e6f`. ninja check-polly fails: Polly :: Isl/CodeGen/MemAccess/generate-all.ll Polly :: ScopInfo/multidim_srem.ll Reviewers: kadircet, bollu Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83230	2020-07-06 16:41:59 +02:00
David Green	afdb2ef2ed	[ARM] Adjust default fp extend and trunc costs This adds some default costs for fp extends and truncates, generally costing them as 1 per lane. If the type is not legal then the cost will include a call to an __aeabi_ function. Some NEON code is also adjusted to make sure it applies to the expected types, now that fp16 is a more common thing. Differential Revision: https://reviews.llvm.org/D82458	2020-07-06 14:23:17 +01:00
David Green	60b8b2beea	[ARM] Add extra extend and trunc costs for cast instructions This expands the existing extend costs with a few extras for larger types than legal, which will usually be split under MVE. It also adds trunk support for the same thing. These should not have a large effect on many things, but makes the costs explicit and keeps a certain balance between the trunks and extends. Differential Revision: https://reviews.llvm.org/D82457	2020-07-06 11:33:05 +01:00
David Green	55227f85d0	[ARM] Use BaseT::getMemoryOpCost for getMemoryOpCost This alters getMemoryOpCost to use the Base TargetTransformInfo version that includes some additional checks for whether extending loads are legal. This will generally have the effect of making <2 x ..> and some <4 x ..> loads/stores more expensive, which in turn should help favour larger vector factors. Notably it alters the cost of a <4 x half>, which with the current codegen will be expensive if it is not extended. Differential Revision: https://reviews.llvm.org/D82456	2020-07-06 10:58:40 +01:00
Arthur Eubanks	3d12e79094	[NewPM][LSR] Rename strength-reduce -> loop-reduce The legacy pass was called "loop-reduce". This lowers the number of check-llvm failures under NPM by 83. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D82925	2020-07-02 11:15:29 -07:00
David Green	30bd66544d	[BasicAA] Fix recursive phi MustAlias calculations With the option -basic-aa-recphi we can detect recursive phis that loop through constant geps, which allows us to detect more no-alias case for pointer IV's. If the other phi operand and the other alias value are MustAlias though, we cannot presume that every element in the loop is also MustAlias. We need to instead be conservative and return MayAlias. Differential Revision: https://reviews.llvm.org/D82987	2020-07-02 14:01:38 +01:00
Roman Lebedev	2c16100e6f	[ScalarEvolution] createSCEV(): recognize `udiv`/`urem` disguised as an `sdiv`/`srem` Summary: While InstCombine trivially converts that `srem` into a `urem`, it might happen later than wanted, in particular i'd like for that to happen on https://godbolt.org/z/bwuEmJ test case early in pipeline, before first instcombine run, just before `-mem2reg`. SCEV should recognize this case natively. Reviewers: mkazantsev, efriedma, nikic, reames Reviewed By: efriedma Subscribers: clementval, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82721	2020-07-02 13:22:12 +03:00
David Green	68498ce8af	[BasicAA] New basic-aa-recphi test. NFC	2020-07-02 10:54:01 +01:00
Roman Lebedev	e7da7d9428	[NFCI] Actually provide correct check lines in sdiv.ll	2020-07-02 02:00:02 +03:00
Roman Lebedev	51ff7642a3	[NFC][ScalarEvolution] Add udiv-disguised-as-sdiv test Much like `25521150d7`, but with division instead of remainder. See https://reviews.llvm.org/D82721	2020-07-02 01:44:19 +03:00
Sergey Dmitriev	cb8faaacb5	[CallGraph] Add support for callback call sites Summary: This patch changes call graph analysis to recognize callback call sites and add an artificial 'reference' call record from the broker function caller to the callback function in the call graph. A presence of such reference enforces bottom-up traversal order for callback functions in CG SCC pass manager because callback function logically becomes a callee of the broker function caller. Reviewers: jdoerfert, hfinkel, sstefan1, baziotis Reviewed By: jdoerfert Subscribers: hiraditya, kuter, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82572	2020-07-01 13:44:11 -07:00
Florian Hahn	1ccc49924a	[AArch64] Add getCFInstrCost, treat branches as free for throughput. D79164/2596da31740f changed getCFInstrCost to return 1 per default. AArch64 did not have its own implementation, hence the throughput cost of CFI instructions is overestimated. On most cores, most branches should be predicated and essentially free throughput wise. This restores a 9% performance regression on a SPEC2006 benchmark on AArch64 with -O3 LTO & PGO. This patch effectively restores pre `2596da3174` behavior for AArch64 and undoes the AArch64 test changes of the patch. Reviewers: samparker, dmgreen, anemet Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D82755	2020-06-30 20:34:04 +01:00
Roman Lebedev	25521150d7	[NFC][ScalarEvolution] Add a test showing SCEV failure to recognize 'urem' While InstCombine trivially converts that `srem` into a `urem`, it might happen later than wanted. SCEV should recognize this natively.	2020-06-28 20:35:02 +03:00
Roman Lebedev	141e845da5	[SCEV] Make SCEVAddExpr actually always return pointer type if there is pointer operand (PR46457) Summary: The added assertion fails on the added test without the fix. Reduced from test-suite/MultiSource/Benchmarks/MiBench/office-ispell/correct.c In IR, getelementptr, obviously, takes pointer as it's base, and returns a pointer. When creating an SCEV expression, SCEV operands are sorted in hope that it increases folding potential, and at the same time SCEVAddExpr's type is the type of the last(!) operand. Which means, in some exceedingly rare cases, pointer operand may happen to end up not being the last operand, and as a result SCEV for GEP will suddenly have a non-pointer return type. We should ensure that does not happen. In the end, actually storing the `Type *`, at the cost of increasing memory footprint of `SCEVAddExpr`, appears to be the solution. We can't just store a 'is a pointer' bit and create pointer type on the fly since we don't have data layout in getType(). Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=46457 \| PR46457 ]] Reviewers: efriedma, mkazantsev, reames, nikic Reviewed By: efriedma Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82633	2020-06-27 11:37:17 +03:00
Fangrui Song	4cd19a6e15	[BasicAA] Rename -disable-basicaa to -disable-basic-aa to be consistent with the canonical name "basic-aa"	2020-06-26 20:55:44 -07:00
Fangrui Song	f31811f2dc	[BasicAA] Rename deprecated -basicaa to -basic-aa Follow-up to D82607 Revert an accidental change (empty.ll) of D82683	2020-06-26 20:41:37 -07:00
Arthur Eubanks	0077988a6f	Fix full-store-partial-alias.ll Accidentally renamed -disable-basicaa -> -disable-basic-aa	2020-06-26 15:46:47 -07:00
Arthur Eubanks	feeed16a5f	[NewPM][BasicAA] basicaa -> basic-aa in Analysis/BasicAA Following https://reviews.llvm.org/D82607. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D82683	2020-06-26 14:58:01 -07:00
Arthur Eubanks	0c6bf90b56	[NewPM][BasicAA] Rename basicaa -> basic-aa, add alias Summary: BasicAA under the new pass manager is called "basic-aa", which fits more with the other AA names which almost always contain a dash. Keep an alias from basicaa -> basic-aa. Will change all references of "basicaa" to "basic-aa", then remove the alias. Makes check-llvm failures under NPM go from 2307 to 1867. Reviewers: asbirlea, ychen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82607	2020-06-25 18:08:34 -07:00
David Green	f14457f5d8	[ARM] Split cast cost tests, and add masked load/store tests. NFC This file has grown quite large and could do with being split up. This splits away the load/store + cast tests into a separate file. Some masked load/store + cast tests have been added too, along with some extra load/store + fpcast tests.	2020-06-25 13:24:17 +01:00
Eli Friedman	90ad786947	[IR] Prefer scalar type for struct indexes in GEP constant expressions. This has two advantages: one, it's simpler, and two, it doesn't require heroic pattern matching with scalable vectors. Also includes a small fix to DataLayout to allow the scalable vector testcase to work correctly. Differential Revision: https://reviews.llvm.org/D82061	2020-06-23 16:14:36 -07:00
Vitaly Buka	5d964e262f	[StackSafety] Check variable lifetime We can't consider variable safe if out-of-lifetime access is possible. So if StackLifetime can't prove that the instruction always uses the variable when it's still alive, we consider it unsafe.	2020-06-22 03:45:29 -07:00
Vitaly Buka	8f592ed333	[StackSafety] Ignore unreachable instructions Usually DominatorTree provides this info, but here we use StackLifetime. The reason is that in the next patch StackLifetime will be used for actual lifetime checks and we can avoid forwarding the DominatorTree into this code.	2020-06-22 03:45:29 -07:00
Florian Hahn	9a7d80a32c	Revert "[BasicAA] Use known lower bounds for index values for size based check." This potentially related to https://bugs.llvm.org/show_bug.cgi?id=46335 and causes a slight compile-time regression. Revert while investigating. This reverts commit `d99a1848c4`.	2020-06-20 10:06:05 +01:00
dfukalov	129388ddc4	[AMDGPU][CostModel] Add fneg cost estimation Summary: The estimation uses AMDGPUTargetLowering::isFNegFree() Reviewers: rampitec Reviewed By: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82065	2020-06-19 17:31:35 +03:00
Vitaly Buka	306c257b00	[SafeStack,NFC] Print liveness for all instrunctions	2020-06-19 02:32:17 -07:00
Vitaly Buka	7b27c09f63	[StackSafety,NFC] Don't test terminators Code does not track terminators and do not expose them through interface. State there is just a state of the last instruction or entry. So this information is just redundant and doesn't need to be tested.	2020-06-19 02:32:17 -07:00
Tyker	b7338fb1a6	[AssumeBundles] add cannonicalisation to the assume builder Summary: this reduces significantly the number of assumes generated without aftecting too much the information that is preserved. this improves the compile-time cost of enable-knowledge-retention significantly. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79650	2020-06-19 10:32:26 +02:00
Vitaly Buka	fcd67665a8	[StackSafety] Add "Must Live" logic Summary: Extend StackLifetime with option to calculate liveliness where alloca is only considered alive on basic block entry if all non-dead predecessors had it alive at terminators. Depends on D82043. Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82124	2020-06-18 16:53:37 -07:00
Vitaly Buka	f672791e08	[StackSafety] Add pass for StackLifetime testing Summary: lifetime.ll is a copy of SafeStack/X86/coloring2.ll Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82043	2020-06-18 16:34:18 -07:00
Paul Walker	4612f39120	[SVE] Add flag to specify SVE register size, using this to calculate legal vector types. Adds aarch64-sve-vector-bits-{min,max} to allow the size of SVE data registers (in bits) to be specified. This allows the code generator to make assumptions it normally couldn't. As a starting point this information is used to mark fixed length vector types that can fit within the specified size as legal. Reviewers: rengolin, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80384	2020-06-18 12:11:16 +00:00
Sameer Sahasrabuddhe	7aad220795	[DA] conservatively mark the join of every divergent branch For a loop, a join block is a block that is reachable along multiple disjoint paths from the exiting block of a loop. If the exit condition of the loop is divergent, then such join blocks must also be marked divergent. This currently fails in some cases because not all join blocks are identified correctly. The workaround is to conservatively mark every join block of any branch (not necessarily the exiting block of a loop) as divergent. https://bugs.llvm.org/show_bug.cgi?id=46372 Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D81806	2020-06-18 17:39:20 +05:30
Christopher Tetreault	8819202dfd	[SVE] Eliminate bad VectorType::getNumElements() calls from ConstantFold Summary: Assume all usages of this function are explicitly fixed-width operations and cast to FixedVectorType Reviewers: efriedma, sdesmalen, c-rhodes, majnemer, dblaikie Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80262	2020-06-17 14:19:56 -07:00
Sameer Sahasrabuddhe	d3963b3a5f	[DA] propagate loop live-out values that get used in a branch Values that are uniform within a loop but appear divergent to uses outside the loop are "tainted" so that such uses are marked divergent. But if such a use is a branch, then it's divergence needs to be propagated. The simplest way to do that is to put the branch back in the main worklist so that it is processed appropriately. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D81822	2020-06-17 09:21:00 +05:30
Tyker	d7deef1206	Revert "[AssumeBundles] add cannonicalisation to the assume builder" This reverts commit `90c50cad19`.	2020-06-16 14:34:55 +02:00
Tyker	90c50cad19	[AssumeBundles] add cannonicalisation to the assume builder Summary: this reduces significantly the number of assumes generated without aftecting too much the information that is preserved. this improves the compile-time cost of enable-knowledge-retention significantly. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79650	2020-06-16 13:12:35 +02:00
Sam Parker	2596da3174	[CostModel] getCFInstrCost in getUserCost. Have BasicTTI call the base implementation so that both agree on the default behaviour, which the default being a cost of '1'. This has required an X86 specific implementation as it seems to be very reliant on those instructions being free. Changes are also made to AMDGPU so that their implementations distinguish between cost kinds, so that the unrolling isn't affected. PowerPC also has its own implementation to prevent changes to the reg-usage vectorizer test. The cost model test changes now reflect that ret instructions are not generally free. Differential Revision: https://reviews.llvm.org/D79164	2020-06-15 09:28:46 +01:00
David Green	7507186b94	[ARM] Additional cast cost tests. This adds additional cast cpst tests useful for MVE, notably around half types.	2020-06-14 14:30:07 +01:00
Vitaly Buka	c1e47b47f8	[StackSafety] Run ThinLTO Summary: ThinLTO linking runs dataflow processing on collected function parameters. Then StackSafetyGlobalInfoWrapperPass in ThinLTO backend will run as usual looking up to external symbol in the summary if needed. Depends on D80985. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81242	2020-06-12 18:11:29 -07:00
Vitaly Buka	e6ce0dc5de	[StackSafety,NFC] Extract addOverflowNever	2020-06-12 17:42:32 -07:00
Vitaly Buka	999307323a	[StackSafety] Fix byval handling We don't need process paramenters which marked as byval as we are not going to pass interested allocas without copying. If we pass value into byval argument, we just handle that as Load of corresponding type and stop that branch of analysis.	2020-06-11 20:58:36 -07:00
Alina Sbirlea	519b019a0a	Verify MemorySSA after all updates. Verify after completing all updates. Resolves PR46275.	2020-06-11 18:48:41 -07:00
Simon Pilgrim	28947bc23c	[CostModel][X86] Add broadcast costs for vXi1 bool vectors Doesn't mean much on non-AVX512 targets but better to keep with the other shuffles	2020-06-10 15:27:15 +01:00
Roman Lebedev	c868335e24	[SCEV] ScalarEvolution::createSCEV(): clarify no-wrap flag propagation for shift by bitwidth-1 Summary: There was this comment here previously: ``` - // It is currently not resolved how to interpret NSW for left - // shift by BitWidth - 1, so we avoid applying flags in that - // case. Remove this check (or this comment) once the situation - // is resolved. See - // http://lists.llvm.org/pipermail/llvm-dev/2015-April/084195.html - // and http://reviews.llvm.org/D8890 . ``` But langref was fixed in rL286785, and the behavior is pretty obvious: http://volta.cs.utah.edu:8080/z/MM4WZP ^ nuw can always be propagated. nsw can be propagated if either nuw is specified, or the shift is by less than bitwidth-1. This mimics similar D81189 Reassociate change, alive2 is happy about that one. I'm not sure `NUW` isn't being printed, but that seems unrelated. Reviewers: mkazantsev, reames, sanjoy, nlopes, craig.topper, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81243	2020-06-06 13:02:07 +03:00
Philip Reames	32c09d527c	[Tests] Migrate a number of tests to gc-live bundle representation	2020-06-05 16:44:04 -07:00
Roman Lebedev	39e3683534	[NFC][SCEV] Add test with 'or' with no common bits set	2020-06-05 12:18:15 +03:00
Roman Lebedev	39e3c92410	[NFC][SCEV] Some tests for shifts by bitwidth-2/bitwidth-1 w/ no-wrap flags	2020-06-05 11:45:09 +03:00
Vitaly Buka	6dd738e2f0	[StackSafety,NFC] Switch tests to aarch64	2020-06-05 00:24:02 -07:00
Yevgeny Rouban	dcfa78a4cc	Extend InvokeInst !prof branch_weights metadata to unwind branches Allow InvokeInst to have the second optional prof branch weight for its unwind branch. InvokeInst is a terminator with two successors. It might have its unwind branch taken many times. If so the BranchProbabilityInfo unwind branch heuristic can be inaccurate. This patch allows a higher accuracy calculated with both branch weights set. Changes: - A new section about InvokeInst is added to the BranchWeightMetadata page. It states the old information that missed in the doc and adds new about the second branch weight. - Verifier is changed to allow either 1 or 2 branch weights for InvokeInst. - A new test is written for BranchProbabilityInfo to demonstrate the main improvement of the simple fix in calcMetadataWeights(). - Several new testcases are created for Inliner. Those check that both weights are accounted for invoke instruction weight calculation. - PGOUseFunc::setBranchWeights() is fixed to be applicable to InvokeInst. Reviewers: davidxl, reames, xur, yamauchi Tags: #llvm Differential Revision: https://reviews.llvm.org/D80618	2020-06-04 15:37:15 +07:00
Jay Foad	c27214c234	[AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsics (fix tests) Try to fix Windows buildbots.	2020-06-03 11:40:52 +01:00
Jay Foad	c823cfde21	[AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsics Differential Revision: https://reviews.llvm.org/D80702	2020-06-03 09:34:22 +01:00
Vitaly Buka	d3b7f90d00	[StackSafety] Skip non-pointer parameters Summary: Depends on D80908. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80956	2020-06-03 01:16:39 -07:00
Vitaly Buka	232d348c6e	[MTE] Convert StackSafety into analysis This lets us to remove !stack-safe metadata and better controll when to perform StackSafety analysis. Reviewers: eugenis Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80771	2020-06-02 16:08:14 -07:00
Vitaly Buka	fc07c1af69	[StackSafety] Delete useless test	2020-06-02 16:08:14 -07:00
Sam Parker	e70cf280f8	[NFC][ARM][AArch64] Test runs Add code size tests runs for memory ops for both architectures.	2020-06-02 09:05:30 +01:00
Yevgeny Rouban	07239c736a	[BrachProbablityInfo] Proportional distribution of reachable probabilities When fixing probability of unreachable edges in BranchProbabilityInfo::calcMetadataWeights() proportionally distribute remainder probability over the reachable edges. The old implementation distributes the remainder probability evenly. See examples in the fixed tests. Reviewers: yamauchi, ebrevnov Tags: #llvm Differential Revision: https://reviews.llvm.org/D80611	2020-06-02 12:06:52 +07:00
Florian Hahn	d99a1848c4	[BasicAA] Use known lower bounds for index values for size based check. Currently, BasicAA does not exploit information about value ranges of indexes. For example, consider the 2 pointers %a = %base and %b = %base + %stride below, assuming they are used to access 4 elements. If we know that %stride >= 4, we know the accesses do not alias. If %stride is a constant, BasicAA currently gets that. But if the >= 4 constraint is encoded using an assume, it misses the NoAlias. This patch extends DecomposedGEP to include an additional MinOtherOffset field, which tracks the constant offset similar to the existing OtherOffset, which the difference that it also includes non-negative lower bounds on the range of the index value. When checking if the distance between 2 accesses exceeds the access size, we can use this improved bound. For now this is limited to using non-negative lower bounds for indices, as this conveniently skips cases where we do not have a useful lower bound (because it is not constrained). We potential miss out in cases where the lower bound is constrained but negative, but that can be exploited in the future. Reviewers: sanjoy, hfinkel, reames, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D76194	2020-05-30 16:20:42 +01:00
David Green	a01c0049b1	[ConstantFolding] Constant folding for integer vector reduce intrinsics This add constant folding for all the integer vector reduce intrinsics, providing that the argument is a constant vector. zeroinitializer always produces 0 for all intrinsics, and other values can be handled with APInt operators. Differential Revision: https://reviews.llvm.org/D80516	2020-05-29 17:58:42 +01:00
Philip Reames	27304b1737	[Tests] Switch a few statepoint tests to using operand bundles We've started (D80598) the process of migrating away from the inline operand lists in statepoints to using explicit operand bundles. Update a few tests to reflect the new preference. More to come, these were simply the ones outside any obvious grouping.	2020-05-28 14:36:05 -07:00
Vitaly Buka	892c71a5bb	[StackSafety] Don't run datafow on allocas We need to process only parameters. Allocas access can be calculated afterwards. Also don't create fake function for aliases and just resolve them on initialization.	2020-05-28 13:32:57 -07:00
Vitaly Buka	f6383643d9	[StackSafety] Bailout on some function calls Don't miss values used in calls outside regular argument list.	2020-05-27 02:48:42 -07:00
Vitaly Buka	06a07dd608	[StackSafety] Fix formatting in the test	2020-05-27 02:48:41 -07:00
Vitaly Buka	b101c6251a	[StackSafety] Ignore some use of values We should ignore value used in MemTransferInst as other then src/dst argument.	2020-05-27 02:48:41 -07:00
Vitaly Buka	32a1f60d11	[StackSafety] Use SCEV to find mem operation length	2020-05-26 23:22:37 -07:00
Vitaly Buka	d0f1f5adfa	[StackSafety] Use getSignedRange for offsets	2020-05-26 23:22:36 -07:00
Vitaly Buka	b5ae70046b	[StackSafety] Simplify SCEVRewriteVisitor Probably NFC.	2020-05-26 18:09:43 -07:00
Sam Parker	792575ff32	[NFC][ARM][AArch64] More code size tests Add analysis runs for icmp, fcmp and select instructions.	2020-05-26 14:47:02 +01:00
Sam Parker	c5bbc8dd6d	[NFC][ARM] Fix for previous commit Actually analyse code-size for the size runs...	2020-05-26 10:45:35 +01:00
Sam Parker	48cdbd081c	[NFC][ARM] Add code size analysis tests Add code size runs for the cast costs.	2020-05-26 10:30:43 +01:00
Sam Parker	64cfb8a864	[NFC][ARM] Add intrinsic code size runs Add code size analysis of arithmetic intrinsics.	2020-05-26 09:41:54 +01:00
Sam Parker	1f72d5880e	[CostModel] Check for free intrinsics in BasicTTI Recommitting part of "[CostModel] Unify Intrinsic Costs." `de71def3f5` Now that the 'free' intrinsic information has been sunk to the lowest level, query the base implementation in BasicTTI before doing anything else. I suspect this is the change that was causing the main changes, particularly the large effects on debug builds. Differential Revision: https://reviews.llvm.org/D80012	2020-05-26 08:37:13 +01:00
Denis Antrushin	5451289aba	[SCEV] Constant fold MultExpr before applying depth limit. Summary: Users of SCEV reasonably assume that multiplication of two constant SCEVs will in turn be constant. However, that is not always the case: First, we can get here with reached depth limit, and will create MultExpr SCEV `C1 * C2` and cache it. Then, we can get here with the same operands, but with small depth level. But this time we will find existing MultExpr SCEV and return it, instead of expected constant SCEV. This patch changes getMultExpr to not apply depth limit to all constant operands expression, allowing them to be folded. Reviewers: reames, mkazantsev Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79893	2020-05-22 18:34:32 +03:00
Sam Parker	fb3ba38021	[CostModel] Remove getExtCost This has not been implemented by any backends which appear to cover the functionality through getCastInstrCost. Sink what there is in the default implementation into BasicTTI. Differential Revision: https://reviews.llvm.org/D78922	2020-05-21 07:18:06 +01:00
Yevgeny Rouban	8138487468	[BrachProbablityInfo] Set edge probabilities at once and fix calcMetadataWeights() Hide the method that allows setting probability for particular edge and introduce a public method that sets probabilities for all outgoing edges at once. Setting individual edge probability is error prone. More over it is difficult to check that the total probability is 1.0 because there is no easy way to know when the user finished setting all the probabilities. Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights(). Changing unreachable branch probabilities to raw(1) and distributing the rest (oldProbability - raw(1)) over the reachable branches could introduce total probability inaccuracy bigger than 1/numOfBranches. Reviewers: yamauchi, ebrevnov Tags: #llvm Differential Revision: https://reviews.llvm.org/D79396	2020-05-21 12:52:37 +07:00
Eli Friedman	f26bdb539e	Make Value::getPointerAlignment() return an Align, not a MaybeAlign. If we don't know anything about the alignment of a pointer, Align(1) is still correct: all pointers are at least 1-byte aligned. Included in this patch is a bugfix for an issue discovered during this cleanup: pointers with "dereferenceable" attributes/metadata were assumed to be aligned according to the type of the pointer. This wasn't intentional, as far as I can tell, so Loads.cpp was fixed to stop making this assumption. Frontends may need to be updated. I updated clang's handling of C++ references, and added a release note for this. Differential Revision: https://reviews.llvm.org/D80072	2020-05-20 16:37:20 -07:00
Nikita Popov	5fae613a4f	[LVI] Don't require DominatorTree in LVI (NFC) After D76797 the dominator tree is no longer used in LVI, so we can remove it as a pass dependency, and also get rid of the dominator tree enabling/disabling logic in JumpThreading. Apart from cleaning up the code, this also clarifies LVI cache consistency, in that the LVI cache can no longer depend on whether the DT was or wasn't enabled due to pending DT updates at any given time. Differential Revision: https://reviews.llvm.org/D76985	2020-05-19 20:21:46 +02:00
Eli Friedman	11aa3707e3	StoreInst should store Align, not MaybeAlign This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small. Differential Revision: https://reviews.llvm.org/D79968	2020-05-15 12:26:58 -07:00
Nikita Popov	f89f7da999	[IR] Convert null-pointer-is-valid into an enum attribute The "null-pointer-is-valid" attribute needs to be checked by many pointer-related combines. To make the check more efficient, convert it from a string into an enum attribute. In the future, this attribute may be replaced with data layout properties. Differential Revision: https://reviews.llvm.org/D78862	2020-05-15 19:41:07 +02:00
Sam Parker	0ef62fc25d	[NFC][ARM] Intrinsic CostModel Tests Add throughput tests for saturating, overflowing and reduction operations.	2020-05-15 13:38:42 +01:00
Stanislav Mekhanoshin	184b383457	Add v16f64 value type We need to use it to handle <16 x double> indirect indexes in the AMDGPU BE. The only visible change from adding it is in ARM cost model. To me it looks reasonable. With doubling a vector size it quadruples the cost up to the size 8 and then it did only double it. Now it also quadruples, which seems a logical progression to me. Actual AMDGPU code is to follow, this is a common part, plus load/store legalization in the AMDGPU BE not to break what works now. Differential Revision: https://reviews.llvm.org/D79952	2020-05-14 14:28:00 -07:00
Eli Friedman	4532a50899	Infer alignment of unmarked loads in IR/bitcode parsing. For IR generated by a compiler, this is really simple: you just take the datalayout from the beginning of the file, and apply it to all the IR later in the file. For optimization testcases that don't care about the datalayout, this is also really simple: we just use the default datalayout. The complexity here comes from the fact that some LLVM tools allow overriding the datalayout: some tools have an explicit flag for this, some tools will infer a datalayout based on the code generation target. Supporting this properly required plumbing through a bunch of new machinery: we want to allow overriding the datalayout after the datalayout is parsed from the file, but before we use any information from it. Therefore, IR/bitcode parsing now has a callback to allow tools to compute the datalayout at the appropriate time. Not sure if I covered all the LLVM tools that want to use the callback. (clang? lli? Misc IR manipulation tools like llvm-link?). But this is at least enough for all the LLVM regression tests, and IR without a datalayout is not something frontends should generate. This change had some sort of weird effects for certain CodeGen regression tests: if the datalayout is overridden with a datalayout with a different program or stack address space, we now parse IR based on the overridden datalayout, instead of the one written in the file (or the default one, if none is specified). This broke a few AVR tests, and one AMDGPU test. Outside the CodeGen tests I mentioned, the test changes are all just fixing CHECK lines and moving around datalayout lines in weird places. Differential Revision: https://reviews.llvm.org/D78403	2020-05-14 13:03:50 -07:00
Sam Parker	6bbad7285c	[CostModel] Modify BasicTTI getCastInstrCost Fix the assumption that all bitcasts of the same type sizes are free. We now only assume that bitcasts between ints and ptrs of the same size are free. This allows TTImpl to just call the concrete implementation of getCastInstrCost. Differential Revision: https://reviews.llvm.org/D78918	2020-05-13 07:26:08 +01:00
Juneyoung Lee	e5f602d82c	[ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc. Summary: This patch makes propagatesPoison be more accurate by returning true on more bin ops/unary ops/casts/etc. The changed test in ScalarEvolution/nsw.ll was introduced by `a19edc4d15` . IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has no-overflow flags even if the loop isn't in the wanted form. It becomes more accurate with this patch, so think this is okay. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, sanjoy Reviewed By: spatel, nikic Subscribers: regehr, nlopes, efriedma, fhahn, javed.absar, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D78615	2020-05-13 02:51:42 +09:00
Sam Parker	f1f8cffce4	[NFC][AArch64] More casts tests... Don't use truncs are users because sometimes they're free too.	2020-05-12 13:06:17 +01:00
Sam Parker	e114bdf072	[NFC][AArch64] More cast cost tests Add truncating stores and casts with users.	2020-05-12 11:32:52 +01:00
Sam Parker	b4a8091a11	[ARM][CostModel] Improve getCastInstrCost - Specifically check for sext/zext users which have 'long' form NEON instructions. - Add more entries to the table for sext/zexts so that we can report more accurately the number of vmovls required for NEON. - Pass the instruction to the pass implementation. Differential Revision: https://reviews.llvm.org/D79561	2020-05-12 10:32:20 +01:00
Sam Parker	1952c86d61	[AArch64][CostModel] getCastInstrCost Pass the instruction to the base implementation. Differential Revision: https://reviews.llvm.org/D79562	2020-05-12 10:02:29 +01:00
Sam Parker	494c7ecef9	[NFC][AArch64] Update tests Add cost model tests for extending loads.	2020-05-12 08:49:05 +01:00
Tyker	821a0f23d8	[AssumeBundles] Prevent generation of some redundant assumes Summary: with this patch the assume salvageKnowledge will not generate assume if all knowledge is already available in an assume with valid context. assume bulider can also in some cases update an existing assume with better information. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78014	2020-05-10 19:23:59 +02:00
zoecarver	f65f566aeb	Re-commit: Mark values as trivially dead when their only use is a start or end lifetime intrinsic. Summary: If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well. Currently, this only works for allocas, globals, and arguments. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79355	2020-05-08 12:24:10 -07:00
Nikita Popov	5a2265647e	Reapply [InstSimplify] Remove known bits constant folding No changes relative to last time, but after a mitigation for an AMDGPU regression landed. --- If SimplifyInstruction() does not succeed in simplifying the instruction, it will compute the known bits of the instruction in the hope that all bits are known and the instruction can be folded to a constant. I have removed a similar optimization from InstCombine in D75801, and would like to drop this one as well. On average, we spend ~1% of total compile-time performing this known bits calculation. However, if we introduce some additional statistics for known bits computations and how many of them succeed in simplifying the instruction we get (on test-suite): instsimplify.NumKnownBits: 216 instsimplify.NumKnownBitsComputed: 13828375 valuetracking.NumKnownBitsComputed: 45860806 Out of ~14M known bits calculations (accounting for approximately one third of all known bits calculations), only 0.0015% succeed in producing a constant. Those cases where we do succeed to compute all known bits will get folded by other passes like InstCombine later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test show a hash difference after this change. On lencod we see an improvement (a loop phi is optimized away), on the GCC torture test a regression (a function return value is determined only after IPSCCP, preventing propagation from a noinline function.) There are various regressions in InstSimplify tests. However, all of these cases are already handled by InstCombine, and corresponding tests have already been added there. Differential Revision: https://reviews.llvm.org/D79294	2020-05-08 10:24:53 +02:00
Sam Parker	751da4d596	[NFC][AArch64] Add test Add cost model test for cast operations.	2020-05-07 13:16:03 +01:00
Alina Sbirlea	8e911545d6	[MemorySSA] Make MemoryLocation unknown when phi translation cannot be performed. Summary: When phi translation cannot be performed, be conservative and make the MemoryLocation unknown. Reviewers: george.burgess.iv Subscribers: Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79386	2020-05-05 13:32:32 -07:00
Nikita Popov	46ee652c70	Revert "[InstSimplify] Remove known bits constant folding" This reverts commit `08556afc54`. This breaks some AMDGPU tests.	2020-05-03 20:45:10 +02:00
Nikita Popov	08556afc54	[InstSimplify] Remove known bits constant folding If SimplifyInstruction() does not succeed in simplifying the instruction, it will compute the known bits of the instruction in the hope that all bits are known and the instruction can be folded to a constant. I have removed a similar optimization from InstCombine in D75801, and would like to drop this one as well. On average, we spend ~1% of total compile-time performing this known bits calculation. However, if we introduce some additional statistics for known bits computations and how many of them succeed in simplifying the instruction we get (on test-suite): instsimplify.NumKnownBits: 216 instsimplify.NumKnownBitsComputed: 13828375 valuetracking.NumKnownBitsComputed: 45860806 Out of ~14M known bits calculations (accounting for approximately one third of all known bits calculations), only 0.0015% succeed in producing a constant. Those cases where we do succeed to compute all known bits will get folded by other passes like InstCombine later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test show a hash difference after this change. On lencod we see an improvement (a loop phi is optimized away), on the GCC torture test a regression (a function return value is determined only after IPSCCP, preventing propagation from a noinline function.) There are various regressions in InstSimplify tests. However, all of these cases are already handled by InstCombine, and corresponding tests have already been added there. Differential Revision: https://reviews.llvm.org/D79294	2020-05-03 20:26:58 +02:00
Nikita Popov	7cf0f8568c	[ValueTracking] Convert test to unit test (NFC) Test this directly, rather than going through InstSimplify.	2020-05-03 12:23:57 +02:00
Craig Topper	e39c7ab2b9	[CostModel][X86][ARM] Teach default implementation of getCastInstrCost to not add a split/join cost if source type and the destination type both have a SplitVector action If both the source and the destination need to be split then the two halves of the split operation are completely independent and don't need to be split or joined. So we don't need to assess a cost for the split or join. Differential Revision: https://reviews.llvm.org/D79111	2020-05-01 18:55:23 -07:00
Craig Topper	b938168aef	[X86] Lower the cost of v4i64->v4i32 truncate with avx512. We use the vpmovqd instruction which is a single uop. So the cost should be 1.	2020-05-01 11:09:37 -07:00
Craig Topper	6a1ad76dab	[X86] Don't return true from isTruncateFree for vectors Also fix some cost tables for vXi1 types to match the costs entries for the types they will be promoted to. Differential Revision: https://reviews.llvm.org/D79045	2020-04-30 16:43:35 -07:00
Craig Topper	ff66919020	[X86][CostModel] Bump the cost of vpermw/vpermt2b/vperm2w vpermw is 2 uops. vpermt2b/vpermt2w are two shuffle uops and a port 015 uop. Weirdly vpermb is a single uop. This patch bumps the cost to 2 for these operations. Maybe should go to 3 for the vpermt2*, but I've started conservative. I've also removed a few entries that were now the same as earlier subtargets or that I didn't think we really did. Like I don't think we extend v32i8 to v32i16, shuffle, and then truncate. Differential Revision: https://reviews.llvm.org/D79148	2020-04-30 11:32:25 -07:00
Evgeniy Brevnov	bb0842a3f1	[BPI] Incorrect probability reported in case of mulptiple edges. Summary: By design 'BranchProbabilityInfo:: getEdgeProbability(const BasicBlock Src, const BasicBlock Dst) const' should return sum of probabilities over all edges from Src to Dst. Current implementation is buggy and returns 1/num_of_successors if probabilities are not explicitly set. Note current implementation of BPI printing has an issue as well and annotates each edge with sum of probabilities over all ages from one basic block to another. That's why 30% probability reported (instead of 10%) in the lit test. This is not urgent issue since only printing is affected. Note also current implementation assumes that either all or none edges have probabilities set. This is not the only place which uses such assumption. At least we should assert that in verifier. In addition we can think on a more robust API of BPI which would prevent situations. Reviewers: skatkov, yrouban, taewookoh Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79071	2020-04-30 11:41:03 +07:00
Alina Sbirlea	161ccfe5ba	[MemorySSA] Pass DT to the upward iterator for proper PhiTranslation. Summary: A valid DominatorTree is needed to do PhiTranslation. Before this patch, a MemoryUse could be optimized to an access outside a loop, while the address it loads from is modified in the loop. This can lead to a miscompile. Reviewers: george.burgess.iv Subscribers: Prazek, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79068	2020-04-29 14:28:31 -07:00
Craig Topper	cff6686532	[X86] Lower the cost of v4i64->v4i32 and v8i64->v8i32 truncate with AVX We generate much better code these days than we used to. And we use the same sequence for AVX1 and AVX2 for these For v4i64->v4i32 we generate: vextractf128 xmm1, ymm0, 1 vshufps xmm0, xmm0, xmm1, 136 # xmm0 = xmm0[0,2],xmm1[0,2] And for v8i64->v8i32 we generate: vperm2f128 ymm2, ymm0, ymm1, 49 # ymm2 = ymm0[2,3],ymm1[2,3] vinsertf128 ymm0, ymm0, xmm1, 1 vshufps ymm0, ymm0, ymm2, 136 # ymm0 = ymm0[0,2],ymm2[0,2],ymm0[4,6],ymm2[4,6] Differential Revision: https://reviews.llvm.org/D79109	2020-04-29 13:21:44 -07:00
Sam Parker	e9d0f1c8ea	[NFC][ARM] Modify cost model test	2020-04-29 12:42:47 +01:00
Sam Parker	850bdefa65	[NFC][ARM] Add two cost model tests	2020-04-29 12:36:05 +01:00
Simon Pilgrim	090cae8491	[TTI] Add DemandedElts to getScalarizationOverhead The improvements to the x86 vector insert/extract element costs in D74976 resulted in the estimated costs for vector initialization and scalarization increasing higher than should be expected. This is particularly noticeable on pre-SSE4 targets where the available of legal INSERT_VECTOR_ELT ops is more limited. This patch does 2 things: 1 - it implements X86TTIImpl::getScalarizationOverhead to more accurately represent the typical costs of a ISD::BUILD_VECTOR pattern. 2 - it adds a DemandedElts mask to getScalarizationOverhead to permit the SLP's BoUpSLP::getGatherCost to be rewritten to use it directly instead of accumulating raw vector insertion costs. This fixes PR45418 where a v4i8 (zext'd to v4i32) was no longer vectorizing. A future patch should extend X86TTIImpl::getScalarizationOverhead to tweak the EXTRACT_VECTOR_ELT scalarization costs as well. Reviewed By: @craig.topper Differential Revision: https://reviews.llvm.org/D78216	2020-04-29 12:00:38 +01:00
Craig Topper	59b9e6fe76	[X86] Update costs for truncates from less than 128-bit vectors to vXi1 on pre-avx512 targets vXi1 types are legalized by promoting, but the narrow vectors are legalized by widening. This results in some truncates turning into any_extends.	2020-04-28 11:35:41 -07:00
Craig Topper	d42192c50f	[X86][CostModel] Correct the costs for truncate to a mask register with avx512 I've modified isTruncateFree to get an accurate cost for types that need to be split. I'm planning to look into fixing it for all vectors, but need more cost cleanups first. Differential Revision: https://reviews.llvm.org/D78973	2020-04-28 10:39:36 -07:00
Craig Topper	9ea5cc8a25	[X86][CostModel] Add vXiY->vXi1 truncate tests to min-legal-vector-width.ll. NFC	2020-04-27 15:48:11 -07:00
Craig Topper	37ec709233	[X86][CostModel] Update truncate costs for some narrow vector cases to match their wider version. This updates v4i16->v4i8 with sse2 to match v8i16->v8i8. Update v2i16->v2i8 and v4i16->v4i8 with sse 4.1 to match v8i16->v8i8.	2020-04-27 13:47:48 -07:00
Craig Topper	bdbbed115f	[X86][CostModel] Update costs for vector truncate with avx512f/avx512bw. All avx512 truncate instructions except vXi64->vXi32 are 2 uops on port 5. So raise their costs to 2. Except when we have an earlier faster sequence like pshufb for 128 bit input vectors. Add a lower cost of 3 v16i16->v16i8 with avx512f where we can extend to v16i32 then truncate. And a cost of 2 for avx512bw with and without avx512vl. There we can use vpmovwb with either a ymm or zmm input. Both of these beat masking, splitting, and using packuswb which is our avx/avx2 codegen.	2020-04-27 12:00:24 -07:00
Craig Topper	5eff75d86a	[X86][CostModel] Improve costs for fp_to_uint/fp_to_sint for vXi8/vXi16/v2i32 results. Differential Revision: https://reviews.llvm.org/D78893	2020-04-27 10:35:15 -07:00
Craig Topper	8296bcf76f	[X86][CostModel] Fix typos in test. NFC	2020-04-26 21:17:38 -07:00
Craig Topper	5f2ea70980	[X86] Add cost model tests for conversions between <2 x float> and integers. For all but 2 x i32 we were starting from 4 x float.	2020-04-26 19:59:01 -07:00
Craig Topper	b9de62c2b6	[X86] Fix the cost of v16i1->v16i16 sext/zext on avx targets. Previously we were hitting the scalarization case in the default implementation.	2020-04-25 23:16:20 -07:00
Craig Topper	19cb26f517	[X86][CostModel] Improve costs for vXi1 sign_extend/zero_extend with avx512. With avx512 vXi1 is legal and uses k-registers with many custom cases for extending.	2020-04-25 23:16:20 -07:00
Craig Topper	084433702d	[X86][CostModel] Add sext/zext from vXi1 tests to min-legal-vector-width.ll. NFC We aren't properly costing extends from k-registers. I also added command lines without avx512bw to be able to show all the different extending strategies we have.	2020-04-25 23:15:40 -07:00
Craig Topper	061f330d7e	[X86] Add avx512vl to the truncate cost model test. NFC	2020-04-25 12:59:10 -07:00
Craig Topper	999058ba5e	[X86] Add cost model tests for truncating from v2i8/v4i8/v8i8/v16i8 to vXi1. NFC	2020-04-24 23:11:17 -07:00
Craig Topper	7664a0d282	[X86] Improve accuracy of cost for v16i64->v16i8 truncate with avx512. The 2 vpmovqds are only 1 uop each.	2020-04-24 19:13:55 -07:00
Craig Topper	03aa967c0d	[CostModel][X86][ARM] Teach getCastInstrCost to include the splitting factor when handling operations that type legalize to the same number of subvectors or scalar components Previously, we just always returned 1. But that ignores that we have to do the operation for each subvector or scalar component. Differential Revision: https://reviews.llvm.org/D78824	2020-04-24 13:36:26 -07:00
Tyker	42431da895	[AssumeBundles] Use assume bundles in isKnownNonZero Summary: Use nonnull and dereferenceable from an assume bundle in isKnownNonZero Reviewers: jdoerfert, nikic, lebedev.ri, reames, fhahn, sstefan1 Reviewed By: jdoerfert Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76149	2020-04-24 20:41:51 +02:00
Craig Topper	4cf73a3fc6	[CostModel][X86] Account for splitting cost when vector zext/sext type legalize to the same size vector.	2020-04-24 09:59:23 -07:00
Juneyoung Lee	aca335955c	[ValueTracking] Let analyses assume a value cannot be partially poison Summary: This is RFC for fixes in poison-related functions of ValueTracking. These functions assume that a value can be poison bitwisely, but the semantics of bitwise poison is not clear at the moment. Allowing a value to have bitwise poison adds complexity to reasoning about correctness of optimizations. This patch makes the analysis functions simply assume that a value is either fully poison or not, which has been used to understand the correctness of a few previous optimizations. The bitwise poison semantics seems to be only used by these functions as well. In terms of implementation, using value-wise poison concept makes existing functions do more precise analysis, which is what this patch contains. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr Reviewed By: nikic Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78503	2020-04-23 08:08:53 +09:00
Juneyoung Lee	5ceef26350	Revert "RFC: [ValueTracking] Let analyses assume a value cannot be partially poison" This reverts commit `80faa8c3af`.	2020-04-23 08:07:09 +09:00
Juneyoung Lee	80faa8c3af	RFC: [ValueTracking] Let analyses assume a value cannot be partially poison Summary: This is RFC for fixes in poison-related functions of ValueTracking. These functions assume that a value can be poison bitwisely, but the semantics of bitwise poison is not clear at the moment. Allowing a value to have bitwise poison adds complexity to reasoning about correctness of optimizations. This patch makes the analysis functions simply assume that a value is either fully poison or not, which has been used to understand the correctness of a few previous optimizations. The bitwise poison semantics seems to be only used by these functions as well. In terms of implementation, using value-wise poison concept makes existing functions do more precise analysis, which is what this patch contains. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr Reviewed By: nikic Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78503	2020-04-23 07:57:12 +09:00
Sam Parker	04ef154124	[NFC] Test changes Add some more targets for the ARM cost model tests and add some tests for icmps and bitcasts.	2020-04-22 08:28:52 +01:00
Eli Friedman	9b9454af8a	Require "target datalayout" to be at the beginning of an IR file. This will allow us to use the datalayout to disambiguate other constructs in IR, like load alignment. Split off from D78403. Differential Revision: https://reviews.llvm.org/D78413	2020-04-20 11:55:49 -07:00
Craig Topper	8dfb9627b7	[X86] Make v32i16/v64i8 legal types without avx512bw. Use custom splitting instead. This moves v32i16/v64i8 to a model consistent with how we treat integer types with avx1. This does change the ABI for types vXi16/vXi8 vectors larger than 512 bits to pass in multiple zmms instead of multiple ymms. We'd already hacked some code to make v64i8/v32i16 pass in zmm. Cost model is still a bit of a mess. In some place I tried to match existing behavior. But really we need to account for splitting and concating costs. Cost model for shuffles is especially pessimistic. Differential Revision: https://reviews.llvm.org/D76212	2020-04-15 12:17:18 -07:00
Simon Pilgrim	2f951e99c6	[CostModel][X86] Regenerate load_store.ll costs tests Add SSE + AVX512 targets Add some illegal type store tests	2020-04-15 11:54:39 +01:00
Tyker	1d2b76a8fc	[AssumeBundles] adapte GVN to assume bundles Summary: prevent GVN from removing assume bundles make GVN preserve information from removed instructions Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77405	2020-04-14 12:48:14 +02:00
Craig Topper	2f60fbce6c	[X86] Use a more realisitic cost for truncate v16i64->v16i8 with avx512f. Still not great and we could probably codegen this better, but 11 was clearly ridiculous.	2020-04-13 21:09:43 -07:00
Craig Topper	071c64d68d	[X86] Add a more accurate truncate cost for v8i64->v8i8	2020-04-13 21:09:41 -07:00
Craig Topper	b37b1840eb	[X86] Add truncate cost model tests to min-legal-vector-width.ll for when we're avoiding 512 bit vectors.	2020-04-13 21:09:40 -07:00
Simon Pilgrim	353347288b	[CostModel][X86] Remove comments that begin with a filecheck prefix. Stop filecheck from confusing a general comment with a check.	2020-04-13 18:39:24 +01:00
Huihui Zhang	6c989d0248	[BasicAA] Fix aliasGEP/DecomposeGEPExpression for scalable type. Summary: Don't attempt to analyze the decomposed GEP for scalable type. GEP index scale is not compile-time constant for scalable type. Be conservative, return MayAlias. Explicitly call TypeSize::getFixedSize() to assert on places where scalable type doesn't make sense. Add unit tests to check functionality of -basicaa for scalable type. This patch is needed for D76944. Reviewers: sdesmalen, efriedma, spatel, bjope, ctetreau Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77828	2020-04-10 16:58:26 -07:00
Simon Pilgrim	91bc50c0d7	[CostModel][X86] Improve InsertElement costs for sub-128bit vectors If we're inserting into v2i8/v4i8/v8i8/v2i16/v4i16 style sub-128bit vectors ensure we don't use the SK_PermuteTwoSrc cost of the legalized value type - this is a followup to rG12c629ec6c59 which added equivalent sub-128bit shuffle costs	2020-04-10 14:55:46 +01:00
Craig Topper	5625e6ab37	[X86] Improve min/max reduction costs. This is similar to what I recently did for getArithmeticReductionCost. I'm trying to account for the narrowing from 512->256->128 as we go. I've also added a new helper method getMinMaxCost that tries to handle the cases where we have native min/max instructions and fall back to cmp+select when we don't. Differential Revision: https://reviews.llvm.org/D76634	2020-04-09 17:28:50 -07:00
Simon Pilgrim	12c629ec6c	[CostModel][X86] Add shuffle costs for some common sub-128bit vectors v2i8/v4i8/v8i8 + v2i16/v4i16 all show up in vectorizer code and by just using the legalized types (v16i8/v8i16) we're highly exaggerating the actual cost of the shuffle.	2020-04-09 19:57:06 +01:00
Simon Pilgrim	898e22908c	[MemorySSA] invariant-groups.ll - add missing check to fix issue reported on D77354	2020-04-08 15:18:04 +01:00
Sanjay Patel	a2bb19ca42	[x86] add size cost tests for casts and binops; NFC Shows bugs for div/rem/fdiv and possibly others.	2020-04-06 12:38:15 -04:00
Jonathan Roelofs	7c5d2bec76	[llvm] Fix missing FileCheck directive colons https://reviews.llvm.org/D77352	2020-04-06 09:59:08 -06:00
Simon Pilgrim	be84d2b5b7	[CostModel][X86] Add some insert subvector cost tests for vXf32/vXi32/vXi16/vXi8 types	2020-04-04 22:46:57 +01:00
Simon Pilgrim	6a57ba17c0	[CostModel][X86] Add shuffle cost tests for sub-128bit vectors	2020-04-04 13:08:25 +01:00
Simon Pilgrim	87fd686f6f	[CostModel][X86] Add insert/extract cost tests for sub-128bit vXi8/vXi16 vectors	2020-04-04 13:08:25 +01:00
Matt Arsenault	5660bb6bc9	AMDGPU: Remove denormal subtarget features Switch to using the denormal-fp-math/denormal-fp-math-f32 attributes.	2020-04-02 17:17:12 -04:00
Denis Antrushin	06c58f11a9	[SCEV] Use backedge SCEV of PHI only if its input is loop invariant For the PHI node %1 = phi [%A, %entry], [%X, %latch] it is incorrect to use SCEV of backedge val %X as an exit value of PHI unless %X is loop invariant. This is because exit value of %1 is value of %X at one-before-last iteration of the loop. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D73181	2020-03-31 18:39:24 +07:00
Sebastian Neubauer	5d3a69feca	[AMDGPU] New llvm.amdgcn.ballot intrinsic Add a new llvm.amdgcn.ballot intrinsic modeled on the ballot function in GLSL and other shader languages. It returns a bitfield containing the result of its boolean argument in all active lanes, and zero in all inactive lanes. This is intended to replace the existing llvm.amdgcn.icmp and llvm.amdgcn.fcmp intrinsics after a suitable transition period. Use the new intrinsic in the atomic optimizer pass. Differential Revision: https://reviews.llvm.org/D65088	2020-03-31 10:35:39 +02:00

... 4 5 6 7 8 ...

2515 Commits