llvm-project

Commit Graph

Author	SHA1	Message	Date
Wei Mi	80a0c97e07	[PM] keeping history when original SCC split and then merge into itself in the same round of SCC update. In https://reviews.llvm.org/rL309784, inline history is added to prevent infinite inlining across multiple run of inliner and SCC update, but the history will only be kept when new SCC is actually generated during SCC update. We found a case that SCC can be split and then merge into itself in the same round of SCC update, so the same SCC will be pop out from UR.CWorklist and then added back immediately, without any new SCC generated, that is why the existing patch cannot catch the infinite inline case. What the patch does is even if no new SCC is generated, if only the current SCC appears in UR.CWorklist again, then keep the inline history. Differential Revision: https://reviews.llvm.org/D52915 llvm-svn: 345103	2018-10-23 23:29:45 +00:00
Vedant Kumar	503154615d	[HotColdSplitting] Attach MinSize to outlined code Outlined code is cold by assumption, so it makes sense to optimize it for minimal code size rather than performance. After r344869 moved the splitting pass to the end of the IR pipeline, this does not result in much of a code size reduction. This is probably because a comparatively small number backend transforms make use of the MinSize hint. Running LNT on x86_64, I see that 33/1020 binaries shrink for a total of 919 bytes of TEXT reduction. I didn't measure a significant performance impact. Differential Revision: https://reviews.llvm.org/D53518 llvm-svn: 345072	2018-10-23 19:41:12 +00:00
Sanjay Patel	95790c546f	[InstCombine] use 'match' to simplify code There's probably some vector-with-undef-element pattern that shows an improvement, so this is probably not quite 'NFC'. This is the last step towards removing the fake binop queries for not/neg. Ie, there are no more uses of those functions in trunk. Fneg should follow. llvm-svn: 345050	2018-10-23 16:54:28 +00:00
Jordan Rupprecht	2fed6ac186	[DebugInfo][GlobalOpt] Fix -debugify for globalopt shrinking globals to booleans. Summary: TryToShrinkGlobalToBoolean, when possible, will split store <value> + load <value> into store <bool> + select <bool ? value : 0>. This preserves DebugLoc during that pass. Fixes PR37959. The test case here is the simplified .ll for: ``` static int foo; int bar() { foo = 5; return foo; } ``` Reviewers: dblaikie, gbedwell, aprantl Reviewed By: dblaikie Subscribers: mehdi_amini, JDevlieghere, dexonsmith, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D53531 llvm-svn: 345046	2018-10-23 16:35:51 +00:00
Sanjay Patel	5b6b090cf2	[Reassociate] replace fake binop queries with 'match' API We need to update this code before introducing an 'fneg' instruction in IR, so we might as well kill off the integer neg/not queries too. This is no-functional-change-intended for scalar code and most vector code. For vectors, we can see that the 'match' API allows for undef elements in constants, so we optimize those cases better. Ideally, there would be a test for each code diff, but I don't see evidence of that for the existing code, so I didn't try very hard to come up with new vector tests for each code change. Differential Revision: https://reviews.llvm.org/D53533 llvm-svn: 345042	2018-10-23 15:55:06 +00:00
Simon Pilgrim	532a0f122e	[SLPVectorizer] Add basic support for mul/and/or/xor horizontal reductions Expand arithmetic reduction to include mul/and/or/xor instructions. This patch just fixes the SLPVectorizer - the effective reduction costs for AVX1+ are still poor (see rL344846) and will need to be improved before SLP sees this as a valid transform - but we can already see the effect on SSE2 tests. This partially helps PR37731, but doesn't fix it all as it still falls over on the extraction/reduction order for some reason. Differential Revision: https://reviews.llvm.org/D53473 llvm-svn: 345037	2018-10-23 15:13:09 +00:00
Sanjay Patel	747feb28e4	[InstCombine] use 'match' to handle vectors and simplify code This is another step towards completely removing the fake binop queries for not/neg/fneg. llvm-svn: 345036	2018-10-23 15:05:12 +00:00
Sanjay Patel	ad76c682c7	[InstCombine] swap select profile metadata when swapping select ops llvm-svn: 345034	2018-10-23 14:43:31 +00:00
Sanjay Patel	5141435d23	[SLSR] use 'match' to simplify code; NFC This pass could probably be modified slightly to allow vector splat transforms for practically no cost, but it only works on scalars for now. So the use of the newer 'match' API should make no functional difference. llvm-svn: 345030	2018-10-23 14:07:39 +00:00
Dorit Nuzman	da5dc13355	Leftover bits from https://reviews.llvm.org/D53420 that were accidentally left out of revision 344883 llvm-svn: 345021	2018-10-23 11:51:55 +00:00
Kostya Serebryany	af95597c3c	[hwasan] add stack frame descriptions. Summary: At compile-time, create an array of {PC,HumanReadableStackFrameDescription} for every function that has an instrumented frame, and pass this array to the run-time at the module-init time. Similar to how we handle pc-table in SanitizerCoverage. The run-time is dummy, will add the actual logic in later commits. Reviewers: morehouse, eugenis Reviewed By: eugenis Subscribers: srhines, llvm-commits, kubamracek Differential Revision: https://reviews.llvm.org/D53227 llvm-svn: 344985	2018-10-23 00:50:40 +00:00
Sanjay Patel	dd1c3df72d	[Reassociate] add 'using namespace' to reduce bloat; NFC llvm-svn: 344959	2018-10-22 21:37:02 +00:00
Teresa Johnson	f431a2f261	[hot-cold-split] Add opt remark on success Summary: Emit optimization remark on successful hot cold split. Reviewers: sebpop, hiraditya Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53512 llvm-svn: 344938	2018-10-22 19:06:42 +00:00
Benjamin Kramer	3e778165d6	[CGProfile] Turn constant-size SmallVector into array No functionality change. llvm-svn: 344893	2018-10-22 10:51:34 +00:00
Dorit Nuzman	3ec99fe21b	[IAI,LV] Avoid creating a scalar epilogue due to gaps in interleave-groups when optimizing for size LV is careful to respect -Os and not to create a scalar epilog in all cases (runtime tests, trip-counts that require a remainder loop) except for peeling due to gaps in interleave-groups. This patch fixes that; -Os will now have us invalidate such interleave-groups and vectorize without an epilog. The patch also removes a related FIXME comment that is now obsolete, and was also inaccurate: "FIXME: return None if loop requiresScalarEpilog(<MaxVF>), or look for a smaller MaxVF that does not require a scalar epilog." (requiresScalarEpilog() has nothing to do with VF). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53420 llvm-svn: 344883	2018-10-22 06:17:09 +00:00
Aditya Kumar	d9e2e383a9	Schedule Hot Cold Splitting pass after most optimization passes Summary: In the new+old pass manager, hot cold splitting was schedule too early. Thanks to Vedant for pointing this out. Reviewers: sebpop, vsk Reviewed By: sebpop, vsk Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D53437 llvm-svn: 344869	2018-10-21 18:11:56 +00:00
Sanjay Patel	0522b0da31	[InstCombine] use 'match' to simplify code; NFC llvm-svn: 344855	2018-10-20 17:15:57 +00:00
Sanjay Patel	ec572ade20	[InstCombine] make code more flexible with lambda; NFC I couldn't tell from svn history when these checks were added, but it pre-dates the split of instcombine into its own directory at rL92459. The motivation for changing the check is partly shown by the code in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 There are also existing regression tests for SLPVectorizer with sequences of extract+insert that are likely assumed to become shuffles by the vectorizer cost models. llvm-svn: 344854	2018-10-20 16:58:27 +00:00
Sanjay Patel	729c4362cf	[InstCombine] add explanatory comment for strange vector logic; NFC llvm-svn: 344852	2018-10-20 16:25:55 +00:00
Evandro Menezes	164ea101ab	[NFC][InstCombine] Undo stray change Undo stray change introduced by r344725. llvm-svn: 344814	2018-10-19 20:57:45 +00:00
Thomas Lively	c339250e12	[InstCombine] InstCombine and InstSimplify for minimum and maximum Summary: Depends on D52765 Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52766 llvm-svn: 344799	2018-10-19 19:01:26 +00:00
Sanjay Patel	70daf85bc2	[InstCombine] use m_Neg() in dyn_castNegVal() to match vectors with undef elts llvm-svn: 344793	2018-10-19 17:54:53 +00:00
Fangrui Song	2e83b2e9ee	Use llvm::{all,any,none}_of instead std::{all,any,none}_of. NFC llvm-svn: 344774	2018-10-19 06:12:02 +00:00
Ayal Zaks	b0b5312e67	[LV] Fold tail by masking to vectorize loops of arbitrary trip count under opt for size When optimizing for size, a loop is vectorized only if the resulting vector loop completely replaces the original scalar loop. This holds if no runtime guards are needed, if the original trip-count TC does not overflow, and if TC is a known constant that is a multiple of the VF. The last two TC-related conditions can be overcome by 1. rounding the trip-count of the vector loop up from TC to a multiple of VF; 2. masking the vector body under a newly introduced "if (i <= TC-1)" condition. The patch allows loops with arbitrary trip counts to be vectorized under -Os, subject to the existing cost model considerations. It also applies to loops with small trip counts (under -O2) which are currently handled as if under -Os. The patch does not handle loops with reductions, live-outs, or w/o a primary induction variable, and disallows interleave groups. (Third, final and main part of -) Differential Revision: https://reviews.llvm.org/D50480 llvm-svn: 344743	2018-10-18 15:03:15 +00:00
Mikael Holmen	e3605d0f70	Add a emitUnaryFloatFnCall version that fetches the function name from TLI Summary: In several places in the code we use the following pattern: if (hasUnaryFloatFn(&TLI, Ty, LibFunc_tan, LibFunc_tanf, LibFunc_tanl)) { [...] Value Res = emitUnaryFloatFnCall(X, TLI.getName(LibFunc_tan), B, Attrs); [...] } In short, we check if there is a lib-function for a certain type, and then we _always_ fetch the name of the "double" version of the lib function and construct a call to the appropriate function, that we just checked exists, using that "double" name as a basis. This is of course a problem in cases where the target doesn't support the "double" version, but e.g. only the "float" version. In that case TLI.getName(LibFunc_tan) returns "", and emitUnaryFloatFnCall happily appends an "f" to "", and we erroneously end up with a call to a function called "f". To solve this, the above pattern is changed to if (hasUnaryFloatFn(&TLI, Ty, LibFunc_tan, LibFunc_tanf, LibFunc_tanl)) { [...] Value Res = emitUnaryFloatFnCall(X, &TLI, LibFunc_tan, LibFunc_tanf, LibFunc_tanl, B, Attrs); [...] } I.e instead of first fetching the name of the "double" version and then letting emitUnaryFloatFnCall() add the final "f" or "l", we let emitUnaryFloatFnCall() fetch the right name from TLI. Reviewers: eli.friedman, efriedma Reviewed By: efriedma Subscribers: efriedma, bjope, llvm-commits Differential Revision: https://reviews.llvm.org/D53370 llvm-svn: 344725	2018-10-18 06:27:53 +00:00
Chandler Carruth	60b2e054dc	[TI removal] Switch simple loop unswitch to `Instruction`. llvm-svn: 344719	2018-10-18 00:40:26 +00:00
Chandler Carruth	c6cad4251e	[TI removal] Switch NewGVN to directly use `Instruction`. llvm-svn: 344718	2018-10-18 00:39:46 +00:00
Chandler Carruth	c8eaea71c9	[TI removal] Use `Instruction` instead of `TerminatorInst` for a variable's type. llvm-svn: 344717	2018-10-18 00:39:18 +00:00
Chandler Carruth	8b7a8123dd	[TI removal] Update CodeExtractor to use Instruction directly. llvm-svn: 344716	2018-10-18 00:38:54 +00:00
Chandler Carruth	c1e3ee29a4	[TI removal] Switch ObjCARC code to directly use the nice range-based successors API or directly build the iterators out of the terminator instruction and avoid requiring a TerminatorInst variable. llvm-svn: 344715	2018-10-18 00:38:34 +00:00
Chandler Carruth	93cf2ea27a	[TI removal] Switch MergeFunctions to directly use Instruction API. llvm-svn: 344714	2018-10-18 00:37:37 +00:00
Nicolai Haehnle	0823050b9f	StructurizeCFG: Simplify inserted PHI nodes Summary: This improves subsequent divergence analysis in some cases. Change-Id: I5e95e7ec7fd3fa80d414d1a53a02fea23e3d67d3 Reviewers: arsenm, rampitec Subscribers: jvesely, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D53316 llvm-svn: 344697	2018-10-17 15:37:41 +00:00
Fedor Sergeev	c297e84b97	[LoopPredication] add some simple stats Just adding some useful statistics to LoopPredication pass which was lacking any of these. llvm-svn: 344681	2018-10-17 09:02:54 +00:00
Leonard Chan	423957ad3a	[Sanitizer][PassManager] Fix for failing ASan tests on arm-linux-gnueabihf Forgot to initialize the legacy pass in it's constructor. Differential Revision: https://reviews.llvm.org/D53350 llvm-svn: 344659	2018-10-17 00:16:07 +00:00
Teresa Johnson	d2c234a4cc	[ThinLTO] Add importing stats to thin link Summary: Previously we could only get the number of imported functions and variables from the backend. This adds stats to the thin link where the importing is decided. Reviewers: wmi Subscribers: inglorion, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D53337 llvm-svn: 344658	2018-10-16 23:49:50 +00:00
Jonathan Metzman	5eb8cba280	[SanitizerCoverage] Don't duplicate code to get section pointers Summary: Merge code used to get section start and section end pointers for SanitizerCoverage constructors. This includes code that handles getting the start pointers when targeting MSVC. Reviewers: kcc, morehouse Reviewed By: morehouse Subscribers: kcc, hiraditya Differential Revision: https://reviews.llvm.org/D53211 llvm-svn: 344657	2018-10-16 23:43:57 +00:00
David Bolvansky	7c7760da7e	[InstCombine] Cleanup libfunc attribute inferring Reviewers: efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53338 llvm-svn: 344645	2018-10-16 21:18:31 +00:00
Anna Thomas	6f732bfb79	[LV] Teach vectorizer about variant value store into uniform address Summary: Teach vectorizer about vectorizing variant value stores to uniform address. Similar to rL343028, we do not allow vectorization if we have multiple stores to the same uniform address. Cost model already has the change for considering the extract instruction cost for a variant value store. See added test cases for how vectorization is done. The patch also contains changes to the ORE messages. Reviewers: Ayal, mkuper, anemet, hsaito Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D52656 llvm-svn: 344613	2018-10-16 15:46:26 +00:00
Sanjay Patel	bb3dd34e62	revert rL344609: [InstCombine] try harder to form select from logic ops I noticed a missing check and added it at rL344610, but there actually are codegen tests that will fail without that, so I'll edit those and submit a fixed patch with more tests. llvm-svn: 344612	2018-10-16 15:26:08 +00:00
Sanjay Patel	f6a7c8b1fc	[InstCombine] make sure type is integer before calling ComputeNumSignBits llvm-svn: 344610	2018-10-16 14:44:50 +00:00
Sanjay Patel	0c48c977b8	[InstCombine] try harder to form select from logic ops This is part of solving PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 The patterns shown here are a special case of something that we already convert to select. Using ComputeNumSignBits() catches that case (but not the more complicated motivating patterns yet). The backend has hooks/logic to convert back to logic ops if that's better for the target. llvm-svn: 344609	2018-10-16 14:35:21 +00:00
Max Kazantsev	9c90ec2fae	[NFC] Make LoopSafetyInfo abstract to allow alternative implementations llvm-svn: 344592	2018-10-16 08:31:05 +00:00
Max Kazantsev	8d56be7070	[NFC] Encapsulate work with BlockColors in LoopSafetyInfo llvm-svn: 344590	2018-10-16 08:07:14 +00:00
David Stenberg	c9163855dd	[DebugInfo][LCSSA] Rewrite pre-existing debug values outside loop Summary: Extend LCSSA so that debug values outside loops are rewritten to use the PHI nodes that the pass creates. This fixes PR39019. In that case, we ran LCSSA on a loop that was later on vectorized, which left us with something like this: for.cond.cleanup: %add.lcssa = phi i32 [ %add, %for.body ], [ %34, %middle.block ] call void @llvm.dbg.value(metadata i32 %add, ret i32 %add.lcssa for.body: %add = [...] br i1 %exitcond, label %for.cond.cleanup, label %for.body which later resulted in the debug.value becoming undef when removing the scalar loop (and the location would have probably been wrong for the vectorized case otherwise). As we now may need to query the AvailableVals cache more than once for a basic block, FindAvailableVals() in SSAUpdaterImpl is changed so that it updates the cache for blocks that we do not create a PHI node for, regardless of the block's number of predecessors. The debug value in the attached IR reproducer would not be properly rewritten without this. Debug values residing in blocks where we have not inserted any PHI nodes are currently left as-is by this patch. I'm not sure what should be done with those uses. Reviewers: mattd, aprantl, vsk, probinson Reviewed By: mattd, aprantl Subscribers: jmorse, gbedwell, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D53130 llvm-svn: 344589	2018-10-16 08:06:48 +00:00
Max Kazantsev	c8466f937c	[NFC] Turn isGuaranteedToExecute into a method llvm-svn: 344587	2018-10-16 06:34:53 +00:00
Lang Hames	8f9a2446e0	Change a TerminatorInst* to an Instruction* in HotColdSplitting.cpp. r344558 added an assignment to a TerminatorInst* from BasicBlock::getTerminatorInst(), but BasicBlock::getTerminatorInst() returns an Instruction* rather than a TerminatorInst* since r344504 so this fails to compile. Changing the variable to an Instruction* should get the bots building again. llvm-svn: 344566	2018-10-15 22:27:03 +00:00
Sebastian Pop	542e522b87	[hot-cold-split] fix static analysis of cold regions Make the code of blockEndsInUnreachable to match the function blockEndsInUnreachable in CodeGen/BranchFolding.cpp. I also have added a note to make sure the code of this function will not be modified unless the back-end version is also modified. An early return before outlining has been added to avoid outlining the full function body when the first block in the function is marked cold. The static analysis of cold code has been amended to avoid marking the whole function as cold by back-propagation because the back-propagation would mark blocks with return statements as cold. The patch adds debug statements to help discover these problems. Differential Revision: https://reviews.llvm.org/D52904 llvm-svn: 344558	2018-10-15 21:43:11 +00:00
Vedant Kumar	15718a6190	[CodeExtractor] Erase debug intrinsics in outlined thunks (fix PR22900) Variable updates within the outlined function are invisible to debuggers. This could be improved by defining a DISubprogram for the new function. For the moment, simply erase the debug intrinsics instead. This fixes verifier failures about function-local metadata being used in the wrong function, seen while testing the hot/cold splitting pass. rdar://45142482 Differential Revision: https://reviews.llvm.org/D53267 llvm-svn: 344545	2018-10-15 19:22:20 +00:00
Chandler Carruth	e303c87e19	[TI removal] Make `getTerminator()` return a generic `Instruction`. This removes the primary remaining API producing `TerminatorInst` which will reduce the rate at which code is introduced trying to use it and generally make it much easier to remove the remaining APIs across the codebase. Also clean up some of the stragglers that the previous mechanical update of variables missed. Users of LLVM and out-of-tree code generally will need to update any explicit variable types to handle this. Replacing `TerminatorInst` with `Instruction` (or `auto`) almost always works. Most of these edits were made in prior commits using the perl one-liner: ``` perl -i -ple 's/TerminatorInst(\b.* = .*getTerminator)/Instruction\1/g' ``` This also my break some rare use cases where people overload for both `Instruction` and `TerminatorInst`, but these should be easily fixed by removing the `TerminatorInst` overload. llvm-svn: 344504	2018-10-15 10:42:50 +00:00
Chandler Carruth	52eaaf3ff8	[TI removal] Rework `InstVisitor` to support visiting instructions that are terminators without relying on the specific `TerminatorInst` type. This required cleaning up two users of `InstVisitor`s usage of `TerminatorInst` as well. llvm-svn: 344503	2018-10-15 10:10:54 +00:00
Chandler Carruth	edb12a838a	[TI removal] Make variables declared as `TerminatorInst` and initialized by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502	2018-10-15 10:04:59 +00:00
Chandler Carruth	ae98759ec5	[TI removal] Remove `TerminatorInst` from GVN.h and GVN.cpp. This is the last interesting usage in all of LLVM's headers. The remaining usages in headers are the core typesystem bits (Core.h, instruction types, and InstVisitor) and as the return of `BasicBlock::getTerminator`. The latter is the big remaining API point that I'll remove after mass updates to user code. llvm-svn: 344501	2018-10-15 10:00:15 +00:00
Chandler Carruth	4a2d58e16a	[TI removal] Remove `TerminatorInst` from BasicBlockUtils.h This requires updating a number of .cpp files to adapt to the new API. I've just systematically updated all uses of `TerminatorInst` within these files te `Instruction` so thta I won't have to touch them again in the future. llvm-svn: 344498	2018-10-15 09:34:05 +00:00
Chandler Carruth	b99a24689b	[TI removal] Remove TerminatorInst as an input parameter from all public LLVM APIs. There weren't very many. We still have the instruction visitor, and APIs with TerminatorInst as a return type or an output parameter. llvm-svn: 344494	2018-10-15 09:17:09 +00:00
Ayal Zaks	e567b5b526	[LV] Fix comments reported when not vectorizing single iteration loops; NFC Landing this as a separate part of https://reviews.llvm.org/D50480, being a seemingly unrelated change ([LV] Vectorizing loops of arbitrary trip count without remainder under opt for size). llvm-svn: 344483	2018-10-14 17:53:02 +00:00
Sanjay Patel	7181146c6c	[InstCombine] combine a shuffle and an extract subvector shuffle This is part of the missing IR-level folding noted in D52912. This should be ok as a canonicalization because the new shuffle mask can't be any more complicated than the existing shuffle mask. If there's some target where the shorter vector shuffle is not legal, it should just end up expanding to something like the pair of shuffles that we're starting with here. Differential Revision: https://reviews.llvm.org/D53037 llvm-svn: 344476	2018-10-14 15:25:06 +00:00
Dorit Nuzman	38bbf81ade	recommit 344472 after fixing build failure on ARM and PPC. llvm-svn: 344475	2018-10-14 08:50:06 +00:00
Dorit Nuzman	5118c68cde	revert 344472 due to failures. llvm-svn: 344473	2018-10-14 07:21:20 +00:00
Dorit Nuzman	8174368955	[IAI,LV] Add support for vectorizing predicated strided accesses using masked interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472	2018-10-14 07:06:16 +00:00
Benjamin Kramer	c55e997556	Move some helpers from the global namespace into anonymous ones. llvm-svn: 344468	2018-10-13 22:18:22 +00:00
Sanjay Patel	47579b21e2	[InstCombine] fix complexity canonicalization with fake unary vector ops This is a preliminary step to avoid regressions when we add an actual 'fneg' instruction to IR. See D52934 and D53205. llvm-svn: 344458	2018-10-13 16:15:37 +00:00
David Bolvansky	e8b3bba717	[InstCombine] Fixed crash with aliased functions Summary: Fixes PR39177 Reviewers: spatel, jbuening Reviewed By: jbuening Subscribers: jbuening, llvm-commits Differential Revision: https://reviews.llvm.org/D53129 llvm-svn: 344454	2018-10-13 15:21:55 +00:00
Kostya Serebryany	bc504559ec	move GetOrCreateFunctionComdat to Instrumentation.cpp/Instrumentation.h Summary: GetOrCreateFunctionComdat is currently used in SanitizerCoverage, where it's defined. I'm planing to use it in HWASAN as well, so moving it into a common location. NFC Reviewers: morehouse Reviewed By: morehouse Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53218 llvm-svn: 344433	2018-10-12 23:21:48 +00:00
Jonathan Metzman	0b94e88007	[SanitizerCoverage] Prevent /OPT:REF from stripping constructors Summary: Linking with the /OPT:REF linker flag when building COFF files causes the linker to strip SanitizerCoverage's constructors. Prevent this by giving the constructors WeakODR linkage and by passing the linker a directive to include sancov.module_ctor. Include a test in compiler-rt to verify libFuzzer can be linked using /OPT:REF Reviewers: morehouse, rnk Reviewed By: morehouse, rnk Subscribers: rnk, morehouse, hiraditya Differential Revision: https://reviews.llvm.org/D52119 llvm-svn: 344391	2018-10-12 18:11:47 +00:00
Max Moroz	4d010ca35b	[SanitizerCoverage] Make Inline8bit and TracePC counters dead stripping resistant. Summary: Otherwise, at least on Mac, the linker eliminates unused symbols which causes libFuzzer to error out due to a mismatch of the sizes of coverage tables. Issue in Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=892167 Reviewers: morehouse, kcc, george.karpenkov Reviewed By: morehouse Subscribers: kubamracek, llvm-commits Differential Revision: https://reviews.llvm.org/D53113 llvm-svn: 344345	2018-10-12 13:59:31 +00:00
Tim Northover	487780678f	SCCP: avoid caching DenseMap entry that might be invalidated. Later calls to getValueState might insert entries into the ValueState map and cause reallocation, invalidating a reference. llvm-svn: 344327	2018-10-12 09:01:59 +00:00
Eugene Leviant	eddf6b5df5	[ThinLTO] Don't import GV which contains blockaddress Differential revision: https://reviews.llvm.org/D53139 llvm-svn: 344325	2018-10-12 07:24:02 +00:00
Kostya Serebryany	d891ac9794	merge two near-identical functions createPrivateGlobalForString into one Summary: We have two copies of createPrivateGlobalForString (in asan and in esan). This change merges them into one. NFC Reviewers: vitalybuka Reviewed By: vitalybuka Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53178 llvm-svn: 344314	2018-10-11 23:03:27 +00:00
Leonard Chan	64e21b5cfd	[PassManager/Sanitizer] Port of AddresSanitizer pass from legacy to new PassManager This patch ports the legacy pass manager to the new one to take advantage of the benefits of the new PM. This involved moving a lot of the declarations for `AddressSantizer` to a header so that it can be publicly used via PassRegistry.def which I believe contains all the passes managed by the new PM. This patch essentially decouples the instrumentation from the legacy PM such hat it can be used by both legacy and new PM infrastructure. Differential Revision: https://reviews.llvm.org/D52739 llvm-svn: 344274	2018-10-11 18:31:51 +00:00
Amara Emerson	54f60255a2	[InstCombine] Fix SimplifyLibCalls erasing an instruction while IC still had references to it. InstCombine keeps a worklist and assumes that optimizations don't eraseFromParent() the instruction, which SimplifyLibCalls violates. This change adds a new callback to SimplifyLibCalls to let clients specify their own hander for erasing actions. Differential Revision: https://reviews.llvm.org/D52729 llvm-svn: 344251	2018-10-11 14:51:11 +00:00
David Green	8066198442	[InstCombine] Demand bits of UMin This is the umin alternative to the umax code from rL344237. We use DeMorgans law on the umax case to bring us to the same thing on umin, but using countLeadingOnes, not countLeadingZeros. Differential Revision: https://reviews.llvm.org/D53036 llvm-svn: 344239	2018-10-11 11:28:27 +00:00
David Green	30c0e98b9c	[InstCombine] Demand bits of UMax Use the demanded bits of umax(A,C) to prove we can just use A so long as the lowest non-zero bit of DemandMask is higher than the highest non-zero bit of C Differential Revision: https://reviews.llvm.org/D53033 llvm-svn: 344237	2018-10-11 11:04:09 +00:00
Florian Hahn	18e07bb822	[LV] Use SmallVector instead of DenseMap in calculateRegisterUsage (NFC). We assign indices sequentially for seen instructions, so we can just use a vector and push back the seen instructions. No need for using a DenseMap. Reviewers: hsaito, rengolin, nadav, dcaballe Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D53089 llvm-svn: 344233	2018-10-11 09:46:25 +00:00
Florian Hahn	7eb5cb4ebc	[LV] Ignore more debug info. We can avoid doing some unnecessary work by skipping debug instructions in a few loops. It also helps to ensure debug instructions do not prevent vectorization, although I do not have any concrete test cases for that. Reviewers: rengolin, hsaito, dcaballe, aprantl, vsk Reviewed By: rengolin, dcaballe Differential Revision: https://reviews.llvm.org/D53091 llvm-svn: 344232	2018-10-11 09:27:24 +00:00
Calixte Denizet	d2f290b034	[gcov] Display the hit counter for the line of a function definition Summary: Right now there is no hit counter on the line of function. So the idea is add the line of the function to all the lines covered by the entry block. Tests in compiler-rt/profile will be fixed in another patch: https://reviews.llvm.org/D49854 Reviewers: marco-c, davidxl Reviewed By: marco-c Subscribers: sylvestre.ledru, llvm-commits Differential Revision: https://reviews.llvm.org/D49853 llvm-svn: 344228	2018-10-11 08:53:43 +00:00
Max Kazantsev	b2e51090a4	[IndVars] Drop "exact" flag from lshr and udiv when substituting their args There is a transform that may replace `lshr (x+1), 1` with `lshr x, 1` in case if it can prove that the result will be the same. However the initial instruction might have an `exact` flag set, and it now should be dropped unless we prove that it may hold. Incorrectly set `exact` attribute may then produce poison. Differential Revision: https://reviews.llvm.org/D53061 Reviewed By: sanjoy llvm-svn: 344223	2018-10-11 07:22:26 +00:00
Richard Smith	6c67662816	Add a flag to remap manglings when reading profile data information. This can be used to preserve profiling information across codebase changes that have widespread impact on mangled names, but across which most profiling data should still be usable. For example, when switching from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI, or even from a 32-bit to a 64-bit build. The user can provide a remapping file specifying parts of mangled names that should be treated as equivalent (eg, std::__1 should be treated as equivalent to std::__cxx11), and profile data will be treated as applying to a particular function if its name is equivalent to the name of a function in the profile data under the provided equivalences. See the documentation change for a description of how this is configured. Remapping is supported for both sample-based profiling and instruction profiling. We do not support remapping indirect branch target information, but all other profile data should be remapped appropriately. Support is only added for the new pass manager. If someone wants to also add support for this for the old pass manager, doing so should be straightforward. This is the LLVM side of Clang r344199. Reviewers: davidxl, tejohnson, dlj, erik.pilkington Subscribers: mehdi_amini, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51249 llvm-svn: 344200	2018-10-10 23:13:47 +00:00
George Burgess IV	6ef8002c2c	Replace most users of UnknownSize with LocationSize::unknown(); NFC Moving away from UnknownSize is part of the effort to migrate us to LocationSizes (e.g. the cleanup promised in D44748). This doesn't entirely remove all of the uses of UnknownSize; some uses require tweaks to assume that UnknownSize isn't just some kind of int. This patch is intended to just be a trivial replacement for all places where LocationSize::unknown() will Just Work. llvm-svn: 344186	2018-10-10 21:28:44 +00:00
Sanjay Patel	05aadf885d	[InstCombine] reverse 'trunc X to <N x i1>' canonicalization; 2nd try Re-trying r344082 because it unintentionally included extra diffs. Original commit message: icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 llvm-svn: 344181	2018-10-10 20:47:46 +00:00
Sanjay Patel	58fc00d0bc	revert r344082: [InstCombine] reverse 'trunc X to <N x i1>' canonicalization This commit accidentally included the diffs from D53057. llvm-svn: 344178	2018-10-10 20:39:39 +00:00
Renato Golin	d8e7ca4a32	[VPlan] Fix CondBit quoting in dumpBasicBlock Quotes were being printed for VPInstructions but not the rest. llvm-svn: 344161	2018-10-10 17:55:21 +00:00
Scott Linder	3759efc650	Relax trivial cast requirements in CallPromotionUtils Differential Revision: https://reviews.llvm.org/D52792 llvm-svn: 344153	2018-10-10 16:35:47 +00:00
Carlos Alberto Enciso	c0952c8a08	Revert "[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG." This reverts commit r344120. It was causing buildbot failures. llvm-svn: 344135	2018-10-10 12:09:34 +00:00
Neil Henning	3d4579829e	Fix an ordering bug in the scalarizer. I've added a new test case that causes the scalarizer to try and use dead-and-erased values - caused by the basic blocks not being in domination order within the function. To fix this, instead of iterating through the blocks in function order, I walk them in reverse post order. Differential Revision: https://reviews.llvm.org/D52540 llvm-svn: 344128	2018-10-10 09:27:45 +00:00
Carlos Alberto Enciso	e7a347e5f8	[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG. When SimplifyCFG changes the PHI node into a select instruction, the debug line records becomes ambiguous. It causes the debugger to display unreachable source lines. Differential Revision: https://reviews.llvm.org/D52887 llvm-svn: 344120	2018-10-10 08:29:55 +00:00
George Burgess IV	40dc63e1f0	[Analysis] Make LocationSizes carry an 'imprecise' bit There are places where we need to merge multiple LocationSizes of different sizes into one, and get a sensible result. There are other places where we want to optimize aggressively based on the value of a LocationSizes (e.g. how can a store of four bytes be to an area of storage that's only two bytes large?) This patch makes LocationSize hold an 'imprecise' bit to note whether the LocationSize can be treated as an upper-bound and lower-bound for the size of a location, or just an upper-bound. This concludes the series of patches leading up to this. The most recent of which is r344108. Fixes PR36228. Differential Revision: https://reviews.llvm.org/D44748 llvm-svn: 344114	2018-10-10 06:39:40 +00:00
Max Kazantsev	1d893bfcea	[NFC] Make a variable const llvm-svn: 344113	2018-10-10 04:19:38 +00:00
Sanjay Patel	e9ca7ea3e5	[InstCombine] reverse 'trunc X to <N x i1>' canonicalization icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 llvm-svn: 344082	2018-10-09 21:26:01 +00:00
Sanjay Patel	88194dfe1a	[InstCombine] make helper function 'static'; NFC llvm-svn: 344056	2018-10-09 15:29:26 +00:00
George Burgess IV	f96e618017	Make LocationSize a proper Optional type; NFC This is the second in a series of changes intended to make https://reviews.llvm.org/D44748 more easily reviewable. Please see that patch for more context. The first change being r344012. Since I was requested to do all of this with post-commit review, this is about as small as I can make this patch. This patch makes LocationSize into an actual type that wraps a uint64_t; users are required to call getValue() in order to get the size now. If the LocationSize has an Unknown size (e.g. if LocSize == MemoryLocation::UnknownSize), getValue() will assert. This also adds DenseMap specializations for LocationInfo, which required taking two more values from the set of values LocationInfo can represent. Hence, heavy users of multi-exabyte arrays or structs may observe slightly lower-quality code as a result of this change. The intent is for getValue()s to be very close to a corresponding hasValue() (which is often spelled `!= MemoryLocation::UnknownSize`). Sadly, small diff context appears to crop that out sometimes, and the last change in DSE does require a bit of nonlocal reasoning about control-flow. :/ This also removes an assert, since it's now redundant with the assert in getValue(). llvm-svn: 344013	2018-10-09 03:18:56 +00:00
George Burgess IV	fefc42c9bb	Use locals instead of struct fields; NFC This is one of a series of changes intended to make https://reviews.llvm.org/D44748 more easily reviewable. Please see that patch for more context. Since I was requested to do all of this with post-commit review, this is about as small as I can make it (beyond committing changes to these few files separately, but they're incredibly similar in spirit, so...) On its own, this change doesn't make a great deal of sense. I plan on having a follow-up Real Soon Now(TM) to make the bits here make more sense. :) In particular, the next change in this series is meant to make LocationSize an actual type, which you have to call .getValue() on in order to get at the uint64_t inside. Hence, this change refactors code so that: - we only need to call the soon-to-come getValue() once in most cases, and - said call to getValue() happens very closely to a piece of code that checks if the LocationSize has a value (e.g. if it's != UnknownSize). llvm-svn: 344012	2018-10-09 02:14:33 +00:00
Robert Lougher	0c93ea2634	[TailCallElim] Enable marking of calls with byval as tails In r339636 the alias analysis rules were changed with regards to tail calls and byval arguments. Previously, tail calls were assumed not to alias allocas from the current frame. This has been updated, to not assume this for arguments with the byval attribute. This patch aligns TailCallElim with the new rule. Tail marking can now be more aggressive and mark more calls as tails, e.g.: define void @test() { %f = alloca %struct.foo call void @bar(%struct.foo* byval %f) ret void } define void @test2(%struct.foo* byval %f) { call void @bar(%struct.foo* byval %f) ret void } define void @test3(%struct.foo* byval %f) { %agg.tmp = alloca %struct.foo %0 = bitcast %struct.foo* %agg.tmp to i8* %1 = bitcast %struct.foo* %f to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 40, i1 false) call void @bar(%struct.foo* byval %agg.tmp) ret void } The problematic case where a byval parameter is captured by a call is still handled correctly, and will not be marked as a tail (see PR7272). llvm-svn: 343986	2018-10-08 18:03:40 +00:00
Xin Tong	bfdad33b82	[ThinLTO] Keep non-prevailing (linkonce\|weak)_odr symbols live Summary: If we have a symbol with (linkonce\|weak)_odr linkage, we do not want to dead strip it even it is not prevailing. IR level (linkonce\|weak)_odr symbol can become non-prevailing when we mix ELF objects and IR objects where the (linkonce\|weak)_odr symbol in the ELF object is prevailing and the ones in the IR objects are not. Stripping them will prevent us from doing optimizations with them. By not dead stripping them, We will convert these symbols to available_externally linkage as a result of non-prevailing and eventually dropping them after inlining. I modified cache-prevailing.ll to use linkonce linkage as it is testing whether cache prevailing bit is effective or not, not we should treat linkonce_odr alive or not Reviewers: tejohnson, pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D52893 llvm-svn: 343970	2018-10-08 15:12:48 +00:00
Neil Henning	57f5d0a885	[IRBuilder] Fixup CreateIntrinsic to allow specifying Types to Mangle. The IRBuilder CreateIntrinsic method wouldn't allow you to specify the types that you wanted the intrinsic to be mangled with. To fix this I've: - Added an ArrayRef<Type > member to both CreateIntrinsic overloads. - Used that array to pass into the Intrinsic::getDeclaration call. - Added a CreateUnaryIntrinsic to replace the most common use of CreateIntrinsic where the type was auto-deduced from operand 0. - Added a bunch more unit tests to test CreateIntrinsic calls that weren't being tested (including the FMF flag that wasn't checked). This was suggested as part of the AMDGPU specific atomic optimizer review (https://reviews.llvm.org/D51969). Differential Revision: https://reviews.llvm.org/D52087 llvm-svn: 343962	2018-10-08 10:32:33 +00:00
Ewan Crawford	fa120cbdbc	[InstCombine] Fix incongruous GEP type addrspace Currently running the @insertelem_after_gep function below through the InstCombine pass with opt produces invalid IR. Input: ``` define void @insertelem_after_gep(<16 x i32>* %t0) { %t1 = bitcast <16 x i32>* %t0 to [16 x i32]* %t2 = addrspacecast [16 x i32]* %t1 to [16 x i32] addrspace(3)* %t3 = getelementptr inbounds [16 x i32], [16 x i32] addrspace(3)* %t2, i64 0, i64 0 %t4 = insertelement <16 x i32 addrspace(3)> undef, i32 addrspace(3) %t3, i32 0 call void @extern_vec_pointers_func(<16 x i32 addrspace(3)> %t4) ret void } ``` Output: ``` define void @insertelem_after_gep(<16 x i32> %t0) { %t3 = getelementptr inbounds <16 x i32>, <16 x i32>* %t0, i64 0, i64 0 %t4 = insertelement <16 x i32 addrspace(3)> undef, i32 addrspace(3) %t3, i32 0 call void @my_extern_func(<16 x i32 addrspace(3)> %t4) ret void } ``` Which although causes no complaints when produced, isn't valid IR as the insertelement use of the %t3 GEP expects an address space. ``` opt: /tmp/bad.ll:52:73: error: '%t3' defined with type 'i32' but expected 'i32 addrspace(3)' %t4 = insertelement <16 x i32 addrspace(3)> undef, i32 addrspace(3)* %t3, i32 0 ``` I've fixed this by adding an addrspacecast after the GEP in the InstCombine pass, and including a check for this type mismatch to the verifier. Reviewers: spatel, lebedev.ri Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52294 llvm-svn: 343956	2018-10-08 08:40:45 +00:00
Max Kazantsev	b07369651e	[LV] Do not create SCEVs on broken IR in emitTransformedIndex. PR39160 At the point when we perform `emitTransformedIndex`, we have a broken IR (in particular, we have Phis for which not every incoming value is properly set). On such IR, it is illegal to create SCEV expressions, because their internal simplification process may try to prove some predicates and break when it stumbles across some broken IR. The only purpose of using SCEV in this particular place is attempt to simplify the generated code slightly. It seems that the result isn't worth it, because some trivial cases (like addition of zero and multiplication by 1) can be handled separately if needed, but more generally InstCombine is able to achieve the goals we want to achieve by using SCEV. This patch fixes a functional crash described in PR39160, and as side-effect it also generates a bit smarter code in some simple cases. It also may cause some optimality loss (i.e. we will now generate `mul` by power of `2` instead of shift etc), but there is nothing what InstCombine could not handle later. In case of dire need, we can support more trivial cases just in place. Note that this patch only fixes one particular case of the general problem that LV misuses SCEV, attempting to create SCEVs or prove predicates on invalid IR. The general solution, however, seems complex enough. Differential Revision: https://reviews.llvm.org/D52881 Reviewed By: fhahn, hsaito llvm-svn: 343954	2018-10-08 05:46:29 +00:00
Jonas Paulsson	29d80f07ee	[LoopVectorizer] Use TTI.getOperandInfo() Call getOperandInfo() instead of using (near) duplicated code in LoopVectorizationCostModel::getInstructionCost(). This gets the OperandValueKind and OperandValueProperties values for a Value passed as operand to an arithmetic instruction. getOperandInfo() used to be a static method in TargetTransformInfo.cpp, but is now instead a public member. Review: Florian Hahn https://reviews.llvm.org/D52883 llvm-svn: 343852	2018-10-05 14:34:04 +00:00
Neil Henning	d2261f617b	Add missing period to comment to match style of file. This is a test commit to show that my commit access is working. llvm-svn: 343842	2018-10-05 09:39:07 +00:00
Craig Topper	029d1ef6eb	[SimplifyCFG] Pass AggressiveInsts to DominatesMergePoint by reference. Remove null check. Summary: At some point in the past the recursion in DominatesMergePoint used to pass null for AggressiveInsts as part of the recursion. It no longer does this. So there is no way for AggressiveInsts to be null. This passes it by reference and removes the null check to make this explicit. Reviewers: efriedma, reames Reviewed By: efriedma Subscribers: xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D52575 llvm-svn: 343828	2018-10-04 23:40:31 +00:00
Sanjay Patel	3436dc2923	[InstCombine] drop poison flags in SimplifyVectorDemandedElts We established the (unfortunately complicated) rules for UB/poison propagation with vector ops in: D48893 D48987 D49047 It's clear from the affected tests that we are potentially creating poison where none existed before the transforms. For add/sub/mul, the answer is simple: just drop the flags because the extra undef vector lanes are generally more valuable for analysis and codegen. llvm-svn: 343819	2018-10-04 21:36:50 +00:00
Craig Topper	1d15f7b02b	[SimplifyCFG] Change recursive calls to llvm::SimplifyCFG to instead use an outer while loop to revisit. Summary: The llvm::SimplifyCFG function creates a SimplifyCFGOpt object and calls run on it. There were numerous places reached from this run function that called back out llvm::SimplifyCFG which would create another SimplifyCFGOpt object. This is an inefficient use of stack space at minimum. We are also not passing along the LoopHeaders pointer passed into the outer llvm::SimplifyCFG call. So if its not null we lose it on the first recursion and get nullptr from there on. This patch adds an outer loop around the main BasicBlock simplifying code and adds a flag to the SimplifyCFGOpt class that can be set by to request another iteration. I don't think we can iterate based just on the change flag alone since some of the simplifications delete a basic block entirely leaving nothing to iterate on. Reviewers: bogner, eli.friedman, reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52760 llvm-svn: 343816	2018-10-04 21:11:52 +00:00
Sanjay Patel	9d6688e38d	[InstCombine] reduce code duplication in SimplifyDemandedVectorElts; NFCI llvm-svn: 343806	2018-10-04 19:12:07 +00:00
Sanjay Patel	3746e11abe	[InstCombine] allow bitcast to/from FP for vector insert/extract transform This is a follow-up to rL343482 / D52439. This was a pattern that initially caused the commit to be reverted because the transform requires a bitcast as shown here. llvm-svn: 343794	2018-10-04 16:25:05 +00:00
Sanjay Patel	cafdeb1aa6	[InstCombine] allow SimplifyDemandedVectorElts to work with FP binops We're a long way from D50992 and D51553, but this is where we have to start. We weren't back-propagating undefs into binop constant values for anything but add/sub/mul/and/or/xor. This is likely because we have to be careful about not introducing UB/poison with div/rem/shift. But I suspect we already are getting the poison part wrong for add/sub/mul (although it may not be possible to expose the bug currently because we use SimplifyDemandedVectorElts from a limited set of opcodes). See the discussion/implementation from D48987 and D49047. This patch just enables functionality for FP ops because those do not have UB/poison potential. llvm-svn: 343727	2018-10-03 21:44:59 +00:00
Sanjay Patel	306f14ceb8	[InstCombine] clean up foldVectorBinop(); NFC 1. Fix include ordering. 2. Improve variable name (width is bitwidth not number-of-elements). 3. Add local Opcode variable to reduce code duplication. llvm-svn: 343694	2018-10-03 15:46:03 +00:00
Sanjay Patel	79dceb2903	[InstCombine] name change: foldShuffledBinop -> foldVectorBinop; NFC This function will deal with more than shuffles with D50992, and I have another potential per-element fold that could live here. llvm-svn: 343692	2018-10-03 15:20:58 +00:00
Florian Hahn	11a1423348	[LoopInterchange] Remove unused variable PreserveLCSSA (NFC). llvm-svn: 343676	2018-10-03 11:01:23 +00:00
Aditya Kumar	a27014b851	Improve static analysis of cold basic blocks Differential Revision: https://reviews.llvm.org/D52704 Reviewers: sebpop, tejohnson, brzycki, SirishP Reviewed By: sebpop llvm-svn: 343663	2018-10-03 06:21:05 +00:00
Aditya Kumar	9e20ade72a	Add support for new pass manager Modified the testcases to use both pass managers Use single commandline flag for both pass managers. Differential Revision: https://reviews.llvm.org/D52708 Reviewers: sebpop, tejohnson, brzycki, SirishP Reviewed By: tejohnson, brzycki llvm-svn: 343662	2018-10-03 05:55:20 +00:00
David Green	1e44c3b62c	[InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A This is an attempt to get out of a local-minimum that instcombine currently gets stuck in. We essentially combine two optimisations at once, ~a - ~b = b-a and min(~a, ~b) = ~max(a, b), only doing the transform if the result is at least neutral. This involves using IsFreeToInvert, which has been expanded a little to include selects that can be easily inverted. This is trying to fix PR35875, using the ideas from Sanjay. It is a large improvement to one of our rgb to cmy kernels. Differential Revision: https://reviews.llvm.org/D52177 llvm-svn: 343569	2018-10-02 09:48:34 +00:00
Craig Topper	d616d33a96	[SimplifyCFG] Use Value::hasNUses instead of 'getNumUses() =='. NFCI getNumUses is linear in the number of uses. Since we're looking for a specific use count, we can use hasNUses which will stop as soon as it determines there are more than N uses instead of walking all of them. llvm-svn: 343550	2018-10-01 23:09:52 +00:00
Craig Topper	90c0a0621c	[SimplifyCFG] Update comments that refer to CondBB to say ThenBB instead. NFC There is no variable in this function named CondBB, but there is one named ThenBB and I believe the comments are all refering to it. llvm-svn: 343548	2018-10-01 22:56:11 +00:00
Eric Christopher	dcf1d97c5c	Temporarily revert "[GVNHoist] Re-enable GVNHoist by default" This reverts commit r342387 as it's showing significant performance regressions in a number of benchmarks. Followed up with the committer and original thread with an example and will get performance numbers before recommitting. llvm-svn: 343522	2018-10-01 18:57:08 +00:00
Jesper Antonsson	c954b86391	[InstCombine] Handle vector compares in foldGEPIcmp(), take 2 Summary: This is a continuation of the fix for PR34627 "InstCombine assertion at vector gep/icmp folding". (I just realized bugpoint had fuzzed the original test for me, so I had fixed another trigger of the same assert in adjacent code in InstCombine.) This patch avoids optimizing an icmp (to look only at the base pointers) when the resulting icmp would have a different type. The patch adds a testcase and also cleans up and shrinks the pre-existing test for the adjacent assert trigger. Reviewers: lebedev.ri, majnemer, spatel Reviewed By: lebedev.ri Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52494 llvm-svn: 343486	2018-10-01 14:59:25 +00:00
Sanjay Patel	31b07198f1	[InstCombine] try to convert vector insert+extract to trunc; 2nd try This was originally committed at rL343407, but reverted at rL343458 because it crashed trying to handle a case where the destination type is FP. This version of the patch adds a check for that possibility. Tests added at rL343480. Original commit message: This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343482	2018-10-01 14:40:00 +00:00
Hans Wennborg	a60aa91374	Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc" This caused Chromium builds to fail with "Illegal Trunc" assertion. See https://crbug.com/890723 for repro. > This transform is requested for the backend in: > https://bugs.llvm.org/show_bug.cgi?id=39016 > ...but I figured it was worth doing in IR too, and it's probably > easier to implement here, so that's this patch. > > In the simplest case, we are just truncating a scalar value. If the > extract index doesn't correspond to the LSBs of the scalar, then we > have to shift-right before the truncate. Endian-ness makes this tricky, > but hopefully the ASCII-art helps visualize the transform. > > Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343458	2018-10-01 12:07:45 +00:00
Florian Hahn	8600fee52e	Recommit r343308: [LoopInterchange] Turn into a loop pass. llvm-svn: 343450	2018-10-01 09:59:48 +00:00
Fangrui Song	3507c6e884	Use the container form llvm::sort(C, ...) There are a few leftovers in rL343163 which span two lines. This commit changes these llvm::sort(C.begin(), C.end, ...) to llvm::sort(C, ...) llvm-svn: 343426	2018-09-30 22:31:29 +00:00
Sanjay Patel	1e0f1f645a	[InstCombine] try to convert vector insert+extract to trunc This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343407	2018-09-30 14:34:01 +00:00
Sanjay Patel	26c119a9c2	[InstCombine] allow lengthening of insertelement to eliminate shuffles As noted in post-commit comments for D52548, the limitation on increasing vector length can be applied by opcode. As a first step, this patch only allows insertelement to be widened because that has no logical downsides for IR and has little risk of pessimizing codegen. This may cause PR39132 to go into hiding during a full compile, but that bug is not fixed. llvm-svn: 343406	2018-09-30 13:50:42 +00:00
Sanjay Patel	54d31ef87e	[InstCombine] fix formatting in vector evaluators; NFC We need to alter the functionality as shown in D52548. llvm-svn: 343379	2018-09-29 15:05:24 +00:00
Vitaly Buka	0509070811	[cxx2a] Fix warning triggered by r343285 llvm-svn: 343369	2018-09-29 02:17:12 +00:00
whitequark	29b2980159	Revert "[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints" This reverts commit c4baf7c2f06ff5459c4f5998ce980346e72bff97. Broke the bots, and should really be in Transforms/Coroutines instead. llvm-svn: 343337	2018-09-28 16:45:18 +00:00
whitequark	937afbc365	[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints Summary: This patch adds bindings to C and Go for addCoroutinePassesToExtensionPoints, which is used to add coroutine passes to the correct locations in PassManagerBuilder. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: mehdi_amini, modocache, llvm-commits Differential Revision: https://reviews.llvm.org/D51642 llvm-svn: 343336	2018-09-28 16:38:11 +00:00
Sanjay Patel	242f90fe82	[InstCombine] don't propagate wider shufflevector arguments to predecessors InstCombine would propagate shufflevector insts that had wider output vectors onto predecessors, which would sometimes push undef's onto the divisor of a div/rem and result in bad codegen. I've fixed this by just banning propagating shufflevector back if the result of the shufflevector is wider than the input vectors. Patch by: @sheredom (Neil Henning) Differential Revision: https://reviews.llvm.org/D52548 llvm-svn: 343329	2018-09-28 15:24:41 +00:00
Florian Hahn	8d72ecc36f	Revert r343308: [LoopInterchange] Turn into a loop pass. llvm-svn: 343310	2018-09-28 10:20:07 +00:00
Florian Hahn	0694c159f7	[LoopInterchange] Turn into a loop pass. This patch turns LoopInterchange into a loop pass. It now only considers top-level loops and tries to move the innermost loop to the optimal position within the loop nest. By only looking at top-level loops, we might miss a few opportunities the function pass would get (e.g. if we have a loop nest of 3 loops, in the function pass we might process loops at level 1 and 2 and move the inner most loop to level 1, and then we process loops at levels 0, 1, 2 and interchange again, because we now have a different inner loop). But I think it would be better to handle such cases by picking the best inner loop from the start and avoid re-visiting the same loops again. The biggest advantage of it being a function pass is that it interacts nicely with the other loop passes. Without this patch, there are some performance regressions on AArch64 with loop interchanging enabled, where no loops were interchanged, but we missed out on some other loop optimizations. It also removes the SimplifyCFG run. We are just changing branches, so the CFG should not be more complicated, besides the additional 'unique' preheaders this pass might create. Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00 Reviewed By: xbolva00 Differential Revision: https://reviews.llvm.org/D51702 llvm-svn: 343308	2018-09-28 09:45:50 +00:00
Sanjay Patel	c3f50ff92e	[InstCombine] Without infinites, fold (C / X) < 0.0 --> (X < 0) When C is not zero and infinites are not allowed (C / X) > 0 is a sign test. Depending on the sign of C, the predicate must be swapped. E.g.: foo(double X) { if ((-2.0 / X) <= 0) ... } => foo(double X) { if (X >= 0) ... } Patch by: @marels (Martin Elshuber) Differential Revision: https://reviews.llvm.org/D51942 llvm-svn: 343228	2018-09-27 15:59:24 +00:00
Teresa Johnson	f24136f17a	[WPD] Fix incorrect devirtualization after indirect call promotion Summary: Add a dominance check to ensure that the possible devirtualizable call is actually dominated by the type test/checked load intrinsic being analyzed. With PGO, after indirect call promotion is performed during the compile step, followed by inlining, we may have a type test in the promoted and inlined sequence that allows an indirect call in that sequence to be devirtualized. That indirect call (inserted by inlining after promotion) will share the same vtable pointer as the fallback indirect call that cannot be devirtualized. Before this patch the code was incorrectly devirtualizing the fallback indirect call. See the new test and the example described there for more details. Reviewers: pcc, vitalybuka Subscribers: mehdi_amini, Prazek, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D52514 llvm-svn: 343226	2018-09-27 14:55:32 +00:00
Fangrui Song	0cac726a00	llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...) Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163	2018-09-27 02:13:45 +00:00
Florian Hahn	6feb637124	[LoopInterchange] Preserve LCSSA. This patch extends LoopInterchange to move LCSSA to the right place after interchanging. This is required for LoopInterchange to become a function pass. An alternative to the manual moving of the PHIs, we could also re-form the LCSSA phis for a set of interchanged loops, but that's more expensive. Reviewers: efriedma, mcrosier, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D52154 llvm-svn: 343132	2018-09-26 19:34:25 +00:00
Vyacheslav Zakharin	e06831a3b2	Remove LoopID metadata from the branch instruction that follows the peeled iterations. Differential Revision: https://reviews.llvm.org/D52176 llvm-svn: 343054	2018-09-26 01:03:21 +00:00
Zhaoshi Zheng	95710337b4	Revert "Revert "[ConstHoist] Do not rebase single (or few) dependent constant"" This reverts commit bd7b44f35ee9fbe365eb25ce55437ea793b39346. Reland r342994: disabled the optimization and explicitly enable it in test. -mllvm -consthoist-min-num-to-rebase<unsigned>=0 [ConstHoist] Do not rebase single (or few) dependent constant If an instance (InsertionPoint or IP) of Base constant A has only one or few rebased constants depending on it, do NOT rebase. One extra ADD instruction is required to materialize each rebased constant, assuming A and the rebased have the same materialization cost. Differential Revision: https://reviews.llvm.org/D52243 llvm-svn: 343053	2018-09-26 00:59:09 +00:00
Anna Thomas	b1e3d45318	[LV][LAA] Vectorize loop invariant values stored into loop invariant address Summary: We are overly conservative in loop vectorizer with respect to stores to loop invariant addresses. More details in https://bugs.llvm.org/show_bug.cgi?id=38546 This is the first part of the fix where we start with vectorizing loop invariant values to loop invariant addresses. This also includes changes to ORE for stores to invariant address. Reviewers: anemet, Ayal, mkuper, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50665 llvm-svn: 343028	2018-09-25 20:57:20 +00:00
Jessica Paquette	e02de05b32	Revert "[ConstHoist] Do not rebase single (or few) dependent constant" This caused a couple test failures on a bot: CodeGen/X86/constant-hoisting-bfi.ll Transforms/ConstantHoisting/X86/ehpad.ll Example: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53575/ llvm-svn: 343005	2018-09-25 18:41:40 +00:00
Zhaoshi Zheng	2c1a09188f	[ConstHoist] Do not rebase single (or few) dependent constant If an instance (InsertionPoint or IP) of Base constant A has only one or few rebased constants depending on it, do NOT rebase. One extra ADD instruction is required to materialize each rebased constant, assuming A and the rebased have the same materialization cost. Differential Revision: https://reviews.llvm.org/D52243 llvm-svn: 342994	2018-09-25 17:45:37 +00:00
Sanjay Patel	69ed4710b8	[InstCombine] narrow binops on concatenated vectors (PR33026) The motivating case from: https://bugs.llvm.org/show_bug.cgi?id=33026 ...has no shuffles now. This kind of pattern may occur during vectorization when targets have lumpy ISAs like SSE/AVX. llvm-svn: 342988	2018-09-25 15:57:37 +00:00
David Green	9108c2b921	[LoopUnroll] Add check to Latch's terminator in UnrollRuntimeLoopRemainder In this patch, I'm adding an extra check to the Latch's terminator in llvm::UnrollRuntimeLoopRemainder, similar to how it is already done in the llvm::UnrollLoop. The compiler would crash if this function is called with a malformed loop. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D51486 llvm-svn: 342958	2018-09-25 10:08:47 +00:00
Evgeniy Stepanov	090f0f9504	[hwasan] Record and display stack history in stack-based reports. Summary: Display a list of recent stack frames (not a stack trace!) when tag-mismatch is detected on a stack address. The implementation uses alignment tricks to get both the address of the history buffer, and the base address of the shadow with a single 8-byte load. See the comment in hwasan_thread_list.h for more details. Developed in collaboration with Kostya Serebryany. Reviewers: kcc Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52249 llvm-svn: 342923	2018-09-24 23:03:34 +00:00
Evgeniy Stepanov	20c4999e8b	Revert "[hwasan] Record and display stack history in stack-based reports." This reverts commit r342921: test failures on clang-cmake-arm* bots. llvm-svn: 342922	2018-09-24 22:50:32 +00:00
Evgeniy Stepanov	9043e17edd	[hwasan] Record and display stack history in stack-based reports. Summary: Display a list of recent stack frames (not a stack trace!) when tag-mismatch is detected on a stack address. The implementation uses alignment tricks to get both the address of the history buffer, and the base address of the shadow with a single 8-byte load. See the comment in hwasan_thread_list.h for more details. Developed in collaboration with Kostya Serebryany. Reviewers: kcc Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52249 llvm-svn: 342921	2018-09-24 21:38:42 +00:00
Christy Lee	e94374809e	Re-submitting changes in D51550 because it failed to patch. Reviewers: javed.absar, trentxintong, courbet Reviewed By: trentxintong Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52433 llvm-svn: 342919	2018-09-24 20:47:12 +00:00
Sanjay Patel	4674c7765d	[InstCombine] add bitcast+extelt helper function; NFC We can handle patterns where the elements have different sizes, so refactoring ahead of trying to add another blob within these clauses. llvm-svn: 342918	2018-09-24 20:41:22 +00:00
Sanjay Patel	7a52626a08	[InstCombine] improve variable name and use 'match'; NFC 'width' of a vector usually refers to the bit-width. https://bugs.llvm.org/show_bug.cgi?id=39016 shows a case where we could extend this fold to handle a case where the number of elements in the bitcasted vector is not equal to the resulting value. llvm-svn: 342902	2018-09-24 16:39:03 +00:00
Petar Jovanovic	c451c9ef50	[deadargelim] Update dbg.value of 'unused' parameters DeadArgElim pass marks unused function arguments as ‘undef’ without updating existing dbg.values referring to it. As a consequence the debug info metadata in the final executable was wrong. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D51968 llvm-svn: 342871	2018-09-24 10:01:24 +00:00
Eugene Leviant	2b70d616f0	[WholeProgramDevirt] Don't process declarations when building type id map Differential revision: https://reviews.llvm.org/D52175 llvm-svn: 342836	2018-09-23 13:27:47 +00:00
Sanjay Patel	09e02fbf51	[InstCombine][x86] try even harder to convert blendv intrinsic to generic IR (PR38814) Follow-up to rL342324 (D52059): Missing optimizations with blendv are shown in: https://bugs.llvm.org/show_bug.cgi?id=38814 This is an easier and more powerful solution than adding pattern matching for a few special cases in the backend. The potential danger with this transform in IR is that the condition value can get separated from the select, and the backend might not be able to make a blendv out of it again. llvm-svn: 342806	2018-09-22 14:43:55 +00:00
Craig Topper	2b3f5df73a	[InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible Summary: This restores the combine that was reverted in r341883. The infinite loop from the failing test no longer occurs due to changes from r342163. Reviewers: spatel, dmgreen Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52070 llvm-svn: 342797	2018-09-22 05:53:27 +00:00
Warren Ristow	4f27730eaf	[Loop Vectorizer] Abandon vectorization when no integer IV found Support for vectorizing loops with secondary floating-point induction variables was added in r276554. A primary integer IV is still required for vectorization to be done. If an FP IV was found, but no integer IV was found at all (primary or secondary), the attempt to vectorize still went forward, causing a compiler-crash. This change abandons that attempt when no integer IV is found. (Vectorizing FP-only cases like this, rather than bailing out, is discussed as possible future work in D52327.) See PR38800 for more information. Differential Revision: https://reviews.llvm.org/D52327 llvm-svn: 342786	2018-09-21 23:03:50 +00:00
Sameer Sahasrabuddhe	0807e94951	revert changes from r342722 "[AMDGPU] lower-switch in preISel as a workaround for legacy DA" This broke regression tests. The first breakage was noticed here: http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/23549 llvm-svn: 342743	2018-09-21 16:31:51 +00:00

1 2 3 4 5 ...

20907 Commits