llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	14ca9a8355	Revert "[DemandedBits][BDCE] Support vectors of integers" This reverts commit r348549. Causing assertion failures during clang build. llvm-svn: 348558	2018-12-07 00:42:03 +00:00
Nikita Popov	cf65b9207b	[DemandedBits][BDCE] Support vectors of integers DemandedBits and BDCE currently only support scalar integers. This patch extends them to also handle vector integer operations. In this case bits are not tracked for individual vector elements, instead a bit is demanded if it is demanded for any of the elements. This matches the behavior of computeKnownBits in ValueTracking and SimplifyDemandedBits in InstCombine. The getDemandedBits() method can now only be called on instructions that have integer or vector of integer type. Previously it could be called on any sized instruction (even if it was not particularly useful). The size of the return value is now always the scalar size in bits (while previously it was the type size in bits). Differential Revision: https://reviews.llvm.org/D55297 llvm-svn: 348549	2018-12-06 23:50:32 +00:00
Adrian Prantl	fbeeac0e1e	Reapply "Adapt gcov to changes in CFE." This reverts commit r348203 and reapplies D55085 with an additional GCOV bugfix to make the change NFC for relative file paths in .gcno files. Thanks to Ilya Biryukov for additional testing! Original commit message: Update Diagnostic handling for changes in CFE. The clang frontend no longer emits the current working directory for DIFiles containing an absolute path in the filename: and will move the common prefix between current working directory and the file into the directory: component. https://reviews.llvm.org/D55085 llvm-svn: 348512	2018-12-06 18:44:48 +00:00
Alexandros Lamprineas	e4c91f5c4c	[GVN] Don't perform scalar PRE on GEPs Partial Redundancy Elimination of GEPs prevents CodeGenPrepare from sinking the addressing mode computation of memory instructions back to its uses. The problem comes from the insertion of PHIs, which confuse CGP and make it bail. I've autogenerated the check lines of an existing test and added a store instruction to demonstrate the motivation behind this change. The store is now using the gep instead of a phi. Differential Revision: https://reviews.llvm.org/D55009 llvm-svn: 348496	2018-12-06 16:11:58 +00:00
Ilya Biryukov	cb5331eb93	Revert "[LoopSimplifyCFG] Delete dead in-loop blocks" This reverts commit r348457. The original commit causes clang to crash when doing an instrumented build with a new pass manager. Reverting to unbreak our integrate. llvm-svn: 348484	2018-12-06 13:21:01 +00:00
Roman Lebedev	98cb1216a6	[InstCombine] foldICmpWithLowBitMaskedVal(): don't miscompile -1 vector elts I was finally able to quantify what i thought was missing in the fix, it was vector constants. If we have a scalar (and %x, -1), it will be instsimplified before we reach this code, but if it is a vector, we may still have a -1 element. Thus, we want to avoid the fold if at least one element is -1. Or in other words, ignoring the undef elements, no sign bits should be set. Thus, m_NonNegative(). A follow-up for rL348181 https://bugs.llvm.org/show_bug.cgi?id=39861 llvm-svn: 348462	2018-12-06 08:14:24 +00:00
Max Kazantsev	0b1d069d64	[LoopSimplifyCFG] Delete dead in-loop blocks This patch teaches LoopSimplifyCFG to delete loop blocks that have become unreachable after terminator folding has been done. Differential Revision: https://reviews.llvm.org/D54023 Reviewed By: anna llvm-svn: 348457	2018-12-06 05:45:02 +00:00
Sanjay Patel	998ececef0	[InstCombine] remove dead code from visitExtractElement Extracting from a splat constant is always handled by InstSimplify. Move the test for this from InstCombine to InstSimplify to make sure that stays true. llvm-svn: 348423	2018-12-05 23:09:33 +00:00
Sanjay Patel	47b3b4b5aa	[InstCombine] reduce duplication in visitExtractElementInst; NFC llvm-svn: 348418	2018-12-05 21:57:51 +00:00
Vedant Kumar	09415a850e	[CodeExtractor] Do not marked outlined calls which may resume EH as noreturn Treat terminators which resume exception propagation as returning instructions (at least, for the purposes of marking outlined functions `noreturn`). This is to avoid inserting traps after calls to outlined functions which unwind. rdar://46129950 llvm-svn: 348404	2018-12-05 19:35:37 +00:00
Christian Bruel	4ead99b3ac	Allow norecurse attribute on functions that have debug infos. Summary: debug intrinsics might be marked norecurse to enable the caller function to be norecurse and optimized if needed. This avoids code gen optimisation differences when -g is used, as in globalOpt.cpp:processInternalGlobal checks. Reviewers: chandlerc, jmolloy, aprantl Reviewed By: aprantl Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D55187 llvm-svn: 348381	2018-12-05 16:48:00 +00:00
Sanjay Patel	baffae91b2	[InstCombine] simplify icmps with same operands based on dominating cmp The tests here are based on the motivating cases from D54827. More background: 1. We don't get these cases in general with SimplifyCFG because the root of the pattern match is an icmp, not a branch. I'm not sure how often we encounter this pattern vs. the seemingly more likely case with branches, but I don't see evidence to leave the minimal pattern unoptimized. 2. This has a chance of increasing compile-time because we're using a ValueTracking call to handle the match. The motivating cases could be handled with a simpler pair of calls to isImpliedTrueByMatchingCmp/ isImpliedFalseByMatchingCmp, but I saw that we have a more comprehensive wrapper around those, so we might as well use it here unless there's evidence that it's significantly slower. 3. Ideally, we'd handle the fold to constants in InstSimplify, but as with the existing code here, we could extend this to handle cases where the result is not a constant, but a new combined predicate. That would mean splitting the logic across the 2 passes and possibly duplicating the pattern-matching cost. 4. As mentioned in D54827, this seems like the kind of thing that should be handled in Correlated Value Propagation, but that pass is currently limited to dealing with instructions with constant operands, so extending this bit of InstCombine is the smallest/easiest way to get these patterns optimized. llvm-svn: 348367	2018-12-05 15:04:00 +00:00
Alina Sbirlea	0e216854f9	[LICM] Actually disable ControlFlowHoisting. Summary: The remaining code paths that ControlFlowHoisting introduced that were not disabled, increased compile time by 3x for some benchmarks. The time is spent in DominatorTree updates. Reviewers: john.brawn, mkazantsev Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D55313 llvm-svn: 348345	2018-12-05 10:16:21 +00:00
Vitaly Buka	8076c57fd2	[asan] Add clang flag -fsanitize-address-use-odr-indicator Reviewers: eugenis, m.ostapenko, ygribov Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55157 llvm-svn: 348327	2018-12-05 01:44:31 +00:00
Vitaly Buka	d6bab09b4b	[asan] Split -asan-use-private-alias to -asan-use-odr-indicator Reviewers: eugenis, m.ostapenko, ygribov Subscribers: mehdi_amini, kubamracek, hiraditya, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D55156 llvm-svn: 348316	2018-12-04 23:17:41 +00:00
Sanjay Patel	3d5bb15a1d	[CmpInstAnalysis] fix function signature for ICmp code to predicate; NFC The old function underspecified the return type, took an unused parameter, and had a misleading name. llvm-svn: 348292	2018-12-04 18:53:27 +00:00
Sanjay Patel	d23b5ed857	[InstCombine] rearrange foldICmpWithDominatingICmp; NFC Move it out from under the constant check, reorder predicates, add comments. This makes it easier to extend to handle the non-constant case. llvm-svn: 348284	2018-12-04 17:44:24 +00:00
Ilya Biryukov	449a7f0dbb	Revert "Adapt gcov to changes in CFE." This reverts commit r348203. Reason: this produces absolute paths in .gcno files, breaking us internally as we rely on them being consistent with the filenames passed in the command line. Also reverts r348157 and r348155 to account for revert of r348154 in clang repository. llvm-svn: 348279	2018-12-04 16:30:31 +00:00
Sanjay Patel	a40bf9fff7	[InstCombine] add helper for icmp with dominator; NFC There's a potential small enhancement to this code that could solve the cases currently under proposal in D54827 via SimplifyCFG. Whether instcombine should be doing this kind of semi-non-local analysis in the first place is an open question, but separating the logic out can only help if/when we decide to move it to a different pass. AFAICT, any proposal to do this in SimplifyCFG could also be seen as an overreach + it would be incomplete to start the fold from a branch rather than an icmp. There's another question here about the code for processUGT_ADDCST_ADD(). That part may be completely dead after rL234638 ? llvm-svn: 348273	2018-12-04 15:35:17 +00:00
Alina Sbirlea	797935f4f1	[SimpleLoopUnswitch] Remove debug dump. llvm-svn: 348267	2018-12-04 14:43:24 +00:00
Alina Sbirlea	a2eebb828e	Update MemorySSA in SimpleLoopUnswitch. Summary: Teach SimpleLoopUnswitch to preserve MemorySSA. Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D47022 llvm-svn: 348263	2018-12-04 14:23:37 +00:00
Vitaly Buka	537cfc0352	[asan] Reduce binary size by using unnamed private aliases Summary: --asan-use-private-alias increases binary sizes by 10% or more. Most of this space was long names of aliases and new symbols. These symbols are not needed for the ODC check at all. Reviewers: eugenis Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55146 llvm-svn: 348221	2018-12-04 00:36:14 +00:00
Vedant Kumar	d129569e34	[CodeExtractor] Split PHI nodes with incoming values from outlined region (PR39433) If a PHI node out of extracted region has multiple incoming values from it, split this PHI on two parts. First PHI has incomings only from region and extracts with it (they are placed to the separate basic block that added to the list of outlined), and incoming values in original PHI are replaced by first PHI. Similar solution is already used in CodeExtractor for PHIs in entry block (severSplitPHINodes method). It covers PR39433 bug. Patch by Sergei Kachkov! Differential Revision: https://reviews.llvm.org/D55018 llvm-svn: 348205	2018-12-03 22:40:21 +00:00
Adrian Prantl	40eb622325	Adapt gcov to changes in CFE. The clang frontend no longer emits the current working directory for DIFiles containing an absolute path in the filename: and will move the common prefix between current working directory and the file into the directory: component. This fixes the GCOV tests in compiler-rt that were broken by the Clang change. llvm-svn: 348203	2018-12-03 22:37:48 +00:00
Sanjay Patel	8c65515082	[InstCombine] fix undef propagation bug with shuffle+binop When we have a shuffle that extends a source vector with undefs and then do some binop on that, we must make sure that the extra elements remain undef with that binop if we reverse the order of the binop and shuffle. 'or' is probably the easiest example to show the bug because 'or C, undef --> -1' (not undef). But there are other opcode/constant combinations where this is true as shown by the 'shl' test. llvm-svn: 348191	2018-12-03 21:15:17 +00:00
Roman Lebedev	7bf2fed167	[InstCombine] foldICmpWithLowBitMaskedVal(): disable 2 faulty folds. These two folds are invalid for this non-constant pattern when the mask ends up being all-ones: https://rise4fun.com/Alive/9au https://rise4fun.com/Alive/UcQM Fixes https://bugs.llvm.org/show_bug.cgi?id=39861 llvm-svn: 348181	2018-12-03 20:07:58 +00:00
Sanjay Patel	f2bda5e43f	[InstCombine] rearrange shuffle+binop fold; NFC This code has a bug dealing with undefs, so we need to add another escape hatch, so doing some cleanup ahead of that. llvm-svn: 348175	2018-12-03 19:53:04 +00:00
Sanjay Patel	472652ef68	[CmpInstAnalysis] fix formatting; NFC There are potential improvements to the structure of this API raised by D54994, but remove some cosmetic blemishes before making any functional changes. llvm-svn: 348149	2018-12-03 15:48:30 +00:00
Alexander Potapenko	7502e5fc56	[KMSAN] Enable -msan-handle-asm-conservative by default This change enables conservative assembly instrumentation in KMSAN builds by default. It's still possible to disable it with -msan-handle-asm-conservative=0 if something breaks. It's now impossible to enable conservative instrumentation for userspace builds, but it's not used anyway. llvm-svn: 348112	2018-12-03 10:15:43 +00:00
Sanjay Patel	7d82d37854	[ValueTracking] add helper function for testing implied condition; NFCI We were duplicating code around the existing isImpliedCondition() that checks for a predecessor block/dominating condition, so make that a wrapper call. llvm-svn: 348088	2018-12-02 13:26:03 +00:00
Nikita Popov	0c5d6ccbfc	[InstCombine] Support ssub.sat canonicalization for non-splats Extend ssub.sat(X, C) -> sadd.sat(X, -C) canonicalization to also support non-splat vector constants. This is done by generalizing the implementation of the isNotMinSignedValue() helper to return true for constants that are non-splat, but don't contain any signed min elements. Differential Revision: https://reviews.llvm.org/D55011 llvm-svn: 348072	2018-12-01 10:58:34 +00:00
Joseph Tremoulet	27b1e3bd4f	[Mem2Reg] Fix nondeterministic corner case Summary: When mem2reg inserts phi nodes in blocks with unreachable predecessors, it adds undef operands for those incoming edges. When there are multiple such predecessors, the order is currently based on the address of the BasicBlocks. This change fixes that by using the BBNumbers in the sort/search predicates, as is done elsewhere in mem2reg to ensure determinism. Also adds a testcase with a bunch of unreachable preds, which (nodeterministically) fails without the fix. Reviewers: majnemer Reviewed By: majnemer Subscribers: mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D55077 llvm-svn: 348024	2018-11-30 19:20:02 +00:00
Alexey Bataev	3689747619	[SLP]PR39774: Update references of the replaced external instructions. Summary: An additional fix for PR39774. Need to update the references for the RedcutionRoot instruction when it is replaced during the vectorization phase to avoid compiler crash on reduction vectorization. Reviewers: RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55017 llvm-svn: 347997	2018-11-30 15:14:20 +00:00
Max Kazantsev	9cf417db78	[LoopSimplifyCFG] Update MemorySSA in terminator folding. PR39783 Terminator folding transform lacks MemorySSA update for memory Phis, while they exist within MemorySSA analysis. They need exactly the same type of updates as regular Phis. Failing to update them properly ends up with inconsistent MemorySSA and manifests in various assertion failures. This patch adds Memory Phi updates to this transform. Thanks to @jonpa for finding this! Differential Revision: https://reviews.llvm.org/D55050 Reviewed By: asbirlea llvm-svn: 347979	2018-11-30 10:06:23 +00:00
David Stuttard	c6603861d8	Revert r347871 "Fix: Add support for TFE/LWE in image intrinsic" Also revert fix r347876 One of the buildbots was reporting a failure in some relevant tests that I can't repro or explain at present, so reverting until I can isolate. llvm-svn: 347911	2018-11-29 20:14:17 +00:00
Sanjay Patel	d802270808	[InstSimplify] fold select with implied condition This is an almost direct move of the functionality from InstCombine to InstSimplify. There's no reason not to do this in InstSimplify because we never create a new value with this transform. (There's a question of whether any dominance-based transform belongs in either of these passes, but that's a separate issue.) I've changed 1 of the conditions for the fold (1 of the blocks for the branch must be the block we started with) into an assert because I'm not sure how that could ever be false. We need 1 extra check to make sure that the instruction itself is in a basic block because passes other than InstCombine may be using InstSimplify as an analysis on values that are not wired up yet. The 3-way compare changes show that InstCombine has some kind of phase-ordering hole. Otherwise, we would have already gotten the intended final result that we now show here. llvm-svn: 347896	2018-11-29 18:44:39 +00:00
John Brawn	a7eb2c863f	[LICM] Reapply r347776 "Make LICM able to hoist phis" with fix This commit caused a large compile-time slowdown in some cases when NDEBUG is off due to the dominator tree verification it added. Fix this by only doing dominator tree and loop info verification when something has been hoisted. Differential Revision: https://reviews.llvm.org/D52827 llvm-svn: 347889	2018-11-29 17:10:00 +00:00
Teresa Johnson	93f9996278	[ThinLTO] Import local variables from the same module as caller Summary: We can sometimes end up with multiple copies of a local variable that have the same GUID in the index. This happens when there are local variables with the same name that are in different source files having the same name/path at compile time (but compiled into different bitcode objects). In this case make sure we import the copy in the caller's module. This enables importing both of the variables having the same GUID (but which will have different promoted names since the module paths, and therefore the module hashes, will be distinct). Importing the wrong copy is particularly problematic for read only variables, since we must import them as a local copy whenever referenced. Otherwise we get undefs at link time. Note that the llvm-lto.cpp and ThinLTOCodeGenerator changes are needed for testing the distributed index case via clang, which will be sent as a separate clang-side patch shortly. We were previously not doing the dead code/read only computation before computing imports when testing distributed index generation (like it was for testing importing and other ThinLTO mechanisms alone). Reviewers: evgeny777 Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D55047 llvm-svn: 347886	2018-11-29 17:02:42 +00:00
Joseph Tremoulet	926ee459c4	[CallSiteSplitting] Report edge deletion to DomTreeUpdater Summary: When splitting musttail calls, the split blocks' original terminators get removed; inform the DTU when this happens. Also add a testcase that fails an assertion in the DTU without this fix. Reviewers: fhahn, junbuml Reviewed By: fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55027 llvm-svn: 347872	2018-11-29 15:27:04 +00:00
David Stuttard	de02e4b1cc	Add support for TFE/LWE in image intrinsics TFE and LWE support requires extra result registers that are written in the event of a failure in order to detect that failure case. The specific use-case that initiated these changes is sparse texture support. This means that if image intrinsics are used with either option turned on, the programmer must ensure that the return type can contain all of the expected results. This can result in redundant registers since the vector size must be a power-of-2. This change takes roughly 6 parts: 1. Modify the instruction defs in tablegen to add new instruction variants that can accomodate the extra return values. 2. Updates to lowerImage in SIISelLowering.cpp to accomodate setting TFE or LWE (where the bulk of the work for these instruction types is now done) 3. Extra verification code to catch cases where intrinsics have been used but insufficient return registers are used. 4. Modification to the adjustWritemask optimisation to account for TFE/LWE being enabled (requires extra registers to be maintained for error return value). 5. An extra pass to zero initialize the error value return - this is because if the error does not occur, the register is not written and thus must be zeroed before use. Also added a new (on by default) option to ensure ALL return values are zero-initialized that is required for sparse texture support. 6. Disable the inst_combine optimization in the presence of tfe/lwe (later TODO for this to re-enable and handle correctly). There's an additional fix now to avoid a dmask=0 For an image intrinsic with tfe where all result channels except tfe were unused, I was getting an image instruction with dmask=0 and only a single vgpr result for tfe. That is incorrect because the hardware assumes there is at least one vgpr result, plus the one for tfe. Fixed by forcing dmask to 1, which gives the desired two vgpr result with tfe in the second one. The TFE or LWE result is returned from the intrinsics using an aggregate type. Look in the test code provided to see how this works, but in essence IR code to invoke the intrinsic looks as follows: %v = call {<4 x float>,i32} @llvm.amdgcn.image.load.1d.v4f32i32.i32(i32 15, i32 %s, <8 x i32> %rsrc, i32 1, i32 0) %v.vec = extractvalue {<4 x float>, i32} %v, 0 %v.err = extractvalue {<4 x float>, i32} %v, 1 Differential revision: https://reviews.llvm.org/D48826 Change-Id: If222bc03642e76cf98059a6bef5d5bffeda38dda llvm-svn: 347871	2018-11-29 15:21:13 +00:00
Sanjay Patel	8242c82de4	[CVP] tidy processCmp(); NFC 1. The variables were confusing: 'C' typically refers to a constant, but here it was the Cmp. 2. Formatting violations. 3. Simplify code to return true/false constant. llvm-svn: 347868	2018-11-29 14:41:21 +00:00
Martin Storsjo	bfd1d27585	Revert "[LICM] Enable control flow hoisting by default" and "[LICM] Reapply r347190 "Make LICM able to hoist phis" with fix" This reverts commits r347776 and r347778. The first one, r347776, caused significant compile time regressions for certain input files, see PR39836 for details. llvm-svn: 347867	2018-11-29 14:39:39 +00:00
Max Kazantsev	24c186ff00	Disable TermFolding in LoopSimplifyCFG until PR39783 is fixed llvm-svn: 347844	2018-11-29 09:00:19 +00:00
Sam Parker	d6ebf0108e	[LoopStrengthReduce] ComplexityLimit as an option Convert ComplexityLimit into a command line value. Differential Revision: https://reviews.llvm.org/D54899 llvm-svn: 347843	2018-11-29 08:34:22 +00:00
Jeremy Morse	9b4cfa55b1	[DebugInfo] Give inlinable calls DILocs (PR39807) In PR39807 we incorrectly handle circumstances where calls are common'd from conditional blocks into the parent BB. Calls that can be inlined must always have DebugLocs, however we strip them during commoning, which the IR verifier asserts on. Fix this by using applyMergedLocation: it will perform the same DebugLoc stripping of conditional Locs, but will also generate an unknown location DebugLoc that satisfies the requirement for inlinable calls to always have locations. Some of the prior logic for selecting a DebugLoc is now likely redundant; I'll generate a follow-up to remove it (involves editing more regression tests). Differential Revision: https://reviews.llvm.org/D54997 llvm-svn: 347782	2018-11-28 17:58:45 +00:00
John Brawn	4557ffeb63	[LICM] Enable control flow hoisting by default Differential Revision: https://reviews.llvm.org/D54949 llvm-svn: 347778	2018-11-28 17:23:03 +00:00
John Brawn	31c9769580	[LICM] Reapply r347190 "Make LICM able to hoist phis" with fix This commit caused failures because it failed to correctly handle cases where we hoist a phi, then hoist a use of that phi, then have to rehoist that use. We need to make sure that we rehoist the use to _after_ the hoisted phi, which we do by always rehoisting to the immediate dominator instead of just rehoisting everything to the original preheader. An option is also added to control whether control flow is hoisted, which is off in this commit but will be turned on in a subsequent commit. Differential Revision: https://reviews.llvm.org/D52827 llvm-svn: 347776	2018-11-28 17:21:49 +00:00
Nikita Popov	8d63aed459	[InstCombine] Combine saturating add/sub with constant operands Combine sat(sat(X + C1) + C2) -> sat(X + (C1+C2)) and sat(sat(X - C1) - C2) -> sat(X - (C1+C2)) if the sign of C1 and C2 matches. In the unsigned case we can compute C1+C2 with saturating arithmetic, and InstSimplify will reduce this just to the saturation value. For the signed case, we cannot perform the simplification if the result of the addition overflows. This change is part of https://reviews.llvm.org/D54534. llvm-svn: 347773	2018-11-28 16:37:15 +00:00
Nikita Popov	42f89989a1	[InstCombine] Canonicalize ssub.sat to sadd.sat Canonicalize ssub.sat(X, C) to ssub.sat(X, -C) if C is constant and not signed minimum. This will help further optimizations to apply. This change is part of https://reviews.llvm.org/D54534. llvm-svn: 347772	2018-11-28 16:37:09 +00:00
Nikita Popov	78a9295e15	[InstCombine] Use known overflow information for saturating add/sub If ValueTracking can determine that the add/sub can newer overflow, replace it with the corresponding nuw/nsw add/sub. Additionally, for the unsigned case, if ValueTracking determines that the add/sub always overflows, replace the result with the saturation value. This change is part of https://reviews.llvm.org/D54534. llvm-svn: 347770	2018-11-28 16:36:59 +00:00
Nikita Popov	085d24a8b3	[InstCombine] Canonicalize const arg for saturating adds If a saturating add intrinsic has one constant argument, make sure it is on the RHS. This will simplify further transformations. This change is part of https://reviews.llvm.org/D54534. llvm-svn: 347769	2018-11-28 16:36:52 +00:00
Xin Tong	53e52e47e8	[ThinLTO] Correct linkonce_any function import linkage. NFC. Summary: This is a NFC as we do not import non-odr vague linkage when computing for import list for a module. Reviewers: tejohnson, pcc Subscribers: inglorion, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D54928 llvm-svn: 347763	2018-11-28 15:16:35 +00:00
Alexey Bataev	579c2d9d64	[SLP]Fix PR39774: Set ReductionRoot if the original instruction is vectorized. Summary: If the original reduction root instruction was vectorized, it might be removed from the tree. It means that the insertion point may become invalidated and the whole vectorization of the reduction leads to the incorrect output result. The ReductionRoot instruction must be marked as externally used so it could not be removed. Otherwise it might cause inconsistency with the cost model and we may end up with too optimistic optimization. Reviewers: RKSimon, spatel, hfinkel, mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54955 llvm-svn: 347759	2018-11-28 14:34:11 +00:00
Florian Hahn	fd6ea134f4	[PartialInliner] Make PHIs free in cost computation. InlineCost also treats them as free and the current implementation can cause assertion failures if PHI nodes are moved outside the region from entry BBs to the region. It also updates the code to use the instructionsWithoutDebug iterator. Reviewers: davidxl, davide, vsk, graham-yiu-huawei Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D54748 llvm-svn: 347683	2018-11-27 18:17:27 +00:00
Tim Northover	81bff5e6ea	InstCombine: add comment explaining malloc deletion. NFC. I tried to change this, not quite realising the logic behind what we were doing. Hopefully this comment will help the next person to come along. llvm-svn: 347653	2018-11-27 11:08:14 +00:00
Max Kazantsev	70b11c6d31	[LoopSimplifyCFG] Turn on term folding after underlying bug fixed llvm-svn: 347641	2018-11-27 06:19:42 +00:00
Max Kazantsev	c4e4d6449a	[LoopSimplifyCFG] Fix corner case with duplicating successors It fixes a bug that doesn't update Phi inputs of the only live successor that is in the list of block's successors more than once. Thanks @uabelho for finding this. Differential Revision: https://reviews.llvm.org/D54849 Reviewed By: anna llvm-svn: 347640	2018-11-27 06:17:21 +00:00
Xin Tong	04d49779a1	[ICP] Remove incompatible attributes at indirect-call promoted callsites. Summary: Removing ncompatible attributes at indirect-call promoted callsites, not removing it results in at least a IR verification error. Reviewers: davidxl, xur, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54913 llvm-svn: 347605	2018-11-26 22:03:52 +00:00
Sanjay Patel	790af91803	[InstCombine] add helper function to reduce code duplication; NFC llvm-svn: 347604	2018-11-26 22:00:41 +00:00
Florian Hahn	6615a7132a	[IPSCCP] Use input operand instead of OriginalOp for ssa_copy. OriginalOp of a Predicate refers to the original IR value, before renaming. While solving in IPSCCP, we have to use the operand of the ssa_copy instead, to avoid missing updates for nested conditions on the same IR value. Fixes PR39772. llvm-svn: 347524	2018-11-25 16:32:02 +00:00
Nikita Popov	2c779c0e34	[InstCombine] Determine demanded and known bits for funnel shifts Support funnel shifts in InstCombine demanded bits simplification. If the shift amount is constant, we can determine both the demanded bits of the operands, as well as the known bits of the result. If one of the operands has no demanded bits, it will be replaced by undef and the funnel shift will be simplified into a simple shift due to the simplifications added in D54778. Differential Revision: https://reviews.llvm.org/D54869 llvm-svn: 347515	2018-11-24 19:00:45 +00:00
Nikita Popov	6e81d421e1	[InstCombine] Simplify funnel shift with zero/undef operand to shift The following simplifications are implemented: * `fshl(X, 0, C) -> shl X, C%BW` * `fshl(X, undef, C) -> shl X, C%BW` (assuming undef = 0) * `fshl(0, X, C) -> lshr X, BW-C%BW` * `fshl(undef, X, C) -> lshr X, BW-C%BW` (assuming undef = 0) * `fshr(X, 0, C) -> shl X, (BW-C%BW)` * `fshr(X, undef, C) -> shl X, BW-C%BW` (assuming undef = 0) * `fshr(0, X, C) -> lshr X, C%BW` * `fshr(undef, X, C) -> lshr, X, C%BW` (assuming undef = 0) The simplification is only performed if the shift amount C is constant, because we can explicitly compute C%BW and BW-C%BW in this case. Differential Revision: https://reviews.llvm.org/D54778 llvm-svn: 347505	2018-11-23 22:45:08 +00:00
Max Kazantsev	e1c2dc27d3	Disable LoopSimplifyCFG terminator folding by default llvm-svn: 347486	2018-11-23 09:14:53 +00:00
Max Kazantsev	cb8e240334	[LoopSimplifyCFG] Don't delete LCSSA Phis When removing edges, we also update Phi inputs and may end up removing a Phi if it has only one input. We should not do it for edges that leave the current loop because these Phis are LCSSA Phis and need to be preserved. Thanks @dmgreen for finding this! Differential Revision: https://reviews.llvm.org/D54841 llvm-svn: 347484	2018-11-23 07:56:47 +00:00
Max Kazantsev	b565e6093b	[NFC] Assert that all blocks staying in loop are live llvm-svn: 347458	2018-11-22 12:43:27 +00:00
Max Kazantsev	56a2443024	[NFC] Ensure deterministic order of dead exit blocks llvm-svn: 347457	2018-11-22 12:33:41 +00:00
Max Kazantsev	d9f59f8c80	[NFC] Simplify code by using standard exit blocks collection llvm-svn: 347454	2018-11-22 10:48:30 +00:00
Fedor Sergeev	59246b6bfe	[PM] correcting return value for new-pass-manager version of Scalarizer Obvious mistake missed during D54695 review. llvm-svn: 347432	2018-11-21 22:01:19 +00:00
Nikita Popov	6f54fb0052	[MergeFuncs] Generate alias instead of thunk if possible The MergeFunctions pass was originally intended to emit aliases instead of thunks where possible (unnamed_addr). However, for a long time this functionality was behind a flag hardcoded to false, bitrotted and was eventually removed in r309313. Originally the functionality was first disabled in r108417 due to lack of support for aliases in Mach-O. I believe that this is no longer the case nowadays, but not really familiar with this area. In the interest of being conservative, this patch reintroduces the aliasing functionality behind a default disabled -mergefunc-use-aliases flag. Differential Revision: https://reviews.llvm.org/D53285 llvm-svn: 347407	2018-11-21 19:37:19 +00:00
Mikael Holmen	b6f76002d9	[PM] Port Scalarizer to the new pass manager. Patch by: markus (Markus Lavin) Reviewers: chandlerc, fedor.sergeev Reviewed By: fedor.sergeev Subscribers: llvm-commits, Ka-Ka, bjope Differential Revision: https://reviews.llvm.org/D54695 llvm-svn: 347392	2018-11-21 14:00:17 +00:00
Guozhi Wei	c21fba1bab	[LoopSink] Add preheader to alias set This patch fixes PR39695. The original LoopSink only considers memory alias in loop body. But PR39695 shows that instructions following sink candidate in preheader should also be checked. This is a conservative patch, it simply adds whole preheader block to alias set. It may lose some optimization opportunity, but I think that is very rare because: 1 in the most common case st/ld to the same address, the load should already be optimized away. 2 usually preheader is not very large. Differential Revision: https://reviews.llvm.org/D54659 llvm-svn: 347325	2018-11-20 16:49:07 +00:00
Max Kazantsev	c04b5307d1	Recommit "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches" The initial version of patch lacked Phi nodes updates in destinations of removed edges. This version contains this update and tests on this situation. Differential Revision: https://reviews.llvm.org/D54021 llvm-svn: 347289	2018-11-20 05:43:32 +00:00
Reid Kleckner	994a8451ba	[Transforms] Prefer static and avoid namespaces, NFC Put 'static' on three functions in an anonymous namespace as per our coding style. Remove the 'namespace llvm {}' around the .cpp file and explicitly declare the free function 'llvm::optimizeGlobalCtorsList' in 'llvm::'. I prefer this style for free functions because the compiler will error out if the .h and .cpp files don't agree on the function name or prototype. llvm-svn: 347269	2018-11-19 22:19:05 +00:00
Benjamin Kramer	fdd9b4fc8f	Revert "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches" This reverts commits r347183 & r347184. Crashes while building libxml. llvm-svn: 347260	2018-11-19 20:01:20 +00:00
Vedant Kumar	238533ec2e	[InstCombine] Set debug loc on `mergeStoreIntoSuccessor` phi Assigning a merged debug location to the `mergeStoreIntoSuccessor` phi improves backtrace quality. Fixes llvm.org/PR38083. llvm-svn: 347257	2018-11-19 19:55:02 +00:00
Vedant Kumar	4de31bba51	[IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlock Add methods to BasicBlock which make it easier to efficiently check whether a block has N (or more) predecessors. This can be more efficient than using pred_size(), which is a linear time operation. We might consider adding similar methods for successors. I haven't done so in this patch because succ_size() is already O(1). With this patch applied, I measured a 0.065% compile-time reduction in user time for running `opt -O3` on the sqlite3 amalgamation (30 trials). The change in mergeStoreIntoSuccessor alone saves 45 million linked list iterations in a stage2 Release build of llc. See llvm.org/PR39702 for a harder but more general way of achieving similar results. Differential Revision: https://reviews.llvm.org/D54686 llvm-svn: 347256	2018-11-19 19:54:27 +00:00
Benjamin Kramer	2cad359c91	Revert "[LICM] Make LICM able to hoist phis" This reverts commit r347190. llvm-svn: 347225	2018-11-19 16:51:57 +00:00
Anna Thomas	5e9215f02b	[LV] Avoid vectorizing unsafe dependencies in uniform address Summary: Currently, when vectorizing stores to uniform addresses, the only instance we prevent vectorization is if there are multiple stores to the same uniform address causing an unsafe dependency. This patch teaches LAA to avoid vectorizing loops that have an unsafe cross-iteration dependency between a load and a store to the same uniform address. Fixes PR39653. Reviewers: Ayal, efriedma Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D54538 llvm-svn: 347220	2018-11-19 15:39:59 +00:00
John Brawn	12c046fba0	[LICM] Make LICM able to hoist phis The general approach taken is to make note of loop invariant branches, then when we see something conditional on that branch, such as a phi, we create a copy of the branch and (empty versions of) its successors and hoist using that. This has no impact by itself that I've been able to see, as LICM typically doesn't see such phis as they will have been converted into selects by the time LICM is run, but once we start doing phi-to-select conversion later it will be important. Differential Revision: https://reviews.llvm.org/D52827 llvm-svn: 347190	2018-11-19 11:31:24 +00:00
Max Kazantsev	8e3e33d138	[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches This patch introduces infrastructure and the simplest case for constant-folding of branch and switch instructions within loop into unconditional branches. It is useful as a cleanup for such passes as loop unswitching that sometimes produce such branches. Only the simplest case supported in this patch: after the folding, no block should become dead or stop being part of the loop. Support for more sophisticated cases will go separately in follow-up patches. Differential Revision: https://reviews.llvm.org/D54021 Reviewed By: anna llvm-svn: 347183	2018-11-19 05:54:38 +00:00
Vedant Kumar	e7b789b529	[ProfileSummary] Standardize methods and fix comment Every Analysis pass has a get method that returns a reference of the Result of the Analysis, for example, BlockFrequencyInfo &BlockFrequencyInfoWrapperPass::getBFI(). I believe that ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning a pointer. Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock, respectively. Most methods use BB as the argument of variable names while methods usually refer to Basic Blocks as Blocks, instead of BB. For example, Function::getEntryBlock, Loop:getExitBlock, etc. I also fixed one of the comments. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D54669 llvm-svn: 347182	2018-11-19 05:23:16 +00:00
Vedant Kumar	35f504c113	[CorrelatedValuePropagation] Preserve debug locations (PR38178) Fix all of the missing debug location errors in CVP found by debugify. This includes the missing-location-after-udiv-truncation case described in llvm.org/PR38178. llvm-svn: 347147	2018-11-18 00:29:58 +00:00
Fangrui Song	7570932977	Use llvm::copy. NFC llvm-svn: 347126	2018-11-17 01:44:25 +00:00
Fedor Sergeev	2e3e224e71	[SimpleLoopUnswitch] adding cost multiplier to cap exponential unswitch with We need to control exponential behavior of loop-unswitch so we do not get run-away compilation. Suggested solution is to introduce a multiplier for an unswitch cost that makes cost prohibitive as soon as there are too many candidates and too many sibling loops (meaning we have already started duplicating loops by unswitching). It does solve the currently known problem with compile-time degradation (PR 39544). Tests are built on top of a recently implemented CHECK-COUNT-<num> FileCheck directives. Reviewed By: chandlerc, mkazantsev Differential Revision: https://reviews.llvm.org/D54223 llvm-svn: 347097	2018-11-16 21:16:43 +00:00
Adrian Prantl	83d87520ed	GlobalDCE: Teach isEmptyFunction() to ignore debug intrinsics. This fixes PR39669. https://bugs.llvm.org/show_bug.cgi?id=39669 llvm-svn: 347065	2018-11-16 17:47:21 +00:00
Eugene Leviant	bf46e7410c	[ThinLTO] Internalize readonly globals An attempt to recommit r346584 after failure on OSX build bot. Fixed cache key computation in ThinLTOCodeGenerator and added test case llvm-svn: 347033	2018-11-16 07:08:00 +00:00
Xin Tong	642c8d3575	[LTO] Load sample profile in LTO link step. Summary: Load sample profile in LTO link step. ThinLTO calls populateModulePassManager to load the profile Reviewers: tejohnson, davidxl, danielcdh Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D54564 llvm-svn: 346971	2018-11-15 18:06:42 +00:00
Sanjay Patel	bc56b2432d	[InstCombine] fix rotate narrowing bug for non-pow-2 types llvm-svn: 346968	2018-11-15 17:19:14 +00:00
Mandeep Singh Grang	0905fc77c1	[InstCombine] Remove a couple of asserts based on incorrect assumptions Summary: These asserts are based on the assumption that the order of true/false operands in a select and those in the compare would always be the same. This fixes PR39595. Reviewers: craig.topper, spatel, dmgreen Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54359 llvm-svn: 346874	2018-11-14 17:55:07 +00:00
Sanjay Patel	6072842770	[InstCombine] fix formatting for matchBSwap(); NFC We should have a similar function for matching rotate and/or funnel shift, so tidy up the related existing call. llvm-svn: 346871	2018-11-14 16:03:36 +00:00
Florian Hahn	6df11868b5	[VPlan, SLP] Use SmallPtrSet for Candidates. This slightly improves the candidate handling in getBest(). llvm-svn: 346870	2018-11-14 15:58:40 +00:00
Florian Hahn	02cb67deb9	[VPlan] Remove LLVM_DEBUG from VPlanSlp::dumpBundle. The caller should take care of only calling it with debug enabled. llvm-svn: 346860	2018-11-14 13:33:44 +00:00
Florian Hahn	2eca3728ee	[VPlan] Update ifdef. llvm-svn: 346858	2018-11-14 13:21:26 +00:00
Florian Hahn	09e516c54b	[VPlan, SLP] Add simple SLP analysis on top of VPlan. This patch adds an initial implementation of the look-ahead SLP tree construction described in 'Look-Ahead SLP: Auto-vectorization in the Presence of Commutative Operations, CGO 2018 by Vasileios Porpodas, Rodrigo C. O. Rocha, Luís F. W. Góes'. It returns an SLP tree represented as VPInstructions, with combined instructions represented as a single, wider VPInstruction. This initial version does not support instructions with multiple different users (either inside or outside the SLP tree) or non-instruction operands; it won't generate any shuffles or insertelement instructions. It also just adds the analysis that builds an SLP tree rooted in a set of stores. It does not include any cost modeling or memory legality checks. The plan is to integrate it with VPlan based cost modeling, once available and to only apply it to operations that can be widened. A follow-up patch will add a support for replacing instructions in a VPlan with their SLP counter parts. Reviewers: Ayal, mssimpso, rengolin, mkuper, hfinkel, hsaito, dcaballe, vporpo, RKSimon, ABataev Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D4949 llvm-svn: 346857	2018-11-14 13:11:49 +00:00
Florian Hahn	505091a8f2	Recommit r346483: [CallSiteSplitting] Only record conditions up to the IDom(call site). The underlying problem causing the expensive-check failure was fixed in rL346769. llvm-svn: 346843	2018-11-14 10:04:30 +00:00
Reid Kleckner	41390b47de	Revert r346810 "Preserve loop metadata when splitting exit blocks" It broke the Windows self-host: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/1457 llvm-svn: 346823	2018-11-14 01:47:32 +00:00
Sanjay Patel	a139564896	[InstCombine] fold funnel shift amount based on demanded bits The shift amount of a funnel shift is modulo the scalar bitwidth: http://llvm.org/docs/LangRef.html#llvm-fshl-intrinsic ...so we can use demanded bits analysis on that operand to simplify it when we have a power-of-2 bitwidth. This is another step towards canonicalizing {shift/shift/or} to the intrinsics in IR. Differential Revision: https://reviews.llvm.org/D54478 llvm-svn: 346814	2018-11-13 23:27:23 +00:00
Craig Topper	3c87c2a3c5	Preserve loop metadata when splitting exit blocks LoopUtils.cpp contains a utility that splits an loop exit block, so that the new block contains only edges coming from the loop. In the case of nested loops, the exit path for the inner loop might also be the back-edge of the outer loop. The new block which is inserted on this path, is now a latch for the outer loop, and it needs to hold the loop metadata for the outer loop. (The test case gives a more concrete view of the situation.) Patch by Chang Lin (clin1) Differential Revision: https://reviews.llvm.org/D53876 llvm-svn: 346810	2018-11-13 23:06:49 +00:00
Sanjay Patel	f8f12272e8	[InstCombine] canonicalize rotate patterns with cmp/select The cmp+branch variant of this pattern is shown in: https://bugs.llvm.org/show_bug.cgi?id=34924 ...and as discussed there, we probably can't transform that without a rotate intrinsic. We do have that now via funnel shift, but we're not quite ready to canonicalize IR to that form yet. The case with 'select' should already be transformed though, so that's this patch. The sequence with negation followed by masking is what we use in the backend and partly in clang (though that part should be updated). https://rise4fun.com/Alive/TplC %cmp = icmp eq i32 %shamt, 0 %sub = sub i32 32, %shamt %shr = lshr i32 %x, %shamt %shl = shl i32 %x, %sub %or = or i32 %shr, %shl %r = select i1 %cmp, i32 %x, i32 %or => %neg = sub i32 0, %shamt %masked = and i32 %shamt, 31 %maskedneg = and i32 %neg, 31 %shl2 = lshr i32 %x, %masked %shr2 = shl i32 %x, %maskedneg %r = or i32 %shl2, %shr2 llvm-svn: 346807	2018-11-13 22:47:24 +00:00
Florian Hahn	107d0a8756	[CSP, Cloning] Update DuplicateInstructionsInSplitBetween to use DomTreeUpdater. This patch updates DuplicateInstructionsInSplitBetween to update a DTU instead of applying updates to the DT directly. Given that there only are 2 users, also updated them in this patch to avoid churn. I slightly moved the code in CallSiteSplitting around to reduce the places where we have to pass in DTU. If necessary, I could split those changes in a separate patch. This fixes missing DT updates when dealing with musttail calls in CallSiteSplitting, by using DTU->deleteBB. Reviewers: junbuml, kuhar, NutshellySima, indutny, brzycki Reviewed By: NutshellySima llvm-svn: 346769	2018-11-13 17:54:43 +00:00

1 2 3 4 5 ...

21060 Commits