llvm-project

Commit Graph

Author	SHA1	Message	Date
Ehud Katz	c6c265527d	Revert "[StructurizeCFG] Fix region nodes ordering" This reverts commit `897d8ee5cd`, due to causing an infinite loop when encountering a loop with a sub-region with an inner loop.	2020-05-14 17:56:39 +03:00
Anna Thomas	f20c62741e	Revert "[RS4GC] Fix algorithm to avoid setting vector BDV for scalar derived pointer" This reverts commit `bb308b0205`. Failing a testcase.	2020-05-14 10:16:25 -04:00
Anna Thomas	bb308b0205	[RS4GC] Fix algorithm to avoid setting vector BDV for scalar derived pointer Summary: This is a more general fix to `59029b9eef` (D75704). This patch does the following: 1. updates isKnownBaseValue to account for base pointer and derived pointer having differing types. 2. This inturn allows us to populate the lattice (States) for such derived pointers. 3. It also updates all states where the base and derived pointers have differing types (vector versus scalar) and conservatively marks these states as conflictcs. Note that in `59029b9eef`, we were just fixing existing lattice values and that too, only for uses of extractelement. Reviewers: reames, skatkov, dantrushin Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76305	2020-05-14 10:03:30 -04:00
Sanjay Patel	26e742fd84	[x86][CGP] improve sinking of splatted vector shift amount operand Expands on the enablement of the shouldSinkOperands() TLI hook in: D79718 The last codegen/IR test diff shows what I suspected could happen - we were sinking all splat shift operands into a loop. But that's not what we want in general; we only want to sink the shift amount operand if it is a splat. Differential Revision: https://reviews.llvm.org/D79827	2020-05-14 08:36:03 -04:00
Omar Ahmed	425333c23b	[Attributor] Improve the alignment of the loads This patch introduces an improvement in the Alignment of the loads generated in createReplacementValues() by querying AAAlign attribute for the best Alignment for the base. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76550	2020-05-13 18:24:05 -05:00
Johannes Doerfert	6045a804b9	[Attributor] Check lines accidentally not committed with D76208	2020-05-13 18:24:05 -05:00
Kuter Dinel	e57807769b	[Attributor] Use AAValueConstantRange to infer dereferencability. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76208	2020-05-13 16:44:15 -05:00
Mircea Trofin	d6695e1876	[llvm] Add interface to drive inlining decision using ML model Summary: This change introduces InliningAdvisor (and related APIs), the interface that abstracts decision making away from the inlining pass. We will use this interface to delegate decision making to a trained ML model, subsequently (see referenced RFC). RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html Reviewers: davidxl, eraman, dblaikie Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79042	2020-05-13 13:27:29 -07:00
Alina Sbirlea	db04ff4b6b	[SimpleLoopUnswitch] Add non-empty unreachable block check to exit cases removed. Summary: Update check to include the check for unreachable. Basic blocks ending in unreachable are special cased, as these blocks may be already unswitched. Before this patch this check is only done for the default destination. The condition for the exit cases and the default case must be the same, because we should never leave edges from the switch instruction to a basic block that we are unswitching. In PR45355 we still have a remaining edge (that we're attempting to remove from the DT) because its the default edge to an unreachable-terminated block where we unswitch a case edge to that block. Resolves PR45355. Reviewers: chandlerc Subscribers: hiraditya, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78279	2020-05-13 12:38:37 -07:00
Simon Pilgrim	33d96bf7b9	[InstCombine] Add vector tests for the or(shl(zext(x),32)\|zext(y)) concat combines	2020-05-13 18:48:02 +01:00
Huber, Joseph	4d4ea9ac59	OpenMPOpt Remarks Support Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D79359	2020-05-13 12:20:40 -05:00
Pierre-vh	2668775f66	[LSR][ARM] Add new TTI hook to mark some LSR chains as profitable This patch adds a new TTI hook to allow targets to tell LSR that a chain including some instruction is already profitable and should not be optimized. This patch also adds an implementation of this TTI hook for ARM so LSR doesn't optimize chains that include the VCTP intrinsic. Differential Revision: https://reviews.llvm.org/D79418	2020-05-13 14:18:28 +01:00
Sjoerd Meijer	9529597cf4	Recommit #2 : "[LV] Induction Variable does not remain scalar under tail-folding." This was reverted because of a miscompilation. At closer inspection, the problem was actually visible in a changed llvm regression test too. This one-line follow up fix/recommit will splat the IV, which is what we are trying to avoid if unnecessary in general, if tail-folding is requested even if all users are scalar instructions after vectorisation. Because with tail-folding, the splat IV will be used by the predicate of the masked loads/stores instructions. The previous version omitted this, which caused the miscompilation. The original commit message was: If tail-folding of the scalar remainder loop is applied, the primary induction variable is splat to a vector and used by the masked load/store vector instructions, thus the IV does not remain scalar. Because we now mark that the IV does not remain scalar for these cases, we don't emit the vector IV if it is not used. Thus, the vectoriser produces less dead code. Thanks to Ayal Zaks for the direction how to fix this.	2020-05-13 13:50:09 +01:00
Ehud Katz	897d8ee5cd	[StructurizeCFG] Fix region nodes ordering This is a reimplementation of the `orderNodes` function, as the old implementation didn't take into account all cases. Fix PR41509 Differential Revision: https://reviews.llvm.org/D79037	2020-05-13 15:33:36 +03:00
Sam Parker	6bbad7285c	[CostModel] Modify BasicTTI getCastInstrCost Fix the assumption that all bitcasts of the same type sizes are free. We now only assume that bitcasts between ints and ptrs of the same size are free. This allows TTImpl to just call the concrete implementation of getCastInstrCost. Differential Revision: https://reviews.llvm.org/D78918	2020-05-13 07:26:08 +01:00
KAWASHIMA Takahiro	272bc25bc1	[LoopReroll] Fix rerolling loop with use outside the loop Fixes PR41696 The loop-reroll pass generates an invalid IR (or its assertion fails in debug build) if values of the base instruction and other root instructions (terms used in the loop-reroll pass) are used outside the loop block. See IRs written in PR41696 as examples. The current implementation of the loop-reroll pass can reroll only loops that don't have values that are used outside the loop, except reduced values (the last values of reduction chains). This is described in the comment of the `LoopReroll::reroll` function. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1600 This is checked in the `LoopReroll::DAGRootTracker::validate` function. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1393 However, the base instruction and other root instructions skip this check in the validation loop. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1229 Moving the check in front of the skip is the logically simplest fix. However, inserting the check in an earlier stage is better in terms of compilation time of unrerollable loops. This fix inserts the check for the base instruction into the function to validate possible base/root instructions. Check for other root instructions is unnecessary because they don't match any base instructions if they have uses outside the loop. Differential Revision: https://reviews.llvm.org/D79549	2020-05-13 13:03:03 +09:00
Johannes Doerfert	af48351cc8	[Attributor][FIX] Stabilize the state of AAReturnedValues each update For AAReturnedValues we treated new and existing information differently in the updateImpl. Only the latter was properly analyzed and categorized. The former was thought to be analyzed in the subsequent update. Since the Attributor does not support "self-updates" we need to make sure the state is "stable" after each updateImpl invocation. That is, if the surrounding information does not change, the state is valid. Now we make sure all return values have been handled and properly categorized each iteration. We might not update again if we have not requested a non-fix attribute so we cannot "wait" for the next update to analyze a new return value. Bug reported by @sdmitriev.	2020-05-12 21:00:30 -05:00
Juneyoung Lee	d3eb51f062	[ValueTracking] Fix crash in isGuaranteedNotToBeUndefOrPoison when V is in an unreachable block Summary: This fixes PR45885 by fixing isGuaranteedNotToBeUndefOrPoison so it does not look into dominating branch conditions of V when V is an instruction in an unreachable block. Reviewers: spatel, nikic, lebedev.ri Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79790	2020-05-13 10:16:47 +09:00
Zequan Wu	cb22ab7403	Add nomerge function attribute to supress tail merge optimization in simplifyCFG We want to add a way to avoid merging identical calls so as to keep the separate debug-information for those calls. There is also an asan usecase where having this attribute would be beneficial to avoid alternative work-arounds. Here is the link to the feature request: https://bugs.llvm.org/show_bug.cgi?id=42783. `nomerge` is different from `noline`. `noinline` prevents function from inlining at callsites, but `nomerge` prevents multiple identical calls from being merged into one. This patch adds `nomerge` to disable the optimization in IR level. A followup patch will be needed to let backend understands `nomerge` and avoid tail merge at backend. Reviewed By: asbirlea, rnk Differential Revision: https://reviews.llvm.org/D78659	2020-05-12 16:49:20 -07:00
Sanjay Patel	f490ca76b0	[x86][CGP] enable target hook to sink funnel shift intrinsic's splatted shift amount SDAG suffers when it can't see that a funnel operand is a splat value (due to single-basic-block visibility), so invert the normal loop hoisting rules to move a splat op closer to its use. This would be part 1 of an enhancement similar to D63233. This is needed to re-fix PR37426: https://bugs.llvm.org/show_bug.cgi?id=37426 ...because we got better at canonicalizing IR to funnel shift intrinsics. The existing CGP code for shift opcodes is likely overstepping what it was intended to do, so that will be fixed in a follow-up. Differential Revision: https://reviews.llvm.org/D79718	2020-05-12 18:40:40 -04:00
Fangrui Song	66055230bf	[TargetLoweringObjectFileImpl] Produce .text.hot. instead of .text.hot for -fno-unique-section-names GNU ld's internal linker script uses (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=add44f8d5c5c05e08b11e033127a744d61c26aee) .text : { (.text.unlikely .text._unlikely .text.unlikely.) (.text.exit .text.exit.) (.text.startup .text.startup.) (.text.hot .text.hot.) (SORT(.text.sorted.)) (.text .stub .text.* .gnu.linkonce.t.) / .gnu.warning sections are handled specially by elf.em. / (.gnu.warning) } Because `(.text.exit .text.exit.)` is ordered before `(.text .text.)`, in a -ffunction-sections build, the C library function `exit` will be placed before other functions. gold's `-z keep-text-section-prefix` has the same problem. In lld, `-z keep-text-section-prefix` recognizes `.text.{exit,hot,startup,unlikely,unknown}.*`, but not `.text.{exit,hot,startup,unlikely,unknown}`, to avoid the strange placement problem. In -fno-function-sections or -fno-unique-section-names mode, a function whose `function_section_prefix` is set to `.exit"` will go to the output section `.text` instead of `.text.exit` when linked by lld. To address the problem, append a dot to become `.text.exit.` Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D79600	2020-05-12 14:14:17 -07:00
Sergey Dmitriev	32f5ee830b	[Attributor] Fixup block addresses after rewriting function signature Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79801	2020-05-12 13:53:04 -07:00
Sanjay Patel	93bd696347	[VectorCombine] add test to check for iterative improvements; NFC	2020-05-12 12:49:25 -04:00
Fangrui Song	b56b1e67e3	[gcov] Default coverage version to '408' and delete CC1 option -coverage-exit-block-before-body gcov 4.8 (r189778) moved the exit block from the last to the second. The .gcda format is compatible with 4.7 but decoding libgcov 4.7 produced .gcda with gcov [4.7,8) can mistake the exit block, emit bogus `%s:'%s' has arcs from exit block\n` warnings, and print wrong `" returned %s` for branch statistics (-b). * decoding libgcov 4.8 produced .gcda with gcov 4.7 has similar issues. Also, rename "return block" to "exit block" because the latter is the appropriate term.	2020-05-12 09:14:03 -07:00
Sam Parker	b4a8091a11	[ARM][CostModel] Improve getCastInstrCost - Specifically check for sext/zext users which have 'long' form NEON instructions. - Add more entries to the table for sext/zexts so that we can report more accurately the number of vmovls required for NEON. - Pass the instruction to the pass implementation. Differential Revision: https://reviews.llvm.org/D79561	2020-05-12 10:32:20 +01:00
Sam Parker	1952c86d61	[AArch64][CostModel] getCastInstrCost Pass the instruction to the base implementation. Differential Revision: https://reviews.llvm.org/D79562	2020-05-12 10:02:29 +01:00
Johannes Doerfert	8d94d3c3b4	[Attributor][FIX] Disallow function signature rewrite for casted calls We will now ensure ensure the return type of called function is the type of all call sites we are going to rewrite. This avoids a problem partially fixed by D79680. The part that was not covered is a use of this "weird" casted call site (see `@func3` in `misc_crash.ll`). misc_crash.ll checks are auto-generated now.	2020-05-11 15:32:47 -05:00
Johannes Doerfert	c115a78f0d	[Attributor] Make AAIsDead dependences optional to prevent top state We should never give up on AAIsDead as it guards other AAs from unreachable code (in which SSA properties are meaningless). We did however use required dependences on some queries in AAIsDead which caused us to invalidate AAIsDead if the queried AA got invalidated. We now use optional dependences instead. The bug that exposed this is added to the liveness.ll test and other test changes show the impact. Bug report by @sdmitriev.	2020-05-11 15:32:47 -05:00
Johannes Doerfert	c86fd3333d	[Attributor] Force update of "newly live" abstract attributes During an update of AAIsDead, new instructions become live. If we query information from them, the result is often just the initial state, e.g., for call site `noreturn` and `nounwind`. We will now trigger an update for cached attributes during the AAIsDead update, though other AAs might later use the same API.	2020-05-11 15:32:47 -05:00
Sanjay Patel	5f730b645d	[VectorCombine] account for extra uses in scalarization cost Follow-up to D79452. Mimics the extra use cost formula for the inverse transform with extracts.	2020-05-11 15:20:57 -04:00
Sanjay Patel	7c480c4385	[VectorCombine] add tests for possible scalarization with extra uses; NFC	2020-05-11 15:04:31 -04:00
Sanjay Patel	0cea15cc4a	[CGP][x86] add test for funnel-shift with cross-block splat shift-amount; NFC	2020-05-11 13:22:20 -04:00
Hongtao Yu	47c1f2741f	Properly add out-of-module functions to the import list This patch addresses two issues related to adding inline functions to the import list while recursively going through the profiling data. 1. For callsite samples, only add an inlined function to the import list if it's from outside of the module (i.e. only has a declaration inside the module). 2. For body samples, add each target function to the import list if it's from outside of the module (i.e. only has a declaration inside the module). Previously we were using getSubProgram() to check whether it has dbg info, which is inaccurate. This fix properly add imports and could improve the quality of the pass. Added a few changes to the test to catch these cases. Differential Revision: https://reviews.llvm.org/D79379	2020-05-11 10:00:14 -07:00
Sergey Dmitriev	3df40007e6	[Attributor] Fix for a crash on RAUW when rewriting function signature Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: uenoku Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79680	2020-05-11 08:06:19 -07:00
Tyker	78d85c2091	[AssumeBundles] fix crashes Summary: this patch fixe crash/asserts found in the test-suite. the AssumeptionCache cannot be assumed to have all assumes contrary to what i tought. prevent generation of information for terminators, because this can create broken IR in transfromation where we insert the new terminator before removing the old one. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79458	2020-05-11 11:52:21 +02:00
OCHyams	da100de0a6	[NFC][DwarfDebug] Add test for variables with a single location which don't span their entire scope. The previous commit (`6d1c40c171`) is an older version of the test. Reviewed By: aprantl, vsk Differential Revision: https://reviews.llvm.org/D79573	2020-05-11 11:49:11 +02:00
Johannes Doerfert	3a8740bdd5	[Attributor] Merge the query set into AbstractAttribute The old QuerriedAAs contained two vectors, one for required one for optional dependences (=queries). We now use a single vector and encode the kind directly in the pointer. This reduces memory consumption and makes the connection between abstract attributes and their dependences clearer. No functional change is intended, changes in the test are due to different order in the query map. Neither the order before nor now is in any way special. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 543734 (329735/s) temporary memory allocations: 105895 (64217/s) peak heap memory consumption: 19.19MB peak RSS (including heaptrack overhead): 102.26MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 513292 (341511/s) temporary memory allocations: 106028 (70544/s) peak heap memory consumption: 13.35MB peak RSS (including heaptrack overhead): 95.64MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: -30442 (208506/s) temporary memory allocations: 133 (-910/s) peak heap memory consumption: -5.84MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` --- Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D78729	2020-05-10 22:27:00 -05:00
Johannes Doerfert	5e06b2514a	[Attributor][FIX] Carefully handle/ignore/forget `argmemonly` When we have an existing `argmemonly` or `inaccessiblememorargmemonly` we used to "know" that information. However, interprocedural constant propagation can invalidate these attributes. We now ignore and remove these attributes for internal functions (which may be affected by IP constant propagation), if we are deriving new attributes for the function.	2020-05-10 19:06:11 -05:00
Johannes Doerfert	713ee3aa77	[Attributor] Use "simplify to constant" in genericValueTraversal As we replace values with constants interprocedurally, we also need to do this "look-through" step during the generic value traversal or we would derive properties from replaced values. While this is often not problematic, it is when we use the "kind" of a value for reasoning, e.g., accesses to arguments allow `argmemonly`.	2020-05-10 19:06:11 -05:00
Johannes Doerfert	31c03b9223	[Attributor] Use existing helpers to determine IR facts We now use getPointerDereferenceableBytes to determine `nonnull` and `dereferenceable` facts from the IR. We also use getPointerAlignment in AAAlign for the same reason. The latter can interfere with callbacks so we do restrict it to non-function-pointers for now.	2020-05-10 19:06:10 -05:00
Fangrui Song	25544ce2df	[gcov] Default coverage version to '407' and delete CC1 option -coverage-cfg-checksum Defaulting to -Xclang -coverage-version='407' makes .gcno/.gcda compatible with gcov [4.7,8) In addition, delete clang::CodeGenOptionsBase::CoverageExtraChecksum and GCOVOptions::UseCfgChecksum. We can infer the information from the version. With this change, .gcda files produced by `clang --coverage a.o` linked executable can be read by gcov 4.7~7. We don't need other -Xclang -coverage* options. There may be a mismatching version warning, though. (Note, GCC r173147 "split checksum into cfg checksum and line checksum" made gcov 4.7 incompatible with previous versions.)	2020-05-10 16:14:07 -07:00
Fangrui Song	13a633b438	[gcov] Delete CC1 option -coverage-no-function-names-in-data rL144865 incorrectly wrote function names for GCOV_TAG_FUNCTION (this might be part of the reasons the header says "We emit files in a corrupt version of GCOV's "gcda" file format"). rL176173 and rL177475 realized the problem and introduced -coverage-no-function-names-in-data to work around the issue. (However, the description is wrong. libgcov never writes function names, even before GCC 4.2). In reality, the linker command line has to look like: clang --coverage -Xclang -coverage-version='407*' -Xclang -coverage-cfg-checksum -Xclang -coverage-no-function-names-in-data Failing to pass -coverage-no-function-names-in-data can make gcov 4.7~7 either produce wrong results (for one gcov-4.9 program, I see "No executable lines") or segfault (gcov-7). (gcov-8 uses an incompatible format.) This patch deletes -coverage-no-function-names-in-data and the related function names support from libclang_rt.profile	2020-05-10 12:37:44 -07:00
Tyker	5957e058e4	[AssumeBundles] Remove non-determinisme from assume builder Summary: The assume builder was non-deterministic when working on unamed values. this patch fixes this. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78616	2020-05-10 21:18:33 +02:00
Tyker	821a0f23d8	[AssumeBundles] Prevent generation of some redundant assumes Summary: with this patch the assume salvageKnowledge will not generate assume if all knowledge is already available in an assume with valid context. assume bulider can also in some cases update an existing assume with better information. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78014	2020-05-10 19:23:59 +02:00
Sanjay Patel	856cc60bc1	[InstCombine] canonicalize bitcast after insertelement into undef We have a transform in the opposite direction only for the x86 MMX type, Other types are not handled either way before this patch. The motivating case from PR45748: https://bugs.llvm.org/show_bug.cgi?id=45748 ...is the last test diff. In that example, we are triggering an existing bitcast transform, so we reduce the number of casts, and that should give us the ideal x86 codegen. Differential Revision: https://reviews.llvm.org/D79171	2020-05-10 11:37:47 -04:00
Simon Pilgrim	bab44a698e	[InstCombine] matchOrConcat - match BITREVERSE Fold or(zext(bitreverse(x)),shl(zext(bitreverse(y)),bw/2) -> bitreverse(or(zext(x),shl(zext(y),bw/2)) Practically this is the same as the BSWAP pattern so we might as well handle it.	2020-05-10 16:00:29 +01:00
Florian Hahn	96c63f544f	Recommit "[LAA] Remove one addRuntimeChecks function (NFC)." The failing assertion has been fixed and the problematic test case has been added. This reverts the revert commit `fc44617f28`.	2020-05-10 15:19:57 +01:00
Sanjay Patel	a62533c29f	[InstCombine] fold fpext into exact integer-to-FP cast We can combine a floating-point extension cast with a conversion from integer if we know the earlier cast is exact. This is an optimization suggested in PR36617: https://bugs.llvm.org/show_bug.cgi?id=36617#c19 However, this patch does not change the example suggested there. This patch only uses the existing analysis to handle cases where the integer source value magnitude is narrower than the intermediate FP mantissa (guarantees that the conversion to FP is exact). Follow-up patches to the analysis function can enable more cases. Differential Revision: https://reviews.llvm.org/D79116	2020-05-10 07:04:54 -04:00
Simon Pilgrim	9237d88001	[X86] isVectorShiftByScalarCheap - don't limit fast XOP vector shifts to 128-bit vectors XOP targets have fast per-element vector shifts and we're better off splitting to 128-bit shifts where necessary (which is what we already do in LowerShift).	2020-05-09 22:24:08 +01:00
Matt Arsenault	16295d521e	InstCombine: Broaden copy-constant-to-alloca optimization Consider any constant memory type, not just global constants. AMDGPU kernel parameters are effectively global constants, but appear as either reads from an intrinsic derived pointer or function argument.	2020-05-09 16:00:27 -04:00
Simon Pilgrim	f8b09f7b52	[CodeGenPrepare][X86] Add x16i16, v32i8 and XOP vector shift by scalar amount tests Helps improve test coverage of the XOP modes in X86TargetLowering::isVectorShiftByScalarCheap (and where we always return false for vXi8 vector shifts).	2020-05-09 20:47:42 +01:00
Sanjay Patel	0d2a0b44c8	[VectorCombine] scalarize binop of inserted elements into vector constants As with the extractelement patterns that are currently in vector-combine, there are going to be several possible variations on this theme. This should be the clearest, simplest example. Scalarization is the right direction for target-independent canonicalization, and InstCombine has some of those folds already, but it doesn't do this. I proposed a similar transform in D50992. Here in vector-combine, we can check the cost model to be sure it's profitable, so there should be less risk. Differential Revision: https://reviews.llvm.org/D79452	2020-05-08 16:31:12 -04:00
zoecarver	f65f566aeb	Re-commit: Mark values as trivially dead when their only use is a start or end lifetime intrinsic. Summary: If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well. Currently, this only works for allocas, globals, and arguments. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79355	2020-05-08 12:24:10 -07:00
Sanjay Patel	1aa8cef97a	[InstCombine] add/adjust tests for fpext of casted value; NFC	2020-05-08 15:22:36 -04:00
Wei Mi	aa2ddfc73d	[SampleFDO] For functions without profiles, provide an option to put them in a special text section. For sampleFDO, because the optimized build uses profile generated from previous release, previously we couldn't tell a function without profile was truely cold or just newly created so we had to treat them conservatively and put them in .text section instead of .text.unlikely. The result was when we persuing the best performance by locking .text.hot and .text in memory, we wasted a lot of memory to keep cold functions inside. In https://reviews.llvm.org/D66374, we introduced profile symbol list to discriminate functions being cold versus functions being newly added. This mechanism works quite well for regular use cases in AutoFDO. However, in some case, we can only have a partial profile when optimizing a target. The partial profile may be an aggregated profile collected from many targets. The profile symbol list method used for regular sampleFDO profile is not applicable to partial profile use case because it may be too large and introduce many false positives. To solve the problem for partial profile use case, we provide an option called --profile-unknown-in-special-section. For functions without profile, we will still treat them conservatively in compiler optimizations -- for example, treat them as warm instead of cold in inliner. When we use profile info to add section prefix for functions, we will discriminate functions known to be not cold versus functions without profile (being unknown), and we will put functions being unknown in a special text section called .text.unknown. Runtime system will have the flexibility to decide where to put the special section in order to achieve a balance between performance and memory saving. Differential Revision: https://reviews.llvm.org/D62540	2020-05-08 11:18:09 -07:00
Sanjay Patel	df5c9fdaac	[InstCombine] add tests for known bits before FP casts; NFC	2020-05-08 13:44:32 -04:00
Matt Arsenault	78a43f10c7	AMDGPU: Don't assert on unknown address spaces Assume unknown address spaces behave like some flavor of global memory.	2020-05-08 12:57:27 -04:00
Benjamin Kramer	f936457f80	Revert "Recommit "[LV] Induction Variable does not remain scalar under tail-folding."" This reverts commit `ae45b4dbe7`. It causes miscompilations, test case on the mailing list.	2020-05-08 14:49:10 +02:00
Nikita Popov	5a2265647e	Reapply [InstSimplify] Remove known bits constant folding No changes relative to last time, but after a mitigation for an AMDGPU regression landed. --- If SimplifyInstruction() does not succeed in simplifying the instruction, it will compute the known bits of the instruction in the hope that all bits are known and the instruction can be folded to a constant. I have removed a similar optimization from InstCombine in D75801, and would like to drop this one as well. On average, we spend ~1% of total compile-time performing this known bits calculation. However, if we introduce some additional statistics for known bits computations and how many of them succeed in simplifying the instruction we get (on test-suite): instsimplify.NumKnownBits: 216 instsimplify.NumKnownBitsComputed: 13828375 valuetracking.NumKnownBitsComputed: 45860806 Out of ~14M known bits calculations (accounting for approximately one third of all known bits calculations), only 0.0015% succeed in producing a constant. Those cases where we do succeed to compute all known bits will get folded by other passes like InstCombine later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test show a hash difference after this change. On lencod we see an improvement (a loop phi is optimized away), on the GCC torture test a regression (a function return value is determined only after IPSCCP, preventing propagation from a noinline function.) There are various regressions in InstSimplify tests. However, all of these cases are already handled by InstCombine, and corresponding tests have already been added there. Differential Revision: https://reviews.llvm.org/D79294	2020-05-08 10:24:53 +02:00
Diego Caballero	f5224d437e	[LoopFusion] Remove unreachable blocks from DT and LI after fusion This patch removes FC0.ExitBlock and FC1GuardBlock from DT and LI after fusion of guarded loops. They become unreachable and LI verification failed when they happened to be inside another loop. Reviewed By: kbarton Differential Revision: https://reviews.llvm.org/D78679	2020-05-07 16:44:40 -07:00
Johannes Doerfert	edf0391491	[Attributor][FIX] Record dependences for assumed dead abstract attributes In a recent patch we introduced a problem with abstract attributes that were assumed dead at some point. Since `Attributor::updateAA` was introduced in `95e0d28b71`, we did not remember the dependence on the liveness AA when an abstract attribute was assumed dead and therefore not updated. Explicit reproducer added in liveness.ll. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 509242 (345483/s) temporary memory allocations: 98666 (66937/s) peak heap memory consumption: 18.60MB peak RSS (including heaptrack overhead): 103.29MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 529332 (355494/s) temporary memory allocations: 102107 (68574/s) peak heap memory consumption: 19.40MB peak RSS (including heaptrack overhead): 102.79MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: 20090 (1339333/s) temporary memory allocations: 3441 (229400/s) peak heap memory consumption: 801.45KB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ```	2020-05-07 17:00:50 -05:00
Alina Sbirlea	6227f021ad	[SimpleLoopUnswitch] Update DefaultExit condition to check unreachable is not empty. Summary: Update the check for the default exit block to not only check that the terminator is not unreachable, but also check that unreachable block has only the unreachable instruction. Reviewers: chandlerc Subscribers: hiraditya, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78277	2020-05-07 13:48:30 -07:00
Sanjay Patel	5b48f7d2fc	[VectorCombine] adjust test to make intent clearer; NFC Create a non-zero result to show that the other lane is computed correctly.	2020-05-07 16:21:17 -04:00
Huihui Zhang	e8ea1eb4c1	[NFC] Adjust test check lines for D78267. This wasn't identified through buildbot before.	2020-05-07 13:20:15 -07:00
Huihui Zhang	1ec0cc0f02	[InstCombine][SVE] Fix visitExtractElementInst for scalable type. Summary: This patch fix the following issues with visitExtractElementInst: 1. Restrict VectorUtils::findScalarElement to fixed-length vector. For scalable type, the number of elements in shuffle mask is unknown at compile-time. 2. Fix out-of-range calculation for fixed-length vector. 3. Skip scalable type when analysis rely on fixed number of elements. 4. Add unit tests to check functionality of extractelement for scalable type. Reviewers: sdesmalen, efriedma, spatel, nikic Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78267	2020-05-07 13:03:52 -07:00
Huihui Zhang	08c9c13749	[InstCombine][SVE] Fix visitInsertElementInst for scalable type. Summary: This patch fixes the following issues in visitInsertElementInst: 1. Bail out for scalable type when analysis requires fixed size number of vector elements. 2. Use cast<FixedVectorType> to get vector number of elements. This ensure assertion on scalable vector type. 3. For scalable type, avoid folding a chain of insertelement into splat: insertelt(insertelt(insertelt(insertelt X, %k, 0), %k, 1), %k, 2) ... -> shufflevector(insertelt(X, %k, 0), undef, zero) The length of scalable vector is unknown at compile-time, therefore we don't know if given insertelement sequence is valid for splat. Reviewers: sdesmalen, efriedma, spatel, nikic Reviewed By: sdesmalen, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78895	2020-05-07 12:44:52 -07:00
Sanjay Patel	5d0f2fdfa5	[VectorCombine] add tests with undefs; NFC Goes with D79452.	2020-05-07 15:28:26 -04:00
Sanjay Patel	02051c7f3a	[SLP] add another bailout for load-combine patterns (2nd try) The original patch (rG86dfbc676ebe) exposed an existing bug: we could wrongly cast a constant expression to BinaryOperator because the pattern matching allows that. This adds a check for that case, and there's a reduced test case to verify no crashing. Original commit message: This builds on the or-reduction bailout that was added with D67841. We still do not have IR-level load combining, although that could be a target-specific enhancement for -vector-combiner. The heuristic is narrowly defined to catch the motivating case from PR39538: https://bugs.llvm.org/show_bug.cgi?id=39538 ...while preserving existing functionality. That is, there's an unmodified test of pure load/zext/store that is not seen in this patch at llvm/test/Transforms/SLPVectorizer/X86/cast.ll. That's the reason for the logic difference to require the 'or' instructions. The chances that vectorization would actually help a memory-bound sequence like that seem small, but it looks nicer with: vpmovzxwd (%rsi), %xmm0 vmovdqu %xmm0, (%rdi) rather than: movzwl (%rsi), %eax movl %eax, (%rdi) ... In the motivating test, we avoid creating a vector mess that is unrecoverable in the backend, and SDAG forms the expected bswap instructions after load combining: movzbl (%rdi), %eax vmovd %eax, %xmm0 movzbl 1(%rdi), %eax vmovd %eax, %xmm1 movzbl 2(%rdi), %eax vpinsrb $4, 4(%rdi), %xmm0, %xmm0 vpinsrb $8, 8(%rdi), %xmm0, %xmm0 vpinsrb $12, 12(%rdi), %xmm0, %xmm0 vmovd %eax, %xmm2 movzbl 3(%rdi), %eax vpinsrb $1, 5(%rdi), %xmm1, %xmm1 vpinsrb $2, 9(%rdi), %xmm1, %xmm1 vpinsrb $3, 13(%rdi), %xmm1, %xmm1 vpslld $24, %xmm0, %xmm0 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpslld $16, %xmm1, %xmm1 vpor %xmm0, %xmm1, %xmm0 vpinsrb $1, 6(%rdi), %xmm2, %xmm1 vmovd %eax, %xmm2 vpinsrb $2, 10(%rdi), %xmm1, %xmm1 vpinsrb $3, 14(%rdi), %xmm1, %xmm1 vpinsrb $1, 7(%rdi), %xmm2, %xmm2 vpinsrb $2, 11(%rdi), %xmm2, %xmm2 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpinsrb $3, 15(%rdi), %xmm2, %xmm2 vpslld $8, %xmm1, %xmm1 vpmovzxbd %xmm2, %xmm2 # xmm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero vpor %xmm2, %xmm1, %xmm1 vpor %xmm1, %xmm0, %xmm0 vmovdqu %xmm0, (%rsi) movl (%rdi), %eax movl 4(%rdi), %ecx movl 8(%rdi), %edx movbel %eax, (%rsi) movbel %ecx, 4(%rsi) movl 12(%rdi), %ecx movbel %edx, 8(%rsi) movbel %ecx, 12(%rsi) Differential Revision: https://reviews.llvm.org/D78997	2020-05-07 15:04:37 -04:00
Sanjay Patel	62ea77ec02	[SLP] add test for constant expression fake of load-combine pattern; NFC This is a reduction of the test that caused D78997 to be reverted.	2020-05-07 15:04:37 -04:00
Hans Wennborg	c54c6ee1a7	Revert "[SLP] add another bailout for load-combine patterns" It caused asserts building Chromium, see discussion on https://reviews.llvm.org/D78997 This reverts commit `86dfbc676e`.	2020-05-07 16:31:52 +02:00
Sanjay Patel	666c61db79	[VectorCombine] add tests for insert into arbitrary constant; NFC Goes with D79452.	2020-05-07 10:27:25 -04:00
Sjoerd Meijer	ae45b4dbe7	Recommit "[LV] Induction Variable does not remain scalar under tail-folding." With 3 llvm regr tests fixed/updated that I had missed.	2020-05-07 11:52:20 +01:00
Sjoerd Meijer	20d67ffeae	Revert "[LV] Induction Variable does not remain scalar under tail-folding." This reverts commit `617aa64c84`. while I investigate buildbot failures.	2020-05-07 09:29:56 +01:00
Sjoerd Meijer	617aa64c84	[LV] Induction Variable does not remain scalar under tail-folding. If tail-folding of the scalar remainder loop is applied, the primary induction variable is splat to a vector and used by the masked load/store vector instructions, thus the IV does not remain scalar. Because we now mark that the IV does not remain scalar for these cases, we don't emit the vector IV if it is not used. Thus, the vectoriser produces less dead code. Thanks to Ayal Zaks for the direction how to fix this. Differential Revision: https://reviews.llvm.org/D78911	2020-05-07 09:15:23 +01:00
Whitney Tsang	0a52401ad6	[LoopUnrollAndJam] Changed safety checks to consider more than 2-levels loop nest. Summary: As discussed in https://reviews.llvm.org/D73129. Example Before unroll and jam: for A for B for C D E After unroll and jam (currently): for A A' for B for C D B' for C' D' E E' After unroll and jam (Ideal): for A A' for B B' for C C' D D' E E' This is the first patch to change unroll and jam to work in the ideal way. This patch change the safety checks needed to make sure is safe to unroll and jam in the ideal way. Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto Reviewed By: Meinersbur Subscribers: fhahn, hiraditya, zzheng, llvm-commits, anhtuyen, prithayan Tag: LLVM Differential Revision: https://reviews.llvm.org/D76132	2020-05-06 21:47:44 +00:00
zoecarver	1998e796e9	Revert "Mark values as trivially dead when their only use is a start or end lifetime intrinsic." This reverts commit `95aa28cc8f`.	2020-05-06 11:07:22 -07:00
zoecarver	95aa28cc8f	Mark values as trivially dead when their only use is a start or end lifetime intrinsic. Summary: If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well. Currently, this only works for allocas, globals, and arguments. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79355	2020-05-06 10:58:08 -07:00
Sanjay Patel	2058c98715	[InstCombine] limit bitcast+insertelement transform to x86 MMX type This is unusual for the general case because we are replacing 1 instruction with 2. Splitting from a potential conflicting transform in D79171	2020-05-06 13:12:36 -04:00
Sanjay Patel	e3eb297deb	[VectorCombine] add tests for possible scalarization; NFC	2020-05-06 09:58:27 -04:00
Johannes Doerfert	094137a6c6	[Attributor][NFC] Avoid dependences on known information	2020-05-05 23:14:23 -05:00
Christopher Tetreault	855e02e799	[SVE] Fix invalid usage of getNumElements() in InstCombineMulDivRem Summary: getLogBase2 tries to iterate over the number of vector elements. Since the number of elements of a scalable vector is unknown at compile time, we must return null if the input type is scalable. Identified by test LLVM.Transforms/InstCombine::nsw.ll Reviewers: efriedma, fpetrogalli, kmclaughlin, spatel Reviewed By: efriedma, fpetrogalli Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79197	2020-05-05 15:19:01 -07:00
Sanjay Patel	a954b8a363	[ValueTracking] fix CannotBeNegativeZero() to disregard 'nsz' FMF The 'nsz' flag is different than 'nnan' or 'ninf' in that it does not create poison. Make that explicit in the LangRef and fix ValueTracking analysis that misinterpreted the definition. This manifests as bugs in InstSimplify shown in the test diffs and as discussed in PR45778: https://bugs.llvm.org/show_bug.cgi?id=45778 Differential Revision: https://reviews.llvm.org/D79422	2020-05-05 16:04:59 -04:00
Sanjay Patel	86dfbc676e	[SLP] add another bailout for load-combine patterns This builds on the or-reduction bailout that was added with D67841. We still do not have IR-level load combining, although that could be a target-specific enhancement for -vector-combiner. The heuristic is narrowly defined to catch the motivating case from PR39538: https://bugs.llvm.org/show_bug.cgi?id=39538 ...while preserving existing functionality. That is, there's an unmodified test of pure load/zext/store that is not seen in this patch at llvm/test/Transforms/SLPVectorizer/X86/cast.ll. That's the reason for the logic difference to require the 'or' instructions. The chances that vectorization would actually help a memory-bound sequence like that seem small, but it looks nicer with: vpmovzxwd (%rsi), %xmm0 vmovdqu %xmm0, (%rdi) rather than: movzwl (%rsi), %eax movl %eax, (%rdi) ... In the motivating test, we avoid creating a vector mess that is unrecoverable in the backend, and SDAG forms the expected bswap instructions after load combining: movzbl (%rdi), %eax vmovd %eax, %xmm0 movzbl 1(%rdi), %eax vmovd %eax, %xmm1 movzbl 2(%rdi), %eax vpinsrb $4, 4(%rdi), %xmm0, %xmm0 vpinsrb $8, 8(%rdi), %xmm0, %xmm0 vpinsrb $12, 12(%rdi), %xmm0, %xmm0 vmovd %eax, %xmm2 movzbl 3(%rdi), %eax vpinsrb $1, 5(%rdi), %xmm1, %xmm1 vpinsrb $2, 9(%rdi), %xmm1, %xmm1 vpinsrb $3, 13(%rdi), %xmm1, %xmm1 vpslld $24, %xmm0, %xmm0 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpslld $16, %xmm1, %xmm1 vpor %xmm0, %xmm1, %xmm0 vpinsrb $1, 6(%rdi), %xmm2, %xmm1 vmovd %eax, %xmm2 vpinsrb $2, 10(%rdi), %xmm1, %xmm1 vpinsrb $3, 14(%rdi), %xmm1, %xmm1 vpinsrb $1, 7(%rdi), %xmm2, %xmm2 vpinsrb $2, 11(%rdi), %xmm2, %xmm2 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpinsrb $3, 15(%rdi), %xmm2, %xmm2 vpslld $8, %xmm1, %xmm1 vpmovzxbd %xmm2, %xmm2 # xmm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero vpor %xmm2, %xmm1, %xmm1 vpor %xmm1, %xmm0, %xmm0 vmovdqu %xmm0, (%rsi) movl (%rdi), %eax movl 4(%rdi), %ecx movl 8(%rdi), %edx movbel %eax, (%rsi) movbel %ecx, 4(%rsi) movl 12(%rdi), %ecx movbel %edx, 8(%rsi) movbel %ecx, 12(%rsi) Differential Revision: https://reviews.llvm.org/D78997	2020-05-05 12:44:38 -04:00
Jay Foad	22829ab5fa	[InstCombine] Allow denormal C in pow(C,y) -> exp2(log2(C)*y) We check that C is finite and strictly positive, but there's no need to check that it's normal too. exp2 should be just as accurate on denormals as pow is. Differential Revision: https://reviews.llvm.org/D79413	2020-05-05 16:25:48 +01:00
Jay Foad	47f5066553	Precommit new test cases for D79413 [InstCombine] Allow denormal C in pow(C,y) -> exp2(log2(C)*y)	2020-05-05 16:08:09 +01:00
Sam Parker	f35ccfa2af	[NFC] Update tests Run the update script on a couple of tests.	2020-05-05 15:28:40 +01:00
Jay Foad	fa2783d79a	[InstCombine] Remove hasOneUse check for pow(C,x) -> exp2(log2(C)*x) I don't think there's any good reason not to do this transformation when the pow has multiple uses. Differential Revision: https://reviews.llvm.org/D79407	2020-05-05 14:46:08 +01:00
Simon Pilgrim	5c91aa6603	[InstCombine] Fold or(zext(bswap(x)),shl(zext(bswap(y)),bw/2)) -> bswap(or(zext(x),shl(zext(y), bw/2)) This adds a general combine that can be used to fold: or(zext(OP(x)), shl(zext(OP(y)),bw/2)) --> OP(or(zext(x), shl(zext(y),bw/2))) Allowing us to widen 'concat-able' style or+zext patterns - I've just set this up for BSWAP but we could use this for other similar ops (BITREVERSE for instance). We already do something similar for bitop(bswap(x),bswap(y)) --> bswap(bitop(x,y)) Fixes PR45715 Reviewed By: @lebedev.ri Differential Revision: https://reviews.llvm.org/D79041	2020-05-05 12:30:10 +01:00
Sergey Dmitriev	f637334df9	[CallGraphUpdater] Removed references to calles when deleting function Summary: Otherwise we can get unaccounted references to call graph nodes. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79382	2020-05-04 18:59:47 -07:00
Simon Pilgrim	940061438e	[InstCombine] Fold (mul(abs(x),abs(x))) -> (mul(x,x)) (PR39476) This patch adds support for discarding integer absolutes (abs + nabs variants) from self-multiplications. ABS Alive2: http://volta.cs.utah.edu:8080/z/rwcc8W NABS Alive2: http://volta.cs.utah.edu:8080/z/jZXUwQ This is an InstCombine version of D79304 - I'm not sure yet if we'll need that after this. Reviewed By: @lebedev.ri and @xbolva00 Differential Revision: https://reviews.llvm.org/D79319	2020-05-04 15:21:52 +01:00
Jay Foad	e737847b8f	[SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func optimizePow does not create any new calls to pow, so it should work regardless of whether the pow library function is available. This allows it to optimize the llvm.pow intrinsic on targets with no math library. Based on a patch by Tim Renouf. Differential Revision: https://reviews.llvm.org/D68231	2020-05-04 10:54:07 +01:00
Simon Pilgrim	8e9a8dc185	[InstCombine] Add tests showing failure to fold mul(abs(x),abs(x)) -> mul(x,x) (PR39476) Includes abs() and nabs() variants	2020-05-04 10:24:18 +01:00
Jay Foad	6c42814a26	Precommit test updates for D68231.	2020-05-04 09:55:59 +01:00
Johannes Doerfert	95e0d28b71	[Attributor] Remember only necessary dependences Before we eagerly put dependences into the QueryMap as soon as we encountered them (via `Attributor::getAAFor<>` or `Attributor::recordDependence`). Now we will wait to see if the dependence is useful, that is if the target is not already in a fixpoint state at the end of the update. If so, there is no need to record the dependence at all. Due to the abstraction via `Attributor::updateAA` we will now also treat the very first update (during attribute creation) as we do subsequent updates. Finally this resolves the problematic usage of QueriedNonFixAA. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 554675 (389245/s) temporary memory allocations: 101574 (71280/s) peak heap memory consumption: 28.46MB peak RSS (including heaptrack overhead): 116.26MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 512465 (345559/s) temporary memory allocations: 98832 (66643/s) peak heap memory consumption: 22.54MB peak RSS (including heaptrack overhead): 106.58MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: -42210 (-727758/s) temporary memory allocations: -2742 (-47275/s) peak heap memory consumption: -5.92MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ```	2020-05-03 22:01:51 -05:00
Johannes Doerfert	231026a508	[Attributor] Inititialize "value attributes" w/ must-be-executed-context info Attributes that only depend on the value (=bit pattern) can be initialized from uses in the must-be-executed-context (MBEC). We did use `AAComposeTwoGenericDeduction` and `AAFromMustBeExecutedContext` before to do this for some positions of these attributes but not for all. This was fairly complicated and also problematic as we did run it in every `updateImpl` call even though we only use known information. The new implementation removes `AAComposeTwoGenericDeduction`* and `AAFromMustBeExecutedContext` in favor of a simple interface `AddInformation::fromMBEContext(...)` which we call from the `initialize` methods of the "value attribute" `Impl` classes, e.g. `AANonNullImpl:initialize`. There can be two types of test changes: 1) Artifacts were we miss some information that was known before a global fixpoint was reached and therefore available in an update but not at the beginning. 2) Deduction for values we did not derive via the MBEC before or which were not found as the `AAFromMustBeExecutedContext::updateImpl` was never invoked. * An improved version of AAComposeTwoGenericDeduction can be found in D78718. Once we find a new use case that implementation will be able to handle "generic" AAs better. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 468428 (328952/s) temporary memory allocations: 77480 (54410/s) peak heap memory consumption: 32.71MB peak RSS (including heaptrack overhead): 122.46MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 554720 (351310/s) temporary memory allocations: 101650 (64376/s) peak heap memory consumption: 28.46MB peak RSS (including heaptrack overhead): 116.75MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: 86292 (556722/s) temporary memory allocations: 24170 (155935/s) peak heap memory consumption: -4.25MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D78719	2020-05-03 21:41:22 -05:00
Johannes Doerfert	2f97b8b891	[Attributor][NFC] Proactively ask for `nocapure` on call site arguments This minimizes test noise later on and is in line with other attributes we derive proactively.	2020-05-03 21:38:06 -05:00
Sergey Dmitriev	0f70f73308	[Attributor] Bitcast constant to the returned value type if it has different type Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79277	2020-05-03 11:46:13 -07:00
Nikita Popov	46ee652c70	Revert "[InstSimplify] Remove known bits constant folding" This reverts commit `08556afc54`. This breaks some AMDGPU tests.	2020-05-03 20:45:10 +02:00
Nikita Popov	08556afc54	[InstSimplify] Remove known bits constant folding If SimplifyInstruction() does not succeed in simplifying the instruction, it will compute the known bits of the instruction in the hope that all bits are known and the instruction can be folded to a constant. I have removed a similar optimization from InstCombine in D75801, and would like to drop this one as well. On average, we spend ~1% of total compile-time performing this known bits calculation. However, if we introduce some additional statistics for known bits computations and how many of them succeed in simplifying the instruction we get (on test-suite): instsimplify.NumKnownBits: 216 instsimplify.NumKnownBitsComputed: 13828375 valuetracking.NumKnownBitsComputed: 45860806 Out of ~14M known bits calculations (accounting for approximately one third of all known bits calculations), only 0.0015% succeed in producing a constant. Those cases where we do succeed to compute all known bits will get folded by other passes like InstCombine later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test show a hash difference after this change. On lencod we see an improvement (a loop phi is optimized away), on the GCC torture test a regression (a function return value is determined only after IPSCCP, preventing propagation from a noinline function.) There are various regressions in InstSimplify tests. However, all of these cases are already handled by InstCombine, and corresponding tests have already been added there. Differential Revision: https://reviews.llvm.org/D79294	2020-05-03 20:26:58 +02:00
Hongtao Yu	911e06f5eb	[ICP] Handling must tail calls in indirect call promotion Per the IR convention, a musttail call must precede a ret with an optional bitcast. This was violated by the indirect call promotion optimization which could result an IR like: ; <label>:2192: br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483 ; <label>:2199: ; preds = %2192 musttail call fastcc void @foo(i8* %2195), !dbg !226012 br label %2202, !dbg !226012 ; <label>:2201: ; preds = %2192 musttail call fastcc void %2197(i8* %2195), !dbg !226012 br label %2202, !dbg !226012 ; <label>:2202: ; preds = %605, %2201, %2199 ret void, !dbg !229485 This is being fixed in this change where the return statement goes together with the promoted indirect call. The code generated is like: ; <label>:2192: br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483 ; <label>:2199: ; preds = %2192 musttail call fastcc void @foo(i8* %2195), !dbg !226012 ret void, !dbg !229485 ; <label>:2201: ; preds = %2192 musttail call fastcc void %2197(i8* %2195), !dbg !226012 ret void, !dbg !229485 Differential Revision: https://reviews.llvm.org/D79258	2020-05-03 10:42:22 -07:00
Sanjay Patel	682f0b366b	[InstCombine] use select-of-constants with set/clear bit mask patterns Cond ? (X & ~C) : (X \| C) --> (X & ~C) \| (Cond ? 0 : C) Cond ? (X \| C) : (X & ~C) --> (X & ~C) \| (Cond ? C : 0) The select-of-constants form results in better codegen. There's an existing test diff that shows a transform that results in an extra IR instruction, but that's an existing problem. This is motivated by code seen in LLVM itself - see PR37581: https://bugs.llvm.org/show_bug.cgi?id=37581 define i8 @src(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %or = or i8 %x, %C %cond = select i1 %b, i8 %or, i8 %and ret i8 %cond } define i8 @tgt(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %mul = select i1 %b, i8 %C, i8 0 %or = or i8 %mul, %and ret i8 %or } http://volta.cs.utah.edu:8080/z/Vt2WVm Differential Revision: https://reviews.llvm.org/D78880	2020-05-03 09:44:43 -04:00
Nikita Popov	7c649b58f0	[InstCombine] Duplicate some InstSimplify tests (NFC) Duplicate some tests in preparation for D79294.	2020-05-03 12:49:36 +02:00
Nikita Popov	60e9ee16b4	[MergeFuncs] Don't merge shufflevectors with different masks When the shufflevector mask operand was converted into special instruction data, the FunctionComparator was not updated to account for this. As such, MergeFuncs will happily merge shufflevectors with different masks. This fixes https://bugs.llvm.org/show_bug.cgi?id=45773. Differential Revision: https://reviews.llvm.org/D79261	2020-05-02 10:21:14 +02:00
Sanjay Patel	7fa150203f	[InstCombine] fix miscompile from multi-use cttz/ctlz transform PR45762: https://bugs.llvm.org/show_bug.cgi?id=45762	2020-05-01 13:52:24 -04:00
Sanjay Patel	43b0e446fb	[InstCombine] add test for faulty cttz fold (PR45762); NFC	2020-05-01 13:52:23 -04:00
Simon Pilgrim	4548e62ca4	[InstCombine] Additional 'concat of ORs' BSWAP/BITREVERSE tests for D79041	2020-05-01 18:05:24 +01:00
Sanjay Patel	57f0eed98d	[InstSimplify] allow insertelement-with-undef fold if poison-safe The more general fold was not poison-safe, so it was removed: rG5486e00 ...but it is ok to have this transform if analysis can determine the vector contains no poison. The test shows a simple example of that: constant integer elements are not poison.	2020-05-01 10:34:29 -04:00
Sanjay Patel	c79a366ec0	[InstSimplify] update test; NFC Missed this test diff when committing: rG5486e00dc3	2020-05-01 10:06:56 -04:00
Sanjay Patel	5486e00dc3	[InstSimplify] remove poison-unsafe insertelement of undef value PR45481: https://bugs.llvm.org/show_bug.cgi?id=45481 SDAG has an identical transform to this, so there's little chance of any real-world impact. OTOH, that means we are effectively sweeping the bug out of sight because poison exists in codegen too.	2020-05-01 09:22:05 -04:00
Sanjay Patel	5013a788f8	[InstCombine] adjust tests for pow(); NFC D68231 would change this, but the existing test doesn't cover what was probably intended (a libcall test).	2020-05-01 08:42:51 -04:00
Nikita Popov	b74c6d2c9d	[InlineFunction] Disable emission of alignment assumptions by default In D74183 clang started emitting alignment for sret parameters unconditionally. This caused a 1.5% compile-time regression on tramp3d-v4. The reason is that we now generate many instance of IR like %ptrint = ptrtoint %class.GuardLayers* %guards_m to i64 %maskedptr = and i64 %ptrint, 3 %maskcond = icmp eq i64 %maskedptr, 0 tail call void @llvm.assume(i1 %maskcond) to preserve the alignment information during inlining. Based on IR analysis, these assumptions also regress optimization. The attached phase ordering test case illustrates two issues: One are instruction count based optimization heuristics, which are affected by the four additional instructions of the assumption. The other is blocking of SROA due to ptrtoint casts (PR45763). We already encountered the same problem in Rust, where we (unlike Clang) generally prefer to emit alignment information absolutely everywhere it is available. We were only able to do this after hardcoding -preserve-alignment-assumptions-during-inlining=false, because we were seeing significant optimization and compile-time regressions otherwise. This patch disables -preserve-alignment-assumptions-during-inlining by default, because we should not be punishing people for adding more alignment annotations. Once the assume bundle work shakes out and we can represent (and use) alignment assumptions using assume bundles, it should be possible to re-enable this with reduced overhead. Differential Revision: https://reviews.llvm.org/D76886	2020-04-30 23:12:54 +02:00
Masoud Ataei	b4934ae44c	[VFDatabase] Testsuite for scalar functions are vector functions with VF =1 Fixing test suite of the committed PR: https://reviews.llvm.org/D78054. I am proposing to remove the PowerPC target triple in the test suite. Reviewed by: @jsji, @fpetrogalli Tags: LLVM Differential Revision: https://reviews.llvm.org/D79124	2020-04-30 15:47:21 -04:00
Kirill Naumov	0383253cdf	[InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot The assert checks that every instruction must be annotated by this point while it is not necessary. If the inlining process was interrupted because the threshold was reached, the rest of the instructions would not be annotated which triggers the assert. The added test shows the situation in which it can happen. This is a recommit as the original commit fail due to the absence of REQUIRES: assert in the test. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D79107	2020-04-30 15:38:36 +00:00
Sanjay Patel	4a065a72ef	[InstCombine] add tests for bitcast+inselt; NFC	2020-04-30 09:11:29 -04:00
Sanjay Patel	35fe2814cf	[InstCombine] update auto-generated test checks; NFC	2020-04-30 08:39:02 -04:00
Sanjay Patel	2cfeaf3b2d	[InstCombine] add tests for FP->int->FP->FP casting; NFC	2020-04-30 07:41:28 -04:00
Evgeniy Brevnov	3e68a66704	[BPI][NFC] Reuse post dominantor tree from analysis manager when available Summary: Currenlty BPI unconditionally creates post dominator tree each time. While this is not incorrect we can save compile time by reusing existing post dominator tree (when it's valid) provided by analysis manager. Reviewers: skatkov, taewookoh, yrouban Reviewed By: skatkov Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78987	2020-04-30 11:31:03 +07:00
Kirill Naumov	0fa793e798	Revert "[InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot" This reverts commit `66947d05fd`.	2020-04-29 22:00:51 +00:00
Kirill Naumov	66947d05fd	[InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot The assert checks that every instruction must be annotated by this point while it is not necessary. If the inlining process was interrupted because the threshold was reached, the rest of the instructions would not be annotated which triggers the assert. The added test shows the situation in which it can happen. Reviewed-By: mtrofin Diff: https://reviews.llvm.org/D79107	2020-04-29 20:44:10 +00:00
Anh Tuyen Tran	c7878ad231	[VFDatabase] Scalar functions are vector functions with VF =1 Summary: Return scalar function when VF==1. The new trivial mapping scalar --> scalar when VF==1 to prevent false positive for "isVectorizable" query. Author: masoud.ataei (Masoud Ataei) Reviewers: Whitney (Whitney Tsang), fhahn (Florian Hahn), pjeeva01 (Jeeva P.), fpetrogalli (Francesco Petrogalli), rengolin (Renato Golin) Reviewed By: fpetrogalli (Francesco Petrogalli) Subscribers: hiraditya (Aditya Kumar), llvm-commits, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D78054	2020-04-29 17:20:37 +00:00
Simon Pilgrim	090cae8491	[TTI] Add DemandedElts to getScalarizationOverhead The improvements to the x86 vector insert/extract element costs in D74976 resulted in the estimated costs for vector initialization and scalarization increasing higher than should be expected. This is particularly noticeable on pre-SSE4 targets where the available of legal INSERT_VECTOR_ELT ops is more limited. This patch does 2 things: 1 - it implements X86TTIImpl::getScalarizationOverhead to more accurately represent the typical costs of a ISD::BUILD_VECTOR pattern. 2 - it adds a DemandedElts mask to getScalarizationOverhead to permit the SLP's BoUpSLP::getGatherCost to be rewritten to use it directly instead of accumulating raw vector insertion costs. This fixes PR45418 where a v4i8 (zext'd to v4i32) was no longer vectorizing. A future patch should extend X86TTIImpl::getScalarizationOverhead to tweak the EXTRACT_VECTOR_ELT scalarization costs as well. Reviewed By: @craig.topper Differential Revision: https://reviews.llvm.org/D78216	2020-04-29 12:00:38 +01:00
Florian Hahn	e89379856a	Recommit "[VPlan] Add & use VPValue operands for VPWidenRecipe (NFC)." The crash that caused the original revert has been fixed in `a3c964a278`. I also added a reduced version of the crash reproducer. This reverts the revert commit `2107af9ccf`.	2020-04-29 11:40:39 +01:00
Florian Hahn	e018b8bbb0	[DSE,MSSA] Add multi-path tests with readnone throwing calls.	2020-04-29 10:30:05 +01:00
Simon Pilgrim	751a554f25	[InstCombine] Add PR45715 test case	2020-04-28 21:53:59 +01:00
Roman Lebedev	a0004358a8	[InstCombine] Negator: 'or' with no common bits set is just 'add' In `InstCombiner::visitAdd()`, we have ``` // A+B --> A\|B iff A and B have no bits set in common. if (haveNoCommonBitsSet(LHS, RHS, DL, &AC, &I, &DT)) return BinaryOperator::CreateOr(LHS, RHS); ``` so we should handle such `or`'s here, too.	2020-04-28 19:16:32 +03:00
Roman Lebedev	a5f22f2b0e	[NFC][InstCombine] Tests for negation of 'or' with no common bits set	2020-04-28 19:16:31 +03:00
Krzysztof Parzyszek	25a4b1904c	Handle part-word LL/SC in atomic expansion pass Differential Revision: https://reviews.llvm.org/D77213	2020-04-28 10:07:39 -05:00
Sanjay Patel	7a8c226ba8	[SLP] add test for partially vectorized bswap (PR39538); NFC	2020-04-27 17:29:27 -04:00
Sanjay Patel	54fe6c9599	[InstCombine] add tests for set/clear masked bits; NFC	2020-04-27 15:55:45 -04:00
Craig Topper	5eff75d86a	[X86][CostModel] Improve costs for fp_to_uint/fp_to_sint for vXi8/vXi16/v2i32 results. Differential Revision: https://reviews.llvm.org/D78893	2020-04-27 10:35:15 -07:00
Wei Mi	10b57ca690	[ProfileSummary] Add partial profile annotation on IR. Profile and profile summary are usually read only once and then annotated on IR. The profile summary metadata on IR should include the value of the newly added partial profile flag, so that compilation phase like thinlto postlink can get the full set of profile information. Differential Revision: https://reviews.llvm.org/D78310	2020-04-27 08:34:15 -07:00
Ayal Zaks	a3c964a278	[LV] Fix recording of BranchTakenCount for FoldTail When folding tail, branch taken count is computed during initial VPlan execution and recorded to be used by the compare computing the loop's mask. This recording should directly set the State, instead of reusing Value2VPValue mapping which serves original Values present prior to vectorization. The branch taken count may be a constant Value, which may be used elsewhere in the loop; trying to employ Value2VPValue for both leads to the issue reported in https://reviews.llvm.org/D76992#inline-721028 Differential Revision: https://reviews.llvm.org/D78847	2020-04-26 20:13:10 +03:00
Sanjay Patel	3f10f1a5c7	[InstCombine] updated test comments; NFC As suggested in review for: rG4abab5c5ca7b	2020-04-26 11:11:00 -04:00
Florian Hahn	7d57d22baa	[SCCP] Support ranges for loads and stores. Integer ranges can be used for loaded/stored values. Note that widening can be disabled for loads/stores, as we only rely on instructions that cause continued increases to ranges to be widened (like binary operators). Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78433	2020-04-26 13:16:47 +01:00
Florian Hahn	c1c5c47e64	[SCCP] Add load/store test for integer ranges.	2020-04-26 13:16:47 +01:00
Sergey Dmitriev	67aed1469b	[Attributor] Do not set 'returned' attribute for arguments that cannot be bitcasted to function result Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78828	2020-04-25 09:49:40 -07:00
Sanjay Patel	4abab5c5ca	[InstCombine] generalize canonicalization of masked equality comparisons (X \| MaskC) == C --> (X & ~MaskC) == C ^ MaskC (X \| MaskC) != C --> (X & ~MaskC) != C ^ MaskC We have more analyis for 'and' patterns and already lean this way in the existing code, so this should be neutral or better in IR. If this does not do as well in codegen, the problem already exists and we should fix that based on target costs/heuristics. http://volta.cs.utah.edu:8080/z/oP3ecL define void @src(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) { %or = or i8 %x, %OrC %eq = icmp eq i8 %or, %C store i1 %eq, i1* %p0 %ne = icmp ne i8 %or, %C store i1 %ne, i1* %p1 ret void } define void @tgt(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) { %NotOrC = xor i8 %OrC, -1 %a = and i8 %x, %NotOrC %NewC = xor i8 %C, %OrC %eq = icmp eq i8 %a, %NewC store i1 %eq, i1* %p0 %ne = icmp ne i8 %a, %NewC store i1 %ne, i1* %p1 ret void }	2020-04-25 11:31:57 -04:00
Florian Hahn	46a04940e8	[DSE] Add stat for remaining stores after DSE. Using the existing NumFastStores statistic can be misleading when comparing the impact of DSE patches. For example, consider the case where a store gets removed from a function before it is inlined into another function. A less powerful DSE might only remove the store from functions it has been inlined into, which will result in more stores being removed, but no difference in the actual number of stores after DSE. The new stat provides the absolute number of stores surviving after DSE. Reviewers: dmgreen, bryant, asbirlea, jfb Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D78830	2020-04-25 16:12:55 +01:00
Sanjay Patel	9193644f77	[InstCombine] add tests for icmp with bitmask logic op; NFC	2020-04-25 10:57:29 -04:00
Juneyoung Lee	f5677fe700	[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into more constants/instructions Summary: This patch helps isGuaranteedNotToBeUndefOrPoison look into more constants and instructions (bitcast/alloca/gep/fcmp). To deal with bitcast, Depth is added to isGuaranteedNotToBeUndefOrPoison. This patch is splitted from https://reviews.llvm.org/D75808. Checked with Alive2 Reviewers: reames, jdoerfert Reviewed By: jdoerfert Subscribers: sanwou01, spatel, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D76010	2020-04-25 23:29:54 +09:00
Florian Hahn	82ce334727	[ValueLattice] Merging unknown with empty CR is unknown. Currently an unknown/undef value is marked as overdefined when merged with an empty range. An empty range can occur in unreachable/dead code. When merging the new unknown state (= no value known yet) with an empty range, there still isn't any information about the value yet and we can stay in unknown. This gives a few nice improvements on the number of instructions removed by IPSCCP: Same hash: 170 (filtered out) Remaining: 67 Metric: sccp.IPNumInstRemoved Program base patch diff test-suite...rks/FreeBench/mason/mason.test 3.00 6.00 100.0% test-suite...nchmarks/McCat/18-imp/imp.test 3.00 5.00 66.7% test-suite...C/CFP2000/179.art/179.art.test 2.00 3.00 50.0% test-suite...ijndael/security-rijndael.test 2.00 3.00 50.0% test-suite...ks/Prolangs-C/agrep/agrep.test 40.00 58.00 45.0% test-suite...ce/Applications/Burg/burg.test 26.00 37.00 42.3% test-suite...cCat/03-testtrie/testtrie.test 3.00 4.00 33.3% test-suite...Source/Benchmarks/sim/sim.test 29.00 36.00 24.1% test-suite.../Applications/spiff/spiff.test 9.00 11.00 22.2% test-suite...s/FreeBench/neural/neural.test 5.00 6.00 20.0% test-suite...pplications/treecc/treecc.test 66.00 79.00 19.7% test-suite...langs-C/football/football.test 85.00 101.00 18.8% test-suite...ce/Benchmarks/PAQ8p/paq8p.test 90.00 105.00 16.7% test-suite...oxyApps-C++/miniFE/miniFE.test 37.00 43.00 16.2% test-suite...rks/FreeBench/pifft/pifft.test 26.00 30.00 15.4% test-suite...lications/sqlite3/sqlite3.test 481.00 548.00 13.9% test-suite...marks/7zip/7zip-benchmark.test 4875.00 5522.00 13.3% test-suite.../CINT2000/176.gcc/176.gcc.test 1117.00 1197.00 7.2% test-suite...0.perlbench/400.perlbench.test 1618.00 1732.00 7.0% Reviewers: efriedma, nikic, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78667	2020-04-25 13:43:34 +01:00
Tyker	e5f8a77c19	[AssumeBundles] Refactor asssume builder Summary: refactor assume bulider for the next patch. the assume builder now generate only one assume per attribute kind and per value they are on. to do this it takes the highest. this is desirable because currently, for all attributes the higest value is the most valuable. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78013	2020-04-25 13:43:52 +02:00
Ehud Katz	64249f177e	[CodeExtractor] Fix extraction of a value used only by intrinsics outside of region We should only skip `lifetime` and `dbg` intrinsics when searching for users. Other intrinsics are legit users that can't be ignored. Without this fix, the testcase would result in an invalid IR. `memcpy` will have a reference to the, now, external value (local to the extracted loop function). Fix PR42194 Differential Revision: https://reviews.llvm.org/D78749	2020-04-25 11:44:47 +03:00
Craig Topper	e4a9190ad7	[X86][ArgumentPromotion] Allow Argument Promotion if caller and callee disagree on 512-bit vectors support if the arguments are scalar. If one of caller/callee has disabled ZMM registers due to prefer-vector-width=256, we were previously disabling argument promotion as the ABI might be incompatible since one side will split 512-bit vectors in this case. But if we can see that the types are all scalar this shouldn't be a problem. This patch assumes that pointer element type reflects the type that the argument will be promoted to. Differential Revision: https://reviews.llvm.org/D78770	2020-04-24 15:47:02 -07:00
Tyker	42431da895	[AssumeBundles] Use assume bundles in isKnownNonZero Summary: Use nonnull and dereferenceable from an assume bundle in isKnownNonZero Reviewers: jdoerfert, nikic, lebedev.ri, reames, fhahn, sstefan1 Reviewed By: jdoerfert Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76149	2020-04-24 20:41:51 +02:00
Sanjay Patel	238f00f6d3	[InstCombine] regenerate test checks; NFC Values named 'tmp' can cause problems for the auto-generated check script.	2020-04-24 13:51:13 -04:00
Mircea Trofin	c3770c5d6d	[llvm][NFC] Factor out inlining pipeline as a module pipeline. Summary: This simplifies testing in scenarios where we want to set up module-wide analyses for inlining. The patch enables treating inlining and its function cleanups, as a module pass. The alternative would be for tests to describe the pipeline, which is tedious and adds maintenance overhead. Reviewers: davidxl, dblaikie, jdoerfert, sstefan1 Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78512	2020-04-24 09:24:12 -07:00
Sanjay Patel	e4175ff525	[InstCombine] intersect FMF when reassociating FP min/max intrinsics As discussed in PR45478: https://bugs.llvm.org/show_bug.cgi?id=45478 ...propagating FMF from the outer (second) call is not correct, so intersect them instead. I suspect we could do better (see TODO comment), but mismatched FMF is probably too rare to care about. Differential Revision: https://reviews.llvm.org/D78631	2020-04-24 12:14:03 -04:00
Max Kazantsev	9cd4debd5a	[LoopVectorize] Preserve CFG analyses if CFG wasn't modified One of transforms the loop vectorizer makes is LCSSA formation. In some cases it is the only transform it makes. We should not drop CFG analyzes if only LCSSA was formed and no actual CFG changes was made. We should think of expanding this logic to other passes as well, and maybe make it a part of PM framework. Reviewed By: Florian Hahn Differential Revision: https://reviews.llvm.org/D78360	2020-04-24 17:22:24 +07:00
Johannes Doerfert	10ff24d853	[Attributor][NFC] Remove and update old check lines	2020-04-24 01:08:23 -05:00
Johannes Doerfert	a6b14bae0f	[Attributor][NFC] Strip check lines not used while 3 tests are partially disable The three tests modified by this commit have been partially disabled (one run line is commented out). As a consequence subsequent updates will have weird effects on the check lines. This is a commit to avoid such effects by making the check lines match the three remaining run lines.	2020-04-24 01:08:23 -05:00
Eli Friedman	3291efc2b3	[ValueTracking] Handle shufflevector constants in ComputeNumSignBits Differential Revision: https://reviews.llvm.org/D78688	2020-04-23 17:47:37 -07:00
Roman Lebedev	5a159ed2a8	[InstCombine] Negator: don't negate multi-use `sub` While we can do that, it doesn't increase instruction count, if the old `sub` sticks around then the transform is not only not a unlikely win, but a likely regression, since we likely now extended live range and use count of both of the `sub` operands, as opposed to just the result of `sub`. As Kostya Serebryany notes in post-commit review in https://reviews.llvm.org/D68408#1998112 this indeed can degrade final assembly, increase register pressure, and spilling. This isn't what we want here, so at least for now let's guard it with an use check.	2020-04-23 23:59:15 +03:00
Matt Arsenault	156afb2253	AMDGPU: Fix inlining logic for denormals This was backwards from intended and missing a test. We perhaps should just ignored the FP mode here, since it shouldn't be legal to mix code with different default modes in the absence of strictfp.	2020-04-23 15:30:48 -04:00
Sanjay Patel	62da6ecea2	[InstCombine] substitute equivalent constant to reduce logic-of-icmps (X == C) && (Y Pred1 X) --> (X == C) && (Y Pred1 C) (X != C) \|\| (Y Pred1 X) --> (X != C) \|\| (Y Pred1 C) This cooperates/overlaps with D78430, but it is a more general transform that gets us most of the expected simplifications and several other improvements. http://volta.cs.utah.edu:8080/z/5gxjjc PR45618: https://bugs.llvm.org/show_bug.cgi?id=45618 Differential Revision: https://reviews.llvm.org/D78582	2020-04-23 10:19:16 -04:00
Sanjay Patel	e86eff0e82	[InstSimplify] fold and/or of compares with equality to min/max constant I found 12 (6 if we compress the DeMorganized forms) patterns for logic-of-compares with a min/max constant while looking at PR45510: https://bugs.llvm.org/show_bug.cgi?id=45510 The variations on those forms multiply the test cases by 8 (unsigned/signed, swapped compare operands, commuted logic operands). We have partial logic to deal with these for the unsigned min (zero) case, but missed everything else. We are deferring the majority of these patterns to InstCombine to allow more general handling (see D78582). We could use ConstantRange instead of predicate+constant matching here. I don't expect there's any noticeable compile-time impact for either form. Here's an abuse of Alive2 to show the 12 basic signed variants of the patterns in one function: http://volta.cs.utah.edu:8080/z/5Vpiyg declare void @use(i1, i1, i1, i1, i1, i1, i1, i1, i1, i1, i1, i1) define void @src(i8 %x, i8 %y) { %m1 = icmp eq i8 %x, 127 %c1 = icmp slt i8 %x, %y %r1 = and i1 %m1, %c1 ; (X == MAX) && (X < Y) --> false %m2 = icmp ne i8 %x, 127 %c2 = icmp sge i8 %x, %y %r2 = or i1 %m2, %c2 ; (X != MAX) \|\| (X >= Y) --> true %m3 = icmp eq i8 %x, -128 %c3 = icmp sgt i8 %x, %y %r3 = and i1 %m3, %c3 ; (X == MIN) && (X > Y) --> false %m4 = icmp ne i8 %x, -128 %c4 = icmp sle i8 %x, %y %r4 = or i1 %m4, %c4 ; (X != MIN) \|\| (X <= Y) --> true %m5 = icmp eq i8 %x, 127 %c5 = icmp sge i8 %x, %y %r5 = and i1 %m5, %c5 ; (X == MAX) && (X >= Y) --> X == MAX %m6 = icmp ne i8 %x, 127 %c6 = icmp slt i8 %x, %y %r6 = or i1 %m6, %c6 ; (X != MAX) \|\| (X < Y) --> X != MAX %m7 = icmp eq i8 %x, -128 %c7 = icmp sle i8 %x, %y %r7 = and i1 %m7, %c7 ; (X == MIN) && (X <= Y) --> X == MIN %m8 = icmp ne i8 %x, -128 %c8 = icmp sgt i8 %x, %y %r8 = or i1 %m8, %c8 ; (X != MIN) \|\| (X > Y) --> X != MIN %m9 = icmp ne i8 %x, 127 %c9 = icmp slt i8 %x, %y %r9 = and i1 %m9, %c9 ; (X != MAX) && (X < Y) --> X < Y %m10 = icmp eq i8 %x, 127 %c10 = icmp sge i8 %x, %y %r10 = or i1 %m10, %c10 ; (X == MAX) \|\| (X >= Y) --> X >= Y %m11 = icmp ne i8 %x, -128 %c11 = icmp sgt i8 %x, %y %r11 = and i1 %m11, %c11 ; (X != MIN) && (X > Y) --> X > Y %m12 = icmp eq i8 %x, -128 %c12 = icmp sle i8 %x, %y %r12 = or i1 %m12, %c12 ; (X == MIN) \|\| (X <= Y) --> X <= Y call void @use(i1 %r1, i1 %r2, i1 %r3, i1 %r4, i1 %r5, i1 %r6, i1 %r7, i1 %r8, i1 %r9, i1 %r10, i1 %r11, i1 %r12) ret void } define void @tgt(i8 %x, i8 %y) { %m5 = icmp eq i8 %x, 127 %m6 = icmp ne i8 %x, 127 %m7 = icmp eq i8 %x, -128 %m8 = icmp ne i8 %x, -128 %c9 = icmp slt i8 %x, %y %c10 = icmp sge i8 %x, %y %c11 = icmp sgt i8 %x, %y %c12 = icmp sle i8 %x, %y call void @use(i1 0, i1 1, i1 0, i1 1, i1 %m5, i1 %m6, i1 %m7, i1 %m8, i1 %c9, i1 %c10, i1 %c11, i1 %c12) ret void } Differential Revision: https://reviews.llvm.org/D78430	2020-04-23 09:16:10 -04:00
Sanjay Patel	6a10560f17	[InstCombine] add test for logic-of-icmps that should simplify (D78582); NFC	2020-04-23 09:16:10 -04:00
Florian Hahn	352b612a71	[SCCP] Drop unnecessary early exit for ExtractValueInst. visitExtractValueInst uses mergeInValue, so it already can handle constant ranges. Initially the early exit was using isOverdefined to keep things as NFC during the initial move to ValueLatticeElement. As the function already supports constant ranges, it can just use ValueState[&I].isOverdefined. Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78393	2020-04-22 22:07:59 +01:00
David Green	eecba95067	[ARM] Replace arm vendor with none. NFC	2020-04-22 18:19:35 +01:00
Johannes Doerfert	68a27587c2	[OpenMP][FIX] Do not use InaccessibleMemOrArgMemOnly for barrier and flush This was reported as PR45635, committed first as `72a9e7c926`, reverted by `188f5cde96`, and now recommitted with the test change.	2020-04-22 11:10:54 -05:00
Roman Lebedev	a70d2ab323	[NFC][InstCombine] Tests for negation of sign-/zero- extensions * sext of non-positive can be negated. * zext of non-negative can be negated.	2020-04-22 17:37:42 +03:00
Sanjay Patel	6f19f0fb9a	[InstCombine] add tests for min/max FP intrinsics with FMF (PR45478); NFC https://bugs.llvm.org/show_bug.cgi?id=45478	2020-04-22 08:43:40 -04:00
Roman Lebedev	67266d879c	[InstCombine] Negator: shufflevector is negatible All these folds are correct as per alive-tv	2020-04-22 15:14:23 +03:00
Roman Lebedev	4d44ce7437	[NFC][InstCombine] Add shuffle negation tests	2020-04-22 15:14:23 +03:00
Sameer Sahasrabuddhe	5a7a6382bc	FixIrreducible: don't crash when moving a child loop Summary: When an irreducible SCC is converted into a new natural loop, existing loops included in that SCC now become children of the new loop. The logic that moves these loops from the parent loop to the new loop invoked undefined behaviour when it modified the container that it was iterating over. Fixed this by first extracting all the loops that are to be removed from the parent. Fixes bug 45623. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D78544	2020-04-22 07:47:30 +05:30
Johannes Doerfert	46b7ed0e6f	[Attributor] Remove dependence edges eagerly If we have a dependence between an abstract attribute A to an abstract attribute B such hat changes in A should trigger an update of B, we do not need to keep the dependence around once the update was triggered. If the dependence is still required the update will reinsert it into the dependence map, if it is not we avoid triggering B in the future. This replaces the "recompute interval" mechanism we used before to prune stale dependences. Number of required iterations is generally down, compile time for the module pass (not really the CGSCC pass) is down quite a bit. There is one test change which looks like an artifact in the undefined behavior AA that needs to be looked at.	2020-04-21 15:22:10 -05:00
Johannes Doerfert	e2b53a4c05	[Attributor][NFC] Remove obsolete option from tests Since D76871 it is sufficient to run `opt -atributor` or `-attributor-cgscc`.	2020-04-21 15:22:10 -05:00
Roman Lebedev	352fef3f11	[InstCombine] Negator - sink sinkable negations Summary: As we have discussed previously (e.g. in D63992 / D64090 / [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]]), `sub` instruction can almost be considered non-canonical. While we do convert `sub %x, C` -> `add %x, -C`, we sparsely do that for non-constants. But we should. Here, i propose to interpret `sub %x, %y` as `add (sub 0, %y), %x` IFF the negation can be sinked into the `%y` This has some potential to cause endless combine loops (either around PHI's, or if there are some opposite transforms). For former there's `-instcombine-negator-max-depth` option to mitigate it, should this expose any such issues For latter, if there are still any such opposing folds, we'd need to remove the colliding fold. In any case, reproducers welcomed! Reviewers: spatel, nikic, efriedma, xbolva00 Reviewed By: spatel Subscribers: xbolva00, mgorny, hiraditya, reames, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68408	2020-04-21 22:00:23 +03:00
Sanjay Patel	cf30aafa2d	[Analysis] recognize the 'null' pointer constant as not poison Differential Revision: https://reviews.llvm.org/D78575	2020-04-21 14:23:06 -04:00
Sanjay Patel	b349098d22	[InstCombine] add tests for logic-of-icmps; NFC	2020-04-21 14:23:05 -04:00
Roman Lebedev	1f9c169990	[NFC][InstCombine] sub-of-negatible.ll: some more test cases	2020-04-21 20:14:09 +03:00
Sanjay Patel	44a8c5410e	[InstCombine] add tests for logic-of-icmps; NFC These are mostly replicated from D78430 (instsimplify). If we implement more general transforms for instcombine, then we probably don't need to add that complexity to instsimplify.	2020-04-21 12:26:45 -04:00
Johannes Doerfert	dc3b5b00fe	[OpenMPOpt] Make the combination of `ident_t` deterministic Before we kept the first applicable `ident_t` during deduplication of runtime calls. The problem is that "first" is dependent on the iteration order of a DenseMap. Since the proper solution, which is to combine the information from all `ident_t`, should be deterministic on its own, we will not try to make the iteration order deterministic. Instead, we will create a fresh `ident_t` if there is not a unique existing `ident_t*` to pick.	2020-04-20 23:27:08 -05:00
Johannes Doerfert	06a8d1aaa6	[Attributor] Partially disable three tests to unblock the windows bot The windows bot reported a crash [0] which seems to not happen on other platforms. We disable the old pass manager cgscc runs while this is under investigation. [0] http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/23151	2020-04-20 21:11:02 -05:00
Sriraman Tallam	365b60fc93	New pass to make internal linkage symbol names unique. With clang option -funique-internal-linkage-symbols, symbols with internal linkage get names with the module hash appended. Differential Revision: https://reviews.llvm.org/D78243	2020-04-20 15:05:22 -07:00
Eli Friedman	9b9454af8a	Require "target datalayout" to be at the beginning of an IR file. This will allow us to use the datalayout to disambiguate other constructs in IR, like load alignment. Split off from D78403. Differential Revision: https://reviews.llvm.org/D78413	2020-04-20 11:55:49 -07:00
Bjorn Pettersson	a8a31fdd80	[Scalarizer] Fix a non-deterministic scatter order problem Summary: The indexing operator in Scatterer may result in building new instructions. When using multiple such operators in a function argument list the order in which we build instructions depend on argument evaluation order (which is undefined in C++). This patch avoid such problems by expanding the components using the [] operator prior to the function call. Problem was seen when comparing output, while builing LLVM with different compilers (clang vs gcc). Reviewers: foad, cameron.mcinally, uabelho Reviewed By: foad Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78455	2020-04-20 16:05:33 +02:00
Max Kazantsev	204c0bbe7f	[Test] Fix test failure: platform-dependent printout	2020-04-20 10:27:50 +07:00
Max Kazantsev	80cd36ed63	[Test] Add a test showing how CFG analyses are invalidated after LV It demonstrates that, even if LV does no actual vectorization and only forms LCSSA, CFG analyses get dropped.	2020-04-20 09:38:49 +07:00
Sanjay Patel	a2eb55de99	[InstSimplify] add tests for logic+icmp folds for nullptr; NFC See discussion in D78430.	2020-04-19 10:42:08 -04:00
Sanjay Patel	bef6e67e95	[VectorCombine] transform bitcasted shuffle to wider elements bitcast (shuf V, MaskC) --> shuf (bitcast V), MaskC' This is the widen shuffle elements enhancement to D76727. It builds on the analysis and simplifications in D77881 and rG6a7e958a423e. The phase ordering tests show that we can simplify inverse shuffles across a binop in both directions (widen/narrow or narrow/widen) now. There's another potential transform visible in some of the remaining TODOs - move a bitcasted operand of a shuffle after the shuffle. Differential Revision: https://reviews.llvm.org/D78371	2020-04-19 08:24:38 -04:00
Sanjay Patel	02b070ed49	[InstSimplify] add tests for logic-of-icmp with min/max constant; NFC See PR45510: https://bugs.llvm.org/show_bug.cgi?id=45510 We had partial coverage for some of these patterns, so removing duplicate tests with the complete set in the new test file.	2020-04-19 08:24:38 -04:00
Ayal Zaks	8e0c5f7200	[LV] Mark first-order recurrences as allowed exits First-order recurrences require special treatment when they are live-out; such treatment is provided by fixFirstOrderRecurrence(), so they should be included in AllowedExit set. (Should probably have been included originally in D16197.) Fixes PR45526: AllowedExit set is used by prepareToFoldTailByMasking() to check whether the treatment for live-outs also holds when folding the tail, which is not (yet) the case for first-order recurrences. Differential Revision: https://reviews.llvm.org/D78210	2020-04-18 23:54:21 +03:00
Florian Hahn	9cd68bfa0e	[SCCP] Add additional tests for structs, conditional prop and widening. This patch adds a few additional test cases with cases subsequent patches will improve on.	2020-04-18 14:07:56 +01:00
Florian Hahn	4ee45ab60f	[LV] Invalidate cost model decisions along with interleave groups. Cost-modeling decisions are tied to the compute interleave groups (widening decisions, scalar and uniform values). When invalidating the interleave groups, those decisions also need to be invalidated. Otherwise there is a mis-match during VPlan construction. VPWidenMemoryRecipes created initially are left around w/o converting them into VPInterleave recipes. Such a conversion indeed should not take place, and these gather/scatter recipes may in fact be right. The crux is leaving around obsolete CM_Interleave (and dependent) markings of instructions along with their costs, instead of recalculating decisions, costs, and recipes. Alternatively to forcing a complete recompute later on, we could try to selectively invalidate the decisions connected to the interleave groups. But we would likely need to run the uniform/scalar value detection parts again anyways and the extra complexity is probably not worth it. Fixes PR45572. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78298	2020-04-18 10:23:49 +01:00
Anna Thomas	ef49b1d97e	Revert "[InlineFunction] Update metadata on loads that are return values" This reverts commit `1d0f757904` because of https://bugs.llvm.org/show_bug.cgi?id=45590. Needs investigation.	2020-04-17 17:23:00 -04:00
Sjoerd Meijer	5be767d489	NFC: remove outdated TODOs from ARM test file.	2020-04-17 17:08:11 +01:00
Sanjay Patel	9d9a088e51	[PhaseOrdering] remove blank lines in tests; NFC	2020-04-17 10:30:38 -04:00
Craig Topper	944cc5e0ab	[SelectionDAGBuilder][CGP][X86] Move some of SDB's gather/scatter uniform base handling to CGP. I've always found the "findValue" a little odd and inconsistent with other things in SDB. This simplfifies the code in SDB to just handle a splat constant address or a 2 operand GEP in the same BB. This removes the need for "findValue" since the operands to the GEP are guaranteed to be available. The splat constant handling is new, but was needed to avoid regressions due to constant folding combining GEPs created in CGP. CGP is now responsible for canonicalizing gather/scatters into this form. The pattern I'm using for scalarizing, a scalar GEP followed by a GEP with an all zeroes index, seems to be subject to constant folding that the insertelement+shufflevector was not. Differential Revision: https://reviews.llvm.org/D76947	2020-04-16 17:49:22 -07:00
Bob Haarman	cc5c58889e	[WPD] Avoid noalias assumptions in unique return value optimization Summary: Changes the type of the @__typeid_.*_unique_member imports we generate for unique return value optimization from i8 to [0 x i8]. This prevents assuming that these imports do not alias, such as when two unique return values occur in the same vtable. Fixes PR45393. Reviewers: tejohnson, pcc Reviewed By: pcc Subscribers: aganea, hiraditya, rnk, george.burgess.iv, dblaikie, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77421	2020-04-16 14:49:51 -07:00
Florian Hahn	c2171457e2	[SCCP] Add widening test case.	2020-04-16 22:39:52 +01:00
Johannes Doerfert	0741dec27b	[Attributor][FIX] Handle droppable uses when replacing values Since we use the fact that some uses are droppable in the Attributor we need to handle them explicitly when we replace uses. As an example, an assumed dead value can have live droppable users. In those we cannot replace the value simply by an undef. Instead, we either drop the uses (via `dropDroppableUses`) or keep them as they are. In this patch we do both, depending on the situation. For values that are dead but not necessarily removed we keep droppable uses around because they contain information we might be able to use later. For values that are removed we drop droppable uses explicitly to avoid replacement with undef.	2020-04-16 00:56:08 -05:00
Johannes Doerfert	ea7f17ee38	[InstCombine] Simplify calls with casted `returned` attribute The handling of the `returned` attribute in D75815 did miss the case where the argument is (bit)casted to a different type. This is explicitly allowed by the language reference and exposed by the Attributor. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D77977	2020-04-16 00:56:00 -05:00
Johannes Doerfert	253d6be0f6	[Attributor][FIX] Properly check for accesses to globals The check if globals were accessed was not always working because two bits are set for NO_GLOBAL_MEM. The new check works also if only on kind of globals (internal/external) is accessed.	2020-04-16 00:55:34 -05:00
Johannes Doerfert	9ff344ef6b	[Attributor] Remove large and seemingly useless test This was supposed to be part of D76588 already.	2020-04-15 21:26:36 -05:00
Johannes Doerfert	3ca54f4595	[Attributor] Unify testing (=updates,prefixes,run configurations,...) When the Attributor was created the test update scripts were not well suited to deal with the challenges of IR attribute checking. This partially improved. Since then we also added three additional configurations that need testing; in total we now have the following four: { TUNIT, CGSCC } x { old pass manager (OPM), new pass manager (NPM) } Finally, the number of developers and tests grew rapidly (partially due to the addition of ArgumentPromotion and IPConstantProp tests), which resulted in tests only being run in some configurations, different prefixes being used, and different "styles" of checks being used. Due to the above reasons I believed we needed to take another look at the test update scripts. While we started to use them, via UTC_ARGS: --enable/disable, the other problems remained. To improve the testing situation for all configurations, to simplify future updates to the test, and to help identify subtle effects of future changes, we now use the test update scripts for (almost) all Attributor tests. An exhaustive prefix list minimizes the number of check lines and makes it easy to identify and compare configurations. Tests have been adjusted in the process but we tried to keep their intend unchanged. Reviewed By: sstefan1 Differential Revision: https://reviews.llvm.org/D76588	2020-04-15 19:59:51 -05:00
Davide Italiano	5f87415efc	[LICM] Try to merge debug locations when sinking. The current strategy LICM uses when sinking for debuginfo is that of picking the debug location of one of the uses. This causes stepping to be wrong sometimes, see, e.g. PR45523. This patch introduces a generalization of getMergedLocation(), that operates on a vector of locations instead of two, and try to merge all them together, and use the new API in LICM. <rdar://problem/61750950>	2020-04-15 12:29:34 -07:00
Jon Roelofs	6c9d52885d	Add FileCheck colons missed in D76210 https://reviews.llvm.org/D76210#inline-715185	2020-04-15 12:26:53 -06:00
Florian Hahn	b578608256	[DSE,MSSA] Add use of alloca, to guard against removal in the future. Currently the alloca does not escape and all stores and the memset can be removed. Adding a use of the alloca ensures not all stores are eliminated.	2020-04-15 15:23:43 +01:00
Sanjay Patel	01bcc3e937	[InstCombine] prevent infinite loop with sub/abs of constant expression PR45539: https://bugs.llvm.org/show_bug.cgi?id=45539	2020-04-15 09:19:16 -04:00
Florian Hahn	cf9ee49b4d	[DSE] Lift post-dominance for objs not accessible in caller. We can eliminate MemoryDefs of objects not accessible after the function returns (e.g. alloca), if there are no reads between the MemoryDef and any function exits. We can stop traversing paths that completely overwrite the memory location of the MemoryDef. This patch was split off D73763. Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv Reviewed By: asbirlea, george.burgess.iv Differential Revision: https://reviews.llvm.org/D77736	2020-04-15 11:37:14 +01:00
Sameer Sahasrabuddhe	8c11bc0cd0	Introduce fix-irreducible pass An irreducible SCC is one which has multiple "header" blocks, i.e., blocks with control-flow edges incident from outside the SCC. This pass converts an irreducible SCC into a natural loop by introducing a single new header block and redirecting all the edges on the original headers to this new block. This is a useful workaround for a limitation in the structurizer which, which produces incorrect control flow in the presence of irreducible regions. The AMDGPU backend provides an option to enable this pass before the structurizer, which may eventually be enabled by default. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D77198 This restores commit `2ada8e2525`. Originally reverted with commit `44e09b59b8`.	2020-04-15 15:05:51 +05:30
Gil Rapaport	b747d72c19	[LV] Fix PR45525: Incorrect assert in blend recipe Fix an assert introduced in 41ed5d856c1: a phi with a single predecessor and a mask is a valid case which is already supported by the code. Differential Revision: https://reviews.llvm.org/D78115	2020-04-15 10:39:07 +03:00
Sameer Sahasrabuddhe	44e09b59b8	Revert "Introduce fix-irreducible pass" This reverts commit `2ada8e2525`. Buildbots produced compilation errors which I was not able to quickly reproduce locally. Need more time to investigate.	2020-04-15 12:19:50 +05:30
Sameer Sahasrabuddhe	2ada8e2525	Introduce fix-irreducible pass An irreducible SCC is one which has multiple "header" blocks, i.e., blocks with control-flow edges incident from outside the SCC. This pass converts an irreducible SCC into a natural loop by introducing a single new header block and redirecting all the edges on the original headers to this new block. This is a useful workaround for a limitation in the structurizer which, which produces incorrect control flow in the presence of irreducible regions. The AMDGPU backend provides an option to enable this pass before the structurizer, which may eventually be enabled by default. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D77198	2020-04-15 11:29:19 +05:30
Teresa Johnson	33ffb62e23	Allow disabling of vectorization using internal options Summary: Currently, the internal options -vectorize-loops, -vectorize-slp, and -interleave-loops do not have much practical effect. This is because they are used to initialize the corresponding flags in the pass managers, and those flags are then unconditionally overwritten when compiling via clang or via LTO from the linkers. The only exception was -vectorize-loops via opt because of some special hackery there. While vectorization could still be disabled when compiling via clang, using -fno-[slp-]vectorize, this meant that there was no way to disable it when compiling in LTO mode via the linkers. This only affected ThinLTO, since for regular LTO vectorization is done during the compile step for scalability reasons. For ThinLTO it is invoked in the LTO backends. See also the discussion on PR45434. This patch makes it so the internal options can actually be used to disable these optimizations. Ultimately, the best long term solution is to mark the loops with metadata (similar to the approach used to fix -fno-unroll-loops in D77058), but this enables a shorter term workaround, and actually makes these internal options useful. I constant propagated the initial values of these internal flags into the pass manager flags (for some reasons vectorize-loops and interleave-loops were initialized to true, while vectorize-slp was initialized to false). As mentioned above, they are overwritten unconditionally so this doesn't have any real impact, and these initial values aren't particularly meaningful. I then changed the passes to check the internl values and return without performing the associated optimization when false (I changed the default of -vectorize-slp to true so the options behave similarly). I was able to remove the hackery in opt used to get -vectorize-loops=false to work, as well as a special option there used to disable SLP vectorization. Finally, I changed thinlto-slp-vectorize-pm.c to: a) Only test SLP (moved the loop vectorization checking to a new test). b) Use code that is slp vectorized when it is enabled, and check that instead of whether the pass is enabled. c) Test the new behavior of -vectorize-slp. d) Test both pass managers. The loop vectorization (and associated interleaving) testing I moved to a new thinlto-loop-vectorize-pm.c test, with several changes: a) Changed the flags on the interleaving testing so that it will actually interleave, and check that. b) Test the new behavior of -vectorize-loops and -interleave-loops. c) Test both pass managers. Reviewers: fhahn, wmi Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, davezarzycki, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77989	2020-04-14 18:09:10 -07:00
Sjoerd Meijer	3ef614a007	NFC: update of ARM llvm regr test, follow up of `9633fc14ae`.	2020-04-14 21:30:22 +01:00
Huihui Zhang	5c1d1a62e3	[InstCombine][SVE] Fix visitGetElementPtrInst for scalable type. Summary: This patch fix the following issues in InstCombiner::visitGetElementPtrInst 1. Skip for scalable type if transformation requires fixed size number of vector element. 2. Skip for scalable type if transformation relies on compile-time known type alloc size. 3. Use VectorType::getElementCount when scalable property is used to construct new VectorType. 4. Use TypeSize::getKnownMinSize when minimal size of a scalable type is valid to determine GEP 'inbounds'. 5. Explicitly call TypeSize::getFixedSize to avoid implicit type conversion to uint64_t. Reviewers: sdesmalen, efriedma, spatel, ctetreau Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78081	2020-04-14 12:38:32 -07:00
Sanjay Patel	6a7e958a42	[InstCombine] try to reduce more shuffles with bitcasted operand This is the widen mask element sibling to D76844. shuf (bitcast X), undef, Mask --> bitcast X' http://volta.cs.utah.edu:8080/z/4dt3V8	2020-04-14 15:03:59 -04:00
Sanjay Patel	3c87fba27f	[InstCombine] add tests for bitcasted shuffle operand; NFC Similar to D76844, but this is casted from wider element type. See D77881.	2020-04-14 13:57:30 -04:00
Sanjay Patel	c72f49cc57	[InstSimplify] add test for select that should not be simplified; NFC See discussion in D77868	2020-04-14 13:57:30 -04:00
Sergey Dmitriev	c1a9dd9aea	[AbstractCallSite] Check that callback callee index is within call arguments Summary: AbstractCallSite::getCallbackUses() does not check that callback callee index from the callback metadata does not exceed the total number of call arguments. This patch add such validation check. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78112	2020-04-14 09:24:00 -07:00
Sjoerd Meijer	9633fc14ae	[LV][ARM] Add tail-folding tests for MVE. NFC. D77635 added support to recognise primary induction variables for counting-down loops. This allows us to fold the scalar tail loop into the main vector body, which we need for MVE tail-predication. This adds some ARM tail-folding test cases that we want to support. This test was extracted from D76838, which implemented a different approach to reverse and thus find a primary induction variable.	2020-04-14 16:03:29 +01:00
Max Kazantsev	f8a42bca28	[ADCE] Fix incorrect reporting of CFG changes This patch fixes 2 related bugs in ADCE: - `performDeadCodeElimination` does not report changes if it did ONLY CFG changes (affects both old and new pass managers); - When control flow removal is enabled, new pass manager does not drop CFG analyses. Both can lead to incorrect loop info after ADCE that does only CFG changes. Differential Revision: https://reviews.llvm.org/D78103 Reviewed By: Denis Antrushin	2020-04-14 20:26:13 +07:00
Max Kazantsev	2c4d914eeb	[Test] Add failing test that demonstrates buggy behavior of ADCE ADCE messes up with loop info (proved for new pass manager only) by making some loop blocks unreachable, without making proper updates to the loop.	2020-04-14 18:47:04 +07:00
Florian Hahn	38609fa9e4	Recommit "[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef." This includes a fix reported with simplifications in the presence of NaN. This reverts the revert commit `06408451bf`.	2020-04-14 11:48:52 +01:00
Tyker	3bdfa966ec	[AssumeBundles] preserve knowledge in DCE Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77403	2020-04-14 12:48:15 +02:00
Tyker	086de7673e	[AssumeBundles] preserve knowledge in DSE Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77404	2020-04-14 12:48:15 +02:00
Tyker	de4dc275f5	[AssumeBundles] preserve information in NewGVN Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: Prazek, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77406	2020-04-14 12:48:14 +02:00
Tyker	c35194b800	[AssumeBundles] preserve information in LICM Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77407	2020-04-14 12:48:14 +02:00
Eli Friedman	89e0662dee	Make IRBuilder automatically set alignment on load/store/alloca. This is equivalent in terms of LLVM IR semantics, but we want to transition away from using MaybeAlign to represent the alignment of these instructions. Differential Revision: https://reviews.llvm.org/D77984	2020-04-13 13:43:14 -07:00
Vedant Kumar	4831f4b7bd	[InstCombine] Fix debug variance issue in tryToMoveFreeBeforeNullTest Fix an issue where the presence of debug info could disable an optimization in tryToMoveFreeBeforeNullTest.	2020-04-13 10:55:17 -07:00
Jon Roelofs	0b0bb1969f	[llvm] Fix yet more missing FileCheck colons	2020-04-13 10:49:19 -06:00
Benjamin Kramer	06408451bf	Revert "[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef." This reverts commit `1a02aaeaa4`. Crashes on the following test case: $ cat crash.ll source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" @0 = private unnamed_addr constant [24 x i8] c"\00\00\C0\7F\00\00\C0\7F\09\85\08?\ED\C94\FE~\EB/\F3\90\CF\BA\C1" @1 = private unnamed_addr constant [24 x i8] c"\00\00\C0\7F\A3\A0\0FA\00\00\C0\7F\00\00\C0\7F\00\00\00\00\02\9AA\00" define void @IgammaSpecialValues.448() { entry: br label %fusion.26.loop_header.dim.0 fusion.26.loop_header.dim.0: ; preds = %fusion.26.loop_header.dim.0, %entry %fusion.26.invar_address.dim.0.0 = phi i64 [ 0, %entry ], [ %invar.inc17, %fusion.26.loop_header.dim.0 ] %0 = getelementptr inbounds [6 x float], [6 x float]* bitcast ([24 x i8]* @0 to [6 x float]), i64 0, i64 %fusion.26.invar_address.dim.0.0 %1 = load float, float %0 %2 = fmul float %1, 0.000000e+00 %3 = getelementptr inbounds [6 x float], [6 x float]* bitcast ([24 x i8]* @1 to [6 x float]), i64 0, i64 %fusion.26.invar_address.dim.0.0 %4 = load float, float %3 %5 = fneg float %4 %6 = fadd float %2, %5 %invar.inc17 = add nuw nsw i64 %fusion.26.invar_address.dim.0.0, 1 br label %fusion.26.loop_header.dim.0 } $ opt -ipsccp -S < crash.ll opt: llvm/include/llvm/Analysis/ValueLattice.h:251: bool llvm::ValueLatticeElement::markConstant(llvm::Constant *, bool): Assertion `getConstant() == V && "Marking constant with different value"' failed.	2020-04-13 11:23:26 +02:00
Eli Friedman	cfb844265a	[GlobalOpt] Explicitly set alignment of bool load/store operations.	2020-04-12 16:03:12 -07:00
Mehdi Amini	ed03d9485e	Revert "[TLI] Per-function fveclib for math library used for vectorization" This reverts commit `60c642e74b`. This patch is making the TLI "closed" for a predefined set of VecLib while at the moment it is extensible for anyone to customize when using LLVM as a library. Reverting while we figure out a way to re-land it without losing the generality of the current API. Differential Revision: https://reviews.llvm.org/D77925	2020-04-11 01:05:01 +00:00
Huihui Zhang	6e7eeb44b3	[GVN] Fix VNCoercion for Scalable Vector. Summary: For VNCoercion, skip scalable vector when analysis rely on fixed size, otherwise call TypeSize::getFixedSize() explicitly. Add unit tests to check funtionality of GVN load elimination for scalable type. Reviewers: sdesmalen, efriedma, spatel, fhahn, reames, apazos, ctetreau Reviewed By: efriedma Subscribers: bjope, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76944	2020-04-10 17:49:07 -07:00
Eric Christopher	45dca04395	Exclude bitcast and ext/trunc signbit optimization on ppc_fp128 Revision `a1c05fe` <https://reviews.llvm.org/rGa1c05fe20f3def1f1be9f50d2adefc6b6f1578ad> removed bitcast from the list of problematic transformations, however: %97 = fptrunc ppc_fp128 %2 to double // we need to check ppc_fp128 here to prevent the transformation %98 = bitcast double %97 to i64 // `a1c05fe` checks ppc_fp128 at here %99 = icmp slt i64 %98, 0 %100 = zext i1 %99 to i8 store i8 %100, i8* %7, align 1 so this patch does that. I'm also disabling it in the presence of extend just in case. I verified separately that the hash of -std::infinity and std::infinity don't match now. Differential Revision: https://reviews.llvm.org/D77911	2020-04-10 17:07:55 -07:00
Sanjay Patel	73bebc9445	[InstSimplify] add tests for folding bool select to logic; NFC	2020-04-10 09:08:00 -04:00
Florian Hahn	1a02aaeaa4	[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef. For non-integer constants/expressions and overdefined, I think we can just use SimplifyBinOp to do common folds. By just passing a context with the DL, SimplifyBinOp should not try to get additional information from looking at definitions. For overdefined values, it should be enough to just pass the original operand. Note: The comment before the `if (isconstant(V1State)...` was wrong originally: isConstant() also matches integer ranges with a single element. It is correct now. Reviewers: efriedma, davide, mssimpso, aartbik Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76459	2020-04-10 11:02:57 +01:00
Max Kazantsev	4e87823026	[LoopLoadElim] Fix crash by always checking simplify form Loop simplify form should always be checked because logic of propagateStoredValueToLoadUsers relies on it (in particular, it requires preheader). Reviewed By: Fedor Sergeev, Florian Hahn Differential Revision: https://reviews.llvm.org/D77775	2020-04-10 09:23:28 +07:00
Wenlei He	60c642e74b	[TLI] Per-function fveclib for math library used for vectorization Summary: Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads the attributes and select available vector function list based on that. Now we also populate function list for all supported vector libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now. Subscribers: hiraditya, dexonsmith, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77632	2020-04-09 18:26:38 -07:00
Craig Topper	5625e6ab37	[X86] Improve min/max reduction costs. This is similar to what I recently did for getArithmeticReductionCost. I'm trying to account for the narrowing from 512->256->128 as we go. I've also added a new helper method getMinMaxCost that tries to handle the cases where we have native min/max instructions and fall back to cmp+select when we don't. Differential Revision: https://reviews.llvm.org/D76634	2020-04-09 17:28:50 -07:00
Zequan Wu	eccfa35d53	Fix lifetime call in landingpad blocking Simplifycfg pass Fix lifetime call in landingpad blocks simplifycfg from removing the landingpad. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D77188	2020-04-09 13:07:32 -07:00
Florian Hahn	0f7aedfd48	[SCCP] Add tests with AND/OR branch conditions.	2020-04-09 16:39:13 +01:00
Ayal Zaks	1678489234	[LV] FoldTail w/o Primary Induction Introduce a new VPWidenCanonicalIVRecipe to generate a canonical vector induction for use in fold-tail-with-masking, if a primary induction is absent. The canonical scalar IV having start = 0 and step = VFUF, created during code -gen to control the vector loop, is widened into a canonical vector IV having start = {<PartVF, PartVF+1, ..., PartVF+VF-1> for 0 <= Part < UF} and step = <VFUF, VFUF, ..., VF*UF>. Differential Revision: https://reviews.llvm.org/D77635	2020-04-09 17:45:23 +03:00
Sanjay Patel	5b5a74f7d1	[InstCombine] remove stale FIXME comment; NFC	2020-04-09 10:33:49 -04:00
Florian Hahn	db91a6b800	[SCCP] Add test case for binary ops with constant expressions.	2020-04-09 13:38:43 +01:00
Sanjay Patel	812970edda	[InstCombine] replace undef in vector constant for safe shift transform (PR45447) As noted in PR45447, we have a vector-constant-with-undef-element transform bug: https://bugs.llvm.org/show_bug.cgi?id=45447 We replace undefs with a safe constant (0 or -1) based on the (non-)negative predicate constraint. So this is correct: http://volta.cs.utah.edu:8080/z/WZE36H ...but this is not: http://volta.cs.utah.edu:8080/z/boj8gJ Previously, we were relying on getSafeVectorConstantForBinop() in the related fold (D76800). But that's making an assumption about what qualifies as "safe", and that assumption may not always hold. Differential Revision: https://reviews.llvm.org/D77739	2020-04-09 08:00:46 -04:00
Florian Hahn	0c22cb0fd7	Temporarily revert "[Attributor] Unify testing (=updates,...)" This patch reverts the 2 patches below, as on most systems the disabled tests actually pass and that causes most bots to be red, including http://green.lab.llvm.org/green/job/clang-stage1-RA/8541/ http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/15646/ http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/23690 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/16751 * [Attributor] Disable three tests until the SCC update bug was fixed commit `2ae1a76c27`. * [Attributor] Unify testing (=updates,prefixes,run configurations,...) `2bcf5793e1`.	2020-04-09 11:11:50 +01:00
Johannes Doerfert	2ae1a76c27	[Attributor] Disable three tests until the SCC update bug was fixed D76588 exposed an SCC update bug in three tests which manifests sometimes, e.g., on this bot that runs expensive checks: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/23032/steps/test-check-all/logs/FAIL%3A%20LLVM%3A%3Afp80.ll We disable the tests temporarily to investigate.	2020-04-09 00:31:41 -05:00
Johannes Doerfert	2bcf5793e1	[Attributor] Unify testing (=updates,prefixes,run configurations,...) When the Attributor was created the test update scripts were not well suited to deal with the challenges of IR attribute checking. This partially improved. Since then we also added three additional configurations that need testing; in total we now have the following four: { TUNIT, CGSCC } x { old pass manager (OPM), new pass manager (NPM) } Finally, the number of developers and tests grew rapidly (partially due to the addition of ArgumentPromotion and IPConstantProp tests), which resulted in tests only being run in some configurations, different prefixes being used, and different "styles" of checks being used. Due to the above reasons I believed we needed to take another look at the test update scripts. While we started to use them, via UTC_ARGS: --enable/disable, the other problems remained. To improve the testing situation for all configurations, to simplify future updates to the test, and to help identify subtle effects of future changes, we now use the test update scripts for (almost) all Attributor tests. An exhaustive prefix list minimizes the number of check lines and makes it easy to identify and compare configurations. Tests have been adjusted in the process but we tried to keep their intend unchanged. Reviewed By: sstefan1 Differential Revision: https://reviews.llvm.org/D76588	2020-04-08 22:52:46 -05:00
Craig Topper	ca376782ff	[LoopVectorize] Move testing for SVML vectorization of exp2f_finite/exp2_finite from svml-calls.ll to svml-calls-finite.ll where the finite versions of log, pow, and exp already were.	2020-04-08 18:13:55 -07:00
Simon Pilgrim	94121c60d6	[InstCombine] Regenerate phi-preserve-ir-flags.ll test checks to fix issue reported on D77354	2020-04-08 18:13:24 +01:00
Florian Hahn	4184b2e034	[DSE,MSSA] Add additional test cases for multi-path elimination (NFC). This adds additional test cases for more scenarios and also with objects that are accessible after the functions return and allocas.	2020-04-08 15:26:26 +01:00
Simon Pilgrim	0c2ab63689	[CodeExtractor] Fix typo in check label to fix issue reported on D77354	2020-04-08 14:59:15 +01:00
Sanjay Patel	a1c05fe20f	[InstCombine] exclude bitcast of ppc_fp128 in icmp signbit fold Based on the post-commit comments for rG0f56bbc, there might be a problem with this transform: (bitcast (fpext/fptrunc X)) to iX) < 0 --> (bitcast X to iY) < 0 ...and the ppc_fp128 data type, so conservatively bypass if we are bitcasting a ppc_fp128. We might be able to account for endian or other differences to enable this for PowerPC again if that is useful. Differential Revision: https://reviews.llvm.org/D77642	2020-04-08 08:56:19 -04:00
Max Kazantsev	7adb9e06fd	[LoopLoadElim] Add test showing that LoopLoadElim doesn't work correctly with new PM	2020-04-08 17:32:03 +07:00
Sanjay Patel	e268ec8e0d	[InstCombine] add icmp+cast tests for ppc_fp128; NFC See post-commit comments for rG0f56bbc.	2020-04-07 07:35:01 -04:00
Florian Hahn	6aabb109be	[SCCP] Use ranges for predicate info conditions. This patch updates the code that deals with conditions from predicate info to make use of constant ranges. For ssa_copy instructions inserted by PredicateInfo, we have 2 ranges: 1. The range of the original value. 2. The range imposed by the linked condition. 1. is known, 2. can be determined using makeAllowedICmpRegion. The intersection of those ranges is the range for the copy. With this patch, we get a nice increase in the number of instructions eliminated by both SCCP and IPSCCP for some benchmarks: For MultiSource, SPEC2000 & SPEC2006: Tests: 237 Same hash: 170 (filtered out) Remaining: 67 Metric: sccp.NumInstRemoved Program base patch diff test-suite...Source/Benchmarks/sim/sim.test 10.00 71.00 610.0% test-suite...CFP2000/177.mesa/177.mesa.test 361.00 1626.00 350.4% test-suite...encode/alacconvert-encode.test 141.00 602.00 327.0% test-suite...decode/alacconvert-decode.test 141.00 602.00 327.0% test-suite...CI_Purple/SMG2000/smg2000.test 1639.00 4093.00 149.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 75.00 163.00 117.3% test-suite...T2006/401.bzip2/401.bzip2.test 358.00 513.00 43.3% test-suite...rks/FreeBench/pifft/pifft.test 11.00 15.00 36.4% test-suite...langs-C/unix-tbl/unix-tbl.test 4.00 5.00 25.0% test-suite...lications/sqlite3/sqlite3.test 541.00 667.00 23.3% test-suite.../CINT2000/254.gap/254.gap.test 243.00 299.00 23.0% test-suite...ks/Prolangs-C/agrep/agrep.test 25.00 29.00 16.0% test-suite...marks/7zip/7zip-benchmark.test 1135.00 1304.00 14.9% test-suite...lications/ClamAV/clamscan.test 1105.00 1268.00 14.8% test-suite...urce/Applications/lua/lua.test 398.00 436.00 9.5% Metric: sccp.IPNumInstRemoved Program base patch diff test-suite...C/CFP2000/179.art/179.art.test 1.00 3.00 200.0% test-suite...006/447.dealII/447.dealII.test 429.00 1056.00 146.2% test-suite...nch/fourinarow/fourinarow.test 3.00 7.00 133.3% test-suite...CI_Purple/SMG2000/smg2000.test 818.00 1748.00 113.7% test-suite...ks/McCat/04-bisect/bisect.test 3.00 5.00 66.7% test-suite...CFP2000/177.mesa/177.mesa.test 165.00 255.00 54.5% test-suite...ediabench/gsm/toast/toast.test 18.00 27.00 50.0% test-suite...telecomm-gsm/telecomm-gsm.test 18.00 27.00 50.0% test-suite...ks/Prolangs-C/agrep/agrep.test 24.00 35.00 45.8% test-suite...TimberWolfMC/timberwolfmc.test 43.00 62.00 44.2% test-suite...encode/alacconvert-encode.test 46.00 66.00 43.5% test-suite...decode/alacconvert-decode.test 46.00 66.00 43.5% test-suite...langs-C/unix-tbl/unix-tbl.test 12.00 17.00 41.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 31.00 41.00 32.3% test-suite.../CINT2000/254.gap/254.gap.test 117.00 154.00 31.6% Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76611	2020-04-07 11:09:18 +01:00

... 3 4 5 6 7 ...

15093 Commits