llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	935685f420	[SCCP] Re-use pushToWorkList in pushToWorkListMsg (NFC). There's no need to duplicate the logic to push to the different work-lists.	2020-05-04 10:19:39 +01:00
Simon Moll	1e89f36c98	[VE][NFC] formatting VEISD enum	2020-05-04 09:50:27 +02:00
Craig Topper	243ffc0e65	[X86] Simplify some code in combineTruncatedArithmetic. NFC We haven't promoted AND/OR/XOR to vXi64 types for a while. So there's no reason to use isOperationLegalOrPromote. So we can just use isOperationLegal by merging with ADD handling.	2020-05-03 23:53:10 -07:00
Craig Topper	8b53fdd3b6	[X86] Custom legalize v16i64->v16i8 truncate with avx512. Default legalization will create two v8i64 truncs to v8i32, concat them to v16i32, and then truncate the rest of the way to v16i8. Instead we can truncate directly from v8i64 to v8i8 in the lower half of an xmm. Then concat the two halves to use vpunpcklqdq. This is the same number of uops, but the dependency chain through the uops is better since the halves are merged at the end. I had to had SimplifyDemandedBits support for VTRUNC to prevent a regression on vector-trunc-math.ll. combineTruncatedArithmetic no longer gets a chance to shrink vXi64 mul so we were producing the v8i64 multiply sequence using multiple PMULUDQs. With the demanded bits fix we are able to prune out the extra ops leaving just two PMULUDQs, one for each v8i64 half. This is twice the width of the 2 v8i32 PMULLDs we had before, but PMULUDQ is 1 uop and PMULLD is 2. We also save some truncates. It's probably worth using PMULUDQ even when PMULLQ is available since the latter is 3 uops, but that will require a different change. Differential Revision: https://reviews.llvm.org/D79231	2020-05-03 23:26:04 -07:00
Johannes Doerfert	14cb0bdf2b	[Attributor][NFC] Replace the nested AAMap with a key pair No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 512375 (362871/s) temporary memory allocations: 98746 (69933/s) peak heap memory consumption: 22.54MB peak RSS (including heaptrack overhead): 106.78MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 509833 (338534/s) temporary memory allocations: 98902 (65671/s) peak heap memory consumption: 18.71MB peak RSS (including heaptrack overhead): 103.00MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: -2542 (-27042/s) temporary memory allocations: 156 (1659/s) peak heap memory consumption: -3.83MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ```	2020-05-03 22:10:47 -05:00
Johannes Doerfert	95e0d28b71	[Attributor] Remember only necessary dependences Before we eagerly put dependences into the QueryMap as soon as we encountered them (via `Attributor::getAAFor<>` or `Attributor::recordDependence`). Now we will wait to see if the dependence is useful, that is if the target is not already in a fixpoint state at the end of the update. If so, there is no need to record the dependence at all. Due to the abstraction via `Attributor::updateAA` we will now also treat the very first update (during attribute creation) as we do subsequent updates. Finally this resolves the problematic usage of QueriedNonFixAA. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 554675 (389245/s) temporary memory allocations: 101574 (71280/s) peak heap memory consumption: 28.46MB peak RSS (including heaptrack overhead): 116.26MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 512465 (345559/s) temporary memory allocations: 98832 (66643/s) peak heap memory consumption: 22.54MB peak RSS (including heaptrack overhead): 106.58MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: -42210 (-727758/s) temporary memory allocations: -2742 (-47275/s) peak heap memory consumption: -5.92MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ```	2020-05-03 22:01:51 -05:00
Johannes Doerfert	231026a508	[Attributor] Inititialize "value attributes" w/ must-be-executed-context info Attributes that only depend on the value (=bit pattern) can be initialized from uses in the must-be-executed-context (MBEC). We did use `AAComposeTwoGenericDeduction` and `AAFromMustBeExecutedContext` before to do this for some positions of these attributes but not for all. This was fairly complicated and also problematic as we did run it in every `updateImpl` call even though we only use known information. The new implementation removes `AAComposeTwoGenericDeduction`* and `AAFromMustBeExecutedContext` in favor of a simple interface `AddInformation::fromMBEContext(...)` which we call from the `initialize` methods of the "value attribute" `Impl` classes, e.g. `AANonNullImpl:initialize`. There can be two types of test changes: 1) Artifacts were we miss some information that was known before a global fixpoint was reached and therefore available in an update but not at the beginning. 2) Deduction for values we did not derive via the MBEC before or which were not found as the `AAFromMustBeExecutedContext::updateImpl` was never invoked. * An improved version of AAComposeTwoGenericDeduction can be found in D78718. Once we find a new use case that implementation will be able to handle "generic" AAs better. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 468428 (328952/s) temporary memory allocations: 77480 (54410/s) peak heap memory consumption: 32.71MB peak RSS (including heaptrack overhead): 122.46MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 554720 (351310/s) temporary memory allocations: 101650 (64376/s) peak heap memory consumption: 28.46MB peak RSS (including heaptrack overhead): 116.75MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: 86292 (556722/s) temporary memory allocations: 24170 (155935/s) peak heap memory consumption: -4.25MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D78719	2020-05-03 21:41:22 -05:00
Johannes Doerfert	87f1e93945	[Attributor][NFC] Use reference instead of pointer	2020-05-03 21:38:06 -05:00
Johannes Doerfert	2f97b8b891	[Attributor][NFC] Proactively ask for `nocapure` on call site arguments This minimizes test noise later on and is in line with other attributes we derive proactively.	2020-05-03 21:38:06 -05:00
Sergey Dmitriev	0f70f73308	[Attributor] Bitcast constant to the returned value type if it has different type Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79277	2020-05-03 11:46:13 -07:00
Nikita Popov	46ee652c70	Revert "[InstSimplify] Remove known bits constant folding" This reverts commit `08556afc54`. This breaks some AMDGPU tests.	2020-05-03 20:45:10 +02:00
Nikita Popov	08556afc54	[InstSimplify] Remove known bits constant folding If SimplifyInstruction() does not succeed in simplifying the instruction, it will compute the known bits of the instruction in the hope that all bits are known and the instruction can be folded to a constant. I have removed a similar optimization from InstCombine in D75801, and would like to drop this one as well. On average, we spend ~1% of total compile-time performing this known bits calculation. However, if we introduce some additional statistics for known bits computations and how many of them succeed in simplifying the instruction we get (on test-suite): instsimplify.NumKnownBits: 216 instsimplify.NumKnownBitsComputed: 13828375 valuetracking.NumKnownBitsComputed: 45860806 Out of ~14M known bits calculations (accounting for approximately one third of all known bits calculations), only 0.0015% succeed in producing a constant. Those cases where we do succeed to compute all known bits will get folded by other passes like InstCombine later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test show a hash difference after this change. On lencod we see an improvement (a loop phi is optimized away), on the GCC torture test a regression (a function return value is determined only after IPSCCP, preventing propagation from a noinline function.) There are various regressions in InstSimplify tests. However, all of these cases are already handled by InstCombine, and corresponding tests have already been added there. Differential Revision: https://reviews.llvm.org/D79294	2020-05-03 20:26:58 +02:00
Hongtao Yu	911e06f5eb	[ICP] Handling must tail calls in indirect call promotion Per the IR convention, a musttail call must precede a ret with an optional bitcast. This was violated by the indirect call promotion optimization which could result an IR like: ; <label>:2192: br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483 ; <label>:2199: ; preds = %2192 musttail call fastcc void @foo(i8* %2195), !dbg !226012 br label %2202, !dbg !226012 ; <label>:2201: ; preds = %2192 musttail call fastcc void %2197(i8* %2195), !dbg !226012 br label %2202, !dbg !226012 ; <label>:2202: ; preds = %605, %2201, %2199 ret void, !dbg !229485 This is being fixed in this change where the return statement goes together with the promoted indirect call. The code generated is like: ; <label>:2192: br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483 ; <label>:2199: ; preds = %2192 musttail call fastcc void @foo(i8* %2195), !dbg !226012 ret void, !dbg !229485 ; <label>:2201: ; preds = %2192 musttail call fastcc void %2197(i8* %2195), !dbg !226012 ret void, !dbg !229485 Differential Revision: https://reviews.llvm.org/D79258	2020-05-03 10:42:22 -07:00
Mircea Trofin	bec4ab95a4	[llvm][NFC] Inliner: factor cost and reporting out of inlining process Summary: This factors cost and reporting out of the inlining workflow, thus making it easier to reuse when driving inlining from the upcoming InliningAdvisor. Depends on: D79215 Reviewers: davidxl, echristo Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79275	2020-05-03 10:38:28 -07:00
Florian Hahn	bbdfcf8f69	[VPlan] Remove unused & undefined print method (NFC).	2020-05-03 18:36:20 +01:00
Johannes Doerfert	8228153f87	[Attributor][NFC] Encode IRPositions in the bits of a single pointer This reduces memory consumption for IRPositions by eliminating the vtable pointer and the `KindOrArgNo` integer. Since each abstract attribute has an associated IRPosition, the 12-16 bytes we save add up quickly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 469545 (260135/s) temporary memory allocations: 77137 (42735/s) peak heap memory consumption: 30.50MB peak RSS (including heaptrack overhead): 119.50MB total memory leaked: 269.07KB ``` After: ``` calls to allocation functions: 468999 (274108/s) temporary memory allocations: 77002 (45004/s) peak heap memory consumption: 28.83MB peak RSS (including heaptrack overhead): 118.05MB total memory leaked: 269.07KB ``` Difference: ``` calls to allocation functions: -546 (5808/s) temporary memory allocations: -135 (1436/s) peak heap memory consumption: -1.67MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` --- CTMark 15 runs Metric: compile_time Program lhs rhs diff test-suite...:: CTMark/sqlite3/sqlite3.test 25.07 24.09 -3.9% test-suite...Mark/mafft/pairlocalalign.test 14.58 14.14 -3.0% test-suite...-typeset/consumer-typeset.test 21.78 21.58 -0.9% test-suite :: CTMark/SPASS/SPASS.test 21.95 22.03 0.4% test-suite :: CTMark/lencod/lencod.test 25.43 25.50 0.3% test-suite...ark/tramp3d-v4/tramp3d-v4.test 23.88 23.83 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 60.24 60.11 -0.2% test-suite :: CTMark/kimwitu++/kc.test 15.69 15.69 -0.0% test-suite...:: CTMark/ClamAV/clamscan.test 25.43 25.42 -0.0% test-suite :: CTMark/Bullet/bullet.test 37.63 37.62 -0.0% Geomean difference -0.8% --- Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D78722	2020-05-03 12:15:19 -05:00
Johannes Doerfert	6bf16ee4c5	[Attributor][NFC] Let AbstractAttribute be an IRPosition Since every AbstractAttribute so far, and for the foreseeable future, corresponds to a single IRPosition we can simplify the class structure. We already did this for IRAttribute but there is no reason to stop there.	2020-05-03 12:13:40 -05:00
Nico Weber	fb5fd74685	Revert "Optimize path::remove_dots" This reverts commit `53913a65b4`. Breaks VFSFromYAMLTest.DirectoryIterationSameDirMultipleEntries in SupportTests on non-Windows.	2020-05-03 12:46:46 -04:00
Mircea Trofin	667f558c3f	[llvm][NFC] Inliner.cpp shouldInline post-commit feedback Discussion is in https://reviews.llvm.org/D79215	2020-05-03 09:31:31 -07:00
Reid Kleckner	53913a65b4	Optimize path::remove_dots LLD calls this on every source file string in every object file when writing PDBs, so it is somewhat hot. Avoid rewriting paths that do not contain path traversal components (./..). Use find_first_not_of(separators) directly instead of using the path iterators. The path component iterators appear to be slow, and directly searching for slashes makes it easier to find double separators that need to be canonicalized. I discovered that the VFS relies on remote_dots to not canonicalize early slashes (/foo or C:/foo) on Windows, so I had to leave that behavior behind with unit tests for it. This is undesirable, but I claim that my change is NFC.	2020-05-03 07:58:05 -07:00
Sanjay Patel	682f0b366b	[InstCombine] use select-of-constants with set/clear bit mask patterns Cond ? (X & ~C) : (X \| C) --> (X & ~C) \| (Cond ? 0 : C) Cond ? (X \| C) : (X & ~C) --> (X & ~C) \| (Cond ? C : 0) The select-of-constants form results in better codegen. There's an existing test diff that shows a transform that results in an extra IR instruction, but that's an existing problem. This is motivated by code seen in LLVM itself - see PR37581: https://bugs.llvm.org/show_bug.cgi?id=37581 define i8 @src(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %or = or i8 %x, %C %cond = select i1 %b, i8 %or, i8 %and ret i8 %cond } define i8 @tgt(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %mul = select i1 %b, i8 %C, i8 0 %or = or i8 %mul, %and ret i8 %or } http://volta.cs.utah.edu:8080/z/Vt2WVm Differential Revision: https://reviews.llvm.org/D78880	2020-05-03 09:44:43 -04:00
Benjamin Kramer	7a529ad2c1	[Support] Don't initialize buffer allocated by zlib::uncompress This is a somewhat annoying API, but not without precedend in this low level API.	2020-05-03 15:01:52 +02:00
Simon Pilgrim	7c203163c7	[X86] Use splitVector helper in truncateVectorWithPACK/splitVectorStore/combineHorizontalMinMaxResult/combineReductionToHorizontal. NFC. All these locations were performing the same type splitting/extractSubVector calls as the spltVector helper.	2020-05-03 13:40:38 +01:00
Simon Pilgrim	e8d9794a23	[X86] Don't limit splitVector helper to simple types. It can handle EVT just as well (and so can the extractSubVector calls).	2020-05-03 12:27:37 +01:00
Alexey Lapshin	4f576ea731	[Debuginfo][NFC] Avoid double calling of DWARFDie::find(DW_AT_name). Summary: Current implementation of DWARFDie::getName(DINameKind Kind) could lead to double call to DWARFDie::find(DW_AT_name) in following scenario: getName(LinkageName); getName(ShortName); getName(LinkageName) calls find(DW_AT_name) if linkage name is not found. Then, it is called again in getName(ShortName). This patch alows to request LinkageName and ShortName separately to avoid extra call to find(DW_AT_name). It helps D74169 to parse clang debuginfo faster(~1%). Reviewers: clayborg, dblaikie Differential Revision: https://reviews.llvm.org/D79173	2020-05-03 14:00:25 +03:00
Simon Pilgrim	74e9952c8e	[X86][SSE] splitAndLowerShuffle - use splitVector helper. NFC. The splitVector helper uses extractSubVector which splits build vectors like we do here, so avoid reimplementing it. splitVector could easily be extended to peek through bitcasts as well but I'd prefer to keep this commit NFC.	2020-05-03 11:26:51 +01:00
Simon Pilgrim	4d2b0ebd17	[X86] detectAVGPattern - use matchUnaryPredicate helper. NFC. Use the ISD::matchUnaryPredicate helper to check for inrange constants.	2020-05-03 11:26:51 +01:00
Ten Tzen	21c1a0c730	Test Commit: add two head comments in WinEHPrepare.cpp This is a Test commit.	2020-05-03 01:15:59 -07:00
Reid Kleckner	5070cecd72	[PDB] Bypass generic deserialization code for publics sorting The number of public symbols is very large, and each deserialization does a few heap allocations. The public symbols are serialized by the linker, so we can assume they have the expected layout and use it directly. Saves O(#publics) temporary heap allocations and shrinks some data structures.	2020-05-02 18:14:50 -07:00
Craig Topper	7867f4c15f	[PDB] Remove a couple asserts that are no longer valid now that C13Builders does not use unique_ptr. These asserts used to check that unique_ptr was not null. This fixes failures from `7af4bb1641`	2020-05-02 17:31:10 -07:00
Reid Kleckner	7af4bb1641	[PDB] Remove unique_ptr wrapper around C13 line table subsections This accounts for a large portion of the memory allocations in LLD. This DebugSubsectionRecordBuilder object can be stored directly in C13Builders, it mostly wraps other subsections. Remove the container kind field from the object. It is always the same for all elements in the vector, and we can pass it in during writing.	2020-05-02 16:35:07 -07:00
LemonBoy	6d103ca855	[SelectionDAG] Unify scalarizeVectorLoad and VectorLegalizer::ExpandLoad The two code paths have the same goal, legalizing a load of a non-byte-sized vector by loading the "flattened" representation in memory, slicing off each single element and then building a vector out of those pieces. The technique employed by `ExpandLoad` is slightly more convoluted and produces slightly better codegen on ARM, AMDGPU and x86 but suffers from some bugs (D78480) and is wrong for BE machines. Differential Revision: https://reviews.llvm.org/D79096	2020-05-02 15:18:10 -07:00
Simon Pilgrim	a09a3c6d3e	Revert rG8e05ac0a510c - "[DAGCombine] visitTRUNCATE - remove GetDemandedBits call" Causing buildbot failures	2020-05-02 20:08:33 +01:00
Simon Pilgrim	8e05ac0a51	[DAGCombine] visitTRUNCATE - remove GetDemandedBits call rL368553 added SimplifyMultipleUseDemandedBits handling for ISD::TRUNCATE to SimplifyDemandedBits so we don't need to duplicate this (and it gets rid of another GetDemandedBits call which is slowly being replaced with SimplifyMultipleUseDemandedBits anyhow).	2020-05-02 19:52:17 +01:00
Benjamin Kramer	97f92261df	[MBP] tuple->pair. NFC. std::pair has a trivial copy ctor, std::tuple doesn't.	2020-05-02 20:23:34 +02:00
Reid Kleckner	270d3faf6e	[COFF] Add and use a zero-copy tokenizer for .drectve This generalizes the main Windows command line tokenizer to be able to produce StringRef substrings as well as freshly copied C strings. The implementation is still shared with the normal tokenizer, which is important, because we have unit tests for that. .drective sections can be very long. They can potentially list up to every symbol in the object file by name. It is worth avoiding these string copies. This saves a lot of memory when linking chrome.dll with PGO instrumentation: BEFORE AFTER % IMP peak memory: 6657.76MB 4983.54MB -25% real: 4m30.875s 2m26.250s -46% The time improvement may not be real, my machine was noisy while running this, but that the peak memory usage improvement should be real. This change may also help apps that heavily use dllexport annotations, because those also use linker directives in object files. Apps that do not use many directives are unlikely to be affected. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D79262	2020-05-02 10:47:02 -07:00
Sam Elliott	fe4245a4c1	[RISCV] Implement convertSelectOfConstantsToMath Summary: The current lowering of `select` on RISC-V uses a branch instruction to load a register with one or other value. This is inefficient, especially in the case of small constants that can be computed easily. By implementing the TargetLowering::convertSelectOfConstantsToMath hook, some of the simpler cases are covered that let us avoid introducing a branch in these cases. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D79260	2020-05-02 15:05:57 +01:00
Sam Elliott	a4a9a1f671	[RISCV] Add patterns for checking isnan Summary: This patch addresses some weird assembly sequences we were seeing during comparing floats. In particular, comparing a float to itself tells you whether it is NaN or not, which we were doing correctly, but with an extra unneeded `and` instruction. This patch specialises the existing patterns to remove the `and` instructions when both their operands are the same. Reviewed By: luismarques, asb Differential Revision: https://reviews.llvm.org/D78908	2020-05-02 15:01:04 +01:00
Sam McCall	d10c995b4d	std::isspace -> llvm::isSpace (where locale should be ignored) I've left out some cases where I wasn't totally sure this was right or whether the include was ok (compiler-rt) or idiomatic (flang).	2020-05-02 15:36:04 +02:00
Nikita Popov	8148b11647	[ValueTracking] Short-circuit GEP known bits calculation (NFC) Don't compute known bits of all GEP operands, if we already know that we don't know anything.	2020-05-02 12:29:26 +02:00
Nikita Popov	b7e2358220	Remove getNumUses() comparisons (NFC) getNumUses() scans the full use list. Don't use it is we only want to check if there's zero or one uses.	2020-05-02 11:05:19 +02:00
Nikita Popov	60e9ee16b4	[MergeFuncs] Don't merge shufflevectors with different masks When the shufflevector mask operand was converted into special instruction data, the FunctionComparator was not updated to account for this. As such, MergeFuncs will happily merge shufflevectors with different masks. This fixes https://bugs.llvm.org/show_bug.cgi?id=45773. Differential Revision: https://reviews.llvm.org/D79261	2020-05-02 10:21:14 +02:00
Xing GUO	ff6a0b6a8e	[Object] Change ObjectFile::getSymbolValue() return type to Expected<uint64_t> Summary: In D77860, we have changed `getSymbolFlags()` return type to `Expected<uint32_t>`. This change helps bubble the error further up the stack. Reviewers: jhenderson, grimar, JDevlieghere, MaskRay Reviewed By: jhenderson Subscribers: hiraditya, MaskRay, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79075	2020-05-02 14:04:44 +08:00
Thomas Lively	e0f52842c8	[WebAssembly] Renumber SIMD opcodes Summary: As described in https://github.com/WebAssembly/simd/pull/209. This is the final reorganization of the SIMD opcode space before standardization. It has been landed in concert with corresponding changes in other projects in the WebAssembly SIMD ecosystem. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79224	2020-05-01 17:20:49 -07:00
Nemanja Ivanovic	8ca2fc9993	[PowerPC] Refactor PPCInstrVSX.td Over time, we have made many additions to this file and it has frankly become a bit of a mess. This has led to at least one issue - we have a number of instructions where the side effects flag should be set to false and we neglected to do this. This patch suggests a refactoring that should make the file much more maintainable. The file is split up into major sections and the nesting level is reduced, predicate blocks merged, etc. Sections: - Custom PPCISD node definitions - Predicate definitions - Instruction formats - Instruction definitions - Helper DAG definitions - Anonymous patterns - Instruction aliases Differential revision: https://reviews.llvm.org/D78132	2020-05-01 19:17:39 -05:00
Mircea Trofin	3dbc612cf2	[llvm][NFC] Rename variable as per https://reviews.llvm.org/D79215 Operator error - performed the rename and didn't save.	2020-05-01 16:30:41 -07:00
Mircea Trofin	e1c4a7cb16	[llvm][NFC] Inliner: simplify inlining decision logic Summary: shouldInline makes a decision based on the InlineCost of a call site, as well as an evaluation on whether the site should be deferred. This means it's possible for the decision to be not to inline, even for an InlineCost that would otherwise allow it. Both uses of shouldInline performed the exact same logic after calling it. In addition, the decision on whether to inline or not was communicated through two values of the Option<InlineCost> return value: None, or an InlineCost evaluating to false. Simplified by: - encapsulating the decision in the return object. The bool it evaluates to communicates unambiguously the decision. The InlineCost is also available. - encapsulated the common post-shouldInline code into shouldInline. Reviewers: davidxl, echristo, eraman Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79215	2020-05-01 16:18:59 -07:00
Vedant Kumar	9350792c62	[DebugInfo] Update loop metadata in stripNonLineTableDebugInfo Summary: Have stripNonLineTableDebugInfo() attach updated !llvm.loop metadata to an instruction (instead of updating and then discarding the metadata). This fixes "!dbg attachment points at wrong subprogram for function" errors seen while archiving an iOS app. It would be nice -- as a follow-up -- to catch this issue earlier, perhaps by modifying the verifier to constrain where DILocations are allowed. Any alternative suggestions appreciated. rdar://61982466 Reviewers: aprantl, dsanders Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79200	2020-05-01 11:36:05 -07:00
Craig Topper	b938168aef	[X86] Lower the cost of v4i64->v4i32 truncate with avx512. We use the vpmovqd instruction which is a single uop. So the cost should be 1.	2020-05-01 11:09:37 -07:00
Christopher Tetreault	beeabe382d	[SVE] Fix invalid usage of VectorType::getNumElements() in InstCombine Summary: Make foldVectorBinop return null if the instruction type is a scalable vector. It is unclear what, if any, of this function works with scalable vectors. Identified by test LLVM.Transforms/InstCombine::nsw.ll Reviewers: efriedma, david-arm, fpetrogalli, spatel Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79196	2020-05-01 10:56:29 -07:00

1 2 3 4 5 ...

134040 Commits