llvm-project

Commit Graph

Author	SHA1	Message	Date
Bjorn Pettersson	985b9b7e42	[PM] Avoid duplicates in the Used/Preserved/Required sets The pass analysis uses "sets" implemented using a SmallVector type to keep track of Used, Preserved, Required and RequiredTransitive passes. When having nested analyses we could end up with duplicates in those sets, as there was no checks to see if a pass already existed in the "set" before pushing to the vectors. This idea with this patch is to avoid such duplicates by avoiding pushing elements that already is contained when adding elements to those sets. To align with the above PMDataManager::collectRequiredAndUsedAnalyses is changed to skip adding both the Required and RequiredTransitive passes to its result vectors (since RequiredTransitive always is a subset of Required we ended up with duplicates when traversing both sets). Main goal with this is to avoid spending time verifying the same analysis mulitple times in PMDataManager::verifyPreservedAnalysis when iterating over the Preserved "set". It is assumed that removing duplicates from a "set" shouldn't have any other negative impact (I have not seen any problems so far). If this ends up causing problems one could do some uniqueness filtering of the vector being traversed in verifyPreservedAnalysis instead. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D94416	2021-01-20 13:55:18 +01:00
Mark Murray	cab20f6105	[AArch64] Add missing "flagm" feature to the .arch_extension directive. Depends on D94970 Differential Revision: https://reviews.llvm.org/D94971	2021-01-20 11:57:39 +00:00
Mark Murray	f344c028de	[AArch64] Add missing "pauth" feature to the .arch_extension directive. Differential Revision: https://reviews.llvm.org/D94970	2021-01-20 11:57:39 +00:00
Chuanqi Xu	c1bc7981ba	[Coroutine] Remain alignment information when merging frame variables Summary: This is to address bug48712. The solution in this patch is that when we want to merge two variable a into the storage frame of variable b only if the alignment of a is multiple of b. There may be other strategies. But now I think they are hard to handle and benefit little. Or we can implement them in the future. Test-plan: check-llvm Reviewers: jmorse, lxfind, junparser Differential Revision: https://reviews.llvm.org/D94891	2021-01-20 18:59:00 +08:00
Mirko Brkusanin	a6a72dfdf2	[AMDGPU][GlobalISel] Avoid selecting S_PACK with constants If constants are hidden behind G_ANYEXT we can treat them same way as G_SEXT. For that purpose we extend getConstantVRegValWithLookThrough with option to handle G_ANYEXT same way as G_SEXT. Differential Revision: https://reviews.llvm.org/D92219	2021-01-20 11:54:53 +01:00
Petar Avramovic	4ab704d628	[AMDGPU][MC] Add tfe disassembler support MIMG opcodes With tfe on there can be a vgpr write to vdata+1. Add tablegen support for 5 register vdata store. This is required for 4 register vdata store with tfe. Differential Revision: https://reviews.llvm.org/D94960	2021-01-20 10:37:09 +01:00
Gabriel Hjort Åkerlund	2aeaaf841b	[GlobalISel] Add missing operand update when copy is required When constraining an operand register using constrainOperandRegClass(), the function may emit a COPY in case the provided register class does not match the current operand register class. However, the operand itself is not updated to make use of the COPY, thereby resulting in incorrect code. This patch fixes that bug by updating the machine operand accordingly. Reviewed By: dsanders Differential Revision: https://reviews.llvm.org/D91244	2021-01-20 10:32:52 +01:00
David Sherwood	255a507716	[NFC][InstructionCost] Use InstructionCost in lib/Transforms/IPO/IROutliner.cpp In places where we call a TTI.getXXCost() function I have changed the code to use InstructionCost instead of unsigned. This is in preparation for later on when we will change the TTI interfaces to return InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D94427	2021-01-20 08:33:59 +00:00
Bill Wendling	e22295385c	[X86] Add segment and address-size override prefixes X86 allows for the "addr32" and "addr16" address size override prefixes. Also, these and the segment override prefixes should be recognized as valid prefixes. Differential Revision: https://reviews.llvm.org/D94726	2021-01-19 23:54:31 -08:00
Hsiangkai Wang	8ca4b174d7	[RISCV] Implement vlseg intrinsics. For Zvlsseg, we need continuous vector registers for the values. We need to define new register classes for the different combinations of (number of fields and LMUL). For example, when the number of fields(NF) = 3, LMUL = 2, the values will be assigned to (V0M2, V2M2, V4M2), (V2M2, V4M2, V6M2), (V4M2, V6M2, V8M2), ... We define the vlseg intrinsics with multiple outputs. There is no way to describe the codegen patterns with multiple outputs in the tablegen files. We do the codegen in RISCVISelDAGToDAG and use EXTRACT_SUBREG to extract the values of output. The multiple scalable vector values will be put into a struct. This patch is depended on the support for scalable vector struct. Differential Revision: https://reviews.llvm.org/D94229	2021-01-20 14:26:04 +08:00
Kazu Hirata	b023cdeacc	[llvm] Use llvm::all_of (NFC)	2021-01-19 20:19:17 -08:00
Kazu Hirata	978c754076	[llvm] Use llvm::any_of (NFC)	2021-01-19 20:19:16 -08:00
Kazu Hirata	8857202489	[llvm] Use llvm::find (NFC)	2021-01-19 20:19:14 -08:00
ShihPo Hung	4dae2247fd	[RISCV] refactor VPatBinary (NFC) Make it easier to reuse for intrinsic vrgatherei16 which needs to encode both LMUL & EMUL in the instruction name, like PseudoVRGATHEREI16_VV_M1_M1 and PseudoVRGATHEREI16_VV_M1_M2. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94951	2021-01-19 19:09:56 -08:00
Juneyoung Lee	4479c0c2c0	Allow nonnull/align attribute to accept poison Currently LLVM is relying on ValueTracking's `isKnownNonZero` to attach `nonnull`, which can return true when the value is poison. To make the semantics of `nonnull` consistent with the behavior of `isKnownNonZero`, this makes the semantics of `nonnull` to accept poison, and return poison if the input pointer isn't null. This makes many transformations like below legal: ``` %p = gep inbounds %x, 1 ; % p is non-null pointer or poison call void @f(%p) ; instcombine converts this to call void @f(nonnull %p) ``` Instead, this semantics makes propagation of `nonnull` to caller illegal. The reason is that, passing poison to `nonnull` does not immediately raise UB anymore, so such program is still well defined, if the callee does not use the argument. Having `noundef` attribute there re-allows this. ``` define void @f(i8* %p) { ; functionattr cannot mark %p nonnull here anymore call void @g(i8* nonnull %p) ; .. because @g never raises UB if it never uses %p. ret void } ``` Another attribute that needs to be updated is `align`. This patch updates the semantics of align to accept poison as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90529	2021-01-20 11:31:23 +09:00
Ian Levesque	68a1f09107	[xray] Honor xray-never function-instrument attribute function-instrument=xray-never wasn't actually honored before. We were getting lucky that it worked because CodeGenFunction would omit the other xray attributes when a function was annotated with xray_never_instrument. This patch adds proper support. Differential Revision: https://reviews.llvm.org/D89441	2021-01-19 18:47:09 -05:00
Wei Mi	21b1ad0340	[SampleFDO] Add the support to split the function profiles with context into separate sections. For ThinLTO, all the function profiles without context has been annotated to outline functions if possible in prelink phase. In postlink phase, profile annotation in postlink phase is only meaningful for function profile with context. If the profile is large, it is better to split the profile into two parts, one with context and one without, so the profile reading in postlink phase only has to read the part with context. To have the profile splitting, we extend the ExtBinary format to support different section arrangement. It will be flexible to add other section layout in the future without the need to create new class inheriting from ExtBinary class. Differential Revision: https://reviews.llvm.org/D94435	2021-01-19 15:16:19 -08:00
Sam Clegg	96ef4f307d	Revert "[WebAssembly] call_indirect issues table number relocs" This reverts commit `418df4a6ab`. This change broke emscripten tests, I believe because it started generating 5-byte a wide table index in the call_indirect instruction. Neither v8 nor wabt seem to be able to handle that. The spec currently says that this is single 0x0 byte and: "In future versions of WebAssembly, the zero byte occurring in the encoding of the call_indirectcall_indirect instruction may be used to index additional tables." So we need to revisit this change. For backwards compat I guess we need to guarantee that __indirect_function_table is always at address zero. We could also consider making this a single-byte relocation with and assert if have more than 127 tables (for now). Differential Revision: https://reviews.llvm.org/D95005	2021-01-19 15:06:07 -08:00
Craig Topper	e75a4b6ea9	[RISCV] Remove NotHasStdExtZbb predicate from zext.h/sext.b/sext.h InstAliases. NFC NotHasStdExtZbb doesn't have an AssemblerPredicate associated with it so it didn't do anything. We don't need it either because the sorting rules in tablegen prioritize by number of predicates. So the dedicated instructions in the B extension that have predicates will be prioritized automatically.	2021-01-19 14:31:48 -08:00
Alexey Bataev	e463bd53c0	Revert "[SLP]Merge reorder and reuse shuffles." This reverts commit `438682de6a` to fix the bug with the reducing size of the resulting vector for the entry node with multiple users.	2021-01-19 11:48:04 -08:00
Mitch Phillips	5b7aef6eb4	Revert "[PDB] Defer relocating .debug$S until commit time and parallelize it" This reverts commit `6529d7c5a4`. Reason: Broke the ASan buildbots. http://lab.llvm.org:8011/#/builders/99/builds/1567	2021-01-19 11:45:48 -08:00
Jonas Devlieghere	a4b42c621b	[llvm] Protect signpost map with a mutex Use a mutex to protect concurrent access to the signpost map. This fixes nondeterministic crashes in LLDB that appeared after using signposts in the timer implementation. Differential revision: https://reviews.llvm.org/D94285	2021-01-19 11:41:54 -08:00
Mariya Podchishchaeva	7113de301a	[ScalarizeMaskedMemIntrin] Add missing dependency The pass has dependency on 'TargetTransformInfoWrapperPass', but the corresponding call to INITIALIZE_PASS_DEPENDENCY was missing. Differential Revision: https://reviews.llvm.org/D94916	2021-01-19 22:33:47 +03:00
Nikita Popov	21443381c0	Reapply [InstCombine] Replace one-use select operand based on condition Relative to the original change, this adds a check that the instruction on which we're replacing operands is safe to speculatively execute, because that's what we're effectively doing. We're executing the instruction with the replaced operand, which is fine if it's pure, but not fine if can cause side-effects or UB (aka is not speculatable). Additionally, we cannot (generally) replace operands in phi nodes, as these may refer to a different loop iteration. This is also covered by the speculation check. ----- InstCombine already performs a fold where X == Y ? f(X) : Z is transformed to X == Y ? f(Y) : Z if f(Y) simplifies. However, if f(X) only has one use, then we can always directly replace the use inside the instruction. To actually be profitable, limit it to the case where Y is a non-expr constant. This could be further extended to replace uses further up a one-use instruction chain, but for now this only looks one level up. Among other things, this also subsumes D94860. Differential Revision: https://reviews.llvm.org/D94862	2021-01-19 20:26:38 +01:00
Craig Topper	ce8b3937dd	[RISCV] Add DAG combine to turn (setcc X, 1, setne) -> (setcc X, 0, seteq) if we can prove X is 0/1. If we are able to compare with 0 instead of 1, we might be able to fold the setcc into a beqz/bnez. Often these setccs start life as an xor that gets converted to a setcc by DAG combiner's rebuildSetcc. I looked into a detecting (xor X, 1) and converting to (seteq X, 0) based on boolean contents being 0/1 in rebuildSetcc instead of using computeKnownBits. It was very perturbing to AMDGPU tests which I didn't look closely at. It had a few changes on a couple other targets, but didn't seem to be much if any improvement. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D94730	2021-01-19 11:21:48 -08:00
Jeroen Dobbelaere	121cac01e8	[noalias.decl] Look through llvm.experimental.noalias.scope.decl Just like llvm.assume, there are a lot of cases where we can just ignore llvm.experimental.noalias.scope.decl. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93042	2021-01-19 20:09:42 +01:00
Brendon Cahoon	57443bfb4a	[Hexagon] Fix segment start to adjust for gaps between segments The Hexagon Vector Combine pass genertes stores for a complete aligned vector. The start of each section is a multiple of the vector size, so that value is passed to normalize to compute the offset of the stores in the section. The first store may not occur at offset 0 when there is a gap between sections.	2021-01-19 12:49:39 -06:00
Jay Foad	18cb7441b6	[AMDGPU] Simpler names for arch-specific ttmp registers. NFC. Rename the _gfx9_gfx10 ttmp registers to _gfx9plus for simplicity, and use the corresponding isGFX9Plus predicate to decide when to use them instead of the old *_vi versions. Differential Revision: https://reviews.llvm.org/D94975	2021-01-19 18:47:14 +00:00
Jessica Paquette	cbf5246359	Fix buildbot after `cfc6073017` Windows buildbots were not happy with using find_if + instructionsWithoutDebug. In `cfc6073017`, instructionsWithoutDebug is not technically necessary. So, just iterate over the block directly. http://lab.llvm.org:8011/#/builders/127/builds/4732/steps/7/logs/stdio	2021-01-19 10:38:04 -08:00
Jessica Paquette	cfc6073017	[GlobalISel] Combine (a[0]) \| (a[1] << k1) \| ...\| (a[m] << kn) into a wide load This is a restricted version of the combine in `DAGCombiner::MatchLoadCombine`. (See D27861) This tries to recognize patterns like below (assuming a little-endian target): ``` s8* x = ... s32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) -> s32 val = ((i32)a) s8 x = ... s32 val = a[3] \| (a[2] << 8) \| (a[1] << 16) \| (a[0] << 24) -> s32 val = BSWAP(*((s32)a)) ``` (This patch also handles the big-endian target case as well, in which the first example above has a BSWAP, and the second example above does not.) To recognize the pattern, this searches from the last G_OR in the expression tree. E.g. ``` Reg Reg \ / OR_1 Reg \ / OR_2 \ Reg .. / Root ``` Each non-OR register in the tree is put in a list. Each register in the list is then checked to see if it's an appropriate load + shift logic. If every register is a load + potentially a shift, the combine checks if those loads + shifts, when OR'd together, are equivalent to a wide load (possibly with a BSWAP.) To simplify things, this patch (1) Only handles G_ZEXTLOADs (which appear to be the common case) (2) Only works in a single MachineBasicBlock (3) Only handles G_SHL as the bit twiddling to stick the small load into a specific location An IR example of this is here: https://godbolt.org/z/4sP9Pj (lifted from test/CodeGen/AArch64/load-combine.ll) At -Os on AArch64, this is a 0.5% code size improvement for CTMark/sqlite3, and a 0.4% improvement for CTMark/7zip-benchmark. Also fix a bug in `isPredecessor` which caused it to fail whenever `DefMI` was the first instruction in the block. Differential Revision: https://reviews.llvm.org/D94350	2021-01-19 10:24:27 -08:00
Fraser Cormack	9c6a00fe99	[RISCV] Add ISel patterns for scalable mask exts & truncs Original patch by @rogfer01. This patch adds support for sign-, zero-, and any-extension from scalable mask vector types to integer vector types, as well as truncation in the opposite direction. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94590	2021-01-19 18:13:15 +00:00
David Green	6a563eef13	[ARM] Expand vXi1 VSELECT's We have no lowering for VSELECT vXi1, vXi1, vXi1, so mark them as expanded to turn them into a series of logical operations. Differential Revision: https://reviews.llvm.org/D94946	2021-01-19 17:56:50 +00:00
Nikita Popov	051ec9f5f4	[ValueTracking] Strengthen impliesPoison reasoning Split impliesPoison into two recursive walks, one over V, the other over ValAssumedPoison. This allows us to reason about poison implications in a number of additional cases that are important in practice. This is a generalized form of D94859, which handles the cmp to cmp implication in particular. Differential Revision: https://reviews.llvm.org/D94866	2021-01-19 18:04:23 +01:00
Fraser Cormack	15fd6bae0e	[RISCV] Extend RVV VType info with the type's AVL (NFC) This patch factors out the "VLMax" operand passed to most scalable-vector ISel patterns into a property of each VType. This is seen as a preparatory change to allow RVV in the future to more easily support fixed-length vector types with constrained vector lengths, with the AVL operand set to the length of the fixed-length vector. It has no effect on the scalable code generation path. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D94594	2021-01-19 15:46:56 +00:00
David Green	f373b30923	[ARM] Add MVE add.sat costs This adds some basic MVE sadd_sat/ssub_sat/uadd_sat/usub_sat costs, based on when the instruction is legal. With smaller than legal types that are promoted we generate shr(qadd(shl, shl)), so the cost is 4 appropriately. Differential Revision: https://reviews.llvm.org/D94958	2021-01-19 15:38:46 +00:00
Victor Huang	909d6c86ea	[PowerPC] Fix the check for the instruction using FRSP/XSRSP output register When performing peephole optimization to simplify the code, after removing passed FPSP/XSRSP instruction we will set any uses of that FRSP/XSRSP to the source of the FRSP/XSRSP. We are finding the machine instruction using virtual register holding FRSP/XSRSP results by searching all following instructions and encountering an issue that the first use of the virtual register is a debug MI causing: 1. virtual register in the debug MI removed unexpectedly. 2. virtual register used in non-debug MI not replaced with the source of FRSP/XSRSP. which stays in a undef status. This patch fix the issue by only searching non-debug machine instruction using virtual register holding FRSP/XSRSP results when the vr only has one non debug usage. Differential Revisien: https://reviews.llvm.org/D94711 Reviewed by: nemanjai	2021-01-19 09:20:03 -06:00
Florian Hahn	3747b69b53	[LoopRotate] Calls not lowered to calls should not block rotation. `83daa49758` made loop-rotate more conservative in the presence of function calls in the prepare-for-lto stage. The code did not properly account for calls that are no actual function calls, like calls to intrinsics. This patch updates the code to ensure only calls that are lowered to actual calls are considered inline candidates.	2021-01-19 14:37:36 +00:00
Tim Northover	6259fbd8b6	AArch64: add apple-a14 as a CPU This CPU supports all v8.5a features except BTI, and so identifies as v8.5a to Clang. A bit weird, but the best way for things like xnu to detect the new features it cares about.	2021-01-19 14:04:53 +00:00
Hans Wennborg	ec877106a3	[ThinLTO] Also prune Thin-* files from the ThinLTO cache Such files (Thin-%%%%%%.tmp.o) are supposed to be deleted immediately after they're used (either by renaming or deletion). However, we've seen instances on Windows where this doesn't happen, probably due to the filesystem being flaky. This is effectively a resource leak which has prevented us from using the ThinLTO cache on Windows. Since those temporary files are in the thinlto cache directory which we prune periodically anyway, allowing them to be pruned too seems like a tidy way to solve the problem. Differential revision: https://reviews.llvm.org/D94962	2021-01-19 14:43:49 +01:00
Caroline Concatto	172f1f8952	[AArch64][SVE]Add cost model for vector reduce for scalable vector This patch computes the cost for vector.reduce<operand> for scalable vectors. The cost is split into two parts: the legalization cost and the horizontal reduction. Differential Revision: https://reviews.llvm.org/D93639	2021-01-19 11:54:16 +00:00
Simon Pilgrim	5626adcd6b	[X86][SSE] combineVectorSignBitsTruncation - fold trunc(srl(x,c)) -> packss(sra(x,c)) If a srl doesn't introduce any sign bits into the truncated result, then replace with a sra to let us use a PACKSS truncation - fixes a regression noticed in D56387 on pre-SSE41 targets that don't have PACKUSDW.	2021-01-19 11:04:13 +00:00
Hans Wennborg	58bdfcfac0	Revert `5238e7b302` "[InstCombine] Replace one-use select operand based on condition" This caused a miscompile in Chromium, see comments on the codereview for discussion and pointer to a reproducer. > InstCombine already performs a fold where X == Y ? f(X) : Z is > transformed to X == Y ? f(Y) : Z if f(Y) simplifies. However, > if f(X) only has one use, then we can always directly replace the > use inside the instruction. To actually be profitable, limit it to > the case where Y is a non-expr constant. > > This could be further extended to replace uses further up a one-use > instruction chain, but for now this only looks one level up. > > Among other things, this also subsumes D94860. > > Differential Revision: https://reviews.llvm.org/D94862 This also reverts the follow-up a003f26539cf4db744655e76c41f4c4a8913f116: > [llvm] Prevent infinite loop in InstCombine of select statements > > This fixes an issue where the RHS and LHS the comparison operation > creating the predicate were swapped back and forth forever. > > Differential Revision: https://reviews.llvm.org/D94934	2021-01-19 11:50:56 +01:00
Jay Foad	49dce85584	[AMDGPU] Simplify AMDGPUInstPrinter::printExpSrcN. NFC. Change-Id: Idd7f47647bc0faa3ad6f61f44728c0f20540ec00	2021-01-19 10:39:56 +00:00
Florian Hahn	83daa49758	[LoopRotate] Add PrepareForLTO stage, avoid rotating with inline cands. D84108 exposed a bad interaction between inlining and loop-rotation during regular LTO, which is causing notable regressions in at least CINT2006/473.astar. The problem boils down to: we now rotate a loop just before the vectorizer which requires duplicating a function call in the preheader when compiling the individual files ('prepare for LTO'). But this then prevents further inlining of the function during LTO. This patch tries to resolve this issue by making LoopRotate more conservative with respect to rotating loops that have inline-able calls during the 'prepare for LTO' stage. I think this change intuitively improves the current situation in general. Loop-rotate tries hard to avoid creating headers that are 'too big'. At the moment, it assumes all inlining already happened and the cost of duplicating a call is equal to just doing the call. But with LTO, inlining also happens during full LTO and it is possible that a previously duplicated call is actually a huge function which gets inlined during LTO. From the perspective of LV, not much should change overall. Most loops calling user-provided functions won't get vectorized to start with (unless we can infer that the function does not touch memory, has no other side effects). If we do not inline the 'inline-able' call during the LTO stage, we merely delayed loop-rotation & vectorization. If we inline during LTO, chances should be very high that the inlined code is itself vectorizable or the user call was not vectorizable to start with. There could of course be scenarios where we inline a sufficiently large function with code not profitable to vectorize, which would have be vectorized earlier (by scalarzing the call). But even in that case, there probably is no big performance impact, because it should be mostly down to the cost-model to reject vectorization in that case. And then the version with scalarized calls should also not be beneficial. In a way, LV should have strictly more information after inlining and make more accurate decisions (barring cost-model issues). There is of course plenty of room for things to go wrong unexpectedly, so we need to keep a close look at actual performance and address any follow-up issues. I took a look at the impact on statistics for MultiSource/SPEC2000/SPEC2006. There are a few benchmarks with fewer loops rotated, but no change to the number of loops vectorized. Reviewed By: sanwou01 Differential Revision: https://reviews.llvm.org/D94232	2021-01-19 10:15:29 +00:00
Yvan Roux	244ad228f3	[ARM][MachineOutliner] Add stack fixup feature This patch handles cases where we have to save/restore the link register into the stack and and load/store instruction which use the stack are part of the outlined region. It checks that there will be no overflow introduced by the new offset and fixup these instructions accordingly. Differential Revision: https://reviews.llvm.org/D92934	2021-01-19 10:59:09 +01:00
Fraser Cormack	c81ea9429f	[RISCV] Add scalable-vector integer extension patterns Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94694	2021-01-19 09:30:36 +00:00
Tres Popp	a003f26539	[llvm] Prevent infinite loop in InstCombine of select statements This fixes an issue where the RHS and LHS the comparison operation creating the predicate were swapped back and forth forever. Differential Revision: https://reviews.llvm.org/D94934	2021-01-19 10:31:48 +01:00
Lang Hames	95b63c7b13	[ORC] Move LookupRequest from OrcShared to Orc. It depends on Orc types (SymbolLookupSet), so can't be part of OrcShared.	2021-01-19 20:23:47 +11:00
Tres Popp	170199f562	[llvm][nvptx] add atomicity to counter in ISelLowering Previously uniqueCallSite could have race conditions between different threads. Now it is accessed with an atomic RMW and will be unique between different threads. Differential Revision: https://reviews.llvm.org/D94784	2021-01-19 10:20:20 +01:00
David Sherwood	c3ce262794	[NFC] Make remaining cost functions in LoopVectorize.cpp use InstructionCost A previous patch has already changed getInstructionCost to return an InstructionCost type. This patch changes the other various getXXXCost functions to return an InstructionCost too. This is a non-functional change - I've added a few asserts that the costs are valid in places where we're selecting between vector call and intrinsic costs. However, since we don't yet return invalid costs from any of the TTI implementations these asserts should not fire. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D94065	2021-01-19 09:08:40 +00:00

1 2 3 4 5 ...

143125 Commits