llvm-project

Commit Graph

Author	SHA1	Message	Date
Arthur Eubanks	be611431cd	[NewPM][opt] Run the "default" AA pipeline by default We tend to assume that the AA pipeline is by default the default AA pipeline and it's confusing when it's empty instead. PR48779 Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D95117	2021-01-21 19:46:38 -08:00
Hsiangkai Wang	5d354220d4	[RISCV] Correct DWARF number for vector registers. The DWARF numbers of vector registers are already defined in riscv-elf-psabi. The DWARF number for vector is start from 96. Correct the DWARF numbers of vector registers. Differential Revision: https://reviews.llvm.org/D94749	2021-01-22 11:33:42 +08:00
Craig Topper	f8f1b20e6b	[RISCV] Don't create LMUL=8 pseudo instructions for ternary widening arithmetic instructions These instructions produce 2*SEW result so the input can't have an LMUL=8 or the result would need a non-existant LMUL=16. So only create pseudos for LMUL up to 4. Differential Revision: https://reviews.llvm.org/D95189	2021-01-21 19:29:02 -08:00
Cassie Jones	3dedad475d	[AArch64][GlobalISel] Make G_USUBO legal and select it. The expansion for wide subtractions includes G_USUBO. Differential Revision: https://reviews.llvm.org/D95032	2021-01-21 18:53:33 -08:00
ShihPo Hung	9667750331	[RISCV] Add intrinsics for RVV1.0 VFRSQRTE7 & VFRECE7 Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D95113	2021-01-21 18:38:49 -08:00
ShihPo Hung	976cf53cc7	[RISCV] Add intrinsics for vector unordered indexed load in RVV 1.0 Add unordered indexed load: vluxei Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95028	2021-01-21 18:38:49 -08:00
ShihPo Hung	bea661d9a5	[RISCV] Add intrinsics for RVV 1.0 vrgatherei16 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95014	2021-01-21 18:38:49 -08:00
Xun Li	bd3ca6666d	[Inlining] Delete redundant optnone/alwaysinline check The same check is done in InlineCost: `8b0bd54d0e/llvm/lib/Analysis/InlineCost.cpp (L2537-L2552)` Also, doing a check on the callee here is confusing, because anything that deals with callee should be done in the inner loop where we proecss all calls from the same caller. Differential Revision: https://reviews.llvm.org/D95186	2021-01-21 18:38:10 -08:00
Qiu Chaofan	449f2f7140	[PowerPC] Duplicate inherited heuristic from base scheduler PowerPC has its custom scheduler heuristic. It calls parent classes' tryCandidate in override version, but the function returns void, so this way doesn't actually help. This patch duplicates code from base scheduler into PPC machine scheduler class, which does what we wanted. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D94464	2021-01-22 10:11:03 +08:00
RamNalamothu	b6c3a59c3f	[AMDGPU] Test case demonstrating issues with generation of .debug_frame This test case demonstrates that the Call Frame Information generation is totally biased towards whether exceptions are enabled or not. Currently LLVM does not generate CFI i.e. a .debug_frame for debug purpose even if --force-dwarf-frame-section is enabled unless exceptions are enabled. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D94801	2021-01-22 07:39:06 +05:30
Akira Hatanaka	3d349ed7e1	[CodeGen][ObjC] Fix broken IR generated when there is a nil receiver check This patch fixes a bug in emitARCOperationAfterCall where it inserts the fall-back call after a bitcast instruction and then replaces the bitcast's operand with the result of the fall-back call. The generated IR without this patch looks like this: msgSend.call: ; preds = %entry %call = call i8* bitcast (i8* (i8, i8, ...)* @objc_msgSend br label %msgSend.cont msgSend.null-receiver: ; preds = %entry call void @llvm.objc.release(i8* %4) br label %msgSend.cont msgSend.cont: %8 = phi i8* [ %call, %msgSend.call ], [ null, %msgSend.null-receiver ] %9 = bitcast i8* %10 to %0* %10 = call i8* @llvm.objc.retain(i8* %8) Notice that `%9 = bitcast i8* %10` to %0* is taking operand %10 which is defined after it. To fix the bug, this patch modifies the insert point to point to the bitcast instruction so that the fall-back call is inserted before the bitcast. In addition, it teaches the function to look at phi instructions that are generated when there is a check for a null receiver and insert the retainRV/claimRV instruction right after the call instead of inserting a fall-back call right after the phi instruction. rdar://73360225 Differential Revision: https://reviews.llvm.org/D95181	2021-01-21 17:38:46 -08:00
mikeurbach	0a7a1ac73d	[mlir] Support FuncOpSignatureConversion for more FunctionLike ops. This extracts the implementation of getType, setType, and getBody from FunctionSupport.h into the mlir::impl namespace and defines them generically in FunctionSupport.cpp. This allows them to be used elsewhere for any FunctionLike ops that use FunctionType for their type signature. Using the new helpers, FuncOpSignatureConversion is generalized to work with all such FunctionLike ops. Convenience helpers are added to configure the pattern for a given concrete FunctionLike op type. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D95021	2021-01-21 18:35:09 -07:00
Wolfgang Pieb	c6e8f81410	[llvm-mca] Addressing build failures due to missing override specifiers	2021-01-21 17:32:18 -08:00
Craig Topper	3b5430eb0d	[RISCV] Add a VL output to vleff intrinsics. The fault-only-first-load instructions can reduce VL if an element other than element 0 triggers a memory fault. This can be used to vectorize loops with data dependent exit conditions like strcmp or strlen. This patch adds a VL output to these intrinsics so that the new VL value can be captured by software. This will be expanded to 'csrr gpr, vl' after the vleff instruction during SelectionDAG. By doing this with one intrinsic we are able to guarantee that the csrr reads the VL value produced by the vleff instruction. Having it as a separate intrinsic would make it impossible to guarantee ordering without making every other vector intrinsic have side effects. The intrinsics are expanded during lowering into two ISD nodes that are glued together. These ISD nodes will go through isel separately, but should maintain the glue so that they get emitted adjacently by InstrEmitter. I've only ran the chain through the vleff instruction, allowing the READ_VL to be deleted if it is unused. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D94286	2021-01-21 17:19:58 -08:00
Chen Zheng	8120cfedf5	[NFC] [TargetRegisterInfo] add another API to get srcreg through copy. Reviewed By: nemanjai, jsji Differential Revision: https://reviews.llvm.org/D92069	2021-01-21 20:10:25 -05:00
peter klausler	2de5ea3b3e	[flang] Fix bogus error message with binding ProcedureDesignator::GetInterfaceSymbol() needs to return the procedure bound to a bindings. Differential Revision: https://reviews.llvm.org/D95178	2021-01-21 16:59:51 -08:00
Brad Smith	1be2524b7d	[libcxx] Check return value for asprintf() local __libcpp_asprintf_l() -> libc asprintf() was inspecting the pointer (with indeterminate value) for failure, rather than the return value of -1. Reviewed By: ldionne Differential Revision: https://reviews.llvm.org/D94564	2021-01-21 19:43:11 -05:00
peter klausler	0cfadb37f4	[flang] Allow NULL() actual argument for pointer dummy Fixes a bogus error message about an actual argument not being an object. Differential Revision: https://reviews.llvm.org/D95176	2021-01-21 16:37:35 -08:00
Wolfgang Pieb	020c00b5d3	[llvm-mca] Test case was missing a triple.	2021-01-21 16:19:32 -08:00
peter klausler	3738447c96	[flang] Address name resolution problems Don't emit a bogus error message about a bad forward reference when it's an IMPORT of a USE-associated symbol; don't ignore intrinsic functions when USE-associating the contents of a module when the intrinsic has been explicitly USE'd; allow PUBLIC or PRIVATE accessibility attribute to be specified for an enumerator before the declaration of the enumerator. Differential Revision: https://reviews.llvm.org/D95175	2021-01-21 16:15:38 -08:00
Hsiangkai Wang	6e360460f1	[RISCV] Use v8-v23 as argument registers to conform to the proposal. The maximum LMUL is 8. We need 16 vector registers for two LMUL-8 arguments. The modification follows the proposal of psABI in https://github.com/riscv/riscv-elf-psabi-doc/pull/171 Differential Revision: https://reviews.llvm.org/D95134	2021-01-22 07:55:24 +08:00
Wolfgang Pieb	04af1ca2e9	[llvm-mca] Forgot a couple of override specifiers. Differential Revision: https://reviews.llvm.org/D86644	2021-01-21 15:44:14 -08:00
Hsiangkai Wang	b7ab6726b6	[RISCV] New vector load/store in V extension v1.0 Upgrade RISC-V V extension to v1.0-08a0b46. Indexed load/store have ordered and unordered form. New whole vector load/store. Differential Revision: https://reviews.llvm.org/D93614	2021-01-22 07:30:09 +08:00
Petr Hosek	b014335263	[libc] Distinguish compiler and run failures This is useful for debugging issues, for example when cross-compiling. Differential Revision: https://reviews.llvm.org/D95118	2021-01-21 15:27:34 -08:00
LLVM GN Syncbot	0cd1e47327	[gn build] Port `d38be2ba0e`	2021-01-21 23:19:45 +00:00
Fangrui Song	cfe9ccbddd	[libc++abi] Simplify scan_eh_tab 1. All `_URC_HANDLER_FOUND` return values need to set `landingPad` and its value does not matter for `_URC_CONTINUE_UNWIND`. So we can always set `landingPad` to unify code. 2. For an exception specification (`ttypeIndex < 0`), we can check `_UA_FORCE_UNWIND` first. 3. The so-called type 3 search (`actions & _UA_CLEANUP_PHASE && !(actions & _UA_HANDLER_FRAME)`) is actually conceptually wrong. For a catch handler or an unmatched dynamic exception specification, `_UA_HANDLER_FOUND` should be returned immediately. It still appeared to work because the `ttypeIndex==0` case would return `_UA_HANDLER_FOUND` at a later time. This patch fixes the conceptual error and simplifies the code by handling type 3 the same way as type 2 (which is also what libsupc++ does). The only difference between phase 1 and phase 2 is what to do with a cleanup (`actionEntry==0`, or a `ttypeIndex==0` is found in the action record chain): phase 1 returns `_URC_CONTINUE_UNWIND` while phase 2 returns `_URC_HANDLER_FOUND`. Reviewed By: #libc_abi, compnerd Differential Revision: https://reviews.llvm.org/D93190	2021-01-21 15:19:23 -08:00
Wolfgang Pieb	d38be2ba0e	[llvm-mca] Initial implementation of serialization using JSON. The views implemented at this time are Summary, Timeline, ResourcePressure and InstructionInfo. Use --json on the command line to obtain JSON output.	2021-01-21 15:15:54 -08:00
Mehdi Amini	922b26cde4	Add Python bindings for the builtin dialect This includes some minor customization for FuncOp and ModuleOp. Differential Revision: https://reviews.llvm.org/D95022	2021-01-21 22:44:44 +00:00
Jon Roelofs	1deee5cacb	Fix crash when emitting NullReturn guards for functions returning BOOL CodeGenModule::EmitNullConstant() creates constants with their "in memory" type, not their "in vregs" type. The one place where this difference matters is when the type is _Bool, as that is an i1 when in vregs and an i8 in memory. Fixes: rdar://73361264	2021-01-21 14:29:36 -08:00
Sam Clegg	d75b371982	[WebAssembly] Test that invalid symbol/relocation types generate errors See https://bugs.llvm.org/show_bug.cgi?id=48827 Differential Revision: https://reviews.llvm.org/D95163	2021-01-21 13:58:28 -08:00
Christian Sigg	bd3a387ee7	Revert [mlir] Link mlir_runner_utils statically into cuda/rocm-runtime-wrappers (`cf50f4f764`) There are cmake failures that I do not know how to fix. Differential Revision: https://reviews.llvm.org/D95162	2021-01-21 22:38:59 +01:00
Dan Albert	866d480fe0	[libc++abi] Add an option to avoid demangling in terminate. We've been using this patch in Android so we can avoid including the demangler in libc++.so. It comes with a rather large cost in RSS and isn't commonly needed. Reviewed By: #libc_abi, compnerd Differential Revision: https://reviews.llvm.org/D88189	2021-01-21 13:32:29 -08:00
Walter Erquinigo	39239f9b56	[lldb-vscode] improve modules request lldb-vsdode was communicating the list of modules to the IDE with events, which in practice ended up having some drawbacks - when debugging large targets, the number of these events were easily 10k, which polluted the messages being transmitted, which caused the following: a harder time debugging the messages, a lag after terminated the process because of these messages being processes (this could easily take several seconds). The latter was specially bad, as users were complaining about it even when they didn't check the modules view. - these events were rarely used, as users only check the modules view when something is wrong and they try to debug things. After getting some feedback from users, we realized that it's better to not used events but make this simply a request and is triggered by users whenever they needed. This diff achieves that and does some small clean up in the existing code. Differential Revision: https://reviews.llvm.org/D94033	2021-01-21 13:18:50 -08:00
David Green	39db5753f9	[LV][ARM] Inloop reduction cost modelling This adds cost modelling for the inloop vectorization added in `745bf6cf44`. Up until now they have been modelled as the original underlying instruction, usually an add. This happens to works OK for MVE with instructions that are reducing into the same type as they are working on. But MVE's instructions can perform the equivalent of an extended MLA as a single instruction: %sa = sext <16 x i8> A to <16 x i32> %sb = sext <16 x i8> B to <16 x i32> %m = mul <16 x i32> %sa, %sb %r = vecreduce.add(%m) -> R = VMLADAV A, B There are other instructions for performing add reductions of v4i32/v8i16/v16i8 into i32 (VADDV), for doing the same with v4i32->i64 (VADDLV) and for performing a v4i32/v8i16 MLA into an i64 (VMLALDAV). The i64 are particularly interesting as there are no native i64 add/mul instructions, leading to the i64 add and mul naturally getting very high costs. Also worth mentioning, under NEON there is the concept of a sdot/udot instruction which performs a partial reduction from a v16i8 to a v4i32. They extend and mul/sum the first four elements from the inputs into the first element of the output, repeating for each of the four output lanes. They could possibly be represented in the same way as above in llvm, so long as a vecreduce.add could perform a partial reduction. The vectorizer would then produce a combination of in and outer loop reductions to efficiently use the sdot and udot instructions. Although this patch does not do that yet, it does suggest that separating the input reduction type from the produced result type is a useful concept to model. It also shows that a MLA reduction as a single instruction is fairly common. This patch attempt to improve the costmodelling of in-loop reductions by: - Adding some pattern matching in the loop vectorizer cost model to match extended reduction patterns that are optionally extended and/or MLA patterns. This marks the cost of the reduction instruction correctly and the sext/zext/mul leading up to it as free, which is otherwise difficult to tell and may get a very high cost. (In the long run this can hopefully be replaced by vplan producing a single node and costing it correctly, but that is not yet something that vplan can do). - getExtendedAddReductionCost is added to query the cost of these extended reduction patterns. - Expanded the ARM costs to account for these expanded sizes, which is a fairly simple change in itself. - Some minor alterations to allow inloop reduction larger than the highest vector width and i64 MVE reductions. - An extra InLoopReductionImmediateChains map was added to the vectorizer for it to efficiently detect which instructions are reductions in the cost model. - The tests have some updates to show what I believe is optimal vectorization and where we are now. Put together this can greatly improve performance for reduction loop under MVE. Differential Revision: https://reviews.llvm.org/D93476	2021-01-21 21:03:41 +00:00
Sanjay Patel	2f03528f5e	[SLP] rename reduction variable to avoid shadowing; NFC The code structure can likely be improved now that 'OperationData' is gone.	2021-01-21 16:02:38 -05:00
Anton Rapetov	bfec9148a0	Scalar: Don't visit constants in findInnerReductionPhi in LoopInterchange In LoopInterchange, `findInnerReductionPhi()` looks for reduction variables, which cannot be constants. Update it to return early in that case. This also addresses a blocker for removing use-lists from ConstantData, whose users could be spread across arbitrary modules in the same LLVMContext. Differential Revision: https://reviews.llvm.org/D94712	2021-01-21 12:33:06 -08:00
Christian Sigg	8827e07aaf	Remove deprecated methods from OpState. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D95123	2021-01-21 21:29:08 +01:00
Duncan P. N. Exon Smith	d7ff003646	ADT: Fix reference invalidation in SmallVector::emplace_back and assign(N,V) This fixes the final (I think?) reference invalidation in `SmallVector` that we need to fix to align with `std::vector`. (There is still some left in the range insert / append / assign, but the standard calls that UB for `std::vector` so I think we don't care?) For POD-like types, reimplement `emplace_back()` in terms of `push_back()`, taking a copy even for large `T` rather than lose the realloc optimization in `grow_pod()`. For other types, split the grow operation in three and construct the new element in the middle. - `mallocForGrow()` calculates the new capacity and returns the result of `safe_malloc()`. We only need a single definition per `SmallVectorBase` so this is defined in SmallVector.cpp to avoid code size bloat. Moving this part of non-POD grow to the source file also allows the logic to be easily shared with `grow_pod`, and `report_size_overflow()` and `report_at_maximum_capacity()` can move there too. - `moveElementsForGrow()` moves elements from the old to the new allocation. - `takeAllocationForGrow()` frees the old allocation and saves the new allocation and capacity . `SmallVector:assign(size_type, const T&)` also uses the split-grow operations for non-POD, but it also has a semantic change when not growing. Previously, assign would start with `clear()`, and so the old elements were destructed and all elements of the new vector were copy-constructed (potentially invalidating references). The new implementation skips destruction and uses copy-assignment for the prefix of the new vector that fits. The new semantics match what libc++ does for `std::vector::assign()`. Note that the following is another possible implementation: ``` void assign(size_type NumElts, ValueParamT Elt) { std::fill_n(this->begin(), std::min(NumElts, this->size()), Elt); this->resize(NumElts, Elt); } ``` The downside of this simpler implementation is that if the vector has to grow there will be `size()` redundant copy operations. (I had planned on splitting this patch up into three for committing (after getting performance numbers / initial review), but I've realized that if this does for some reason need to be reverted we'll probably want to revert the whole package...) Differential Revision: https://reviews.llvm.org/D94739	2021-01-21 12:11:41 -08:00
Michael Munday	4ab0f51a75	Recommit "[RISCV] Legalize select when Zbt extension available" This recommits `71ed4b6ce5` with the polarity of some of the pattern corrected. Original commit message: The custom expansion of select operations in the RISC-V backend interferes with the matching of cmov instructions. Legalizing select when the Zbt extension is available solves that problem. Reviewed By: luismarques, craig.topper Differential Revision: https://reviews.llvm.org/D93767	2021-01-21 12:07:44 -08:00
Sanjay Patel	d77753381f	[SLP] simplify reduction matching This is NFC-intended and removes the "OperationData" class which had become nothing more than a recurrence (reduction) type. I adjusted the matching logic to distinguish instructions from non-instructions - that's all that the "IsLeafValue" member was keeping track of.	2021-01-21 14:58:57 -05:00
Bob Haarman	8e0b179315	[ELF] report section sizes when output file too large Fixes PR48523. When the linker errors with "output file too large", one question that comes to mind is how the section sizes differ from what they were previously. Unfortunately, this information is lost when the linker exits without writing the output file. This change makes it so that the error message includes the sizes of the largest sections. Reviewed By: MaskRay, grimar, jhenderson Differential Revision: https://reviews.llvm.org/D94560	2021-01-21 19:47:03 +00:00
Nikita Popov	65fd034b95	[FunctionAttrs] Infer willreturn for functions without loops If a function doesn't contain loops and does not call non-willreturn functions, then it is willreturn. Loops are detected by checking for backedges in the function. We don't attempt to handle finite loops at this point. Differential Revision: https://reviews.llvm.org/D94633	2021-01-21 20:29:33 +01:00
Duncan P. N. Exon Smith	f2fd41d789	X86: Fix use-after-realloc in X86AsmParser::ParseIntelExpression `X86AsmParser::ParseIntelExpression` has a while loop. In the body, calls to MCAsmLexer::UnLex can force a reallocation in the MCAsmLexer's `CurToken` SmallVector, invalidating saved references to `MCAsmLexer::getTok()`. `const MCAsmToken &Tok` is such a saved reference, and this moves it from outside the while loop to inside the body, fixing a use-after-realloc. `Tok` will still be reused across calls to `Lex()`, each of which effectively destroys and constructs the pointed-to token. I'm a bit skeptical of this usage pattern, but it seems broadly used in the X86AsmParser (and others) so I'm leaving it alone (for now). Somehow this bug was exposed by https://reviews.llvm.org/D94739, resulting in test failures in dot-operator related tests in llvm/test/tools/llvm-ml. I suspect the exposure path is related to optimizer changes from splitting up the grow operation, but I haven't dug all the way in. Regardless, there are already tests in tree that cover this; they might fail consistently if we added ASan instrumentation to SmallVector. Differential Revision: https://reviews.llvm.org/D95112	2021-01-21 11:24:35 -08:00
Joseph Huber	119a9ea13f	[OpenMP] Fix failing test due to change in offloading flags Summary: Prior to D91261 the information checked the OMP_MAP_TARGET_PARAM flag, change this as it has been removed. The INFO macro was changed to accept a flag as input to make conditionally printing information easier. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95133	2021-01-21 14:09:36 -05:00
Artem Belevich	127091bfd5	[CUDA] Normalize handling of defauled dtor. Defaulted destructor was treated inconsistently, compared to other compiler-generated functions. When Sema::IdentifyCUDATarget() got called on just-created dtor which didn't have implicit __host__ __device__ attributes applied yet, it would treat it as a host function. That happened to (sometimes) hide the error when dtor referred to a host-only functions. Even when we had identified defaulted dtor as a HD function, we still treated it inconsistently during selection of usual deallocators, where we did not allow referring to wrong-side functions, while it is allowed for other HD functions. This change brings handling of defaulted dtors in line with other HD functions. Differential Revision: https://reviews.llvm.org/D94732	2021-01-21 10:48:07 -08:00
peter klausler	a75840a09c	[flang] Better C_LOC and C_ASSOCIATED in flang/module The place-holding implementation of C_LOC just didn't work when used with our more complete semantic checking, specifically in the case of a polymorphic argument; convert it to an external function with an implicit interface. C_ASSOCIATED needs to be a generic interface with specific implementations for C_PTR and C_FUNPTR. Differential Revision: https://reviews.llvm.org/D94714	2021-01-21 09:41:05 -08:00
Ulrich Weigand	b3a5abcb36	[NFC][Doc] Mention SystemZ supports StackMap generation Support available as of commit `5eb64110d2`. Differential Revision: https://reviews.llvm.org/D95040	2021-01-21 18:29:46 +01:00
Hsiangkai Wang	b8921af63b	[RISCV] Update V instructions constraints to conform to v1.0 Upgrade RISC-V V extension to v1.0-08a0b46. Update instruction constraints to conform to v1.0. Differential Revision: https://reviews.llvm.org/D93612	2021-01-22 01:15:55 +08:00
Giorgis Georgakoudis	6b7645dd31	[OpenMP] Add time profiling support in libomp Profiling has been recently implemented in libomptarget (D93055). This patch enables time profiling support for libomptarget in libomp, to support profiling of multi-threaded execution of offloaded regions. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94855	2021-01-21 09:15:14 -08:00
Sebastian Neubauer	4dbdff66fe	Revert "[AMDGPU] Implement mir parseCustomPseudoSourceValue" This reverts commit `ba7dcd8542`. (caused memory leaks)	2021-01-21 18:11:48 +01:00

... 2 3 4 5 6 ...

377837 Commits All Branches Search

377837 Commits

All Branches