llvm-project

Commit Graph

Author	SHA1	Message	Date
Arthur Eubanks	a11bf9a7fb	[AMDGPU][Inliner] Remove amdgpu-inline and add a new TTI inline hook Having a custom inliner doesn't really fit in with the new PM's pipeline. It's also extra technical debt. amdgpu-inline only does a couple of custom things compared to the normal inliner: 1) It disables inlining if the number of BBs in a function would exceed some limit 2) It increases the threshold if there are pointers to private arrays(?) These can all be handled as TTI inliner hooks. There already exists a hook for backends to multiply the inlining threshold. This way we can remove the custom amdgpu-inline pass. This caused inline-hint.ll to fail, and after some investigation, it looks like getInliningThresholdMultiplier() was previously getting applied twice in amdgpu-inline (https://reviews.llvm.org/D62707 fixed it not applying at all, so some later inliner change must have fixed something), so I had to change the threshold in the test. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D94153	2021-01-21 20:29:17 -08:00
Jacques Pienaar	aee622fa20	[mlir] Enable passing crash reproducer stream factory method Add factory to create streams for logging the reproducer. Allows for more general logging (beyond file) and logging the configuration/module separately (logged in order, configuration before module). Also enable querying filename of ToolOutputFile. Differential Revision: https://reviews.llvm.org/D94868	2021-01-21 20:03:15 -08:00
Kazu Hirata	551aaa24af	[llvm] Use isDigit (NFC)	2021-01-21 19:59:50 -08:00
ShihPo Hung	9667750331	[RISCV] Add intrinsics for RVV1.0 VFRSQRTE7 & VFRECE7 Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D95113	2021-01-21 18:38:49 -08:00
ShihPo Hung	976cf53cc7	[RISCV] Add intrinsics for vector unordered indexed load in RVV 1.0 Add unordered indexed load: vluxei Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95028	2021-01-21 18:38:49 -08:00
ShihPo Hung	bea661d9a5	[RISCV] Add intrinsics for RVV 1.0 vrgatherei16 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95014	2021-01-21 18:38:49 -08:00
Craig Topper	3b5430eb0d	[RISCV] Add a VL output to vleff intrinsics. The fault-only-first-load instructions can reduce VL if an element other than element 0 triggers a memory fault. This can be used to vectorize loops with data dependent exit conditions like strcmp or strlen. This patch adds a VL output to these intrinsics so that the new VL value can be captured by software. This will be expanded to 'csrr gpr, vl' after the vleff instruction during SelectionDAG. By doing this with one intrinsic we are able to guarantee that the csrr reads the VL value produced by the vleff instruction. Having it as a separate intrinsic would make it impossible to guarantee ordering without making every other vector intrinsic have side effects. The intrinsics are expanded during lowering into two ISD nodes that are glued together. These ISD nodes will go through isel separately, but should maintain the glue so that they get emitted adjacently by InstrEmitter. I've only ran the chain through the vleff instruction, allowing the READ_VL to be deleted if it is unused. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D94286	2021-01-21 17:19:58 -08:00
Chen Zheng	8120cfedf5	[NFC] [TargetRegisterInfo] add another API to get srcreg through copy. Reviewed By: nemanjai, jsji Differential Revision: https://reviews.llvm.org/D92069	2021-01-21 20:10:25 -05:00
Hsiangkai Wang	b7ab6726b6	[RISCV] New vector load/store in V extension v1.0 Upgrade RISC-V V extension to v1.0-08a0b46. Indexed load/store have ordered and unordered form. New whole vector load/store. Differential Revision: https://reviews.llvm.org/D93614	2021-01-22 07:30:09 +08:00
David Green	39db5753f9	[LV][ARM] Inloop reduction cost modelling This adds cost modelling for the inloop vectorization added in `745bf6cf44`. Up until now they have been modelled as the original underlying instruction, usually an add. This happens to works OK for MVE with instructions that are reducing into the same type as they are working on. But MVE's instructions can perform the equivalent of an extended MLA as a single instruction: %sa = sext <16 x i8> A to <16 x i32> %sb = sext <16 x i8> B to <16 x i32> %m = mul <16 x i32> %sa, %sb %r = vecreduce.add(%m) -> R = VMLADAV A, B There are other instructions for performing add reductions of v4i32/v8i16/v16i8 into i32 (VADDV), for doing the same with v4i32->i64 (VADDLV) and for performing a v4i32/v8i16 MLA into an i64 (VMLALDAV). The i64 are particularly interesting as there are no native i64 add/mul instructions, leading to the i64 add and mul naturally getting very high costs. Also worth mentioning, under NEON there is the concept of a sdot/udot instruction which performs a partial reduction from a v16i8 to a v4i32. They extend and mul/sum the first four elements from the inputs into the first element of the output, repeating for each of the four output lanes. They could possibly be represented in the same way as above in llvm, so long as a vecreduce.add could perform a partial reduction. The vectorizer would then produce a combination of in and outer loop reductions to efficiently use the sdot and udot instructions. Although this patch does not do that yet, it does suggest that separating the input reduction type from the produced result type is a useful concept to model. It also shows that a MLA reduction as a single instruction is fairly common. This patch attempt to improve the costmodelling of in-loop reductions by: - Adding some pattern matching in the loop vectorizer cost model to match extended reduction patterns that are optionally extended and/or MLA patterns. This marks the cost of the reduction instruction correctly and the sext/zext/mul leading up to it as free, which is otherwise difficult to tell and may get a very high cost. (In the long run this can hopefully be replaced by vplan producing a single node and costing it correctly, but that is not yet something that vplan can do). - getExtendedAddReductionCost is added to query the cost of these extended reduction patterns. - Expanded the ARM costs to account for these expanded sizes, which is a fairly simple change in itself. - Some minor alterations to allow inloop reduction larger than the highest vector width and i64 MVE reductions. - An extra InLoopReductionImmediateChains map was added to the vectorizer for it to efficiently detect which instructions are reductions in the cost model. - The tests have some updates to show what I believe is optimal vectorization and where we are now. Put together this can greatly improve performance for reduction loop under MVE. Differential Revision: https://reviews.llvm.org/D93476	2021-01-21 21:03:41 +00:00
Duncan P. N. Exon Smith	d7ff003646	ADT: Fix reference invalidation in SmallVector::emplace_back and assign(N,V) This fixes the final (I think?) reference invalidation in `SmallVector` that we need to fix to align with `std::vector`. (There is still some left in the range insert / append / assign, but the standard calls that UB for `std::vector` so I think we don't care?) For POD-like types, reimplement `emplace_back()` in terms of `push_back()`, taking a copy even for large `T` rather than lose the realloc optimization in `grow_pod()`. For other types, split the grow operation in three and construct the new element in the middle. - `mallocForGrow()` calculates the new capacity and returns the result of `safe_malloc()`. We only need a single definition per `SmallVectorBase` so this is defined in SmallVector.cpp to avoid code size bloat. Moving this part of non-POD grow to the source file also allows the logic to be easily shared with `grow_pod`, and `report_size_overflow()` and `report_at_maximum_capacity()` can move there too. - `moveElementsForGrow()` moves elements from the old to the new allocation. - `takeAllocationForGrow()` frees the old allocation and saves the new allocation and capacity . `SmallVector:assign(size_type, const T&)` also uses the split-grow operations for non-POD, but it also has a semantic change when not growing. Previously, assign would start with `clear()`, and so the old elements were destructed and all elements of the new vector were copy-constructed (potentially invalidating references). The new implementation skips destruction and uses copy-assignment for the prefix of the new vector that fits. The new semantics match what libc++ does for `std::vector::assign()`. Note that the following is another possible implementation: ``` void assign(size_type NumElts, ValueParamT Elt) { std::fill_n(this->begin(), std::min(NumElts, this->size()), Elt); this->resize(NumElts, Elt); } ``` The downside of this simpler implementation is that if the vector has to grow there will be `size()` redundant copy operations. (I had planned on splitting this patch up into three for committing (after getting performance numbers / initial review), but I've realized that if this does for some reason need to be reverted we'll probably want to revert the whole package...) Differential Revision: https://reviews.llvm.org/D94739	2021-01-21 12:11:41 -08:00
Matt Arsenault	35c535a7df	AArch64/GlobalISel: Factor out parametersInCSRMatch Make this look more like the DAG handling and move to common code. I also noticed AArch64 seems to not be properly adding the physreg:virtreg mapping to the function live ins.	2021-01-21 10:32:48 -05:00
Joseph Huber	e4eaf9d820	[OpenMP] Add support for mapping names in mapper API Summary: The custom mapper API did not previously support the mapping names added previously. This means they were not present if a user requested debugging information while using the mapper functions. This adds basic support for passing the mapped names to the runtime library. Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D94806	2021-01-21 09:26:44 -05:00
Luo, Yuanke	64132f541e	Revert "[X86][AMX] Fix tile config register spill issue." This reverts commit `20013d02f3`.	2021-01-21 18:11:43 +08:00
Fangrui Song	71635ea5ff	MCDwarf: Delete uneeded parameter And change signature	2021-01-21 00:55:07 -08:00
Luo, Yuanke	20013d02f3	[X86][AMX] Fix tile config register spill issue. Previous code build the model that tile config register is the user of each AMX instruction. There is a problem for the tile config register spill. When across function, the ldtilecfg instruction may be inserted on each AMX instruction which use tile config register. This cause all tile data register clobber. To fix this issue, we remove the model of tile config register. We analyze the regmask of call instruction and insert ldtilecfg if there is any tile data register live across the call. Inserting the sttilecfg before the call is unneccessary, because the tile config doesn't change and we can just reload the config. Besides we also need check tile config register interference. Since we don't model the config register we should check interference from the ldtilecfg to each tile data register def. ldtilecfg / \ BB1 BB2 / \ call BB3 / \ %1=tileload %2=tilezero We can start from the instruction of each tile def, and backward to ldtilecfg. If there is any call instruction, and tile data register is not preserved, we should insert ldtilecfg after the call instruction. Differential Revision: https://reviews.llvm.org/D94155	2021-01-21 16:01:50 +08:00
Georgii Rymar	51f4958057	[yaml2obj/obj2yaml] - Improve dumping/creating of ELF versioning sections. This makes the following improvements. For `SHT_GNU_versym`: * yaml2obj: set `sh_link` to index of `.dynsym` section automatically. For `SHT_GNU_verdef`: * yaml2obj: set `sh_link` to index of `.dynstr` section automatically. * yaml2obj: set `sh_info` field automatically. * obj2yaml: don't dump the `Info` field when its value matches the number of version definitions. For `SHT_GNU_verneed`: * yaml2obj: set `sh_link` to index of `.dynstr` section automatically. * yaml2obj: set `sh_info` field automatically. * obj2yaml: don't dump the `Info` field when its value matches the number of version dependencies. Also, simplifies few test cases. Differential revision: https://reviews.llvm.org/D94956	2021-01-21 10:36:48 +03:00
Kazu Hirata	6de4865545	[llvm] Use hasSingleElement (NFC)	2021-01-20 21:35:55 -08:00
Hsiangkai Wang	a8b96eadfd	[RISCV] Implement vssseg intrinsics. Define vlsseg intrinsics and pseudo instructions. Lower vlsseg intrinsics to pseudo instructions in RISCVDAGToDAGISel. Differential Revision: https://reviews.llvm.org/D94863	2021-01-21 11:51:35 +08:00
Hsiangkai Wang	e5e329023b	[RISCV] Implement vlsseg intrinsics. Define vlsseg intrinsics and pseudo instructions. Lower vlsseg intrinsics to pseudo instructions in RISCVDAGToDAGISel. Differential Revision: https://reviews.llvm.org/D94763	2021-01-21 11:51:35 +08:00
Hsiangkai Wang	47228f7854	[RISCV] Implement vsseg intrinsics. Define vsseg intrinsics and pseudo instructions. Lower vsseg intrinsics to pseudo instructions in RISCVDAGToDAGISel. Differential Revision: https://reviews.llvm.org/D94688	2021-01-21 11:51:35 +08:00
Varun Gandhi	87a89549c4	[NFC] Minor cleanup for ValueHandle code. Based on feedback in https://reviews.llvm.org/D93433. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D94238	2021-01-20 16:27:55 -08:00
Mircea Trofin	ccec2cf1d9	Reland "[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor" This reverts commit `d97f776be5`. The original problem was due to build failures in shared lib builds. D95079 moved ImportedFunctionsInliningStatistics under Analysis, unblocking this.	2021-01-20 13:33:43 -08:00
Mircea Trofin	95ce32c787	[NFC] Move ImportedFunctionsInliningStatistics to Analysis This is related to D94982. We want to call these APIs from the Analysis component, so we can't leave them under Transforms. Differential Revision: https://reviews.llvm.org/D95079	2021-01-20 13:18:03 -08:00
Reid Kleckner	1a9bd5b813	Reland "[PDB] Defer relocating .debug$S until commit time and parallelize it" This reverts commit `5b7aef6eb4` and relands `6529d7c5a4`. The ASan error was debugged and determined to be the fault of an invalid object file input in our test suite, which was fixed by my last change. LLD's project policy is that it assumes input objects are valid, so I have added a comment about this assumption to the relocation bounds check.	2021-01-20 11:53:43 -08:00
Thomas Lively	11802eced5	[WebAssembly] Prototype new f64x2 conversions As proposed in https://github.com/WebAssembly/simd/pull/383. Differential Revision: https://reviews.llvm.org/D95012	2021-01-20 11:28:06 -08:00
Jez Ng	697f4e429b	[lld-macho] Run ObjCContractPass during LTO Run the ObjCARCContractPass during LTO. The legacy LTO backend (under LTO/ThinLTOCodeGenerator.cpp) already does this; this diff just adds that behavior to the new LTO backend. Without that pass, the objc.clang.arc.use intrinsic will get passed to the instruction selector, which doesn't know how to handle it. In order to test both the new and old pass managers, I've also added support for the `--[no-]lto-legacy-pass-manager` flags. P.S. Not sure if the ordering of the pass within the pipeline matters... Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D94547	2021-01-20 14:21:32 -05:00
Mircea Trofin	d97f776be5	Revert "[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor" This reverts commit `e8aec763a5`.	2021-01-20 11:19:34 -08:00
Mircea Trofin	e8aec763a5	[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor When using 2 InlinePass instances in the same CGSCC - one for other mandatory inlinings, the other for the heuristic-driven ones - the order in which the ImportedFunctionStats would be output-ed would depend on the destruction order of the inline passes, which is not deterministic. This patch moves the ImportedFunctionStats responsibility to the InlineAdvisor to address this problem. Differential Revision: https://reviews.llvm.org/D94982	2021-01-20 11:07:36 -08:00
Paul C. Anagnostopoulos	4f5f29d409	Revert "[TableGen] Improve algorithm for inheriting class template args and fields" This reverts commit `c056f82434`. That commit causes build failures.	2021-01-20 09:47:13 -05:00
Paul C. Anagnostopoulos	c056f82434	[TableGen] Improve algorithm for inheriting class template args and fields Differential Revision: https://reviews.llvm.org/D94822	2021-01-20 09:31:43 -05:00
Amanieu d'Antras	21bfd068b3	[AArch64] Add support for the GNU ILP32 ABI Add the aarch64[_be]-*-gnu_ilp32 targets to support the GNU ILP32 ABI for AArch64. The needed codegen changes were mostly already implemented in D61259, which added support for the watchOS ILP32 ABI. The main changes are: - Wiring up the new target to enable ILP32 codegen and MC. - ILP32 va_list support. - ILP32 TLSDESC relocation support. There was existing MC support for ELF ILP32 relocations from D25159 which could be enabled by passing "-target-abi ilp32" to llvm-mc. This was changed to check for "gnu_ilp32" in the target triple instead. This shouldn't cause any issues since the existing support was slightly broken: it was generating ELF64 objects instead of the ELF32 object files expected by the GNU ILP32 toolchain. This target has been tested by running the full rustc testsuite on a big-endian ILP32 system based on the GCC ILP32 toolchain. Reviewed By: kristof.beyls Differential Revision: https://reviews.llvm.org/D94143	2021-01-20 13:34:47 +00:00
Bjorn Pettersson	985b9b7e42	[PM] Avoid duplicates in the Used/Preserved/Required sets The pass analysis uses "sets" implemented using a SmallVector type to keep track of Used, Preserved, Required and RequiredTransitive passes. When having nested analyses we could end up with duplicates in those sets, as there was no checks to see if a pass already existed in the "set" before pushing to the vectors. This idea with this patch is to avoid such duplicates by avoiding pushing elements that already is contained when adding elements to those sets. To align with the above PMDataManager::collectRequiredAndUsedAnalyses is changed to skip adding both the Required and RequiredTransitive passes to its result vectors (since RequiredTransitive always is a subset of Required we ended up with duplicates when traversing both sets). Main goal with this is to avoid spending time verifying the same analysis mulitple times in PMDataManager::verifyPreservedAnalysis when iterating over the Preserved "set". It is assumed that removing duplicates from a "set" shouldn't have any other negative impact (I have not seen any problems so far). If this ends up causing problems one could do some uniqueness filtering of the vector being traversed in verifyPreservedAnalysis instead. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D94416	2021-01-20 13:55:18 +01:00
Mirko Brkusanin	a6a72dfdf2	[AMDGPU][GlobalISel] Avoid selecting S_PACK with constants If constants are hidden behind G_ANYEXT we can treat them same way as G_SEXT. For that purpose we extend getConstantVRegValWithLookThrough with option to handle G_ANYEXT same way as G_SEXT. Differential Revision: https://reviews.llvm.org/D92219	2021-01-20 11:54:53 +01:00
Gabriel Hjort Åkerlund	2aeaaf841b	[GlobalISel] Add missing operand update when copy is required When constraining an operand register using constrainOperandRegClass(), the function may emit a COPY in case the provided register class does not match the current operand register class. However, the operand itself is not updated to make use of the COPY, thereby resulting in incorrect code. This patch fixes that bug by updating the machine operand accordingly. Reviewed By: dsanders Differential Revision: https://reviews.llvm.org/D91244	2021-01-20 10:32:52 +01:00
David Sherwood	255a507716	[NFC][InstructionCost] Use InstructionCost in lib/Transforms/IPO/IROutliner.cpp In places where we call a TTI.getXXCost() function I have changed the code to use InstructionCost instead of unsigned. This is in preparation for later on when we will change the TTI interfaces to return InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D94427	2021-01-20 08:33:59 +00:00
Hsiangkai Wang	8ca4b174d7	[RISCV] Implement vlseg intrinsics. For Zvlsseg, we need continuous vector registers for the values. We need to define new register classes for the different combinations of (number of fields and LMUL). For example, when the number of fields(NF) = 3, LMUL = 2, the values will be assigned to (V0M2, V2M2, V4M2), (V2M2, V4M2, V6M2), (V4M2, V6M2, V8M2), ... We define the vlseg intrinsics with multiple outputs. There is no way to describe the codegen patterns with multiple outputs in the tablegen files. We do the codegen in RISCVISelDAGToDAG and use EXTRACT_SUBREG to extract the values of output. The multiple scalable vector values will be put into a struct. This patch is depended on the support for scalable vector struct. Differential Revision: https://reviews.llvm.org/D94229	2021-01-20 14:26:04 +08:00
Juneyoung Lee	4479c0c2c0	Allow nonnull/align attribute to accept poison Currently LLVM is relying on ValueTracking's `isKnownNonZero` to attach `nonnull`, which can return true when the value is poison. To make the semantics of `nonnull` consistent with the behavior of `isKnownNonZero`, this makes the semantics of `nonnull` to accept poison, and return poison if the input pointer isn't null. This makes many transformations like below legal: ``` %p = gep inbounds %x, 1 ; % p is non-null pointer or poison call void @f(%p) ; instcombine converts this to call void @f(nonnull %p) ``` Instead, this semantics makes propagation of `nonnull` to caller illegal. The reason is that, passing poison to `nonnull` does not immediately raise UB anymore, so such program is still well defined, if the callee does not use the argument. Having `noundef` attribute there re-allows this. ``` define void @f(i8* %p) { ; functionattr cannot mark %p nonnull here anymore call void @g(i8* nonnull %p) ; .. because @g never raises UB if it never uses %p. ret void } ``` Another attribute that needs to be updated is `align`. This patch updates the semantics of align to accept poison as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90529	2021-01-20 11:31:23 +09:00
Wei Mi	3729ee8939	Fix Wmissing-field-initializers warnings.	2021-01-19 15:26:52 -08:00
Wei Mi	21b1ad0340	[SampleFDO] Add the support to split the function profiles with context into separate sections. For ThinLTO, all the function profiles without context has been annotated to outline functions if possible in prelink phase. In postlink phase, profile annotation in postlink phase is only meaningful for function profile with context. If the profile is large, it is better to split the profile into two parts, one with context and one without, so the profile reading in postlink phase only has to read the part with context. To have the profile splitting, we extend the ExtBinary format to support different section arrangement. It will be flexible to add other section layout in the future without the need to create new class inheriting from ExtBinary class. Differential Revision: https://reviews.llvm.org/D94435	2021-01-19 15:16:19 -08:00
Mitch Phillips	5b7aef6eb4	Revert "[PDB] Defer relocating .debug$S until commit time and parallelize it" This reverts commit `6529d7c5a4`. Reason: Broke the ASan buildbots. http://lab.llvm.org:8011/#/builders/99/builds/1567	2021-01-19 11:45:48 -08:00
Craig Topper	ce8b3937dd	[RISCV] Add DAG combine to turn (setcc X, 1, setne) -> (setcc X, 0, seteq) if we can prove X is 0/1. If we are able to compare with 0 instead of 1, we might be able to fold the setcc into a beqz/bnez. Often these setccs start life as an xor that gets converted to a setcc by DAG combiner's rebuildSetcc. I looked into a detecting (xor X, 1) and converting to (seteq X, 0) based on boolean contents being 0/1 in rebuildSetcc instead of using computeKnownBits. It was very perturbing to AMDGPU tests which I didn't look closely at. It had a few changes on a couple other targets, but didn't seem to be much if any improvement. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D94730	2021-01-19 11:21:48 -08:00
Jeroen Dobbelaere	121cac01e8	[noalias.decl] Look through llvm.experimental.noalias.scope.decl Just like llvm.assume, there are a lot of cases where we can just ignore llvm.experimental.noalias.scope.decl. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93042	2021-01-19 20:09:42 +01:00
Jessica Paquette	cfc6073017	[GlobalISel] Combine (a[0]) \| (a[1] << k1) \| ...\| (a[m] << kn) into a wide load This is a restricted version of the combine in `DAGCombiner::MatchLoadCombine`. (See D27861) This tries to recognize patterns like below (assuming a little-endian target): ``` s8* x = ... s32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) -> s32 val = ((i32)a) s8 x = ... s32 val = a[3] \| (a[2] << 8) \| (a[1] << 16) \| (a[0] << 24) -> s32 val = BSWAP(*((s32)a)) ``` (This patch also handles the big-endian target case as well, in which the first example above has a BSWAP, and the second example above does not.) To recognize the pattern, this searches from the last G_OR in the expression tree. E.g. ``` Reg Reg \ / OR_1 Reg \ / OR_2 \ Reg .. / Root ``` Each non-OR register in the tree is put in a list. Each register in the list is then checked to see if it's an appropriate load + shift logic. If every register is a load + potentially a shift, the combine checks if those loads + shifts, when OR'd together, are equivalent to a wide load (possibly with a BSWAP.) To simplify things, this patch (1) Only handles G_ZEXTLOADs (which appear to be the common case) (2) Only works in a single MachineBasicBlock (3) Only handles G_SHL as the bit twiddling to stick the small load into a specific location An IR example of this is here: https://godbolt.org/z/4sP9Pj (lifted from test/CodeGen/AArch64/load-combine.ll) At -Os on AArch64, this is a 0.5% code size improvement for CTMark/sqlite3, and a 0.4% improvement for CTMark/7zip-benchmark. Also fix a bug in `isPredecessor` which caused it to fail whenever `DefMI` was the first instruction in the block. Differential Revision: https://reviews.llvm.org/D94350	2021-01-19 10:24:27 -08:00
Valentin Clement	6bd0a4451c	[flang][directive] Get rid of flangClassValue in TableGen The TableGen emitter for directives has two slots for flangClass information and this was mainly to be able to keep up with the legacy openmp parser at the time. Now that all clauses are encapsulated in AccClause or OmpClause, these two strings are not necessary anymore and were the the source of couple of problem while working with the generic structure checker for OpenMP. This patch remove the flangClassValue string from DirectiveBase.td and use the string flangClass as the placeholder for the encapsulated class. Reviewed By: sameeranjoshi Differential Revision: https://reviews.llvm.org/D94821	2021-01-19 10:28:46 -05:00
Tim Northover	6259fbd8b6	AArch64: add apple-a14 as a CPU This CPU supports all v8.5a features except BTI, and so identifies as v8.5a to Clang. A bit weird, but the best way for things like xnu to detect the new features it cares about.	2021-01-19 14:04:53 +00:00
Med Ismail Bennani	1d37db6ef5	[llvm/Orc] Fix ExecutionEngine module build breakage This patch updates the llvm module map to reflect changes made in `24672ddea3c97fd1eca3e905b23c0116d7759ab8` and fixes the module builds (`-DLLVM_ENABLE_MODULES=On`). Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>	2021-01-19 14:39:06 +01:00
Caroline Concatto	172f1f8952	[AArch64][SVE]Add cost model for vector reduce for scalable vector This patch computes the cost for vector.reduce<operand> for scalable vectors. The cost is split into two parts: the legalization cost and the horizontal reduction. Differential Revision: https://reviews.llvm.org/D93639	2021-01-19 11:54:16 +00:00
Florian Hahn	83daa49758	[LoopRotate] Add PrepareForLTO stage, avoid rotating with inline cands. D84108 exposed a bad interaction between inlining and loop-rotation during regular LTO, which is causing notable regressions in at least CINT2006/473.astar. The problem boils down to: we now rotate a loop just before the vectorizer which requires duplicating a function call in the preheader when compiling the individual files ('prepare for LTO'). But this then prevents further inlining of the function during LTO. This patch tries to resolve this issue by making LoopRotate more conservative with respect to rotating loops that have inline-able calls during the 'prepare for LTO' stage. I think this change intuitively improves the current situation in general. Loop-rotate tries hard to avoid creating headers that are 'too big'. At the moment, it assumes all inlining already happened and the cost of duplicating a call is equal to just doing the call. But with LTO, inlining also happens during full LTO and it is possible that a previously duplicated call is actually a huge function which gets inlined during LTO. From the perspective of LV, not much should change overall. Most loops calling user-provided functions won't get vectorized to start with (unless we can infer that the function does not touch memory, has no other side effects). If we do not inline the 'inline-able' call during the LTO stage, we merely delayed loop-rotation & vectorization. If we inline during LTO, chances should be very high that the inlined code is itself vectorizable or the user call was not vectorizable to start with. There could of course be scenarios where we inline a sufficiently large function with code not profitable to vectorize, which would have be vectorized earlier (by scalarzing the call). But even in that case, there probably is no big performance impact, because it should be mostly down to the cost-model to reject vectorization in that case. And then the version with scalarized calls should also not be beneficial. In a way, LV should have strictly more information after inlining and make more accurate decisions (barring cost-model issues). There is of course plenty of room for things to go wrong unexpectedly, so we need to keep a close look at actual performance and address any follow-up issues. I took a look at the impact on statistics for MultiSource/SPEC2000/SPEC2006. There are a few benchmarks with fewer loops rotated, but no change to the number of loops vectorized. Reviewed By: sanwou01 Differential Revision: https://reviews.llvm.org/D94232	2021-01-19 10:15:29 +00:00
Lang Hames	95b63c7b13	[ORC] Move LookupRequest from OrcShared to Orc. It depends on Orc types (SymbolLookupSet), so can't be part of OrcShared.	2021-01-19 20:23:47 +11:00
Andy Wingo	831a143e50	[WebAssembly] Change prefix on data segment flags to WASM_DATA_SEGMENT Element sections will also need flags, so we shouldn't squat the WASM_SEGMENT namespace. Depends on D90948. Differential Revision: https://reviews.llvm.org/D92315	2021-01-19 09:40:42 +01:00
ShihPo Hung	9cf511aa08	[RISCV] Add intrinsics for vector AMO operations Add vamoswap, vamoadd, vamoxor, vamoand, vamoor, vamomin, vamomax, vamominu, vamomaxu intrinsics. Reviewed By: craig.topper, khchen Differential Revision: https://reviews.llvm.org/D94589	2021-01-18 23:11:10 -08:00
Lang Hames	24672ddea3	[ORC] Move OrcError.h to include/llvm/ExecutionEngine/Orc/Shared. OrcShared is the correct home for this header since Orc was split in `1d0676b54c`. (It should have been moved in that commit, but was overlooked).	2021-01-19 16:18:00 +11:00
Chen Zheng	a9b3303a88	Revert "[NFC] [TargetRegisterInfo] add one use check to lookThruCopyLike." This reverts commit `3bdf4507b6`. Post commit comments need to be addressed first.	2021-01-18 21:33:31 -05:00
Kazu Hirata	fe301f4749	[LoopInfo] Fix a typo in compareLoops The code here is checking to see if two sets are identical. OtherBlocksSet should point to OtherL->getBlocksSet() instead. Differential Revision: https://reviews.llvm.org/D94926	2021-01-18 14:53:22 -08:00
Kazu Hirata	dc300beba7	[STLExtras] Add a default value to drop_begin This patch adds the default value of 1 to drop_begin. In the llvm codebase, 70% of calls to drop_begin have 1 as the second argument. The interface similar to with std::next should improve readability. This patch converts a couple of calls to drop_begin as examples. Differential Revision: https://reviews.llvm.org/D94858	2021-01-18 10:16:34 -08:00
Florian Hahn	291ac7e622	[AArch64] Revert back to Intrinsic<> for TME instructions. This patch reverts back to Intrinsic for the instructions for the transactional memory extension, so nosync is not included.	2021-01-18 18:03:58 +00:00
Florian Hahn	50ae6a3ac9	[AArch64] Make target intrinsics DefaultAttrIntrinsics. DefaultAttrIntrinsics was introduced to add very common attributes to a large set of intrinsics. Currently the added attributes include: nofree nosync nounwind willreturn I think those should hold for most AArch64 target intrinsics, but there are too many to check manually. This patch makes most AArch64 target intrinsics DefaultAttrsIntrinsics. Some notable exceptions I think are exclusive loads and stores as well as the memory barrier intrinsics, for which nosync does not apply I think. Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D94687	2021-01-18 17:32:15 +00:00
Florian Hahn	83aa93e995	[VectorUtils] Do not try to add indices matching tombstone/empty values. Keys matching the tombstone/empty special values cannot be inserted in a DenseMap. Under some circumstances, LV tries to add members to an interleave group that match the special values. Skip adding such members. This is unlikely to have any impact in practice, because interleave groups with such indices are very likely to not be vectorized, due to gaps. This issue has been surfaced by fuzzing, see https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11638	2021-01-18 11:18:28 +00:00
Tres Popp	3bd24574c7	Revert "[PowerPC] support register pressure reduction in machine combiner." This reverts commit `26a396c4ef`. See https://reviews.llvm.org/D92071 for a description of the issue.	2021-01-18 12:01:57 +01:00
Georgii Rymar	b9ce772b8f	[Object, llvm-readelf] - Move the API for retrieving symbol versions to ELF.h `ELFDumper.cpp` implements the functionality that allows to get symbol versions. It is used for dumping versioned symbols. This helps to implement https://bugs.llvm.org/show_bug.cgi?id=48670 ("make llvm-nm -D print version names"): we can move out and reuse the code from `ELFDumper.cpp`. This is what this patch do: it moves the related functionality to `ELFFile<ELFT>`. Differential revision: https://reviews.llvm.org/D94771	2021-01-18 12:50:29 +03:00
Craig Topper	cfec6cd50c	[IR] Allow scalable vectors in structs to support intrinsics returning multiple values. RISC-V would like to use a struct of scalable vectors to return multiple values from intrinsics. This woud also be needed for target independent intrinsics like llvm.sadd.overflow. This patch removes the existing restriction for this. I've modified StructType::isSized to consider a struct containing scalable vectors as unsized so the verifier won't allow loads/stores/allocas of these structs. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D94142	2021-01-17 23:29:51 -08:00
Chen Zheng	26a396c4ef	[PowerPC] support register pressure reduction in machine combiner. Reassociating some patterns to generate more fma instructions to reduce register pressure. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92071	2021-01-17 23:56:13 -05:00
Chen Zheng	3bdf4507b6	[NFC] [TargetRegisterInfo] add one use check to lookThruCopyLike. add one use check to lookThruCopyLike. The root node is safe to be deleted if we are sure that every definition in the copy chain only has one use. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92069	2021-01-17 19:56:42 -05:00
Nikita Popov	0b84afa5fc	Reapply [BasicAA] Handle recursive queries more efficiently There are no changes relative to the original commit. However, an issue this exposed in BasicAA assumption tracking has been fixed in the previous commit. ----- An alias query currently works out roughly like this: * Look up location pair in cache. * Perform BasicAA logic (including cache lookup and insertion...) * Perform a recursive query using BestAAResults. * Look up location pair in cache (and thus do not recurse into BasicAA) * Query all the other AA providers. * Query all the other AA providers. This is a lot of unnecessary work, all ultimately caused by the BestAAResults query at the end of aliasCheck(). The reason we perform it, is that aliasCheck() is getting called recursively, and we of course want those recursive queries to also make use of other AA providers, not just BasicAA. We can solve this by making the recursive queries directly use BestAAResults (which will check both BasicAA and other providers), rather than recursing into aliasCheck(). There are some tradeoffs: * We can no longer pass through the precomputed underlying object to aliasCheck(). This is not a major concern, because nowadays getUnderlyingObject() is quite cheap. * Results from other AA providers are no longer cached inside BasicAA. The way this worked was already a bit iffy, in that a result could be cached, but if it was MayAlias, we'd still end up re-querying other providers anyway. If we want to cache non-BasicAA results, we should do that in a more principled manner. In any case, despite those tradeoffs, this works out to be a decent compile-time improvment. I think it also simplifies the mental model of how BasicAA works. It took me quite a while to fully understand how these things interact. Differential Revision: https://reviews.llvm.org/D90094	2021-01-17 10:34:35 +01:00
Nikita Popov	b1c2f1282a	[BasicAA] Move assumption tracking into AAQI D91936 placed the tracking for the assumptions into BasicAA. However, when recursing over phis, we may use fresh AAQI instances. In this case AssumptionBasedResults from an inner AAQI can reesult in a removal of an element from the outer AAQI. To avoid this, move the tracking into AAQI. This generally makes more sense, as the NoAlias assumptions themselves are also stored in AAQI. The test case only produces an assertion failure with D90094 reapplied. I think the issue exists independently of that change as well, but I wasn't able to come up with a reproducer.	2021-01-17 10:34:35 +01:00
Kazu Hirata	19aacdb715	[llvm] Construct SmallVector with iterator ranges (NFC)	2021-01-16 09:40:53 -08:00
Kazu Hirata	ba0fc7e1f8	[StringExtras] Fix comment typos (NFC)	2021-01-16 09:40:51 -08:00
Florian Hahn	bca16e2fbb	[LTO] Remove options to disable inlining, vectorization & GVNLoadPRE. This patch removes some ancient options as a clean-up before moving code-gen to use LTOBackend in D94487. I think it would preferable to remove those ancient options, because 1. There are no corresponding options in LTOBackend based tools, 2. There are no unit tests for them, 3. They are not passed through by Clang, 4. At least for GNVLoadPRE, users could just use GVN's `enable-load-pre`. Alternatively we could add support for those options to lto::Config & co, but I think it would be better to remove them, unless they are actually used in practice. Reviewed By: steven_wu, tejohnson Differential Revision: https://reviews.llvm.org/D94783	2021-01-16 16:29:15 +00:00
James Player	25c1578a46	Fix llvm::Optional build breaks in MSVC using std::is_trivially_copyable Current code breaks this version of MSVC due to a mismatch between `std::is_trivially_copyable` and `llvm::is_trivially_copyable` for `std::pair` instantiations. Hence I was attempting to use `std::is_trivially_copyable` to set `llvm::is_trivially_copyable<T>::value`. I spent some time root causing an `llvm::Optional` build error on MSVC 16.8.3 related to the change described above: ``` 62>C:\src\ocg_llvm\llvm-project\llvm\include\llvm/ADT/BreadthFirstIterator.h(96,12): error C2280: 'llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> ,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> ,llvm::Optional<llvm::Graph<4>::ChildIterator>>>::operator =(const llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &)': attempting to reference a deleted function (compiling source file C:\src\ocg_llvm\llvm-project\llvm\unittests\ADT\BreadthFirstIteratorTest.cpp) ... ``` The "trivial" specialization of `optional_detail::OptionalStorage` assumes that the value type is trivially copy constructible and trivially copy assignable. The specialization is invoked based on a check of `is_trivially_copyable` alone, which does not imply both `is_trivially_copy_assignable` and `is_trivially_copy_constructible` are true. [[ https://en.cppreference.com/w/cpp/named_req/TriviallyCopyable \| According to the spec ]], a deleted assignment operator does not make `is_trivially_copyable` false. So I think all these properties need to be checked explicitly in order to specialize `OptionalStorage` to the "trivial" version: ``` /// Storage for any type. template <typename T, bool = std::is_trivially_copy_constructible<T>::value && std::is_trivially_copy_assignable<T>::value> class OptionalStorage { ``` Above fixed my build break in MSVC, but I think we need to explicitly check `is_trivially_copy_constructible` too since it might be possible the copy constructor is deleted. Also would be ideal to move over to `std::is_trivially_copyable` instead of the `llvm` namespace verson. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D93510	2021-01-16 09:37:04 -05:00
Jeroen Dobbelaere	668827b648	Introduce llvm.noalias.decl intrinsic The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a noalias scope is declared. When the intrinsic is duplicated, a decision must also be made about the scope: depending on the reason of the duplication, the scope might need to be duplicated as well. Reviewed By: nikic, jdoerfert Differential Revision: https://reviews.llvm.org/D93039	2021-01-16 09:20:45 +01:00
Kazu Hirata	8fd8ff1f67	[StringExtras] Rename SubsequentDelim to ListSeparator This patch renames SubsequentDelim to ListSeparator to clarify the purpose of the class. Differential Revision: https://reviews.llvm.org/D94649	2021-01-15 21:00:56 -08:00
Mircea Trofin	e8049dc3c8	[NewPM][Inliner] Move the 'always inliner' case in the same CGSCC pass as 'regular' inliner Expanding from D94808 - we ensure the same InlineAdvisor is used by both InlinerPass instances. The notion of mandatory inlining is moved into the core InlineAdvisor: advisors anyway have to handle that case, so this change also factors out that a bit better. Differential Revision: https://reviews.llvm.org/D94825	2021-01-15 17:59:38 -08:00
Vladislav Vinogradov	76f5c5a7b0	[ADT][Support] Fix C4146 error from MSVC Unary minus operator applied to unsigned type, result still unsigned. Use `~0U` instead of `-1U` and `1 + ~VAL` instead of `-VAL`. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D94417	2021-01-15 14:34:14 -08:00
Duncan P. N. Exon Smith	ceaf0110ff	Revert "Revert "ADT: Fix reference invalidation in SmallVector..."" This reverts commit `33be50daa9`, effectively reapplying: - `260a856c2a` - `3043e5a5c3` - `49142991a6` ... with a fix to skip a call to `SmallVector::isReferenceToStorage()` when we know the parameter had been taken by value for small, POD-like `T`. See https://reviews.llvm.org/D93779 for the discussion on the revert. At a high-level, these commits fix reference invalidation in SmallVector's push_back, append, insert (one or N), and resize operations. For more details, please see the original commit messages. This commit fixes a bug that crept into `SmallVectorTemplateCommon::reserveForAndGetAddress()` during the review process after performance analysis was done. That function is now called `reserveForParamAndGetAddress()`, clarifying that it only works for parameter values. It uses that knowledge to bypass `SmallVector::isReferenceToStorage()` when `TakesParamByValue`. This is `constexpr` and avoids adding overhead for "small enough", trivially copyable `T`. Performance could potentially be tuned further by increasing the threshold for `TakesParamByValue`, which is currently defined as: ``` bool TakesParamByValue = sizeof(T) <= 2 * sizeof(void *); ``` in the POD-like version of SmallVectorTemplateBase (else, `false`). Differential Revision: https://reviews.llvm.org/D94800	2021-01-15 14:27:48 -08:00
Roman Lebedev	c6654a4cda	[SimplifyCFG][BasicBlockUtils] Port SplitBlockPredecessors()/SplitLandingPadPredecessors() to DomTreeUpdater This is not nice, but it's the best transient solution possible, and is better than just duplicating the whole function. The problem is, this function is widely used, and it is not at all obvious that all the users could be painlessly switched to operate on DomTreeUpdater, and somehow i don't feel like porting all those users first. This function is one of last three that not operate on DomTreeUpdater.	2021-01-15 23:35:56 +03:00
Roman Lebedev	286cf6cb02	[SimplifyCFG] Port SplitBlockAndInsertIfThen() to DomTreeUpdater This is not nice, but it's the best transient solution possible, and is better than just duplicating the whole function. The problem is, this function is widely used, and it is not at all obvious that all the users could be painlessly switched to operate on DomTreeUpdater, and somehow i don't feel like porting all those users first. This function is one of last three that not operate on DomTreeUpdater.	2021-01-15 23:35:56 +03:00
Roman Lebedev	c845c724c2	[Utils][SimplifyCFG] Port SplitBlock() to DomTreeUpdater This is not nice, but it's the best transient solution possible, and is better than just duplicating the whole function. The problem is, this function is widely used, and it is not at all obvious that all the users could be painlessly switched to operate on DomTreeUpdater, and somehow i don't feel like porting all those users first. This function is one of last three that not operate on DomTreeUpdater.	2021-01-15 23:35:56 +03:00
Roman Lebedev	b81f75fa79	[Utils] splitBlockBefore() always operates on DomTreeUpdater, so take it, not DomTree Even though not all it's users operate on DomTreeUpdater, it itself internally operates on DomTreeUpdater, so it must mean everything is fine with that, so just do that globally.	2021-01-15 23:35:56 +03:00
Reid Kleckner	64db296e5a	Revert "[BasicAA] Handle recursive queries more efficiently" This reverts commit `a3904cc77f`. It causes the compiler to crash while building Harfbuzz for ARM in Chromium, reduced reproducer forthcoming: https://crbug.com/1167305	2021-01-15 12:29:57 -08:00
Jessica Paquette	cc90d41945	[MIPatternMatch] Add m_OneNonDBGUse Add a matcher that checks if the given subpattern has only one non-debug use. Also improve existing m_OneUse testcase. Differential Revision: https://reviews.llvm.org/D94705	2021-01-15 10:18:46 -08:00
Stefan Gränitz	cf905274c6	[Orc] Allow LLJITBuilder's CreateObjectLinkingLayer to return errors It can be useful for an ObjectLinkingLayerCreator to allow callee errors to get propagated to the builder. Specifically, this is the case when the ObjectLayer uses the EHFrameRegistrationPlugin, because it requires a TPCEHFrameRegistrar and instantiation for it may fail (e.g. if the required registration symbols are missing in the target process). Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D94690	2021-01-15 12:53:41 +01:00
Stefan Gränitz	a5eb9df1e3	[Orc][NFC] Turn LLJIT member ObjTransformLayer into unique_ptr All other layers in LLJIT are stored as unique_ptr's already. At this point, it is not strictly necessary for ObjTransformLayer, but it makes a follow-up change more straightforward. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D94689	2021-01-15 12:53:24 +01:00
Georgii Rymar	d9afe8588e	[yaml2obj/obj2yaml] - Refine handling of SHT_GNU_verdef sections. This patch: 1) Makes `Version`, `Flags`, `VersionNdx` and `Hash` fields to be `Optional<>`. 2) Disallows dumping version definitions that have `vd_version != 1`. `vd_version` identifies the version of the structure itself. (https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html, https://docs.oracle.com/cd/E19683-01/816-7777/chapter6-80869/index.html) 3) Stops dumping default values for `Version`, `Flags`, `VersionNdx` and `Hash` fields. 4) Refines testing. Differential revision: https://reviews.llvm.org/D94659	2021-01-15 12:40:42 +03:00
Alok Kumar Sharma	104a9f99cc	[Debuginfo][DW_OP_implicit_pointer] (1/7) Support for DW_OP_LLVM_implicit_pointer New dwarf operator DW_OP_LLVM_implicit_pointer is introduced (present only in LLVM IR) This operator is required as it is different than DWARF operator DW_OP_implicit_pointer in representation and specification (number and types of operands) and later can not be used as multiple level. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D84113	2021-01-15 14:45:04 +05:30
Nikita Popov	33be50daa9	Revert "Reapply "ADT: Fix reference invalidation in SmallVector::push_back and single-element insert"" This reverts commit `260a856c2a`. This reverts commit `3043e5a5c3`. This reverts commit `49142991a6`. This change had a larger than anticipated compile-time impact, possibly because the small value optimization is not working as intended. See D93779.	2021-01-15 09:28:42 +01:00
Jan Svoboda	b6575bfd0e	[clang][cli] Specify KeyPath prefixes via TableGen classes It turns out we need to handle `LangOptions` separately from the rest of the options. `LangOptions` used to be conditionally parsed only when `!(DashX.getFormat() == InputKind::Precompiled \|\| DashX.getLanguage() == Language::LLVM_IR)` and we need to restore this order (for more info, see D94682). We could do this similarly to how `DiagnosticOptions` are handled: via a counterpart to the `IsDiag` mix-in (e.g. `IsLang`). These mix-ins would prefix the option key path with the appropriate `CompilerInvocation::XxxOpts` member. However, this solution would be problematic, as we'd now have two kinds of options (`Lang` and `Diag`) with seemingly incomplete key paths in the same file. To understand what `CompilerInvocation` member an option affects, one would need to read the whole option definition and notice the `IsDiag` or `IsLang` class. Instead, this patch introduces more robust way to handle different kinds of options separately: via the `KeyPathAndMacroPrefix` class. We have one specialization of that class per `CompilerInvocation` member (e.g. `LangOpts`, `DiagnosticOpts`, etc.). Now, instead of specifying a key path with `"LangOpts->UndefPrefixes"`, we use `LangOpts<"UndefPrefixes">`. This keeps the readability intact (you don't have to look for the `IsLang` mix-in, the key path is complete on its own) and allows us to specify a custom macro prefix within `LangOpts`. Reviewed By: Bigcheese Differential Revision: https://reviews.llvm.org/D94676	2021-01-15 08:42:59 +01:00
Kazu Hirata	7dc3575ef2	[llvm] Remove redundant return and continue statements (NFC) Identified with readability-redundant-control-flow.	2021-01-14 20:30:34 -08:00
Alexandre Ganea	4fcb25583c	Re-land [Support] On Windows, take the affinity mask into account The number of hardware threads available to a ThreadPool can be limited if setting an affinity mask. For example: > start /B /AFFINITY 0xF lld-link.exe ... Would let LLD only use 4 hyper-threads. Previously, there was an outstanding issue on Windows Server 2019 on dual-CPU machines, which was preventing from using both CPU sockets. In normal conditions, when no affinity mask was set, ProcessorGroup::AllThreads was different from ProcessorGroup::UsableThreads. The previous code in llvm/lib/Support/Windows/Threading.inc L201 was improperly assuming those two values to be equal, and consequently was limiting the execution to only one CPU socket. Differential Revision: https://reviews.llvm.org/D92419	2021-01-14 17:03:22 -05:00
Nikita Popov	a3904cc77f	[BasicAA] Handle recursive queries more efficiently An alias query currently works out roughly like this: * Look up location pair in cache. * Perform BasicAA logic (including cache lookup and insertion...) * Perform a recursive query using BestAAResults. * Look up location pair in cache (and thus do not recurse into BasicAA) * Query all the other AA providers. * Query all the other AA providers. This is a lot of unnecessary work, all ultimately caused by the BestAAResults query at the end of aliasCheck(). The reason we perform it, is that aliasCheck() is getting called recursively, and we of course want those recursive queries to also make use of other AA providers, not just BasicAA. We can solve this by making the recursive queries directly use BestAAResults (which will check both BasicAA and other providers), rather than recursing into aliasCheck(). There are some tradeoffs: * We can no longer pass through the precomputed underlying object to aliasCheck(). This is not a major concern, because nowadays getUnderlyingObject() is quite cheap. * Results from other AA providers are no longer cached inside BasicAA. The way this worked was already a bit iffy, in that a result could be cached, but if it was MayAlias, we'd still end up re-querying other providers anyway. If we want to cache non-BasicAA results, we should do that in a more principled manner. In any case, despite those tradeoffs, this works out to be a decent compile-time improvment. I think it also simplifies the mental model of how BasicAA works. It took me quite a while to fully understand how these things interact. Differential Revision: https://reviews.llvm.org/D90094	2021-01-14 20:32:41 +01:00
Valentin Clement	ca98baa042	[openacc] Rename generated file from ACC.cpp.inc to ACC.inc to match D92955 This patch rename the tablegen generated file ACC.cpp.inc to ACC.inc in order to match what was done in D92955. This file is included in header file as well as .cpp file so it make more sense. Reviewed By: sameeranjoshi Differential Revision: https://reviews.llvm.org/D93485	2021-01-14 14:19:53 -05:00
Simon Pilgrim	d0dbb0468c	[Support] Remove redundant sign bit tests from KnownBits::getSignedMinValue/getSignedMaxValue As noted by @foad on rG6895581fd2c1	2021-01-14 15:46:26 +00:00
Alexandre Ganea	6abbba3fca	Revert "Fix llvm::Optional build breaks in MSVC using std::is_trivially_copyable" This reverts commit `854f0984f0`. This breaks compilation with clang-cl on Windows, while in a MSVC 16.8 cmd.exe. This also breaks PPC: http://lab.llvm.org:8011/#/builders/93/builds/1435 And: https://reviews.llvm.org/D93510#2497737	2021-01-14 08:35:38 -05:00
Andy Wingo	53e3b81faa	[lld][WebAssembly] Add support for handling table symbols This commit adds table symbol support in a partial way, while still including some special cases for the __indirect_function_table symbol. No change in tests. Differential Revision: https://reviews.llvm.org/D94075	2021-01-14 11:13:13 +01:00
Florian Hahn	4bb11b3eaf	[LTO] Expose opt() in LTOBackend (NFC). Exposing opt() which runs middle-end LTO optimzation allows re-using it in LTOCodeGenerator. Reviewed By: steven_wu Differential Revision: https://reviews.llvm.org/D94486	2021-01-14 09:53:41 +00:00
Lucas Prates	2b1e25befe	[AArch64] Adding ACLE intrinsics for the LS64 extension This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`, and `__arm_st64bv0`. These are selected into the LS64 instructions LD64B, ST64B, ST64BV and ST64BV0, respectively. Based on patches written by Simon Tatham. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D93232	2021-01-14 09:43:58 +00:00
Duncan P. N. Exon Smith	6ed3083a96	ADT: Reduce code duplication in SmallVector by calling reserve and clear, NFC	2021-01-13 21:10:31 -08:00
Duncan P. N. Exon Smith	3f98b66f23	ADT: Reduce code duplication in SmallVector by reusing reserve, NFC	2021-01-13 20:52:57 -08:00
Duncan P. N. Exon Smith	c224a83458	ADT: Reduce code duplication in SmallVector::resize by using pop_back_n, NFC	2021-01-13 20:50:00 -08:00
Duncan P. N. Exon Smith	260a856c2a	ADT: Fix reference invalidation in SmallVector::resize For small enough, trivially copyable `T`, take the parameter by-value in `SmallVector::resize`. Otherwise, when growing, update the arugment appropriately. Differential Revision: https://reviews.llvm.org/D93781	2021-01-13 20:48:08 -08:00
Duncan P. N. Exon Smith	3043e5a5c3	ADT: Fix reference invalidation in N-element SmallVector::append and insert For small enough, trivially copyable `T`, take the parameter by-value in `SmallVector::append` and `SmallVector::insert`. Otherwise, when growing, update the arugment appropriately. Differential Revision: https://reviews.llvm.org/D93780	2021-01-13 20:00:44 -08:00
Duncan P. N. Exon Smith	49142991a6	Reapply "ADT: Fix reference invalidation in SmallVector::push_back and single-element insert" This reverts commit `56d1ffb927`, reapplying `9abac60309`, removing insert_one_maybe_copy and using a helper called forward_value_param instead. This avoids use of `std::is_same` (or any SFINAE), so I'm hoping it's more portable and MSVC will be happier. Original commit message follows: For small enough, trivially copyable `T`, take the argument by value in `SmallVector::push_back` and copy it when forwarding to `SmallVector::insert_one_impl`. Otherwise, when growing, update the argument appropriately. Differential Revision: https://reviews.llvm.org/D93779	2021-01-13 19:45:39 -08:00
Kazu Hirata	5c1c39e8d8	[llvm] Use *Set::contains (NFC)	2021-01-13 19:14:41 -08:00
Duncan P. N. Exon Smith	56d1ffb927	Revert "ADT: Fix reference invalidation in SmallVector::push_back and single-element insert" This reverts commit `9abac60309` since there are some bot errors on Windows: http://lab.llvm.org:8011/#/builders/127/builds/4489 ``` FAILED: lib/Support/CMakeFiles/LLVMSupport.dir/IntervalMap.cpp.obj C:\PROGRA~2\MIB055~1\2017\PROFES~1\VC\Tools\MSVC\1416~1.270\bin\Hostx64\x64\cl.exe /nologo /TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib\Support -IC:\b\slave\sanitizer-windows\llvm-project\llvm\lib\Support -Iinclude -IC:\b\slave\sanitizer-windows\llvm-project\llvm\include /DWIN32 /D_WINDOWS /Zc:inline /Zc:__cplusplus /Zi /Zc:strictStrings /Oi /Zc:rvalueCast /bigobj /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2 -UNDEBUG -std:c++14 /EHs-c- /GR- /showIncludes /Folib\Support\CMakeFiles\LLVMSupport.dir\IntervalMap.cpp.obj /Fdlib\Support\CMakeFiles\LLVMSupport.dir\LLVMSupport.pdb /FS -c C:\b\slave\sanitizer-windows\llvm-project\llvm\lib\Support\IntervalMap.cpp C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/SmallVector.h(746): error C2672: 'llvm::SmallVectorImpl<T>::insert_one_maybe_copy': no matching overloaded function found with [ T=llvm::IntervalMapImpl::Path::Entry ] C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/SmallVector.h(745): note: while compiling class template member function 'llvm::IntervalMapImpl::Path::Entry llvm::SmallVectorImpl<T>::insert(llvm::IntervalMapImpl::Path::Entry ,T &&)' with [ T=llvm::IntervalMapImpl::Path::Entry ] C:\b\slave\sanitizer-windows\llvm-project\llvm\lib\Support\IntervalMap.cpp(22): note: see reference to function template instantiation 'llvm::IntervalMapImpl::Path::Entry llvm::SmallVectorImpl<T>::insert(llvm::IntervalMapImpl::Path::Entry ,T &&)' being compiled with [ T=llvm::IntervalMapImpl::Path::Entry ] C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/SmallVector.h(1136): note: see reference to class template instantiation 'llvm::SmallVectorImpl<T>' being compiled with [ T=llvm::IntervalMapImpl::Path::Entry ] C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/IntervalMap.h(790): note: see reference to class template instantiation 'llvm::SmallVector<llvm::IntervalMapImpl::Path::Entry,4>' being compiled C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/SmallVector.h(746): error C2783: 'llvm::IntervalMapImpl::Path::Entry llvm::SmallVectorImpl<T>::insert_one_maybe_copy(llvm::IntervalMapImpl::Path::Entry ,ArgType &&)': could not deduce template argument for '__formal' with [ T=llvm::IntervalMapImpl::Path::Entry ] C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/SmallVector.h(727): note: see declaration of 'llvm::SmallVectorImpl<T>::insert_one_maybe_copy' with [ T=llvm::IntervalMapImpl::Path::Entry ] ```	2021-01-13 19:04:20 -08:00
Duncan P. N. Exon Smith	9abac60309	ADT: Fix reference invalidation in SmallVector::push_back and single-element insert For small enough, trivially copyable `T`, take the argument by value in `SmallVector::push_back` and copy it when forwarding to `SmallVector::insert_one_impl`. Otherwise, when growing, update the argument appropriately. Differential Revision: https://reviews.llvm.org/D93779	2021-01-13 18:58:24 -08:00
Alexandre Ganea	eec856848c	Revert "[Support] On Windows, take the affinity mask into account" This reverts commit `336ab2d51d`.	2021-01-13 21:34:54 -05:00
Alexandre Ganea	336ab2d51d	[Support] On Windows, take the affinity mask into account The number of hardware threads available to a ThreadPool can be limited if setting an affinity mask. For example: > start /B /AFFINITY 0xF lld-link.exe ... Would let LLD only use 4 hyper-threads. Previously, there was an outstanding issue on Windows Server 2019 on dual-CPU machines, which was preventing from using both CPU sockets. In normal conditions, when no affinity mask was set, ProcessorGroup::AllThreads was different from ProcessorGroup::UsableThreads. The previous code in llvm/lib/Support/Windows/Threading.inc L201 was improperly assuming those two values to be equal, and consequently was limiting the execution to only one CPU socket. Differential Revision: https://reviews.llvm.org/D92419	2021-01-13 21:00:09 -05:00
Wei Mi	86341247c4	[NFC] Rename ThinLTOPhase to ThinOrFullLTOPhase and move it from PassBuilder.h to Pass.h. In some compiler passes like SampleProfileLoaderPass, we want to know which LTO/ThinLTO phase the pass is in. Currently the phase is represented in enum class PassBuilder::ThinLTOPhase, so it is only available in PassBuilder and it also cannot represent phase in full LTO. The patch extends it to include full LTO phases and move it from PassBuilder.h to Pass.h, then it is much easier for PassBuilder to communiate with each pass about current LTO phase. Differential Revision: https://reviews.llvm.org/D94613	2021-01-13 15:55:40 -08:00
James Player	854f0984f0	Fix llvm::Optional build breaks in MSVC using std::is_trivially_copyable Current code breaks this version of MSVC due to a mismatch between `std::is_trivially_copyable` and `llvm::is_trivially_copyable` for `std::pair` instantiations. Hence I was attempting to use `std::is_trivially_copyable` to set `llvm::is_trivially_copyable<T>::value`. I spent some time root causing an `llvm::Optional` build error on MSVC 16.8.3 related to the change described above: ``` 62>C:\src\ocg_llvm\llvm-project\llvm\include\llvm/ADT/BreadthFirstIterator.h(96,12): error C2280: 'llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> ,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> ,llvm::Optional<llvm::Graph<4>::ChildIterator>>>::operator =(const llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &)': attempting to reference a deleted function (compiling source file C:\src\ocg_llvm\llvm-project\llvm\unittests\ADT\BreadthFirstIteratorTest.cpp) ... ``` The "trivial" specialization of `optional_detail::OptionalStorage` assumes that the value type is trivially copy constructible and trivially copy assignable. The specialization is invoked based on a check of `is_trivially_copyable` alone, which does not imply both `is_trivially_copy_assignable` and `is_trivially_copy_constructible` are true. [[ https://en.cppreference.com/w/cpp/named_req/TriviallyCopyable \| According to the spec ]], a deleted assignment operator does not make `is_trivially_copyable` false. So I think all these properties need to be checked explicitly in order to specialize `OptionalStorage` to the "trivial" version: ``` /// Storage for any type. template <typename T, bool = std::is_trivially_copy_constructible<T>::value && std::is_trivially_copy_assignable<T>::value> class OptionalStorage { ``` Above fixed my build break in MSVC, but I think we need to explicitly check `is_trivially_copy_constructible` too since it might be possible the copy constructor is deleted. Also would be ideal to move over to `std::is_trivially_copyable` instead of the `llvm` namespace verson. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D93510	2021-01-13 15:23:48 -08:00
Matt Arsenault	d55d592a92	GlobalISel: Do not set observer of MachineIRBuilder in LegalizerHelper This fixes double printing of insertion debug messages in the legalizer. Try to cleanup usage of observers. Currently the use of observers is pretty hard to follow and it's not clear what is responsible for them. Observers are referenced in 3 places: 1. In the MachineFunction 2. In the MachineIRBuilder 3. In the LegalizerHelper The observers in the MachineFunction and MachineIRBuilder are both called only on insertions, and are redundant with each other. The source of the double printing was the same observer was added to both the MachineFunction, and the MachineIRBuilder. One of these references needs to be removed. Arguably observers in general should be fully removed from one or the other, but it may be useful to have a local observer in the MachineIRBuilder that is not added to the function's observers. Alternatively, the wrapper observer could manage a local observer in one place. The LegalizerHelper only ever calls the observer on changing/changed instructions, and never insertions. Logically these are two different types of observers, for changes and for insertions. Additionally, some places used the GISelObserverWrapper when they only needed a single observer they could use directly. Setting the observer in the LegalizerHelper constructor is not flexible enough if the LegalizerHelper is constructed anywhere outside the one used by the legalizer. AMDGPU calls the LegalizerHelper in RegBankSelect, and needs to use a local observer to apply the regbank to newly created instructions. Currently it accomplishes this by constructing a local MachineIRBuilder. I'm trying to move the MachineIRBuilder to be owned/maintained by the RegBankSelect pass itself, but the locally constructed LegalizerHelper would reset the observer. Mips also has a special case use of the LegalizationArtifactCombiner in applyMappingImpl; I think we do need to run the artifact combiner during RegBankSelect, but in a more consistent way outside of applyMappingImpl.	2021-01-13 10:44:31 -05:00
Nathan James	af1bb4bc82	Fix build errors after `ceb9379a9` For some reason some builds dont like the arrow operator access. using the deref then access should fix the issue. /home/buildbots/ppc64le-flang-mlir-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/llvm/include/llvm/ADT/iterator.h:171:34: error: taking the address of a temporary object of type 'llvm::StringRef' [-Waddress-of-temporary] PointerT operator->() { return &static_cast<DerivedT >(this)->operator(); } ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/buildbots/ppc64le-flang-mlir-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/llvm/include/llvm/ADT/StringExtras.h:387:13: note: in instantiation of member function 'llvm::iterator_facade_base<llvm::mapped_iterator<mlir::tblgen::TypeParameter , (lambda at /home/buildbots/ppc64le-flang-mlir-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/mlir/tools/mlir-tblgen/TypeDefGen.cpp:414:19), llvm::StringRef>, std::random_access_iterator_tag, llvm::StringRef, long, llvm::StringRef , llvm::StringRef &>::operator->' requested here Len += I->size();	2021-01-13 12:19:53 +00:00
Nathan James	ceb9379a90	[ADT] Fix join_impl using the wrong size when calculating total length Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D83305	2021-01-13 11:36:49 +00:00
Georgii Rymar	6d3098e7ff	[obj2yaml,yaml2obj] - Refine how we set/dump the sh_entsize field. This reuses the code from yaml2obj (moves it to ELFYAML.h). With it we can set the `sh_entsize` in a single place in `obj2yaml`. Note that it also fixes a bug of `yaml2obj`: we do not set the `sh_entsize` field for the `SHT_ARM_EXIDX` section properly. Differential revision: https://reviews.llvm.org/D93858	2021-01-13 11:52:40 +03:00
Georgii Rymar	141906fa14	[llvm-readelf/obj] - Add support of multiple SHT_SYMTAB_SHNDX sections. Currently we don't support multiple SHT_SYMTAB_SHNDX sections and the DT_SYMTAB_SHNDX tag currently. This patch implements it and fixes the https://bugs.llvm.org/show_bug.cgi?id=43991. I had to introduce the `struct DataRegion` to ELF.h, it is used to represent a region that might have no known size. It is needed, because we don't know the size of the extended section indices table when it is located via DT_SYMTAB_SHNDX. In this case we still want to validate that we don't read past the end of the file. Differential revision: https://reviews.llvm.org/D92923	2021-01-13 11:36:43 +03:00
Jonas Devlieghere	f1d5cbbdee	[dsymutil] Add preliminary support for DWARF 5. Currently dsymutil will silently fail when processing binaries with Dwarf 5 debug info. This patch adds rudimentary support for Dwarf 5 in dsymutil. - Recognize relocations in the debug_addr section. - Recognize (a subset of) Dwarf 5 form values. - Emits valid Dwarf 5 compile unit header chains. To simplify things (and avoid having to emit indexed sections) I decided to emit the relocated addresses directly in the debug info section. - DW_FORM_strx gets relocated and rewritten to DW_FORM_strp - DW_FORM_addrx gets relocated and rewritten to DW_FORM_addr Obviously there's a lot of work left, but this should be a step in the right direction. rdar://62345491 Differential revision: https://reviews.llvm.org/D94323	2021-01-12 21:55:41 -08:00
Kazu Hirata	2c2d489b78	[CodeGen] Remove unused function isRegLiveInExitBlocks (NFC) The last use was removed on Jan 17, 2020 in commit `42350cd893`.	2021-01-12 21:43:48 -08:00
Kazu Hirata	12fc9ca3a4	[llvm] Remove redundant string initialization (NFC) Identified with readability-redundant-string-init.	2021-01-12 21:43:46 -08:00
Serguei Katkov	fba9805ba3	[Verifier] Extend statepoint verifier to cover more constants Also old mir tests are updated to meet last changes in STATEPOINT format. Reviewers: reames, dantrushin Reviewed By: reames, dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D94482	2021-01-13 11:51:48 +07:00
Lang Hames	cd8a80de96	[Orc] Add a unit test for asynchronous definition generation.	2021-01-13 14:23:36 +11:00
Hsiangkai Wang	914e2f5a02	[NFC] Use generic name for scalable vector stack ID. Differential Revision: https://reviews.llvm.org/D94471	2021-01-13 10:57:43 +08:00
Reid Kleckner	6529d7c5a4	[PDB] Defer relocating .debug$S until commit time and parallelize it This is a pretty classic optimization. Instead of processing symbol records and copying them to temporary storage, do a first pass to measure how large the module symbol stream will be, and then copy the data into place in the PDB file. This requires defering relocation until much later, which accounts for most of the complexity in this patch. This patch avoids copying the contents of all live .debug$S sections into heap memory, which is worth about 20% of private memory usage when making PDBs. However, this is not an unmitigated performance win, because it can be faster to read dense, temporary, heap data than it is to iterate symbol records in object file backed memory a second time. Results on release chrome.dll: peak mem: 5164.89MB -> 4072.19MB (-1,092.7MB, -21.2%) wall-j1: 0m30.844s -> 0m32.094s (slightly slower) wall-j3: 0m20.968s -> 0m20.312s (slightly faster) wall-j8: 0m19.062s -> 0m17.672s (meaningfully faster) I gathered similar numbers for a debug, component build of content.dll in Chrome, and the performance impact of this change was in the noise. The memory usage reduction was visible and similar. Because of the new parallelism in the PDB commit phase, more cores makes the new approach faster. I'm assuming that most C++ developer machines these days are at least quad core, so I think this is a win. Differential Revision: https://reviews.llvm.org/D94267	2021-01-12 17:46:29 -08:00
Juneyoung Lee	25eb7b08ba	[DAGCombiner] Fold BRCOND(FREEZE(COND)) to BRCOND(COND) This patch resolves the suboptimal codegen described in http://llvm.org/pr47873 . When CodeGenPrepare lowers select into a conditional branch, a freeze instruction is inserted. It is then translated to `BRCOND(FREEZE(SETCC))` in SelDag. The `FREEZE` in the middle of `SETCC` and `BRCOND` was causing a suboptimal code generation however. This patch adds `BRCOND(FREEZE(cond))` -> `BRCOND(cond)` fold to DAGCombiner to remove the `FREEZE`. To make this optimization sound, `BRCOND(UNDEF)` simply should nondeterministically jump to the branch or not, rather than raising UB. It wasn't clear what happens when the condition was undef according to the comments in ISDOpcodes.h, however. I updated the comments of `BRCOND` to make it explicit (as well as `BR_CC`, which is also a conditional branch instruction). Note that it diverges from the semantics of `br` instruction in IR, which is explicitly UB. Since the UB semantics was necessary to explain optimizations that use branching conditions, and SelDag doesn't seem to have such optimization, I think this divergence is okay. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D92015	2021-01-13 09:36:52 +09:00
Jessica Paquette	ddcb0aae8b	[MIPatternMatch] Add matcher for G_PTR_ADD Add a matcher which recognizes G_PTR_ADD and add a test. Differential Revision: https://reviews.llvm.org/D94348	2021-01-12 15:21:19 -08:00
Craig Topper	1730b0f66a	[RISCV] Remove '.mask' from vcompress intrinsic name. NFC It has a mask argument, but isn't a masked instruction. It doesn't use the mask policy of or the v0.t syntax.	2021-01-12 14:46:16 -08:00
Nathan James	a7130d85e4	[ADT][NFC] Use empty base optimisation in BumpPtrAllocatorImpl Most uses of this class just use the default MallocAllocator. As this contains no fields, we can use the empty base optimisation for BumpPtrAllocatorImpl and save 8 bytes of padding for most use cases. This prevents using a class that is marked as `final` as the `AllocatorT` template argument. In one must use an allocator that has been marked as `final`, the simplest way around this is a proxy class. The class should have all the methods that `AllocaterBase` expects and should forward the calls to your own allocator instance. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D94439	2021-01-12 22:43:48 +00:00
modimo	2a49b7c64a	[Inliner] Change inline remark format and update ReplayInlineAdvisor to use it This change modifies the source location formatting from: LineNumber.Discriminator to: LineNumber:ColumnNumber.Discriminator The motivation here is to enhance location information for inline replay that currently exists for the SampleProfile inliner. This will be leveraged further in inline replay for the CGSCC inliner in the related diff. The ReplayInlineAdvisor is also modified to read the new format and now takes into account the callee for greater accuracy. Testing: ninja check-llvm Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D94333	2021-01-12 13:43:48 -08:00
Florian Hahn	6cd44b204c	[FunctionAttrs] Derive willreturn for fns with readonly` & `mustprogress`. Similar to D94125, derive `willreturn` for functions that are `readonly` and `mustprogress` in FunctionAttrs. To quote the reasoning from D94125: Since D86233 we have `mustprogress` which, in combination with `readonly`, implies `willreturn`. The idea is that every side-effect has to be modeled as a "write". Consequently, `readonly` means there is no side-effect, and `mustprogress` guarantees that we cannot "loop" forever without side-effect. Reviewed By: jdoerfert, nikic Differential Revision: https://reviews.llvm.org/D94502	2021-01-12 20:02:34 +00:00
Valentin Clement	0bd9a13691	[mlir][openacc] Use TableGen information for default enum Use TableGen and information in ACC.td for the Default enum in the OpenACC dialect. This patch generalize what was done for OpenMP for directives. Follow up patch after D93576 Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D93710	2021-01-12 09:42:42 -05:00
Bevin Hansson	c4944a6f53	[Fixed Point] Add codegen for conversion between fixed-point and floating point. The patch adds the required methods to FixedPointBuilder for converting between fixed-point and floating point, and uses them from Clang. This depends on D54749. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D86632	2021-01-12 13:53:01 +01:00
Christian Sigg	ace516fb33	Change the LLVM_ATTRIBUTE_DEPRECATED macro to use C++14 attribute. C++14 attributes are superior because they can be applied to functions with inline definition and the syntax is cleaner. I intend to convert all uses and then remove the macro. One issue that might hold back switching uses to C++14 attributes is that clang-format does not put long attributes on separate lines and formatted code will look like: ``` template <typename T> [[deprecated("blah blah")]] void foooooooooooooooooooooooooooo() { ... } ``` Putting long attributes on a separate line would be prettier. See https://stackoverflow.com/questions/45740466/clang-format-setting-to-control-c-attributes AttributeMacros probably won't help because it can't match the custom message. https://clang.llvm.org/docs/ClangFormatStyleOptions.html Reviewed By: rriddle, MaskRay Differential Revision: https://reviews.llvm.org/D94219	2021-01-12 12:41:00 +01:00
Jay Foad	f264f9ad7d	[SlotIndexes] Fix and simplify basic block splitting Remove the InsertionPoint argument from SlotIndexes::insertMBBInMaps because it was confusing: what does it mean to insert a new block between two instructions, in the middle of an existing block? Instead, support the case that MachineBasicBlock::splitAt really needs, where the new block contains some instructions that are already in the maps because they have been moved there from the tail of the previous block. In all other use cases the new block is empty. Based on work by Carl Ritson! Differential Revision: https://reviews.llvm.org/D94311	2021-01-12 10:50:14 +00:00
Sebastian Neubauer	6a195491b6	[AMDGPU] Fix failing assert with scratch ST mode In ST mode, flat scratch instructions have neither an sgpr nor a vgpr for the address. This lead to an assertion when inserting hard clauses. Differential Revision: https://reviews.llvm.org/D94406	2021-01-12 09:54:02 +01:00
Craig Topper	f9ef3a6003	[SelectionDAG] Make isConstantIntBuildVectorOrConstantInt and isConstantFPBuildVectorOrConstantFP methods const.	2021-01-11 23:26:53 -08:00
Kazu Hirata	89e8eb946d	[llvm] Use llvm::find_if (NFC)	2021-01-11 18:48:06 -08:00
Quentin Colombet	905623b64d	[NFC][LICM] Minor improvements to debug output Added a utility function in Value class to print block name and use block labels for unnamed blocks. Changed LICM to call this function in its debug output. Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D93577	2021-01-11 18:02:49 -08:00
Lang Hames	ef50c07b1f	[JITLink] Add a new PostAllocationPasses list. Passes in the new PostAllocationPasses list will run immediately after memory allocation and address assignment for defined symbols, and before JITLinkContext::notifyResolved is called. These passes can set up state associated with the addresses of defined symbols before any query for these addresses completes.	2021-01-12 11:57:07 +11:00
Jonas Devlieghere	f9902514fe	[MC] Make getEHFrameSection const like every other getter (NFC)	2021-01-11 16:56:29 -08:00
Evandro Menezes	7470017f24	[RISCV] Define the vfclass RVV intrinsics Define the `vfclass` IR intrinsics for the respective V instructions. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com> Differential Revision: https://reviews.llvm.org/D94356	2021-01-11 17:40:09 -06:00
Duncan P. N. Exon Smith	5ccff5aaa6	ADT: Fix pointer comparison UB in SmallVector The standard requires comparisons of pointers to unrelated storage to use `std::less`. Split out some helpers that do that and update all the code that was comparing using `<` and friends (mostly assertions). Differential Revision: https://reviews.llvm.org/D93777	2021-01-11 15:31:04 -08:00
Hongtao Yu	32bcfcda4e	Rename debug linkage name with -funique-internal-linkage-names Functions that are renamed under -funique-internal-linkage-names have their debug linkage name updated as well. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D93747	2021-01-11 13:56:07 -08:00
Bjorn Pettersson	32c073acb3	[GlobalISel] Map extractelt to G_EXTRACT_VECTOR_ELT Before this patch there was generic mapping from vector_extract to G_EXTRACT_VECTOR_ELT added in SelectionDAGCompat.td. That mapping is now replaced by a mapping from extractelt instead. The reasoning is that vector_extract is marked as deprecated, so it is assumed that a majority of targets will use extractelt and not vector_extract (and that the long term solution for all targets would be to use extractelt). Targets like AArch64 that still use vector_extract can add an additional mapping from the deprecated vector_extract as target specific tablegen definitions. Such a mapping is added for AArch64 in this patch to avoid breaking tests. When adding the extractelt => G_EXTRACT_VECTOR_ELT mapping we triggered some new code paths in GlobalISelEmitter, ending up in an assert when trying to import a pattern containing EXTRACT_SUBREG for ARM. Therefore this patch also adds a "failedImport" warning for that situation (instead of hitting the assert). Differential Revision: https://reviews.llvm.org/D93416	2021-01-11 21:53:56 +01:00
Nathan James	d3ff24cbf8	[ADT] Add makeIntrusiveRefCnt helper function Works like std::make_unique but for IntrusiveRefCntPtr objects. See https://lists.llvm.org/pipermail/llvm-dev/2021-January/147729.html Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D94440	2021-01-11 20:12:53 +00:00
Jamie Schmeiser	43a830ed94	Introduce new quiet mode and new option handling for -print-changed. Summary: Introduce a new mode of operation for -print-changed that only reports after a pass changes the IR with all of the other messages suppressed (ie, no initial IR and no messages about ignored, filtered or non-modifying passes). The option processing for -print-changed is changed to take an optional string indicating options for print-changed. Initially, the only option supported is quiet (as described above). This new quiet mode is specified with -print-changed=quiet while -print-changed will continue to function in the same way. It is intended that there will be more options in the future. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks) Differential Revision: https://reviews.llvm.org/D92589	2021-01-11 14:15:18 -05:00
Paul Robinson	1f9c29228c	[FastISel] NFC: Clean up unnecessary bookkeeping Now that we flush the local value map for every instruction, we don't need any extra flushes for specific cases. Also, LastFlushPoint is not used for anything. Follow-ups to #c161665 (D91734). This reapplies #3fd39d3. Differential Revision: https://reviews.llvm.org/D92338	2021-01-11 09:40:39 -08:00
Paul Robinson	c161775dec	[FastISel] Flush local value map on every instruction Local values are constants or addresses that can't be folded into the instruction that uses them. FastISel materializes these in a "local value" area that always dominates the current insertion point, to try to avoid materializing these values more than once (per block). https://reviews.llvm.org/D43093 added code to sink these local value instructions to their first use, which has two beneficial effects. One, it is likely to avoid some unnecessary spills and reloads; two, it allows us to attach the debug location of the user to the local value instruction. The latter effect can improve the debugging experience for debuggers with a "set next statement" feature, such as the Visual Studio debugger and PS4 debugger, because instructions to set up constants for a given statement will be associated with the appropriate source line. There are also some constants (primarily addresses) that could be produced by no-op casts or GEP instructions; the main difference from "local value" instructions is that these are values from separate IR instructions, and therefore could have multiple users across multiple basic blocks. D43093 avoided sinking these, even though they were emitted to the same "local value" area as the other instructions. The patch comment for D43093 states: Local values may also be used by no-op casts, which adds the register to the RegFixups table. Without reversing the RegFixups map direction, we don't have enough information to sink these instructions. This patch undoes most of D43093, and instead flushes the local value map after() every IR instruction, using that instruction's debug location. This avoids sometimes incorrect locations used previously, and emits instructions in a more natural order. In addition, constants materialized due to PHI instructions are not assigned a debug location immediately; instead, when the local value map is flushed, if the first local value instruction has no debug location, it is given the same location as the first non-local-value-map instruction. This prevents PHIs from introducing unattributed instructions, which would either be implicitly attributed to the location for the preceding IR instruction, or given line 0 if they are at the beginning of a machine basic block. Neither of those consequences is good for debugging. This does mean materialized values are not re-used across IR instruction boundaries; however, only about 5% of those values were reused in an experimental self-build of clang. () Actually, just prior to the next instruction. It seems like it would be cleaner the other way, but I was having trouble getting that to work. This reapplies commits `cf1c774d` and `dc35368c`, and adds the modification to PHI handling, which should avoid problems with debugging under gdb. Differential Revision: https://reviews.llvm.org/D91734	2021-01-11 08:32:36 -08:00
Giorgis Georgakoudis	9751705512	[OpenMPOpt][WIP] Expand parallel region merging The existing implementation of parallel region merging applies only to consecutive parallel regions that have speculatable sequential instructions in-between. This patch lifts this limitation to expand merging with any sequential instructions in-between, except calls to unmergable OpenMP runtime functions. In-between sequential instructions in the merged region are sequentialized in a "master" region and any output values are broadcasted to the following parallel regions and the sequential region continuation of the merged region. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90909	2021-01-11 08:06:23 -08:00
Kazushi (Jam) Marukawa	b72ca79982	[VE] Support intrinsic to isnert/extract_subreg of v512i1 Support insert/extract_subreg intrinsic instructions for v512i1 registers and add regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D94298	2021-01-11 20:40:10 +09:00
Jan Svoboda	97100646d1	Reapply "[clang][cli] Port DiagnosticOpts to new option parsing system" This reverts commit `8e3e148c` This commit fixes two issues with the original patch: * The sanitizer build bot reported an uninitialized value. This was caused by normalizeStringIntegral not returning None on failure. * Some build bots complained about inaccessible keypaths. To mitigate that, "this->" was added back to the keypath to restore the previous behavior.	2021-01-11 10:05:53 +01:00
David Sherwood	b7ccaca537	[NFC] Remove min/max functions from InstructionCost Removed the InstructionCost::min/max functions because it's fine to use std::min/max instead. Differential Revision: https://reviews.llvm.org/D94301	2021-01-11 09:00:12 +00:00
Lang Hames	7b11f564dc	[JITLink] Rename PostAllocationPasses to PreFixupPasses. PreFixupPasses better reflects when these passes will run. A future patch will (re)introduce a PostAllocationPasses list that will run after allocation, but before JITLinkContext::notifyResolved is called to notify the rest of the JIT about the resolved symbol addresses.	2021-01-11 18:33:50 +11:00

1 2 3 4 5 ...

43897 Commits