llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	0d3807b365	[MergeICmps] Separate out BCECmp and use Optional (NFC) Separate out the BCECmp part from BCECmpBlock, which just stores the comparison atoms without the branch instruction. At the same time switch the code to return Optional<> rather than objects in invalid state and partially constructed objects.	2021-07-26 17:06:43 +02:00
Sander de Smalen	981e9dce54	[LV] Don't assume isScalarAfterVectorization if one of the uses needs widening. This fixes an issue that was found in D105199, where a GEP instruction is used both as the address of a store, as well as the value of a store. For the former, the value is scalar after vectorization, but the latter (as value) requires widening. Other code in that function seems to prevent similar cases from happening, but it seems this case was missed. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D106164	2021-07-26 16:01:55 +01:00
Bradley Smith	81eafb8a37	[AArch64][SVE] Break false dependencies for inactive lanes of unary operations Differential Revision: https://reviews.llvm.org/D105889	2021-07-26 15:01:21 +00:00
Shilei Tian	3274cdc83e	[Clang][OpenMP] Remove the mandatory flush for capture for OpenMP 5.1 In OpenMP 5.1: > If the `write` or `update` clause is specifieded, the atomic operation is not an atomic conditional update for which the comparison fails, and the effective memory ordering is `release`, `acq_rel`, or `seq_cst`, the strong flush on entry to the atomic operation is also a release flush. If the `read` or `update` clause is specified and the effective memory ordering is `acquire`, `acq_rel`, or `seq_cst` then the strong flush on exit from the atomic operation is also an acquire flush. In OpenMP 5.0: > If the `write`, `update`, or `capture` clause is specified and the `release`, `acq_rel`, or `seq_cst` clause is specified then the strong flush on entry to the atomic operation is also a release flush. If the `read` or `capture` clause is specified and the `acquire`, `acq_rel`, or `seq_cst` clause is specified then the strong flush on exit from the atomic operation is also an acquire flush. From my understanding, in OpenMP 5.1, `capture` is removed from the requirement for flush, therefore we don't have to enforce it. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D100768	2021-07-26 11:00:44 -04:00
Ulrich Weigand	8cd8120a7b	[SystemZ] Add support for new cpu architecture - arch14 This patch adds support for the next-generation arch14 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Detection of arch14 as host processor. - Assembler/disassembler support for new instructions. - New LLVM intrinsics for certain new instructions. - Support for low-level builtins mapped to new LLVM intrinsics. - New high-level intrinsics in vecintrin.h. - Indicate support by defining __VEC__ == 10304. Note: No currently available Z system supports the arch14 architecture. Once new systems become available, the official system name will be added as supported -march name.	2021-07-26 16:57:28 +02:00
Florian Hahn	7a1e73f0b9	Recommit "[VPlan] Add recipe for first-order rec phis, make splicing explicit." This reverts the revert commit `b1777b04dc`. The patch originally got reverted due to a crash: https://bugs.chromium.org/p/chromium/issues/detail?id=1232798#c2 The underlying issue was that we were not using the stored values from the modified memory recipes, but the out-of-date values directly from the IR (accessed via the VPlan). This should be fixed in `d995d6376`. A reduced version of the reproducer has been added in `93664503be`.	2021-07-26 15:50:30 +01:00
Mark de Wever	1139fd4270	[libc++][ci] Detect not committed generated files. The Generated output CI job only tests for modified files. This job should also fail the generated output contains new files. It would be possible to test modified and untracked files in one execution of `git ls-files`. However the diff is stored as an artifact so the execution of `git diff` would still be required. Discussion: Would it be better to do `git ls-files -om` and remove the excution of `! grep -q '^--- a' ${BUILD_DIR}/generated_output.patch \|\| false` ? (Obviously then the name `generated_output.untracked` should change to something like `generated_output.status`) Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D106534	2021-07-26 16:41:57 +02:00
Nikita Popov	33146857e9	[IR] Consider non-willreturn as side effect (PR50511) This adjusts mayHaveSideEffect() to return true for !willReturn() instructions. Just like other side-effects, non-willreturn calls (aka "divergence") cannot be removed and cannot be reordered relative to other side effects. This fixes a number of bugs where non-willreturn calls are either incorrectly dropped or moved. In particular, it also fixes the last open problem in https://bugs.llvm.org/show_bug.cgi?id=50511. I performed a cursory review of all current mayHaveSideEffect() uses, which convinced me that these are indeed the desired default semantics. Places that do not want to consider non-willreturn as a sideeffect generally do not want mayHaveSideEffect() semantics at all. I identified two such cases, which are addressed by D106591 and D106742. Finally, there is a use in SCEV for which we don't really have an appropriate API right now -- what it wants is basically "would this be considered forward progress". I've just spelled out the previous semantics there. Differential Revision: https://reviews.llvm.org/D106749	2021-07-26 16:35:14 +02:00
Benjamin Kramer	404f0d4f7c	Simplify away some SmallVector copies. NFCI. The lifetime of the initializer list is the full expression, so we can skip storing it in a temporary vector.	2021-07-26 16:33:38 +02:00
Gabor Marton	4761321d49	[Analyzer][solver][NFC] print constraints deterministically (ordered by their string representation) This change is an extension to D103967 where I added dump methods for (dis)equality classes of the State. There, the (dis)equality classes and their contents are dumped in an ordered fashion, they are ordered based on their string representation. This is very useful once we start to use FileCheck to test the State dump in certain tests. Differential Revision: https://reviews.llvm.org/D106642	2021-07-26 16:27:23 +02:00
Jeremy Morse	f86694cb80	[InstrRef][AArch64][1/4] Accept constant physreg variable locations Late in SelectionDAG we join up instruction numbers with their defining instructions, if it couldn't be done during the main part of SelectionDAG. One exception is function arguments, where we have to point a DBG_PHI instruction at the incoming live register, as they don't have a defining instruction. This patch adds another exception, for constant physregs, like aarch64 has. It may seem wasteful to use two instructions where we could use a single DBG_VALUE, however the whole point of instruction referencing is to decouple the identification of values from the specification of where variable location ranges start. (Part of my aarch64 work to ease adoption of instruction referencing, as in the meta comment on D104520) Differential Revision: https://reviews.llvm.org/D104520	2021-07-26 15:26:15 +01:00
Florian Hahn	93664503be	[LV] Add test to store a first-order rec via interleave group. This is a reduced version of the reproducer from https://bugs.chromium.org/p/chromium/issues/detail?id=1232798#c2	2021-07-26 15:20:04 +01:00
Alexey Bataev	6ca48efcf6	[SLP]Fix costs calculations. Need to fix several cost-related problems. The final type may be defined incorrectly because of to early definition (we may end up with the wider type), the CommonCost should not be redefined in ExtractElements cost related calculations and the shuffle of the final insertelements vectors should be calculated as a cost of single vector permutations + costs of two vector permutations for other n-1 incoming vectors. Differential Revision: https://reviews.llvm.org/D106578	2021-07-26 07:14:03 -07:00
Anastasia Stulova	81600160b3	[OpenCL] Change default standard version to CL1.2 Set default version for OpenCL C to 1.2. This means that the absence of any standard flag will be equivalent to passing '-cl-std=CL1.2'. Note that this patch also fixes incorrect version check for the pointer to pointer kernel arguments diagnostic and atomic test. Differential Revision: https://reviews.llvm.org/D106504	2021-07-26 15:04:34 +01:00
gbreynoo	87ed73fe6e	[llvm-readobj] Display multiple function names for stack size entries The current implementation of displaying .stack_size information presumes that each entry represents a single function but this is not always the case. For example with the use of ICF multiple functions can be represented with the same code, meaning that the address found in a .stack_size entry corresponds to multiple function symbols. This change allows multiple function names to be displayed when appropriate. Differential Revision: https://reviews.llvm.org/D105884	2021-07-26 14:49:53 +01:00
Jay Foad	59f6865231	[AMDGPU][GISel] Fix MMO for raw/struct buffer access with non-constant offset Codegen for the raw/struct buffer access intrinsics would update the offset in the MMO to reflect the combined offset, if it was known to be constant. If the combined offset was not known to be constant, or if there was an index, it would set the offset in the MMO to 0. This is unsafe because it makes it look like the access does not alias with another access with a fixed non-zero offset. Fix these cases by setting the pointer in the MMO to null, to reflect the fact that we do not have any known IR value pointer + constant offset for the access. D106284 did this for SelectionDAG. This is the corresponding fix for GlobalISel. Differential Revision: https://reviews.llvm.org/D106451	2021-07-26 14:27:30 +01:00
Jay Foad	683b9ed0d5	[AMDGPU] Pre-commit global-isel test case for D106451 This test case shows the scheduler wrongly reordering two buffer accesses that might alias.	2021-07-26 14:27:30 +01:00
Jay Foad	9ac10658ae	[AMDGPU] Fix MMO for raw/struct buffer access with non-constant offset Codegen for the raw/struct buffer access intrinsics would update the offset in the MMO to reflect the combined offset, if it was known to be constant. If the combined offset was not known to be constant, or if there was an index, it would set the offset in the MMO to 0. This is unsafe because it makes it look like the access does not alias with another access with a fixed non-zero offset. Fix these cases by setting the pointer in the MMO to null, to reflect the fact that we do not have any known IR value pointer + constant offset for the access. Differential Revision: https://reviews.llvm.org/D106284	2021-07-26 14:27:30 +01:00
David Green	010f8e3057	[ARM] Ensure correct regclass in distributing postinc The register class required for some MVE loads/stores is more constrained than the register we use when creating postinc. Make sure we constrain the register class to keep the code correct.	2021-07-26 14:26:38 +01:00
Tim Northover	a487a49acc	AArch64: support i128 (& larger) returns in GlobalISel	2021-07-26 14:16:35 +01:00
Nikita Popov	ffb3277b00	[SimplifyCFG] Improve store speculation check isSafeToSpeculateStore() looks for a preceding store to the same location to make sure that introducing a new store of the same value is safe. It currently bails on intervening mayHaveSideEffect() instructions. However, I believe just checking mayWriteToMemory() is sufficient there -- we just need to make sure that we know which value was stored, we don't care if we can unwind in the meantime. While looking into this, I started having some doubts about the correctness of the transform with regard to thread safety. While we don't try to hoist non-simple stores, I believe we also need to make sure that the preceding store is simple as well. Otherwise we could introduce a spurious non-atomic write after an atomic write -- under our memory model this would result in a subsequent undef atomic read, even if the second write stores the same value as the first. Example: https://alive2.llvm.org/ce/z/q_3YAL Differential Revision: https://reviews.llvm.org/D106742	2021-07-26 15:01:00 +02:00
Kerry McLaughlin	e484e1ae03	[SVE] Fix casts to <FixedVectorType> in truncateToMinimalBitwidths Fixes more casts to `<FixedVectorType>` for the cases where the instruction is a Insert/ExtractElementInst. For fixed-width, this part of truncateToMinimalBitWidths is tested by AArch64/type-shrinkage-insertelt.ll. I attempted to write a test case for this part of truncateToMinimalBitWidths which uses scalable vectors, but was unable to add one. The tests in type-shrinkage-insertelt.ll rely on scalarization to create extract element instructions for instance, which is not possible for scalable vectors. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D106163	2021-07-26 13:44:51 +01:00
Alexey Bataev	d7cb2a0796	Revert "[SLP]Fix costs calculations." This reverts commit `a053afed49` to fix buildbots.	2021-07-26 05:42:34 -07:00
Caroline Concatto	bf28111ebd	[AArch65][SVE] Remove vector_splice from AddedComplexity pattern The pattern for vector_splice with Index equal or bigger than zero was misplaced in the AddedComplexity = 1 pattern in the AArch64 tablegen file. This patch fixes it by removing vector_splice pattern from inside AddedComplexity = 1.	2021-07-26 13:35:51 +01:00
Tres Popp	539437e288	[mlir] split type conversion to two lines for GCC's sake	2021-07-26 14:15:47 +02:00
Alexey Bataev	a053afed49	[SLP]Fix costs calculations. Need to fix several cost-related problems. The final type may be defined incorrectly because of to early definition (we may end up with the wider type), the CommonCost should not be redefined in ExtractElements cost related calculations and the shuffle of the final insertelements vectors should be calculated as a cost of single vector permutations + costs of two vector permutations for other n-1 incoming vectors. Differential Revision: https://reviews.llvm.org/D106578	2021-07-26 04:37:22 -07:00
Paul Walker	8a8d01d58c	[NFC] Change VFShape so it contains an ElementCount rather than seperate VF and IsScalable properties. Differential Revision: https://reviews.llvm.org/D106750	2021-07-26 12:25:46 +01:00
Philipp Krones	46c0366877	[Inliner] Make the CallPenalty configurable Tests with multiple benchmarks, like Embench [1], showed that the CallPenalty magic number has the most influence on inlining decisions when optimizing for size. On the other hand, there was no good default value for this parameter. Some benchmarks profited strongly from a reduced call penalty. On example is the picojpeg benchmark compiled for RISC-V, which got 6% smaller with a CallPenalty of 10 instead of 12. Other benchmarks increased in size, like matmult. This commit makes the compromise of turning the magic number constant of CallPenalty into a configurable value. This introduces the flag `--inline-call-penalty`. With that flag users can fine tune the inliner to their needs. The CallPenalty constant was also used for loops. This commit replaces the CallPenalty constant with a new LoopPenalty constant that is now used instead. This is a slimmed down version of https://reviews.llvm.org/D30899 [1]: https://github.com/embench/embench-iot Differential Revision: https://reviews.llvm.org/D105976	2021-07-26 12:07:49 +01:00
Florian Hahn	d995d63767	[VPlan] Use stored value from recipes for interleave groups. Instead of getting the VPValue for the stored IR values through the current plan, use the stored value of the recipes directly. This way, the correct VPValues are used if the store recipes have been modified in the VPlan and the IR value is not correct any longer. This can happen, e.g. due to D105008.	2021-07-26 12:05:23 +01:00
Dylan Fleming	20b0fa91c9	[SVE] Add support for folding for select + masked loads Add folds to instcombine to support the removal of select instruction when the masked_load is guaranteed to zero the same lanes, i.e. select(mask, mload(,,mask,0), 0) -> mload(,,mask,0). Patch originally authored by @paulwalker-arm Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D106376	2021-07-26 11:58:41 +01:00
Caroline Concatto	0bfc26e3a4	[SVE][AArch64] Improve code generation for vector_splice for Imm > 0 This patch implements vector_splice in tablegen for all cases when the Immediate is positive and lower than the known minimum value of a scalable vector. Vector_splice can be implemented using SVE instruction EXT. For instance : @llvm.experimental.vector.splice(Vector_1, Vector_2, Imm) @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E> EXT Vector_1, Vector_2, Imm // Vector_1 = B, C, D + Vector_2 = E Depends on D105633 Differential Revision: https://reviews.llvm.org/D106273	2021-07-26 11:45:46 +01:00
David Sherwood	b2a5f0029f	Fix test failures caused by `0aff1798b5`	2021-07-26 11:40:26 +01:00
Caroline Concatto	73e4e9cd00	[AArch64][SVE] Improve code generation for vector_splice for Imm == -1 This patch implements vector_splice in tablegen for: a) when the immediate is equal to -1 (Imm==1) and uses: INSR + LASTB For instance : @llvm.experimental.vector.splice(Vector_1, Vector_2, -1) @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <D, E, F, G> LAST RegLast, Vector_1 // RegLast = D INSR Res, (Vector_1 >> 1), RegLast // Res = D + E, F, G Differential Revision: https://reviews.llvm.org/D105633	2021-07-26 11:25:01 +01:00
Simon Pilgrim	c8472db0a8	[X86][AVX] Prefer vinsertf128 to vperm2f128 on AVX1 targets Splatting the lower xmm with vinsertf128 is at least as quick as vperm2f128, and a lot faster on some AMD targets. First step towards PR50053	2021-07-26 11:11:56 +01:00
Simon Pilgrim	f64e251560	[X86][SSE] Don't scrub address math from interleaved shuffle tests	2021-07-26 11:03:31 +01:00
Sam McCall	e9274af718	Revert "[clangd] Avoid range-loop init-list lifetime subtleties." This reverts commit `253b8145de`. This doesn't actually fix anything - I should stop guessing. See https://github.com/clangd/clangd/issues/800 for update	2021-07-26 11:38:47 +02:00
Cullen Rhodes	e6ff9179ce	[AArch64][AsmParser] NFC: Parser.getTok().getLoc() -> getLoc() Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D106635	2021-07-26 09:36:34 +00:00
David Sherwood	0aff1798b5	[Analysis] Add simple cost model for strict (in-order) reductions I have added a new FastMathFlags parameter to getArithmeticReductionCost to indicate what type of reduction we are performing: 1. Tree-wise. This is the typical fast-math reduction that involves continually splitting a vector up into halves and adding each half together until we get a scalar result. This is the default behaviour for integers, whereas for floating point we only do this if reassociation is allowed. 2. Ordered. This now allows us to estimate the cost of performing a strict vector reduction by treating it as a series of scalar operations in lane order. This is the case when FP reassociation is not permitted. For scalable vectors this is more difficult because at compile time we do not know how many lanes there are, and so we use the worst case maximum vscale value. I have also fixed getTypeBasedIntrinsicInstrCost to pass in the FastMathFlags, which meant fixing up some X86 tests where we always assumed the vector.reduce.fadd/mul intrinsics were 'fast'. New tests have been added here: Analysis/CostModel/AArch64/reduce-fadd.ll Analysis/CostModel/AArch64/sve-intrinsics.ll Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll Differential Revision: https://reviews.llvm.org/D105432	2021-07-26 10:26:06 +01:00
Fraser Cormack	f924a3d474	[SelectionDAG] Support scalable-vector splats in yet more cases This patch extends support for (scalable-vector) splats in the DAGCombiner via the `ISD::matchBinaryPredicate` function, which enable a variety of simple combines of constants. Users of this function may now have to distinguish between `BUILD_VECTOR` and `SPLAT_VECTOR` vector operands. The way of dealing with this in-tree follows the approach added for `ISD::matchUnaryPredicate` implemented in D94501. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D106575	2021-07-26 10:15:08 +01:00
Kadir Cetinkaya	0a3c7960cb	Revert "Revert D106562 "[clangd] Get rid of arg adjusters in CommandMangler"" This reverts commit `2aa0cf19e7`. Get rid of reference to the temporary.	2021-07-26 11:13:22 +02:00
Jon Chesterfield	2a613a7790	[libomptarget] Build amdgpu plugin without hsa Default to building the amdgpu plugin to use dlopen when hsa is not found instead of disabling it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106600	2021-07-26 09:54:51 +01:00
Jon Chesterfield	93fe84d32f	[libomptarget][nfc] Squash unused variable warning Suppress only current warning on openmp-clang-x86_64-linux-debian Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106777	2021-07-26 09:54:31 +01:00
Vladislav Vinogradov	eb6c63cb0b	[mlir] Fix RankedTensorType::walkImmediateSubElements method Add 'enconding' attribute visitor. Without it ASM printer doesn't use attribute aliases for 'enconding'. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D105554	2021-07-26 11:49:25 +03:00
Guillaume Chatelet	47afd43eaa	[libc] fix LibcUnitTestMain when building with shared libraries	2021-07-26 08:43:45 +00:00
Lang Hames	cdcc354768	[ORC][ORC-RT] Add initial Objective-C and Swift support to MachOPlatform. This allows ORC to execute code containing Objective-C and Swift classes and methods (provided that the language runtime is loaded into the executor).	2021-07-26 18:02:01 +10:00
Marcel Koester	0425332015	[mlir] Added new RegionBranchTerminatorOpInterface and adapted uses of hasTrait<ReturnLike>. This CL adds a new RegionBranchTerminatorOpInterface to query information about operands that can be passed to successor regions. Similar to the BranchOpInterface, it allows to freely define the involved operands. However, in contrast to the BranchOpInterface, it expects an additional region number to distinguish between various use cases which might require different operands passed to different regions. Moreover, we added new utility functions (namely getMutableRegionBranchSuccessorOperands and getRegionBranchSuccessorOperands) to query (mutable) operand ranges for operations equiped with the ReturnLike trait and/or implementing the newly added interface. This simplifies reasoning about terminators in the scope of the nested regions. We also adjusted the SCF.ConditionOp to benefit from the newly added capabilities. Differential Revision: https://reviews.llvm.org/D105018	2021-07-26 06:39:31 +02:00
Michael Kruse	ae6b400002	[Preprocessor] Implement -fminimize-whitespace. This patch adds the -fminimize-whitespace with the following effects: * If combined with -E, remove as much non-line-breaking whitespace as possible. * If combined with -E -P, removes as much whitespace as possible, including line-breaks. The motivation is to reduce the amount of insignificant changes in the preprocessed output with source files where only whitespace has been changed (add/remove comments, clang-format, etc.) which is in particular useful with ccache. A patch for ccache for using this flag has been proposed to ccache as well: https://github.com/ccache/ccache/pull/815, which will use -fnormalize-whitespace when clang-13 has been detected, and additionally uses -P in "unify_mode". ccache already had a unify_mode in an older version which was removed because of problems that using the preprocessor itself does not have (such that the custom tokenizer did not recognize C++11 raw strings). This patch slightly reorganizes which part is responsible for adding newlines that are required for semantics. It is now either startNewLineIfNeeded() or MoveToLine() but never both; this avoids the ShouldUpdateCurrentLine workaround and avoids redundant lines being inserted in some cases. It also fixes a mandatory newline not inserted after a _Pragma("...") that is expanded into a #pragma. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D104601	2021-07-25 23:30:57 -05:00
Yuanfang Chen	1558bb80c0	[Object] make SourceMgr available to MCContext during inline asm symbols collection Fixes PR51210.	2021-07-25 21:23:03 -07:00
Esme-Yi	0d3e4d9d4d	[Debug-Info][llvm-dwarfdump] Don't use DW_FORM_data4/8 to encode the constants for DW_AT_data_member_location. Summary: In DWARF v3, DW_FORM_data4/8 in DW_AT_data_member_location are interpreted as location list pointers. Interpreting constants as pointers is not expected, so we use DW_FORM_udata to encode the constants. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D105687	2021-07-26 03:47:02 +00:00
Mehdi Amini	3211eadfe0	Revert "Build libSupport with -Werror=global-constructors (NFC)" This reverts commit `579cc9ad2e`. This breaks on Windows.	2021-07-26 03:08:26 +00:00

... 2 3 4 5 6 ...

394949 Commits All Branches Search

394949 Commits

All Branches