llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	86bb7df6e6	[CostModel][X86] getScalarizationOverhead - handle vXi1 extracts with MOVMSK (pre-AVX512) We can quickly extract multiple elements of a bool vector using MOVMSK ops - since we don't know what generated the vXi1, I've been optimistic and assumed we can use PMOVMSKB to extract the maximum number of bools with a single op. The MOVMSK pattern isn't great for extract+insert round trips as vXi1 type legalization can interfere with this a lot - so this relies on us remaining good at using getScalarizationOverhead properly (and tagging both Insert and Extract modes) for those round trip cases. The AVX512 KMOV codegen for bool extraction is a bit of a mess so for now I've not included that - the per-element cost is a lot more accurate for current codegen.	2022-05-02 09:58:39 +01:00
Balazs Benics	fd7efe33f1	[analyzer] Fix cast evaluation on scoped enums in ExprEngine We ignored the cast if the enum was scoped. This is bad since there is no implicit conversion from the scoped enum to the corresponding underlying type. The fix is basically: isIntegralOrEnumerationType() -> isIntegralOrUnscopedEnumerationType() This materialized in crashes on analyzing the LLVM itself using the Z3 refutation. Refutation synthesized the given Z3 Binary expression (`BO_And` of `unsigned char` aka. 8 bits and an `int` 32 bits) with the wrong bitwidth in the end, which triggered an assert. Now, we evaluate the cast according to the standard. This bug could have been triggered using the Z3 CM according to https://bugs.llvm.org/show_bug.cgi?id=44030 Fixes #47570 #43375 Reviewed By: martong Differential Revision: https://reviews.llvm.org/D85528	2022-05-02 10:54:26 +02:00
Nikita Popov	aae5f8115a	[Local] Consider atomic loads from constant global as dead Per the guidance in https://llvm.org/docs/Atomics.html#atomics-and-ir-optimization, an atomic load from a constant global can be dropped, as there can be no stores to synchronize with. Any write to the constant global would be UB. IPSCCP will already drop such loads, but the main helper in Local doesn't recognize this currently. This is motivated by D118387. Differential Revision: https://reviews.llvm.org/D124241	2022-05-02 10:52:58 +02:00
Shraiysh Vaishay	a60fda59dc	[mlir][OpenMP] Restrict types for omp.parallel args This patch restricts the value of `if` clause expression to an I1 value. It also restricts the value of `num_threads` clause expression to an I32 value. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D124142	2022-05-02 14:17:34 +05:30
owenca	c8603db071	[clang-format] Fix a bug that misformats Access Specifier after *[] Fixes #55132. Differential Revision: https://reviews.llvm.org/D124589	2022-05-02 01:39:26 -07:00
Balazs Benics	464c9833df	[analyzer][docs] Document alpha.security.cert.pos.34c limitations Reviewed By: martong Differential Revision: https://reviews.llvm.org/D124659	2022-05-02 10:37:23 +02:00
Balazs Benics	5a2e595eb8	[analyzer] Fix Static Analyzer g_memdup false-positive `g_memdup()` allocates and copies memory, thus we should not assume that the returned memory region is uninitialized because it might not be the case. PS: It would be even better to copy the bindings to mimic the actual content of the buffer, but this works too. Fixes #53617 Reviewed By: martong Differential Revision: https://reviews.llvm.org/D124436	2022-05-02 10:35:51 +02:00
Nikita Popov	597946a4dd	[ConstantFold] Don't convert getelementptr to ptrtoint+inttoptr ConstantFolding currently converts "getelementptr i8, Ptr, (sub 0, V)" to "inttoptr (sub (ptrtoint Ptr), V)". This transform is, taken by itself, correct, but does came with two issues: 1. It unnecessarily broadens provenance by introducing an inttoptr. We generally prefer not to introduce inttoptr during optimization. 2. For the case where V == ptrtoint Ptr, this folds to inttoptr 0, which further folds to null. In that case provenance becomes incorrect. This has been observed as a real-world miscompile with rustc. We should probably address that incorrect inttoptr 0 fold at some point, but in either case we should also drop this inttoptr-introducing fold. Instead, replace it with a fold rooted at ptrtoint(getelementptr), which seems to cover the original motivation for this fold (test2 in the changed file). Differential Revision: https://reviews.llvm.org/D124677	2022-05-02 10:24:46 +02:00
David Green	986de8f50b	[AArch64] Add more comprehensive reverse shuffle costmodel tests. NFC	2022-05-02 09:16:57 +01:00
Alex Zinenko	946311b893	[mlir] support isa/cast/dyn_cast<Operation >(operation) This enables one to write generic code that can be instantiated for both specific operation classes and the common base class without specialization. Examples include functions that take/return ops, such as: ```mlir template <typename FnTy> void applyIf(FnTy &&lambda, ...) { for (Operation op : ...) { auto specific = dyn_cast<function_traits<FnTy>::template arg_t<0>>(op); if (specific) lambda(specific); } } ``` that would otherwise need to rely on template specialization to support lambdas that take specific operations and those that take `Operation *`. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D124675	2022-05-02 10:07:09 +02:00
Phoebe Wang	7c04454227	[ArgPromotion][Attributor] Update min-legal-vector-width when do promotion X86 codegen uses function attribute `min-legal-vector-width` to select the proper ABI. The intention of the attribute is to reflect user's requirement when they passing or returning vector arguments. So Clang front-end will iterate the vector arguments and set `min-legal-vector-width` to the width of the maximum for both caller and callee. It is assumed any middle end optimizations won't care of the attribute expect inlining and argument promotion. - For inlining, we will propagate the attribute of inlined functions because the inlining functions become the newer caller. - For argument promotion, we check the `min-legal-vector-width` of the caller and callee and refuse to promote when they don't match. The problem comes from the optimizations' combination, as shown by https://godbolt.org/z/zo3hba8xW. The caller `foo` has two callees `bar` and `baz`. When doing argument promotion, both `foo` and `bar` has the same `min-legal-vector-width`. So the argument was promoted to vector. Then the inlining inlines `baz` to `foo` and updates `min-legal-vector-width`, which results in ABI mismatch between `foo` and `bar`. This patch fixes the problem by expanding the concept of `min-legal-vector-width` to indicator of functions arguments. That says, any passes touch functions arguments have to set `min-legal-vector-width` to the value reflects the width of vector arguments. It makes sense to me because any arguments modifications are ABI related and should response for the ABI compatibility. Differential Revision: https://reviews.llvm.org/D123284	2022-05-02 14:13:05 +08:00
Shraiysh Vaishay	e6295c645f	[flang] Added tests for taskwait and taskyield translation Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D124229 Co-authored-by: Sourabh Singh Tomar <SourabhSingh.Tomar@amd.com>	2022-05-02 10:48:03 +05:30
Congzhe Cao	3d6fe7ace8	[LoopCacheAnalysis] Use stable_sort() to avoid non-deterministic print output The print output of loop cache analysis sometimes has a non-deterministic order and therefore we have been using `CHECK-DAG` in its lit tests. This patch changes the sorting of LoopCosts to llvm::stable_sort() where we compare loop cost numbers and sort the loops. In case of the same loop cost numbers, llvm::stable_sort() now would output a deterministic loop order. Reviewed By: Meinersbur, fhahn, #loopoptwg Differential Revision: https://reviews.llvm.org/D124725	2022-05-02 00:49:45 -04:00
Ben Shi	42fa5bae7a	[clang][preprocessor] Add more macros to target AVR Reviewed By: MaskRay, aykevl Differential Revision: https://reviews.llvm.org/D124157	2022-05-02 04:37:57 +00:00
Brad Smith	a132e527f2	[Driver][Ananas] -r: imply -nostdlib like GCC Similar to D116843 for Gnu.cpp Reviewed By: zhmu, MaskRay Differential Revision: https://reviews.llvm.org/D124729	2022-05-02 00:28:14 -04:00
Fangrui Song	6cfcfbdc95	[Driver][test] Remove unneeded -no-canonical-prefixes and use preferred --target= Similar to D119309	2022-05-01 20:44:13 -07:00
Ben Shi	fb7a435492	[compiler-rt][builtins] Add several helper functions for AVR __mulqi3 : int8 multiplication __mulhi3 : int16 multiplication _exit : golobal terminator Reviewed By: MaskRay, aykevl Differential Revision: https://reviews.llvm.org/D123200	2022-05-02 01:27:46 +00:00
LLVM GN Syncbot	1790e2976b	[gn build] Port `3939e99aae`	2022-05-01 22:32:29 +00:00
Matt Arsenault	aabea3b2ea	llvm-reduce: Fix not removing first instruction in MachineBasicBlock This had the surprising behavior of using whatever instruction happened to be first in the block as an anchor point to stick random implicit defs on. Use a real implicit_def instead.	2022-05-01 18:26:45 -04:00
Matt Arsenault	35264e7179	llvm-reduce: Introduce new scoring mechanism for MIR reductions Many MIR reductions benefit from or require increasing the instruction count. For example, unlike in the IR, you may need to insert a new instruction to represent an undef. The current instruction reduction pass works around this by sticking implicit defs on whatever instruction happens to be first in the entry block block. Other strategies I've applied manually include breaking instructions with multiple defs into separate instructions, or breaking large register defs into multiple subregister defs. Make up a simple scoring system based on what I generally try to get rid of first when manually reducing. Counts implicit defs as free since reduction passes will be introducing them, although they probably should count for something. It also might make more sense to have a comparison the two functions, rather than having to compute a contextless number. This isn't particularly well tested since overall the MIR support isn't in a place where it is useful on the kinds of testcases I want to throw at it.	2022-05-01 18:24:04 -04:00
Matt Arsenault	0b896b754e	llvm-reduce: Do not try to delete frame instructions The verifier enforces these appearing as balanced pairs, so just deleting one has no real chance of producing something valid.	2022-05-01 18:21:52 -04:00
Matt Arsenault	3939e99aae	llvm-reduce: Add pass to reduce IR references from MIR This is typically the first thing I do when reducing a new testcase until the IR section can be deleted.	2022-05-01 17:40:53 -04:00
Fangrui Song	2019c9b1c8	[RISCV] Lower case the first letter of LowerRISCVMachineOperandToMCOperand. NFC	2022-05-01 14:13:55 -07:00
Sylvestre Ledru	ee4ac3a856	doc: update of the adv build doc now that clang is in tree too And be more consistent in the declarations	2022-05-01 22:59:49 +02:00
River Riddle	3c75228991	[mlir:PDLInterp] Refactor the implementation of result type inferrence The current implementation uses a discrete "pdl_interp.inferred_types" operation, which acts as a "fake" handle to a type range. This op is used as a signal to pdl_interp.create_operation that types should be inferred. This is terribly awkward and clunky though: * This op doesn't have a byte code representation, and its conversion to bytecode kind of assumes that it is only used in a certain way. The current lowering is also broken and seemingly untested. * Given that this is a different operation, it gives off the assumption that it can be used multiple times, or that after the first use the value contains the inferred types. This isn't the case though, the resultant type range can never actually be used as a type range. This commit refactors the representation by removing the discrete InferredTypesOp, and instead adds a UnitAttr to pdl_interp.CreateOperation that signals when the created operations should infer their types. This leads to a much much cleaner abstraction, a more optimal bytecode lowering, and also allows for better error handling and diagnostics when a created operation doesn't actually support type inferrence. Differential Revision: https://reviews.llvm.org/D124587	2022-05-01 12:25:05 -07:00
Florian Hahn	5387a38c38	[SimpleLoopUnswitch] Freeze individual OR/AND operands. In some cases, it is not enough to freeze the final AND/OR operation when chaining a number of invariant conditions together. After creating a chain of ANDs/ORs, we assume all unswitched operands to be either true or false. But if any of the operands is poison, the rest of the operands could have any value after branching on the frozen condition. To avoid that, freeze individual operands, if needed. In some cases this may lead to unnecessary freezes, but it seems required at least for some cases (see trivial-unswitch-freeze-individual-conditions.ll) Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D124554	2022-05-01 20:11:05 +01:00
Simon Pilgrim	34f97a3709	[VectorCombine] Merge isa<>/cast<> into dyn_cast<>. NFC. We want to handle the the assert in VectorCombine so avoid the repeated isa/cast code.	2022-05-01 20:09:10 +01:00
Michael Kruse	809ca66eac	[Polly] Fix test after D119669.	2022-05-01 13:32:42 -05:00
Simon Pilgrim	ae8b10e543	[DAG] (style) Break apart if-else chain as they all return	2022-05-01 17:56:59 +01:00
Stanislav Gatev	955a05a278	[clang][dataflow] Optimize flow condition representation Enable efficient implementation of context-aware joining of distinct boolean values. It can be used to join distinct boolean values while preserving flow condition information. Flow conditions are represented as Token <=> Clause iff formulas. To perform context-aware joining, one can simply add the tokens of flow conditions to the formula when joining distinct boolean values, e.g: `makeOr(makeAnd(FC1, Val1), makeAnd(FC2, Val2))`. This significantly simplifies the implementation of `Environment::join`. This patch removes the `DataflowAnalysisContext::getSolver` method. The `DataflowAnalysisContext::flowConditionImplies` method should be used instead. Reviewed-by: ymandel, xazax.hun Differential Revision: https://reviews.llvm.org/D124395	2022-05-01 16:25:29 +00:00
Simon Pilgrim	980f41d7c4	[X86] (style) Use auto for dyn_cast<> results	2022-05-01 17:15:18 +01:00
Simon Pilgrim	d4f06ec874	[X86] (style) Don't use auto for non obvious types	2022-05-01 17:10:21 +01:00
Simon Pilgrim	09761ce295	[SLPVectorizer] Remove weird unicode character from comment. NFCI. Whatever it was, Visual Assist really didn't like it....	2022-05-01 16:37:21 +01:00
Simon Pilgrim	bee9aa78db	[InstCombine] Add test coverage from D124503	2022-05-01 16:09:23 +01:00
Simon Pilgrim	e04ca7c4f1	[Coroutines] Regenerate coro-retcon-resume-values.ll	2022-05-01 13:21:55 +01:00
Simon Pilgrim	cff0afc184	[LoopVectorize][X86] Regenerate invariant-store-vectorization.ll	2022-05-01 13:04:24 +01:00
Andrew Ng	57c55165eb	[analyzer] Fix return of llvm::StringRef to destroyed std::string This issue was discovered whilst testing with ASAN. Differential Revision: https://reviews.llvm.org/D124683	2022-05-01 12:24:32 +01:00
Simon Pilgrim	d5198cf92f	[CostModel][X86] Check for 'null op' truncations If the legalized src/dst types are the same, assume the "truncation" is free. This fixes some edge cases such as mul lo/hi ops and bool vectors which will get legalized back to legal vector widths	2022-05-01 12:03:40 +01:00
Nikolas Klauser	639b9618f4	[libc++][NFC] Replace _LIBCPP_INLINE_VISIBILTIY and _VSTD in <string> Replace all the instances of `_LIBCPP_INLINE_VISIBILITY` with `_LIBCPP_HIDE_FROM_ABI` and `_VSTD` with `std`. Reviewed By: Mordante, #libc Spies: libcxx-commits Differential Revision: https://reviews.llvm.org/D124662	2022-05-01 12:59:52 +02:00
PeixinQiao	303ecc42d4	[flang] Add one semantic check for implicit interface As Fortran 2018 C1533, a nonintrinsic elemental procedure shall not be used as an actual argument. The semantic check for implicit iterface is missed. Reviewed By: klausler Differential Revision: https://reviews.llvm.org/D124379	2022-05-01 18:40:17 +08:00
sstwcw	43c146c96d	[clang-format] Take out common code for parsing blocks NFC Differential Revision: https://reviews.llvm.org/D121757	2022-05-01 08:58:40 +00:00
Simon Pilgrim	c2964746e3	[CostModel][X86] Reduce cost of vector selects on SSE2/AVX1 targets Based off the script from D103695, we were exaggerating the cost of the OR(AND(X,M),AND(Y,~M)) expansion using instruction count instead of effective throughput	2022-05-01 09:32:14 +01:00
Nathan James	8a9e2dd48d	[clang-tidy][NFC] Re-alphabetize the clang tidy release notes	2022-05-01 07:41:04 +01:00
Jack Andersen	09325d3606	[CAPI] Expose CastInst::getCastOpcode in C API Reviewed By: deadalnix Differential Revision: https://reviews.llvm.org/D91514	2022-04-30 18:40:04 -04:00
Dmitry Vassiliev	2e7e0975c0	[NVPTX] Prefix "$L__" for branch label names A global variable may have the same name as a label, and ptxas does not accept it. Prefix labels with $L__ to fix this. Reviewed By: MaskRay, tra Differential Revision: https://reviews.llvm.org/D119669	2022-04-30 21:55:20 +02:00
Florian Hahn	841fffa745	[LV] Add test for interleaving multiple iterations with call.	2022-04-30 20:43:22 +01:00
Simon Pilgrim	6f80830f06	[PhaseOrdering][X86] Use passes="" instead of passes='' so DOS can evaluate the cmd lines Fix regenerating the tests on windows builds	2022-04-30 19:56:49 +01:00
Florian Hahn	8b022f87b0	[SimpleLoopUnswitch] Freeze trivial conditions if needed. Trivial unswitching can also introduce new branches on undef/poison. Freeze the conditions if needed. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D124549	2022-04-30 19:53:36 +01:00
Simon Pilgrim	c6994ec12e	[PhaseOrdering][X86] Use passes="default<O3>" instead of passes='default<O3>' so DOS can evaluate the cmd lines Fix regenerating the tests on windows builds	2022-04-30 19:53:07 +01:00
Simon Pilgrim	732b57d5f1	[SLP][X86] extractelement tests - use -mattr=avx2 instead of a -march flag	2022-04-30 19:51:24 +01:00

1 2 3 4 5 ...

422612 Commits All Branches Search

422612 Commits

All Branches