llvm-project

Commit Graph

Author	SHA1	Message	Date
Sami Tolvanen	4474958d3a	ThinLTO: Fix inline assembly references to static functions with CFI Create an internal alias with the original name for static functions that are renamed in promoteInternals to avoid breaking inline assembly references to them. Link: https://github.com/ClangBuiltLinux/linux/issues/1354 Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D104058	2021-06-22 10:01:55 -07:00
Joseph Huber	03d7e61c87	[OpenMP] Internalize functions in OpenMPOpt to improve IPO passes Summary: Currently the attributor needs to give up if a function has external linkage. This means that the optimization introduced in D97818 will only apply to static functions. This change uses the Attributor to internalize OpenMP device routines by making a copy of each function with private linkage and replacing the uses in the module with it. This allows for the optimization to be applied to any regular function. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102824	2021-06-22 12:38:10 -04:00
Joseph Huber	6fc51c9f7d	[OpenMP] Replace GPU globalization calls with shared memory in the middle-end Summary: The changes introduced in D97680 create a simpler interface to code that needs to be globalized. This interface is used to simplify the globalization calls in the middle end. We can check any globalization call that is only called by a single thread in the team and replace it with a static shared memory buffer. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D97818	2021-06-22 11:55:44 -04:00
Nikita Popov	e790d3667e	[OpaquePtr] Handle addrspacecasts in InstCombine This adds support for addrspace casts involving opaque pointers to InstCombine, as well as the isEliminableCastPair() helper (otherwise the assertion failure would just move there). Add PointerType::hasSameElementTypeAs() to hide the element type details. Differential Revision: https://reviews.llvm.org/D104668	2021-06-22 17:45:30 +02:00
Jingu Kang	873ff5a728	[SimpleLoopUnswich] Fixa a bug on ComputeUnswitchedCost with partial unswitch There was a bug from cost calculation for partially invariant unswitch. The costs of non-duplicated blocks are substracted from the total LoopCost, so anything that is duplicated should not be counted. Differential Revision: https://reviews.llvm.org/D103816	2021-06-22 16:18:00 +01:00
Joseph Huber	68d133a3e8	[OpenMP] Simplify GPU memory globalization Summary: Memory globalization is required to maintain OpenMP standard semantics for data sharing between worker and master threads. The GPU cannot share data between its threads so must allocate global or shared memory to store the data in. Currently this is implemented fully in the frontend using the `__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard CPU stack sharing. The front-end scans the target region for variables that escape the region and must be shared between the threads. Each variable then has a field created for it in a global record type. This patch replaces this functinality with a single allocation command, effectively mimicing an alloca instruction for the variables that must be shared between the threads. This will be much slower than the current solution, but makes it much easier to optimize as we can analyze each variable independently and determine if it is not captured. In the future, we can replace these calls with an `alloca` and small allocations can be pushed to shared memory. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D97680	2021-06-22 10:52:46 -04:00
Rosie Sumpter	b2f48cc914	[SLP][AArch64] Add SLP vectorizer tests for XOR and AND reductions. NFC These regression tests show missed SLP vectorization opportunities, which will be fixed in a future commit (see: https://reviews.llvm.org/D104538). Differential Revision: https://reviews.llvm.org/D104708	2021-06-22 15:16:02 +01:00
Nikita Popov	e638a290f7	[ConstantFold] Delay fetching pointer element type Don't do this while stipping pointer casts, instead fetch it at the end. This improves compatibility with opaque pointers for the case where the base object is not opaque.	2021-06-22 15:51:00 +02:00
Nikita Popov	87bdde4962	[ConstantFold] Skip bitcast -> GEP transform for opaque pointers Same as with the InstCombine transform, this is not possible for bitcasts involving opaque pointers, as GEP preserves opaqueness.	2021-06-22 15:50:55 +02:00
Florian Hahn	d17798823c	[SCEV] Retain AddExpr flags when subtracting a foldable constant. Currently we drop wrapping flags for expressions like (A + C1)<flags> - C2. But we can retain flags under certain conditions: * Adding a smaller constant is NUW if the original AddExpr was NUW. * Adding a constant with the same sign and small magnitude is NSW, if the original AddExpr was NSW. This can improve results after using `SimplifyICmpOperands`, which may subtract one in order to use stricter predicates, as is the case for `isKnownPredicate`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D104319	2021-06-22 11:27:51 +01:00
Max Kazantsev	4c4f1ae93e	Re-land "[LoopDeletion] Handle Phis with similar inputs from different blocks" Patch was reverted due to a bug that existed before it and was exposed by it. Returning after the underlying bug has been fixed. Differential Revision: https://reviews.llvm.org/D103959	2021-06-22 12:28:46 +07:00
Max Kazantsev	575253887b	[LoopDeletion] Require loop to have a predecessor when executing 1st iteration symbolically Two predecessors break the further logic, and the loop may come to the opt in non-canonicalized state.	2021-06-22 12:20:55 +07:00
Eli Friedman	8f3d16905d	[ScalarEvolution] Ensure backedge-taken counts are not pointers. A backedge-taken count doesn't refer to memory; returning a pointer type is nonsense. So make sure we always return an integer. The obvious way to do this would be to just convert the operands of the icmp to integers, but that doesn't quite work out at the moment: isLoopEntryGuardedByCond currently gets confused by ptrtoint operations. So we perform the ptrtoint conversion late for lt/gt operations. The test changes are mostly innocuous. The most interesting changes are more complex SCEV expressions of the form "(-1 * (ptrtoint i8* %ptr to i64)) + %ptr)". This is expected: we can't fold this to zero because we need to preserve the pointer base. The call to isLoopEntryGuardedByCond in howFarToZero is less precise because of ptrtoint operations; this shows up in the function pr46786_c26_char in ptrtoint.ll. Fixing it here would require more complex refactoring. It should eventually be fixed by future improvements to isImpliedCond. See https://bugs.llvm.org/show_bug.cgi?id=46786 for context. Differential Revision: https://reviews.llvm.org/D103656	2021-06-21 16:24:16 -07:00
Roman Lebedev	4cf74469a0	[NFC][SimplifyCFG] Add basic test for debuginfo preservation of `ret` tail merging	2021-06-21 23:56:54 +03:00
Roman Lebedev	3e98b88797	[NFC][SimplifyCFG] Fix tests to use FileCheck instead of grep	2021-06-21 23:56:54 +03:00
Alexey Bataev	c5bbc737e8	[SLP][NFC]Rename functions in the tests, NFC.	2021-06-21 13:37:12 -07:00
Nikita Popov	39796e1ad0	Reapply [InstCombine] Don't try converting opaque pointer bitcast to GEP Reapplied without changes -- this was reverted together with an underlying patch. ----- Bitcasts having opaque pointer source or result type cannot be converted into a zero-index GEP, GEP source and result types always have the same opaque-ness.	2021-06-21 22:15:56 +02:00
Nikita Popov	e2c2124a4b	Reapply [InstCombine] Extract bitcast -> gep transform Relative to the original patch, an InstCombine test has been added to show a previously missed pattern, and the Coroutine test that resulted in the revert has been regenerated. ----- Move this into a separate function, to make sure that early returns do not accidentally skip other transforms. This previously happened for the isSized() check, which skipped folds like distributing a bitcast over a select.	2021-06-21 22:03:15 +02:00
Nikita Popov	403792f91e	[InstCombine] Add test for bitcast of unsized pointer (NFC) The bitcast should get folded into the select, but currently isn't due to an incorrect early bailout.	2021-06-21 22:03:15 +02:00
Nikita Popov	6922ab73a5	Revert "[InstCombine] Extract bitcast -> gep transform" This reverts commit `d9f5d7b959`. This reverts commit `5780611d7e`. This causes a failure in Coroutine tests.	2021-06-21 21:34:17 +02:00
Alexey Bataev	908b753661	[SLP]Improve vectorization of PHI instructions. Perform better analysis when trying to vectorize PHIs. 1. Do not try to vectorize vector PHIs. 2. Do deeper analysis for more profitable nodes for the vectorization. Before we just tried to vectorize the PHIs of the same type. Patch improves this and tries to vectorize PHIs with incoming values which come from the same basic block, have the same and/or alternative opcodes. It allows to save the compile time and provides better vectorization results in general. Part of D57059. Differential Revision: https://reviews.llvm.org/D103638	2021-06-21 12:26:24 -07:00
Nikita Popov	5780611d7e	[InstCombine] Don't try converting opaque pointer bitcast to GEP Bitcasts having opaque pointer source or result type cannot be converted into a zero-index GEP, GEP source and result types always have the same opaque-ness.	2021-06-21 21:24:50 +02:00
Philip Reames	0c09e5bd74	Split a test for ease of auto update	2021-06-21 11:02:26 -07:00
Jacob Hegna	f86d1f99b3	Remove ML inlining model artifacts. They are not conducive to being stored in git. Instead, we autogenerate mock model artifacts for use in tests. Production models can be specified with the cmake flag LLVM_INLINER_MODEL_PATH. LLVM_INLINER_MODEL_PATH has two sentinel values: - download, which will download the most recent compatible model. - autogenerate, which will autogenerate a "fake" model for testing the model uptake infrastructure. Differential Revision: https://reviews.llvm.org/D104251	2021-06-21 17:38:09 +00:00
Nathan Chancellor	f52666985d	Revert "[LoopDeletion] Handle Phis with similar inputs from different blocks" This reverts commit `bb1dc876eb`. This patch causes an assertion failure when building an arm64 defconfig Linux kernel. See https://reviews.llvm.org/D103959 for a link to the original bug report and a reduced reproducer.	2021-06-21 10:18:55 -07:00
Rosie Sumpter	2251f33bef	[SLP][AArch64] Add SLP vectorizer regression test. NFC This test is for a missed SLP vectorizer opportunity, reported here https://bugs.llvm.org/show_bug.cgi?id=44593. This is due to a cost modelling issue with vector reduction intrinsics which will be fixed in a future commit (see https://reviews.llvm.org/D104538).	2021-06-21 16:31:00 +01:00
Sanjay Patel	64b2676ca8	[InstCombine] fold ctlz/cttz-of-select with 1 or more constant arms Building on: `4c44b02d87` ...and adding handling for the extra operand in these intrinsics. This pattern is discussed in: https://llvm.org/PR50140	2021-06-21 11:04:12 -04:00
Sjoerd Meijer	071dbaec87	[FuncSpec] Add minsize test. NFC.	2021-06-21 15:21:09 +01:00
Florian Hahn	05bb969014	[LoopIdiom] Add test case that involves adds with flags and zero exts. Test coverage to ensure D104319 does not introduce a regression here.	2021-06-21 12:10:58 +01:00
Nikita Popov	acefe0eaaf	[Mem2Reg] Regenerate test checks (NFC)	2021-06-21 11:06:28 +02:00
Nikita Popov	80e0424b2c	[Mem2Reg] Use poison for unreachable cases Use poison instead of undef for cases dealing with unreachable code. This still leaves the more interesting case of "load from uninitialized memory" as undef.	2021-06-21 10:54:13 +02:00
Nikita Popov	00a88a81d2	[Mem2Reg] Regenerate test checks (NFC)	2021-06-21 10:47:59 +02:00
Juneyoung Lee	c038845f58	[InstCombine] Fold icmp (select c,const,arg), null if icmp arg, null can be simplified This patch folds icmp (select c,const,arg), null if icmp arg, null can be simplified. Resolves llvm.org/pr48975. Reviewed By: nikic, xbolva00 Differential Revision: https://reviews.llvm.org/D96663	2021-06-21 17:39:05 +09:00
Sjoerd Meijer	342bbb7832	[FuncSpec] Don't specialise functions with NoDuplicate instructions. getSpecializationCost was returning INT_MAX for a case when specialisation shouldn't happen, but this wasn't properly checked if specialisation was forced. Differential Revision: https://reviews.llvm.org/D104461	2021-06-21 09:02:11 +01:00
Max Kazantsev	3f2ff7cc8c	[Test] Add some tests showing room for optimization exploiting undef and UB	2021-06-21 13:11:46 +07:00
Max Kazantsev	bb1dc876eb	[LoopDeletion] Handle Phis with similar inputs from different blocks This patch lifts the requirement to have the only incoming live block for Phis. There can be multiple live blocks if the same value comes to phi from all of them. Differential Revision: https://reviews.llvm.org/D103959 Reviewed By: nikic, lebedev.ri	2021-06-21 11:37:06 +07:00
Juneyoung Lee	ce192ced2b	[InstCombine] Use poison constant to represent the result of unreachable instrs This patch updates InstCombine to use poison constant to represent the resulting value of (either semantically or syntactically) unreachable instrs, or a don't-care value of an unreachable store instruction. This allows more aggressive folding of unused results, as shown in llvm/test/Transforms/InstCombine/getelementptr.ll . Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104602	2021-06-21 09:58:44 +09:00
Dmitri Gribenko	ffa252e8ce	[GCOVProfiling][test] Ensure that 'opt' drops any files in a temp directory	2021-06-20 22:48:35 +02:00
Nikita Popov	1ae266f452	[LoopUnroll] Use smallest exact trip count from any exit This is a more general alternative/extension to D102635. Rather than handling the special case of "header exit with non-exiting latch", this unrolls against the smallest exact trip count from any exit. The latch exit is no longer treated as priviledged when it comes to full unrolling. The motivating case is in full-unroll-one-unpredictable-exit.ll. Here the header exit is an IV-based exit, while the latch exit is a data comparison. This kind of loop does not get rotated, because the latch is already exiting, and loop rotation doesn't try to distinguish IV-based/analyzable latches. Differential Revision: https://reviews.llvm.org/D102982	2021-06-20 20:58:26 +02:00
David Green	a24b02193a	[DSE] Remove stores in the same loop iteration DSE will currently only remove stores in the same block unless they can be guaranteed to be loop invariant. This expands that to any stores that are in the same Loop, at the same loop level. This should still account for where AA/MSSA will not handle aliasing between loops, but allow the dead stores to be removed where they overlap in the same loop iteration. It requires adding loop info to DSE, but that looks fairly harmless. The test case this helps is from code like this, which can come up in certain matrix operations: for(i=..) dst[i] = 0; for(j=..) dst[i] += src[in+j]; After LICM, this becomes: for(i=..) dst[i] = 0; sum = 0; for(j=..) sum += src[in+j]; dst[i] = sum; The first store is dead, and with this patch is now removed. Differntial Revision: https://reviews.llvm.org/D100464	2021-06-20 17:03:30 +01:00
Sanjay Patel	4c44b02d87	[InstCombine] fold ctpop-of-select with 1 or more constant arms The general pattern is mentioned in: https://llvm.org/PR50140 ...but we need to do a bit more to handle intrinsics with extra operands like ctlz/cttz.	2021-06-20 11:28:45 -04:00
Sanjay Patel	240acb0cff	[InstCombine] avoid infinite loops with select folds of constant expressions This pair of transforms was added recently with: `8591640379` And could lead to conflicting folds: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=35399	2021-06-20 09:46:25 -04:00
Roman Lebedev	c5b7335dc8	[SimplifyCFG] FoldTwoEntryPHINode(): don't fold if either block has it's address taken Same as with HoistThenElseCodeToIf() (`ad87761925`).	2021-06-20 12:37:14 +03:00
Roman Lebedev	ad87761925	[SimplifyCFG] HoistThenElseCodeToIf(): don't hoist if either block has it's address taken This problem is exposed by D104598, after it tail-merges `ret` in `@test_inline_constraint_S_label`, the verifier would start complaining `invalid operand for inline asm constraint 'S'`. Essentially, taking address of a block is mismodelled in IR. It should probably be an explicit instruction, a first one in block, that isn't identical to any other instruction of the same type, so that it can't be hoisted.	2021-06-20 12:18:15 +03:00
Juneyoung Lee	09e8c0d5aa	[InstSimplify] icmp poison, X -> poison This adds a simple transformation from icmp with poison constant to poison. Comparing poison with something else is poison, so this is okay. https://alive2.llvm.org/ce/z/e8iReb https://alive2.llvm.org/ce/z/q4MurY	2021-06-20 15:39:07 +09:00
Fangrui Song	8ea2a58a2e	[llvm-profdata] Make diagnostics consistent with the (no capitalization, no period) style The format is currently inconsistent. Use the https://llvm.org/docs/CodingStandards.html#error-and-warning-messages style. And add `error:` or `warning:` to CHECK lines wherever appropriate.	2021-06-19 14:54:25 -07:00
Sanjay Patel	328b21a338	[InstCombine][test] add tests for select-of-bit-manip; NFC	2021-06-19 12:34:32 -04:00
Liqiang Tao	671a87104b	[llvm][Inliner] Add an optional PriorityInlineOrder This patch adds an optional PriorityInlineOrder, which uses the heap to order inlining. The callsite which size is smaller would have a higher priority. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D104028	2021-06-19 10:17:32 +08:00
Guozhi Wei	575ba6f425	[InstCombine] Don't transform code if DoTransform is false In patch https://reviews.llvm.org/D72396, it doesn't check DoTransform before transforming the code, and generates wrong result for the attached test case. Differential Revision: https://reviews.llvm.org/D104567	2021-06-18 18:01:34 -07:00
Fangrui Song	3307240f05	[InstrProfiling][ELF] Make __profd_ private if the function does not use value profiling On ELF, the D1003372 optimization can apply to more cases. There are two prerequisites for making `__profd_` private: * `__profc_` keeps `__profd_` live under compiler/linker GC * `__profd_` is not referenced by code The first is satisfied because all counters/data are in a section group (either `comdat any` or `comdat noduplicates`). The second requires that the function does not use value profiling. Regarding the second point: `__profd_` may be referenced by other text sections due to inlining. There will be a linker error if a prevailing text section references the non-prevailing local symbol. With this change, a stage 2 (`-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_BUILD_INSTRUMENTED=IR`) clang is 4.2% smaller (1-169620032/177066968). `stat -c %s */.o \| awk '{s+=$1}END{print s}' is 2.5% smaller. Reviewed By: davidxl, rnk Differential Revision: https://reviews.llvm.org/D103717	2021-06-18 17:01:17 -07:00

1 2 3 4 5 ...

18804 Commits