llvm-project

Commit Graph

Author	SHA1	Message	Date
Jessica Paquette	cbf5246359	Fix buildbot after `cfc6073017` Windows buildbots were not happy with using find_if + instructionsWithoutDebug. In `cfc6073017`, instructionsWithoutDebug is not technically necessary. So, just iterate over the block directly. http://lab.llvm.org:8011/#/builders/127/builds/4732/steps/7/logs/stdio	2021-01-19 10:38:04 -08:00
Jessica Paquette	cfc6073017	[GlobalISel] Combine (a[0]) \| (a[1] << k1) \| ...\| (a[m] << kn) into a wide load This is a restricted version of the combine in `DAGCombiner::MatchLoadCombine`. (See D27861) This tries to recognize patterns like below (assuming a little-endian target): ``` s8* x = ... s32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) -> s32 val = ((i32)a) s8 x = ... s32 val = a[3] \| (a[2] << 8) \| (a[1] << 16) \| (a[0] << 24) -> s32 val = BSWAP(*((s32)a)) ``` (This patch also handles the big-endian target case as well, in which the first example above has a BSWAP, and the second example above does not.) To recognize the pattern, this searches from the last G_OR in the expression tree. E.g. ``` Reg Reg \ / OR_1 Reg \ / OR_2 \ Reg .. / Root ``` Each non-OR register in the tree is put in a list. Each register in the list is then checked to see if it's an appropriate load + shift logic. If every register is a load + potentially a shift, the combine checks if those loads + shifts, when OR'd together, are equivalent to a wide load (possibly with a BSWAP.) To simplify things, this patch (1) Only handles G_ZEXTLOADs (which appear to be the common case) (2) Only works in a single MachineBasicBlock (3) Only handles G_SHL as the bit twiddling to stick the small load into a specific location An IR example of this is here: https://godbolt.org/z/4sP9Pj (lifted from test/CodeGen/AArch64/load-combine.ll) At -Os on AArch64, this is a 0.5% code size improvement for CTMark/sqlite3, and a 0.4% improvement for CTMark/7zip-benchmark. Also fix a bug in `isPredecessor` which caused it to fail whenever `DefMI` was the first instruction in the block. Differential Revision: https://reviews.llvm.org/D94350	2021-01-19 10:24:27 -08:00
Fraser Cormack	9c6a00fe99	[RISCV] Add ISel patterns for scalable mask exts & truncs Original patch by @rogfer01. This patch adds support for sign-, zero-, and any-extension from scalable mask vector types to integer vector types, as well as truncation in the opposite direction. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94590	2021-01-19 18:13:15 +00:00
Abhina Sreeskantharajan	88e7c3498c	[SystemZ][z/OS] Fix Permission denied pattern matching On z/OS, the error message "EDC5111I Permission denied." is not matched correctly in lit tests. This patch updates the check expression to match successfully. Differential Revision: https://reviews.llvm.org/D94432	2021-01-19 13:05:52 -05:00
Michael Kruse	842314b5f0	[Polly] Update isl to isl-0.23-61-g24e8cd12. This fixes llvm.org/PR48554 Some test cases had to be updated because the hash function for union_maps have been changed which affects the output order.	2021-01-19 12:01:31 -06:00
Raphael Isemann	2f80995090	[lldb][docs] Update .htaccess to redirect from old SB API documentation to new one This is mostly SEO so that the new API can take over the old API when people search for the different SB* classes. Sadly epydoc decided to throw in a -class prefix behind all the class file names, so we can't just overwrite the old files with the newly generated ones. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D94900	2021-01-19 18:58:43 +01:00
David Green	6a563eef13	[ARM] Expand vXi1 VSELECT's We have no lowering for VSELECT vXi1, vXi1, vXi1, so mark them as expanded to turn them into a series of logical operations. Differential Revision: https://reviews.llvm.org/D94946	2021-01-19 17:56:50 +00:00
Raphael Isemann	3cae8b3329	[lldb][docs] Add a doc page for enums and constants Enums and constants are currently missing in the new LLDB Python API docs. In theory we could just let them be autogenerated like the SB API classes, but sadly the generated documentation suffers from a bunch of problems. Most of these problems come from the way SWIG is representing enums, which is done by translating every single enum case into its own constant. This has a bunch of nasty effects: * Because SWIG throws away the enum types, we can't actually reference the enum type itself in the API. Also because automodapi is impossible to script, this can't be fixed in post (at least without running like sed over the output files). * The lack of enum types also causes that every enum case has its own full doc page. Having a full doc page that just shows a single enum case is pointless and it really slows down sphinx. * There is no SWIG code for the enums, so there is also no place to write documentation strings for them. Also there is no support for copying the doxygen strings (which would be in the wrong format, but better than nothing) for enums (let alone our defines), so we can't really document all this code. * Because the enum cases are just forwards to the native lldb module (which we mock), automodapi actually takes the `Mock` docstrings and adds it to every single enum case. I don't see any way to solve this via automodapi or SWIG. The most reasonable way to solve this is IMHO to write a simple Clang tool that just parses our enum/constant headers and emits an *.rst file that we check in. This way we can do all the LLDB-specific enum case and constant grouping that we need to make a readable documentation page. As we're without any real documentation until I get around to write that tool, I wrote a doc page for the enums/constants as a stop gap measure. Most of this is done by just grepping our enum header and then manually cleaning up all the artifacts and copying the few doc strings we have. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D94959	2021-01-19 18:54:05 +01:00
Andrzej Warzynski	cea3abc26f	[flang][driver] Move isFixedFormSuffix and isFreeFormSuffix to flangFrontend isFixedFormSuffix and isFreeFormSuffix should be defined in flangFrontend rather than flangFrontendTool library. That's for 2 reasons: * these methods are used in flangFrontend rather than flangFrontendTool * flangFrontendTool depends on flangFrontend As mentioned in the post-commit review for D94228, without this change shared library builds fail. Differential Revision: https://reviews.llvm.org/D94968	2021-01-19 17:47:40 +00:00
Stella Laurenzo	71b6b010e6	[mlir][python] Factor out standalone OpView._ods_build_default class method. * This allows us to hoist trait level information for regions and sized-variadic to class level attributes (_ODS_REGIONS, _ODS_OPERAND_SEGMENTS, _ODS_RESULT_SEGMENTS). * Eliminates some splicey python generated code in favor of a native helper for it. * Makes it possible to implement custom, variadic and region based builders with one line of python, without needing to manually code access to the segment attributes. * Needs follow-on work for region based callbacks and support for SingleBlockImplicitTerminator. * A follow-up will actually add ODS support for generating custom Python builders that delegate to this new method. * Also includes the start of an e2e sample for constructing linalg ops where this limitation was discovered (working progressively through this example and cleaning up as I go). Differential Revision: https://reviews.llvm.org/D94738	2021-01-19 09:29:57 -08:00
Björn Schäpers	cbdde495ba	[clang-format] Apply Allman style to lambdas Differential Revision: https://reviews.llvm.org/D94906	2021-01-19 18:17:01 +01:00
Nikita Popov	051ec9f5f4	[ValueTracking] Strengthen impliesPoison reasoning Split impliesPoison into two recursive walks, one over V, the other over ValAssumedPoison. This allows us to reason about poison implications in a number of additional cases that are important in practice. This is a generalized form of D94859, which handles the cmp to cmp implication in particular. Differential Revision: https://reviews.llvm.org/D94866	2021-01-19 18:04:23 +01:00
Jay Foad	0808c7009a	[AMDGPU] Fix test case for D94010	2021-01-19 16:46:47 +00:00
KareemErgawy-TomTom	27820496a7	[MLIR][SPIRV] Add `SignedOp` trait. This commit adds a new trait that can be attached to ops that have signed semantics. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D94896	2021-01-19 17:40:40 +01:00
Jay Foad	de2f942399	[AMDGPU] Simplify test case for D94010	2021-01-19 16:36:43 +00:00
Hansang Bae	2d911f7c72	[OpenMP] Fix atomic entries for captured logical operation Added missing code for the captured atomic operation. Differential Revision: https://reviews.llvm.org/D94848	2021-01-19 09:59:28 -06:00
Fraser Cormack	15fd6bae0e	[RISCV] Extend RVV VType info with the type's AVL (NFC) This patch factors out the "VLMax" operand passed to most scalable-vector ISel patterns into a property of each VType. This is seen as a preparatory change to allow RVV in the future to more easily support fixed-length vector types with constrained vector lengths, with the AVL operand set to the length of the fixed-length vector. It has no effect on the scalable code generation path. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D94594	2021-01-19 15:46:56 +00:00
David Green	f373b30923	[ARM] Add MVE add.sat costs This adds some basic MVE sadd_sat/ssub_sat/uadd_sat/usub_sat costs, based on when the instruction is legal. With smaller than legal types that are promoted we generate shr(qadd(shl, shl)), so the cost is 4 appropriately. Differential Revision: https://reviews.llvm.org/D94958	2021-01-19 15:38:46 +00:00
Valentin Clement	6bd0a4451c	[flang][directive] Get rid of flangClassValue in TableGen The TableGen emitter for directives has two slots for flangClass information and this was mainly to be able to keep up with the legacy openmp parser at the time. Now that all clauses are encapsulated in AccClause or OmpClause, these two strings are not necessary anymore and were the the source of couple of problem while working with the generic structure checker for OpenMP. This patch remove the flangClassValue string from DirectiveBase.td and use the string flangClass as the placeholder for the encapsulated class. Reviewed By: sameeranjoshi Differential Revision: https://reviews.llvm.org/D94821	2021-01-19 10:28:46 -05:00
Victor Huang	909d6c86ea	[PowerPC] Fix the check for the instruction using FRSP/XSRSP output register When performing peephole optimization to simplify the code, after removing passed FPSP/XSRSP instruction we will set any uses of that FRSP/XSRSP to the source of the FRSP/XSRSP. We are finding the machine instruction using virtual register holding FRSP/XSRSP results by searching all following instructions and encountering an issue that the first use of the virtual register is a debug MI causing: 1. virtual register in the debug MI removed unexpectedly. 2. virtual register used in non-debug MI not replaced with the source of FRSP/XSRSP. which stays in a undef status. This patch fix the issue by only searching non-debug machine instruction using virtual register holding FRSP/XSRSP results when the vr only has one non debug usage. Differential Revisien: https://reviews.llvm.org/D94711 Reviewed by: nemanjai	2021-01-19 09:20:03 -06:00
Raul Tambre	480643a95c	[CMake] Remove dead code setting policies to NEW cmake_minimum_required(VERSION) calls cmake_policy(VERSION), which sets all policies up to VERSION to NEW. LLVM started requiring CMake 3.13 last year, so we can remove a bunch of code setting policies prior to 3.13 to NEW as it no longer has any effect. Reviewed By: phosek, #libunwind, #libc, #libc_abi, ldionne Differential Revision: https://reviews.llvm.org/D94374	2021-01-19 17:19:36 +02:00
Utkarsh Saxena	8bf7116d50	[clangd] Index local classes, virtual and overriding methods. Previously we did not record local class declarations. Now with features like findImplementation and typeHierarchy, we have a need to index such local classes to accurately report subclasses and implementations of methods. Performance testing results: - No changes in indexing timing. - No significant change in memory usage. - 1% increase in #relations. - 0.17% increase in #refs. - 0.22% increase #symbols. New index stats Time to index: 4:13 min memory usage 543MB number of symbols: 521.5K number of refs: 8679K number of relations: 49K Base Index stats Time to index: 4:15 min memory usage 542MB number of symbols: 520K number of refs: 8664K number of relations: 48.5K Fixes: https://github.com/clangd/clangd/issues/644 Differential Revision: https://reviews.llvm.org/D94785	2021-01-19 16:18:48 +01:00
Andy Wingo	1a9b6e4a32	[WebAssembly][lld] Fix call-indirect.s test to validate Add missing address operand, so that we can validate the output files. Depends on D92315. Differential Revision: https://reviews.llvm.org/D92320	2021-01-19 16:12:38 +01:00
David Green	54e38440e7	[ARM] Expand add.sat/sub.sat cost checks. NFC	2021-01-19 15:06:06 +00:00
Alex Richardson	077a84f911	[libc++] Sync TEST_HAS_TIMESPEC_GET and _LIBCPP_HAS_TIMESPEC_GET on FreeBSD Commit `5e416ba943` (D71522) updated the __config header but didn't change test_macros.h. This fixes libcxx/language.support/has_timespec_get.compile.pass.cpp on FreeBSD12/13. Reviewed By: #libc, dim, ldionne Differential Revision: https://reviews.llvm.org/D94292	2021-01-19 15:02:57 +00:00
Florian Hahn	3747b69b53	[LoopRotate] Calls not lowered to calls should not block rotation. `83daa49758` made loop-rotate more conservative in the presence of function calls in the prepare-for-lto stage. The code did not properly account for calls that are no actual function calls, like calls to intrinsics. This patch updates the code to ensure only calls that are lowered to actual calls are considered inline candidates.	2021-01-19 14:37:36 +00:00
Praveen	c42f5ca3d8	[Flang][OpenMP] Add semantic checks for OpenMP Workshare Construct Add Semantic checks for OpenMP 4.5 - 2.7.4 Workshare Construct. - The structured block in a workshare construct may consist of only scalar or array assignments, forall or where statements, forall, where, atomic, critical or parallel constructs. - All array assignments, scalar assignments, and masked array assignments must be intrinsic assignments. - The construct must not contain any user defined function calls unless the function is ELEMENTAL. Test cases : omp-workshare03.f90, omp-workshare04.f90, omp-workshare05.f90 Resolve test cases (omp-workshare01.f90 and omp-workshare02.f90) marked as XFAIL Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D93091	2021-01-19 20:00:12 +05:30
Simon Pilgrim	2988f940d8	[X86] Regenerate fmin/fmax reduction tests Add missing check-prefixes + v1f32 tests	2021-01-19 14:28:44 +00:00
Raphael Isemann	626681b09a	[lldb] Fix two documentation typos	2021-01-19 15:25:15 +01:00
Lei Zhang	3a56a96664	[mlir][spirv] Define spv.GLSL.Fma and add lowerings Also changes some rewriter.create + rewriter.replaceOp calls into rewriter.replaceOpWithNewOp calls. Reviewed By: hanchung Differential Revision: https://reviews.llvm.org/D94965	2021-01-19 09:14:21 -05:00
Tim Northover	6259fbd8b6	AArch64: add apple-a14 as a CPU This CPU supports all v8.5a features except BTI, and so identifies as v8.5a to Clang. A bit weird, but the best way for things like xnu to detect the new features it cares about.	2021-01-19 14:04:53 +00:00
Nicolas Vasilache	93a873dfc9	[mlir][Affine] Revisit and simplify composeAffineMapAndOperands. In prehistorical times, AffineApplyOp was allowed to produce multiple values. This allowed the creation of intricate SSA use-def chains. AffineApplyNormalizer was originally introduced as a means of reusing the AffineMap::compose method to write SSA use-def chains. Unfortunately, symbols that were produced by an AffineApplyOp needed to be promoted to dims and reordered for the mathematical composition to be valid. Since then, single result AffineApplyOp became the law of the land but the original assumptions were not revisited. This revision revisits these assumptions and retires AffineApplyNormalizer. Differential Revision: https://reviews.llvm.org/D94920	2021-01-19 13:52:07 +00:00
Hans Wennborg	ec877106a3	[ThinLTO] Also prune Thin-* files from the ThinLTO cache Such files (Thin-%%%%%%.tmp.o) are supposed to be deleted immediately after they're used (either by renaming or deletion). However, we've seen instances on Windows where this doesn't happen, probably due to the filesystem being flaky. This is effectively a resource leak which has prevented us from using the ThinLTO cache on Windows. Since those temporary files are in the thinlto cache directory which we prune periodically anyway, allowing them to be pruned too seems like a tidy way to solve the problem. Differential revision: https://reviews.llvm.org/D94962	2021-01-19 14:43:49 +01:00
Med Ismail Bennani	1d37db6ef5	[llvm/Orc] Fix ExecutionEngine module build breakage This patch updates the llvm module map to reflect changes made in `24672ddea3c97fd1eca3e905b23c0116d7759ab8` and fixes the module builds (`-DLLVM_ENABLE_MODULES=On`). Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>	2021-01-19 14:39:06 +01:00
Faris Rehman	197d9a55f1	[flang][driver] Add standard macro predefinitions for compiler version Add the following standard predefinitions that f18 supports: * `__flang__`, * `__flang_major__`, * `__flang_minor__`, * `__flang_patchlevel__` Summary of changes: - Populate Fortran::parser::Options#predefinitions with the default supported predefinitions Differential Revision: https://reviews.llvm.org/D94516	2021-01-19 13:22:59 +00:00
AndreyChurbanov	a60bc55c69	[OpenMP] libomp: cleanup parsing of OMP_ALLOCATOR env variable. Differential Revision: https://reviews.llvm.org/D94932	2021-01-19 16:21:22 +03:00
OCHyams	d77a572087	[DebugInfo][dexter] Tweak dexter test for merged values Tweak dexter-tests/memvars/inline-escaping-function.c added in D94761 (`b7e516202e`) by adding a 'param' use after the merge point. The test XFAILS with and without this change, but without it the test looks very similar to memvars/unused-merged-value.c. The test now demonstrates the problem more clearly.	2021-01-19 12:45:31 +00:00
Faris Rehman	443d6957ca	[flang][driver] Add support for fixed form detection Currently the new flang driver always runs in free form mode. This patch adds support for fixed form mode detection based on the file extensions. Like `f18`, `flang-new` will treat files ending with ".f", ".F" and ".ff" as fixed form. Additionally, ".for", ".FOR", ".fpp" and ".FPP" file extensions are recognised as fixed form files. This is consistent with gfortran [1]. In summary, files with the following extensions are treated as fixed-form: * ".f", ".F", ".ff", ".for", ".FOR", ".fpp", ".FPP" For consistency with flang/test/lit.cfg.py and f18, this patch also adds support for the following file extensions: * ".ff", ".FOR", ".for", ".ff90", ".fpp", ".FPP" This is added in flang/lib/Frontend/FrontendOptions.cpp. Additionally, the following extensions are included: * ".f03", ".F03", ".f08", ".F08" This is for compatibility with gfortran [1] and other popular Fortran compilers [2]. NOTE: internally Flang will only differentiate between fixed and free form files. Currently Flang does not support switching between language standards, so in this regard file extensions are irrelevant. More specifically, both `file.f03` and `file.f18` are represented with `Language::Fortran` (as opposed to e.g. `Language::Fortran03`). Summary of changes: - Set Fortran::parser::Options::sFixedForm according to the file type - Add isFixedFormSuffix and isFreeFormSuffix helper functions to FrontendTool/Utils.h - Change FrontendOptions::GetInputKindForExtension to support the missing file extensions that f18 supports and some additional ones - FrontendActionTest.cpp is updated to make sure that the test input is treated as free-form [1] https://gcc.gnu.org/onlinedocs/gfortran/GNU-Fortran-and-GCC.html [2] https://github.com/llvm/llvm-project/blob/master/flang/docs/OptionComparison.md#notes Differential Revision: https://reviews.llvm.org/D94228	2021-01-19 12:58:01 +00:00
Adam Czachorowski	a6f9077b16	[clang] Check for nullptr when instantiating late attrs This was already done in SemaTemplateInstantiateDecl.cpp, but not in SemaTemplateInstantiate.cpp. Anecdotally I've seen some clangd crashes where coredumps point to this being a problem, but I cannot reproduce this so far. Differential Revision: https://reviews.llvm.org/D94933	2021-01-19 13:43:15 +01:00
Alex Zinenko	9a60ad216d	[mlir] Clarify docs around LLVM dialect-compatible types Explicitly mention that there is exactly one MLIR type that corresponds to a given LLVM IR type.	2021-01-19 13:42:16 +01:00
Abhina Sreeskantharajan	2c4f6be86c	[SystemZ][z/OS] Fix No such file or directory expression error On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match the end period successfully. ``` EDC5129I No such file or directory. ``` Differential Revision: https://reviews.llvm.org/D94239	2021-01-19 07:25:24 -05:00
Caroline Concatto	172f1f8952	[AArch64][SVE]Add cost model for vector reduce for scalable vector This patch computes the cost for vector.reduce<operand> for scalable vectors. The cost is split into two parts: the legalization cost and the horizontal reduction. Differential Revision: https://reviews.llvm.org/D93639	2021-01-19 11:54:16 +00:00
OCHyams	b7e516202e	[DebugInfo][dexter] Add dexter tests for merged values These dexter tests illustrate PR48719, the summary of which is: Sometimes we insert dbg.values for merged values (PHIs) when promoting variables, sometimes we don't. Sometimes there is no PHI because the merged value is never used. It doesn't matter because LiveDebugValues understands these merged values (implicit or otherwise) and correctly updates the debug info. Importantly, these merged variable values (which may or may not exist as PHIs, and may or not be represented with dbg.values) are //always// implicitly defined by the combination of incoming edges and the incoming variable locations along those edges by virtue of LiveDebugValues existing. Unfortunately, it is possible to mess with the CFG and remove / move these edges before LiveDebugValues runs. In this case our debug info model only works when the merged value is tracked by a dbg.value. Currently, this is only done rigorously for variables which are A) promoted in the first round of mem2reg and B) are used after the merge point. As an example, compile the following source with -O3 -g and step through with a debugger. You will see parama=5 throughout the function fun which is incorrect - we expect to see param=20 after the conditional assignment. __attribute__((optnone)) void esc(int* p) {} __attribute__((optnone)) void fluff() {} __attribute__((noinline)) int fun(int parama, int paramb) { if (parama) parama = paramb; fluff(); // DexLabel('s0') esc(&parama); return 0; } int main() { return fun(5, 20); } 1. parama is escaped by esc(&parama) so it is not promoted by SROA/mem2reg (failing condition "A" above). 2. InstCombine's LowerDbgDeclare converts the dbg.declare to a set of dbg.values (tracking the stored SSA values). 3. InstCombine replaces the two stores to parama's alloca (the initial parameter register store in entry and the assignment in if.then) with a PHI+store in the common sucessor. 4. SimplifyCFG folds the blocks together and converts the PHI to a select. The debug info is not updated to account for the merged value in the successor prior to SimplifyCFG when it exists as a PHI, or during when it becomes a select. As with D89543, which added some dexter tests for escaped locals, the idea is to build a set of source-level tests which highlights existing issues and might be useful in evaluating a new debug info model. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D94761	2021-01-19 11:11:00 +00:00
Faris Rehman	87dfd5e012	[flang][driver] Add support for `-I` in the new driver Add support for option -I in the new Flang driver. This will allow for included headers and module files in other directories, as the default search path is currently the working folder. The behaviour of this is consistent with the current f18 driver, where the current folder (i.e. ".") has the highest priority followed by the order of '-I's taking priority from first to last. Summary of changes: - Add SearchDirectoriesFromDashI to PreprocessorOptions, to be forwarded into the parser's searchDirectories - Add header files and non-functional module files to be used in regression tests. The module files are just text files and are used to demonstrated that paths specified with `-I` are taken into account when searching for .mod files. Differential Revision: https://reviews.llvm.org/D93453	2021-01-19 11:20:56 +00:00
Alexander Belyaev	11f4c58c15	[mlir] Add `complex.abs`, `complex.div` and `complex.mul` to ComplexOps. Differential Revision: https://reviews.llvm.org/D94911	2021-01-19 12:09:59 +01:00
Simon Pilgrim	5626adcd6b	[X86][SSE] combineVectorSignBitsTruncation - fold trunc(srl(x,c)) -> packss(sra(x,c)) If a srl doesn't introduce any sign bits into the truncated result, then replace with a sra to let us use a PACKSS truncation - fixes a regression noticed in D56387 on pre-SSE41 targets that don't have PACKUSDW.	2021-01-19 11:04:13 +00:00
Hans Wennborg	58bdfcfac0	Revert `5238e7b302` "[InstCombine] Replace one-use select operand based on condition" This caused a miscompile in Chromium, see comments on the codereview for discussion and pointer to a reproducer. > InstCombine already performs a fold where X == Y ? f(X) : Z is > transformed to X == Y ? f(Y) : Z if f(Y) simplifies. However, > if f(X) only has one use, then we can always directly replace the > use inside the instruction. To actually be profitable, limit it to > the case where Y is a non-expr constant. > > This could be further extended to replace uses further up a one-use > instruction chain, but for now this only looks one level up. > > Among other things, this also subsumes D94860. > > Differential Revision: https://reviews.llvm.org/D94862 This also reverts the follow-up a003f26539cf4db744655e76c41f4c4a8913f116: > [llvm] Prevent infinite loop in InstCombine of select statements > > This fixes an issue where the RHS and LHS the comparison operation > creating the predicate were swapped back and forth forever. > > Differential Revision: https://reviews.llvm.org/D94934	2021-01-19 11:50:56 +01:00
Jay Foad	49dce85584	[AMDGPU] Simplify AMDGPUInstPrinter::printExpSrcN. NFC. Change-Id: Idd7f47647bc0faa3ad6f61f44728c0f20540ec00	2021-01-19 10:39:56 +00:00
Florian Hahn	83daa49758	[LoopRotate] Add PrepareForLTO stage, avoid rotating with inline cands. D84108 exposed a bad interaction between inlining and loop-rotation during regular LTO, which is causing notable regressions in at least CINT2006/473.astar. The problem boils down to: we now rotate a loop just before the vectorizer which requires duplicating a function call in the preheader when compiling the individual files ('prepare for LTO'). But this then prevents further inlining of the function during LTO. This patch tries to resolve this issue by making LoopRotate more conservative with respect to rotating loops that have inline-able calls during the 'prepare for LTO' stage. I think this change intuitively improves the current situation in general. Loop-rotate tries hard to avoid creating headers that are 'too big'. At the moment, it assumes all inlining already happened and the cost of duplicating a call is equal to just doing the call. But with LTO, inlining also happens during full LTO and it is possible that a previously duplicated call is actually a huge function which gets inlined during LTO. From the perspective of LV, not much should change overall. Most loops calling user-provided functions won't get vectorized to start with (unless we can infer that the function does not touch memory, has no other side effects). If we do not inline the 'inline-able' call during the LTO stage, we merely delayed loop-rotation & vectorization. If we inline during LTO, chances should be very high that the inlined code is itself vectorizable or the user call was not vectorizable to start with. There could of course be scenarios where we inline a sufficiently large function with code not profitable to vectorize, which would have be vectorized earlier (by scalarzing the call). But even in that case, there probably is no big performance impact, because it should be mostly down to the cost-model to reject vectorization in that case. And then the version with scalarized calls should also not be beneficial. In a way, LV should have strictly more information after inlining and make more accurate decisions (barring cost-model issues). There is of course plenty of room for things to go wrong unexpectedly, so we need to keep a close look at actual performance and address any follow-up issues. I took a look at the impact on statistics for MultiSource/SPEC2000/SPEC2006. There are a few benchmarks with fewer loops rotated, but no change to the number of loops vectorized. Reviewed By: sanwou01 Differential Revision: https://reviews.llvm.org/D94232	2021-01-19 10:15:29 +00:00
Muhammad Omair Javaid	4d3081331a	[LLDB] Test SVE dynamic resize with multiple threads This patch adds a new test case which depends on AArch64 SVE support and dynamic resize capability enabled. It created two seperate threads which have different values of sve registers and SVE vector granule at various points during execution. We test that LLDB is doing the size and offset updates properly for all of the threads including the main thread and when we VG is updated using prctl call or by 'register write vg' command the appropriate changes are also update in register infos. Reviewed By: labath Differential Revision: https://reviews.llvm.org/D82866	2021-01-19 15:01:32 +05:00

1 2 3 4 5 ...

377413 Commits All Branches Search

377413 Commits

All Branches