llvm-project

Commit Graph

Author	SHA1	Message	Date
Paul Walker	614fb09645	[SVE] Disable some BUILD_VECTOR related code generator features. Fixed length vector code generation for SVE does not yet custom lower BUILD_VECTOR and instead relies on expansion. At the same time custom lowering for VECTOR_SHUFFLE is also not available so this patch updates isShuffleMaskLegal to reject vector types that require SVE. Related to this it also prevents the merging of stores after legalisation because this only works when BUILD_VECTOR is either legal or can be elminated. When this is not the case the code generator enters an infinite legalisation loop. Differential Revision: https://reviews.llvm.org/D83408	2020-07-09 10:47:04 +00:00
Simon Pilgrim	58a85717cc	DebugCounterList::printOptionInfo - use const auto& iterator in for-range-loop. Avoids unnecessary copies and silences clang tidy warning.	2020-07-09 11:37:49 +01:00
Jun Ma	f0bfad2ed9	[Coroutines] Refactor sinkLifetimeStartMarkers Differential Revision: https://reviews.llvm.org/D83379	2020-07-09 18:23:28 +08:00
Simon Pilgrim	03fe47a29c	ConstantFoldScalarCall3 - use const APInt& returned by getValue() Avoids unnecessary APInt copies and silences clang tidy warning.	2020-07-09 11:16:47 +01:00
Simon Pilgrim	dbed9d5ce7	VersionPrinter - use const auto& iterator in for-range-loop. Avoids unnecessary copies and silences clang tidy warning.	2020-07-09 10:56:38 +01:00
Dmitry Polukhin	9e7fddbd36	[yaml][clang-tidy] Fix multiline YAML serialization Summary: New line duplication logic introduced in https://reviews.llvm.org/D63482 has two issues: (1) there is no logic that removes duplicate newlines when clang-apply-replacment reads YAML and (2) in general such logic should be applied to all strings and should happen on string serialization level instead in YAML parser. This diff changes multiline strings quotation from single quote `'` to double `"`. It solves problems with internal newlines because now they are escaped. Also double quotation solves the problem with leading whitespace after newline. In case of single quotation YAML parsers should remove leading whitespace according to specification. In case of double quotation these leading are internal space and they are preserved. There is no way to instruct YAML parsers to preserve leading whitespaces after newline so double quotation is the only viable option that solves all problems at once. Test Plan: check-all Reviewers: gribozavr, mgehre, yvvan Subscribers: xazax.hun, hiraditya, cfe-commits, llvm-commits Tags: #clang-tools-extra, #clang, #llvm Differential Revision: https://reviews.llvm.org/D80301	2020-07-09 02:41:58 -07:00
serge-sans-paille	e4ec6d0afe	Correctly update return status for MVEGatherScatterLowering `Changed` should reflect all possible changes. Differential Revision: https://reviews.llvm.org/D83459	2020-07-09 11:18:54 +02:00
Oliver Stannard	dc4a6f5db4	[llvm-objdump] Display locations of variables alongside disassembly This adds the --debug-vars option to llvm-objdump, which prints locations (registers/memory) of source-level variables alongside the disassembly based on DWARF info. A vertical line is printed for each live-range, with a label at the top giving the variable name and location, and the position and length of the line indicating the program counter range in which it is valid. Differential revision: https://reviews.llvm.org/D70720	2020-07-09 09:58:00 +01:00
Florian Hahn	b805e94477	[PredicateInfo] Add additional RenamedOp field to PB. OriginalOp of a predicate always refers to the original IR value that was renamed. So for nested predicates of the same value, it will always refer to the original IR value. For the use in SCCP however, we need to find the renamed value that is currently used in the condition associated with the predicate. This patch adds a new RenamedOp field to do exactly that. NewGVN currently relies on the existing behavior to merge instruction metadata. A test case to check for exactly that has been added in `195fa4bfae`. Reviewers: efriedma, davide, nikic Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D78133	2020-07-09 09:51:18 +01:00
Lucas Prates	fc39a9ca0e	[CodeGen] Matching promoted type for 16-bit integer bitcasts from fp16 operand Summary: When legalizing a biscast operation from an fp16 operand to an i16 on a target that requires both input and output types to be promoted to 32-bits, an assertion can fail when building the new node due to a mismatch between the the operation's result size and the type specified to the node. This patches fix the issue by making sure the bit width of the types match for the FP_TO_FP16 node, covering the difference with an extra ANYEXTEND operation. Reviewers: ostannard, efriedma, pirama, jmolloy, plotfi Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82552	2020-07-09 09:46:17 +01:00
Shengchen Kan	e59e39b7c4	[MC] Simplify the logic of applying fixup for fragments, NFCI Replace mutiple `if else` clauses with a `switch` clause and remove redundant checks. Before this patch, we need to add a statement like `if(!isa<MCxxxFragment>(Frag)) ` here each time we add a new kind of `MCEncodedFragment` even if it has no fixups. After this patch, we don't need to do that. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D83366	2020-07-09 16:39:13 +08:00
serge-sans-paille	a60c31fd62	Fix return status of AtomicExpandPass Correctly reflect change in the return status. Differential Revision: https://reviews.llvm.org/D83457	2020-07-09 10:27:48 +02:00
Kai Luo	e2b93185b8	[PowerPC] Only make copies of registers on stack in variadic function when va_start is called On PPC64, for a variadic function, if va_start is not called, it won't access any variadic argument on stack, thus we can save stores of registers used to pass arguments. Differential Revision: https://reviews.llvm.org/D82361	2020-07-09 07:18:17 +00:00
Vitaly Buka	e38727a0bb	[StackSafety,NFC] Update documentation It's follow up for D80908 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D82941	2020-07-08 23:57:13 -07:00
Craig Topper	c96877ff62	[X86] Remove unnecessary union from getHostCPUFeatures. NFC This seems to be leftover copied from an older implementation of getHostCPUName where we needed this to check the name of CPU vendor. We don't check the CPU vendor at all in getHostCPUFeatures so this union and the variable are unneeded.	2020-07-08 23:42:05 -07:00
Lang Hames	6709150944	[ORC] Modify LazyCallThroughManager to support asynchronous resolution. Asynchronous resolution is a better fit for handling reentry over IPC/RPC where we want to avoid blocking a communication handler/thread.	2020-07-08 21:13:55 -07:00
Qiu Chaofan	4254ed5c32	[Legalizer] Fix wrong operand in split vector helper This should be a typo introduced in D69275, which may cause an unknown segment fault in getNode. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D83376	2020-07-09 09:57:29 +08:00
Matt Arsenault	18bd821f02	DAG: Remove redundant finalizeLowering call 9cac4e6d1403554b06ec2fc9d834087b1234b695/D32628 intended to eliminate this, and move all isel pseudo expansion to FinalizeISel. This was a bad rebase or something, and failed to actually delete this call. GlobalISel also has a redundant call of finalizeLowering. However, it requires more work to remove it since it currently triggers a lot of verifier errors in tests.	2020-07-08 18:48:20 -04:00
Matt Arsenault	2ec5fc0c61	DAG: Remove redundant handling of reg fixups It looks like `9cac4e6d14` accidentally added a second copy of this from a bad rebase or something. This second copy was added, and the finalizeLowering call was not deleted as intended.	2020-07-08 18:32:43 -04:00
Matt Arsenault	74a148ad39	GlobalISel: Verify G_BITCAST changes the type Updated the AArch64 tests the best I could with my vague, inferred understanding of AArch64 register banks. As far as I can tell, there is only one 32-bit/64-bit type which will use the gpr register bank, so we have to use the fpr bank for the other operand.	2020-07-08 17:16:27 -04:00
Arthur Eubanks	930eaadacf	[opt] Remove obsolete --quiet option git blame shows these were last touched in 2004? Obsoleted in r13844. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D83409	2020-07-08 13:21:20 -07:00
Craig Topper	9b1e95329a	[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms As noted here https://lists.llvm.org/pipermail/llvm-dev/2016-October/106182.html and by alive2, this transform isn't valid. If X is poison this potentially propagates poison when it shouldn't. This same transform still exists in DAGCombiner. Differential Revision: https://reviews.llvm.org/D83360	2020-07-08 12:53:05 -07:00
Nikita Popov	0b39d2d752	Revert "[NFC] Separate Peeling Properties into its own struct" This reverts commit `0369dc98f9`. Many failing tests.	2020-07-08 21:43:32 +02:00
Nikita Popov	a48cf72238	[InstSimplify] Handle not inserted instruction gracefully (PR46638) When simplifying comparisons using a dominating assume, bail out if the context instruction is not inserted.	2020-07-08 21:43:32 +02:00
Gui Andrade	ff7900d5de	[LLVM] Accept `noundef` attribute in function definitions/calls The `noundef` attribute indicates an argument or return value which may never have an undef value representation. This patch allows LLVM to parse the attribute. Differential Revision: https://reviews.llvm.org/D83412	2020-07-08 19:02:04 +00:00
Sidharth Baveja	0369dc98f9	[NFC] Separate Peeling Properties into its own struct Summary: This patch makes the peeling properties of the loop accessible by other loop transformations. Author: sidbav (Sidharth Baveja) Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel) Reviewed By: Meinersbur (Michael Kruse) Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D80580	2020-07-08 18:59:59 +00:00
Anh Tuyen Tran	6965af43e6	Revert "[NFC] Separate Peeling Properties into its own struct" This reverts commit `fead250b43`.	2020-07-08 18:58:05 +00:00
Anh Tuyen Tran	fead250b43	[NFC] Separate Peeling Properties into its own struct Summary: This patch makes the peeling properties of the loop accessible by other loop transformations. Author: sidbav (Sidharth Baveja) Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel) Reviewed By: Meinersbur (Michael Kruse) Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D80580	2020-07-08 18:56:03 +00:00
Sanjay Patel	1265eb2d5f	[DAGCombiner] clean up in mergeConsecutiveStores(); NFC	2020-07-08 14:48:05 -04:00
Sanjay Patel	12c2271e53	[DAGCombiner] fix code comment and improve readability; NFC	2020-07-08 14:48:05 -04:00
Jay Foad	47788b97a9	SILoadStoreOptimizer: add support for GFX10 image instructions GFX10 image instructions use one or more address operands starting at vaddr0, instead of a single vaddr operand, to allow for NSA forms. Differential Revision: https://reviews.llvm.org/D81675	2020-07-08 19:15:46 +01:00
Jay Foad	a8816ebee0	[AMDGPU] Fix and simplify AMDGPULegalizerInfo::legalizeUDIV_UREM32Impl Use the algorithm from AMDGPUCodeGenPrepare::expandDivRem32. Differential Revision: https://reviews.llvm.org/D83383	2020-07-08 19:14:49 +01:00
Jay Foad	ecac951be9	[AMDGPU] Fix and simplify AMDGPUTargetLowering::LowerUDIVREM Use the algorithm from AMDGPUCodeGenPrepare::expandDivRem32. Differential Revision: https://reviews.llvm.org/D83382	2020-07-08 19:14:49 +01:00
Jay Foad	f4bd01c191	[AMDGPU] Fix and simplify AMDGPUCodeGenPrepare::expandDivRem32 Fix the division/remainder algorithm by adding a second quotient refinement step, which is required in some cases like 0xFFFFFFFFu / 0x11111111u (https://bugs.llvm.org/show_bug.cgi?id=46212). Also document, rewrite and simplify it by ensuring that we always have a lower bound on inv(y), which simplifies the UNR step and the quotient refinement steps. Differential Revision: https://reviews.llvm.org/D83381	2020-07-08 19:14:48 +01:00
Christopher Tetreault	c444b1b904	[SVE] Remove calls to VectorType::getNumElements from Scalar Reviewers: efriedma, fhahn, reames, kmclaughlin, sdesmalen Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, dantrushin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82243	2020-07-08 11:08:20 -07:00
Fangrui Song	4137ab62cf	[Support] Define llvm::parallel::strategy for -DLLVM_ENABLE_THREADS=off builds after D76885	2020-07-08 10:51:20 -07:00
Sanjay Patel	683a7f7025	[DAGCombiner] fix function-name formatting; NFC	2020-07-08 12:49:59 -04:00
Sanjay Patel	39329d5724	[DAGCombiner] add enum for store source value; NFC This removes existing code duplication and allows us to assert that we are handling the expected cases. We have a list of outstanding bugs that could benefit by handling truncated source values, so that's a possible addition going forward.	2020-07-08 12:49:59 -04:00
Simon Pilgrim	800fb68420	[X86][SSE] Pull out PACK(SHUFFLE(),SHUFFLE()) folds into its own function. NFC. Future patches will extend this so declutter combineVectorPack before we start.	2020-07-08 17:42:42 +01:00
Simon Pilgrim	08a2c9ce5c	[X86] Fix copy+paste typo in combineVectorPack assert message. NFC.	2020-07-08 17:42:42 +01:00
Arthur Eubanks	0b2536d0bd	[NewPM] Add PredicateInfoPrinterPass to PassRegistry.def Fixes tests under NPM in Transforms/Util/PredicateInfo.	2020-07-08 09:32:46 -07:00
Wei Mi	e32469a140	[SampleFDO] Enable sample-profile-top-down-load and sample-profile-merge-inlinee by default. sample-profile-top-down-load is an internal option which can enable top-down order of inlining and profile annotation in sample profile load pass. It was found to be beneficial for better profile annotation. Recently we found it could also solve some build time issue. Suppose function A has many callsites in function B. In the last release binary where sample profile was collected, the outline copy of A is large because there are many other functions inlined into A. However although all the callsites calling A in B are inlined, but every inlined body is small (A was inlined into B before other functions are inlined into A), there is no build time issue in last release. In an optimized build using the sample profile collected from last release, without top-down inlining, we saw a case that A got very large because of inlining, and then multiple callsites of A got inlined into B, and that led to a huge B which caused significant build time issue besides profile annotation issue. To solve that problem, the patch enables the flag sample-profile-top-down-load by default. sample-profile-top-down-load can have better performance when it is enabled together with sample-profile-merge-inlinee so in this patch we also enable sample-profile-merge-inlinee by default. Differential Revision: https://reviews.llvm.org/D82919	2020-07-08 09:23:18 -07:00
Ulrich Weigand	cca8578efa	[SystemZ] Allow specifying integer registers as part of the address calculation Revision `e1de2773a5` provided support for accepting integer registers in inline asm i.e. __asm("lhi %r0, 5") -> lhi %r0, 5 __asm("lhi 0, 5") -> lhi 0,5 This patch aims to extend this support to instructions which compute addresses as well. (i.e instructions of type BDMem and BD[X\|R\|V\|L]Mem) Author: anirudhp Differential Revision: https://reviews.llvm.org/D83251	2020-07-08 18:20:24 +02:00
Nicolai Hähnle	3fa989d4fd	DomTree: remove explicit use of DomTreeNodeBase::iterator Summary: Almost all uses of these iterators, including implicit ones, really only need the const variant (as it should be). The only exception is in NewGVN, which changes the order of dominator tree child nodes. Change-Id: I4b5bd71e32d71b0c67b03d4927d93fe9413726d4 Reviewers: arsenm, RKSimon, mehdi_amini, courbet, rriddle, aartbik Subscribers: wdng, Prazek, hiraditya, kuhar, rogfer01, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, vkmr, Kayjukh, jurahul, msifontes, cfe-commits, llvm-commits Tags: #clang, #mlir, #llvm Differential Revision: https://reviews.llvm.org/D83087	2020-07-08 18:18:49 +02:00
serge-sans-paille	bf9a940c3f	Revert "Double check that passes correctly set their Modified status" This reverts commit `37afd99c76`.	2020-07-08 18:14:40 +02:00
Evgeny Leviant	a074984250	[MIR] Speedup parsing of function with large number of basic blocks Patch eliminates string length calculation when lexing a token. Speedup can be up to 1000x. Differential revision: https://reviews.llvm.org/D83389	2020-07-08 18:50:00 +03:00
Arthur Eubanks	470bf7b5a2	[Preallocated] Add @llvm.call.preallocated.teardown This cleans up the stack allocated by a @llvm.call.preallocated.setup. Should either call the teardown or the preallocated call to clean up the stack. Calling both is UB. Add LangRef. Add verifier check that the token argument is a @llvm.call.preallocated.setup. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D83354	2020-07-08 08:48:44 -07:00
Paul Walker	bb35f0fd89	[SelectionDAG] Fix incorrect offset when expanding CONCAT_VECTORS. ExpandVectorBuildThroughStack is also used for CONCAT_VECTORS. However, when calculating the offsets for each of the operands we incorrectly use the element size rather than actual size and thus the stores overlap. Differential Revision: https://reviews.llvm.org/D83303	2020-07-08 15:39:25 +00:00
serge-sans-paille	37afd99c76	Double check that passes correctly set their Modified status The approach is simple: if a pass reports that it's not modifying a Function/Module, compute a loose hash of that Function/Module and compare it with the original one. If we report no change but there's a hash change, then we have an error. This approach misses a lot of change but it's not super intrusive and can detect most of the simple mistakes. Differential Revision: https://reviews.llvm.org/D80916	2020-07-08 17:36:13 +02:00
sstefan1	6aab27ba85	[OpenMPIRBuilder][Fix] Move llvm::omp::types to OpenMPIRBuilder. Summary: D82193 exposed a problem with global type definitions in `OMPConstants.h`. This causes a race when running in thinLTO mode. Types now live inside of OpenMPIRBuilder to prevent this from happening. Reviewers: jdoerfert Subscribers: yaxunl, hiraditya, guansong, dexonsmith, aaron.ballman, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D83176	2020-07-08 17:23:55 +02:00
Stanislav Mekhanoshin	64030099c3	SLP: honor requested max vector size merging PHIs At the moment this place does not check maximum size set by TTI and just creates a maximum possible vectors. Differential Revision: https://reviews.llvm.org/D82227	2020-07-08 08:06:15 -07:00
Ties Stuij	26a22478cd	[CodeGen] Don't combine extract + concat vectors with non-legal types Summary: The following combine currently breaks in the DAGCombiner: ``` extract_vector_elt (concat_vectors v4i16:a, v4i16:b), x -> extract_vector_elt a, x ``` This happens because after we have combined these nodes we have inserted nodes that use individual instances of the vector element type. In the above example i16. However this isn't a legal type on all backends, and when the combining pass calls the legalizer it breaks as it expects types to already be legal. The type legalizer has already been run, and running it again would make a mess of the nodes. In the example code at least, the generated code is still efficient after the change. Reviewers: miyuki, arsenm, dmgreen, lebedev.ri Reviewed By: miyuki, lebedev.ri Subscribers: lebedev.ri, wdng, hiraditya, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83231	2020-07-08 15:29:57 +01:00
Sanjay Patel	9114900287	[x86] improve codegen for non-splat bit-masked vector compare and select (PR46531) vselect ((X & Pow2C) == 0), LHS, RHS --> vselect ((shl X, C') < 0), RHS, LHS Follow-up to D83073 - the non-splat mask cases where we actually see an improvement are quite limited from what I can tell. AVX1 needs multiply and blend capabilities and AVX2 needs vector shift and blend capabilities. The intersection of those 2 constraints is only vectors with 32-bit or 64-bit elements. XOP is/was better. Differential Revision: https://reviews.llvm.org/D83181	2020-07-08 08:20:49 -04:00
Simon Pilgrim	9dc250db9d	[X86][AVX] SimplifyDemandedVectorEltsForTargetShuffle - ensure mask is same size as constant size Fixes test regression reported on D81791	2020-07-08 11:47:59 +01:00
Petar Avramovic	419c92a749	[GlobalISel][InlineAsm] Fix matching input constraints to mem operand Mark matching input constraint to mem operand as not supported. Differential Revision: https://reviews.llvm.org/D83235	2020-07-08 12:32:17 +02:00
Oliver Stannard	a50c7ebfd0	[Support] Fix signed/unsigned comparison warning	2020-07-08 11:26:10 +01:00
Paul Walker	fb75451775	[SVE] Custom ISel for fixed length extract/insert_subvector. We use extact_subvector and insert_subvector to "cast" between fixed length and scalable vectors. This patch adds custom c++ based ISel for the following cases: fixed_vector = ISD::EXTRACT_SUBVECTOR scalable_vector, 0 scalable_vector = ISD::INSERT_SUBVECTOR undef(scalable_vector), fixed_vector, 0 Which result in either EXTRACT_SUBREG/INSERT_SUBREG for NEON sized vectors or COPY_TO_REGCLASS otherwise. Differential Revision: https://reviews.llvm.org/D82871	2020-07-08 09:49:28 +00:00
Jeremy Morse	b9d977b0ca	[DWARF] Add cuttoff guarding quadratic validThroughout behaviour Occasionally we see absolutely massive basic blocks, typically in global constructors that are vulnerable to heavy inlining. When these blocks are dense with DBG_VALUE instructions, we can hit near quadratic complexity in DwarfDebug's validThroughout function. The problem is caused by: * validThroughout having to step through all instructions in the block to examine their lexical scope, * and a high proportion of instructions in that block being DBG_VALUEs for a unique variable fragment, Leading to us stepping through every instruction in the block, for (nearly) each instruction in the block. By adding this guard, we force variables in large blocks to use a location list rather than a single-location expression, as shown in the added test. This shouldn't change the meaning of the output DWARF at all: instead we use a less efficient DWARF encoding to avoid a poor-performance code path. Differential Revision: https://reviews.llvm.org/D83236	2020-07-08 10:30:09 +01:00
Simon Pilgrim	997a3c29f4	Fix MSVC "not all control paths return a value" warnings. NFC.	2020-07-08 10:18:36 +01:00
Simon Pilgrim	c00a27752e	[X86][AVX] Remove redundant EXTRACT_VECTOR_ELT(VBROADCAST(SCALAR())) fold Noticed while looking for similar cases to rG931ec74f7a29 - SimplifyDemandedVectorElts and shuffle combining both should handle this now.	2020-07-08 10:18:36 +01:00
Georgii Rymar	bee8cdcabd	[DebugInfo/DWARF] - Test invalid CFI opcodes properly and refine related `CFIProgram::parse` code. There are following issues with `CFIProgram::parse` code: 1) Invalid CFI opcodes were never tested. And currently a test would fail when the `LLVM_ENABLE_ABI_BREAKING_CHECKS` is enabled. It happens because the `DataExtractor::Cursor C` remains unchecked when the "Invalid extended CFI opcode" error is reported: ``` .eh_frame section at offset 0x1128 address 0x0: Program aborted due to an unhandled Error: Error value was Success. (Note: Success values must still be checked prior to being destroyed). ``` 2) It is impossible to reach the "Invalid primary CFI opcode" error with the current code. There are 3 possible primary opcode values and all of them are handled. Hence this error should be replaced with llvm_unreachable. 3) Errors currently reported are upper-case. This patch refines the code in the `CFIProgram::parse` method to fix all issues mentioned and adds unit tests for all possible invalid extended CFI opcodes. Differential revision: https://reviews.llvm.org/D82868	2020-07-08 12:10:23 +03:00
David Sherwood	9e66e9c30a	[CodeGen] Fix wrong use of getVectorNumElements() in DAGTypeLegalizer::SplitVecRes_ExtendOp In DAGTypeLegalizer::SplitVecRes_ExtendOp I have replaced an invalid call to getVectorNumElements() with a call to getVectorMinNumElements(), since the code path works for both fixed and scalable vectors. This fixes up a warning in the following test: sve-sext-zext.ll Differential Revision: https://reviews.llvm.org/D83197	2020-07-08 09:53:20 +01:00
David Sherwood	5b14f5051f	[CodeGen] Fix wrong use of getVectorNumElements in PromoteIntRes_EXTRACT_SUBVECTOR Calling getVectorNumElements() is not safe for scalable vectors and we should normally use getVectorElementCount() instead. However, for the code changed in this patch I decided to simply move the instantiation of the variable 'OutNumElems' lower down to the place where only fixed-width vectors are used, and hence it is safe to call getVectorNumElements(). Fixes up one warning in this test: sve-sext-zext.ll Differential Revision: https://reviews.llvm.org/D83195	2020-07-08 09:36:34 +01:00
David Sherwood	15aeb805dc	[CodeGen] Fix warnings in sve-ld1-addressing-mode-reg-imm.ll For the GetElementPtr case in function AddressingModeMatcher::matchOperationAddr I've changed the code to use the TypeSize class instead of relying upon the implicit conversion to a uint64_t. As part of this we now check for scalable types and if we encounter one just bail out for now as the subsequent optimisations doesn't currently support them. This changes fixes up all warnings in the following tests: llvm/test/CodeGen/AArch64/sve-ld1-addressing-mode-reg-imm.ll llvm/test/CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll Differential Revision: https://reviews.llvm.org/D83124	2020-07-08 09:16:00 +01:00
Heejin Ahn	7e6793aa33	[WebAssembly] Generate unreachable after __stack_chk_fail `__stack_chk_fail` does not return, but `unreachable` was not generated following `call __stack_chk_fail`. This had a possibility to generate an invalid binary for functions with a return type, because `__stack_chk_fail`'s return type is void and `call __stack_chk_fail` can be the last instruction in the function whose return type is non-void. Generating `unreachable` after it makes sure CFGStackify's `fixEndsAtEndOfFunction` handles it correctly. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D83277	2020-07-08 01:02:05 -07:00
Florian Hahn	80970ac875	[DSE,MSSA] Eliminate stores by terminators (free,lifetime.end). This patch adds support for eliminating stores by free & lifetime.end calls. We can remove stores that are not read before calling a memory terminator and we can eliminate all stores after a memory terminator until we see a new lifetime.start. The second case seems to not really trigger much in practice though. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72410	2020-07-08 08:59:46 +01:00
serge-sans-paille	edc7da2405	Upgrade TypePromotionTransaction to be able to report changes in CodeGenPrepare optimizeMemoryInst was reporting no change while still modifying the IR. Inspect the status of TypePromotionTransaction to get a better status. Related to https://reviews.llvm.org/D80916 Differential Revision: https://reviews.llvm.org/D81256	2020-07-08 08:35:44 +02:00
Nico Weber	e885f336fd	Revert "[X86] Add back the assert in getImpliedFeatures that I removed in ef4cc70f3ed2a91e0a48c6448c517c3ba34c2846" This reverts commit `91f70675cc`. It seems to break most (all?) hwasan tests.	2020-07-07 22:56:08 -04:00
Craig Topper	51b0da731a	Recommit "[X86] Merge the FEATURE_64BIT and FEATURE_EM64T bits in X86TargetParser.def." These represent the same thing but 64BIT only showed up from getHostCPUFeatures providing a list of featuers to clang. While EM64T showed up from getting the features for a named CPU. EM64T didn't have a string specifically so it would not be passed up to clang when getting features for a named CPU. While 64bit needed a name since that's how it is index. Merge them by filtering 64bit out before sending features to clang for named CPUs.	2020-07-07 19:01:58 -07:00
Ben Shi	1e9d0811c9	[RISCV] optimize addition with a pair of (addi imm) For an addition with an immediate in specific ranges, a pair of addi-addi can be generated instead of the ordinary lui-addi-add serial. Reviewed By: MaskRay, luismarques Differential Revision: https://reviews.llvm.org/D82262	2020-07-07 18:57:28 -07:00
Ben Shi	cb82de2960	[RISCV] Optimize multiplication by constant ... to shift/add or shift/sub. Do not enable it on riscv32 with the M extension where decomposeMulByConstant may not be an optimization. Reviewed By: luismarques, MaskRay Differential Revision: https://reviews.llvm.org/D82660	2020-07-07 18:50:24 -07:00
Craig Topper	d92bf71a07	Revert "[X86] Merge the FEATURE_64BIT and FEATURE_EM64T bits in X86TargetParser.def." An accidental change snuck in here This reverts commit `f1d290d812`.	2020-07-07 18:20:07 -07:00
Craig Topper	f1d290d812	[X86] Merge the FEATURE_64BIT and FEATURE_EM64T bits in X86TargetParser.def. These represent the same thing but 64BIT only showed up from getHostCPUFeatures providing a list of featuers to clang. While EM64T showed up from getting the features for a named CPU. EM64T didn't have a string specifically so it would not be passed up to clang when getting features for a named CPU. While 64bit needed a name since that's how it is index. Merge them by filtering 64bit out before sending features to clang for named CPUs.	2020-07-07 17:59:54 -07:00
Wouter van Oortmerssen	fd0964ae83	[WebAssembly] fix gcc 10 warning	2020-07-07 17:55:37 -07:00
Philip Reames	22596e7b2f	[Statepoint] Use early return to reduce nesting and clarify comments [NFC]	2020-07-07 16:19:05 -07:00
Philip Reames	9955876d74	[Statepoint] Reduce intendation and change a variable name [NFC]	2020-07-07 16:19:05 -07:00
Craig Topper	91f70675cc	[X86] Add back the assert in getImpliedFeatures that I removed in `ef4cc70f3e` I've added additional features to the table so I want to see if the bots are happier with this.	2020-07-07 15:20:59 -07:00
Florian Hahn	04b85e2bcb	Revert "[SLP] Make sure instructions are ordered when computing spill cost." This seems to break http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/24371 This reverts commit `eb46137daa`.	2020-07-07 23:15:01 +01:00
Eric Astor	bc8e262afe	[ms] [llvm-ml] Add initial MASM STRUCT/UNION support Summary: Add support for user-defined types to MasmParser, including initialization and field access. Known issues: - Omitted entry initializers (e.g., <,0>) do not work consistently for nested structs/arrays. - Size checking/inference for values with known types is not yet implemented. - Some ml64.exe syntaxes for accessing STRUCT fields are not recognized. - `[<register>.<struct name>].<field>` - `[<register>[<struct name>.<field>]]` - `(<struct name> PTR [<register>]).<field>` - `[<variable>.<struct name>].<field>` - `(<struct name> PTR <variable>).<field>` Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D75306	2020-07-07 17:02:10 -04:00
Christopher Tetreault	021d56abb9	[SVE] Make Constant::getSplatValue work for scalable vector splats Summary: Make Constant::getSplatValue recognize scalable vector splats of the form created by ConstantVector::getSplat. Add unit test to verify that C == ConstantVector::getSplat(C)->getSplatValue() for fixed width and scalable vector splats Reviewers: efriedma, spatel, fpetrogalli, c-rhodes Reviewed By: efriedma Subscribers: sdesmalen, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82416	2020-07-07 13:45:51 -07:00
Matt Arsenault	23157f3bdb	GlobalISel: Handle EVT argument lowering correctly handleAssignments was assuming every argument type is an MVT, and assignArg would always fail. This fixes one of the hacks in the current AMDGPU calling convention code that pre-processes the arguments.	2020-07-07 16:36:14 -04:00
Matt Arsenault	42bb481442	AMDGPU/GlobalISel: Fix skipping unused kernel arguments The tests in `a5b9ad7e9a` actually failed the verifier, which for some reason is not the default. Also add tests for 0-sized function arguments, which do not add entries to the expected register lists.	2020-07-07 16:36:13 -04:00
Philip Reames	b172cd7812	[Statepoint] Factor out logic for non-stack non-vreg lowering [almost NFC] This is inspired by D81648. The basic idea is to have the set of SDValues which are lowered as either constants or direct frame references explicit in one place, and to separate them clearly from the spilling logic. This is not NFC in that the handling of constants larger than > 64 bit has changed. The old lowering would crash on values which could not be encoded as a sign extended 64 bit value. The new lowering just spills all constants > 64 bits. We could be consistent about doing the sext(Con64) optimization, but I happen to know that this code path is utterly unexercised in practice, so simple is better for now.	2020-07-07 13:34:28 -07:00
Zola Bridges	9d9e499840	[x86][seses] Add clang flag; Use lvi-cfi with seses This patch creates a clang flag to enable SESES. This flag also ensures that lvi-cfi is on when using seses via clang. SESES should use lvi-cfi to mitigate returns and indirect branches. The flag to enable the SESES functionality only without lvi-cfi is now -x86-seses-enable-without-lvi-cfi to warn users part of the mitigation is not enabled if they use this flag. This is useful in case folks want to see the cost of SESES separate from the LVI-CFI. Reviewed By: sconstab Differential Revision: https://reviews.llvm.org/D79910	2020-07-07 13:20:13 -07:00
Arthur Eubanks	2279380eab	[Inliner] Don't skip inlining alwaysinline in optnone functions Previously the NPM inliner would skip all potential inlines in an optnone function, but alwaysinline callees should be inlined regardless of optnone. Fixes inline-optnone.ll under NPM. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D83021	2020-07-07 12:54:55 -07:00
Nikita Popov	8691544a27	[SCCP] Use range metadata for loads and calls When all else fails, use range metadata to constrain the result of loads and calls. It should also be possible to use !nonnull, but that would require some general support for inequalities in SCCP first. Differential Revision: https://reviews.llvm.org/D83179	2020-07-07 21:09:21 +02:00
Stanislav Mekhanoshin	7c03872645	LIS: fix handleMove to properly extend main range handleMoveDown or handleMoveUp cannot properly repair a main range of a LiveInterval since they only get LiveRange. There is a problem if certain use has moved few segments away and there is a hole in the main range in between of these two locations. We may get a SubRange with a very extended Segment spanning several Segments of the main range and also spanning that hole. If that happens then we end up with the main range not covering its SubRange which is an error. It might be possible to attempt fixing the main range in place just between of the old and new index by extending all of its Segments in between, but it is unclear this logic will be faster than just straight constructMainRangeFromSubranges, which itself is pretty cheap since it only contains interval logic. That will also require shrinkToUses() call after which is probably even more expensive. In the test second move is from 64B to 92B for the sub1. Subrange is correctly fixed: L000000000000000C [16r,32B:0)[32B,92r:1) 0@16r 1@32B-phi But the main range has a hole in between 80d and 88r after updateRange(): %1 [16r,32B:0)[32B,80r:4)[80r,80d:3)[88r,96r:1)[96r,160B:2) Since source position is 64B this segment is not even considered by the updateRange(). Differential Revision: https://reviews.llvm.org/D82916	2020-07-07 11:52:32 -07:00
Nikita Popov	9dfea03517	[SCCP] Handle assume predicates Take assume predicates into account when visiting ssa.copy. The handling is the same as for branch predicates, with the difference that we're always on the true edge. Differential Revision: https://reviews.llvm.org/D83257	2020-07-07 20:22:52 +02:00
Simon Pilgrim	931ec74f7a	[X86][AVX] Don't fold PEXTR(VBROADCAST_LOAD(X)) -> LOAD(X). We were checking the VBROADCAST_LOAD element size against the extraction destination size instead of the extracted vector element size - PEXTRW/PEXTB have implicit zext'ing so have i32 destination sizes for v8i16/v16i8 vectors, resulting in us extracting from the wrong part of a load. This patch bails from the fold if the vector element sizes don't match, and we now use the target constant extraction code later on like the pre-AVX2 targets, fixing the test case. Found by internal fuzzing tests.	2020-07-07 19:10:03 +01:00
Zola Bridges	dfabffb195	[x86][lvi][seses] Use SESES at O0 for LVI mitigation Use SESES as the fallback at O0 where the optimized LVI pass isn't desired due to its effect on build times at O0. I updated the LVI tests since this changes the code gen for the tests touched in the parent revision. This is a follow up to the comments I made here: https://reviews.llvm.org/D80964 Hopefully we can continue the discussion here. Also updated SESES to handle LFENCE instructions properly instead of adding redundant LFENCEs. In particular, 1) no longer add LFENCE if the current instruction being processed is an LFENCE and 2) no longer add LFENCE if the instruction right before the instruction being processed is an LFENCE Reviewed By: sconstab Differential Revision: https://reviews.llvm.org/D82037	2020-07-07 11:05:09 -07:00
Thomas Lively	0d7286a652	[WebAssembly] Avoid scalarizing vector shifts in more cases Since WebAssembly's vector shift instructions take a scalar shift amount rather than a vector shift amount, we have to check in ISel that the vector shift amount is a splat. Previously, we were checking explicitly for splat BUILD_VECTOR nodes, but this change uses the standard utilities for detecting splat values that can handle more complex splat patterns. Since the C++ ISel lowering is now more general than the ISel patterns, this change also simplifies shift lowering by using the C++ lowering for all SIMD shifts rather than mixing C++ and normal pattern-based lowering. This change improves ISel for shifts to the point that the simd-shift-unroll.ll regression test no longer tests the code path it was originally meant to test. The bug corresponding to that regression test is no longer reproducible with its original reported reproducer, so rather than try to fix the regression test, this change just removes it. Differential Revision: https://reviews.llvm.org/D83278	2020-07-07 10:45:26 -07:00
Arthur Eubanks	1143f09678	[NewPM][LoopFusion] Rename loop-fuse -> loop-fusion The legacy pass name is "loop-fusion". Fixes most tests under Transforms/LoopFusion under NPM. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D83066	2020-07-07 10:43:07 -07:00
Biplob Mishra	62ba48b45f	[PowerPC] Implement Vector Replace Builtins in LLVM Provide the LLVM intrinsics needed to implement vector replace element builtins in altivec.h which will be added in a subsequent patch. Differential Revision: https://reviews.llvm.org/D83308	2020-07-07 12:22:52 -05:00
Hans Wennborg	7fc279ca3d	[GlobalOpt] Don't remove inalloca from musttail-called functions Otherwise the verifier complains about the mismatching function ABIs. Differential revision: https://reviews.llvm.org/D83300	2020-07-07 19:02:46 +02:00
Sanjay Patel	642eed3713	[x86] fix miscompile in buildvector v16i8 lowering In the test based on PR46586: https://bugs.llvm.org/show_bug.cgi?id=46586 ...we are inserting 16-bits into the high element of the vector, shuffling it to element 0, and extracting 32-bits. But xmm1 was never initialized, so the top 16-bits of the extract are undef without this patch. (It seems like we could do better than this by recognizing that we only demand a subsection of the build vector, but I want to make sure we fix the miscompile 1st.) This path is only used for pre-SSE4.1, and simpler patterns get squashed somewhere along the way, so the test still includes a 'urem' as it did in the original test from the bug report. Differential Revision: https://reviews.llvm.org/D83319	2020-07-07 13:02:31 -04:00
Amy Huang	9ee90a4905	[NativeSession] Add column numbers to NativeLineNumber. Summary: This adds column numbers if they are present, and otherwise sets the column number to be zero. Bug: https://bugs.llvm.org/show_bug.cgi?id=41795 Reviewers: amccarth Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81950	2020-07-07 09:59:22 -07:00
SharmaRithik	082e395230	[CodeMoverUtils] Make specific analysis dependent checks optional Summary: This patch makes code motion checks optional which are dependent on specific analysis example, dominator tree, post dominator tree and dependence info. The aim is to make the adoption of CodeMoverUtils easier for clients that don't use analysis which were strictly required by CodeMoverUtils. This will also help in diversifying code motion checks using other analysis example MSSA. Authored By: RithikSharma Reviewer: Whitney, bmahjour, etiotto Reviewed By: Whitney Subscribers: Prazek, hiraditya, george.burgess.iv, asbirlea, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D82566	2020-07-07 20:11:07 +05:30
Roman Lebedev	16266e6396	[Scalarizer] When gathering scattered scalar, don't replace it with itself The (previously-crashing) test-case would cause us to seemingly-harmlessly replace some use with something else, but we can't replace it with itself, so we would crash.	2020-07-07 17:03:53 +03:00
Liu, Chen3	ea85ff82c8	[X86] Fix a bug that when lowering byval argument When an argument has 'byval' attribute and should be passed on the stack according calling convention, a stack copy would be emitted twice. This will cause the real value will be put into stack where the pointer should be passed. Differential Revision: https://reviews.llvm.org/D83175	2020-07-07 21:49:31 +08:00
Ayal Zaks	7bf299c8d8	[LV] Vectorize without versioning-for-unit-stride under -Os/-Oz If a loop is in a function marked OptSize, Loop Access Analysis should refrain from generating runtime checks for unit strides that will version the loop. If a loop is in a function marked OptSize and its vectorization is enabled, it should be vectorized w/o any versioning. Fixes PR46228. Differential Revision: https://reviews.llvm.org/D81345	2020-07-07 15:04:21 +03:00
Kerry McLaughlin	cdf2eef613	[SVE][CodeGen] Legalisation of unpredicated store instructions Summary: When splitting a store of a scalable type, the new address is calculated in SplitVecOp_STORE using a vscale and an add instruction. Reviewers: sdesmalen, efriedma, david-arm Reviewed By: david-arm Subscribers: tschuett, hiraditya, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83041	2020-07-07 11:47:10 +01:00
Kerry McLaughlin	5e8084beba	[SVE][CodeGen] Legalisation of unpredicated load instructions Summary: When splitting a load of a scalable type, the new address is calculated in SplitVecRes_LOAD using a vscale and an add instruction. This patch also adds a DAG combiner fold to visitADD for vscale: - Fold (add (vscale(C0)), (vscale(C1))) to (add (vscale(C0 + C1))) Reviewers: sdesmalen, efriedma, david-arm Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82792	2020-07-07 11:05:03 +01:00
Guillaume Chatelet	74c723757e	[NFC] Adding the align attribute on Atomic{CmpXchg\|RMW}Inst This is the first step to add support for the align attribute to AtomicRMWInst and AtomicCmpXchgInst. Next step is to add support in IRBuilder and BitcodeReader. Bug: https://bugs.llvm.org/show_bug.cgi?id=27168 Differential Revision: https://reviews.llvm.org/D83136	2020-07-07 09:54:13 +00:00
David Sherwood	79d34a5a1b	[SVE][CodeGen] Fix bug when falling back to DAG ISel In an earlier commit `584d0d5c17` I added functionality to allow AArch64 CodeGen support for falling back to DAG ISel when Global ISel encounters scalable vector types. However, it seems that we were not falling back early enough as llvm::getLLTForType was still being invoked for scalable vector types. I've added a new fallback function to the call lowering class in order to catch this problem early enough, rather than wait for lowerFormalArguments to reject scalable vector types. Differential Revision: https://reviews.llvm.org/D82524	2020-07-07 09:23:04 +01:00
David Sherwood	c061e56e88	[CodeGen] Fix warnings in sve-vector-splat.ll and sve-trunc.ll This patch fixes all remaining warnings in: llvm/test/CodeGen/AArch64/sve-trunc.ll llvm/test/CodeGen/AArch64/sve-vector-splat.ll I hit some warnings related to getCopyPartsToVector. I fixed two issues: 1. In widenVectorToPartType() we assumed that we'd always be using BUILD_VECTOR nodes to expand from one vector type to another, which is incorrect for scalable vector types. I've fixed this for now by simply bailing out immediately for scalable vectors. 2. In getCopyToPartsVector() I've changed the code to compare the element counts of different types. Differential Revision: https://reviews.llvm.org/D83028	2020-07-07 09:21:47 +01:00
Craig Topper	44ea81acb6	[X86] Add 64bit and retpoline-external-thunk to list of featuers in X86TargetParser.def. '64bit' shows up from -march=native on 64-bit capable CPUs. 'retpoline-eternal-thunk' isn't a real feature but shows up when -mretpoline-external-thunk is passed to clang.	2020-07-07 00:57:04 -07:00
Craig Topper	ef4cc70f3e	[X86] Remove assert for missing features from X86::getImpliedFeatures This is failing on the bots. Remove while I try to figure out what feature I missed in the table.	2020-07-07 00:18:01 -07:00
Carl Ritson	560292fa99	[AMDGPU] Update isFMAFasterThanFMulAndFAdd assumptions MAD/MAC is no longer always available. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D83207	2020-07-07 15:40:44 +09:00
Craig Topper	16f3d698f2	[X86] Move the feature dependency handling in X86TargetInfo::setFeatureEnabledImpl to a table based lookup in X86TargetParser.cpp Previously we had to specify the forward and backwards feature dependencies separately which was error prone. And as dependencies have gotten more complex it was hard to be sure the transitive dependencies were handled correctly. The way it was written was also not super readable. This patch replaces everything with a table that lists what features a feature is dependent on directly. Then we can recursively walk through the table to find the transitive dependencies. This is largely based on how we handle subtarget features in the MC layer from the tablegen descriptions. Differential Revision: https://reviews.llvm.org/D83273	2020-07-06 23:14:02 -07:00
Craig Topper	7fb3a849c1	[X86] Remove duplicate SSE4A feature bit from X86TargetParser.def. NFC We had both SSE4A and SSE4_A. So remove one of them.	2020-07-06 22:11:51 -07:00
Nemanja Ivanovic	1b1539712e	[PowerPC] Do not RAUW combined nodes in VECTOR_SHUFFLE legalization When legalizing shuffles, we make an attempt to combine it into a PPC specific canonical form that avoids a need for a swap. If the combine is successful, we RAUW the node and the custom legalization replaces the now dead node instead of the one it should replace. Remove that erroneous call to RAUW.	2020-07-06 22:09:28 -05:00
Valentin Clement	65482e8a70	[openmp] Move isAllowedClauseForDirective to tablegen + add clause version to OMP.td Summary: Generate the isAllowedClauseForDirective function from tablegen. This patch introduce the VersionedClause in the tablegen file so that clause can be encapsulated in this class to specify a range of validity on a directive. VersionedClause has default minVersion, maxVersion so it can be used without them or minVersion. Reviewers: jdoerfert, jdenny Reviewed By: jdenny Subscribers: yaxunl, hiraditya, guansong, jfb, sstefan1, aaron.ballman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82982	2020-07-06 22:20:06 -04:00
Xiang1 Zhang	939d8309db	[X86-64] Support Intel AMX Intrinsic INTEL ADVANCED MATRIX EXTENSIONS (AMX). AMX is a new programming paradigm, it has a set of 2-dimensional registers (TILES) representing sub-arrays from a larger 2-dimensional memory image and operate on TILES. These intrinsics use direct TMM register number as its params. Spec can be found in Chapter 3 here https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D83111	2020-07-07 10:13:40 +08:00
Amy Kwan	c13e3e2c2e	[PowerPC][Power10] Exploit the xxsplti32dx instruction when lowering VECTOR_SHUFFLE. This patch aims to exploit the xxsplti32dx XT, IX, IMM32 instruction when lowering VECTOR_SHUFFLEs. We implement lowerToXXSPLTI32DX when lowering vector shuffles to check if: - Element size is 4 bytes - The RHS is a constant vector (and constant splat of 4-bytes) - The shuffle mask is a suitable mask for the XXSPLTI32DX instruction where it is one of the 32 masks: <0, 4-7, 2, 4-7> <4-7, 1, 4-7, 3> Differential Revision: https://reviews.llvm.org/D83245	2020-07-06 20:28:38 -05:00
Jordan Rupprecht	10c82eecbc	Revert "[LV] Enable the LoopVectorizer to create pointer inductions" This reverts commit `a8fe12065e`. It causes a crash when building gzip. Will post the detailed reduced test case to D81267.	2020-07-06 17:50:38 -07:00
Wolfgang Pieb	129387497e	Correct 3 spelling errors in headers and doc strings.	2020-07-06 17:27:51 -07:00
Sanjay Patel	ea71ba11ab	[DAGCombiner] reassociate reciprocal sqrt expression to eliminate FP division X / (fabs(A) * sqrt(Z)) --> X / sqrt(AAZ) --> X * rsqrt(AAZ) In the motivating case from PR46406: https://bugs.llvm.org/show_bug.cgi?id=46406 ...this is restoring the sequence that was originally in the source code. We extracted a term from within the sqrt because we do not know in instcombine whether a target will expand a sqrt call. Note: we could say that the transform in IR should be restricted, but that would not solve the problem if the source was originally in the pattern shown here. This is a gray area for fast-math-flag requirements. I think we should at least check fast-math-flags on the fdiv and fmul because I view this transform as 2 pieces: reassociate the fmul operands and form reciprocal from the fdiv (as with the existing transform). We could argue that the sqrt also needs FMF, but that was not required before, so we should change that in a follow-up patch if that seems better. We don't currently have a way to check that the target will produce a sqrt or recip estimate without actually creating nodes (the APIs are SDValue getSqrtEstimate() and SDValue getRecipEstimate()), so we clean up speculatively created nodes if we are not able to create an estimate. The x86 test with doubles verifies that we are not changing a test with no estimate sequence. Differential Revision: https://reviews.llvm.org/D82716	2020-07-06 19:12:21 -04:00
Yuanfang Chen	1e495e10e6	[NFC] change getLimitedCodeGenPipelineReason to static function	2020-07-06 15:39:27 -07:00
Roman Lebedev	69dca6efc6	[NFCI][IR] Introduce CallBase::Create() wrapper Summary: It is reasonably common to want to clone some call with different bundles. Let's actually provide an interface to do that. Reviewers: chandlerc, jdoerfert, dblaikie, nickdesaulniers Reviewed By: nickdesaulniers Subscribers: llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D83248	2020-07-07 01:16:36 +03:00
Roman Lebedev	db05f2e34a	[Scalarizer] Centralize instruction DCE As reported in https://reviews.llvm.org/D83101#2133062 the new visitInsertElementInst()/visitExtractElementInst() functionality is causing miscompiles (previously-crashing test added) It is due to the fact how the infra of Scalarizer is dealing with DCE, it was not updated or was it ready for such scalar value forwarding. It always assumed that the moment we "scalarized" something, it can go away, and did so with prejudice. But that is no longer safe/okay to do. Instead, let's prevent it from ever shooting itself into foot, and let's just accumulate the instructions-to-be-deleted in a vector, and collectively cleanup (those that are actually dead) them all at the end. All existing tests are not reporting any new garbage leftovers, but maybe it's test coverage issue.	2020-07-07 01:12:51 +03:00
Stanislav Mekhanoshin	f7a7efbf88	[AMDGPU] Tweak getTypeLegalizationCost() Even though wide vectors are legal they still cost more as we will have to eventually split them. Not all operations can be uniformly done on vector types. Conservatively add the cost of splitting at least to 8 dwords, which is our widest possible load. We are more or less lying to cost mode with this change but this can prevent vectorizer from creation of wide vectors which results in RA problems for us. Differential Revision: https://reviews.llvm.org/D83078	2020-07-06 14:07:48 -07:00
Matt Arsenault	f25d020c2e	AMDGPU/GlobalISel: Add types to special inputs When passing special ABI inputs, we have no existing context for the type to use.	2020-07-06 17:00:55 -04:00
Nicolai Hähnle	dfcc68c528	DomTree: Remove getRoots() accessor Summary: Avoid exposing details about how roots are stored. This enables subsequent type-erasure changes. v5: - cleanup a unit test by using EXPECT_EQ instead of EXPECT_TRUE Change-Id: I532b774cc71f2224e543bc7d79131d97f63f093d Reviewers: arsenm, RKSimon, mehdi_amini, courbet Subscribers: jvesely, wdng, hiraditya, kuhar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83085	2020-07-06 21:58:11 +02:00
Nicolai Hähnle	76c5cb05a3	DomTree: Remove getChildren() accessor Summary: Avoid exposing details about how children are stored. This will enable subsequent type-erasure changes. New methods are introduced to cover common access patterns. Change-Id: Idb5f4b1b9c84e4cc71ddb39bb52a388682f5674f Reviewers: arsenm, RKSimon, mehdi_amini, courbet Subscribers: qcolombet, sdardis, wdng, hiraditya, jrtc27, zzheng, atanasyan, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83083	2020-07-06 21:58:11 +02:00
Wouter van Oortmerssen	16d83c395a	[WebAssembly] Added 64-bit memory.grow/size/copy/fill This covers both the existing memory functions as well as the new bulk memory proposal. Added new test files since changes where also required in the inputs. Also removes unused init/drop intrinsics rather than trying to make them work for 64-bit. Differential Revision: https://reviews.llvm.org/D82821	2020-07-06 12:49:50 -07:00
Wouter van Oortmerssen	4d135b0446	[WebAssembly] 64-bit memory limits	2020-07-06 12:40:45 -07:00
Kazushi (Jam) Marukawa	fa1fecc73d	[VE] Support symbol with offset in assembly Summary: Change MCExpr to support Aurora VE's modifiers. Change asmparser to use existing MCExpr parser (parseExpression) to parse an expression contining symbols with modifiers and offsets. Also add several regression tests of MC layer. Reviewers: simoll, k-ishizaka Reviewed By: simoll Subscribers: hiraditya, llvm-commits Tags: #llvm, #ve Differential Revision: https://reviews.llvm.org/D83170	2020-07-07 04:16:51 +09:00
Kazushi (Jam) Marukawa	af8389e131	[VE] Change to use isa Summary: Change to use isa instead of dyn_cast to avoid a warning. Reviewers: simoll, k-ishizaka Reviewed By: simoll Subscribers: hiraditya, llvm-commits Tags: #llvm, #ve Differential Revision: https://reviews.llvm.org/D83200	2020-07-07 03:48:49 +09:00
Matt Arsenault	c19c153e74	AMDGPU: Don't ignore carry out user when expanding add_co_pseudo This was resulting in a missing vreg def in the use select instruction. The output of the pseudo doesn't make sense, since it really shouldn't have the vreg output in the first place, and instead an implicit scc def to match the real scalar behavior. We could have easier to understand tests if we selected scalar versions of the [us]{add\|sub}.with.overflow intrinsics. This does still end up producing vector code in the end, since it gets moved later.	2020-07-06 14:28:01 -04:00
Luís Marques	61c2a0bb82	[RISCV] Fold ADDIs into load/stores with nonzero offsets We can often fold an ADDI into the offset of load/store instructions: (load (addi base, off1), off2) -> (load base, off1+off2) (store val, (addi base, off1), off2) -> (store val, base, off1+off2) This is possible when the off1+off2 continues to fit the 12-bit immediate. We remove the previous restriction where we would never fold the ADDIs if the load/stores had nonzero offsets. We now do the fold the the resulting constant still fits a 12-bit immediate, or if off1 is a variable's address and we know based on that variable's alignment that off1+offs2 won't overflow. Differential Revision: https://reviews.llvm.org/D79690	2020-07-06 17:32:57 +01:00
jasonliu	6d3ae365bd	[XCOFF][AIX] Give symbol an internal name when desired symbol name contains invalid character(s) Summary: When a desired symbol name contains invalid character that the system assembler could not process, we need to emit .rename directive in assembly path in order for that desired symbol name to appear in the symbol table. Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L Differential Revision: https://reviews.llvm.org/D82481	2020-07-06 15:49:15 +00:00
Oliver Stannard	e80b81d1cb	[Support] Fix formatted_raw_ostream for UTF-8 * The getLine and getColumn functions need to update the position, or they will return stale data for buffered streams. This fixes a bug in the clang -analyzer-checker-option-help option, which was not wrapping the help text correctly when stdout is not a TTY. * If the stream contains multi-byte UTF-8 sequences, then the whole sequence needs to be considered to be a single character. This has the edge case that the buffer might fill up and be flushed part way through a character. * If the stream contains East Asian wide characters, these will be rendered twice as wide as other characters, so we need to increase the column count to match. This doesn't attempt to handle everything unicode can do (combining characters, right-to-left markers, ...), but hopefully covers most things likely to be common in messages and source code we might want to print. Differential revision: https://reviews.llvm.org/D76291	2020-07-06 16:18:15 +01:00
Roman Lebedev	a2619a60e4	Reland "[ScalarEvolution] createSCEV(): recognize `udiv`/`urem` disguised as an `sdiv`/`srem`" This reverts commit `d3e3f36ff1`, which reverter the original commit `2c16100e6f`, but with polly tests now actually passing.	2020-07-06 18:00:22 +03:00
David Green	146dad0077	[ARM] MVE FP16 cost adjustments This adjusts the MVE fp16 cost model, similar to how we already do for integer casts. It uses the base cost of 1 per cvt for most fp extend / truncates, but adjusts it for loads and stores where we know that a extending load has been used to get the load into the correct lane, and only an MVE VCVTB is then needed. Differential Revision: https://reviews.llvm.org/D81813	2020-07-06 15:57:51 +01:00
Mikhail Goncharov	d3e3f36ff1	Revert "[ScalarEvolution] createSCEV(): recognize `udiv`/`urem` disguised as an `sdiv`/`srem`" Summary: This reverts commit `2c16100e6f`. ninja check-polly fails: Polly :: Isl/CodeGen/MemAccess/generate-all.ll Polly :: ScopInfo/multidim_srem.ll Reviewers: kadircet, bollu Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83230	2020-07-06 16:41:59 +02:00
Florian Hahn	cff5739157	[LV] Pass dbgs() to verifyFunction call. This is done in other places of the pass already and improves the output on verification failure.	2020-07-06 15:09:20 +01:00
David Green	afdb2ef2ed	[ARM] Adjust default fp extend and trunc costs This adds some default costs for fp extends and truncates, generally costing them as 1 per lane. If the type is not legal then the cost will include a call to an __aeabi_ function. Some NEON code is also adjusted to make sure it applies to the expected types, now that fp16 is a more common thing. Differential Revision: https://reviews.llvm.org/D82458	2020-07-06 14:23:17 +01:00
Matt Arsenault	521ebc1681	GlobalISel: Move finalizeLowering call later This matches the DAG behavior where this is called after the loop checking for calls. The AMDGPU implementation depends on knowing if there are calls in the function or not, so move this later. Another problem is finalizeLowering is actually called twice; I was seeing weird inconsistencies since the first call would produce unexpected results and the second run would correct them in some contexts. Since this requires disabling the verifier, and it's useful to serialize the MIR immediately after selection, FinalizeISel should probably not be a real pass.	2020-07-06 09:19:40 -04:00
Matt Arsenault	a5b9ad7e9a	AMDGPU/GlobalISel: Don't emit code for unused kernel arguments	2020-07-06 09:04:06 -04:00
Matt Arsenault	7b76a5c8a2	AMDGPU: Fix fixed ABI SGPR arguments The default constructor wasn't setting isSet o the ArgDescriptor, so while these had the value set, they were treated as missing. This only ended up mattering in the indirect call case (and for regular calls in GlobalISel, which current doesn't have a way to support the variable ABI).	2020-07-06 09:01:18 -04:00
Esme-Yi	0607c8df7f	[PowerPC] Legalize SREM/UREM directly on P9. Summary: As Bugzilla-35090 reported, the rationale for using custom lowering SREM/UREM should no longer be true. At the IR level, the div-rem-pairs pass performs the transformation where the remainder is computed from the result of the division when both a required. We should now be able to lower these directly on P9. And the pass also fixed the problem that divide is in a different block than the remainder. This is a patch to remove redundant code and make SREM/UREM legal directly on P9. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D82145	2020-07-06 11:47:31 +00:00
Jay Foad	babbeafa00	[TargetLowering] Improve expansion of FSHL/FSHR by non-zero amount Use a simpler code sequence when the shift amount is known not to be zero modulo the bit width. Nothing much uses this until D77152 changes the translation of fshl and fshr intrinsics. Differential Revision: https://reviews.llvm.org/D82540	2020-07-06 12:07:14 +01:00
Jay Foad	e7a4a24dc5	[TargetLowering] Improve expansion of ROTL/ROTR Using a negation instead of a subtraction from a constant can save an instruction on some targets. Nothing much uses this until D77152 changes the translation of fshl and fshr intrinsics. Differential Revision: https://reviews.llvm.org/D82539	2020-07-06 12:07:14 +01:00
Sam McCall	d7ea6ce809	[Support] fix user_cache_directory on mac	2020-07-06 12:54:11 +02:00
Kai Nacke	bfd84b1c03	[SystemZ/ZOS] Implement getMainExecutable() and is_local_impl() Adds implementation of getMainExecutable() and is_local_impl() to Support/Unix/Path.inc. Both are needed to compile LLVM for z/OS. Reviewed By: hubert.reinterpretcast, emaste Differential Revision: https://reviews.llvm.org/D82544	2020-07-06 06:48:16 -04:00
Roman Lebedev	5d7afe2d2e	[Scalarizer] visit{Insert,Extract}ElementInst(): avoid call arg evaluation order deps Compilers may evaluate call arguments in different order, which would result in different order of IR, which would break the tests. Spotted thanks to Dmitri Gribenko!	2020-07-06 13:42:35 +03:00
David Green	60b8b2beea	[ARM] Add extra extend and trunc costs for cast instructions This expands the existing extend costs with a few extras for larger types than legal, which will usually be split under MVE. It also adds trunk support for the same thing. These should not have a large effect on many things, but makes the costs explicit and keeps a certain balance between the trunks and extends. Differential Revision: https://reviews.llvm.org/D82457	2020-07-06 11:33:05 +01:00
Sam McCall	cd209f1a37	[Support] Add path::user_config_directory for $XDG_CONFIG_HOME etc Reviewers: hokein Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83128	2020-07-06 12:20:55 +02:00
Roman Lebedev	51f9310ff2	[Scalarizer] ExtractElement handling w/ variable insert index (PR46524) Summary: Similar to D82961. Reviewers: bjope, cameron.mcinally, arsenm, jdoerfert Reviewed By: jdoerfert Subscribers: arphaman, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82970	2020-07-06 13:19:33 +03:00
Roman Lebedev	6e50474581	[Scalarizer] InsertElement handling w/ variable insert index (PR46524) Summary: I'm interested in taking the original C++ input, for which we currently are stuck with an alloca and producing roughly the lower IR, with neither an alloca nor a vector ops: https://godbolt.org/z/cRRWaJ For that, as intermediate step, i'd to somehow perform scalarization. As per @arsenmn suggestion, i'm trying to see if scalarizer can help me avoid writing a bicycle. I'm not sure if it's really intentional that variable insert is not handled currently. If it really is, and is supposed to stay that way (?), i guess i could guard it.. See [[ https://bugs.llvm.org/show_bug.cgi?id=46524 \| PR46524 ]]. Reviewers: bjope, cameron.mcinally, arsenm, jdoerfert Reviewed By: jdoerfert Subscribers: arphaman, uabelho, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82961	2020-07-06 13:19:32 +03:00

1 2 3 4 5 ...

136604 Commits