llvm-project

Commit Graph

Author	SHA1	Message	Date
Fraser Cormack	75017db08c	[RISCV] Add tests for commuted vector/scalar VP patterns This patch adds a variety of tests checking that we can match vector/scalar instructions against masked VP intrinsics when the splat is on the LHS. At this stage, we can't, despite us having ostensibly-commutable ISel patterns for them. The use of V0 as the mask operand interferes with the auto-generated ISel table.	2022-01-20 17:10:09 +00:00
Matt Arsenault	be7e938e27	AMDGPU/GlobalISel: Stop handling llvm.amdgcn.buffer.atomic.fadd This code is not structured to handle the legacy buffer intrinsics and was miscompiling them.	2022-01-20 12:12:06 -05:00
Matt Arsenault	8ff3c9e0be	AMDGPU/GlobalISel: Fix selection of gfx90a FP atomics The struct/raw forms for the buffer atomics now work as expected. However, we're incorrectly handling the legacy form (which we probably shouldn't handle at all). We also are not diagnosing the use of the return value on gfx908. These will be addressed separately.	2022-01-20 12:12:06 -05:00
Matt Arsenault	89c447e4e6	AMDGPU: Stop reserving 36-bytes before kernel arguments for amdpal This was inheriting the mesa behavior, and as far as I know nobody is using opencl kernels with amdpal. The isMesaKernel check was irrelevant because this property needs to be held for all functions.	2022-01-20 12:12:05 -05:00
Random06457	ee198df2e1	[mips] Improve vr4300 mulmul bugfix pass When compiling with dwarf info, the mfix4300 flag introduced in https://reviews.llvm.org/D116238 can miss some occurrences of the vr4300 mulmul bug if a debug instruction happens to be between two `muls` instructions. This change skips debug instructions in order to fix the mulmul bug detection. Fixes https://github.com/llvm/llvm-project/issues/53094 Differential Revision: https://reviews.llvm.org/D117615	2022-01-20 20:10:04 +03:00
Lucas Prates	283f5a198a	[GlobalISel] Fix incorrect sign extension when combining G_INTTOPTR and G_PTR_ADD The GlobalISel combiner currently uses sign extension when manipulating the LHS constant when combining a sequence of the following sequence of machine instructions into a single constant: ``` %0:_(s32) = G_CONSTANT i32 <CONSTANT> %1:_(p0) = G_INTTOPTR %0:_(s32) %2:_(s64) = G_CONSTANT i64 <CONSTANT> %3:_(p0) = G_PTR_ADD %1:_, %2:_(s64) ``` This causes an issue when the bit width of the first contant and the target pointer size are different, as G_INTTOPTR has no sign extension semantics. This patch fixes this by capture an arbitrary precision in when matching the constant, allowing the matching function to correctly zero extend it. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D116941	2022-01-20 17:02:52 +00:00
Sjoerd Meijer	fabf1de132	[FuncSpec] Add a reference, and some other clarifying comments. NFC.	2022-01-20 17:01:08 +00:00
Philip Reames	c104fca36b	{SLP] Delete dead code in favor of proper assert [NFC]	2022-01-20 08:54:12 -08:00
Philip Reames	c43ebae838	[SLP] Reduce nesting depth in calculateDependencies via for loop and early continue [NFC]	2022-01-20 08:46:44 -08:00
Sander de Smalen	990bab89ff	[ScalableVectors] Warn instead of error for invalid size requests. This was intended to be fixed by D98856, but that only seemed to have the desired behaviour when compiling to assembly using `-S`, not when compiling into an object file or executable. Given that this was not the intention of D98856, this patch fixes the behaviour.	2022-01-20 16:42:08 +00:00
Adrian Prantl	c0957bd617	Add missing include to fix modular build	2022-01-20 08:35:33 -08:00
Adrian Prantl	54ba376d08	Add missing include to fix modular build	2022-01-20 08:33:44 -08:00
Philip Reames	3c422cbe6b	[SLP] Add an asser to make a non-obvious precondition clear [NFC]	2022-01-20 08:24:10 -08:00
Michael Kruse	616f77172f	[OpenMPIRBuilder] Detect and fix ambiguous InsertPoints for createParallel. When a Builder methods accepts multiple InsertPoints, when both point to the same position, inserting instructions at one position will "move" the other after the inserted position since the InsertPoint is pegged to the instruction following the intended InsertPoint. For instance, when creating a parallel region at Loc and passing the same position as AllocaIP, creating instructions at Loc will "move" the AllocIP behind the Loc position. To avoid this ambiguity, add an assertion checking this condition and fix the unittests. In case of AllocaIP, an alternative solution could be to implicitly split BasicBlock at InsertPoint, using the first as AllocaIP, the second for inserting the instructions themselves. However, this solution is specific to AllocaIP since AllocaIP will always have to be first. Hence, this is an argument to generally handling ambiguous InsertPoints as API sage error. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D117226	2022-01-20 10:13:44 -06:00
Nico Weber	91eca967b9	[gn build] (manually) port `f29256a64a`	2022-01-20 11:02:06 -05:00
Nikita Popov	805bc24868	[InstSimplify] Add test for load of non-integral pointer (NFC)	2022-01-20 16:50:05 +01:00
Mubashar Ahmad	3da69fb5a2	[Clang][AArch64][ARM] Unaligned Access Test Fix Test fixed for the unaligned access warning.	2022-01-20 15:29:53 +00:00
Nikita Popov	0f283de9d1	[InstSimplify] Add test for reinterpret load of pointer type (NFC)	2022-01-20 16:25:54 +01:00
Simon Pilgrim	866311e71c	[X86] lowerToAddSubOrFMAddSub - lower 512-bit ADDSUB patterns to blend(fsub,fadd) AVX512 doesn't provide a ADDSUB instruction, but if we've built this from a build vector of scalar fsub/fadd elements we can still lower to blend(fsub,fadd)	2022-01-20 15:16:05 +00:00
Mircea Trofin	f29256a64a	[MLGO] Improved support for AOT cross-targeting scenarios The tensorflow AOT compiler can cross-target, but it can't run on (for example) arm64. We added earlier support where the AOT-ed header and object would be built on a separate builder and then passed at build time to a build host where the AOT compiler can't run, but clang can be otherwise built. To simplify such scenarios given we now support more than one AOT-able case (regalloc and inliner), we make the AOT scenario centered on whether files are generated, case by case (this includes the "passed from a different builder" scenario). This means we shouldn't need an 'umbrella' LLVM_HAVE_TF_AOT, in favor of case by case control. A builder can opt out of an AOT case by passing that case's model path as `none`. Note that the overrides still take precedence. This patch controls conditional compilation with case-specific flags, which can be enabled locally, for the component where those are available. We still keep an overall flag for some tests. The 'development/training' mode is unchanged, because there the model is passed from the command line and interpreted. Differential Revision: https://reviews.llvm.org/D117752	2022-01-20 07:05:39 -08:00
Nikita Popov	81d35f27dd	[DebugInstrRef] Memoize variable order during sorting (NFC) Instead of constructing DebugVariables and looking up the order in the comparison function, compute the order upfront and then sort a vector of (order, instr). This improves compile-time by -0.4% geomean on CTMark ReleaseLTO-g. Differential Revision: https://reviews.llvm.org/D117575	2022-01-20 16:04:24 +01:00
Simon Pilgrim	4130357f96	[X86] Fix v16f32 ADDSUB test This was supposed to ensure we're not generating 512-bit ADDSUB nodes, but cut+paste typos meant we weren't generating a full 512-bit pattern	2022-01-20 14:58:36 +00:00
eopXD	14c5fd920b	[Clang][RISCV] Change TARGET_BUILTIN to require zve32x for vector instruction According to v-spec v1.0, `zve-32x` is the new minimum extension to include to have vector instructions. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D112613	2022-01-20 06:53:48 -08:00
Jan Svoboda	9011903e36	[llvm][vfs] Abstract in-memory node creation The creation of in-memory VFS nodes happens in a single function that deduces what kind of node to create from the arguments. This leads to complicated if-then-else logic that's difficult to cleanly extend. This patch abstracts away in-memory node creation via a type-erased factory function that's passed instead. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D117648	2022-01-20 15:48:02 +01:00
Jan Svoboda	9e24d14ac8	[llvm][vfs] NFC: Virtualize in-memory `getStatus` This patch virtualizes the `getStatus` function on `InMemoryNode` in LLVM VFS. Currently, this is implemented via top-level function `getNodeStatus` that tries to cast `InMemoryNode *` into each subtype. Virtual functions seem to be the simpler solution here. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D117649	2022-01-20 15:48:02 +01:00
Stephan Herhut	6d45284618	[mlir][memref] Add better support for identity layouts in memref.collapse_shape canonicalizer When computing the new type of a collapse_shape operation, we need to at least take into account whether the type has an identity layout, in which case we can easily support dynamic strides. Otherwise, the canonicalizer creates invalid IR. Longer term, both the verifier and the canoncializer need to be extended to support the general case. Differential Revision: https://reviews.llvm.org/D117772	2022-01-20 15:31:43 +01:00
Stanislav Gatev	c95cb4de1b	[clang][dataflow] Intersect ExprToLoc when joining environments This is part of the implementation of the dataflow analysis framework. See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev. Reviewed-by: xazax.hun Differential Revision: https://reviews.llvm.org/D117754	2022-01-20 14:30:17 +00:00
Valentin Clement	010a10b738	[flang][NFC] Remove extra braces Noticed during the upstreaming process.	2022-01-20 15:18:59 +01:00
Mubashar Ahmad	35737df4dc	[Clang][AArch64][ARM] Unaligned Access Warning Added Added warning for potential cases of unaligned access when option -mno-unaligned-access has been specified Differential Revision: https://reviews.llvm.org/D116221	2022-01-20 14:12:49 +00:00
Nikita Popov	60147c6034	[EarlyCSE] Regenerate test checks (NFC)	2022-01-20 14:49:26 +01:00
Florian Hahn	67aa314bce	[IRGen] Do not overwrite existing attributes in CGCall. When adding new attributes, existing attributes are dropped. While this appears to be a longstanding issue, this was highlighted by D105169 which dropped a lot of attributes due to adding the new noundef attribute. Ahmed Bougacha (@ab) tracked down the issue and provided the fix in CGCall.cpp. I bundled it up and updated the tests.	2022-01-20 13:45:19 +00:00
Simon Tatham	a4ac40e92f	[AArch64] Remove PRBAR0_ELn and PRLAR0_ELn sysregs. The Armv8-R.64 architecture defines numbered MPU region registers with indices 1-15, not 0-15. So there's no such register as PRBAR0_EL2 or PRLAR0_EL1 (for example). The encodings that they would occupy are used for the unnumbered PRBAR_ELn and PRLAR_ELn registers. Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D117755	2022-01-20 13:37:58 +00:00
Simon Tatham	19b9cd4eae	[MC] Add a disassembly test for Armv8-R sysregs. This is the counterpart to llvm/test/MC/AArch64/armv8r-sysreg.s, checking all the same encodings when fed to the disassembler.	2022-01-20 13:37:58 +00:00
Abinav Puthan Purayil	d8b690409d	[AMDGPU] Set MemoryVT for truncstores in tblgen. GlobalISelEmitter was skipping these patterns when its predicates were checked. This patch should allow us to select d16_hi stores in GlobalISel. Differential Revision: https://reviews.llvm.org/D117762	2022-01-20 19:05:12 +05:30
Marek Kurdej	69ecd2484f	[clang-format] Indicate source location on test failure. NFC.	2022-01-20 14:10:59 +01:00
Simon Pilgrim	304cfc706a	[X86] combineConcatVectorOps - remove superfluous Subtarget.hasAVX() check This function only ever gets called by AVX targets, and we already assert for this at the top of the function	2022-01-20 12:56:09 +00:00
Simon Pilgrim	c4f5fd76da	[X86] combineConcatVectorOps - add handling for X86ISD::VSHL/VSRL/VSRA These can be handled the same as the vector shift by immediate variants that are already handled.	2022-01-20 12:56:08 +00:00
Jay Foad	847bb26820	[AMDGPU] Regenerate some MIR checks	2022-01-20 12:41:40 +00:00
Valentin Clement	ccaaeca910	[flang][NFC] Move current inliner files in Dialect directory This patch just move the files from the Transforms directory to the Dialect directory. Reviewed By: jeanPerier Differential Revision: https://reviews.llvm.org/D117661	2022-01-20 13:34:44 +01:00
Valentin Clement	911c137054	[flang][NFC] Cleanup dependent dialects and make def homogenous Remove unnecessary dependent dialect and make the definition of the pass more homogenous with the two others. This patch is part of the upstreaming effort from fir-dev branch. Reviewed By: jeanPerier Differential Revision: https://reviews.llvm.org/D117688	2022-01-20 13:33:56 +01:00
Fraser Cormack	ca36cc56ac	[RISCV] Match RVV VF variants also through masked operations This brings floating-point RVV vector/scalar support more in line with the integer vector patterns, which can already match '.vx' instructions with masked operations. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117697	2022-01-20 12:08:02 +00:00
Peter Waller	d4a6bf4d1a	Revert "[AArch64][SVE][VLS] Move extends into arguments of comparisons" This reverts commit `db04d3e30b`, which causes a buildbot failure.	2022-01-20 12:01:23 +00:00
Fraser Cormack	5a12024b95	[RISCV] Optimize lowering of floating-point -0.0 This idea has come up in several reviews -- D115978 and D105902 -- so I can't take any credit for the idea. Instead of using a constant pool to lower -0.0, we can emit a sequence of two instructions: fmv.[hwd].x freg, zero fsgnjn.[hsd] freg, freg, freg This is only done when the floating-point type is legal. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117687	2022-01-20 11:46:28 +00:00
Prashant Kumar	770353cd94	[MLIR] The return type in the `computeSingleVarRepr` function is modified to include equality expressions. Earlier `computeSingleVarRepr` was returning a pair of upper bound and lower bound indices of the inequality contraints that can be expressed as a floordiv of an affine function. The equality expression can also be expressed as a floordiv but contains only one index and hence the `LocalRepr` class is introduced to facilitate this. Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D117430	2022-01-20 16:40:58 +05:30
David Spickett	787f91b0bb	[lldb] Remove non-address bits from addresses given to memory tag commands Although the memory tag commands use a memory tag manager to handle addresses, that only removes the top byte. That top byte is 4 bits of memory tag and 4 free bits, which is more than it should strictly remove but that's how it is for now. There are other non-address bit uses like pointer authentication. To ensure the memory tag manager only has to deal with memory tags, use the ABI plugin to remove the rest. The tag access test has been updated to sign all the relevant pointers and require that we're running on a system with pointer authentication in addition to memory tagging. The pointers will look like: <4 bit user tag><4 bit memory tag><signature><bit virtual address> Note that there is currently no API for reading memory tags. It will also have to consider this when it arrives. Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D117672	2022-01-20 10:48:14 +00:00
David Spickett	585abe3ba5	[lldb] Rename MemoryTagManager RemoveNonAddressBits to RemoveTagBits This better describes the intent of the method. Which for AArch64 is removing the top byte which includes the memory tags. It does not include pointer signatures, for those we need to use the ABI plugin. The rename makes this a little more clear. It's a bit awkward that the memory tag manager is removing the whole top byte not just the memory tags but it's an improvement for now. Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D117671	2022-01-20 10:47:05 +00:00
Florian Hahn	782c0dd1a1	[IRBuilder] Migrate and-folding to value-based FoldAnd. Similar to the migration of or-folding to FoldOr, there are a few cases where the fold in IRBuilder::CreateAnd triggered directly. Those have been updated. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D117431	2022-01-20 10:22:21 +00:00
Valentin Clement	90efbe697a	[flang][NFC] Fix header guard and comment	2022-01-20 10:56:23 +01:00
Casey Carter	67d483aba2	[libcxx][test] Use TEST_HAS_BUILTIN in test code ... rather than using `__has_builtin` directly. This both (1) allows a compiler that doesn't speak `__has_builtin` to workaround with preprocessor magic, and (2) avoids diagnostics about things that look like function like macros after `#if` but are not.	2022-01-20 01:47:29 -08:00
eopXD	60b6e73769	[RISCV] Imply extensions in RISCVTargetInfo::initFeatureMap Under ASTContext, clang only copies the features from the options with Target->initFeatureMap, and no implications is done there. This makes clang_cc1 fail to imply into `zve32x` for the vector extension, and test cases will have to add ` -target-feature +experimental-zve32x` in order to work. This patch fixes it. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D113336	2022-01-20 01:47:10 -08:00

1 2 3 4 5 ...

411803 Commits All Branches Search

411803 Commits

All Branches