llvm-project

Commit Graph

Author	SHA1	Message	Date
Stanislav Mekhanoshin	4617cc68f6	[AMDGPU] Fix expansion of 192 bit spills in PEI Differential Revision: https://reviews.llvm.org/D92979	2020-12-09 16:36:29 -08:00
Krzysztof Parzyszek	e3b2828b9d	[Hexagon] Silence warnings about unused objects	2020-12-09 17:54:10 -06:00
Krzysztof Parzyszek	43d1c7a564	[Hexagon] Fix build: move template specialization into namespace scope	2020-12-09 17:40:15 -06:00
Krzysztof Parzyszek	f5d07a05bb	[Hexagon] Realign HVX vectors wherever possible Introduce HexagonVectorCombine as a helper class for vector-related optimizations.	2020-12-09 17:11:25 -06:00
Saleem Abdulrasool	ee74d1b420	X86: use a data driven configuration of Windows x86 libcalls (NFC) Rather than creating a series of associated calls and ensuring that everything is lined up, use a table driven approach that ensures that they two always stay in sync.	2020-12-09 22:49:11 +00:00
Scott Linder	9260a99999	[MC][AMDGPU] Consume EndOfStatement in asm parser Avoids spurious newlines showing up in the output when emitting assembly via MC. Reviewed By: MaskRay, arsenm Differential Revision: https://reviews.llvm.org/D92690	2020-12-09 21:45:55 +00:00
Craig Topper	5ff5cf8e05	[X86] Use APInt::isSignedIntN instead of isIntN for 64-bit ANDs in X86DAGToDAGISel::IsProfitableToFold Pretty sure we meant to be checking signed 32 immediates here rather than unsigned 32 bit. I suspect I messed this up because in MathExtras.h we have isIntN and isUIntN so isIntN differs in signedness depending on whether you're using APInt or plain integers. This fixes a case where we didn't fold a constant created by shrinkAndImmediate. Since shrinkAndImmediate doesn't topologically sort constants it creates, we can fail to convert the Constant to a TargetConstant. This leads to very strange behavior later. Fixes PR48458.	2020-12-09 13:39:07 -08:00
Scott Linder	f5f4b8b60f	[AMDGPU][MC] Restore old error position for "too few operands" Revert part of https://reviews.llvm.org/D92084 to make it simpler to start consuming the EndOfStatement token within AMDGPU's ParseInstruction in a future patch. This also brings us back to what every other target currently does. A future change to move the position back to the end of the statement would likely need to audit all of the AMDGPUOperand SMLoc ranges, and determine the SMLoc for the last character of the last operand. Reviewed By: dp Differential Revision: https://reviews.llvm.org/D92960	2020-12-09 21:09:47 +00:00
Florian Hahn	77fd12a66e	[AArch64] Add aarch64_neon_vcmla{_rot{90,180,270}} intrinsics. Add builtins required to implement vcmla and rotated variants from the ACLE Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D92929	2020-12-09 19:46:49 +00:00
Kazushi (Jam) Marukawa	1a2147fead	[VE] Add vsum and vfsum intrinsic instructions Add vsum and vfsum intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92938	2020-12-10 01:11:53 +09:00
Kazushi (Jam) Marukawa	398f29fbb0	[VE] Add vfmk intrinsic instructions Add vfmk intrinsic instructions, a few pseudo instructions to expand vfmk intrinsic using VM512 correctly, and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92758	2020-12-10 00:08:20 +09:00
Simon Pilgrim	24184dbb82	[X86] Fold CONCAT(VPERMV3(X,Y,M0),VPERMV3(Z,W,M1)) -> VPERMV3(CONCAT(X,Z),CONCAT(Y,W),CONCAT(M0,M1)) Further prep work toward supporting different subvector sizes in combineX86ShufflesRecursively	2020-12-09 14:29:32 +00:00
Kerry McLaughlin	05edfc5475	[SVE][CodeGen] Add DAG combines for s/zext_masked_gather This patch adds the following DAGCombines, which apply if isVectorLoadExtDesirable() returns true: - fold (and (masked_gather x)) -> (zext_masked_gather x) - fold (sext_inreg (masked_gather x)) -> (sext_masked_gather x) LowerMGATHER has also been updated to fetch the LoadExtType associated with the gather and also use this value to determine the correct masked gather opcode to use. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92230	2020-12-09 11:53:19 +00:00
Kerry McLaughlin	4519ff4b6f	[SVE][CodeGen] Add the ExtensionType flag to MGATHER Adds the ExtensionType flag, which reflects the LoadExtType of a MaskedGatherSDNode. Also updated SelectionDAGDumper::print_details so that details of the gather load (is signed, is scaled & extension type) are printed. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D91084	2020-12-09 11:19:08 +00:00
Tim Northover	45de42116e	AArch64: use correct operand for ubsantrap immediate. I accidentally pushed the wrong patch originally.	2020-12-09 10:17:16 +00:00
Fraser Cormack	af5fd65895	[RISCV] Fix missing def operand when creating VSETVLI pseudos The register operand was not being marked as a def when it should be. No tests for this in the main branch as there are not yet any pseudos without a non-negative VLIndex. Also change the type of a virtual register operand from unsigned to Register and adjust formatting. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D92823	2020-12-09 09:35:28 +00:00
Siddhesh Poyarekar	0ef0de65f1	Fix typo in llvm/lib/Target/README.txt Trivial typo, replace __builtin_objectsize with __builtin_object_size. Differential Revision: https://reviews.llvm.org/D92914	2020-12-09 10:12:26 +01:00
David Green	384383e15c	[ARM] Common inverse constant predicates to VPNOT This scans through blocks looking for constants used as predicates in MVE instructions. When two constants are found which are the inverse of one another, the second can be replaced by a VPNOT of the first, potentially allowing that not to be folded away into an else predicate of a vpt block. Differential Revision: https://reviews.llvm.org/D92470	2020-12-09 07:56:44 +00:00
Craig Topper	aaa925795f	[RISCV] Use SDLoc created early in RISCVDAGToDAGISel::Select instead of recreating it in multiple cases in the switch. NFC	2020-12-08 21:13:25 -08:00
Craig Topper	846f576bea	[RISCV] Add a table showing the layout of the fields in VTYPE. Rename MaskedOffAgnostic->MaskAgnostic. NFC	2020-12-08 20:41:57 -08:00
Jinsong Ji	45b08c41bf	[PowerPC] Set SubRegIndex offset for sub_vsx1/sub_pair1 We defined SubRegIndex for 256/512 regs, but we did not set the offset for higher part, so the offset of lower and higher part are the same. This may cause problem in assessing ranges of SubReg, it is great that this haven't affected any testcases, but I think we should fix it to avoid hidden bugs in the future. Reviewed By: bsaleil, #powerpc Differential Revision: https://reviews.llvm.org/D92864	2020-12-08 22:56:44 -05:00
Sam Clegg	1b6d879ec1	[WebAssembly] Fix code generated for atomic operations in PIC mode The main this this test does is to add the `IsNotPIC` predicate to the all the atomic instructions pattern that directly refer to `tglobaladdr`. This is because in PIC mode we need to generate separate instruction sequence (either a direct global.get, or __memory_base + offset) for accessing global addresses. As part of this change I noticed that many of the `Requires` attributes added to the instruction in `WebAssemblyInstrAtomics.td` were being honored. This is because the wrapped in a `let Predicates = [HasAtomics]` block and it seems that that outer wrapping overrides any `Requires` on defs within it. As a workaround I removed the outer `let` and added `HasAtomics` to all the inner `Requires`. I believe that all the instrucitons that don't have `Requires` explicit bottom out in `ATOMIC_I` and `ATOMIC_NRI` which have `HasAtomics` so this should not remove this predicate from any patterns (at least that is the idea). The alternative to this approach looks like implementing something like `PredicateControl` in `Mips.td` where we can split the predicates into groups so they don't clobber each other. Differential Revision: https://reviews.llvm.org/D92744	2020-12-08 18:41:32 -08:00
Chen Zheng	66a03d1022	[PowerPC] prepare more dq form for P10 pair load/store Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92393	2020-12-08 21:01:40 -05:00
Craig Topper	a64998be99	[RISCV] Share VTYPE encoding code between the assembler and the CustomInserter for adding VSETVLI before vector instructions This merges the SEW and LMUL enums that each used into singles enums in RISCVBaseInfo.h. The patch also adds a new encoding helper to take SEW, LMUL, tail agnostic, mask agnostic and turn it into a vtype immediate. I also stopped storing the Encoding in the VTYPE operand in the assembler. It is easy to calculate when adding the operand which should only happen once per instruction. Differential Revision: https://reviews.llvm.org/D92813	2020-12-08 16:04:20 -08:00
Jessica Paquette	40d1fb2229	[AArch64][GlobalISel] Swap select operands when inverting condition code This was not obvious when reading the imported tablegen patterns in AArch64GenDAGISel. Update select-select.mir.	2020-12-08 14:17:26 -08:00
Jessica Paquette	21308c2b4c	[AArch64][GlobalISel] Check if G_SELECT has been optimized when folding binops `TryFoldBinOpIntoSelect` didn't have a check for `Optimized`, meaning you could end up folding twice. (e.g. a select with a G_ADD on the true side, and a G_SUB on the false side) Add in the missing `if` and a test.	2020-12-08 13:47:08 -08:00
Kazushi (Jam) Marukawa	95ea50e4ad	[VE] Correct LVLGen (LVL instruction insert pass) SX Aurora VE uses an intermediate representation similar to VP as its MIR. VE itself uses invidiual VL register as its own vector length register at the hardware level. So, LLVM needs to insert load VL (LVL) instruction just before vector instructions if the value of VL is changed. This LVLGen pass generates LVL instructions for such purpose. Previously, a bug is pointed out in D91416. This patch correct this bug and add a regression test. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D92716	2020-12-09 06:33:53 +09:00
Florian Hahn	4c69b1b98a	[AArch64] Fix rottype use in complex instr defs. It seems like the order here is wrong. Types like i32 do not take any arguments. Currently this is not a problem, because the patterns are not actually used with any nodes, but will fail once it is used with real ISD nodes. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D91345	2020-12-08 21:11:33 +00:00
Harald van Dijk	29c8ea6f1a	[X86] Handle localdynamic TLS model in x32 mode D92346 added TLS_(base_)addrX32 to handle TLS in x32 mode, but missed the different TLS models. This diff fixes the logic for the local dynamic model where `RAX` was used when `EAX` should be, and extends the tests to cover all four TLS models. Fixes https://bugs.llvm.org/show_bug.cgi?id=26472. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92737	2020-12-08 21:06:00 +00:00
Austin Kerbow	4aa842a800	[AMDGPU] Add new pseudos for indirect addressing with VGPR Indexing It is possible for copies or spills to be inserted in the middle of indirect addressing sequences which use VGPR indexing. Spills to accvgprs could be effected by the indexing mode. Add new pseudo instructions that are expanded after register allocation to avoid the problematic spill or copy placement. Differential Revision: https://reviews.llvm.org/D91048	2020-12-08 12:24:12 -08:00
Craig Topper	98bca0a605	[RISCV] Add isel patterns for SBCLRI/SBSETI/SBINVI(W) instruction We can use these instructions for single bit immediates that are too large for ANDI/ORI/CLRI. The _10 test cases are to make sure that we still use ANDI/ORI/CLRI for small immediates. Differential Revision: https://reviews.llvm.org/D92262	2020-12-08 12:22:40 -08:00
Craig Topper	fb5b611af9	[RISCV] Detect more errors when parsing vsetvli in the assembler -Reject an "mf1" lmul -Make sure tail agnostic is exactly "tu" or "ta" not just that it starts with "tu" or "ta" -Make sure mask agnostic is exactly "mu" or "ma" not just that it starts with "mu" or "ma" Differential Revision: https://reviews.llvm.org/D92805	2020-12-08 11:25:39 -08:00
Craig Topper	88e58939dc	[RISCV] When parsing vsetvli in the assembler, use StringRef::getAsInteger instead of APInt's string constructor APInt's string constructor asserts on error. Since this is the parser and we don't yet know if the string is a valid integer we shouldn't use that. Instead use StringRef::getAsInteger which returns a bool to indicate success or failure. Since we no longer need APInt, use 'unsigned' instead. Differential Revision: https://reviews.llvm.org/D92801	2020-12-08 11:25:39 -08:00
Jessica Paquette	5b5d3fa9d9	[AArch64][GlobalISel] Fold G_SELECT cc, %t, (G_ADD %x, 1) -> CSINC %t, %x, cc This implements ``` G_SELECT cc, %true, (G_ADD %x, 1) -> CSINC %true, %x, cc G_SELECT cc, (G_ADD %x, 1), %false -> CSINC %x, %false, inv_cc ``` Godbolt example: https://godbolt.org/z/eoPqKq Differential Revision: https://reviews.llvm.org/D92868	2020-12-08 10:53:37 -08:00
Jessica Paquette	cd9a52b99e	[AArch64][GlobalISel] Fold binops on the true side of G_SELECT This implements the following folds: ``` G_SELECT cc, (G_SUB 0, %x), %false -> CSNEG %x, %false, inv_cc G_SELECT cc, (G_XOR x, -1), %false -> CSINV %x, %false, inv_cc ``` This is similar to the folds introduced in `5bc0bd05e6`. In `5bc0bd05e6` I mentioned that we may prefer to do this in AArch64PostLegalizerLowering. I think that it's probably better to do this in the selector. The way we select G_SELECT depends on what register banks end up being assigned to it. If we did this in AArch64PostLegalizerLowering, then we'd end up checking every G_SELECT to see if it's worth swapping operands. Doing it in the selector allows us to restrict the optimization to only relevant G_SELECTs. Also fix up some comments in `TryFoldBinOpIntoSelect` which are kind of confusing IMO. Example IR: https://godbolt.org/z/3qPGca Differential Revision: https://reviews.llvm.org/D92860	2020-12-08 10:42:59 -08:00
Jessica Paquette	ce199667f6	[AArch64][GlobalISel] Don't explicitly write to the zero register in emitCMN This case was missed in `78ccb0359d`. Differential Revision: https://reviews.llvm.org/D92438	2020-12-08 10:42:05 -08:00
Craig Topper	3e86fbc971	[RISCV] Replace custom isel code for RISCVISD::READ_CYCLE_WIDE with isel pattern This node returns 2 results and uses a chain. As long as we use a DAG as part of the pseudo instruction definition where we can use the "set" operator, it looks like tablegen can handle use a pattern for this without a problem. I believe the original implementation was copied from PowerPC. This also fixes the pseudo instruction so that it is marked as having side effects to match the definition of CSRRS and the RV64 instruction. And we don't need to explicitly clear mayLoad/mayStore since those can be inferred now. Differential Revision: https://reviews.llvm.org/D92786	2020-12-08 10:23:37 -08:00
Huihui Zhang	8e6fc1f97e	[AArch64][SVE] Add lowering for llvm.maxnum\|minnum for scalable type. LLVM intrinsic llvm.maxnum\|minnum is overloaded intrinsic, can be used on any floating-point or vector of floating-point type. This patch extends current infrastructure to support scalable vector type. This patch also fix a warning message of incorrect use of EVT::getVectorNumElements() for scalable type, when DAGCombiner trying to split scalable vector. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92607	2020-12-08 09:35:53 -08:00
Jessica Paquette	b15491eb33	[AArch64][GlobalISel] Select G_SADDO and G_SSUBO We didn't have selector support for these. Selection code is similar to `getAArch64XALUOOp` in AArch64ISelLowering. Similar to that code, this returns the AArch64CC and the instruction produced. In SDAG, this is used to optimize select + overflow and condition branch + overflow pairs. (See `AArch64TargetLowering::LowerBR_CC` and `AArch64TargetLowering::LowerSelect`) (G_USUBO should be easy to add here, but it isn't legalized right now.) This also factors out the existing G_UADDO selection code, and removes an unnecessary check for s32/s64. AFAIK, we shouldn't ever get anything other than s32/s64. It makes more sense for this to be handled by the type assertion in `emitAddSub`. Differential Revision: https://reviews.llvm.org/D92610	2020-12-08 09:18:28 -08:00
Stefan Pintilie	2812c15156	[PowerPC] Fix missing nop after call to weak callee. Weak functions can be replaced by other functions at link time. Previously it was assumed that no matter what the weak callee function was replaced with it would still share the same TOC as the caller. This is no longer true as a weak callee with a TOC setup can be replaced by another function that was compiled with PC Relative and does not have a TOC at all. This patch makes sure that all calls to functions defined as weak from a caller that has a valid TOC have a nop after the call to allow a place for the linker to restore the TOC. Reviewed By: NeHuang Differential Revision: https://reviews.llvm.org/D91983	2020-12-08 09:38:44 -06:00
David Green	03e675fd12	[ARM] Turn pred_cast(xor(x, -1)) into xor(pred_cast(x), -1) This folds a not (an xor -1) though a predicate_cast, so that it can be turned into a VPNOT and potentially be folded away as an else predicate inside a VPT block. Differential Revision: https://reviews.llvm.org/D92235	2020-12-08 15:22:46 +00:00
David Green	91fb9eac0b	[ARM] Remove dead instructions before creating VPT block bundles We remove VPNOT instructions in VPT blocks as we create them, turning them into else predicates. We don't remove the dead instructions until after the block has been created though. Because the VPNOT will have killed the vpr register it used, this makes finalizeBundle add internal flags to the vpr uses of any instructions after the VPNOT. These incorrect flags can then confuse what is alive and what is not, leading to machine verifier problems. This patch removes them earlier instead, before the bundle is finalized so that kill flags remain valid. Differential Revision: https://reviews.llvm.org/D92227	2020-12-08 14:05:07 +00:00
David Sherwood	59f17b57d9	[SVE] Fix crashes with inline assembly All the crashes found compiling inline assembly are fixed in this patch by changing AArch64TargetLowering::getRegForInlineAsmConstraint to be more resilient to mismatched value and register types. For example, it makes no sense to request a predicate register for a nxv2i64 type and so on. Tests have been added here: test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll Differential Revision: https://reviews.llvm.org/D92554	2020-12-08 13:48:43 +00:00
Tim Northover	c5978f42ec	UBSAN: emit distinctive traps Sometimes people get minimal crash reports after a UBSAN incident. This change tags each trap with an integer representing the kind of failure encountered, which can aid in tracking down the root cause of the problem.	2020-12-08 10:28:26 +00:00
Qiu Chaofan	5e85a2ba16	[PowerPC] Implement intrinsic for DARN instruction Instruction darn was introduced in ISA 3.0. It means 'Deliver A Random Number'. The immediate number L means: - L=0, the number is 32-bit (higher 32-bits are all-zero) - L=1, the number is 'conditioned' (processed by hardware to reduce bias) - L=2, the number is not conditioned, directly from noise source GCC implements them in three separate intrinsics: __builtin_darn, __builtin_darn_32 and __builtin_darn_raw. This patch implements the same intrinsics. And this change also addresses Bugzilla PR39800. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92465	2020-12-08 14:08:52 +08:00
Esme-Yi	49599cb1a2	[PowerPC] Correct the bit-width definition for some imm operand in td. Summary: The imm operands of some instructions are not defined accurately in td. This is a small patch to correct these definitions. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D91603	2020-12-08 03:20:12 +00:00
Jessica Paquette	d49f6491b6	[AArch64][GlobalISel] Refactor G_BRCOND selection `selectCompareBranch` was hard to understand. Also, it was being needlessly pessimistic with the `ProduceNonFlagSettingCondBr` case. It assumed that everything in `selectCompareBranch` would emit a TB(N)Z or C(B)NZ. That's not true; the G_FCMP + G_BRCOND case would never emit those instructions, and the G_ICMP + G_BRCOND case was capable of emitting an integer compare + Bcc. - Refactor `selectCompareBranch` into separate functions based off of what is feeding the G_BRCOND's condition. - Move G_BRCOND selection code from `select` to `selectCompareBranch`. - Remove duplicated constraint code from the code originally in `select`; `emitTestBit` already handles that, so no need to constrain twice. - Factor out the G_FCMP + G_BRCOND case into `selectCompareBranchFedByFCmp`. - Split the G_ICMP + G_BRCOND case into an optimization function, `tryOptCompareBranchFedByICmp` and a general selection function, `selectCompareBranchFedByICmp`. - Reduce the number of things passed to `tryOptAndIntoCompareBranch`. - Improve documentation. - Give some variables more descriptive names. Other than improving the code generation for functions with speculative_load_hardening by getting the logic correct, this is NFC. Differential Revision: https://reviews.llvm.org/D92582	2020-12-07 17:24:23 -08:00
Jessica Paquette	195a7af0ab	[AArch64][GlobalISel] Narrow 128-bit regs to 64-bit regs in emitTestBit When we have a 128-bit register, emitTestBit would incorrectly narrow to 32 bits always. If the bit number was > 32, then we would need a TB(N)ZX. This would cause a crash, as we'd have the wrong register class. (PR48379) This generalizes `narrowExtReg` into `moveScalarRegClass`. This also allows us to remove `widenGPRBankRegIfNeeded` entirely, since `selectCopy` correctly handles SUBREG_TO_REG etc. This does create some codegen changes (since `selectCopy` uses the `all` regclass variants). However, I think that these will likely be optimized away, and we can always improve the `selectCopy` code. It looks like we should revisit `selectCopy` at this point, and possibly refactor it into at least one `emit` function. Differential Revision: https://reviews.llvm.org/D92707	2020-12-07 15:04:33 -08:00
Amara Emerson	2ac4d0f45a	[AArch64] Fix some minor coding style issues in AArch64CompressJumpTables	2020-12-07 12:48:09 -08:00
Stanislav Mekhanoshin	dd89249498	[AMDGPU] Annotate vgpr<->agpr spills in asm Differential Revision: https://reviews.llvm.org/D92125	2020-12-07 11:25:25 -08:00

1 2 3 4 5 ...

60394 Commits