llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	8e1a464e6a	[CodeGen][X86] Expand UADDSAT to NOT+UMIN+ADD Followup to D56636, this time handling the UADDSAT case by expanding uadd.sat(a, b) to umin(a, ~b) + b. Differential Revision: https://reviews.llvm.org/D56869 llvm-svn: 352409	2019-01-28 19:19:09 +00:00
Jessica Paquette	7db82d7257	[GlobalISel][AArch64] Add instruction selection support for G_FCOS and G_FSIN This contains all of the legalizer changes from D57197 necessary to select G_FCOS and G_FSIN. It also updates several existing IR tests in test/CodeGen/AArch64 that verify that we correctly lower the G_FCOS and G_FSIN instructions. https://reviews.llvm.org/D57197 3/3 llvm-svn: 352402	2019-01-28 18:34:18 +00:00
Simon Pilgrim	2c17512456	[X86][AVX] Remove lowerShuffleByMerging128BitLanes 2-lane restriction First step towards adding support for 64-bit unary "sublane" handling (a bit like lowerShuffleAsRepeatedMaskAndLanePermute). This allows us to add lowerV64I8Shuffle handling. llvm-svn: 352389	2019-01-28 17:02:35 +00:00
Sanjay Patel	94cca60b82	[x86] allow more shuffle splitting to avoid vpermps (PR40434) This is tricky to make optimal: sometimes we're better off using a single wider op, but other times it makes more sense to combine a narrow ops to achieve the same result. This solves the case from: https://bugs.llvm.org/show_bug.cgi?id=40434 There's potentially a similar change for vectors with 64-bit elements, but it needs adjustments similar to rL352333 to avoid creating infinite loops. llvm-svn: 352380	2019-01-28 15:51:34 +00:00
Arnaud A. de Grandmaison	51eb87cadd	Remove no longer needed Arm specific LICENSE.TXT file. As the codebase is now under the Apache 2.0 license with LLVM Exceptions, and all Arm's contributions, past or future, are under that new license, this Arm specific LICENSE.TXT is no longer needed, thus removing it. llvm-svn: 352376	2019-01-28 15:38:01 +00:00
Aleksandar Beserminji	6c5dfcb89e	[mips] Support for +abs2008 attribute Instruction abs.[ds] is not generating correct result when working with NaNs for revisions prior mips32r6 and mips64r6. To generate a sequence which always produce a correct result, but also to allow user more control on how his code is compiled, attribute +abs2008 is added, so user can choose legacy or 2008. By default legacy mode is used on revisions prior R6. Mips32r6 and mips64r6 use abs2008 mode by default. Differential Revision: https://reviews.llvm.org/D35983 llvm-svn: 352370	2019-01-28 14:59:30 +00:00
Tim Corringham	824ca3f3dd	[AMDGPU] Add intrinsics for 16 bit interpolation Summary: Added the intrinsics llvm.amdgcn.interp.p1.f16() and llvm.amdgcn.interp.p2.f16() and related LIT test. The p1 intrinsic generates code appropriate for both 16 and 32 bank LDS. Reviewers: #amdgpu, dstuttard, arsenm, tpr Reviewed By: #amdgpu, arsenm Subscribers: jvesely, mgorny, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46754 llvm-svn: 352357	2019-01-28 13:48:59 +00:00
Petar Avramovic	7cecadb9af	[MIPS GlobalISel] Select sub Lower G_USUBO and G_USUBE. Add narrowScalar for G_SUB. Legalize and select G_SUB for MIPS 32. Differential Revision: https://reviews.llvm.org/D53416 llvm-svn: 352351	2019-01-28 12:10:17 +00:00
Diana Picus	574e0c5e32	[ARM GlobalISel] Support integer division for Thumb2 Support G_SDIV, G_UDIV, G_SREM and G_UREM. The only significant difference between arm and thumb mode is that we need to check a different subtarget feature. llvm-svn: 352346	2019-01-28 10:37:30 +00:00
Craig Topper	453150bc18	[X86] Add new variadic avx512 compress/expand intrinsics that use vXi1 types for the mask argument. Remove and autoupgrade the old intrinsics llvm-svn: 352343	2019-01-28 07:03:03 +00:00
Amara Emerson	fd31bf95c1	[AArch64][GlobalISel] Teach RBS about G_FNEG default mapping. llvm-svn: 352340	2019-01-28 03:21:14 +00:00
Amara Emerson	0bfa2faccc	[AArch64][GlobalISel] Add some missing vector support for FP arithmetic ops. Moved the fneg lowering legalization test from AArch64 to X86, as we want to specify that it's already legal. llvm-svn: 352338	2019-01-28 02:28:22 +00:00
Amara Emerson	92ffb305cc	[AArch64][GlobalISel] Add some vector support for fp <-> int conversions. Some unrelated, but benign, test changes as well due to the test update script. llvm-svn: 352337	2019-01-28 02:27:59 +00:00
Sanjay Patel	ebe6b43aec	[x86] add restriction for lowering to vpermps This transform was added with rL351346, and we had an escape for shufps, but we also want one for unpckps vs. vpermps because vpermps doesn't take an immediate shuffle index operand. llvm-svn: 352333	2019-01-27 21:53:33 +00:00
Simon Pilgrim	670a6971f8	[X86][SSE] Add UNDEF handling to combineSelect ISD::USUBSAT matching (PR40083) llvm-svn: 352330	2019-01-27 21:01:23 +00:00
Simon Pilgrim	f10b6623cc	[X86][SSE] Permit UNDEFs in combineAddToSUBUS matching (PR40083) llvm-svn: 352328	2019-01-27 20:36:37 +00:00
Sanjay Patel	5f1fdaa192	[x86] refactor logic in lowerShuffleWithUndefHalf Although this is longer code, this is no-functional-change-intended. The goal is to untangle the conditions under which we bail out, so that's easier to adjust. llvm-svn: 352320	2019-01-27 18:12:03 +00:00
Gabor Buella	a0f743b77a	[X86] Add some missing blsr patterns The add+and sequence followed by a branch can happen e.g. when looping over the set bits of an integer: ``` while (x != 0) { func(x & ~x); x &= x - 1; } ``` Reviewed By: ctopper Differential Revision: https://reviews.llvm.org/D57296 llvm-svn: 352306	2019-01-27 06:15:39 +00:00
Craig Topper	e65d4c5525	[X86] Add a pattern for (i64 (and (anyext def32:), 0x00000000FFFFFFFF)) to produce SUBREG_TO_REG def32 here means the producing instruction zeroed bits 63:32. We already do this for zext, but it looks like we can get an and+anyext sometimes. Spotted in the diffs from D33587. llvm-svn: 352303	2019-01-27 03:37:05 +00:00
Matt Arsenault	211e89d4dd	GlobalISel: Implement narrowScalar for mul llvm-svn: 352300	2019-01-27 00:52:51 +00:00
Matt Arsenault	2e5f900849	GlobalISel: fewerElementsVector for intrinsic_trunc/intrinsic_round llvm-svn: 352298	2019-01-27 00:12:21 +00:00
Matt Arsenault	ded2f82662	AMDGPU/GlobalISel: Use scalarize instead of clampMaxNumElements llvm-svn: 352297	2019-01-26 23:54:53 +00:00
Matt Arsenault	26a6c74fbe	AMDGPU/GlobalISel: Legalize more bit ops llvm-svn: 352295	2019-01-26 23:47:07 +00:00
Matt Arsenault	4d47594fc5	AMDGPU/GlobalISel: Widen small uaddo/usubo llvm-svn: 352294	2019-01-26 23:44:51 +00:00
Simon Pilgrim	a914fa4dd8	[X86] combineAddOrSubToADCOrSBB/combineCarryThroughADD - use oneuse for entire SDNode Fix issue noted in D57281 that only tested the one use for the SDValue (the result flag), not the entire SUB. I've added the getNode() to make it clearer what is intended than just the -> redirection. llvm-svn: 352291	2019-01-26 21:29:16 +00:00
Simon Pilgrim	37a8e65a60	[X86] combineCarryThroughADD - add support for X86::COND_A commutations (PR24545) As discussed on PR24545, we should try to commute X86::COND_A 'icmp ugt' cases to X86::COND_B 'icmp ult' to more optimally bind the carry flag output to a SBB instruction. Differential Revision: https://reviews.llvm.org/D57281 llvm-svn: 352289	2019-01-26 20:23:04 +00:00
Simon Pilgrim	b7a15acd38	[X86] Fold X86ISD::SBB(ISD::SUB(X,Y),0) -> X86ISD::SBB(X,Y) (PR25858) We often generate X86ISD::SBB(X, 0) for carry flag arithmetic. I had tried to create test cases for the ADC equivalent (which often uses the same pattern) but haven't managed to find anything yet. Differential Revision: https://reviews.llvm.org/D57169 llvm-svn: 352288	2019-01-26 20:13:44 +00:00
Simon Pilgrim	6162fba57c	[X86][SSE] Generalized unsigned compares to support nonsplat constant vectors (PR39859) llvm-svn: 352283	2019-01-26 16:40:03 +00:00
Sanjay Patel	a03c63b77f	[x86] add helper for creating a half-width shuffle; NFC This reduces a bit of duplication between the combining and lowering places that use it, but the primary motivation is to make it easier to rearrange the lowering logic and solve PR40434: https://bugs.llvm.org/show_bug.cgi?id=40434 llvm-svn: 352280	2019-01-26 16:20:22 +00:00
Craig Topper	3b5e01b386	[X86] Remove and autoupgrade vpconflict intrinsics that take a mask and passthru argument. We have unmasked versions as of r352172 llvm-svn: 352270	2019-01-26 06:27:01 +00:00
Craig Topper	58e6b37e62	Revert r352255 "[SelectionDAG][X86] Don't use SEXTLOAD for promoting masked loads in the type legalizer" This might be breaking an lldb windows buildbot. llvm-svn: 352268	2019-01-26 02:44:58 +00:00
Craig Topper	6c9c7d0796	[X86] Remove GCCBuiltins from 512-bit cvt(u)qqtops, cvt(u)qqtopd, and cvt(u)dqtops intrinsics. Add new variadic uitofp/sitofp with rounding mode intrinsics. Summary: See clang patch D56998 for a full description. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56999 llvm-svn: 352266	2019-01-26 02:41:54 +00:00
Thomas Lively	2b8b2978e4	[WebAssembly][NFC] Group SIMD-related ISel configuration Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish Differential Revision: https://reviews.llvm.org/D57263 llvm-svn: 352262	2019-01-26 01:25:37 +00:00
Nemanja Ivanovic	7d007ddedf	[PowerPC] Update Vector Costs for P9 For the power9 CPU, vector operations consume a pair of execution units rather than one execution unit like a scalar operation. Update the target transform cost functions to reflect the higher cost of vector operations when targeting Power9. Patch by RolandF. Differential revision: https://reviews.llvm.org/D55461 llvm-svn: 352261	2019-01-26 01:18:48 +00:00
Craig Topper	7a8e74775c	[X86] Add DAG combine to merge vzext_movl with the various fp<->int conversion operations that only write the lower 64-bits of an xmm register and zero the rest. Summary: We have isel patterns for this, but we're missing some load patterns and all broadcast patterns. A DAG combine seems like a better fit for this. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56971 llvm-svn: 352260	2019-01-26 01:17:09 +00:00
Craig Topper	b1d3457c03	[SelectionDAG][X86] Don't use SEXTLOAD for promoting masked loads in the type legalizer Summary: I'm not sure why we were using SEXTLOAD. EXTLOAD seems more appropriate since we don't care about the upper bits. This patch changes this and then modifies the X86 post legalization combine to emit a extending shuffle instead of a sign_extend_vector_inreg. Could maybe use an any_extend_vector_inreg, but I just did what we already do in LowerLoad. I think we can actually get rid of this code entirely if we switch to -x86-experimental-vector-widening-legalization. On AVX512 targets I think we might be able to use a masked vpmovzx and not have to expand this at all. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57186 llvm-svn: 352255	2019-01-26 00:26:37 +00:00
Alex Bradbury	0092df0669	[RISCV] Add target DAG combine for bitcast fabs/fneg on RV32FD DAGCombiner::visitBITCAST will perform: fold (bitconvert (fneg x)) -> (xor (bitconvert x), signbit) fold (bitconvert (fabs x)) -> (and (bitconvert x), (not signbit)) As shown in double-bitmanip-dagcombines.ll, this can be advantageous. But RV32FD doesn't use bitcast directly (as i64 isn't a legal type), and instead uses RISCVISD::SplitF64. This patch adds an equivalent DAG combine for SplitF64. llvm-svn: 352247	2019-01-25 21:55:48 +00:00
Mircea Trofin	519f42d914	[llvm] Opt-in flag for X86DiscriminateMemOps Summary: Currently, if an instruction with a memory operand has no debug information, X86DiscriminateMemOps will generate one based on the first line of the enclosing function, or the last seen debug info. This may cause confusion in certain debugging scenarios. The long term approach would be to use the line number '0' in such cases, however, that brings in challenges: the base discriminator value range is limited (4096 values). For the short term, adding an opt-in flag for this feature. See bug 40319 (https://bugs.llvm.org/show_bug.cgi?id=40319) Reviewers: dblaikie, jmorse, gbedwell Reviewed By: dblaikie Subscribers: aprantl, eraman, hiraditya Differential Revision: https://reviews.llvm.org/D57257 llvm-svn: 352246	2019-01-25 21:49:54 +00:00
Jessica Paquette	1f9bc2854f	[GlobalISel][AArch64][NFC] Fix incorrect comment in selectUnmergeValues s/scalar/vector/ llvm-svn: 352243	2019-01-25 21:28:27 +00:00
Ana Pazos	05a6064385	Reapply: [RISCV] Set isAsCheapAsAMove for ADDI, ORI, XORI, LUI This reapplies commit r352010 with RISC-V test fixes. llvm-svn: 352237	2019-01-25 20:22:49 +00:00
Craig Topper	4cf28bad5b	[X86] Combine masked store and truncate into masked truncating stores. We also need to combine to masked truncating with saturation stores, but I'm leaving that for a future patch. This does regress some tests that used truncate wtih saturation followed by a masked store. Those now use a truncating store and use min/max to saturate. Differential Revision: https://reviews.llvm.org/D57218 llvm-svn: 352230	2019-01-25 18:37:36 +00:00
Sanjay Patel	0020f8bb23	[x86] simplify logic in lowerShuffleWithUndefHalf(); NFCI This seems unnecessarily complicated because we gave names to opposite polarity bools and have code comments that don't really line up with the logic. Step 1: remove UndefUpper and assert that it is the opposite of UndefLower after the initial early exit. llvm-svn: 352217	2019-01-25 17:00:41 +00:00
Simon Pilgrim	f56298f4b9	[X86] Simplify X86ISD::ADD/SUB if we don't use the result flag Simplify to the generic ISD::ADD/SUB if we don't make use of the result flag. This mainly helps with ADDCARRY/SUBBORROW intrinsics which get expanded to X86ISD::ADD/SUB but could be simplified further. Noticed in some of the test cases in PR31754 Differential Revision: https://reviews.llvm.org/D57234 llvm-svn: 352210	2019-01-25 15:58:28 +00:00
Sanjay Patel	21aa6ddc14	[x86] narrow a shuffle that doesn't use or set any high elements This isn't the final fix for our reduction/horizontal codegen, but it takes care of a lot of the problems. After we narrow the shuffle, existing combines for insert/extract and binops kick in, and we end up with cheaper 128-bit ops. The avg and mul reduction tests show an existing shuffle lowering hole for AVX2/AVX512. I think in its most minimal form this is: https://bugs.llvm.org/show_bug.cgi?id=40434 ...but we might need multiple fixes to get it right. Differential Revision: https://reviews.llvm.org/D57156 llvm-svn: 352209	2019-01-25 15:37:42 +00:00
Simon Pilgrim	dea6174b0b	Fix gcc -Wparentheses warning. NFCI. llvm-svn: 352193	2019-01-25 11:38:40 +00:00
Diana Picus	8976ad12a9	[ARM GlobalISel] Support shifts for Thumb2 Same as ARM. On this occasion we split some of the instruction select tests for more complicated instructions into their own files, so we can reuse them for ARM and Thumb mode. Likewise for the legalizer tests. llvm-svn: 352188	2019-01-25 10:48:42 +00:00
Diana Picus	23628c7b05	[ARM GlobalISel] Remove rebase artifact from r351882. NFC r351882 introduced some superfluous calls to mark G_INTTOPTR and G_PTRTOINT as legal (looks like a rebase mishap). Remove them. llvm-svn: 352187	2019-01-25 10:48:35 +00:00
Anton Korobeynikov	509d5c4a7d	[MSP430] Fix absolute addressing mode printing in AsmPrinter Align checks for absolute addressing mode with its current implementation (SR is used as a base register). This fixes https://bugs.llvm.org/show_bug.cgi?id=39993 Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D56785 llvm-svn: 352178	2019-01-25 09:14:05 +00:00
Zi Xuan Wu	308a609c6e	[PowerPC] Enhance the fast selection of cmp instruction and clean up related asserts Fast selection of llvm icmp and fcmp instructions is not handled well about VSX instruction support. We'd use VSX float comparison instruction instead of non-vsx float comparison instruction if the operand register class is VSSRC or VSFRC because i32 and i64 are mapped to VSSRC and VSFRC correspondingly if VSX feature is opened. If the target does not have corresponding VSX instruction comparison for some type, just copy VSX-related register to common float register class and use non-vsx comparison instruction. Differential Revision: https://reviews.llvm.org/D57078 llvm-svn: 352174	2019-01-25 07:24:59 +00:00
Craig Topper	6fd9af587a	[X86] Add non-masked versions of vpconflict intrinsics so we can use a select in the header file in clang. I'll remove and autoupgrade the old intrinsics in a future commit. llvm-svn: 352172	2019-01-25 07:08:07 +00:00

1 2 3 4 5 ...

50636 Commits