llvm-project

Commit Graph

Author	SHA1	Message	Date
David Bolvansky	d0080c3a5f	[DAGCombiner] Fold 0 div/rem X to 0 Reviewers: RKSimon, spatel, javed.absar, craig.topper, t.p.northover Reviewed By: RKSimon Subscribers: craig.topper, llvm-commits Differential Revision: https://reviews.llvm.org/D52504 llvm-svn: 345721	2018-10-31 14:18:57 +00:00
Nicolai Haehnle	814abb59df	AMDGPU: Rewrite SILowerI1Copies to always stay on SALU Summary: Instead of writing boolean values temporarily into 32-bit VGPRs if they are involved in PHIs or are observed from outside a loop, we use bitwise masking operations to combine lane masks in a way that is consistent with wave control flow. Move SIFixSGPRCopies to before this pass, since that pass incorrectly attempts to move SGPR phis to VGPRs. This should recover most of the code quality that was lost with the bug fix in "AMDGPU: Remove PHI loop condition optimization". There are still some relevant cases where code quality could be improved, in particular: - We often introduce redundant masks with EXEC. Ideally, we'd have a generic computeKnownBits-like analysis to determine whether masks are already masked by EXEC, so we can avoid this masking both here and when lowering uniform control flow. - The criterion we use to determine whether a def is observed from outside a loop is conservative: it doesn't check whether (loop) branch conditions are uniform. Change-Id: Ibabdb373a7510e426b90deef00f5e16c5d56e64b Reviewers: arsenm, rampitec, tpr Subscribers: kzhuravl, jvesely, wdng, mgorny, yaxunl, dstuttard, t-tye, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D53496 llvm-svn: 345719	2018-10-31 13:27:08 +00:00
Nicolai Haehnle	28212cc689	AMDGPU: Remove PHI loop condition optimization Summary: The optimization to early break out of loops if all threads are dead was never fully implemented. But the PHI node analyzing is actually causing a number of problems, so remove all the extra code for it. (This does actually regress code quality in a few places because it ends up relying more heavily on phi's of i1, which we don't do a great job with. However, since it fixes real bugs in the wild, we should take this change. I have some prototype changes to improve i1 lowering in general -- not just for control flow -- which should help recover the code quality, I just need to make those changes fit for general consumption. -- Nicolai) Change-Id: I6fc6c6c8961857ac6009fcfb9f7e5e48dc23fbb1 Patch-by: Christian König <christian.koenig@amd.com> Reviewers: arsenm, rampitec, tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53359 llvm-svn: 345718	2018-10-31 13:26:48 +00:00
Sanjay Patel	2efccd2cf2	[InstSimplify] fold icmp based on range of abs/nabs This is a fix for PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 We managed to get some of these patterns using computeKnownBits in D47041, but that can't be used for nabs(). Instead, put in some range-based logic, so we can fold both abs/nabs with icmp with a constant value. Alive proofs: https://rise4fun.com/Alive/21r Name: abs_nsw_is_positive %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp sgt i32 %abs, -1 => %r = i1 true Name: abs_nsw_is_not_negative %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp slt i32 %abs, 0 => %r = i1 false Name: nabs_is_negative_or_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp slt i32 %nabs, 1 => %r = i1 true Name: nabs_is_not_over_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp sgt i32 %nabs, 0 => %r = i1 false Differential Revision: https://reviews.llvm.org/D53844 llvm-svn: 345717	2018-10-31 13:25:10 +00:00
Andrea Di Biagio	3d2b7176fc	[tblgen][PredicateExpander] Add the ability to describe more complex constraints on instruction operands. Before this patch, class PredicateExpander only knew how to expand simple predicates that performed checks on instruction operands. In particular, the new scheduling predicate syntax was not rich enough to express checks like this one: Foo(MI->getOperand(0).getImm()) == ExpectedVal; Here, the immediate operand value at index zero is passed in input to function Foo, and ExpectedVal is compared against the value returned by function Foo. While this predicate pattern doesn't show up in any X86 model, it shows up in other upstream targets. So, being able to support those predicates is fundamental if we want to be able to modernize all the scheduling models upstream. With this patch, we allow users to specify if a register/immediate operand value needs to be passed in input to a function as part of the predicate check. Now, register/immediate operand checks all derive from base class CheckOperandBase. This patch also changes where TIIPredicate definitions are expanded by the instructon info emitter. Before, definitions were expanded in class XXXGenInstrInfo (where XXX is a target name). With the introduction of this new syntax, we may want to have TIIPredicates expanded directly in XXXInstrInfo. That is because functions used by the new operand predicates may only exist in the derived class (i.e. XXXInstrInfo). This patch is a non functional change for the existing scheduling models. In future, we will be able to use this richer syntax to better describe complex scheduling predicates, and expose them to llvm-mca. Differential Revision: https://reviews.llvm.org/D53880 llvm-svn: 345714	2018-10-31 12:28:05 +00:00
Neil Henning	63718b214a	[AMDGPU] support image load/store a16 Our a16 support was only enabled for sample/gather and buffer load/store, but not for image load/store operations (which take an i16 as the pixel index rather than a half). Fix our isel lowering and add test cases to prove it out. Differential Revision: https://reviews.llvm.org/D53750 llvm-svn: 345710	2018-10-31 10:34:48 +00:00
Max Kazantsev	541f824d32	[IndVars] Strengthen restricton in rewriteLoopExitValues For some unclear reason rewriteLoopExitValues considers recalculation after the loop profitable if it has some "soft uses" outside the loop (i.e. any use other than call and return), even if we have proved that it has a user inside the loop which we think will not be optimized away. There is no existing unit test that would explain this. This patch provides an example when rematerialisation of exit value is not profitable but it passes this check due to presence of a "soft use" outside the loop. It makes no sense to recalculate value on exit if we are going to compute it due to some irremovable within the loop. This patch disallows applying this transform in the described situation. Differential Revision: https://reviews.llvm.org/D51581 Reviewed By: etherzhhb llvm-svn: 345708	2018-10-31 10:30:50 +00:00
Dorit Nuzman	34da6dd696	[LV] Support vectorization of interleave-groups that require an epilog under optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705	2018-10-31 09:57:56 +00:00
Alexander Potapenko	c1c4c9a494	[MSan] another take at instrumenting inline assembly - now with calls Turns out it's not always possible to figure out whether an asm() statement argument points to a valid memory region. One example would be per-CPU objects in the Linux kernel, for which the addresses are calculated using the FS register and a small offset in the .data..percpu section. To avoid pulling all sorts of checks into the instrumentation, we replace actual checking/unpoisoning code with calls to msan_instrument_asm_load(ptr, size) and msan_instrument_asm_store(ptr, size) functions in the runtime. This patch doesn't implement the runtime hooks in compiler-rt, as there's been no demand in assembly instrumentation for userspace apps so far. llvm-svn: 345702	2018-10-31 09:32:47 +00:00
Sanjin Sijaric	fadebc8aae	[ARM64] [Windows] Exception handling support in frame lowering Emit pseudo instructions indicating unwind codes corresponding to each instruction inside the prologue/epilogue. These are used by the MCLayer to populate the .xdata section. Differential Revision: https://reviews.llvm.org/D50288 llvm-svn: 345701	2018-10-31 09:27:01 +00:00
Martin Storsjo	315357faca	[AArch64] Mark condition flags and x16/x17 as clobbered when calling __chkstk This is similar to SVN r311061 for ARM. Differential Revision: https://reviews.llvm.org/D53878 llvm-svn: 345698	2018-10-31 08:14:09 +00:00
Lang Hames	91449355f5	[ORC] Fix hex printing of uint64_t values. A plain "%x" format string will drop the high 32-bits. Use the PRIx64 macro instead. llvm-svn: 345696	2018-10-31 05:16:14 +00:00
Wolfgang Pieb	f39a9bbe72	[DWARF] Revert r345546: Refactor range list extraction and dumping This patch caused some internal tests to break which are being investigated. llvm-svn: 345687	2018-10-31 01:12:58 +00:00
Fangrui Song	a23f091ba3	Use llvm::any_of instead std::any_of. NFC llvm-svn: 345683	2018-10-31 00:31:06 +00:00
Matthias Braun	9fd397b423	ADT/STLExtras: Introduce llvm::empty; NFC This is modeled after C++17 std::empty(). Differential Revision: https://reviews.llvm.org/D53909 llvm-svn: 345679	2018-10-31 00:23:23 +00:00
Saleem Abdulrasool	91242b788a	DWARFVerifier: make the verifier more comprehensive for objects Make the code do what was mentioned in the comment: only skip the CU types. This enables the lexical blocks to be verified as well. llvm-svn: 345675	2018-10-30 23:45:27 +00:00
Matthias Braun	a83403892a	MachineOperand/MIParser: Do not print debug-use flag, infer it The debug-use flag must be set exactly for uses on DBG_VALUEs. This is so obvious that it can be trivially inferred while parsing. This will reduce noise when printing while omitting an information that has little value to the user. The parser will keep recognizing the flag for compatibility with old `.mir` files. Differential Revision: https://reviews.llvm.org/D53903 llvm-svn: 345671	2018-10-30 23:28:27 +00:00
Konstantin Zhuravlyov	2d22d24ac4	Revert r345542: AMDGPU: Enable code object v3 by default It breaks mesa. llvm-svn: 345662	2018-10-30 22:02:40 +00:00
Cameron McInally	2ad870e785	[FPEnv] [FPEnv] Add constrained intrinsics for MAXNUM and MINNUM Differential Revision: https://reviews.llvm.org/D53216 llvm-svn: 345650	2018-10-30 21:01:29 +00:00
Sanjay Patel	4c39dfc91e	[InstCombine] use 'match' to reduce code; NFC llvm-svn: 345647	2018-10-30 20:52:25 +00:00
Quentin Colombet	900678227c	[InstCombine] Teach the move free before null test opti how to deal with noop casts InstCombine features an optimization that essentially replaces: if (a) free(a) into: free(a) Right now, this optimization is gated by the minsize attribute and therefore we only perform it if we can prove that we are going to be able to eliminate the branch and the destination block. However when casts are involved the optimization would fail to apply, because the optimization was not smart enough to realize that it is possible to also move the casts away from the destination block and that is harmless to the performance since they are just noops. E.g., foo(int a) if (a) free((char)a) Wouldn't be optimized by instcombine, because - We would refuse to hoist the `bitcast i32* %a to i8` in the source block - We would fail to see that `bitcast i32* %a to i8` and %a are the same value. This patch fixes both these problems: - It teaches the pattern matching of the comparison how to look through casts. - It checks that whether the additional instruction in the destination block can be hoisted and are harmless performance-wise. - It hoists all the code of the destination block in the source block. Differential Revision: D53356 llvm-svn: 345644	2018-10-30 20:51:04 +00:00
Mandeep Singh Grang	71e0cc2a0b	[COFF, ARM64] Make sure to forward arguments from vararg to musttail vararg Summary: Thunk functions in Windows are varag functions that call a musttail function to pass the arguments after the fixup is done. We need to make sure that we forward the arguments from the caller vararg to the callee vararg function. This is the same mechanism that is used for Windows on X86. Reviewers: ssijaric, eli.friedman, TomTan, mgrang, mstorsjo, rnk, compnerd, efriedma Reviewed By: efriedma Subscribers: efriedma, kristof.beyls, chrib, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53843 llvm-svn: 345641	2018-10-30 20:46:10 +00:00
Craig Topper	4104c00658	[ScalarizeMaskedMemIntrin] Limit the scope of some variables that are only used inside loops. llvm-svn: 345638	2018-10-30 20:33:58 +00:00
Bjorn Pettersson	fe09a20f09	[DAGCombiner] Fix for big endian in ForwardStoreValueToDirectLoad Summary: Normalize the offset for endianess before checking if the store cover the load in ForwardStoreValueToDirectLoad. Without this we missed out on some optimizations for big endian targets. If for example having a 4 bytes store followed by a 1 byte load, loading the least significant byte from the store, the STCoversLD check would fail (see @test4 in test/CodeGen/AArch64/load-store-forwarding.ll). This patch also fixes a problem seen in an out-of-tree target. The target has i40 as a legal type, it is big endian, and the StoreSize for i40 is 48 bits. So when normalizing the offset for endianess we need to take the StoreSize into account (assuming that padding added when storing into a larger StoreSize always is added at the most significant end). Reviewers: niravd Reviewed By: niravd Subscribers: javed.absar, kristof.beyls, llvm-commits, uabelho Differential Revision: https://reviews.llvm.org/D53776 llvm-svn: 345636	2018-10-30 20:16:39 +00:00
Eli Friedman	93d0129b78	[AArch64] [Windows] SEH opcodes should be scheduling boundaries. Prevents the post-RA scheduler from modifying the prologue sequences emitting by frame lowering. This is roughly similar to what we do for other targets: TargetInstrInfo::isSchedulingBoundary checks isPosition(), which checks for CFI_INSTRUCTION. isSEHInstruction is taken from D50288; it'll land with whatever patch lands first. Differential Revision: https://reviews.llvm.org/D53851 llvm-svn: 345634	2018-10-30 19:24:51 +00:00
David Greene	3e89fa8e08	[AArch64] Create proper memoperand for multi-vector stores Re-apply r345315 with testcase fixes. Include all of the store's source vector operands when creating the MachineMemOperand. Previously, we were missing the first operand, making the store size seem smaller than it really is. Differential Revision: https://reviews.llvm.org/D52816 llvm-svn: 345631	2018-10-30 19:17:51 +00:00
Craig Topper	6958b5ffa9	[X86] In lowerVectorShuffleAsBroadcast, make peeking through CONCAT_VECTORS work correctly if we already walked through a bitcast that changed the element size. The CONCAT_VECTORS case was using the original mask element count to determine how to adjust the broadcast index. But if we looked through a bitcast the original mask size doesn't tell us anything about the concat_vectors. This patch switchs to using the concat_vectors input element count directly instead. Differential Revision: https://reviews.llvm.org/D53823 llvm-svn: 345626	2018-10-30 18:48:42 +00:00
Calixte Denizet	38d50545fe	[GCOV] Function counters are wrong when on one line Summary: After commit https://reviews.llvm.org/rL344228, the function definitions have a counter but when on one line the counter is wrong (e.g. void foo() { }) I added a test in: https://reviews.llvm.org/D53601 Reviewers: marco-c Reviewed By: marco-c Subscribers: llvm-commits, sylvestre.ledru Differential Revision: https://reviews.llvm.org/D53600 llvm-svn: 345624	2018-10-30 18:41:31 +00:00
Nirav Dave	eedd2ccd02	[DAG] Add const variants for BaseIndexOffset functions. llvm-svn: 345623	2018-10-30 18:26:43 +00:00
Zachary Turner	bfac17f21e	Fix printing bug in pdb2yaml. We were using the wrong enum table when mapping enum values to strings for public symbol flags. llvm-svn: 345622	2018-10-30 18:25:38 +00:00
Ulrich Weigand	c5854b0adb	[SystemZ] Simplify LRV/STRV ISD nodes The LRV and STRV nodes carry an extra operand to indicate the type of the memory access. This is redundant, since the nodes are actually of class MemIntrinsicNode and therefore hold that same information already as MemoryVT. NFC intended. llvm-svn: 345618	2018-10-30 18:20:59 +00:00
Simon Pilgrim	44a9a71d2a	[TTI] Fix uses of SK_ExtractSubvector shuffle costs (PR39368) Correct costings of SK_ExtractSubvector requires the SubTy argument to indicate the type/size of the extracted subvector. Unlike the rest of the shuffle kinds this means that the main Ty argument represents the source vector type not the destination! I've done my best to fix a number of vectorizer uses: SLP - the reduction epilogue costs should be using a SK_PermuteSingleSrc shuffle as these all occur at the hardware vector width - we're not extracting (illegal) subvector types. This is causing the cost model diffs as SK_ExtractSubvector costs are poorly handled and tend to just return 1 at the moment. LV - I'm not clear on what the SK_ExtractSubvector should represents for recurrences - I've used a <1 x ?> subvector extraction as that seems to match the VF delta. Differential Revision: https://reviews.llvm.org/D53573 llvm-svn: 345617	2018-10-30 18:10:02 +00:00
Sanjay Patel	68a61cb07c	[InstCombine] use getFltSemantics() instead of duplicating it; NFC llvm-svn: 345613	2018-10-30 16:21:56 +00:00
Sanjay Patel	b12e410082	[InstCombine] try to turn shuffle into insertelement shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC' The motivating case is at least a couple of steps away: I noticed that SLPVectorizer does not analyze shuffles as well as sequences of insert/extract in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 ...so SLP may fail to vectorize when source code has shuffles to start with or instcombine has converted insert/extract to shuffles. Independent of that, an insertelement is always a simpler op for IR analysis vs. a shuffle, so we should transform to insert when possible. I don't think there's any codegen concern here - if a target can't insert a scalar directly to some fixed element in a vector (x86?), then this should get expanded to the insert+shuffle that we started with. Differential Revision: https://reviews.llvm.org/D53507 llvm-svn: 345607	2018-10-30 15:26:39 +00:00
Jonas Paulsson	611b533f1d	[SchedModel] Fix for read advance cycles with implicit pseudo operands. The SchedModel allows the addition of ReadAdvances to express that certain operands of the instructions are needed at a later point than the others. RegAlloc may add pseudo operands that are not part of the instruction descriptor, and therefore cannot have any read advance entries. This meant that in some cases the desired read advance was nullified by such a pseudo operand, which still had the original latency. This patch fixes this by making sure that such pseudo operands get a zero latency during DAG construction. Review: Matthias Braun, Ulrich Weigand. https://reviews.llvm.org/D49671 llvm-svn: 345606	2018-10-30 15:04:40 +00:00
Jonas Paulsson	1f067c94dc	[LoopVectorizer] Fix for cost values of memory accesses. This commit is a combination of two patches: * "Fix in getScalarizationOverhead()" If target returns false in TTI.prefersVectorizedAddressing(), it means the address registers will not need to be extracted. Therefore, there should be no operands scalarization overhead for a load instruction. * "Don't pass the instruction pointer from getMemInstScalarizationCost." Since VF is always > 1, this is a cost query for an instruction in the vectorized loop and it should not be evaluated within the scalar context of the instruction. Review: Ulrich Weigand, Hal Finkel https://reviews.llvm.org/D52351 https://reviews.llvm.org/D52417 llvm-svn: 345603	2018-10-30 14:34:15 +00:00
Sanjay Patel	8b207defea	[DAGCombiner] narrow vector binops when extraction is cheap Narrowing vector binops came up in the demanded bits discussion in D52912. I don't think we're going to be able to do this transform in IR as a canonicalization because of the risk of creating unsupported widths for vector ops, but we already have a DAG TLI hook to allow what I was hoping for: isExtractSubvectorCheap(). This is currently enabled for x86, ARM, and AArch64 (although only x86 has existing regression test diffs). This is artificially limited to not look through bitcasts because there are so many test diffs already, but that's marked with a TODO and is a small follow-up. Differential Revision: https://reviews.llvm.org/D53784 llvm-svn: 345602	2018-10-30 14:14:34 +00:00
Sanjay Patel	680c9227ca	[SelectionDAG] fix build warning for mismatched signs in compare; NFC llvm-svn: 345598	2018-10-30 13:47:19 +00:00
Jonas Paulsson	af8e036c29	[SystemZ] Improve isFoldableLoad() for Sub, SDiv and UDiv. Sub, SDiv and UDiv are not commutative, so only the RHS operand can fold a load. This patch adds a check for this. Review: Ulrich Weigand https://reviews.llvm.org/D53791 llvm-svn: 345596	2018-10-30 13:41:03 +00:00
Francis Visoiu Mistrih	0e237d357e	[X86] Re-enable the machine verifier after fixing more tests Was disabled again in r345528. Hopefully this the bots. llvm-svn: 345593	2018-10-30 12:20:17 +00:00
Francis Visoiu Mistrih	85d3f1ee8f	[llc] Error out when -print-machineinstrs is used with an unknown pass We used to assert instead of reporting an error. PR39494 llvm-svn: 345589	2018-10-30 12:07:18 +00:00
Nicola Zaghen	f96383c99e	[SROA] Use offset sizes from the DataLayout instead of the pointer siezes. This fixes an assertion when constant folding a GEP when the part of the offset was in i32 (IndexSize, as per DataLayout) and part in the i64 (PointerSize) in the newly created test case. Differential Revision: https://reviews.llvm.org/D52609 llvm-svn: 345585	2018-10-30 11:15:04 +00:00
Roman Lebedev	b3a14208ac	[X86][BMI1] X86DAGToDAGISel: select BEXTR from x & (-1 >> (32 - y)) pattern Summary: The final pattern. There is no test changes: * We are looking for the pattern with one-use of it's mask, * If the mask is one-use, D48768 will unfold it into pattern d. * Thus, the tests have extra-use on the mask. * Thus, only the BMI2 BZHI can be tested, and it already worked. * So there is no BMI1 test coverage, we just assume it works since it uses the same codepath. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53575 llvm-svn: 345584	2018-10-30 11:12:34 +00:00
Diogo N. Sampaio	a3783b2ac2	[AArch64] Add support for UDF instruction Summary: Add support for AArch64 UDF instruction. UDF - Permanently Undefined generates an Undefined Instruction exception (ESR_ELx.EC = 0b000000). Reviewers: DavidSpickett, javed.absar, t.p.northover Reviewed By: javed.absar Subscribers: nhaehnle, kristof.beyls Differential Revision: https://reviews.llvm.org/D53319 llvm-svn: 345581	2018-10-30 11:06:50 +00:00
Simon Pilgrim	858303b827	[SelectionDAG] Add FoldBUILD_VECTOR to simplify new BUILD_VECTOR nodes Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VECTOR - if all the operands are UNDEF or if the BUILD_VECTOR simplifies to a copy. This exposed an assumption in some AMDGPU code that getBuildVector was guaranteed to be a BUILD_VECTOR node that I've tried to handle. Differential Revision: https://reviews.llvm.org/D53760 llvm-svn: 345578	2018-10-30 10:32:11 +00:00
David Bolvansky	dfdbb038e8	[DAGCombiner] Improve X div/rem Y fold if single bit element type Summary: Tests by @spatel, thanks Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: sdardis, atanasyan, llvm-commits, spatel Differential Revision: https://reviews.llvm.org/D52668 llvm-svn: 345575	2018-10-30 09:07:22 +00:00
Craig Topper	b293322cee	[LegalizeTypes] Teach PromoteIntRes_BITCAST to better handle a bitcast with vector output type and a vector input type that needs to be widened Summary: Previously if we had a bitcast vector output type that needs promotion and a vector input type that needs widening we would just do a stack store and load to handle the conversion. We can do a little better if we can widen the bitcast to a legal vector type the same size as the widened input type. Then we can do the bitcast between this widened type and the widened input type. Afterwards we can extract_subvector back to the original output and any_extend that. Type legalization will then circle back and handle promotion of the extract_subvector and the any_extend will just be removed. This will avoid going through the stack and allows us to remove a custom version of this legalization from X86. Reviewers: efriedma, RKSimon Reviewed By: efriedma Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53229 llvm-svn: 345567	2018-10-30 03:27:15 +00:00
Craig Topper	67c2878501	[X86] Cleanup the code in LowerFABSorFNEG and LowerFCOPYSIGN a little. NFC Use SelectionDAG::EVTToAPFloatSemantics. Make the LogicVT calculation in LowerFABSorFNEG similar to LowerFCOPYSIGN. Use APInt::getSignedMaxValue instead of ~APInt::getSignMask. llvm-svn: 345565	2018-10-30 03:27:12 +00:00
Craig Topper	676d7a7a43	[X86] Stop changing f128 fand/for/fxor to v2i64. The additional patterns don't cost us much and it seems better than changing element widths. llvm-svn: 345564	2018-10-30 03:27:11 +00:00
Matt Arsenault	abc4f29f9c	AMDGPU: Remove custom BUILD_VECTOR combine This was looping in a testcase and removing it now slightly improves a test. llvm-svn: 345560	2018-10-30 01:37:59 +00:00

1 2 3 4 5 ...

117877 Commits