llvm-project

Commit Graph

Author	SHA1	Message	Date
Alex Bradbury	748d080e62	[RISCV] Eliminate unnecessary masking of promoted shift amounts SelectionDAGBuilder::visitShift will always zero-extend a shift amount when it is promoted to the ShiftAmountTy. This results in zero-extension (masking) which is unnecessary for RISC-V as the shift operations only read the lower 5 or 6 bits (RV32 or RV64). I initially proposed adding a getExtendForShiftAmount hook so the shift amount can be any-extended (D52975). @efriedma explained this was unsafe, so I have instead eliminate the unnecessary and operations at instruction selection time in a manner similar to X86InstrCompiler.td. Differential Revision: https://reviews.llvm.org/D53224 llvm-svn: 344432	2018-10-12 23:18:52 +00:00
Craig Topper	3e76b2d736	[X86] Improve type legalization of (v2i32/v4i16/v8i16 (bitcast (v2f32))) to avoid a stack stack temporary. llvm-svn: 344425	2018-10-12 22:00:04 +00:00
Craig Topper	435e38a5df	[LegalizeVectorTypes] When widening the result of a bitcast from a scalar type, use a scalar_to_vector to turn the scalar into a vector intead of a build vector full of mostly undefs. This is more consistent with what we usually do and matches some code X86 custom emits in some cases that I think I can cleanup. The MIPS test change just looks to be an instruction ordering change. llvm-svn: 344422	2018-10-12 21:59:55 +00:00
Craig Topper	1bb0c6041a	[LegalizeVectorTypes] When widening the operands to a concat_vectors, see if we can use the widened operand 0 if the width matches and the other operands are undef. This saves a conversion to extracts and build_vector. We already do this when both the result and the input need to be widened to the same type. This changed the sse-intrinsics-fast-isel test because we don't lower (insert_vector_elt (scalar_to_vector X), Y, 1) well. We turn it into (vector_shuffle (scalar_to_vector X), (scalar_to_vector Y), <0, 4, 2, 3>) losing track of the fact that the upper elts could be undef. We should probably find a way to prevent the scalarization of the <2 x f32> load on these tests. llvm-svn: 344404	2018-10-12 19:37:49 +00:00
Simon Pilgrim	1d6b938132	Regenerate test. NFCI. llvm-svn: 344399	2018-10-12 19:03:54 +00:00
Sanjay Patel	e28c8ecd72	[x86] add and use fast horizontal vector math subtarget feature This is the planned follow-up to D52997. Here we are reducing horizontal vector math codegen by default. AMD Jaguar (btver2) should have no difference with this patch because it has fast-hops. (If we want to set that bit for other CPUs, let me know.) The code changes are small, but there are many test diffs. For files that are specifically testing for hops, I added RUNs to distinguish fast/slow, so we can see the consequences side-by-side. For files that are primarily concerned with codegen other than hops, I just updated the CHECK lines to reflect the new default codegen. To recap the recent horizontal op story: 1. Before rL343727, we were producing hops for all subtargets for a variety of patterns. Hops were likely not optimal for all targets though. 2. The IR improvement in r343727 exposed a hole in the backend hop pattern matching, so we reduced hop codegen for all subtargets. That was bad for Jaguar (PR39195). 3. We restored the hop codegen for all targets with rL344141. Good for Jaguar, but probably bad for other CPUs. 4. This patch allows us to distinguish when we want to produce hops, so everyone can be happy. I'm not sure if we have the best predicate here, but the intent is to undo the extra hop-iness that was enabled by r344141. Differential Revision: https://reviews.llvm.org/D53095 llvm-svn: 344361	2018-10-12 16:41:02 +00:00
Nick Desaulniers	47bab69a2e	[MC][ELF] fix newly added test Summary: Reland of - r344197 "[MC][ELF] compute entity size for explicit sections" - r344206 "[MC][ELF] Fix section_mergeable_size.ll" after being reverted in r344278 due to build breakages from not specifying a target triple. Move test from test/CodeGen/Generic/ to test/MC/ELF/. Add explicit target triple so we don't try to run this test on non ELF targets. Reported: https://reviews.llvm.org/D53056#1261707 Reviewers: fhahn, rnk, espindola, NoQ Reviewed By: fhahn, rnk Subscribers: NoQ, MaskRay, rengolin, emaste, arichardson, llvm-commits, pirama, srhines Differential Revision: https://reviews.llvm.org/D53146 llvm-svn: 344360	2018-10-12 16:35:44 +00:00
Zachary Turner	9f169afab2	Make YAML quote forward slashes. If you have the string /usr/bin, prior to this patch it would not be quoted by our YAML serializer. But a string like C:\src would be, due to the presence of a backslash. This makes the quoting rules of basically every single file path different depending on the path syntax (posix vs. Windows). While technically not required by the YAML specification to quote forward slashes, when the behavior of paths is inconsistent it makes it difficult to portably write FileCheck lines that will work with either kind of path. Differential Revision: https://reviews.llvm.org/D53169 llvm-svn: 344359	2018-10-12 16:31:20 +00:00
Zachary Turner	9c544199cf	Revert "Make YAML quote forward slashes." This reverts commit b86c16ad8c97dadc1f529da72a5bb74e9eaed344. This is being reverted because I forgot to write a useful commit message, so I'm going to resubmit it with an actual commit message. llvm-svn: 344358	2018-10-12 16:31:08 +00:00
Zachary Turner	ec234052a6	Make YAML quote forward slashes. llvm-svn: 344357	2018-10-12 16:24:09 +00:00
Sanjay Patel	f5b1892348	[AArch64][x86] add tests for trunc disguised as vector ops (PR39016); NFC These correspond to the IR transform from: D52439 llvm-svn: 344353	2018-10-12 15:22:14 +00:00
Simon Pilgrim	b8339c0167	[SelectionDAG] Move VectorLegalizer::ExpandCTLZ codegen into SelectionDAGLegalize Generalize SelectionDAGLegalize's CTLZ expansion to handle vectors - lets VectorLegalizer::ExpandCTLZ to just pass the expansion on instead of repeating the same codegen. llvm-svn: 344349	2018-10-12 14:45:57 +00:00
Simon Pilgrim	78b5a3c3ef	[X86][SSE] LowerVectorCTPOP - pull out repeated byte sum stage. Pull out repeated byte sum stage for popcount of vector elements > 8bits. This allows us to simplify the LUT/BITMATH popcnt code to always assume vXi8 vectors, and also improves avx512bitalg codegen which only has access to vpopcntb/vpopcntw. llvm-svn: 344348	2018-10-12 14:18:47 +00:00
Hiroshi Inoue	9552dd187a	[PowerPC] avoid masking already-zero bits in BitPermutationSelector The current BitPermutationSelector generates a code to build a value by tracking two types of bits: ConstZero and Variable. ConstZero means a bit we need to mask off and Variable is a bit we copy from an input value. This patch add third type of bits VariableKnownToBeZero caused by AssertZext node or zero-extending load node. VariableKnownToBeZero means a bit comes from an input value, but it is known to be already zero. So we do not need to mask them. VariableKnownToBeZero enhances flexibility to group bits, since we can avoid redundant masking for these bits. This patch also renames "HasZero" to "NeedMask" since now we may skip masking even when we have zeros (of type VariableKnownToBeZero). Differential Revision: https://reviews.llvm.org/D48025 llvm-svn: 344347	2018-10-12 14:02:20 +00:00
Simon Pilgrim	bb37e81b65	[X86][AVX] Regenerate tzcnt tests llvm-svn: 344341	2018-10-12 13:24:51 +00:00
Simon Pilgrim	29279f29c8	[X86][SSE] Add extract_subvector(PSHUFB) -> PSHUFB(extract_subvector()) combine Fixes PR32160 by reducing the size of PSHUFB if we only use one of the lanes. This approach can probably be generalized to handle any target shuffle (and any subvector index) but we have no test coverage at the moment. llvm-svn: 344336	2018-10-12 12:10:34 +00:00
Simon Pilgrim	bc760b2dc5	[X86][AVX] Add examples of shuffles that can be reduced to a cross-lane shuffle followed by a in-lane permute Suitable for lowering by D53148 llvm-svn: 344332	2018-10-12 10:26:59 +00:00
Simon Pilgrim	c844bc84dd	[X86] Ignore float/double non-temporal loads (PR39256) Scalar non-temporal loads were asserting instead of just being ignored. Reduced from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=10895 llvm-svn: 344331	2018-10-12 10:20:16 +00:00
Stefan Maksimovic	285c0f4fdc	[mips] Mark fmaxl as a long double emulation routine Failure was discovered upon running projects/compiler-rt/test/builtins/Unit/divtc3_test.c in a stage2 compiler build. When compiling projects/compiler-rt/lib/builtins/divtc3.c, a call to fmaxl within the divtc3 implementation had its return values read from registers $2 and $3 instead of $f0 and $f2. Include fmaxl in the list of long double emulation routines to have its return value correctly interpreted as f128. Almost exact issue here: https://reviews.llvm.org/D17760 Differential Revision: https://reviews.llvm.org/D52649 llvm-svn: 344326	2018-10-12 08:18:38 +00:00
Sanjay Patel	56b6660d2e	[DAGCombiner] rearrange extract_element+bitcast fold; NFC I want to add another pattern here that includes scalar_to_vector, so this makes that patch smaller. I was hoping to remove the hasOneUse() check because it shouldn't be necessary for common codegen, but an AMDGPU test has a comment suggesting that the extra check makes things better on one of those targets. llvm-svn: 344320	2018-10-11 23:56:56 +00:00
Tom Stellard	a894043910	Revert "AMDGPU/GlobalISel: Implement select for G_INSERT" This reverts commit r344310. The test case was failing on some bots. llvm-svn: 344317	2018-10-11 23:36:46 +00:00
Tom Stellard	4733be6e7b	AMDGPU/GlobalISel: Implement select for G_INSERT Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53116 llvm-svn: 344310	2018-10-11 22:49:54 +00:00
Sanjay Patel	99c02f6301	[x86] add tests for extract_element; NFC The transform for this pattern has an unnecessary one-use limitation. llvm-svn: 344303	2018-10-11 22:04:36 +00:00
Sanjay Patel	7ad8e32f03	[x86] regenerate CHECKs; NFC llvm-svn: 344301	2018-10-11 21:44:38 +00:00
Craig Topper	35d513c7e4	[X86] Type legalize v2f32 loads by using an f64 load and a scalar_to_vector. On 64-bit targets the generic legalize will use an i64 load and a scalar_to_vector for us. But on 32-bit targets i64 isn't legal and the generic legalizer will end up emitting two 32-bit loads. We have DAG combines that try to put those two loads back together with pretty good success. This patch instead uses f64 to avoid the splitting entirely. I've made it do the same for 64-bit mode for consistency and to keep the load in the fp domain. There are a few things in here that look like regressions in 32-bit mode, but I believe they bring us closer to the 64-bit mode codegen. And that the 64-bit mode code could be better. I think those issues should be looked at separately. Differential Revision: https://reviews.llvm.org/D52528 llvm-svn: 344291	2018-10-11 20:36:06 +00:00
Sumanth Gundapaneni	a4a9155e4f	[Hexagon] Restrict compound instructions with constant value. Having a constant value operand in the compound instruction is not always profitable. This patch improves coremark by ~4% on Hexagon. Differential Revision: https://reviews.llvm.org/D53152 llvm-svn: 344284	2018-10-11 19:48:15 +00:00
Artem Dergachev	2ce1d6faf8	Revert r344197 "[MC][ELF] compute entity size for explicit sections" Revert r344206 "[MC][ELF] Fix section_mergeable_size.ll" They were causing failures on too many important buildbots for too long. Please revert eagerly if your fix takes more than a couple of hours to land! llvm-svn: 344278	2018-10-11 18:43:08 +00:00
Nirav Dave	f1f2a2a31a	[DAG] Fix Big Endian in Load-Store forwarding Summary: Correct offset calculation in load-store forwarding for big-endian targets. Reviewers: rnk, RKSimon, waltl Subscribers: sdardis, nemanjai, hiraditya, jrtc27, atanasyan, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D53147 llvm-svn: 344272	2018-10-11 18:28:59 +00:00
Craig Topper	fb2ac8969e	[X86] Restore X86ISelDAGToDAG::matchBEXTRFromAnd. Teach address matching to create a BEXTR pattern from a (shl (and X, mask >> C1) if C1 can be folded into addressing mode. This is an alternative to D53080 since I think using a BEXTR for a shifted mask is definitely an improvement when the shl can be absorbed into addressing mode. The other cases I'm less sure about. We already have several tricks for handling an and of a shift in address matching. This adds a new case for BEXTR. I've moved the BEXTR matching code back to X86ISelDAGToDAG to allow it to match. I suppose alternatively we could directly emit a X86ISD::BEXTR node that isel could pattern match. But I'm trying to view BEXTR matching as an isel concern so DAG combine can see 'and' and 'shift' operations that are well understood. We did lose a couple cases from tbm_patterns.ll, but I think there are ways to recover that. I've also put back the manual load folding code in matchBEXTRFromAnd that I removed a few months ago in r324939. This gives us some more freedom to make decisions based on the ability to fold a load. I haven't done anything with that yet. Differential Revision: https://reviews.llvm.org/D53126 llvm-svn: 344270	2018-10-11 18:06:07 +00:00
Alex Bradbury	686ef92141	[RISCV] Re-generate test/CodeGen/RISCV/vararg.ll after r344142 The improved load-store forwarding committed in r344142 broke this test. llvm-svn: 344238	2018-10-11 11:11:58 +00:00
Roman Lebedev	4225f4adff	[X86][BMI1]: X86DAGToDAGISel: select BEXTR from x & ~(-1 << nbits) pattern Summary: As discussed in D48491, we can't really do this in the TableGen, since we need to produce two instructions. This only implements one single pattern. The other 3 patterns will be in follow-ups. I'm not sure yet if we want to also fuse shift into here (i.e `(x >> start) & ...`) Reviewers: RKSimon, craig.topper, spatel Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D52304 llvm-svn: 344224	2018-10-11 07:51:13 +00:00
Fangrui Song	f953ea5fb6	[MC][ELF] Fix section_mergeable_size.ll Some targets use %progbits instead of @progbits. Updating that check with a {{[@%]}}progbits regex to make those bots happy. llvm-svn: 344206	2018-10-11 00:08:59 +00:00
Thomas Lively	2ebacb107b	[WebAssembly] Saturating float to int intrinsics Summary: Although the saturating float to int instructions are already emitted from normal IR, the fpto{s,u}i instructions produce poison values if the argument cannot fit in the result type. These intrinsics are therefore necessary to get guaranteed defined saturating behavior. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53004 llvm-svn: 344204	2018-10-11 00:01:25 +00:00
Nick Desaulniers	335315697a	[MC][ELF] compute entity size for explicit sections Summary: Global variables might declare themselves to be in explicit sections. Calculate the entity size always to prevent assembler warnings "entity size for SHF_MERGE not specified" when sections are to be marked merge-able. Fixes PR31828. Reviewers: rnk, echristo Reviewed By: rnk Subscribers: llvm-commits, pirama, srhines Differential Revision: https://reviews.llvm.org/D53056 llvm-svn: 344197	2018-10-10 22:52:32 +00:00
Roman Lebedev	62cd430602	[NFC][X86][AArch64] extract-bits.ll: add tests with constants+storing results. As noted in https://reviews.llvm.org/D53080#inline-467678, this may get pessimized by that diff. llvm-svn: 344182	2018-10-10 20:50:52 +00:00
Roman Lebedev	33d84c6dac	[X86] Move X86DAGToDAGISel::matchBEXTRFromAnd() into X86ISelLowering Summary: As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=38938 \| PR38938 ]], we fail to emit `BEXTR` if the mask is shifted. We can't deal with that in `X86DAGToDAGISel` `before the address mode for the inc is selected`, and we can't really do it in the normal DAGCombine, because we don't have generic `ISD::BitFieldExtract` node, and if we simply turn the shifted mask into a normal mask + shift-left, it will be folded back. So it would seem X86ISelLowering is the place to handle this. This patch only moves the matchBEXTRFromAnd() from X86DAGToDAGISel to X86ISelLowering. It does not add support for the 'shifted mask' pattern. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52426 llvm-svn: 344179	2018-10-10 20:40:12 +00:00
Volkan Keles	da5578c5d0	[GlobalISel] Fix the artifact combiner to fold G_IMPLICIT_DEF properly Summary: GlobalISel generates incorrect code because the legalizer artifact combiner assumes `G_[SZ]EXT (G_IMPLICIT_DEF)` is equivalent to `G_IMPLICIT_DEF `. Replace `G_[SZ]EXT (G_IMPLICIT_DEF)` with 0 because the top bits will be 0 for G_ZEXT and 0/1 for the G_SEXT. Reviewers: aditya_nandakumar, dsanders, aemerson, javed.absar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D52996 llvm-svn: 344163	2018-10-10 18:01:48 +00:00
Nirav Dave	07acc992dc	[DAGCombine] Improve Load-Store Forwarding Summary: Extend analysis forwarding loads from preceeding stores to work with extended loads and truncated stores to the same address so long as the load is fully subsumed by the store. Hexagon's swp-epilog-phis.ll and swp-memrefs-epilog1.ll test are deleted as they've no longer seem to be relevant. Reviewers: RKSimon, rnk, kparzysz, javed.absar Subscribers: sdardis, nemanjai, hiraditya, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D49200 llvm-svn: 344142	2018-10-10 14:15:52 +00:00
Sanjay Patel	6cca8af227	[x86] allow single source horizontal op matching (PR39195) This is intended to restore horizontal codegen to what it looked like before IR demanded elements improved in: rL343727 As noted in PR39195: https://bugs.llvm.org/show_bug.cgi?id=39195 ...horizontal ops can be worse for performance than a shuffle+regular binop, so I've added a TODO. Ideally, we'd solve that in a machine instruction pass, but a quicker solution will be adding a 'HasFastHorizontalOp' feature bit to deal with it here in the DAG. Differential Revision: https://reviews.llvm.org/D52997 llvm-svn: 344141	2018-10-10 13:39:59 +00:00
Carlos Alberto Enciso	c0952c8a08	Revert "[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG." This reverts commit r344120. It was causing buildbot failures. llvm-svn: 344135	2018-10-10 12:09:34 +00:00
Carlos Alberto Enciso	e7a347e5f8	[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG. When SimplifyCFG changes the PHI node into a select instruction, the debug line records becomes ambiguous. It causes the debugger to display unreachable source lines. Differential Revision: https://reviews.llvm.org/D52887 llvm-svn: 344120	2018-10-10 08:29:55 +00:00
Craig Topper	02c62aa58a	[X86] Remove FeatureRTM from Skylake processor list Summary: There are a LOT of Skylakes and later without TSX-NI. Examples: - SKL: https://ark.intel.com/products/136863/Intel-Core-i3-8121U-Processor-4M-Cache-up-to-3-20-GHz- - KBL: https://ark.intel.com/products/97540/Intel-Core-i7-7560U-Processor-4M-Cache-up-to-3-80-GHz- - KBL-R: https://ark.intel.com/products/149091/Intel-Core-i7-8565U-Processor-8M-Cache-up-to-4-60-GHz- - CNL: https://ark.intel.com/products/136863/Intel-Core-i3-8121U-Processor-4M-Cache-up-to-3_20-GHz This feature seems to be present only on high-end desktop and server chips (I can't find any SKX without). This commit leaves it disabled for all processors, but can be re-enabled for specific builds with -mrtm. Patch by Thiago Macieira Reviewers: erichkeane, craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53041 llvm-svn: 344116	2018-10-10 07:43:35 +00:00
Nemanja Ivanovic	ec527dacca	[PowerPC][NFC] Add a test case for extract and store patterns An upcoming patch will change the codegen for these patterns. This test case is added now so that the patch can show the differences in codegen. llvm-svn: 344112	2018-10-10 04:18:35 +00:00
Dylan McKay	30ef1d60f9	[AVR] Fix the 'call.ll' CodeGen test Commit r343851 changed the format of the generated instructions. An unnecessary load has been removed. Previously, a value would be moved from r24 into a temporary register just to be copied into r30 before the indirect call. Now, codegen immediately loads r24 into r30, saving a MOVW instruction. llvm-svn: 344111	2018-10-10 03:21:42 +00:00
QingShan Zhang	bc1586352e	[PowerPC] Fix the assert of ISD::SIGN_EXTEND_INREG when type is v2i16 and v2i8 For ISD::SIGN_EXTEND_INREG operation of v2i16 and v2i8 types will cause assert because they are registered as custom operation. So that the type legalization phase will enter the custom hook, which do not handle ISD::SIGN_EXTEND_INREG operation and fall throw into unreachable assert. Patch By: wuzish (Zixuan Wu) Differential Revision: https://reviews.llvm.org/D52449 llvm-svn: 344109	2018-10-10 02:33:48 +00:00
Thomas Lively	108e98ec32	[WebAssembly] Fix fneg lowering Summary: Subtraction from zero and floating point negation do not have the same semantics, so fix lowering. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52948 llvm-svn: 344107	2018-10-10 01:09:09 +00:00
Thomas Lively	409f5840a7	[WebAssembly] Handle V128 register class in explicit locals pass Summary: Also add tests to catch crashes in passes that are not normally run in tests. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52959 llvm-svn: 344094	2018-10-09 23:33:16 +00:00
Nemanja Ivanovic	72d4866e57	[DAGCombiner] Expand combining of FP logical ops to sign-setting FP ops We already do the following combines: (bitcast int (and (bitcast fp X to int), 0x7fff...) to fp) -> fabs X (bitcast int (xor (bitcast fp X to int), 0x8000...) to fp) -> fneg X When the target has "bit preserving fp logic". This patch just extends it to also combine: (bitcast int (or (bitcast fp X to int), 0x8000...) to fp) -> fneg (fabs X) As some targets have fnabs and even those that don't can efficiently lower both the fabs and the fneg. Differential revision: https://reviews.llvm.org/D44548 llvm-svn: 344093	2018-10-09 23:20:11 +00:00
Nemanja Ivanovic	c62dfe512e	[PowerPC][NFC] Commit nabs test case in preparation for committing D44548 This just adds the test case so that the different code gen is clearly visible when the DAG Combine lands. llvm-svn: 344091	2018-10-09 23:02:53 +00:00
Rong Xu	3d2efdfdea	Recommit r343993: [X86] condition branches folding for three-way conditional codes Fix the memory issue exposed by sanitizer. llvm-svn: 344085	2018-10-09 22:03:40 +00:00

1 2 3 4 5 ...

26164 Commits