Commit Graph

48863 Commits

Author SHA1 Message Date
Thomas Lively 995ad61f23 [WebAssembly] v128.not
Implementation and tests.

llvm-svn: 340857
2018-08-28 18:31:15 +00:00
Matt Arsenault 35b1902bce AMDGPU: Move canShrink into TII
llvm-svn: 340855
2018-08-28 18:22:34 +00:00
Heejin Ahn 56e79dd048 [WebAssembly] Use getCalleeOpNo utility function (NFC)
Reviewers: tlively

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51366

llvm-svn: 340848
2018-08-28 17:49:39 +00:00
Craig Topper c73095e264 [X86] Mark the FUCOMI instructions as requiring CMOV to be enabled. NFCI
These instructions were added on the PentiumPro along with CMOV.

This was already comprehended by the lowering process which should emit an alternate sequence using FCOM and FNSTW. This just makes it an explicit error if that doesn't work for some reason.

llvm-svn: 340844
2018-08-28 17:17:13 +00:00
Ryan Taylor 1f334d0062 [AMDGPU] Add support for a16 modifiear for gfx9
Summary:
Adding support for a16 for gfx9. A16 bit replaces r128 bit for gfx9.

Change-Id: Ie8b881e4e6d2f023fb5e0150420893513e5f4841

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50575

llvm-svn: 340831
2018-08-28 15:07:30 +00:00
Simon Pilgrim af98587095 [X86][SSE] Improve variable scalar shift of vXi8 vectors (PR34694)
This patch creates the shift mask and actual shift using the vXi16 vector shift ops.

Differential Revision: https://reviews.llvm.org/D51263

llvm-svn: 340813
2018-08-28 10:37:29 +00:00
Simon Pilgrim f119e27d80 [X86][SSE] Avoid vector extraction/insertion for non-constant uniform shifts
As discussed on D51263, we're better off using byte shifts to clear the upper bits on pre-SSE41 hardware.

llvm-svn: 340810
2018-08-28 10:14:09 +00:00
Craig Topper c1436db753 [X86] Fix some comments to refer to KORTEST not KTEST. NFC
KTEST is a different instruction. All of this code uses KORTEST.

llvm-svn: 340799
2018-08-28 06:39:35 +00:00
Kit Barton 7c80f98b69 [PPC] Remove Darwin support from POWER backend.
This patch issues an error message if Darwin ABI is attempted with the PPC
backend. It also cleans up existing test cases, either converting the test to
use an alternative triple or removing the test if the coverage is no longer
needed.

Updated Tests
-------------
The majority of test cases were updated to use a different triple that does not
include the Darwin ABI. Many tests were also updated to use FileCheck, in place
of grep.

Deleted Tests
-------------
llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test
specific functionality of dsymutil using an object file created with an old
version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he
suggested removing the test.

llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a
PPC test to a SystemZ test, as the behavior is also reproducible there.

All other tests that were deleted were specific to the darwin/ppc ABI and no
longer necessary.

Phabricator Review: https://reviews.llvm.org/D50988

llvm-svn: 340795
2018-08-28 01:18:29 +00:00
Thomas Lively 211874d2f3 [WebAssembly] TableGen backend for stackifying instructions
Summary:
The new stackification backend generates the giant switch statement
used to translate instructions to their stackified forms. I did this
because it was more interesting than adding all the different vector
versions of the various SIMD instructions to the switch statment
manually.

Reviewers: aardappel, aheejin, dschuff

Subscribers: mgorny, sbc100, jgravelle-google, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D51318

llvm-svn: 340781
2018-08-27 22:02:09 +00:00
Sean Fertile a2f095f1a3 [PowerPC][MC] Support expressions in getMemRIX16Encoding.
Loosens an assert in getMemRIX16Encoding that restricts DQ-form instructions to
using an immediate, so that we can assemble instructions like lxv/stxv where the
offset is an expression.

Differential Revision: https://reviews.llvm.org/D51122

llvm-svn: 340761
2018-08-27 17:37:43 +00:00
Benjamin Kramer 759e7d9819 [NVPTX] Implement isLegalToVectorizeLoadChain
This lets LSV nicely split up underaligned chains.

Differential Revision: https://reviews.llvm.org/D51306

llvm-svn: 340760
2018-08-27 17:29:43 +00:00
Craig Topper 4be11c0585 [X86] When lowering v32i8 MULHS/MULHU, shuffle after the PACKUS rather than before.
We're using a 256-bit PACKUS to do the truncation, but that instruction operates on 128-bit lanes. So previously we shuffled first to rearrange the lanes. But that requires 2 shuffles. Instead we can shuffle after the PACKUS using a single VPERMQ. This matches what our normal LowerTRUNCATE code does when it uses PACKUS.

Differential Revision: https://reviews.llvm.org/D51284

llvm-svn: 340757
2018-08-27 17:20:41 +00:00
Craig Topper fff90377fd [X86] Add support for matching paddus patterns where one of the vectors is a constant.
InstCombine mucks these up a bit. So we need to do some additional pattern matching to fix it. There are a still a few special cases not handled, but this covers the general case.

Differential Revision: https://reviews.llvm.org/D50952

llvm-svn: 340756
2018-08-27 17:20:38 +00:00
Wouter van Oortmerssen 8a9cb242fb [WebAssembly] Added default stack-only instruction mode for MC.
Summary:
Made it convert from register to stack based instructions, and removed the registers.
Fixes to related code that was expecting register based instructions.
Added the correct testing flag to all tests, depending on what the
format they were expecting so far.
Translated one test to stack format as example: reg-stackify-stack.ll

tested:
llvm-lit -v `find test -name WebAssembly`
unittests/MC/*

Reviewers: dschuff, sunfish

Subscribers: sbc100, jgravelle-google, eraman, aheejin, llvm-commits, jfb

Differential Revision: https://reviews.llvm.org/D51241

llvm-svn: 340750
2018-08-27 15:45:51 +00:00
Nico Weber e75fd1b184 fix comment typo
llvm-svn: 340744
2018-08-27 14:25:22 +00:00
Nemanja Ivanovic 5d06f17b8a [PowerPC] Revert commit r339779
This commit has caused failures in some internal benchmarks. Temporarily
reverting this patch until the issue can be diagnosed and fixed.

llvm-svn: 340740
2018-08-27 13:20:42 +00:00
Daniel Cederman db474c12e9 [Sparc] Avoid writing outside array in applyFixup
Summary: If an object file ends with a relocation that is smaller
than 4 bytes we will write outside the Data array and trigger an
"Invalid index" assertion.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D50971

llvm-svn: 340736
2018-08-27 11:43:59 +00:00
Nemanja Ivanovic f2588a28a8 [PowerPC] Recommit r340016 after fixing the reported issue
The internal benchmark failure reported by Google was due to a missing
check for the result type for the sign-extend and shift DAG. This commit
adds the check and re-commits the patch.

llvm-svn: 340734
2018-08-27 11:20:27 +00:00
Daniel Cederman 2739596063 [Sparc] Add support for the cycle counter available in GR740
Summary: The GR740 provides an up cycle counter in the registers ASR22
and ASR23. As these registers can not be read together atomically we only
use the value of ASR23 for llvm.readcyclecounter(). The ASR23 register
holds the 32 LSBs of the up-counter.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: jfb, fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48638

llvm-svn: 340733
2018-08-27 11:11:47 +00:00
Daniel Cederman 92dadc0bca [Sparc] Custom bitcast between f64 and v2i32
Summary:
Currently bitcasting constants from f64 to v2i32 is done by storing the
value to the stack and then loading it again. This is not necessary, but
seems to happen because v2i32 is a valid type for Sparc V8. If it had not
been legal, we would have gotten help from the type legalizer.

This patch tries to do the same work as the legalizer would have done by
bitcasting the floating point constant and splitting the value up into a
vector of two i32 values.

Reviewers: venkatra, jyknight

Reviewed By: jyknight

Subscribers: glaubitz, fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D49219

llvm-svn: 340723
2018-08-27 07:14:53 +00:00
Roger Ferrer Ibanez fe28217048 [RISCV] atomic_store_nn have a different layout to regular store
We cannot directy reuse the patterns of StPat because for some reason the store
DAG node and the atomic_store_nn DAG nodes put the ptr and the value in
different positions. Currently we attempt to store the address to an address
formed by the value.

Differential Revision: https://reviews.llvm.org/D51217

llvm-svn: 340722
2018-08-27 07:08:18 +00:00
Craig Topper 73ed2a2a6b [X86] Cleanup the LowerMULH code by hoisting some commonalities between the vXi32 and vXi8 handling. NFCI
vXi32 support was recently moved from LowerMUL_LOHI to LowerMULH.

This commit shares the getOperand calls, switches both to use common IsSigned flag, and hoists the NumElems/NumElts variable.

llvm-svn: 340720
2018-08-27 06:35:02 +00:00
Craig Topper a72012c206 [X86] Correct the cost of (v4i32 (fptoui (v4f64))) under AVX512F.
Summary: This was inheriting the cost from the AVX table, but should be legal under AVX512.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51267

llvm-svn: 340708
2018-08-26 18:47:44 +00:00
Craig Topper 128915f4ae [X86] Add FeatureCMOV explicitly to all CPUs that support it. Remove FeatureCMOV implication from Feature64Bit and FeatureSSE1
Summary:
Previously most CPUs inherited cmov support through Feature64Bit(or FeatureCMPXCHG16HB implying Feature64Bit) or FeatureSSE1.

This has the surprising side effect that -mattr=-cmov causes an assert to fire in 64-bit mode because it clears the Feature64Bit. Or in 32-bit mode, -mattr=-cmov disables any sse/avx features which seems surprising.

This patch removes the implication and instead updates hasCMOV in X86Subtarget to check SSE1 or is64Bit in addition to the regular cmov flag. This should keep most things working the way they did before. I don't believe there is a way to specific "-cmov" directly from clang so this should only effect our lower level tools.

This does stop -mattr=cx16(cmpxchg16b) from implying cmov is enabled via the 64bit flag as you can see from one of the changed tests. But that was a 32-bit test so I don't know why it enabled cx16 anyway.

For the other test I had to add -sse to override the new sse check in hasCMOV.

Reviewers: RKSimon, DavidKreitzer, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits, jfb

Differential Revision: https://reviews.llvm.org/D51228

llvm-svn: 340707
2018-08-26 18:29:33 +00:00
Craig Topper b68a78b9ac [X86] Add FeatureCMOV to athlon and athlon-tbird cpus.
Summary: This matches gcc and one cpuid dump I found online. Given that these are considered 7th generation x86 CPU it seems likely they support cmov since cmov was added by Intel in their 6th generation.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51264

llvm-svn: 340706
2018-08-26 18:29:27 +00:00
Sanjay Patel 113cac3b15 [SelectionDAG][x86] turn insertelement into undef with variable index into splat
I noticed this along with the patterns in D51125, but when the index is variable, 
we don't convert insertelement into a build_vector.

For x86, that means these get expanded at legalization time into the loading/spilling 
code that we see in the tests. I think it's always better to avoid going to memory on 
these, and we get the optimal 'broadcast' if it's available.

I suspect other targets may want to look at enabling the hook. AArch64 and AMDGPU have 
regression tests that would be affected (although I did not check what would happen in 
those cases). In the most basic cases shown here, AArch64 would probably do much 
better with a splat.

Differential Revision: https://reviews.llvm.org/D51186

llvm-svn: 340705
2018-08-26 18:20:41 +00:00
Petar Jovanovic 1fa5051bad [MIPS GlobalISel] Legalize i8 and i16 add
Legalize G_ADD for types smaller than i32.
LegalizationArtifactCombiner replaces extend instructions with appropriate
bitwise instructions.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D51213

llvm-svn: 340697
2018-08-26 07:25:33 +00:00
Craig Topper 4240ecb909 [X86] Fix typo in comment, expect->except. NFC
llvm-svn: 340695
2018-08-26 03:43:23 +00:00
Craig Topper ebec2793d1 [X86] Replace support for vXi32 SMUL_LOHI/UMUL_LOHI with MULHS/MULHU support instead.
Summary:
The only time vector SMUL_LOHI/UMUL_LOHI nodes are created is during division/remainder lowering. If its created before op legalization, generic DAGCombine immediately turns that SMUL_LOHI/UMUL_LOHI into a MULHS/MULHU since only the upper half is used. That node will stick around through vector op legalization and will be turned back into UMUL_LOHI/SMUL_LOHI during op legalization. It will then be custom lowered by the X86 backend. Due to this two step lowering the vector shuffles created by the custom lowering get legalized after their inputs rather than before. This prevents the shuffles from being combined with any build_vector of constants.

This patch uses changes vXi32 to use MULHS/MULHU instead. This is what the later DAG combine did anyway. But by skipping the change back to UMUL_LOHI/SMUL_LOHI we lower it before any constant BUILD_VECTORS. This allows the vector_shuffle creation to constant fold with the build_vectors. This accounts for the test changes here.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51254

llvm-svn: 340690
2018-08-25 18:01:24 +00:00
Craig Topper a11a3b3818 [SelectionDAG][X86] Reorder the operands the MaskedStoreSDNode to put the value first.
Summary:
Previously the value being stored is the last operand in SDNode. This causes the type legalizer to visit the mask operand before the value operand. The type legalizer was more complicated because of this since we want the type of the value to drive the decisions.

This patch moves the value to be the first operand so we visit it first during type legalization. It also simplifies the type legalization code accordingly.

X86 is currently the only in tree target that uses this SDNode. Not sure if there are any users out of tree.

Reviewers: RKSimon, delena, hfinkel, eli.friedman

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50402

llvm-svn: 340689
2018-08-25 17:48:17 +00:00
Craig Topper bce8680605 [X86] Make sure type is a vector before calling VT.getVectorNumElements() in combineLoopMAddPattern
Fixes PR38700.

llvm-svn: 340688
2018-08-25 17:23:43 +00:00
Sanjay Patel 8a84c747d2 [x86] try harder to use broadcast to load a scalar into vector reg
This is a preliminary step for a preliminary step for D50992. 
I noticed that x86 often misses chances to load a scalar directly 
into a vector register.

So this patch is just allowing more of those cases to match a 
broadcast op in lowerBuildVectorAsBroadcast(). The old code comment 
said it doesn't make sense to use a broadcast when we're loading a 
single element and everything else is undef, but I think that's the 
best case in the improved tests in insert-loaded-scalar.ll. We avoid 
scalar-to-vector-register move and/or less efficient shuffling.

Note that there are some existing types that were already producing 
a broadcast, but that happens semi-accidentally. Ie, it's not 
happening as part of lowerBuildVectorAsBroadcast(). The build vector 
gets expanded into load + shuffle, and then shuffle lowering produces 
the broadcast.

Description of the other test diffs:
1. avx-basic.ll - replacing load+shufle is a win.
2. sse3-avx-addsub-2.ll - vmovddup vs. vbroadcastss is neutral
3. sse41.ll - don't care - we convert that intrinsic to generic IR now, so this test is deprecated
4. vector-shuffle-128-v8.ll / vector-shuffle-256-v16.ll - pshufb alternatives with an extra instruction are not obviously bad

Differential Revision: https://reviews.llvm.org/D51125

llvm-svn: 340685
2018-08-25 14:56:05 +00:00
Tim Renouf 904343f879 [AMDGPU] Add support for multi-dword s.buffer.load intrinsic
Summary:
Patch by Marek Olsak and David Stuttard, both of AMD.

This adds a new amdgcn intrinsic supporting s.buffer.load, in particular
multiple dword variants. These are convenient to use from some front-end
implementations.

Also modified the existing llvm.SI.load.const intrinsic to common up the
underlying implementation.

This modification also requires that we can lower to non-uniform loads correctly
by splitting larger dword variants into sizes supported by the non-uniform
versions of the load.

V2: Addressed minor review comments.
V3: i1 glc is now i32 cachepolicy for consistency with buffer and
    tbuffer intrinsics, plus fixed formatting issue.
V4: Added glc test.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D51098

Change-Id: I83a6e00681158bb243591a94a51c7baa445f169b
llvm-svn: 340684
2018-08-25 14:53:17 +00:00
Ana Pazos ecc65eddec [RISCV] Fixed Assertion`Kind == Immediate && "Invalid type access!"' failed.
Summary:
Missing check for isImm() in some Immediate classes.

This bug was uncovered by a LLVM MC Assembler Protocol Buffer Fuzzer
for the RISC-V assembly language.

Reviewers: hiraditya, asb

Reviewed By: hiraditya, asb

Subscribers: llvm-commits, hiraditya, kito-cheng, shiva0217, rkruppe, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei

Differential Revision: https://reviews.llvm.org/D50797

llvm-svn: 340674
2018-08-24 23:47:49 +00:00
Ana Pazos 61b28ede75 [RISCV] Fix std::advance slowness
Summary:
It seems std::advance template is treating "-MFI.getCalleeSavedInfo().size()"
as a large unsigned value", causing slowness.

Thanks to Henrik Gustafsson for reporting the issue.

Reviewers: asb

Reviewed By: asb

Subscribers: llvm-commits, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb

Differential Revision: https://reviews.llvm.org/D51148

llvm-svn: 340669
2018-08-24 23:13:59 +00:00
Stefan Pintilie f384606799 [PowerPC] Emit xscpsgndp instead of xxlor when copying floating point scalar registers for P9
This patch will address using the xscpsgndp instruction to copy floating point
scalar registers instead of the xxlor (specifically XXLORf) instruction that is
currently used. Additionally, this patch of utilizing xscpsgndp will apply to
P9, while pre-P9 will still use xxlor.

Patch by amyk

Differential Revision: https://reviews.llvm.org/D50004

llvm-svn: 340643
2018-08-24 20:00:24 +00:00
Eli Friedman 071203bbf2 [AArch64] Reject inline asm with FP registers when FP is disabled.
Otherwise, we would crash trying to deal with an illegal input.

Differential Revision: https://reviews.llvm.org/D51202

llvm-svn: 340637
2018-08-24 19:12:13 +00:00
Craig Topper 4058e29e7d [X86] Teach combineLoopMAddPattern to handle cases where there is no loop and the add has two multiply inputs
Differential Revision: https://reviews.llvm.org/D50868

llvm-svn: 340631
2018-08-24 18:05:04 +00:00
Joel Galenson c6f6c17c9b Add missing override keyword (NFC)
llvm-svn: 340615
2018-08-24 16:15:44 +00:00
Joel Galenson d36fb48a27 Find PLT entries for x86, x86_64, and AArch64.
This adds a new method to ELFObjectFileBase that returns the symbols and addresses of PLT entries.

This design was suggested by pcc and eugenis in https://reviews.llvm.org/D49383.

Differential Revision: https://reviews.llvm.org/D50203

llvm-svn: 340610
2018-08-24 15:21:56 +00:00
Petar Jovanovic 65d463bdd7 [MIPS GlobalISel] Lower i8 and i16 arguments
Lower integer arguments smaller than i32.
Support both register and stack arguments.
Define setLocInfo function for setting LocInfo field in ArgLocs vector.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D51031

llvm-svn: 340572
2018-08-23 20:41:09 +00:00
Thomas Lively da26b84bd0 [WebAssembly] Prioritize splats over v128.consts
Summary:
Splats are fewer bytes than v128.consts, so use them when either could
apply.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51179

llvm-svn: 340569
2018-08-23 19:23:13 +00:00
Sanjay Patel 40aa86751a [x86] add debug option for and-immediate shrinking
The commit that added this functionality:
rL322957

may be causing/exposing a miscompile in PR38648:
https://bugs.llvm.org/show_bug.cgi?id=38648

so allow enabling/disabling to make debugging easier.

llvm-svn: 340540
2018-08-23 15:58:07 +00:00
Chandler Carruth ae0cafece8 [x86/retpoline] Split the LLVM concept of retpolines into separate
subtarget features for indirect calls and indirect branches.

This is in preparation for enabling *only* the call retpolines when
using speculative load hardening.

I've continued to use subtarget features for now as they continue to
seem the best fit given the lack of other retpoline like constructs so
far.

The LLVM side is pretty simple. I'd like to eventually get rid of the
old feature, but not sure what backwards compatibility issues that will
cause.

This does remove the "implies" from requesting an external thunk. This
always seemed somewhat questionable and is now clearly not desirable --
you specify a thunk the same way no matter which set of things are
getting retpolines.

I really want to keep this nicely isolated from end users and just an
LLVM implementation detail, so I've moved the `-mretpoline` flag in
Clang to no longer rely on a specific subtarget feature by that name and
instead to be directly handled. In some ways this is simpler, but in
order to preserve existing behavior I've had to add some fallback code
so that users who relied on merely passing -mretpoline-external-thunk
continue to get the same behavior. We should eventually remove this
I suspect (we have never tested that it works!) but I've not done that
in this patch.

Differential Revision: https://reviews.llvm.org/D51150

llvm-svn: 340515
2018-08-23 06:06:38 +00:00
Thomas Lively c17425708b [WebAssembly] SIMD Bitwise binary arithmetic
Summary: AND, OR, and XOR. This CL depends on D51113.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51136

llvm-svn: 340505
2018-08-23 00:48:37 +00:00
Thomas Lively 123c3bb29e [WebAssembly][NFC] Reorganize SIMD instructions
Summary:
Reorganize WebAssemblyInstrSIMD.td to put all of the instruction
definitions together, making it easier to see which instructions have
been implemented already. Depends on D51143.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51113

llvm-svn: 340504
2018-08-23 00:43:47 +00:00
Thomas Lively 914f0f20a4 [WebAssembly][NFC] Move specific instruction formats to specific files
Summary:
WebAssemblyInstrFormats.td retains only multiclasses that are used in
multiple other tablegen files.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D51143

llvm-svn: 340503
2018-08-23 00:36:43 +00:00
Craig Topper cf9df99d79 [X86] Teach combineLoopSADPattern to handle cases where there is no loop and the add has two absolute difference inputs
Previously we asumed a vector reduction add is part of a loop and one of the input is a phi. But the code in SelectionDAGBuilder that sets vector reduction flag handles more cases than that. It just requires that the use chain ends in a horizontal reduction. And there are no other uses. This means it can handle unrolled reduction loops.

If the initial value of the reduction was 0, an unrolled loop would begin with a vector reduction add that has two sad inputs. Previously we would only transform one side of the add, but for this case we need to transform both sides.

I've created a lambda to reuse some of the code for both sides. And fixed the variables names to remove reference to "phi".

Differential Revision: https://reviews.llvm.org/D50817

llvm-svn: 340478
2018-08-22 23:19:01 +00:00
Thomas Lively 2ee686da27 [WebAssembly] Arbitrary BUILD_VECTOR and remove i64x2.mul
Summary:
This CL adds support for arbitrary BUILD_VECTORS, i.e. not splats and
not consts. This is the last feature needed to properly lower v2i64
multiplies without a i64x2.mul instruction (which is not in the spec),
so i64x2.mul is removed as well.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51082

Remove unnecessary condition and fix whitespace

llvm-svn: 340472
2018-08-22 23:06:27 +00:00