Most often as not this is what it started out as, the extraction is zero-cost on AVX and the PMOVZX/PMOVSX folding logic is based around 128-bit loads.
llvm-svn: 270858
LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.
Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.
This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).
Fixes PR19797.
llvm-svn: 270720
As noted in the review, there are still problems, so this doesn't the bug completely.
Differential Revision: http://reviews.llvm.org/D20529
llvm-svn: 270718
Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead.
Differential Revision: http://reviews.llvm.org/D20568
llvm-svn: 270678
Replace bidirectional flow analysis to compute liveness with forward
analysis pass. Treat lifetimes as starting when there is a first
reference to the stack slot, as opposed to starting at the point of the
lifetime.start intrinsic, so as to increase the number of stack
variables we can overlap.
Reviewers: gbiv, qcolumbet, wmi
Differential Revision: http://reviews.llvm.org/D18827
Bug: 25776
llvm-svn: 270559
This isn't the complete fix, but it handles the trivial examples of duplicate vzero* ops in PR27823:
https://llvm.org/bugs/show_bug.cgi?id=27823
...and amusingly, the bogus cases already exist as regression tests, so let's take this baby step.
We'll need to do more in the general case where there's legitimate AVX usage in the function + there's
already a vzero in the code.
Differential Revision: http://reviews.llvm.org/D20477
llvm-svn: 270378
This patch is a first step towards a more extendible method of matching combined target shuffle masks.
Initially this just pulls out the existing basic mask matches and adds support for some 256/512 bit equivalents. Future patterns will require a number of features to be added but I wanted to keep this patch simple.
I hope we can avoid duplication between shuffle lowering and combining and share more complex pattern match functions in future commits.
Differential Revision: http://reviews.llvm.org/D19198
llvm-svn: 270230