This removes the after the fact FMF handling from D46854 in favor of passing fast math flags to getNode. This should be a superset of D87130.
This required adding a SDNodeFlags to SelectionDAG::getSetCC.
Now we manage to contant fold some stuff undefs during the
initial getNode that we don't do in later DAG combines.
Differential Revision: https://reviews.llvm.org/D87200
A lot of tests under PowerPC are using fast flag, while fast is just
alias of 7 fast-math flags. This change makes test points clearer.
mc-instrlat.ll and sms-iterator.ll keeps unchanged since they are not
testing fast-math behavior. (one for machine combiner crash, one for
machine pipeliner bug)
Reviewed By: steven.zhang, spatel
Differential Revision: https://reviews.llvm.org/D78989
xsnegdp, xsabsdp and xsnabsdp can be used to operate on f32 operand.
This patch adds the missing patterns since we prefer VSX instructions
when available.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D75344
Currently, DAG combiner uses (fmul (rsqrt x) x) to estimate square
root of x. However, this method would return NaN if x is +Inf, which
is incorrect.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D76853
These tests fail when the default is switched to assume IEEE denormal
handling. I'm not sure if PPC really has a way to control the denormal
input handling.
Summary: Honoring no signed zeroes is also available as a user control through clang separately regardless of fastmath or UnsafeFPMath context, DAG guards should reflect this context.
Reviewers: spatel, arsenm, hfinkel, wristow, craig.topper
Reviewed By: spatel
Subscribers: rampitec, foad, nhaehnle, wuzish, nemanjai, jvesely, wdng, javed.absar, MaskRay, jsji
Differential Revision: https://reviews.llvm.org/D65170
llvm-svn: 367486
Summary: This change facilitates propagating fmf which was placed on setcc from fcmp through folds with selects so that back ends can model this path for arithmetic folds on selects in SDAG.
Reviewers: qcolombet, spatel
Reviewed By: qcolombet
Subscribers: nemanjai, jsji
Differential Revision: https://reviews.llvm.org/D62552
llvm-svn: 362439
The single-constant algorithm produces infinities on a lot of denormal values.
The precision of the two-constant algorithm is actually sufficient across the
range of denormals. We will switch to that algorithm for now to avoid the
infinities on denormals. In the future, we will re-evaluate the algorithm to
find the optimal one for PowerPC.
Differential revision: https://reviews.llvm.org/D60037
llvm-svn: 360144
If the arch is P8, we will select XFLOAD to load the floating point, and then, expand it to vsx and non-vsx X-form instruction post RA. This patch is trying to convert the X-form to D-form if it meets the requirement that one operand of the x-form inst is the special Zero register, and another operand fed by add inst. i.e.
y = add imm, reg
LFDX. 0, y
-->
LFD imm(reg)
Reviewers: Nemanjai
Differential Revision: https://reviews.llvm.org/D49007
llvm-svn: 340149
We want to run the Machine Scheduler instead of the List Scheduler after RA.
Checked with a performance run on a Power 9 machine with SPEC 2006 and while
some benchmarks improved and others degraded the geomean was slightly improved
with the Machine Scheduler.
Differential Revision: https://reviews.llvm.org/D45265
llvm-svn: 336295
Summary: This patch originated from D47388 and is a proper subset of the originating changes, containing only the fmf optimization guard extensions.
Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar, rampitec, nhaehnle, nemanjai
Reviewed By: rampitec, nhaehnle
Subscribers: tpr, nemanjai, wdng
Differential Revision: https://reviews.llvm.org/D47918
llvm-svn: 334876
Summary: This change uses fmf subflags to guard fma optimizations as well as unsafe. These changes originated from D46483 and have been simplified via getNode.
Reviewers: spatel, arsenm, hfinkel, javed.absar
Reviewed By: spatel
Subscribers: nemanjai, wdng
Differential Revision: https://reviews.llvm.org/D47388
llvm-svn: 334242
Summary:
This change uses fmf subflags to guard optimizations as well as unsafe. These changes originated from D46483.
It contains only context for fsqrt.
Reviewers: spatel, hfinkel, arsenm
Reviewed By: spatel
Subscribers: hfinkel, wdng, andrew.w.kaylor, wristow, efriedma, nemanjai
Differential Revision: https://reviews.llvm.org/D47749
llvm-svn: 334113
Summary: This change uses fmf subflags to guard optimizations as well as unsafe. These changes originated from D46483.
Reviewers: spatel, hfinkel
Reviewed By: spatel
Subscribers: nemanjai
Differential Revision: https://reviews.llvm.org/D47389
llvm-svn: 334037
This is a simple hack based on what's proposed in D37686, but we can extend it if needed in follow-ups.
It gets us most of the FMF functionality that we want without adding any state bits to the flags. It
also intentionally leaves out non-FMF flags (nsw, etc) to minimize the patch.
It should provide a superset of the functionality from D46563 - the extra tests show propagation and
codegen diffs for fcmp, vecreduce, and FP libcalls.
The PPC log2() test shows the limits of this most basic approach - we only applied 'afn' to the last
node created for the call. AFAIK, there aren't any libcall optimizations based on the flags currently,
so that shouldn't make any difference.
Differential Revision: https://reviews.llvm.org/D46854
llvm-svn: 332358
Summary: Adding support for Fast flags in the SDNode to leverage fast math sub flag usage.
Reviewers: spatel, arsenm, jbhateja, hfinkel, escha, qcolombet, echristo, wristow, javed.absar
Reviewed By: spatel
Subscribers: llvm-commits, rampitec, nhaehnle, tstellar, FarhanaAleen, nemanjai, javed.absar, jbhateja, hfinkel, wdng
Differential Revision: https://reviews.llvm.org/D45710
llvm-svn: 331547
We can't see all of the problems currently unless
we look at debug output when the global 'unsafe' is
on. It's a mess. This is another attempt to make
sure that D45710 is not making changes unintentionally.
llvm-svn: 331476
I'm choosing PPC out of convenience because it does
all of the transforms of interest in these tests by
default. There are multiple FMF problems shown in the
current checks. D45710 is proposing to fix part of
that.
llvm-svn: 331471