x*rsqrt(x) returns NaN for x == 0, whereas 1/rsqrt(x) returns 0, as
desired.
Verified that the particular nvptx approximate instructions here do in
fact return 0 for x = 0.
llvm-svn: 293713
Summary:
This lets us lower to sqrt.approx and rsqrt.approx under more
circumstances.
* Now we emit sqrt.approx and rsqrt.approx for calls to @llvm.sqrt.f32,
when fast-math is enabled. Previously, we only would emit it for
calls to @llvm.nvvm.sqrt.f. (With this patch we no longer emit
sqrt.approx for calls to @llvm.nvvm.sqrt.f; we rely on intcombine to
simplify llvm.nvvm.sqrt.f into llvm.sqrt.f32.)
* Now we emit the ftz version of rsqrt.approx when ftz is enabled.
Previously, we only emitted rsqrt.approx when ftz was disabled.
Reviewers: hfinkel
Subscribers: llvm-commits, tra, jholewinski
Differential Revision: https://reviews.llvm.org/D28508
llvm-svn: 293605