Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp
negation. This also extends the scalarization cost in instcombine for unary
operators to result in the same IR rewrites for fneg as for the idiom.
Reviewed By: cameron.mcinally
Differential Revision: https://reviews.llvm.org/D75467
Previously we emitted an fmadd and a fmadd+fneg and combined them with a shufflevector. But this doesn't follow the correct exception behavior for unselected elements so the backend can't merge them into the fmaddsub/fmsubadd instructions.
This patch restores the the fmaddsub intrinsics so we don't have two arithmetic operations. We lose out on optimization opportunity in the non-strict FP case, but I don't think this is a big loss. If someone gives us a test case we can look into adding instcombine/dagcombine improvements. I'd rather not have the frontend do completely different things for strict and non-strict.
This still has problems because target specific intrinsics don't support strict semantics yet. We also still have all of the problems with masking. But we at least generate the right instruction in constrained mode now.
Differential Revision: https://reviews.llvm.org/D74268
With REQUIRES: x86-register-target added to the tests.
Also remove some unneeded FIXMEs
But add a FIXME for bad IR generation for FMADDSUB/FMSUBADD with
constrained FP.
Original patch by Kevin P. Neal
This reverts commit 208470dd5d.
Tests fail:
error: unable to create target: 'No available targets are compatible with triple "x86_64-apple-darwin"'
This happens on clang-hexagon-elf, clang-cmake-armv7-quick, and
clang-cmake-armv7-quick bots.
If anyone has any suggestions on why then I'm all ears.
Differential Revision: https://reviews.llvm.org/D73570
Revert "[FPEnv][X86] Speculative fix for failures introduced by eda495426."
This reverts commit 80e17e5fcc.
The speculative fix didn't solve the test failures on Hexagon, ARMv6, and
MSVC AArch64.
When constrained floating point is enabled the X86-specific builtins don't
use constrained intrinsics in some cases. Fix that.
Differential Revision: https://reviews.llvm.org/D73570