Go to file
Simon Pilgrim db26b3ddfa [X86][FMA4] Prefer FMA4 to FMA
We currently output FMA instructions on targets which support both FMA4 + FMA (i.e. later Bulldozer CPUS bdver2/bdver3/bdver4).

This patch flips this so FMA4 is preferred; this is for several reasons:

1 - FMA4 is non-destructive reducing the need for mov instructions.
2 - Its more straighforward to commute and fold inputs (although the recent work on FMA has reduced this difference).
3 - All supported targets have FMA4 performance equal or better to FMA - Piledriver (bdver2) in particular has half the throughput when executing FMA instructions.

Its looks like no future AMD processor lines will support FMA4 after the Bulldozer series so we're not causing problems for later CPUs.

Differential Revision: http://reviews.llvm.org/D14997

llvm-svn: 254339
2015-11-30 22:22:06 +00:00
clang [libFuzzer] clarify the limitation of fsanitize-coverage=trace-cmp 2015-11-30 22:17:19 +00:00
clang-tools-extra [clang-tidy] google-explicit-constructor: improve the warning message 2015-11-28 02:25:02 +00:00
compiler-rt [compiler-rt] Remove SANITIZER_AARCH64_VMA usage 2015-11-30 19:43:03 +00:00
debuginfo-tests New round of fixes for "Always compile debuginfo-tests for the host triple" 2014-10-18 23:47:59 +00:00
libclc integer: remove explicit casts from _MIN definitions 2015-10-06 19:12:12 +00:00
libcxx Last bit of P0006; mark it as complete 2015-11-30 05:39:30 +00:00
libcxxabi c++abi: use __builtin_offsetof instead of offsetof 2015-11-18 05:33:38 +00:00
libunwind Make it possible to use libunwind without heap. 2015-11-09 06:57:29 +00:00
lld ELF: Make comments consistent. 2015-11-30 21:00:53 +00:00
lldb Fix hang in global static initialization 2015-11-30 22:18:43 +00:00
llgo [llgo] Force exporting __morestack from llgoi 2015-11-27 04:46:46 +00:00
llvm [X86][FMA4] Prefer FMA4 to FMA 2015-11-30 22:22:06 +00:00
openmp Fix honoring of OMP_THREAD_LIMIT in the teams construct 2015-11-30 20:14:05 +00:00
polly ScopInfo: Further simplify code 2015-11-30 21:13:43 +00:00