llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	3fc3454a0c	[X86][FMA] Optimize FNEG(FMUL) Patterns On FMA targets, we can avoid having to load a constant to negate a float/double multiply by instead using a FNMSUB (-(X*Y)-0) Fix for PR24366 Differential Revision: http://reviews.llvm.org/D14909 llvm-svn: 254495	2015-12-02 09:07:55 +00:00
Simon Pilgrim	db26b3ddfa	[X86][FMA4] Prefer FMA4 to FMA We currently output FMA instructions on targets which support both FMA4 + FMA (i.e. later Bulldozer CPUS bdver2/bdver3/bdver4). This patch flips this so FMA4 is preferred; this is for several reasons: 1 - FMA4 is non-destructive reducing the need for mov instructions. 2 - Its more straighforward to commute and fold inputs (although the recent work on FMA has reduced this difference). 3 - All supported targets have FMA4 performance equal or better to FMA - Piledriver (bdver2) in particular has half the throughput when executing FMA instructions. Its looks like no future AMD processor lines will support FMA4 after the Bulldozer series so we're not causing problems for later CPUs. Differential Revision: http://reviews.llvm.org/D14997 llvm-svn: 254339	2015-11-30 22:22:06 +00:00
Simon Pilgrim	d9bb73b236	[X86][FMA] Added 512-bit tests to match 128/256-bit tests coverage As discussed on D14909 llvm-svn: 254233	2015-11-28 16:04:24 +00:00
Simon Pilgrim	82f663d755	[X86][FMA] More thorough FMA tests Added FMADD/FMSUB/FNMADD/FNMSUB tests for all types Added load folding tests for 512-bit vectors NOTE: Many of the AVX512 FMA instructions don't yet commute/fold correctly As discussed on D14909 llvm-svn: 254232	2015-11-28 14:28:44 +00:00
Simon Pilgrim	1d881ae225	[X86][FMA] Begun adding AVX512 FMA tests As discussed on D14909 llvm-svn: 254180	2015-11-26 20:53:28 +00:00
James Y Knight	7c905063c5	Make utils/update_llc_test_checks.py note that the assertions are autogenerated. Also update existing test cases which appear to be generated by it and weren't modified (other than addition of the header) by rerunning it. llvm-svn: 253917	2015-11-23 21:33:58 +00:00
Simon Pilgrim	779bcf3e3d	[X86][FMA] Refreshed fma tests llvm-svn: 247508	2015-09-12 15:33:05 +00:00
Stephen Lin	764d8d3d6f	Start using CHECK-LABEL in some tests. llvm-svn: 186163	2013-07-12 14:54:12 +00:00
Stephen Lin	73de7bf5de	AArch64/PowerPC/SystemZ/X86: This patch fixes the interface, usage, and all in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in order to resolve the following issues with fmuladd (i.e. optional FMA) intrinsics: 1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd intrinsics even if the subtarget does not support FMA instructions, leading to laughably bad code generation in some situations. 2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128, resulting in a call to a software fp128 FMA implementation. 3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize, etc. to types that support hardware FMAs. The function has also been slightly renamed for consistency and to force a merge/build conflict for any out-of-tree target implementing it. To resolve, see comments and fixed in-tree examples. llvm-svn: 185956	2013-07-09 18:16:56 +00:00

9 Commits