llvm-project/llvm/test/CodeGen/X86/wide-fma-contraction.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -mcpu=bdver2 -mattr=-fma -mtriple=i686-apple-darwin < %s | FileCheck %s
; RUN: llc -mcpu=bdver2 -mattr=-fma,-fma4 -mtriple=i686-apple-darwin < %s | FileCheck %s --check-prefix=CHECK-NOFMA

; CHECK-LABEL: fmafunc
; CHECK-NOFMA-LABEL: fmafunc
define <16 x float> @fmafunc(<16 x float> %a, <16 x float> %b, <16 x float> %c) {
; CHECK-LABEL: fmafunc:
; CHECK:       ## BB#0:
; CHECK-NEXT:    pushl %ebp
; CHECK-NEXT:    .cfi_def_cfa_offset 8
; CHECK-NEXT:    .cfi_offset %ebp, -8
; CHECK-NEXT:    movl %esp, %ebp
; CHECK-NEXT:    .cfi_def_cfa_register %ebp
; CHECK-NEXT:    andl $-32, %esp
; CHECK-NEXT:    subl $32, %esp
; CHECK-NEXT:    vfmaddps 8(%ebp), %ymm2, %ymm0, %ymm0
; CHECK-NEXT:    vfmaddps 40(%ebp), %ymm3, %ymm1, %ymm1
; CHECK-NEXT:    movl %ebp, %esp
; CHECK-NEXT:    popl %ebp
; CHECK-NEXT:    retl
;
; CHECK-NOFMA-LABEL: fmafunc:
; CHECK-NOFMA:       ## BB#0:
; CHECK-NOFMA-NEXT:    pushl %ebp
; CHECK-NOFMA-NEXT:    .cfi_def_cfa_offset 8
; CHECK-NOFMA-NEXT:    .cfi_offset %ebp, -8
; CHECK-NOFMA-NEXT:    movl %esp, %ebp
; CHECK-NOFMA-NEXT:    .cfi_def_cfa_register %ebp
; CHECK-NOFMA-NEXT:    andl $-32, %esp
; CHECK-NOFMA-NEXT:    subl $32, %esp
; CHECK-NOFMA-NEXT:    vmulps %ymm2, %ymm0, %ymm0
; CHECK-NOFMA-NEXT:    vaddps 8(%ebp), %ymm0, %ymm0
; CHECK-NOFMA-NEXT:    vmulps %ymm3, %ymm1, %ymm1
; CHECK-NOFMA-NEXT:    vaddps 40(%ebp), %ymm1, %ymm1
; CHECK-NOFMA-NEXT:    movl %ebp, %esp
; CHECK-NOFMA-NEXT:    popl %ebp
; CHECK-NOFMA-NEXT:    retl


  %ret = tail call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %a, <16 x float> %b, <16 x float> %c)
  ret <16 x float> %ret
}

declare <16 x float> @llvm.fmuladd.v16f32(<16 x float>, <16 x float>, <16 x float>) nounwind readnone
Regenerate expectation for wide-fma-contraction.ll . NFC llvm-svn: 304586 2017-06-03 03:15:04 +08:00			`; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py`
X86: Do not use llc -march in tests. `llc -march` is problematic because it only switches the target architecture, but leaves the operating system unchanged. This occasionally leads to indeterministic tests because the OS from LLVM_DEFAULT_TARGET_TRIPLE is used. However we can simply always use `llc -mtriple` instead. This changes all the tests to do this to avoid people using -march when they copy and paste parts of tests. See also the discussion in https://reviews.llvm.org/D35287 llvm-svn: 309774 2017-08-02 08:28:10 +08:00			`; RUN: llc -mcpu=bdver2 -mattr=-fma -mtriple=i686-apple-darwin < %s \| FileCheck %s`
			`; RUN: llc -mcpu=bdver2 -mattr=-fma,-fma4 -mtriple=i686-apple-darwin < %s \| FileCheck %s --check-prefix=CHECK-NOFMA`
Remove the type legality check from the SelectionDAGBuilder when it lowers @llvm.fmuladd to ISD::FMA nodes. Performing this check unilaterally prevented us from generating FMAs when the incoming IR contained illegal vector types which would eventually be legalized to underlying types that did support FMA. For example, an @llvm.fmuladd on an OpenCL float16 should become a sequence of float4 FMAs, not float4 fmul+fadd's. NOTE: Because we still call the target-specific profitability hook, individual targets can reinstate the old behavior, if desired, by simply performing the legality check inside their callback hook. They can also perform more sophisticated legality checks, if, for example, some illegal vector types can be productively implemented as FMAs, but not others. llvm-svn: 177820 2013-03-23 16:26:53 +08:00
Start using CHECK-LABEL in some tests. llvm-svn: 186163 2013-07-12 22:54:12 +08:00			`; CHECK-LABEL: fmafunc`
			`; CHECK-NOFMA-LABEL: fmafunc`
Remove the type legality check from the SelectionDAGBuilder when it lowers @llvm.fmuladd to ISD::FMA nodes. Performing this check unilaterally prevented us from generating FMAs when the incoming IR contained illegal vector types which would eventually be legalized to underlying types that did support FMA. For example, an @llvm.fmuladd on an OpenCL float16 should become a sequence of float4 FMAs, not float4 fmul+fadd's. NOTE: Because we still call the target-specific profitability hook, individual targets can reinstate the old behavior, if desired, by simply performing the legality check inside their callback hook. They can also perform more sophisticated legality checks, if, for example, some illegal vector types can be productively implemented as FMAs, but not others. llvm-svn: 177820 2013-03-23 16:26:53 +08:00			`define <16 x float> @fmafunc(<16 x float> %a, <16 x float> %b, <16 x float> %c) {`
Regenerate expectation for wide-fma-contraction.ll . NFC llvm-svn: 304586 2017-06-03 03:15:04 +08:00			`; CHECK-LABEL: fmafunc:`
			`; CHECK: ## BB#0:`
			`; CHECK-NEXT: pushl %ebp`
			`; CHECK-NEXT: .cfi_def_cfa_offset 8`
			`; CHECK-NEXT: .cfi_offset %ebp, -8`
			`; CHECK-NEXT: movl %esp, %ebp`
			`; CHECK-NEXT: .cfi_def_cfa_register %ebp`
			`; CHECK-NEXT: andl $-32, %esp`
			`; CHECK-NEXT: subl $32, %esp`
			`; CHECK-NEXT: vfmaddps 8(%ebp), %ymm2, %ymm0, %ymm0`
			`; CHECK-NEXT: vfmaddps 40(%ebp), %ymm3, %ymm1, %ymm1`
			`; CHECK-NEXT: movl %ebp, %esp`
			`; CHECK-NEXT: popl %ebp`
			`; CHECK-NEXT: retl`
			`;`
			`; CHECK-NOFMA-LABEL: fmafunc:`
			`; CHECK-NOFMA: ## BB#0:`
			`; CHECK-NOFMA-NEXT: pushl %ebp`
			`; CHECK-NOFMA-NEXT: .cfi_def_cfa_offset 8`
			`; CHECK-NOFMA-NEXT: .cfi_offset %ebp, -8`
			`; CHECK-NOFMA-NEXT: movl %esp, %ebp`
			`; CHECK-NOFMA-NEXT: .cfi_def_cfa_register %ebp`
			`; CHECK-NOFMA-NEXT: andl $-32, %esp`
			`; CHECK-NOFMA-NEXT: subl $32, %esp`
			`; CHECK-NOFMA-NEXT: vmulps %ymm2, %ymm0, %ymm0`
			`; CHECK-NOFMA-NEXT: vaddps 8(%ebp), %ymm0, %ymm0`
			`; CHECK-NOFMA-NEXT: vmulps %ymm3, %ymm1, %ymm1`
			`; CHECK-NOFMA-NEXT: vaddps 40(%ebp), %ymm1, %ymm1`
			`; CHECK-NOFMA-NEXT: movl %ebp, %esp`
			`; CHECK-NOFMA-NEXT: popl %ebp`
			`; CHECK-NOFMA-NEXT: retl`
AArch64/PowerPC/SystemZ/X86: This patch fixes the interface, usage, and all in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in order to resolve the following issues with fmuladd (i.e. optional FMA) intrinsics: 1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd intrinsics even if the subtarget does not support FMA instructions, leading to laughably bad code generation in some situations. 2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128, resulting in a call to a software fp128 FMA implementation. 3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize, etc. to types that support hardware FMAs. The function has also been slightly renamed for consistency and to force a merge/build conflict for any out-of-tree target implementing it. To resolve, see comments and fixed in-tree examples. llvm-svn: 185956 2013-07-10 02:16:56 +08:00

Remove the type legality check from the SelectionDAGBuilder when it lowers @llvm.fmuladd to ISD::FMA nodes. Performing this check unilaterally prevented us from generating FMAs when the incoming IR contained illegal vector types which would eventually be legalized to underlying types that did support FMA. For example, an @llvm.fmuladd on an OpenCL float16 should become a sequence of float4 FMAs, not float4 fmul+fadd's. NOTE: Because we still call the target-specific profitability hook, individual targets can reinstate the old behavior, if desired, by simply performing the legality check inside their callback hook. They can also perform more sophisticated legality checks, if, for example, some illegal vector types can be productively implemented as FMAs, but not others. llvm-svn: 177820 2013-03-23 16:26:53 +08:00			`%ret = tail call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %a, <16 x float> %b, <16 x float> %c)`
			`ret <16 x float> %ret`
			`}`

			`declare <16 x float> @llvm.fmuladd.v16f32(<16 x float>, <16 x float>, <16 x float>) nounwind readnone`