llvm-project/llvm/test/CodeGen/X86/fp-une-cmp.ll

71 lines
1.5 KiB
LLVM
Raw Normal View History

; RUN: llc < %s -march=x86 -mattr=sse4.1 | FileCheck %s
; <rdar://problem/7859988>
; Make sure we don't generate more jumps than we need to. We used to generate
; something like this:
;
; jne LBB0_1
; jnp LBB0_2
; LBB0_1:
; jmp LBB0_3
; LBB0_2:
; addsd ...
; LBB0_3:
;
; Now we generate this:
;
; jne LBB0_2
; jp LBB0_2
; addsd ...
; LBB0_2:
Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed. Currently, AnalyzeBranch() fails non-equality comparison between floating points on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this function can modify the branch by reversing the conditional jump and removing unconditional jump if there is a proper fall-through. However, in the case of non-equality comparison between floating points, this can turn the branch "unanalyzable". Consider the following case: jne.BB1 jp.BB1 jmp.BB2 .BB1: ... .BB2: ... AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be removed: jne.BB1 jnp.BB2 .BB1: ... .BB2: ... However, AnalyzeBranch() cannot analyze this branch anymore as there are two conditional jumps with different targets. This may disable some optimizations like block-placement: in this case the fall-through behavior is enforced even if the fall-through block is very cold, which is suboptimal. Actually this optimization is also done in block-placement pass, which means we can remove this optimization from AnalyzeBranch(). However, currently X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined negation conditions for them. In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them. Here only the second conditional jump is reversed. This is valid as we only need them to do this "unconditional jump removal" optimization. Differential Revision: http://reviews.llvm.org/D11393 llvm-svn: 258847
2016-01-27 04:08:01 +08:00
define float @func1(float %x, float %y) nounwind readnone optsize ssp {
; CHECK: func1
; CHECK: jne [[LABEL:.*]]
; CHECK-NEXT: jp [[LABEL]]
; CHECK-NOT: jmp
Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed. Currently, AnalyzeBranch() fails non-equality comparison between floating points on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this function can modify the branch by reversing the conditional jump and removing unconditional jump if there is a proper fall-through. However, in the case of non-equality comparison between floating points, this can turn the branch "unanalyzable". Consider the following case: jne.BB1 jp.BB1 jmp.BB2 .BB1: ... .BB2: ... AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be removed: jne.BB1 jnp.BB2 .BB1: ... .BB2: ... However, AnalyzeBranch() cannot analyze this branch anymore as there are two conditional jumps with different targets. This may disable some optimizations like block-placement: in this case the fall-through behavior is enforced even if the fall-through block is very cold, which is suboptimal. Actually this optimization is also done in block-placement pass, which means we can remove this optimization from AnalyzeBranch(). However, currently X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined negation conditions for them. In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them. Here only the second conditional jump is reversed. This is valid as we only need them to do this "unconditional jump removal" optimization. Differential Revision: http://reviews.llvm.org/D11393 llvm-svn: 258847
2016-01-27 04:08:01 +08:00
;
entry:
%0 = fpext float %x to double
%1 = fpext float %y to double
%2 = fmul double %0, %1
%3 = fcmp une double %2, 0.000000e+00
br i1 %3, label %bb2, label %bb1
bb1:
%4 = fadd double %2, -1.000000e+00
br label %bb2
bb2:
%.0.in = phi double [ %4, %bb1 ], [ %2, %entry ]
%.0 = fptrunc double %.0.in to float
ret float %.0
}
Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed. Currently, AnalyzeBranch() fails non-equality comparison between floating points on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this function can modify the branch by reversing the conditional jump and removing unconditional jump if there is a proper fall-through. However, in the case of non-equality comparison between floating points, this can turn the branch "unanalyzable". Consider the following case: jne.BB1 jp.BB1 jmp.BB2 .BB1: ... .BB2: ... AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be removed: jne.BB1 jnp.BB2 .BB1: ... .BB2: ... However, AnalyzeBranch() cannot analyze this branch anymore as there are two conditional jumps with different targets. This may disable some optimizations like block-placement: in this case the fall-through behavior is enforced even if the fall-through block is very cold, which is suboptimal. Actually this optimization is also done in block-placement pass, which means we can remove this optimization from AnalyzeBranch(). However, currently X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined negation conditions for them. In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them. Here only the second conditional jump is reversed. This is valid as we only need them to do this "unconditional jump removal" optimization. Differential Revision: http://reviews.llvm.org/D11393 llvm-svn: 258847
2016-01-27 04:08:01 +08:00
define float @func2(float %x, float %y) nounwind readnone optsize ssp {
; CHECK: func2
; CHECK: jne [[LABEL:.*]]
; CHECK-NEXT: jp [[LABEL]]
; CHECK: %bb2
; CHECK: %bb1
; CHECK: jmp
;
entry:
%0 = fpext float %x to double
%1 = fpext float %y to double
%2 = fmul double %0, %1
%3 = fcmp une double %2, 0.000000e+00
br i1 %3, label %bb1, label %bb2, !prof !1
bb1:
%4 = fadd double %2, -1.000000e+00
br label %bb2
bb2:
%.0.in = phi double [ %4, %bb1 ], [ %2, %entry ]
%.0 = fptrunc double %.0.in to float
ret float %.0
}
!1 = !{!"branch_weights", i32 1, i32 1000}