Go to file
Jim Grosbach b6535c32f5 X86: Constant fold converting vector setcc results to float.
Since the result of a SETCC for X86 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the SSE code is generated as:
LCPI0_0:
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0
  cvtdq2ps  %xmm0, %xmm0
  retq

After, the code is improved to:
LCPI0_0:
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0
  retq

The cvtdq2ps has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via the ModRM operand of andps.

llvm-svn: 213342
2014-07-18 00:40:56 +00:00
clang Fix parsing certain kinds of strings in the MS section pragmas 2014-07-18 00:13:16 +00:00
clang-tools-extra Revert "unique_ptr-ify ownership of ASTConsumers" 2014-07-17 22:33:56 +00:00
compiler-rt Revert Thumb-2 conversion of some ARM builtins. 2014-07-17 20:41:01 +00:00
debuginfo-tests relax testcase for LLDB output format compatibility. 2014-03-19 23:06:18 +00:00
libclc Add several missing double constant definitions 2014-07-17 22:07:35 +00:00
libcxx Fix bug #20335 - memory leak when move-constructing a string with unequal allocator. Thanks to Thomas Koeppe for the report 2014-07-17 15:32:20 +00:00
libcxxabi libcxxabi cmake: Use HandleLLVMOptions.cmake, don't manually add -std=c++11. 2014-07-16 23:53:37 +00:00
lld [mach-o] Add support for x86 CALL instruction that uses a scattered relocation 2014-07-18 00:37:52 +00:00
lldb Fixed the objective C symbol parsing in ObjectFileMachO. 2014-07-17 22:51:31 +00:00
llvm X86: Constant fold converting vector setcc results to float. 2014-07-18 00:40:56 +00:00
openmp CMake: remove duplicated source file from list 2014-06-02 13:09:24 +00:00
polly [Refactor] Move code out of the IslAst header 2014-07-17 16:11:28 +00:00