llvm-project/llvm/test/CodeGen/X86/vec_shuffle-41.ll

; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=corei7-avx | FileCheck %s

; Use buildFromShuffleMostly which allows this to be generated as two 128-bit
; shuffles and an insert.

; This is the (somewhat questionable) LLVM IR that is generated for:
;    x8.s0123456 = x8.s1234567;  // x8 is a <8 x float> type
;    x8.s7 = f;                  // f is float


define <8 x float> @test1(<8 x float> %a, float %b) {
; CHECK-LABEL: test1:
; CHECK: vinsertps
; CHECK-NOT: vinsertps
entry:
  %shift = shufflevector <8 x float> %a, <8 x float> undef, <7 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
  %extend = shufflevector <7 x float> %shift, <7 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 undef>
  %insert = insertelement <8 x float> %extend, float %b, i32 7

  ret <8 x float> %insert
}
[X86] Improve buildFromShuffleMostly for AVX For a 256-bit BUILD_VECTOR consisting mostly of shuffles of 256-bit vectors, both the BUILD_VECTOR and its operands may need to be legalized in multiple steps. Consider: (v8f32 (BUILD_VECTOR (extract_vector_elt (v8f32 %vreg0,) Constant<1>), (extract_vector_elt %vreg0, Constant<2>), (extract_vector_elt %vreg0, Constant<3>), (extract_vector_elt %vreg0, Constant<4>), (extract_vector_elt %vreg0, Constant<5>), (extract_vector_elt %vreg0, Constant<6>), (extract_vector_elt %vreg0, Constant<7>), %vreg1)) a. We can't build a 256-bit vector efficiently so, we need to split it into two 128-bit vecs and combine them with VINSERTX128. b. Operands like (extract_vector_elt (v8f32 %vreg0), Constant<7>) needs to be split into a VEXTRACTX128 and a further extract_vector_elt from the resulting 128-bit vector. c. The extract_vector_elt from b. is lowered into a shuffle to the first element and a movss. Depending on the order in which we legalize the BUILD_VECTOR and its operands[1], buildFromShuffleMostly may be faced with: (v4f32 (BUILD_VECTOR (extract_vector_elt (vector_shuffle<1,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), (extract_vector_elt (vector_shuffle<2,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), (extract_vector_elt (vector_shuffle<3,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), %vreg1)) In order to figure out the underlying vector and their identity we need to see through the shuffles. [1] Note that the order in which operations and their operands are legalized is only guaranteed in the first iteration of LegalizeDAG. Fixes <rdar://problem/16296956> llvm-svn: 206634 2014-04-19 03:44:16 +08:00			`; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=corei7-avx \| FileCheck %s`

			`; Use buildFromShuffleMostly which allows this to be generated as two 128-bit`
			`; shuffles and an insert.`

			`; This is the (somewhat questionable) LLVM IR that is generated for:`
			`; x8.s0123456 = x8.s1234567; // x8 is a <8 x float> type`
			`; x8.s7 = f; // f is float`


			`define <8 x float> @test1(<8 x float> %a, float %b) {`
			`; CHECK-LABEL: test1:`
			`; CHECK: vinsertps`
			`; CHECK-NOT: vinsertps`
			`entry:`
			`%shift = shufflevector <8 x float> %a, <8 x float> undef, <7 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>`
			`%extend = shufflevector <7 x float> %shift, <7 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 undef>`
			`%insert = insertelement <8 x float> %extend, float %b, i32 7`

			`ret <8 x float> %insert`
			`}`