llvm-project/llvm/test/CodeGen/ARM/dagcombine-anyexttozeroext.ll

; RUN: llc -mtriple armv7 %s -o - | FileCheck %s

; CHECK-LABEL: f:
define float @f(<4 x i16>* nocapture %in) {
  ; CHECK: vld1
  ; CHECK: vmovl.u16
  ; CHECK-NOT: vand
  %1 = load <4 x i16>, <4 x i16>* %in
  ; CHECK: vcvt.f32.u32
  %2 = uitofp <4 x i16> %1 to <4 x float>
  %3 = extractelement <4 x float> %2, i32 0
  %4 = extractelement <4 x float> %2, i32 1
  %5 = extractelement <4 x float> %2, i32 2

  ; CHECK: vadd.f32
  %6 = fadd float %3, %4
  %7 = fadd float %6, %5

  ret float %7
}

; CHECK-LABEL: g:
define float @g(<4 x i16>* nocapture %in) {
  ; CHECK: vldr
  %1 = load <4 x i16>, <4 x i16>* %in

  ; For now we're generating a vmov.16 and a uxth instruction.
  ; The uxth is redundant, and we should be able to extend without
  ; having to generate cross-domain copies. Once we can do this
  ; we should modify the checks below.

  ; CHECK: uxth
  %2 = extractelement <4 x i16> %1, i32 0
  ; CHECK: vcvt.f32.u32
  %3 = uitofp i16 %2 to float
  ret float %3
}

; Make sure we generate zext from <4 x i8> to <4 x 32>.

; CHECK-LABEL: h:
; CHECK: vld1.32
; CHECK: vmovl.u8 q8, d16
; CHECK: vmovl.u16 q8, d16
; CHECK: vmov r0, r1, d16
; CHECK: vmov r2, r3, d17
define <4 x i32> @h(<4 x i8> *%in) {
  %1 = load <4 x i8>, <4 x i8>* %in, align 4
  %2 = extractelement <4 x i8> %1, i32 0
  %3 = zext i8 %2 to i32
  %4 = insertelement <4 x i32> undef, i32 %3, i32 0
  %5 = extractelement <4 x i8> %1, i32 1
  %6 = zext i8 %5 to i32
  %7 = insertelement <4 x i32> %4, i32 %6, i32 1
  %8 = extractelement <4 x i8> %1, i32 2
  %9 = zext i8 %8 to i32
  %10 = insertelement <4 x i32> %7, i32 %9, i32 2
  %11 = extractelement <4 x i8> %1, i32 3
  %12 = zext i8 %11 to i32
  %13 = insertelement <4 x i32> %10, i32 %12, i32 3
  ret <4 x i32> %13
}
An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases. Noticed while investigating PR12274! llvm-svn: 154090 2012-04-05 18:01:12 +08:00			`; RUN: llc -mtriple armv7 %s -o - \| FileCheck %s`

Mass update to CodeGen tests to use CHECK-LABEL for labels corresponding to function definitions for more informative error messages. No functionality change and all updated tests passed locally. This update was done with the following bash script: find test/CodeGen -name ".ll" \| \ while read NAME; do echo "$NAME" if ! grep -q "^; RUN: llc.debug" $NAME; then TEMP=`mktemp -t temp` cp $NAME $TEMP sed -n "s/^define [^@]@\([A-Za-z0-9_]\)(.$/\1/p" < $NAME \| \ while read FUNC; do sed -i '' "s/;\(.\)\([A-Za-z0-9_-]\):\( \)$FUNC: \$/;\1\2-LABEL:\3$FUNC:/g" $TEMP done sed -i '' "s/;\(.\)-LABEL-LABEL:/;\1-LABEL:/" $TEMP sed -i '' "s/;\(.\)-NEXT-LABEL:/;\1-NEXT:/" $TEMP sed -i '' "s/;\(.\)-NOT-LABEL:/;\1-NOT:/" $TEMP sed -i '' "s/;\(.*\)-DAG-LABEL:/;\1-DAG:/" $TEMP mv $TEMP $NAME fi done llvm-svn: 186280 2013-07-14 14:24:09 +08:00			`; CHECK-LABEL: f:`
An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases. Noticed while investigating PR12274! llvm-svn: 154090 2012-04-05 18:01:12 +08:00			`define float @f(<4 x i16>* nocapture %in) {`
[ARM] Enable vector extload combine for legal types. This commit enables forming vector extloads for ARM. It only does so for legal types, and when we can't fold the extension in a wide/long form of the user instruction. Enabling it for larger types isn't as good an idea on ARM as it is on X86, because: - we pretend that extloads are legal, but end up generating vld+vmov - we have instructions like vld {dN, dM}, which can't be generated when we "manually expand" extloads to vld+vmov. For legal types, the combine doesn't fire that often: in the integration tests only in a big endian testcase, where it removes a pointless AND. Related to rdar://19723053 Differential Revision: http://reviews.llvm.org/D7423 llvm-svn: 231396 2015-03-06 03:37:53 +08:00			`; CHECK: vld1`
An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases. Noticed while investigating PR12274! llvm-svn: 154090 2012-04-05 18:01:12 +08:00			`; CHECK: vmovl.u16`
			`; CHECK-NOT: vand`
[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace\(\d+\) )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794 2015-02-28 05:17:42 +08:00			`%1 = load <4 x i16>, <4 x i16>* %in`
An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases. Noticed while investigating PR12274! llvm-svn: 154090 2012-04-05 18:01:12 +08:00			`; CHECK: vcvt.f32.u32`
			`%2 = uitofp <4 x i16> %1 to <4 x float>`
			`%3 = extractelement <4 x float> %2, i32 0`
			`%4 = extractelement <4 x float> %2, i32 1`
			`%5 = extractelement <4 x float> %2, i32 2`

			`; CHECK: vadd.f32`
			`%6 = fadd float %3, %4`
			`%7 = fadd float %6, %5`

			`ret float %7`
			`}`

[DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext\|any_ext\|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935 2016-03-21 19:43:46 +08:00			`; CHECK-LABEL: g:`
An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases. Noticed while investigating PR12274! llvm-svn: 154090 2012-04-05 18:01:12 +08:00			`define float @g(<4 x i16>* nocapture %in) {`
			`; CHECK: vldr`
[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace\(\d+\) )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794 2015-02-28 05:17:42 +08:00			`%1 = load <4 x i16>, <4 x i16>* %in`
[DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext\|any_ext\|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935 2016-03-21 19:43:46 +08:00
			`; For now we're generating a vmov.16 and a uxth instruction.`
			`; The uxth is redundant, and we should be able to extend without`
			`; having to generate cross-domain copies. Once we can do this`
			`; we should modify the checks below.`

			`; CHECK: uxth`
An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases. Noticed while investigating PR12274! llvm-svn: 154090 2012-04-05 18:01:12 +08:00			`%2 = extractelement <4 x i16> %1, i32 0`
			`; CHECK: vcvt.f32.u32`
			`%3 = uitofp i16 %2 to float`
			`ret float %3`
			`}`
[DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext\|any_ext\|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935 2016-03-21 19:43:46 +08:00
[DAGCombiner] Add a combine to turn a build vector of zero extends of extract vector elts into a vector zero extend and possibly an extract subvector. llvm-svn: 329509 2018-04-08 03:09:50 +08:00			`; Make sure we generate zext from <4 x i8> to <4 x 32>.`
[DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext\|any_ext\|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935 2016-03-21 19:43:46 +08:00
			`; CHECK-LABEL: h:`
			`; CHECK: vld1.32`
[DAGCombiner] Add a combine to turn a build vector of zero extends of extract vector elts into a vector zero extend and possibly an extract subvector. llvm-svn: 329509 2018-04-08 03:09:50 +08:00			`; CHECK: vmovl.u8 q8, d16`
			`; CHECK: vmovl.u16 q8, d16`
			`; CHECK: vmov r0, r1, d16`
			`; CHECK: vmov r2, r3, d17`
[DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext\|any_ext\|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935 2016-03-21 19:43:46 +08:00			`define <4 x i32> @h(<4 x i8> *%in) {`
			`%1 = load <4 x i8>, <4 x i8>* %in, align 4`
			`%2 = extractelement <4 x i8> %1, i32 0`
			`%3 = zext i8 %2 to i32`
			`%4 = insertelement <4 x i32> undef, i32 %3, i32 0`
			`%5 = extractelement <4 x i8> %1, i32 1`
			`%6 = zext i8 %5 to i32`
			`%7 = insertelement <4 x i32> %4, i32 %6, i32 1`
			`%8 = extractelement <4 x i8> %1, i32 2`
			`%9 = zext i8 %8 to i32`
			`%10 = insertelement <4 x i32> %7, i32 %9, i32 2`
			`%11 = extractelement <4 x i8> %1, i32 3`
			`%12 = zext i8 %11 to i32`
			`%13 = insertelement <4 x i32> %10, i32 %12, i32 3`
			`ret <4 x i32> %13`
			`}`