llvm-project

History

Sanjay Patel ae945e7927 [InstCombine] transform more extract/insert pairs into shuffles (PR2109) This is an extension of the shuffle combining from r203229: http://reviews.llvm.org/rL203229 The idea is to widen a short input vector with undef elements so the existing shuffle transform for extract/insert can kick in. The motivation is to finally solve PR2109: https://llvm.org/bugs/show_bug.cgi?id=2109 For that example, the IR becomes: %1 = bitcast <2 x i32>* %P to <2 x float>* %ld1 = load <2 x float>, <2 x float>* %1, align 8 %2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef> %i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5> ret <4 x float> %i2 And x86 SSE output improves from: movq (%rdi), %xmm1 ## xmm1 = mem[0],zero movdqa %xmm1, %xmm2 shufps $229, %xmm2, %xmm2 ## xmm2 = xmm2[1,1,2,3] shufps $48, %xmm0, %xmm1 ## xmm1 = xmm1[0,0],xmm0[3,0] shufps $132, %xmm1, %xmm0 ## xmm0 = xmm0[0,1],xmm1[0,2] shufps $32, %xmm0, %xmm2 ## xmm2 = xmm2[0,0],xmm0[2,0] shufps $36, %xmm2, %xmm0 ## xmm0 = xmm0[0,1],xmm2[2,0] retq To the almost optimal: movhpd (%rdi), %xmm0 Note: There's a tension in the existing transform related to generating arbitrary shufflevector masks. We avoid that in other places in InstCombine because we're scared that codegen can't handle strange masks, but it looks like we're ok with producing those here. I purposely chose weird insert/extract indexes for the regression tests to see the effect in these cases. For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or better for these examples. Differential Revision: http://reviews.llvm.org/D15096 llvm-svn: 256394		2015-12-24 21:17:56 +00:00
..
Analysis	Add a missing const qualifier on the context instruction. This somehow	2015-12-24 09:08:08 +00:00
AsmParser	Implemented Support of IA interrupt and exception handlers:	2015-12-21 14:07:14 +00:00
Bitcode	Remove overly strict new assert in BitcodeReader.	2015-12-21 15:38:13 +00:00
CodeGen	Use range-based for loops. NFC	2015-12-24 05:20:40 +00:00
DebugInfo	Remove unused constants from TypeTableBuilder.cpp.	2015-12-24 19:15:56 +00:00
ExecutionEngine	Delete APIs that have been deprecated since 2010.	2015-12-19 21:42:07 +00:00
Fuzzer	[libFuzzer] add AFL-style dictionary for C++, remove the old file with tokens	2015-12-22 01:50:51 +00:00
IR	[Function] Properly remove use when clearing personality	2015-12-23 18:27:23 +00:00
IRReader	[ThinLTO] Metadata linking for imported functions	2015-12-17 17:14:09 +00:00
LTO	Rename variables to reflect linker split (NFC)	2015-12-18 19:28:59 +00:00
LibDriver	[Option] Use an ArrayRef to store the Option Infos in OptTable. NFC	2015-10-21 16:30:42 +00:00
LineEditor	…
Linker	Handle empty Subprogram list when linking metadata.	2015-12-22 01:17:19 +00:00
MC	Form reform for MCDwarf.	2015-12-23 01:57:31 +00:00
Object	Handle archives with paths in the names.	2015-12-18 16:07:17 +00:00
Option	Convert Arg, ArgList, and Option to dump() to dbgs() rather than errs().	2015-12-18 18:55:26 +00:00
Passes	[PM] Port StripDeadPrototypes to the new pass manager	2015-10-30 23:28:12 +00:00
ProfileData	[ProfileData] Make helper function static.	2015-12-24 10:03:37 +00:00
Support	[Support] Allow multiple paired calls to {start,stop}Timer()	2015-12-22 17:36:17 +00:00
TableGen	[TblGen] ArrayRefize TGParser. No functional change intended.	2015-10-24 12:46:45 +00:00
Target	[X86][ms-inline asm] Add support for memory operands that include structs	2015-12-24 12:09:51 +00:00
Transforms	[InstCombine] transform more extract/insert pairs into shuffles (PR2109)	2015-12-24 21:17:56 +00:00
CMakeLists.txt	LibDriver, llvm-lib: introduce.	2015-06-09 21:50:22 +00:00
LLVMBuild.txt	Wrap some long lines in LLVMBuild files. NFC	2015-06-12 18:44:57 +00:00
Makefile	LibDriver, llvm-lib: introduce.	2015-06-09 21:50:22 +00:00