llvm-project

History

Jim Grosbach 7236678687 Legalize: Improve legalization of long vector extends. When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 llvm-svn: 193727		2013-10-31 00:20:48 +00:00
..
Analysis	SCEV: Make the final add of an inbounds GEP nuw if we know that the index is positive.	2013-10-28 07:30:06 +00:00
AsmParser	Revert r193251 : Use address-taken to disambiguate global variable and indirect memops.	2013-10-27 03:08:44 +00:00
Bitcode	Revert r193251 : Use address-taken to disambiguate global variable and indirect memops.	2013-10-27 03:08:44 +00:00
CodeGen	Legalize: Improve legalization of long vector extends.	2013-10-31 00:20:48 +00:00
DebugInfo	DWARF parser: propery handle DW_FORM_ref_sig8 and fix Windows build.	2013-10-29 16:32:19 +00:00
ExecutionEngine	The FIXME was indeed fixed in the linker, comment removed.	2013-10-25 12:01:53 +00:00
IR	Add calls to doInitialization() and doFinalization() in verifyFunction()	2013-10-30 22:37:51 +00:00
IRReader	Add 'const' qualifiers to static const char* variables.	2013-07-16 01:17:10 +00:00
LTO	Move getSymbol to TargetLoweringObjectFile.	2013-10-29 17:28:26 +00:00
Linker	Add a 'deleteModule' method to the Linker class.	2013-10-16 08:59:57 +00:00
MC	Move the STT_FILE symbols out of the normal symbol table processing for	2013-10-29 01:06:17 +00:00
Object	Support for microMIPS jump instructions	2013-10-29 16:38:59 +00:00
Option	Fix another mistake in r190442.	2013-09-10 23:22:56 +00:00
Support	Add {start,end}with_lower methods to StringRef.	2013-10-30 18:32:26 +00:00
TableGen	Add an error check for a typo I accidentally made in a td file that caused an assert to fire.	2013-08-20 04:22:09 +00:00
Target	Legalize: Improve legalization of long vector extends.	2013-10-31 00:20:48 +00:00
Transforms	Teach scalarrepl about address spaces	2013-10-30 22:54:58 +00:00
CMakeLists.txt	Move LTO support library to a component, allowing it to be tested	2013-09-24 23:52:22 +00:00
LLVMBuild.txt	Move LTO support library to a component, allowing it to be tested	2013-09-24 23:52:22 +00:00
Makefile	Reformat Makefile. No other changes.	2013-10-30 04:03:03 +00:00