We should have equally good shuffle options for v8i32 with VLX. This was spotted during my attempts to remove 512-bit vectors from SKX.
We still use 512-bits for v16i1, v32i1, and v64i1. I'm less sure we can handle those well with narrower vectors. i32 and i64 element sizes get the best shuffle support.
llvm-svn: 321291
We may need to widen the vector to make the shifts legal, but if we do that we need to make sure we shift left/right after accounting for the new size. If not we can't guarantee we are shifting in zeros.
The test cases affected actually show cases where we should move the shifts all together, but that's another problem.
llvm-svn: 320248
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.
The MIR printer prints the IR name of a MBB only for block definitions.
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix
Differential Revision: https://reviews.llvm.org/D40422
llvm-svn: 319665
Added TESTM and TESTNM to the list of instructions that already zeroing unused upper bits
and does not need the redundant shift left and shift right instructions afterwards.
Added a pattern for TESTM and TESTNM in iselLowering, so now icmp(neq,and(X,Y), 0) goes folds into TESTM
and icmp(eq,and(X,Y), 0) goes folds into TESTNM
This commit is a preparation for lowering the test and testn X86 intrinsics to IR.
Differential Revision: https://reviews.llvm.org/D38732
llvm-svn: 317465
VZEROUPPER should not be issued on Knights Landing (KNL), but on Skylake-avx512 it should be.
Differential Revision: https://reviews.llvm.org/D29874
llvm-svn: 296859
Previously this test case fired an assertion in getNode because we tried to create an insert_subvector with both input types the same size and the index pointing to half the vector width.
llvm-svn: 293446
1. Use shuffle to insert element i1 into vector. The previous implementation was incorrect ( dest_bit OR src_bit , it doesn't clear the bit if src_bit=0 )
2. Improve shuffle i1 vector, use CVT2MASK if supported instead TRUNCATE.
Differential Revision: http://reviews.llvm.org/D23347
llvm-svn: 278623
Enable truncate 128/256bit packed byte/word with AVX512BW but without AVX512VL, use 512bit instructions.
Differential Revision: http://reviews.llvm.org/D16531
llvm-svn: 259044
Fix TRUNCATE lowering vector to vector i1, use LSB and not MSB.
Implement VPMOVB/W/D/Q2M intrinsic.
Differential Revision: http://reviews.llvm.org/D15675
llvm-svn: 256470
ISERT_SUBVECTOR for i1 vectors may be done with shifts, when we insert into the lower part, or into the upper part, on into all-zero vector.
CONCAT_VECTORS uses ISERT_SUBVECTOR.
Differential Revision: http://reviews.llvm.org/D14815
llvm-svn: 253819