Commit Graph

5 Commits

Author SHA1 Message Date
Simon Pilgrim f5f291d129 [X86][SSE] Add support for lowering shuffles to PACKSS/PACKUS
If the upper bits of a truncation shuffle patterns have at least the minimum number of sign/zero bits on their inputs then we can safely use PACKSS/PACKUS as shuffles.

Partial fix for https://bugs.llvm.org/show_bug.cgi?id=34773

Differential Revision: https://reviews.llvm.org/D38472

llvm-svn: 314788
2017-10-03 12:01:31 +00:00
Amjad Aboud 4f97751798 [X86] Generate VZEROUPPER for Skylake-avx512.
VZEROUPPER should not be issued on Knights Landing (KNL), but on Skylake-avx512 it should be.

Differential Revision: https://reviews.llvm.org/D29874

llvm-svn: 296859
2017-03-03 09:03:24 +00:00
Simon Pilgrim c76ea4b638 [X86] Attempt to pre-truncate arithmetic operations if useful
In some cases its more efficient to combine TRUNC( BINOP( X, Y ) ) --> BINOP( TRUNC( X ), TRUNC( Y ) ) if the binop is legal for the truncated types.

This is true for vector integer multiplication (especially vXi64), as well as ADD/AND/XOR/OR in cases where we only need to truncate one of the inputs at runtime (e.g. a duplicated input or an one use constant we can fold).

Further work could be done here - scalar cases (especially i64) could often benefit (if we avoid partial registers etc.), other opcodes, and better analysis of when truncating the inputs reduces costs.

I have considered implementing this for all targets within the DAGCombiner but wasn't sure we could devise a suitable cost model system that would give us the range we need.

Differential Revision: https://reviews.llvm.org/D28219

llvm-svn: 290947
2017-01-04 08:05:42 +00:00
Simon Pilgrim 712374169d [X86][AVX512] Regenerate test - missing shuffle comments
llvm-svn: 290770
2016-12-30 22:31:33 +00:00
Igor Breger 2ba64ab9ae [AVX512] Implement missing patterns for any_extend load lowering.
Differential Revision: http://reviews.llvm.org/D20513

llvm-svn: 270357
2016-05-22 10:21:04 +00:00