Commit Graph

9 Commits

Author SHA1 Message Date
Andrea Di Biagio 7277afeec1 [ConstantFold] Improve the bitcast folding logic for constant vectors.
The constant folder didn't know how to always fold bitcasts of constant integer
vectors. In particular, it was unable to handle the case where a constant vector
had some undef elements, and the resulting (i.e. bitcasted) vector type had more
elements than the original vector type.

Example:
  %cast = bitcast <2 x i64><i64 undef, i64 2> to <4 x i32>

On a little endian target, %cast could have been folded to:
  <4 x i32><i32 undef, i32 undef, i32 2, i32 0>

This patch improves the folding logic by teaching how to correctly propagate
undef elements in the folded vector.

Differential Revision: https://reviews.llvm.org/D24301

llvm-svn: 281343
2016-09-13 14:50:47 +00:00
Andrea Di Biagio f3fd316223 [InstCombine][SSE4a] Fix assertion failure in the insertq/insertqi combining logic.
This fixes a similar issue to the one already fixed by r280804
(revieved in D24256). Revision 280804 fixed the problem with unsafe dyn_casts
in the extrq/extrqi combining logic. However, it turns out that even the
insertq/insertqi logic was affected by the same problem.

llvm-svn: 280807
2016-09-07 12:47:53 +00:00
Andrea Di Biagio 8df5b9cf48 [InstCombine][SSE4a] Fix assertion failure caused by unsafe dyn_casts on the operands of extrq/extrqi intrinsic calls.
This patch fixes an assertion failure caused by unsafe dynamic casts on the
constant operands of sse4a intrinsic calls to extrq/extrqi

The combine logic that simplifies sse4a extrq/extrqi intrinsic calls currently
checks if the input operands are constants. Internally, that logic relies on
dyn_casts of values returned by calls to method Constant::getAggregateElement.
However, method getAggregateElemet may return nullptr if the constant element
cannot be retrieved. So, all the dyn_casts can potentially fail. This is what
happens for example if a constexpr value is passed in input to an extrq/extrqi
intrinsic call.

This patch fixes the problem by using a dyn_cast_or_null (instead of a simple
dyn_cast) on the result of each call to Constant::getAggregateElement.

Added reproducible test cases to x86-sse4a.ll.

Differential Revision: https://reviews.llvm.org/D24256

llvm-svn: 280804
2016-09-07 12:03:03 +00:00
Simon Pilgrim 74b3bfdf71 [InstCombine][X86] Regenerate SSE combine tests as part of setup for D17490
Regenerated with utils/update_test_checks.py 

llvm-svn: 266731
2016-04-19 12:56:46 +00:00
Simon Pilgrim 216b1bf5ed [InstCombine] SSE4A constant folding and conversion to shuffles.
This patch improves support for combining the SSE4A EXTRQ(I) and INSERTQ(I) intrinsics:

1 - Converts INSERTQ/EXTRQ calls to INSERTQI/EXTRQI if the 'bit index' and 'length' operands are constant
2 - Converts INSERTQI/EXTRQI calls to shufflevector if the bit index/length are both byte aligned (we can already lower shuffles to INSERTQI/EXTRQI if its useful)
3 - Constant folding support
4 - Add zeroinitializer handling

Differential Revision: http://reviews.llvm.org/D13348

llvm-svn: 250609
2015-10-17 11:40:05 +00:00
Simon Pilgrim 3c2b30f8ba [InstCombine][SSE4A] Remove broken INSERTQI range combining optimization
As discussed in D13348 - the INSERTQI range combining code is wrong in that it confuses the insertion bit index with an extraction bit index.

The remaining legal combines are very unlikely (especially once we've converted to shuffles in D13348) so I'm removing the optimization.

llvm-svn: 250160
2015-10-13 14:48:54 +00:00
Simon Pilgrim aa0ec7f45c [InstCombine] Tidied up SSE4A tests.
First stage of bugfix discussed in D13348

llvm-svn: 250121
2015-10-12 23:07:06 +00:00
Simon Pilgrim 61116ddc7b [InstCombine] Added vector demanded bits support for SSE4A EXTRQ/INSERTQ instructions
The SSE4A instructions EXTRQ/INSERTQ only use the lower 64-bits (or less) for many of their input vector operands and all of them have undefined upper 64-bits results.

Differential Revision: http://reviews.llvm.org/D12680

llvm-svn: 247934
2015-09-17 20:32:45 +00:00
Simon Pilgrim 357b85c926 [InstCombine] Split off SSE4a tests.
These aren't vector demanded bits tests. More tests to follow.

llvm-svn: 243223
2015-07-25 17:14:01 +00:00