Sanjay Patel
d5c2d287f9
[X86, AVX] use blends instead of insert128 with index 0
...
Another case of x86-specific shuffle strength reduction:
avoid generating insert*128 instructions with index 0 because
they are slower than their non-lane-changing blend equivalents.
Shuffle lowering already catches most of these cases, but
the zero vector case and some other paths such as in the
modified test in vector-shuffle-256-v32.ll were getting
through.
Differential Revision: http://reviews.llvm.org/D8366
llvm-svn: 232773
2015-03-19 22:29:40 +00:00
Sanjay Patel
11ce908e4c
fixed to test feature, not CPU
...
llvm-svn: 232398
2015-03-16 18:24:28 +00:00
Sanjay Patel
a8ec726bb6
add CHECK-LABELs for more reliable testing
...
llvm-svn: 232391
2015-03-16 17:59:07 +00:00
Craig Topper
12b72def4e
Fix VINSERTF128/VEXTRACTF128 to be marked as FP instructions. Allow execution dependency fix pass to convert them to their integer equivalents when AVX2 is enabled.
...
llvm-svn: 145376
2011-11-29 05:37:58 +00:00
Bruno Cardoso Lopes
76bc28bac6
Add patterns to generate copies for extract_subvector instead of
...
using vextractf128. This will reduce the number of issued instruction
for several avx codes.
llvm-svn: 136323
2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes
eca99c4b5a
Add a few patterns to match allzeros without having to use the fp unit.
...
Take advantage that the 128-bit vpxor zeros the higher part and use it.
This also fixes PR10491
llvm-svn: 136321
2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes
14a95bda04
Although we already support this, add testcases for consistency
...
llvm-svn: 135728
2011-07-22 00:15:03 +00:00
Bruno Cardoso Lopes
91eff5140f
Add a DAGCombine for transforming 128->256 casts into a simple
...
vxorps + vinsertf128 pair of instructions
llvm-svn: 135727
2011-07-22 00:15:00 +00:00