Jun Bum Lim
|
34b9bd0435
|
Improve ISel using across lane min/max reduction
In vectorized integer min/max reduction code, the final "reduce" step
is sub-optimal. In AArch64, this change wll combine :
%svn0 = vector_shuffle %0, undef<2,3,u,u>
%smax0 = smax %0, svn0
%svn3 = vector_shuffle %smax0, undef<1,u,u,u>
%sc = setcc %smax0, %svn3, gt
%n0 = extract_vector_elt %sc, #0
%n1 = extract_vector_elt %smax0, #0
%n2 = extract_vector_elt $smax0, #1
%result = select %n0, %n1, n2
becomes :
%1 = smaxv %0
%result = extract_vector_elt %1, 0
This change extends r246790.
llvm-svn: 247575
|
2015-09-14 16:19:52 +00:00 |
Chad Rosier
|
6c36eff1d6
|
[AArch64] Improve ISel using across lane addition reduction.
In vectorized add reduction code, the final "reduce" step is sub-optimal.
This change wll combine :
ext v1.16b, v0.16b, v0.16b, #8
add v0.4s, v1.4s, v0.4s
dup v1.4s, v0.s[1]
add v0.4s, v1.4s, v0.4s
into
addv s0, v0.4s
PR21371
http://reviews.llvm.org/D12325
Patch by Jun Bum Lim <junbuml@codeaurora.org>!
llvm-svn: 246790
|
2015-09-03 18:13:57 +00:00 |