The backend tries to use block operations like MVC, NC, OC and XC for
simple scalar operations. For correctness reasons, it rejects any case
in which the regions might partially overlap. However, for performance
reasons, it should also reject cases where the regions might be equal,
since the instruction might then not use the fast path.
This fixes a performance regression seen in bzip2. We may want to limit
the optimisation even more in future, or even remove it entirely, but I'll
try with this for now.
llvm-svn: 191525