forked from OSchip/llvm-project
fea731a4aa
As noted in the code comment, transforming this in the other direction might require a separate transform here in CGP given the block-at-a-time DAG constraint. Besides that theoretical motivation, there are 2 practical motivations for the subtract-of-cmps form: 1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still). There is discussion about canonicalizing IR to the select form ( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ), so we probably need to add DAG transforms for those patterns anyway, but this improves the memcmp output without waiting for that step. 2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided if we choose to enable that. Differential Revision: https://reviews.llvm.org/D34904 llvm-svn: 309597 |
||
---|---|---|
.. | ||
AArch64 | ||
AMDGPU | ||
ARM | ||
NVPTX | ||
X86 | ||
2008-11-24-RAUW-Self.ll | ||
basic.ll | ||
bitreverse-hang.ll | ||
builtin-condition.ll | ||
crash-on-large-allocas.ll | ||
dom-tree.ll | ||
invariant.group.ll | ||
nonintegral.ll | ||
overflow-intrinsics.ll | ||
section-samplepgo.ll | ||
section.ll | ||
skip-merging-case-block.ll | ||
split-indirect-loop.ll | ||
statepoint-relocate.ll |