forked from OSchip/llvm-project
fea731a4aa
As noted in the code comment, transforming this in the other direction might require a separate transform here in CGP given the block-at-a-time DAG constraint. Besides that theoretical motivation, there are 2 practical motivations for the subtract-of-cmps form: 1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still). There is discussion about canonicalizing IR to the select form ( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ), so we probably need to add DAG transforms for those patterns anyway, but this improves the memcmp output without waiting for that step. 2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided if we choose to enable that. Differential Revision: https://reviews.llvm.org/D34904 llvm-svn: 309597 |
||
---|---|---|
.. | ||
catchpad-phi-cast.ll | ||
computedgoto.ll | ||
cttz-ctlz.ll | ||
extend-sink-hoist.ll | ||
fcmp-sinking.ll | ||
lit.local.cfg | ||
memcmp.ll | ||
memset_chk-simplify-nobuiltin.ll | ||
pr27536.ll | ||
select.ll | ||
sink-addrmode.ll | ||
sink-addrspacecast.ll | ||
widen_switch.ll | ||
x86-shuffle-sink.ll |