forked from OSchip/llvm-project
5a48aef3f0
This is the last step needed to fix PR33325: https://bugs.llvm.org/show_bug.cgi?id=33325 We're trading branch and compares for loads and logic ops. This makes the code smaller and hopefully faster in most cases. The 24-byte test shows an interesting construct: we load the trailing scalar elements into vector registers and generate the same pcmpeq+movmsk code that we expected for a pair of full vector elements (see the 32- and 64-byte tests). Differential Revision: https://reviews.llvm.org/D41714 llvm-svn: 321934 |
||
---|---|---|
.. | ||
X86 |