Go to file
Simon Pilgrim 483927aefb [x86, CGP] increase memcmp() expansion up to 4 load pairs
It should be a win to avoid going out to the system lib for all small memcmp() calls using scalar ops. For x86 32-bit, this means most everything up to 16 bytes. For 64-bit, that doubles because we can do 8-byte loads.

Notes:

    Reduced from 4 to 2 loads for -Os behavior, which might not be optimal in all cases. It's effectively a question of how much do we trust the system implementation. Linux and macOS (and Windows I assume, but did not test) have optimized memcmp() code for x86, so it's probably not bad either way? PPC is using 8/4 for defaults on these. We do not expand at all for -Oz.

    There are still potential improvements to make for the CGP expansion IR and/or lowering such as avoiding select-of-constants (D34904) and not doing zexts to the max load type before doing a compare.

    We have special-case SSE/AVX codegen for (memcmp(x, y, 16/32) == 0) that will no longer be produced after this patch. I've shown the experimental justification for that change in PR33329:

https://bugs.llvm.org/show_bug.cgi?id=33329#c12
TLDR: While the vector code is a likely winner, we can't guarantee that it's a winner in all cases on all CPUs, so I'm willing to sacrifice it for the greater good of expanding all small memcmp(). If we want to resurrect that codegen, it can be done by adjusting the CGP params or poking a hole to let those fall-through the CGP expansion.

Committed on behalf of Sanjay Patel

Differential Revision: https://reviews.llvm.org/D35067

llvm-svn: 308322
2017-07-18 15:55:30 +00:00
clang [OPENMP] Generalization of sema analysis of reduction-based clauses, 2017-07-18 15:32:58 +00:00
clang-tools-extra Add autoload cookies for clang-include-fixer lisp functions. 2017-07-18 10:15:07 +00:00
compiler-rt [asan] Remove recent asan tests which expect death in allocator 2017-07-18 01:39:56 +00:00
debuginfo-tests Add a test for PR33166. 2017-05-25 19:33:16 +00:00
libclc generic: add missing get_work_dim include 2017-06-02 15:58:35 +00:00
libcxx Check for _MSC_VER before defining _LIBCPP_MSVCRT 2017-07-17 21:52:31 +00:00
libcxxabi [demangler] Respect try_to_parse_template_args 2017-07-13 19:37:37 +00:00
libunwind [libunwind][CMake] Add install path variable to allow overriding the destination 2017-07-11 01:12:09 +00:00
lld [COFF] Accept discarded relocations in DWARF debug sections 2017-07-18 15:11:05 +00:00
lldb Fix NetBSD/FreeBSD build after r308304 2017-07-18 14:03:47 +00:00
llgo irgen: Create functions instead of global variables for builtin hash and equal algorithms. 2017-06-04 22:11:28 +00:00
llvm [x86, CGP] increase memcmp() expansion up to 4 load pairs 2017-07-18 15:55:30 +00:00
openmp Fix sporadic segfaults in tasking tests. 2017-07-18 11:56:16 +00:00
parallel-libs [Axccel] Remove -Wno-missing-braces in build 2016-12-19 21:34:07 +00:00
polly [ScopInfo] Introduce list of statements in Scop::StmtMap. NFC. 2017-07-18 15:41:49 +00:00