Go to file
Simon Pilgrim 7e6606f4f1 [X86][SSE] Add general memory folding for (V)INSERTPS instruction
This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction.

The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening.

This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done.

It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted.

Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll

Differential Revision: http://reviews.llvm.org/D13988

llvm-svn: 252074
2015-11-04 20:48:09 +00:00
clang Fixed a link. 2015-11-04 19:42:17 +00:00
clang-tools-extra Improve modernize-make-unique matcher. 2015-11-04 10:27:51 +00:00
compiler-rt Asan: utility function to determine first wrongly poisoned byte in 2015-11-04 19:56:03 +00:00
debuginfo-tests New round of fixes for "Always compile debuginfo-tests for the host triple" 2014-10-18 23:47:59 +00:00
libclc integer: remove explicit casts from _MIN definitions 2015-10-06 19:12:12 +00:00
libcxx Make reverse() call iter_swap like the standard says, instead of calling swap directly. No real change. 2015-11-02 21:34:25 +00:00
libcxxabi Fix LIBCXXABI_HAS_NO_THREADS configuration. 2015-10-14 19:21:38 +00:00
libunwind Add FreeBSD _Unwind_Ptr typedef 2015-10-16 19:40:09 +00:00
lld Fix Clang-tidy modernize-use-override warnings, other minor fixes. 2015-11-04 02:11:57 +00:00
lldb Add "zero_memory" option to IRMemoryMap::FindSpace & IRMemoryMap::Malloc. Zero out 2015-11-04 20:32:27 +00:00
llgo [llgo] irgen: always use TargetMachine's data layout 2015-09-25 06:28:14 +00:00
llvm [X86][SSE] Add general memory folding for (V)INSERTPS instruction 2015-11-04 20:48:09 +00:00
openmp [OPENMP] Add dependency to clang/clang-headers etc. for in-tree build of libomp. 2015-11-02 13:43:32 +00:00
polly [FIX] Simplify and correct preloading of base pointer origin 2015-11-03 19:15:33 +00:00