dag-combine optimization to implement the ext-load efficiently (using shuffles).
For example the type <4 x i8> is stored in memory as i32, but it needs to
find its way into a <4 x i32> register. Previously we scalarized the memory
access, now we use shuffles.
llvm-svn: 139995
maxps and maxpd). This broke the sse41-blend.ll testcase by causing
maxpd to be produced rather than a cmp+blend pair, which is the reason
I tweaked it. Gives a small speedup on doduc with dragonegg when the
GCC vectorizer is used.
llvm-svn: 139986
are declared with load patterns. This fix the crash in PR10941. No testcases,
since a fold is triggered and then converted back to the register form
afterwards.
llvm-svn: 139953
This PR basically reports a problem where a crash in generated code
happened due to %rbp being clobbered:
pushq %rbp
movq %rsp, %rbp
....
vmovmskps %ymm12, %ebp
....
movq %rbp, %rsp
popq %rbp
ret
Since Eric's r123367 commit, the default stack alignment for x86 32-bit
has changed to be 16-bytes. Since then, the MaxStackAlignmentHeuristicPass
hasn't been really used, but with AVX it becomes useful again, since per
ABI compliance we don't always align the stack to 256-bit, but only when
there are 256-bit incoming arguments.
ReserveFP was only used by this pass, but there's no RA target hook that
uses getReserveFP() to check for the presence of FP (since nothing was
triggering the pass to run, the uses of getReserveFP() were removed
through time without being noticed). Change this pass to use
setForceFramePointer, which is properly called by MachineFunction
hasFP method.
The testcase is very big and dependent on RA, not sure if it's worth
adding to test/CodeGen/X86.
llvm-svn: 139939
take into consideration the presence of AVX. This change, together with
the SSEDomainFix enabled for AVX, makes AVX codegen to always (hopefully)
emit the same code as SSE for 128-bit vector ops. I don't
have a testcase for this, but AVX now beats SSE in performance for
128-bit ops in the majority of programas in the llvm testsuite
llvm-svn: 139817
alignment check for 256-bit classes more strict. There're no testcases
but we catch more folding cases for AVX while running single and multi
sources in the llvm testsuite.
Since some 128-bit AVX instructions have different number of operands
than their SSE counterparts, they are placed in different tables.
256-bit AVX instructions should also be added in the table soon. And
there a few more 128-bit versions to handled, which should come in
the following commits.
llvm-svn: 139687
more strict about the alignment checking. This was found by inspection
and I don't have any testcases so far, although the llvm testsuite runs
without any problem.
llvm-svn: 139625
However with this fix it does now.
Basically the operand order for the x86 target specific node
is not the same as the instruction, but since the intrinsic need that
specific order at the instruction definition, just change the order
during legalization. Also, there were some wrong invertions of condition
codes, such as GE => LE, GT => LT, fix that too. Fix PR10907.
llvm-svn: 139528
assert("not implemented for target shuffle node");
to:
assert(0 && "not implemented for target shuffle node");
This causes a test failure in CodeGen/X86/palignr.ll which has
been marked as XFAIL for the time being.
Test failure filed at PR10901.
llvm-svn: 139454
single field (Flags), which is a bitwise OR of items from the TB_*
enum. This makes it easier to add new information in the future.
* Gives every static array an equivalent layout: { RegOp, MemOp, Flags }
* Adds a helper function, AddTableEntry, to avoid duplication of the
insertion code.
* Renames TB_NOT_REVERSABLE to TB_NO_REVERSE.
* Adds TB_NO_FORWARD, which is analogous to TB_NO_REVERSE, except that
it prevents addition of the Reg->Mem entry. (This is going to be used
by Native Client, in the next CL).
Patch by David Meyer
llvm-svn: 139311
in Nadav's r139285 and r139287 commits.
1) Rename vsel.ll to a more descriptive name
2) Change the order of BLEND operands to "Op1, Op2, Cond", this is
necessary because PBLENDVB is already used in different places with
this order, and it was being emitted in the wrong way for vselect
3) Add AVX patterns and tests for the same SSE41 instructions
llvm-svn: 139305