proves to be worth 20% on Ptrdist/ks. Might be related to dependency
breaking support.
2. Added FsMOVAPSrr and FsMOVAPDrr as aliases to MOVAPSrr and MOVAPDrr. These
are used for FR32 / FR64 reg-to-reg copies.
3. Tell reg-allocator to generate MOVSSrm / MOVSDrm and MOVSSmr / MOVSDmr to
spill / restore FsMOVAPSrr and FsMOVAPDrr.
llvm-svn: 26241
This fixes a testcase that nate reduced from spass.
Also included are a couple minor code changes that don't affect the generated
code at all.
llvm-svn: 26235
We do not want to emit "Loop: ... brcond Out; br Loop", as it adds an extra
instruction in the loop. Instead, invert the condition and emit
"Loop: ... br!cond Loop; br Out.
Generalize the fix by moving it from PPCDAGToDAGISel to SelectionDAGLowering.
llvm-svn: 26231
want to copy the files when the .cpp file changes, we want to copy them
to the .cvs versions when the .l/.y file change (like the comments even say).
This avoids having bogus changes show up in diffs.
llvm-svn: 26229
transfer.
According to the Intel P4 Optimization Manual:
Moves that write a portion of a register can introduce unwanted
dependences. The movsd reg, reg instruction writes only the bottom
64 bits of a register, not to all 128 bits. This introduces a dependence on
the preceding instruction that produces the upper 64 bits (even if those
bits are not longer wanted). The dependence inhibits register renaming,
and thereby reduces parallelism.
Not to mention movaps is shorter than movss.
llvm-svn: 26226
Turns them into calls to memset / memcpy if 1) buffer(s) are not DWORD aligned,
2) size is not known to be greater or equal to some minimum value (currently 128).
llvm-svn: 26224
unswitch this loop on 2 before sweating to unswitch on 1/3.
void test4(int N, int i, int C, int*P, int*Q) {
int j;
for (j = 0; j < N; ++j) {
switch (C) { // general unswitching.
default: P[i+j] = 0; break;
case 1: Q[i+j] = 0; break;
case 3: P[i+j] = Q[i+j]; break;
case 2: break; // TRIVIAL UNSWITCH on C==2
}
}
}
llvm-svn: 26223
this for example:
for (j = 0; j < N; ++j) { // trivial unswitch
if (C)
P[i+j] = 0;
}
turning it into the obvious code without bothering to duplicate an empty loop.
llvm-svn: 26220
The ABI specifies that there is a register save area at the bottom of the
stack, which means the actual used pointer needs to be an offset from
the subtracted value.
llvm-svn: 26202
go through, but we do want to know if we're using GCC/ICC since they
share certain funky command line options (for dependency generation
stuff)
llvm-svn: 26198