This update was done with the following bash script:
find test/CodeGen -name "*.ll" | \
while read NAME; do
echo "$NAME"
if ! grep -q "^; *RUN: *llc.*debug" $NAME; then
TEMP=`mktemp -t temp`
cp $NAME $TEMP
sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \
while read FUNC; do
sed -i '' "s/;\(.*\)\([A-Za-z0-9_-]*\):\( *\)$FUNC: *\$/;\1\2-LABEL:\3$FUNC:/g" $TEMP
done
sed -i '' "s/;\(.*\)-LABEL-LABEL:/;\1-LABEL:/" $TEMP
sed -i '' "s/;\(.*\)-NEXT-LABEL:/;\1-NEXT:/" $TEMP
sed -i '' "s/;\(.*\)-NOT-LABEL:/;\1-NOT:/" $TEMP
sed -i '' "s/;\(.*\)-DAG-LABEL:/;\1-DAG:/" $TEMP
mv $TEMP $NAME
fi
done
llvm-svn: 186280
this allows for better code generation.
Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.
For example:
movaps %xmm0, %xmm1
movsd LC(%rip), %xmm0
minsd %xmm1, %xmm0
becomes:
minsd LC(%rip), %xmm0
llvm-svn: 162187
Add more comments and use early returns to reduce nesting in isLoadFoldable.
Also disable folding for V_SET0 to avoid introducing a const pool entry and
a const pool load.
rdar://10554090 and rdar://11873276
llvm-svn: 161207
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.
This patch is a rework of r160919 and was tested on clang self-host on my local
machine.
rdar://10554090 and rdar://11873276
llvm-svn: 161152
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.
rdar://10554090 and rdar://11873276
llvm-svn: 160919
This option has been disabled for a while, and it is going away so I can
clean up the coalescer code.
The tests that required physreg joining to be enabled were almost all of
the form "tiny function with interference between arguments and return
value". Such functions are usually inlined in the real world.
The problem exposed by phys_subreg_coalesce-3.ll is real, but fairly
rare.
llvm-svn: 157027
Like V_SET0, these instructions are expanded by ExpandPostRA to xorps /
vxorps so they can participate in execution domain swizzling.
This also makes the AVX variants redundant.
llvm-svn: 145440
maxps and maxpd). This broke the sse41-blend.ll testcase by causing
maxpd to be produced rather than a cmp+blend pair, which is the reason
I tweaked it. Gives a small speedup on doduc with dragonegg when the
GCC vectorizer is used.
llvm-svn: 139986
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.
llvm-svn: 108465
necessary to swap the operands to handle NaN and negative zero properly.
Also, reintroduce logic for checking for NaN conditions when forming
SSE min and max instructions, fixed to take into consideration NaNs and
negative zeros. This allows forming min and max instructions in more
cases.
llvm-svn: 97025
SSE min and max instructions. The real thing this code needs to be
concerned about is negative zero.
Update the sse-minmax.ll test accordingly, and add tests for
-enable-unsafe-fp-math mode as well.
llvm-svn: 96775
And fix a bug with the behavior of min/max instructions formed from
fcmp uge comparisons.
Also, use FiniteOnlyFPMath() for this code instead of UnsafeFPMath,
as it is more specific.
llvm-svn: 82466