Move the X86 VSELECT->UADDSAT fold to DAGCombiner - there's nothing target specific about these folds.
The SSE42 test diffs are relatively benign - its avoiding an extra constant load in exchange for an extra xor operation - there are extra register moves, which is annoying as all those operations should commute them away.
Differential Revision: https://reviews.llvm.org/D91876
Summary:
When doing the conversion: MachineInst -> MCInst, we should ignore the
implicit operands, it will expose more opportunity for InstiAlias.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D77118
Summary:
xxspltib/vspltisb are 3 cycle PM instructions,
xxleqv is 2 cycle ALU instruction.
We should use xxleqv to set all one vectors.
Reviewers: hfinkel, nemanjai, steven.zhang
Subscribers: hiraditya, kbarton, MaskRay, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65529
llvm-svn: 369006
Use the PPC vector min/max instructions for computing the corresponding
operation as these should be faster than the compare/select sequences
we currently emit.
Differential revision: https://reviews.llvm.org/D47332
llvm-svn: 362759
There are 2 changes visible here:
1. There's no reason to limit this transform based on number
of condition registers. That diff allows PPC to produce
slightly better (dot-instructions should be generally good)
code.
Note: someone that cares about PPC codegen might want to
look closer at that output because it seems like we could
still improve this.
2. We (probably?) should not bother trying to form uaddo (or
other overflow ops) when there's no target support for such
an op. This goes beyond checking whether the op is expanded
because both PPC and AArch64 show better codegen for standard
types regardless of whether the op is legal/custom.
llvm-svn: 353001