[X86] Don't combine (x86cmp (trunc (movmsk (bitcast X))), 0) if the truncate discards unknown bits.

We have transform that tries turn a pmovmskb into movmskps/pd or
movmskps to movmskpd. This transform isn't valid if the truncate
discarded bits that might be set by the original movmsk.

We could fix this by inserting an AND after the new movmsk to discard
the equivalent of the truncated bits, but I've left that for later
patch.

Fixes PR52567.

Differential Revision: https://reviews.llvm.org/D114306
This commit is contained in:
Craig Topper 2021-11-19 19:05:10 -08:00
parent 1cb991e754
commit a4373f6753
2 changed files with 8 additions and 5 deletions

View File

@ -44004,7 +44004,11 @@ static SDValue combineSetCCMOVMSK(SDValue EFLAGS, X86::CondCode &CC,
// signbits extend down to all the sub-elements as well.
// Calling MOVMSK with the wider type, avoiding the bitcast, helps expose
// potential SimplifyDemandedBits/Elts cases.
if (Vec.getOpcode() == ISD::BITCAST) {
// If we looked through a truncate that discard bits, we can't do this
// transform.
// FIXME: We could do this transform for truncates that discarded bits by
// inserting an AND mask between the new MOVMSK and the CMP.
if (Vec.getOpcode() == ISD::BITCAST && NumElts <= CmpBits) {
SDValue BC = peekThroughBitcasts(Vec);
MVT BCVT = BC.getSimpleValueType();
unsigned BCNumElts = BCVT.getVectorNumElements();

View File

@ -2,17 +2,16 @@
; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu | FileCheck %s
; The and in the test below discards half the bits from vector icmp result.
; FIXME: The generated code is using a movmskps, but fails to discard bits
; 2 and 3 before the testl.
; We use a testb after a pmovmskb to examine only 8 bits.
define i32 @foo(<4 x float> %arg) {
; CHECK-LABEL: foo:
; CHECK: # %bb.0: # %bb
; CHECK-NEXT: movaps {{.*#+}} xmm1 = [1.00000005E-3,1.00000005E-3,1.00000005E-3,1.00000005E-3]
; CHECK-NEXT: cmpltps %xmm0, %xmm1
; CHECK-NEXT: movmskps %xmm1, %ecx
; CHECK-NEXT: pmovmskb %xmm1, %ecx
; CHECK-NEXT: xorl %eax, %eax
; CHECK-NEXT: testl %ecx, %ecx
; CHECK-NEXT: testb %cl, %cl
; CHECK-NEXT: sete %al
; CHECK-NEXT: retq
bb: