Transform (x&C)>V into (x&C)!=0 where possible

When the least bit of C is greater than V, (x&C) must be greater than V
if it is not zero, so the comparison can be simplified.

Although this was suggested in Target/X86/README.txt, it benefits any
architecture with a directly testable form of AND.

Patch by Kevin Schoedel

llvm-svn: 170576
This commit is contained in:
Paul Redmond 2012-12-19 19:47:13 +00:00
parent 8013df71c7
commit 5917f4c715
3 changed files with 27 additions and 37 deletions

View File

@ -1567,43 +1567,6 @@ The first one is done for all AMDs, Core2, and "Generic"
The second one is done for: Atom, Pentium Pro, all AMDs, Pentium 4, Nocona,
Core 2, and "Generic"
//===---------------------------------------------------------------------===//
Testcase:
int a(int x) { return (x & 127) > 31; }
Current output:
movl 4(%esp), %eax
andl $127, %eax
cmpl $31, %eax
seta %al
movzbl %al, %eax
ret
Ideal output:
xorl %eax, %eax
testl $96, 4(%esp)
setne %al
ret
This should definitely be done in instcombine, canonicalizing the range
condition into a != condition. We get this IR:
define i32 @a(i32 %x) nounwind readnone {
entry:
%0 = and i32 %x, 127 ; <i32> [#uses=1]
%1 = icmp ugt i32 %0, 31 ; <i1> [#uses=1]
%2 = zext i1 %1 to i32 ; <i32> [#uses=1]
ret i32 %2
}
Instcombine prefers to strength reduce relational comparisons to equality
comparisons when possible, this should be another case of that. This could
be handled pretty easily in InstCombiner::visitICmpInstWithInstAndIntCst, but it
looks like InstCombiner::visitICmpInstWithInstAndIntCst should really already
be redesigned to use ComputeMaskedBits and friends.
//===---------------------------------------------------------------------===//
Testcase:
int x(int a) { return (a&0xf0)>>4; }

View File

@ -1226,6 +1226,16 @@ Instruction *InstCombiner::visitICmpInstWithInstAndIntCst(ICmpInst &ICI,
ICI.setOperand(0, NewAnd);
return &ICI;
}
// Replace ((X & AndCST) > RHSV) with ((X & AndCST) != 0), if any
// bit set in (X & AndCST) will produce a result greater than RHSV.
if (ICI.getPredicate() == ICmpInst::ICMP_UGT) {
unsigned NTZ = AndCST->getValue().countTrailingZeros();
if ((NTZ < AndCST->getBitWidth()) &&
APInt::getOneBitSet(AndCST->getBitWidth(), NTZ).ugt(RHSV))
return new ICmpInst(ICmpInst::ICMP_NE, LHSI,
Constant::getNullValue(RHS->getType()));
}
}
// Try to optimize things like "A[i]&42 == 0" to index computations.

View File

@ -677,3 +677,20 @@ define i1 @test66(i64 %A, i64 %B) {
; CHECK-NEXT: ret i1 true
ret i1 %cmp
}
; CHECK: @test67
; CHECK: %and = and i32 %x, 96
; CHECK: %cmp = icmp ne i32 %and, 0
define i1 @test67(i32 %x) nounwind uwtable {
%and = and i32 %x, 127
%cmp = icmp sgt i32 %and, 31
ret i1 %cmp
}
; CHECK: @test68
; CHECK: %cmp = icmp ugt i32 %and, 30
define i1 @test68(i32 %x) nounwind uwtable {
%and = and i32 %x, 127
%cmp = icmp sgt i32 %and, 30
ret i1 %cmp
}