[AMDGPU] Recognize x & ((1 << y) - 1) pattern.

Summary:
As a followup for D48007.

Since we already handle `x << (bitwidth - y) >> (bitwidth - y)` pattern,
which does not have ub for both the edge cases (`y == 0`, `y == bitwidth`),
i think also handling a pattern that is ub for `y == bitwidth` should be fine.

Reviewers: nhaehnle, bogner, tstellar, arsenm

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #amdgpu

Differential Revision: https://reviews.llvm.org/D48010

llvm-svn: 334816
This commit is contained in:
Roman Lebedev 2018-06-15 09:56:39 +00:00
parent aa8587d1fc
commit 9c17dad8f2
2 changed files with 22 additions and 39 deletions

View File

@ -126,6 +126,7 @@ def or_oneuse : HasOneUseBinOp<or>;
def xor_oneuse : HasOneUseBinOp<xor>;
} // Properties = [SDNPCommutative, SDNPAssociative]
def add_oneuse : HasOneUseBinOp<add>;
def sub_oneuse : HasOneUseBinOp<sub>;
def srl_oneuse : HasOneUseBinOp<srl>;
@ -682,6 +683,12 @@ multiclass BFEPattern <Instruction UBFE, Instruction SBFE, Instruction MOV> {
(UBFE $src, $rshift, (MOV (i32 (IMMPopCount $mask))))
>;
// x & ((1 << y) - 1)
def : AMDGPUPat <
(and i32:$src, (add_oneuse (shl_oneuse 1, i32:$width), -1)),
(UBFE $src, (i32 0), $width)
>;
// x & (-1 >> (bitwidth - y))
def : AMDGPUPat <
(and i32:$src, (srl_oneuse -1, (sub 32, i32:$width))),

View File

@ -17,19 +17,11 @@
; ---------------------------------------------------------------------------- ;
define i32 @bzhi32_a0(i32 %val, i32 %numlowbits) nounwind {
; SI-LABEL: bzhi32_a0:
; SI: ; %bb.0:
; SI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; SI-NEXT: v_bfm_b32_e64 v1, v1, 0
; SI-NEXT: v_and_b32_e32 v0, v1, v0
; SI-NEXT: s_setpc_b64 s[30:31]
;
; VI-LABEL: bzhi32_a0:
; VI: ; %bb.0:
; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; VI-NEXT: v_bfm_b32 v1, v1, 0
; VI-NEXT: v_and_b32_e32 v0, v1, v0
; VI-NEXT: s_setpc_b64 s[30:31]
; GCN-LABEL: bzhi32_a0:
; GCN: ; %bb.0:
; GCN-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; GCN-NEXT: v_bfe_u32 v0, v0, 0, v1
; GCN-NEXT: s_setpc_b64 s[30:31]
%onebit = shl i32 1, %numlowbits
%mask = add nsw i32 %onebit, -1
%masked = and i32 %mask, %val
@ -37,19 +29,11 @@ define i32 @bzhi32_a0(i32 %val, i32 %numlowbits) nounwind {
}
define i32 @bzhi32_a1_indexzext(i32 %val, i8 zeroext %numlowbits) nounwind {
; SI-LABEL: bzhi32_a1_indexzext:
; SI: ; %bb.0:
; SI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; SI-NEXT: v_bfm_b32_e64 v1, v1, 0
; SI-NEXT: v_and_b32_e32 v0, v1, v0
; SI-NEXT: s_setpc_b64 s[30:31]
;
; VI-LABEL: bzhi32_a1_indexzext:
; VI: ; %bb.0:
; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; VI-NEXT: v_bfm_b32 v1, v1, 0
; VI-NEXT: v_and_b32_e32 v0, v1, v0
; VI-NEXT: s_setpc_b64 s[30:31]
; GCN-LABEL: bzhi32_a1_indexzext:
; GCN: ; %bb.0:
; GCN-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; GCN-NEXT: v_bfe_u32 v0, v0, 0, v1
; GCN-NEXT: s_setpc_b64 s[30:31]
%conv = zext i8 %numlowbits to i32
%onebit = shl i32 1, %conv
%mask = add nsw i32 %onebit, -1
@ -58,19 +42,11 @@ define i32 @bzhi32_a1_indexzext(i32 %val, i8 zeroext %numlowbits) nounwind {
}
define i32 @bzhi32_a4_commutative(i32 %val, i32 %numlowbits) nounwind {
; SI-LABEL: bzhi32_a4_commutative:
; SI: ; %bb.0:
; SI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; SI-NEXT: v_bfm_b32_e64 v1, v1, 0
; SI-NEXT: v_and_b32_e32 v0, v0, v1
; SI-NEXT: s_setpc_b64 s[30:31]
;
; VI-LABEL: bzhi32_a4_commutative:
; VI: ; %bb.0:
; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; VI-NEXT: v_bfm_b32 v1, v1, 0
; VI-NEXT: v_and_b32_e32 v0, v0, v1
; VI-NEXT: s_setpc_b64 s[30:31]
; GCN-LABEL: bzhi32_a4_commutative:
; GCN: ; %bb.0:
; GCN-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; GCN-NEXT: v_bfe_u32 v0, v0, 0, v1
; GCN-NEXT: s_setpc_b64 s[30:31]
%onebit = shl i32 1, %numlowbits
%mask = add nsw i32 %onebit, -1
%masked = and i32 %val, %mask ; swapped order