[PowerPC] Do not use vectors to codegen bswap with Altivec turned off

We have efficient codegen on P9 for lowering bswap that involves moving
the value into a vector reg and moving it back. However, the check under
which we custom lowered it did not adequately reflect the actual requirements.
It required only that the subtarget be an implementation of ISA 3.0 since all
compliant implementations have to provide the vector instructions.
However, the kernel builds have a valid use case for -mno-altivec -mcpu=pwr9
(i.e. don't emit vector code, don't have to save vector regs for context
switch). So we should require the correct features for this lowering.
Fixes https://bugs.llvm.org/show_bug.cgi?id=39334

llvm-svn: 347376
This commit is contained in:
Nemanja Ivanovic 2018-11-21 02:53:50 +00:00
parent 27a5896fe8
commit 5cf902ccd4
2 changed files with 32 additions and 6 deletions

View File

@ -323,12 +323,14 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
// to speed up scalar BSWAP64.
// CTPOP or CTTZ were introduced in P8/P9 respectively
setOperationAction(ISD::BSWAP, MVT::i32 , Expand);
if (Subtarget.isISA3_0()) {
if (Subtarget.hasP9Vector())
setOperationAction(ISD::BSWAP, MVT::i64 , Custom);
else
setOperationAction(ISD::BSWAP, MVT::i64 , Expand);
if (Subtarget.isISA3_0()) {
setOperationAction(ISD::CTTZ , MVT::i32 , Legal);
setOperationAction(ISD::CTTZ , MVT::i64 , Legal);
} else {
setOperationAction(ISD::BSWAP, MVT::i64 , Expand);
setOperationAction(ISD::CTTZ , MVT::i32 , Expand);
setOperationAction(ISD::CTTZ , MVT::i64 , Expand);
}

View File

@ -1,11 +1,35 @@
; RUN: llc -verify-machineinstrs < %s -mtriple=ppc64le-- -mcpu=pwr9 | FileCheck %s
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -verify-machineinstrs < %s -mtriple=powerpc64le-unknown-unknown \
; RUN: -mcpu=pwr9 | FileCheck %s
; RUN: llc -verify-machineinstrs < %s -mtriple=powerpc64le-unkknown-unkknown \
; RUN: -mcpu=pwr9 -mattr=-altivec | FileCheck %s --check-prefix=NO-ALTIVEC
declare i64 @llvm.bswap.i64(i64)
; CHECK: mtvsrdd
; CHECK: xxbrd
; CHECK: mfvsrd
define i64 @bswap64(i64 %x) {
; CHECK-LABEL: bswap64:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: mtvsrdd 34, 3, 3
; CHECK-NEXT: xxbrd 0, 34
; CHECK-NEXT: mfvsrd 3, 0
; CHECK-NEXT: blr
;
; NO-ALTIVEC-LABEL: bswap64:
; NO-ALTIVEC: # %bb.0: # %entry
; NO-ALTIVEC-NEXT: rotldi 5, 3, 16
; NO-ALTIVEC-NEXT: rotldi 4, 3, 8
; NO-ALTIVEC-NEXT: rldimi 4, 5, 8, 48
; NO-ALTIVEC-NEXT: rotldi 5, 3, 24
; NO-ALTIVEC-NEXT: rldimi 4, 5, 16, 40
; NO-ALTIVEC-NEXT: rotldi 5, 3, 32
; NO-ALTIVEC-NEXT: rldimi 4, 5, 24, 32
; NO-ALTIVEC-NEXT: rotldi 5, 3, 48
; NO-ALTIVEC-NEXT: rldimi 4, 5, 40, 16
; NO-ALTIVEC-NEXT: rotldi 5, 3, 56
; NO-ALTIVEC-NEXT: rldimi 4, 5, 48, 8
; NO-ALTIVEC-NEXT: rldimi 4, 3, 56, 0
; NO-ALTIVEC-NEXT: mr 3, 4
; NO-ALTIVEC-NEXT: blr
entry:
%0 = call i64 @llvm.bswap.i64(i64 %x)
ret i64 %0