[openmp] Fix 51647, corrupt bitcode on amdgpu

Patch by @dpalermo

The corrupt bitcode reported in https://bugs.llvm.org/show_bug.cgi?id=51647 seems to be a result of a later pass changing the workfn variable to addrspace(5) (thread private, on the stack). That seems reasonable for an alloca without an address space so it's an open question why that can crash the bitcode reader.

This change puts it in the thread private address space to begin with which means whatever misfired further down the pipeline does not break it. That matches the codegen from clang where stack variables are always annotated (5) and then addrspace cast prior to following use.

This therefore patches around whatever unsuccessfully moved the alloca variable to addrspace(5). That solves the problem of openmp opt producing code that crashes the bitcode reader. It should be possible to create a minimal repro for the underlying bug based on some handwritten IR that uses an alloca in a generic address space.

Reviewed By: ronlieb, jdoerfert, dpalermo-phab

Differential Revision: https://reviews.llvm.org/D109500
This commit is contained in:
dpalermo 2021-09-13 15:21:21 +01:00 committed by Jon Chesterfield
parent 80b60580df
commit d5c049a3f6
3 changed files with 5229 additions and 2566 deletions

View File

@ -2041,7 +2041,8 @@ bool OpenMPOpt::rewriteDeviceCodeStateMachine() {
UndefValue::get(Int8Ty), F->getName() + ".ID");
for (Use *U : ToBeReplacedStateMachineUses)
U->set(ConstantExpr::getBitCast(ID, U->get()->getType()));
U->set(ConstantExpr::getPointerBitCastOrAddrSpaceCast(
ID, U->get()->getType()));
++NumOpenMPParallelRegionsReplacedInGPUStateMachine;
@ -3455,10 +3456,14 @@ struct AAKernelInfoFunction : AAKernelInfo {
IsWorker->setDebugLoc(DLoc);
BranchInst::Create(StateMachineBeginBB, UserCodeEntryBB, IsWorker, InitBB);
Module &M = *Kernel->getParent();
// Create local storage for the work function pointer.
const DataLayout &DL = M.getDataLayout();
Type *VoidPtrTy = Type::getInt8PtrTy(Ctx);
AllocaInst *WorkFnAI = new AllocaInst(VoidPtrTy, 0, "worker.work_fn.addr",
&Kernel->getEntryBlock().front());
Instruction *WorkFnAI =
new AllocaInst(VoidPtrTy, DL.getAllocaAddrSpace(), nullptr,
"worker.work_fn.addr", &Kernel->getEntryBlock().front());
WorkFnAI->setDebugLoc(DLoc);
auto &OMPInfoCache = static_cast<OMPInformationCache &>(A.getInfoCache());
@ -3471,13 +3476,23 @@ struct AAKernelInfoFunction : AAKernelInfo {
Value *Ident = KernelInitCB->getArgOperand(0);
Value *GTid = KernelInitCB;
Module &M = *Kernel->getParent();
FunctionCallee BarrierFn =
OMPInfoCache.OMPBuilder.getOrCreateRuntimeFunction(
M, OMPRTL___kmpc_barrier_simple_spmd);
CallInst::Create(BarrierFn, {Ident, GTid}, "", StateMachineBeginBB)
->setDebugLoc(DLoc);
if (WorkFnAI->getType()->getPointerAddressSpace() !=
(unsigned int)AddressSpace::Generic) {
WorkFnAI = new AddrSpaceCastInst(
WorkFnAI,
PointerType::getWithSamePointeeType(
cast<PointerType>(WorkFnAI->getType()),
(unsigned int)AddressSpace::Generic),
WorkFnAI->getName() + ".generic", StateMachineBeginBB);
WorkFnAI->setDebugLoc(DLoc);
}
FunctionCallee KernelParallelFn =
OMPInfoCache.OMPBuilder.getOrCreateRuntimeFunction(
M, OMPRTL___kmpc_kernel_parallel);

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff