[llvm-mca] Correctly handle zero-latency stores that consume pipeline resources.

This fixes PR37293.

We can have scheduling classes with no write latency entries, that still consume
processor resources. We don't want to treat those instructions as zero-latency
instructions; they still have to be issued to the underlying pipelines, so they
still consume resource cycles.

This is likely to be a regression which I have accidentally introduced at
revision 330807. Now, if an instruction has a non-empty set of write processor
resources, we conservatively treat it as a normal (i.e. non zero-latency)
instruction.

llvm-svn: 331193
This commit is contained in:
Andrea Di Biagio 2018-04-30 15:55:04 +00:00
parent 79e5cd2fc5
commit e047d3529b
3 changed files with 48 additions and 2 deletions

View File

@ -0,0 +1,44 @@
# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
# RUN: llvm-mca -march=aarch64 -mcpu=falkor -iterations=2 < %s | FileCheck %s
stp d0, d1, [x0]
# CHECK: Iterations: 2
# CHECK-NEXT: Instructions: 2
# CHECK-NEXT: Total Cycles: 4
# CHECK-NEXT: Dispatch Width: 8
# CHECK-NEXT: IPC: 0.50
# CHECK: Instruction Info:
# CHECK-NEXT: [1]: #uOps
# CHECK-NEXT: [2]: Latency
# CHECK-NEXT: [3]: RThroughput
# CHECK-NEXT: [4]: MayLoad
# CHECK-NEXT: [5]: MayStore
# CHECK-NEXT: [6]: HasSideEffects
# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
# CHECK-NEXT: 2 0 1.00 * stp d0, d1, [x0]
# CHECK: Resources:
# CHECK-NEXT: [0] - FalkorUnitB
# CHECK-NEXT: [1] - FalkorUnitGTOV
# CHECK-NEXT: [2] - FalkorUnitLD
# CHECK-NEXT: [3] - FalkorUnitSD
# CHECK-NEXT: [4] - FalkorUnitST
# CHECK-NEXT: [5] - FalkorUnitVSD
# CHECK-NEXT: [6] - FalkorUnitVTOG
# CHECK-NEXT: [7] - FalkorUnitVX
# CHECK-NEXT: [8] - FalkorUnitVY
# CHECK-NEXT: [9] - FalkorUnitX
# CHECK-NEXT: [10] - FalkorUnitY
# CHECK-NEXT: [11] - FalkorUnitZ
# CHECK: Resource pressure per iteration:
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
# CHECK-NEXT: - - - - 1.00 1.00 - - - - - -
# CHECK: Resource pressure by instruction:
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] Instructions:
# CHECK-NEXT: - - - - 1.00 1.00 - - - - - - stp d0, d1, [x0]

View File

@ -411,7 +411,8 @@ void DispatchUnit::dispatch(unsigned IID, Instruction *NewInst,
// instruction. The assumption is that a zero-latency instruction doesn't
// require to be issued to the scheduler for execution. More importantly, it
// doesn't have to wait on the register input operands.
if (NewInst->getDesc().MaxLatency)
const InstrDesc &Desc = NewInst->getDesc();
if (Desc.MaxLatency || !Desc.Resources.empty())
for (std::unique_ptr<ReadState> &RS : NewInst->getUses())
updateRAWDependencies(*RS, STI);

View File

@ -258,12 +258,13 @@ void Scheduler::scheduleInstruction(unsigned Idx, Instruction &MCIS) {
// targets, zero-idiom instructions (for example: a xor that clears the value
// of a register) are treated speacially, and are often eliminated at register
// renaming stage.
bool IsZeroLatency = !Desc.MaxLatency && Desc.Resources.empty();
// Instructions that use an in-order dispatch/issue processor resource must be
// issued immediately to the pipeline(s). Any other in-order buffered
// resources (i.e. BufferSize=1) is consumed.
if (Desc.MaxLatency && !Resources->mustIssueImmediately(Desc)) {
if (!IsZeroLatency && !Resources->mustIssueImmediately(Desc)) {
DEBUG(dbgs() << "[SCHEDULER] Adding " << Idx << " to the Ready Queue\n");
ReadyQueue[Idx] = &MCIS;
return;