drm/amd/amdkfd: Fix kernel panic when reset failed and been triggered again

In SRIOV configuration, the reset may failed to bring asic back to normal but stop cpsch
already been called, the start_cpsch will not be called since there is no resume in this
case.  When reset been triggered again, driver should avoid to do uninitialization again.

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
shaoyunl 2021-11-14 12:38:18 -05:00 committed by Alex Deucher
parent 33155ce6e1
commit c96cb65989
1 changed files with 5 additions and 0 deletions

View File

@ -1225,6 +1225,11 @@ static int stop_cpsch(struct device_queue_manager *dqm)
bool hanging; bool hanging;
dqm_lock(dqm); dqm_lock(dqm);
if (!dqm->sched_running) {
dqm_unlock(dqm);
return 0;
}
if (!dqm->is_hws_hang) if (!dqm->is_hws_hang)
unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0); unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0);
hanging = dqm->is_hws_hang || dqm->is_resetting; hanging = dqm->is_hws_hang || dqm->is_resetting;