drm/amdgpu/kfd: remove is_hws_hang and is_resetting

is_hws_hang and is_resetting serves pretty much the same purpose and
they all duplicates the work of the reset_domain lock, just check that
directly instead. This also eliminate a few bugs listed below and get
rid of dqm->ops.pre_reset.

kfd_hws_hang did not need to avoid scheduling another reset. If the
on-going reset decided to skip GPU reset we have a bad time, otherwise
the extra reset will get cancelled anyway.

remove_queue_mes forgot to check is_resetting flag compared to the
pre-MES path unmap_queue_cpsch, so it did not block hw access during
reset correctly.

Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Yunxiang Li
2024-05-24 13:46:50 -04:00
committed by Alex Deucher
parent 5c0a1cdd17
commit 1802b042a3
7 changed files with 45 additions and 59 deletions

View File

@@ -936,7 +936,6 @@ int kgd2kfd_pre_reset(struct kfd_dev *kfd,
for (i = 0; i < kfd->num_nodes; i++) {
node = kfd->nodes[i];
kfd_smi_event_update_gpu_reset(node, false, reset_context);
node->dqm->ops.pre_reset(node->dqm);
}
kgd2kfd_suspend(kfd, false);