Commit dbf670f1 authored by Matthew Brost's avatar Matthew Brost
Browse files

drm/xe: Wire devcoredump to LR TDR



LR queues can hang, cause engine reset, or cause IOMMU CAT errors.
Collect an error capture when this occurs.

v2:
 - s/queue's/queues (Jonathan)
v4:
 - Fix build (CI)

Signed-off-by: default avatarMatthew Brost <matthew.brost@intel.com>
Reviewed-by: default avatarJonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114022522.1951351-8-matthew.brost@intel.com
parent a54b0de7
Loading
Loading
Loading
Loading
+6 −1
Original line number Diff line number Diff line
@@ -896,13 +896,18 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
					 !exec_queue_pending_disable(q) ||
					 xe_guc_read_stopped(guc), HZ * 5);
		if (!ret) {
			xe_gt_warn(q->gt, "Schedule disable failed to respond\n");
			xe_gt_warn(q->gt, "Schedule disable failed to respond, guc_id=%d\n",
				   q->guc->id);
			xe_devcoredump(q, NULL);
			xe_sched_submission_start(sched);
			xe_gt_reset_async(q->gt);
			return;
		}
	}

	if (!exec_queue_killed(q) && !xe_lrc_ring_is_idle(q->lrc[0]))
		xe_devcoredump(q, NULL);

	xe_sched_submission_start(sched);
}