Commit 5459e16b authored by Matthew Brost's avatar Matthew Brost Committed by Lucas De Marchi
Browse files

drm/xe: Do not wedge device on killed exec queues



When a user closes an exec queue or interrupts an app with Ctrl-C,
this does not warrant wedging the device in mode 2.

Avoid this by skipping the wedge check for killed exec queues in
the TDR and LR exec queue cleanup worker.

Signed-off-by: default avatarMatthew Brost <matthew.brost@intel.com>
Reviewed-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250624174103.2707941-1-matthew.brost@intel.com


(cherry picked from commit 5a2f117a)
Signed-off-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
parent d008fc65
Loading
Loading
Loading
Loading
+6 −4
Original line number Diff line number Diff line
@@ -891,11 +891,12 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
	struct xe_exec_queue *q = ge->q;
	struct xe_guc *guc = exec_queue_to_guc(q);
	struct xe_gpu_scheduler *sched = &ge->sched;
	bool wedged;
	bool wedged = false;

	xe_gt_assert(guc_to_gt(guc), xe_exec_queue_is_lr(q));
	trace_xe_exec_queue_lr_cleanup(q);

	if (!exec_queue_killed(q))
		wedged = guc_submit_hint_wedged(exec_queue_to_guc(q));

	/* Kill the run_job / process_msg entry points */
@@ -1070,7 +1071,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
	int err = -ETIME;
	pid_t pid = -1;
	int i = 0;
	bool wedged, skip_timeout_check;
	bool wedged = false, skip_timeout_check;

	/*
	 * TDR has fired before free job worker. Common if exec queue
@@ -1116,6 +1117,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
	 * doesn't work for SRIOV. For now assuming timeouts in wedged mode are
	 * genuine timeouts.
	 */
	if (!exec_queue_killed(q))
		wedged = guc_submit_hint_wedged(exec_queue_to_guc(q));

	/* Engine state now stable, disable scheduling to check timestamp */