Commit baf4afc5 authored by Philipp Stanner's avatar Philipp Stanner
Browse files

drm/sched: Improve teardown documentation



If jobs are still enqueued in struct drm_gpu_scheduler.pending_list
when drm_sched_fini() gets called, those jobs will be leaked since that
function stops both job-submission and (automatic) job-cleanup. It is,
thus, up to the driver to take care of preventing leaks.

The related function drm_sched_wqueue_stop() also prevents automatic job
cleanup.

Those pitfals are not reflected in the documentation, currently.

Explicitly inform about the leak problem in the docstring of
drm_sched_fini().

Additionally, detail the purpose of drm_sched_wqueue_{start,stop} and
hint at the consequences for automatic cleanup.

Signed-off-by: default avatarPhilipp Stanner <pstanner@redhat.com>
Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105143137.71893-2-pstanner@redhat.com
parent 21c23e4b
Loading
Loading
Loading
Loading
+21 −2
Original line number Diff line number Diff line
@@ -1350,6 +1350,19 @@ EXPORT_SYMBOL(drm_sched_init);
 * @sched: scheduler instance
 *
 * Tears down and cleans up the scheduler.
 *
 * This stops submission of new jobs to the hardware through
 * drm_sched_backend_ops.run_job(). Consequently, drm_sched_backend_ops.free_job()
 * will not be called for all jobs still in drm_gpu_scheduler.pending_list.
 * There is no solution for this currently. Thus, it is up to the driver to make
 * sure that
 *  a) drm_sched_fini() is only called after for all submitted jobs
 *     drm_sched_backend_ops.free_job() has been called or that
 *  b) the jobs for which drm_sched_backend_ops.free_job() has not been called
 *     after drm_sched_fini() ran are freed manually.
 *
 * FIXME: Take care of the above problem and prevent this function from leaking
 * the jobs in drm_gpu_scheduler.pending_list under any circumstances.
 */
void drm_sched_fini(struct drm_gpu_scheduler *sched)
{
@@ -1445,8 +1458,10 @@ EXPORT_SYMBOL(drm_sched_wqueue_ready);

/**
 * drm_sched_wqueue_stop - stop scheduler submission
 *
 * @sched: scheduler instance
 *
 * Stops the scheduler from pulling new jobs from entities. It also stops
 * freeing jobs automatically through drm_sched_backend_ops.free_job().
 */
void drm_sched_wqueue_stop(struct drm_gpu_scheduler *sched)
{
@@ -1458,8 +1473,12 @@ EXPORT_SYMBOL(drm_sched_wqueue_stop);

/**
 * drm_sched_wqueue_start - start scheduler submission
 *
 * @sched: scheduler instance
 *
 * Restarts the scheduler after drm_sched_wqueue_stop() has stopped it.
 *
 * This function is not necessary for 'conventional' startup. The scheduler is
 * fully operational after drm_sched_init() succeeded.
 */
void drm_sched_wqueue_start(struct drm_gpu_scheduler *sched)
{