drm/amdgpu: Fix fence signaling race condition in userqueue

This commit fixes a potential race condition in the userqueue fence
signaling mechanism by replacing dma_fence_is_signaled_locked() with
dma_fence_is_signaled().

The issue occurred because:
1. dma_fence_is_signaled_locked() should only be used when holding
   the fence's individual lock, not just the fence list lock
2. Using the locked variant without the proper fence lock could lead
   to double-signaling scenarios:
   - Hardware completion signals the fence
   - Software path also tries to signal the same fence

By using dma_fence_is_signaled() instead, we properly handle the
locking hierarchy and avoid the race condition while still maintaining
the necessary synchronization through the fence_list_lock.

v2: drop the comment (Christian)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Jesse.Zhang
2025-09-24 16:00:06 +08:00
committed by Alex Deucher
parent 45da20e00d
commit b8ae2640f9

View File

@@ -284,7 +284,7 @@ static int amdgpu_userq_fence_create(struct amdgpu_usermode_queue *userq,
/* Check if hardware has already processed the job */
spin_lock_irqsave(&fence_drv->fence_list_lock, flags);
if (!dma_fence_is_signaled_locked(fence))
if (!dma_fence_is_signaled(fence))
list_add_tail(&userq_fence->link, &fence_drv->fences);
else
dma_fence_put(fence);