Commit 6a23e7b4 authored by Mario Limonciello (AMD)'s avatar Mario Limonciello (AMD) Committed by Alex Deucher
Browse files

drm/amd: Clean up kfd node on surprise disconnect

When an eGPU is unplugged the KFD topology should also be destroyed
for that GPU. This never happens because the fini_sw callbacks never
get to run. Run them manually before calling amdgpu_device_ip_fini_early()
when a device has already been disconnected.

This location is intentionally chosen to make sure that the kfd locking
refcount doesn't get incremented unintentionally.

Cc: kent.russell@amd.com
Closes: https://community.frame.work/t/amd-egpu-on-linux/8691/33


Signed-off-by: default avatarMario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: default avatarKent Russell <kent.russell@amd.com>
Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
parent 17de4726
Loading
Loading
Loading
Loading
+8 −0
Original line number Diff line number Diff line
@@ -4920,6 +4920,14 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)

	amdgpu_ttm_set_buffer_funcs_status(adev, false);

	/*
	 * device went through surprise hotplug; we need to destroy topology
	 * before ip_fini_early to prevent kfd locking refcount issues by calling
	 * amdgpu_amdkfd_suspend()
	 */
	if (drm_dev_is_unplugged(adev_to_drm(adev)))
		amdgpu_amdkfd_device_fini_sw(adev);

	amdgpu_device_ip_fini_early(adev);

	amdgpu_irq_fini_hw(adev);