drm/amdgpu: move the call of ras recovery_init and bad page reserve to proper place

ras recovery_init should be called after ttm init,
bad page reserve should be put in front of gpu reset since i2c
may be unstable during gpu reset.
add cleanup for recovery_init and recovery_fini

v2: add more comment and print.
    remove cancel_work_sync in recovery_init.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Tao Zhou
2019-08-30 19:50:39 +08:00
committed by Alex Deucher
parent 87d2b92f1e
commit 1a6fc071e1
4 changed files with 43 additions and 18 deletions

View File

@@ -54,6 +54,7 @@
#include "amdgpu_trace.h"
#include "amdgpu_amdkfd.h"
#include "amdgpu_sdma.h"
#include "amdgpu_ras.h"
#include "bif/bif_4_1_d.h"
static int amdgpu_map_buffer(struct ttm_buffer_object *bo,
@@ -1777,6 +1778,17 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
adev->gmc.visible_vram_size);
#endif
/*
* retired pages will be loaded from eeprom and reserved here,
* it should be called after ttm init since new bo may be created,
* recovery_init may fail, but it can free all resources allocated by
* itself and its failure should not stop amdgpu init process.
*
* Note: theoretically, this should be called before all vram allocations
* to protect retired page from abusing
*/
amdgpu_ras_recovery_init(adev);
/*
*The reserved vram for firmware must be pinned to the specified
*place on the VRAM, so reserve it early.