When MEC executes unmap_queue for mid command buffer preemption, it will
kick the write pointer of the gfx ring, set CP_VMID_PREEMPT to trigger the
preemption and wait for CP_VMID_PREEMPT becomes zero after the preemption
done. There is a race condition that PFP may excute the resetting command
before MEC set CP_VMID_PREEMPT. As a result, hang happens as
CP_VMID_PREEMPT is always 0xffff.
To avoid this, we send resetting CP_VMID_PREEMPT command after the trailing
fence is siganled and update gfx write pointer explicitly.
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.3.x
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2535
It looks better to place this field in ring
structure. Also drop the repeated ring funcs definitions
if there's no difference except for vmhub field.
v2: rename the field to vm_hub like others (Le)
v3: apply the changes to new ip blocks (Hawking)
v4: fix vcn sw ring (Alex)
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Due to raven/raven2 maybe enable sclk slow down,
they cannot get clock count by the RLC at the auto level of dpm performance.
So switch to golden tsc register.
Suggested-by: shanshengwang <shansheng.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use gfx ras common initialization interface to
initialize gfx ras block.
V2:
Update function call due to amdgpu_gfx_ras_sw_init
interface changes.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some dead code was introduced as part of utilizing the `amdgpu_ucode_*`
helpers. Adjust the control flow to make sure that firmware is released
in the appropriate error flows.
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1530548 ("Control flow issues")
Fixes: ec787deb2d ("drm/amd: Use `amdgpu_ucode_*` helpers for GFX9")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The `amdgpu_ucode_request` helper will ensure that the return code for
missing firmware is -ENODEV so that early_init can fail.
The `amdgpu_ucode_release` helper will provide symmetry on unload.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Trigger Mid-Command Buffer Preemption according to the priority of the software
rings and the hw fence signalling condition.
The muxer saves the locations of the indirect buffer frames from the software
ring together with the fence sequence number in its fifo queue, and pops out
those records when the fences are signalled. The locations are used to resubmit
packages in preemption scenarios by coping the chunks from the software ring.
v2: Update comment style.
v3: Fix conflict caused by previous modifications.
v4: Remove unnecessary prints.
v5: Fix corner cases for resubmission cases.
v6: Refactor functions for resubmission, calling fence_process in irq handler.
v7: Solve conflict for removing amdgpu_sw_ring.c.
v8: Add time threshold to judge if preemption request is needed.
v9: Correct comment spelling. Set fence emit timestamp before rsu assignment.
Cc: Christian Koenig <Christian.Koenig@amd.com>
Cc: Luben Tuikov <Luben.Tuikov@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. Modify the unmap_queue package on gfx9. Add trailing fence to track the
preemption done.
2. Modify emit_ce_meta emit_de_meta functions for the resumed ibs.
v2: Restyle code not to use ternary operator.
v3: Modify code format.
v4: Enable Mid-Command Buffer Preemption for gfx9 by default.
v5: Optimize the flag bit set for emit_fence.
v6: Modify log message for preemption timeout.
Cc: Christian Koenig <Christian.Koenig@amd.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Luben Tuikov <Luben.Tuikov@amd.com>
Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set ring functions with software ring callbacks on gfx9.
The software ring could be tested by debugfs_test_ib case.
v2: Set sw_ring 2 to enable software ring by default.
v3: Remove the parameter for software ring enablement.
v4: Use amdgpu_ring_init/fini for software rings.
v5: Update for code format. Fix conflict.
v6: Remove unnecessary checks and enable software ring on gfx9 by default.
v7: Use static array for software ring names and priorities.
v8: Stop creating software rings if no gfx ring existed.
Cc: Christian Koenig <Christian.Koenig@amd.com>
Cc: Luben Tuikov <Luben.Tuikov@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- clear kiq ring after suspend/resume under sriov to aviod kiq ring
test failure
- update irq after resume to fix kiq interrput loss
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The current position calulated in gfx_v9_0_ring_emit_patch_cond_exec
underflows when the wptr is divisible by ring->buf_mask + 1.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The second jump table is required on live migration or mulitple VF
configuration on Aldebaran. With this implemented, the first level
jump table(hw used) will be same, mec fw internal will use the
second level jump table jump to the real functionality implementation.
so the different VF can load different version of MEC as long as
they support sjt
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The scratch register should be accessed through MMIO instead of RLCG
in SRIOV, since it being used in RLCG register access function.
Fixes: d54762cc3e ("drm/amdgpu: nuke dynamic gfx scratch reg allocation")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Spelling mistakes (triple letters) in comments.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Remove the accidental shifts on the values of RPTR_BLOCK_SIZE
in gfx_v8-v11. The bug essentially always programs the
corresponding fields to zero instead of the correct value.
The hardware clamps the min value to 5 so this resulted in a
value of 5 being programmed.
Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clean up redundant, copy-paste code blocks during the initialization of
the doorbells in mqd_init().
Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's over a decade ago that this was actually used for more than ring and
IB tests. Just use the static register directly where needed and nuke the
now useless infrastructure.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Lang Yu <Lang.Yu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of the 'amdgpu_ring_priority_level' type,
the 'amdgpu_gfx_pipe_priority' type was used,
which is an error when setting ring priority.
This is a minor error, but may cause problems in the future.
Instead of AMDGPU_RING_PRIO_2 = 2, we can use AMDGPU_RING_PRIO_MAX = 3,
but AMDGPU_RING_PRIO_2 = 2 is used for compatibility with
AMDGPU_GFX_PIPE_PRIO_HIGH = 2, and not change the behavior of the
code.
Signed-off-by: Grigory Vasilyev <h0tc0d3@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enabling gfxoff quirk results in perfectly usable graphical user
interface on MacBook Pro (15-inch, 2019) with Radeon Pro Vega 20 4 GB.
Without the quirk, X server is completely unusable as every few seconds
there is gpu reset due to ring gfx timeout.
Signed-off-by: Tomasz Moń <desowin@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. Define amdgpu_ras_block_late_fini_default in amdgpu_ras.c as
.ras_fini common function, which is called when
.ras_fini of ras block isn't initialized.
2. Remove the code of using amdgpu_ras_block_late_fini to
initialize .ras_fini in ras blocks.
Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. Move the variables of ras block instance members from
specific xxx_ras_fini to general ras_fini call.
2. Function calls inside the modules only use parameters
passed from xxx_ras_fini instead of ras block instance
members.
Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Modify .ras_fini function pointer parameter so that
we can remove redundant intermediate calls in some
ras blocks.
Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. Move calling ras block instance members from module internal
function to the top calling xxx_ras_late_init.
2. Module internal function calls can only use parameter variables
of xxx_ras_late_init instead of ras block instance members.
Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. Define amdgpu_ras_block_late_init to create sysfs nodes
and interrupt handles.
2. Define amdgpu_ras_block_late_fini to remove sysfs nodes
and interrupt handles.
3. Replace ras block variable members in struct
amdgpu_ras_block_object with struct ras_common_if, which
can make it easy to associate each ras block instance
with each ras block functional interface.
4. Add .ras_cb to struct amdgpu_ras_block_object.
5. Change each ras block to fit for the changement of struct
amdgpu_ras_block_object.
Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
"adev->gfx.rlc.rlcg_reg_access_supported = true;"
the above varible were set too late during driver initialization.
it will cause the driver to fail to write/read register during GMC hw init
in sriov mode.
move gfx_xxx_init_rlcg_reg_access_ctrl() function to gfx early init stage
to avoid this issue.
Fixes: 5d447e2967 ("drm/amdgpu: add helper for rlcg indirect reg access")
Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>