Merge tag 'drm-misc-next-2025-12-12' of... (6c8e4048) · Commits · git / linux-net

Documentation/gpu/drm-mm.rst

+23 −6

Original line number	Diff line number	Diff line
		@@ -155,7 +155,12 @@ drm_gem_object_init() will create an shmfs file of the
		requested size and store it into the struct :c:type:`struct
		drm_gem_object <drm_gem_object>` filp field. The memory is
		used as either main storage for the object when the graphics hardware
		uses system memory directly or as a backing store otherwise.
		uses system memory directly or as a backing store otherwise. Drivers
		can call drm_gem_huge_mnt_create() to create, mount and use a huge
		shmem mountpoint instead of the default one ('shm_mnt'). For builds
		with CONFIG_TRANSPARENT_HUGEPAGE enabled, further calls to
		drm_gem_object_init() will let shmem allocate huge pages when
		possible.

		Drivers are responsible for the actual physical pages allocation by
		calling shmem_read_mapping_page_gfp() for each page.
		@@ -290,15 +295,27 @@ The open and close operations must update the GEM object reference
		count. Drivers can use the drm_gem_vm_open() and drm_gem_vm_close() helper
		functions directly as open and close handlers.

		The fault operation handler is responsible for mapping individual pages
		to userspace when a page fault occurs. Depending on the memory
		allocation scheme, drivers can allocate pages at fault time, or can
		decide to allocate memory for the GEM object at the time the object is
		created.
		The fault operation handler is responsible for mapping pages to
		userspace when a page fault occurs. Depending on the memory allocation
		scheme, drivers can allocate pages at fault time, or can decide to
		allocate memory for the GEM object at the time the object is created.

		Drivers that want to map the GEM object upfront instead of handling page
		faults can implement their own mmap file operation handler.

		In order to reduce page table overhead, if the internal shmem mountpoint
		"shm_mnt" is configured to use transparent huge pages (for builds with
		CONFIG_TRANSPARENT_HUGEPAGE enabled) and if the shmem backing store
		managed to allocate a huge page for a faulty address, the fault handler
		will first attempt to insert that huge page into the VMA before falling
		back to individual page insertion. mmap() user address alignment for GEM
		objects is handled by providing a custom get_unmapped_area file
		operation which forwards to the shmem backing store. For most drivers,
		which don't create a huge mountpoint by default or through a module
		parameter, transparent huge pages can be enabled by either setting the
		"transparent_hugepage_shmem" kernel parameter or the
		"/sys/kernel/mm/transparent_hugepage/shmem_enabled" sysfs knob.

		For platforms without MMU the GEM core provides a helper method
		drm_gem_dma_get_unmapped_area(). The mmap() routines will call this to get a
		proposed address for the mapping.

Documentation/gpu/todo.rst

+45 −0

Original line number	Diff line number	Diff line
		@@ -878,6 +878,51 @@ Contact: Christian König

		Level: Starter

		DRM GPU Scheduler
		=================

		Provide a universal successor for drm_sched_resubmit_jobs()
		-----------------------------------------------------------

		drm_sched_resubmit_jobs() is deprecated. Main reason being that it leads to
		reinitializing dma_fences. See that function's docu for details. The better
		approach for valid resubmissions by amdgpu and Xe is (apparently) to figure out
		which job (and, through association: which entity) caused the hang. Then, the
		job's buffer data, together with all other jobs' buffer data currently in the
		same hardware ring, must be invalidated. This can for example be done by
		overwriting it. amdgpu currently determines which jobs are in the ring and need
		to be overwritten by keeping copies of the job. Xe obtains that information by
		directly accessing drm_sched's pending_list.

		Tasks:

		1. implement scheduler functionality through which the driver can obtain the
		information which broken jobs are currently in the hardware ring.
		2. Such infrastructure would then typically be used in
		drm_sched_backend_ops.timedout_job(). Document that.
		3. Port a driver as first user.
		4. Document the new alternative in the docu of deprecated
		drm_sched_resubmit_jobs().

		Contact: Christian König <christian.koenig@amd.com>
		Philipp Stanner <phasta@kernel.org>

		Level: Advanced

		Add locking for runqueues
		-------------------------

		There is an old FIXME by Sima in include/drm/gpu_scheduler.h. It details that
		struct drm_sched_rq is read at many places without any locks, not even with a
		READ_ONCE. At XDC 2025 no one could really tell why that is the case, whether
		locks are needed and whether they could be added. (But for real, that should
		probably be locked!). Check whether it's possible to add locks everywhere, and
		do so if yes.

		Contact: Philipp Stanner <phasta@kernel.org>

		Level: Intermediate

		Outside DRM
		===========

Documentation/process/debugging/kgdb.rst

+0 −28

Original line number	Diff line number	Diff line
		@@ -889,34 +889,6 @@ in the virtual console layer. On resuming kernel execution, the kernel
		debugger calls kgdboc_post_exp_handler() which in turn calls
		con_debug_leave().

		Any video driver that wants to be compatible with the kernel debugger
		and the atomic kms callbacks must implement the ``mode_set_base_atomic``,
		``fb_debug_enter`` and ``fb_debug_leave operations``. For the
		``fb_debug_enter`` and ``fb_debug_leave`` the option exists to use the
		generic drm fb helper functions or implement something custom for the
		hardware. The following example shows the initialization of the
		.mode_set_base_atomic operation in
		drivers/gpu/drm/i915/intel_display.c::


		static const struct drm_crtc_helper_funcs intel_helper_funcs = {
		[...]
		.mode_set_base_atomic = intel_pipe_set_base_atomic,
		[...]
		};


		Here is an example of how the i915 driver initializes the
		fb_debug_enter and fb_debug_leave functions to use the generic drm
		helpers in ``drivers/gpu/drm/i915/intel_fb.c``::


		static struct fb_ops intelfb_ops = {
		[...]
		.fb_debug_enter = drm_fb_helper_debug_enter,
		.fb_debug_leave = drm_fb_helper_debug_leave,
		[...]
		};


		Credits

drivers/accel/amdxdna/aie2_message.c

+12 −6

Original line number	Diff line number	Diff line
		@@ -39,7 +39,6 @@ static int aie2_send_mgmt_msg_wait(struct amdxdna_dev_hdl *ndev,
		if (!ndev->mgmt_chann)
		return -ENODEV;

		drm_WARN_ON(&xdna->ddev, xdna->rpm_on && !mutex_is_locked(&xdna->dev_lock));
		ret = xdna_send_msg_wait(xdna, ndev->mgmt_chann, msg);
		if (ret == -ETIME) {
		xdna_mailbox_stop_channel(ndev->mgmt_chann);
		@@ -59,8 +58,15 @@ static int aie2_send_mgmt_msg_wait(struct amdxdna_dev_hdl *ndev,
		int aie2_suspend_fw(struct amdxdna_dev_hdl *ndev)
		{
		DECLARE_AIE2_MSG(suspend, MSG_OP_SUSPEND);
		int ret;

		return aie2_send_mgmt_msg_wait(ndev, &msg);
		ret = aie2_send_mgmt_msg_wait(ndev, &msg);
		if (ret) {
		XDNA_ERR(ndev->xdna, "Failed to suspend fw, ret %d", ret);
		return ret;
		}

		return aie2_psp_waitmode_poll(ndev->psp_hdl);
		}

		int aie2_resume_fw(struct amdxdna_dev_hdl *ndev)
		@@ -646,6 +652,7 @@ aie2_cmdlist_fill_npu_cf(struct amdxdna_gem_obj cmd_bo, void slot, size_t *siz
		u32 cmd_len;
		void *cmd;

		memset(npu_slot, 0, sizeof(*npu_slot));
		cmd = amdxdna_cmd_get_payload(cmd_bo, &cmd_len);
		if (size < sizeof(npu_slot) + cmd_len)
		return -EINVAL;
		@@ -654,7 +661,6 @@ aie2_cmdlist_fill_npu_cf(struct amdxdna_gem_obj cmd_bo, void slot, size_t *siz
		if (npu_slot->cu_idx == INVALID_CU_IDX)
		return -EINVAL;

		memset(npu_slot, 0, sizeof(*npu_slot));
		npu_slot->type = EXEC_NPU_TYPE_NON_ELF;
		npu_slot->arg_cnt = cmd_len / sizeof(u32);
		memcpy(npu_slot->args, cmd, cmd_len);
		@@ -671,6 +677,7 @@ aie2_cmdlist_fill_npu_dpu(struct amdxdna_gem_obj cmd_bo, void slot, size_t *si
		u32 cmd_len;
		u32 arg_sz;

		memset(npu_slot, 0, sizeof(*npu_slot));
		sn = amdxdna_cmd_get_payload(cmd_bo, &cmd_len);
		arg_sz = cmd_len - sizeof(*sn);
		if (cmd_len < sizeof(*sn) \|\| arg_sz > MAX_NPU_ARGS_SIZE)
		@@ -683,7 +690,6 @@ aie2_cmdlist_fill_npu_dpu(struct amdxdna_gem_obj cmd_bo, void slot, size_t *si
		if (npu_slot->cu_idx == INVALID_CU_IDX)
		return -EINVAL;

		memset(npu_slot, 0, sizeof(*npu_slot));
		npu_slot->type = EXEC_NPU_TYPE_PARTIAL_ELF;
		npu_slot->inst_buf_addr = sn->buffer;
		npu_slot->inst_size = sn->buffer_size;
		@@ -703,6 +709,7 @@ aie2_cmdlist_fill_npu_preempt(struct amdxdna_gem_obj cmd_bo, void slot, size_t
		u32 cmd_len;
		u32 arg_sz;

		memset(npu_slot, 0, sizeof(*npu_slot));
		pd = amdxdna_cmd_get_payload(cmd_bo, &cmd_len);
		arg_sz = cmd_len - sizeof(*pd);
		if (cmd_len < sizeof(*pd) \|\| arg_sz > MAX_NPU_ARGS_SIZE)
		@@ -715,7 +722,6 @@ aie2_cmdlist_fill_npu_preempt(struct amdxdna_gem_obj cmd_bo, void slot, size_t
		if (npu_slot->cu_idx == INVALID_CU_IDX)
		return -EINVAL;

		memset(npu_slot, 0, sizeof(*npu_slot));
		npu_slot->type = EXEC_NPU_TYPE_PREEMPT;
		npu_slot->inst_buf_addr = pd->inst_buf;
		npu_slot->save_buf_addr = pd->save_buf;
		@@ -739,6 +745,7 @@ aie2_cmdlist_fill_npu_elf(struct amdxdna_gem_obj cmd_bo, void slot, size_t *si
		u32 cmd_len;
		u32 arg_sz;

		memset(npu_slot, 0, sizeof(*npu_slot));
		pd = amdxdna_cmd_get_payload(cmd_bo, &cmd_len);
		arg_sz = cmd_len - sizeof(*pd);
		if (cmd_len < sizeof(*pd) \|\| arg_sz > MAX_NPU_ARGS_SIZE)
		@@ -747,7 +754,6 @@ aie2_cmdlist_fill_npu_elf(struct amdxdna_gem_obj cmd_bo, void slot, size_t *si
		if (size < sizeof(npu_slot) + arg_sz)
		return -EINVAL;

		memset(npu_slot, 0, sizeof(*npu_slot));
		npu_slot->type = EXEC_NPU_TYPE_ELF;
		npu_slot->inst_buf_addr = pd->inst_buf;
		npu_slot->save_buf_addr = pd->save_buf;

drivers/accel/amdxdna/aie2_pci.c

+1 −1

Original line number	Diff line number	Diff line
		@@ -322,7 +322,7 @@ static int aie2_xrs_set_dft_dpm_level(struct drm_device *ddev, u32 dpm_level)
		if (ndev->pw_mode != POWER_MODE_DEFAULT \|\| ndev->dpm_level == dpm_level)
		return 0;

		return ndev->priv->hw_ops.set_dpm(ndev, dpm_level);
		return aie2_pm_set_dpm(ndev, dpm_level);
		}

		static struct xrs_action_ops aie2_xrs_actions = {