Commit 5da39dce authored by Dave Airlie's avatar Dave Airlie
Browse files

Merge tag 'drm-xe-next-fixes-2025-03-12' of...

Merge tag 'drm-xe-next-fixes-2025-03-12' of https://gitlab.freedesktop.org/drm/xe/kernel

 into drm-next

Core Changes:
 - Fix kernel-doc for gpusvm (Lucas)

Driver Changes:
 - Drop duplicated pc_start call (Rodrigo)
 - Drop sentinels from rtp (Lucas)
 - Fix MOCS debugfs missing forcewake (Tvrtko)
 - Ring flush invalitation (Tvrtko)
 - Fix type for width alignement (Tvrtko)

Signed-off-by: default avatarDave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/fsztfqcddrarwjlxjwm2k4wvc6u5vntceh6b7nsnxjmwzgtunj@sbkshjow65rf
parents 64fc5dc8 7b7b07c2
Loading
Loading
Loading
Loading
+10 −5
Original line number Diff line number Diff line
@@ -67,14 +67,19 @@ Agreed upon design principles
Overview of baseline design
===========================

Baseline design is simple as possible to get a working basline in which can be
built upon.

.. kernel-doc:: drivers/gpu/drm/xe/drm_gpusvm.c
.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
   :doc: Overview

.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
   :doc: Locking
   :doc: Migrataion

.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
   :doc: Migration

.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
   :doc: Partial Unmapping of Ranges

.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
   :doc: Examples

Possible future design features
+69 −55
Original line number Diff line number Diff line
@@ -23,37 +23,42 @@
 * DOC: Overview
 *
 * GPU Shared Virtual Memory (GPU SVM) layer for the Direct Rendering Manager (DRM)
 *
 * The GPU SVM layer is a component of the DRM framework designed to manage shared
 * virtual memory between the CPU and GPU. It enables efficient data exchange and
 * processing for GPU-accelerated applications by allowing memory sharing and
 * is a component of the DRM framework designed to manage shared virtual memory
 * between the CPU and GPU. It enables efficient data exchange and processing
 * for GPU-accelerated applications by allowing memory sharing and
 * synchronization between the CPU's and GPU's virtual address spaces.
 *
 * Key GPU SVM Components:
 * - Notifiers: Notifiers: Used for tracking memory intervals and notifying the
 *		GPU of changes, notifiers are sized based on a GPU SVM
 *		initialization parameter, with a recommendation of 512M or
 *		larger. They maintain a Red-BlacK tree and a list of ranges that
 *		fall within the notifier interval. Notifiers are tracked within
 *		a GPU SVM Red-BlacK tree and list and are dynamically inserted
 *		or removed as ranges within the interval are created or
 *
 * - Notifiers:
 *	Used for tracking memory intervals and notifying the GPU of changes,
 *	notifiers are sized based on a GPU SVM initialization parameter, with a
 *	recommendation of 512M or larger. They maintain a Red-BlacK tree and a
 *	list of ranges that fall within the notifier interval.  Notifiers are
 *	tracked within a GPU SVM Red-BlacK tree and list and are dynamically
 *	inserted or removed as ranges within the interval are created or
 *	destroyed.
 * - Ranges: Represent memory ranges mapped in a DRM device and managed
 *	     by GPU SVM. They are sized based on an array of chunk sizes, which
 *	     is a GPU SVM initialization parameter, and the CPU address space.
 *	     Upon GPU fault, the largest aligned chunk that fits within the
 *	     faulting CPU address space is chosen for the range size. Ranges are
 *	     expected to be dynamically allocated on GPU fault and removed on an
 *	     MMU notifier UNMAP event. As mentioned above, ranges are tracked in
 *	     a notifier's Red-Black tree.
 * - Operations: Define the interface for driver-specific GPU SVM operations
 *               such as range allocation, notifier allocation, and
 *               invalidations.
 * - Device Memory Allocations: Embedded structure containing enough information
 *                              for GPU SVM to migrate to / from device memory.
 * - Device Memory Operations: Define the interface for driver-specific device
 *                             memory operations release memory, populate pfns,
 *                             and copy to / from device memory.
 * - Ranges:
 *	Represent memory ranges mapped in a DRM device and managed by GPU SVM.
 *	They are sized based on an array of chunk sizes, which is a GPU SVM
 *	initialization parameter, and the CPU address space.  Upon GPU fault,
 *	the largest aligned chunk that fits within the faulting CPU address
 *	space is chosen for the range size. Ranges are expected to be
 *	dynamically allocated on GPU fault and removed on an MMU notifier UNMAP
 *	event. As mentioned above, ranges are tracked in a notifier's Red-Black
 *	tree.
 *
 * - Operations:
 *	Define the interface for driver-specific GPU SVM operations such as
 *	range allocation, notifier allocation, and invalidations.
 *
 * - Device Memory Allocations:
 *	Embedded structure containing enough information for GPU SVM to migrate
 *	to / from device memory.
 *
 * - Device Memory Operations:
 *	Define the interface for driver-specific device memory operations
 *	release memory, populate pfns, and copy to / from device memory.
 *
 * This layer provides interfaces for allocating, mapping, migrating, and
 * releasing memory ranges between the CPU and GPU. It handles all core memory
@@ -63,14 +68,18 @@
 * below.
 *
 * Expected Driver Components:
 * - GPU page fault handler: Used to create ranges and notifiers based on the
 *			     fault address, optionally migrate the range to
 *			     device memory, and create GPU bindings.
 * - Garbage collector: Used to unmap and destroy GPU bindings for ranges.
 *			Ranges are expected to be added to the garbage collector
 *			upon a MMU_NOTIFY_UNMAP event in notifier callback.
 * - Notifier callback: Used to invalidate and DMA unmap GPU bindings for
 *			ranges.
 *
 * - GPU page fault handler:
 *	Used to create ranges and notifiers based on the fault address,
 *	optionally migrate the range to device memory, and create GPU bindings.
 *
 * - Garbage collector:
 *	Used to unmap and destroy GPU bindings for ranges.  Ranges are expected
 *	to be added to the garbage collector upon a MMU_NOTIFY_UNMAP event in
 *	notifier callback.
 *
 * - Notifier callback:
 *	Used to invalidate and DMA unmap GPU bindings for ranges.
 */

/**
@@ -83,9 +92,9 @@
 * range RB tree and list, as well as the range's DMA mappings and sequence
 * number. GPU SVM manages all necessary locking and unlocking operations,
 * except for the recheck range's pages being valid
 * (drm_gpusvm_range_pages_valid) when the driver is committing GPU bindings. This
 * lock corresponds to the 'driver->update' lock mentioned in the HMM
 * documentation (TODO: Link). Future revisions may transition from a GPU SVM
 * (drm_gpusvm_range_pages_valid) when the driver is committing GPU bindings.
 * This lock corresponds to the ``driver->update`` lock mentioned in
 * Documentation/mm/hmm.rst. Future revisions may transition from a GPU SVM
 * global lock to a per-notifier lock if finer-grained locking is deemed
 * necessary.
 *
@@ -102,11 +111,11 @@
 * DOC: Migration
 *
 * The migration support is quite simple, allowing migration between RAM and
 * device memory at the range granularity. For example, GPU SVM currently does not
 * support mixing RAM and device memory pages within a range. This means that upon GPU
 * fault, the entire range can be migrated to device memory, and upon CPU fault, the
 * entire range is migrated to RAM. Mixed RAM and device memory storage within a range
 * could be added in the future if required.
 * device memory at the range granularity. For example, GPU SVM currently does
 * not support mixing RAM and device memory pages within a range. This means
 * that upon GPU fault, the entire range can be migrated to device memory, and
 * upon CPU fault, the entire range is migrated to RAM. Mixed RAM and device
 * memory storage within a range could be added in the future if required.
 *
 * The reasoning for only supporting range granularity is as follows: it
 * simplifies the implementation, and range sizes are driver-defined and should
@@ -119,11 +128,11 @@
 * Partial unmapping of ranges (e.g., 1M out of 2M is unmapped by CPU resulting
 * in MMU_NOTIFY_UNMAP event) presents several challenges, with the main one
 * being that a subset of the range still has CPU and GPU mappings. If the
 * backing store for the range is in device memory, a subset of the backing store has
 * references. One option would be to split the range and device memory backing store,
 * but the implementation for this would be quite complicated. Given that
 * partial unmappings are rare and driver-defined range sizes are relatively
 * small, GPU SVM does not support splitting of ranges.
 * backing store for the range is in device memory, a subset of the backing
 * store has references. One option would be to split the range and device
 * memory backing store, but the implementation for this would be quite
 * complicated. Given that partial unmappings are rare and driver-defined range
 * sizes are relatively small, GPU SVM does not support splitting of ranges.
 *
 * With no support for range splitting, upon partial unmapping of a range, the
 * driver is expected to invalidate and destroy the entire range. If the range
@@ -144,6 +153,8 @@
 *
 * 1) GPU page fault handler
 *
 * .. code-block:: c
 *
 *	int driver_bind_range(struct drm_gpusvm *gpusvm, struct drm_gpusvm_range *range)
 *	{
 *		int err = 0;
@@ -208,7 +219,9 @@
 *		return err;
 *	}
 *
 * 2) Garbage Collector.
 * 2) Garbage Collector
 *
 * .. code-block:: c
 *
 *	void __driver_garbage_collector(struct drm_gpusvm *gpusvm,
 *					struct drm_gpusvm_range *range)
@@ -231,7 +244,9 @@
 *			__driver_garbage_collector(gpusvm, range);
 *	}
 *
 * 3) Notifier callback.
 * 3) Notifier callback
 *
 * .. code-block:: c
 *
 *	void driver_invalidation(struct drm_gpusvm *gpusvm,
 *				 struct drm_gpusvm_notifier *notifier,
@@ -499,7 +514,7 @@ drm_gpusvm_notifier_invalidate(struct mmu_interval_notifier *mni,
	return true;
}

/**
/*
 * drm_gpusvm_notifier_ops - MMU interval notifier operations for GPU SVM
 */
static const struct mmu_interval_notifier_ops drm_gpusvm_notifier_ops = {
@@ -2055,7 +2070,6 @@ static int __drm_gpusvm_migrate_to_ram(struct vm_area_struct *vas,

/**
 * drm_gpusvm_range_evict - Evict GPU SVM range
 * @pagemap: Pointer to the GPU SVM structure
 * @range: Pointer to the GPU SVM range to be removed
 *
 * This function evicts the specified GPU SVM range. This function will not
@@ -2146,8 +2160,8 @@ static vm_fault_t drm_gpusvm_migrate_to_ram(struct vm_fault *vmf)
	return err ? VM_FAULT_SIGBUS : 0;
}

/**
 * drm_gpusvm_pagemap_ops() - Device page map operations for GPU SVM
/*
 * drm_gpusvm_pagemap_ops - Device page map operations for GPU SVM
 */
static const struct dev_pagemap_ops drm_gpusvm_pagemap_ops = {
	.page_free = drm_gpusvm_page_free,
+10 −10
Original line number Diff line number Diff line
@@ -82,7 +82,7 @@ write_dpt_remapped(struct xe_bo *bo, struct iosys_map *map, u32 *dpt_ofs,
static int __xe_pin_fb_vma_dpt(const struct intel_framebuffer *fb,
			       const struct i915_gtt_view *view,
			       struct i915_vma *vma,
			       u64 physical_alignment)
			       unsigned int alignment)
{
	struct xe_device *xe = to_xe_device(fb->base.dev);
	struct xe_tile *tile0 = xe_device_get_root_tile(xe);
@@ -108,7 +108,7 @@ static int __xe_pin_fb_vma_dpt(const struct intel_framebuffer *fb,
						      XE_BO_FLAG_VRAM0 |
						      XE_BO_FLAG_GGTT |
						      XE_BO_FLAG_PAGETABLE,
						      physical_alignment);
						      alignment);
	else
		dpt = xe_bo_create_pin_map_at_aligned(xe, tile0, NULL,
						      dpt_size,  ~0ull,
@@ -116,7 +116,7 @@ static int __xe_pin_fb_vma_dpt(const struct intel_framebuffer *fb,
						      XE_BO_FLAG_STOLEN |
						      XE_BO_FLAG_GGTT |
						      XE_BO_FLAG_PAGETABLE,
						      physical_alignment);
						      alignment);
	if (IS_ERR(dpt))
		dpt = xe_bo_create_pin_map_at_aligned(xe, tile0, NULL,
						      dpt_size,  ~0ull,
@@ -124,7 +124,7 @@ static int __xe_pin_fb_vma_dpt(const struct intel_framebuffer *fb,
						      XE_BO_FLAG_SYSTEM |
						      XE_BO_FLAG_GGTT |
						      XE_BO_FLAG_PAGETABLE,
						      physical_alignment);
						      alignment);
	if (IS_ERR(dpt))
		return PTR_ERR(dpt);

@@ -194,7 +194,7 @@ write_ggtt_rotated(struct xe_bo *bo, struct xe_ggtt *ggtt, u32 *ggtt_ofs, u32 bo
static int __xe_pin_fb_vma_ggtt(const struct intel_framebuffer *fb,
				const struct i915_gtt_view *view,
				struct i915_vma *vma,
				u64 physical_alignment)
				unsigned int alignment)
{
	struct drm_gem_object *obj = intel_fb_bo(&fb->base);
	struct xe_bo *bo = gem_to_xe_bo(obj);
@@ -277,7 +277,7 @@ static int __xe_pin_fb_vma_ggtt(const struct intel_framebuffer *fb,

static struct i915_vma *__xe_pin_fb_vma(const struct intel_framebuffer *fb,
					const struct i915_gtt_view *view,
					u64 physical_alignment)
					unsigned int alignment)
{
	struct drm_device *dev = fb->base.dev;
	struct xe_device *xe = to_xe_device(dev);
@@ -327,9 +327,9 @@ static struct i915_vma *__xe_pin_fb_vma(const struct intel_framebuffer *fb,

	vma->bo = bo;
	if (intel_fb_uses_dpt(&fb->base))
		ret = __xe_pin_fb_vma_dpt(fb, view, vma, physical_alignment);
		ret = __xe_pin_fb_vma_dpt(fb, view, vma, alignment);
	else
		ret = __xe_pin_fb_vma_ggtt(fb, view, vma,  physical_alignment);
		ret = __xe_pin_fb_vma_ggtt(fb, view, vma,  alignment);
	if (ret)
		goto err_unpin;

@@ -422,7 +422,7 @@ int intel_plane_pin_fb(struct intel_plane_state *new_plane_state,
	struct i915_vma *vma;
	struct intel_framebuffer *intel_fb = to_intel_framebuffer(fb);
	struct intel_plane *plane = to_intel_plane(new_plane_state->uapi.plane);
	u64 phys_alignment = plane->min_alignment(plane, fb, 0);
	unsigned int alignment = plane->min_alignment(plane, fb, 0);

	if (reuse_vma(new_plane_state, old_plane_state))
		return 0;
@@ -430,7 +430,7 @@ int intel_plane_pin_fb(struct intel_plane_state *new_plane_state,
	/* We reject creating !SCANOUT fb's, so this is weird.. */
	drm_WARN_ON(bo->ttm.base.dev, !(bo->flags & XE_BO_FLAG_SCANOUT));

	vma = __xe_pin_fb_vma(intel_fb, &new_plane_state->view.gtt, phys_alignment);
	vma = __xe_pin_fb_vma(intel_fb, &new_plane_state->view.gtt, alignment);

	if (IS_ERR(vma))
		return PTR_ERR(vma);
+1 −1
Original line number Diff line number Diff line
@@ -320,7 +320,7 @@ static void xe_rtp_process_to_sr_tests(struct kunit *test)
		count_rtp_entries++;

	xe_rtp_process_ctx_enable_active_tracking(&ctx, &active, count_rtp_entries);
	xe_rtp_process_to_sr(&ctx, param->entries, reg_sr);
	xe_rtp_process_to_sr(&ctx, param->entries, count_rtp_entries, reg_sr);

	xa_for_each(&reg_sr->xa, idx, sre) {
		if (idx == param->expected_reg.addr)
+0 −8
Original line number Diff line number Diff line
@@ -1496,14 +1496,6 @@ void xe_guc_stop(struct xe_guc *guc)

int xe_guc_start(struct xe_guc *guc)
{
	if (!IS_SRIOV_VF(guc_to_xe(guc))) {
		int err;

		err = xe_guc_pc_start(&guc->pc);
		xe_gt_WARN(guc_to_gt(guc), err, "Failed to start GuC PC: %pe\n",
			   ERR_PTR(err));
	}

	return xe_guc_submit_start(guc);
}

Loading