Merge tag 'drm-xe-next-fixes-2025-03-12' of... (5da39dce) · Commits · git / linux-net

Documentation/gpu/rfc/gpusvm.rst

+10 −5

Original line number	Diff line number	Diff line
		@@ -67,14 +67,19 @@ Agreed upon design principles
		Overview of baseline design
		===========================

		Baseline design is simple as possible to get a working basline in which can be
		built upon.

		.. kernel-doc:: drivers/gpu/drm/xe/drm_gpusvm.c
		.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
		:doc: Overview

		.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
		:doc: Locking
		:doc: Migrataion

		.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
		:doc: Migration

		.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
		:doc: Partial Unmapping of Ranges

		.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
		:doc: Examples

		Possible future design features

drivers/gpu/drm/drm_gpusvm.c

+69 −55

Original line number	Diff line number	Diff line
		@@ -23,37 +23,42 @@
		* DOC: Overview
		*
		* GPU Shared Virtual Memory (GPU SVM) layer for the Direct Rendering Manager (DRM)
		*
		* The GPU SVM layer is a component of the DRM framework designed to manage shared
		* virtual memory between the CPU and GPU. It enables efficient data exchange and
		* processing for GPU-accelerated applications by allowing memory sharing and
		* is a component of the DRM framework designed to manage shared virtual memory
		* between the CPU and GPU. It enables efficient data exchange and processing
		* for GPU-accelerated applications by allowing memory sharing and
		* synchronization between the CPU's and GPU's virtual address spaces.
		*
		* Key GPU SVM Components:
		* - Notifiers: Notifiers: Used for tracking memory intervals and notifying the
		* GPU of changes, notifiers are sized based on a GPU SVM
		* initialization parameter, with a recommendation of 512M or
		* larger. They maintain a Red-BlacK tree and a list of ranges that
		* fall within the notifier interval. Notifiers are tracked within
		* a GPU SVM Red-BlacK tree and list and are dynamically inserted
		* or removed as ranges within the interval are created or
		*
		* - Notifiers:
		* Used for tracking memory intervals and notifying the GPU of changes,
		* notifiers are sized based on a GPU SVM initialization parameter, with a
		* recommendation of 512M or larger. They maintain a Red-BlacK tree and a
		* list of ranges that fall within the notifier interval. Notifiers are
		* tracked within a GPU SVM Red-BlacK tree and list and are dynamically
		* inserted or removed as ranges within the interval are created or
		* destroyed.
		* - Ranges: Represent memory ranges mapped in a DRM device and managed
		* by GPU SVM. They are sized based on an array of chunk sizes, which
		* is a GPU SVM initialization parameter, and the CPU address space.
		* Upon GPU fault, the largest aligned chunk that fits within the
		* faulting CPU address space is chosen for the range size. Ranges are
		* expected to be dynamically allocated on GPU fault and removed on an
		* MMU notifier UNMAP event. As mentioned above, ranges are tracked in
		* a notifier's Red-Black tree.
		* - Operations: Define the interface for driver-specific GPU SVM operations
		* such as range allocation, notifier allocation, and
		* invalidations.
		* - Device Memory Allocations: Embedded structure containing enough information
		* for GPU SVM to migrate to / from device memory.
		* - Device Memory Operations: Define the interface for driver-specific device
		* memory operations release memory, populate pfns,
		* and copy to / from device memory.
		* - Ranges:
		* Represent memory ranges mapped in a DRM device and managed by GPU SVM.
		* They are sized based on an array of chunk sizes, which is a GPU SVM
		* initialization parameter, and the CPU address space. Upon GPU fault,
		* the largest aligned chunk that fits within the faulting CPU address
		* space is chosen for the range size. Ranges are expected to be
		* dynamically allocated on GPU fault and removed on an MMU notifier UNMAP
		* event. As mentioned above, ranges are tracked in a notifier's Red-Black
		* tree.
		*
		* - Operations:
		* Define the interface for driver-specific GPU SVM operations such as
		* range allocation, notifier allocation, and invalidations.
		*
		* - Device Memory Allocations:
		* Embedded structure containing enough information for GPU SVM to migrate
		* to / from device memory.
		*
		* - Device Memory Operations:
		* Define the interface for driver-specific device memory operations
		* release memory, populate pfns, and copy to / from device memory.
		*
		* This layer provides interfaces for allocating, mapping, migrating, and
		* releasing memory ranges between the CPU and GPU. It handles all core memory
		@@ -63,14 +68,18 @@
		* below.
		*
		* Expected Driver Components:
		* - GPU page fault handler: Used to create ranges and notifiers based on the
		* fault address, optionally migrate the range to
		* device memory, and create GPU bindings.
		* - Garbage collector: Used to unmap and destroy GPU bindings for ranges.
		* Ranges are expected to be added to the garbage collector
		* upon a MMU_NOTIFY_UNMAP event in notifier callback.
		* - Notifier callback: Used to invalidate and DMA unmap GPU bindings for
		* ranges.
		*
		* - GPU page fault handler:
		* Used to create ranges and notifiers based on the fault address,
		* optionally migrate the range to device memory, and create GPU bindings.
		*
		* - Garbage collector:
		* Used to unmap and destroy GPU bindings for ranges. Ranges are expected
		* to be added to the garbage collector upon a MMU_NOTIFY_UNMAP event in
		* notifier callback.
		*
		* - Notifier callback:
		* Used to invalidate and DMA unmap GPU bindings for ranges.
		*/

		/**
		@@ -83,9 +92,9 @@
		* range RB tree and list, as well as the range's DMA mappings and sequence
		* number. GPU SVM manages all necessary locking and unlocking operations,
		* except for the recheck range's pages being valid
		* (drm_gpusvm_range_pages_valid) when the driver is committing GPU bindings. This
		* lock corresponds to the 'driver->update' lock mentioned in the HMM
		* documentation (TODO: Link). Future revisions may transition from a GPU SVM
		* (drm_gpusvm_range_pages_valid) when the driver is committing GPU bindings.
		* This lock corresponds to the ``driver->update`` lock mentioned in
		* Documentation/mm/hmm.rst. Future revisions may transition from a GPU SVM
		* global lock to a per-notifier lock if finer-grained locking is deemed
		* necessary.
		*
		@@ -102,11 +111,11 @@
		* DOC: Migration
		*
		* The migration support is quite simple, allowing migration between RAM and
		* device memory at the range granularity. For example, GPU SVM currently does not
		* support mixing RAM and device memory pages within a range. This means that upon GPU
		* fault, the entire range can be migrated to device memory, and upon CPU fault, the
		* entire range is migrated to RAM. Mixed RAM and device memory storage within a range
		* could be added in the future if required.
		* device memory at the range granularity. For example, GPU SVM currently does
		* not support mixing RAM and device memory pages within a range. This means
		* that upon GPU fault, the entire range can be migrated to device memory, and
		* upon CPU fault, the entire range is migrated to RAM. Mixed RAM and device
		* memory storage within a range could be added in the future if required.
		*
		* The reasoning for only supporting range granularity is as follows: it
		* simplifies the implementation, and range sizes are driver-defined and should
		@@ -119,11 +128,11 @@
		* Partial unmapping of ranges (e.g., 1M out of 2M is unmapped by CPU resulting
		* in MMU_NOTIFY_UNMAP event) presents several challenges, with the main one
		* being that a subset of the range still has CPU and GPU mappings. If the
		* backing store for the range is in device memory, a subset of the backing store has
		* references. One option would be to split the range and device memory backing store,
		* but the implementation for this would be quite complicated. Given that
		* partial unmappings are rare and driver-defined range sizes are relatively
		* small, GPU SVM does not support splitting of ranges.
		* backing store for the range is in device memory, a subset of the backing
		* store has references. One option would be to split the range and device
		* memory backing store, but the implementation for this would be quite
		* complicated. Given that partial unmappings are rare and driver-defined range
		* sizes are relatively small, GPU SVM does not support splitting of ranges.
		*
		* With no support for range splitting, upon partial unmapping of a range, the
		* driver is expected to invalidate and destroy the entire range. If the range
		@@ -144,6 +153,8 @@
		*
		* 1) GPU page fault handler
		*
		* .. code-block:: c
		*
		* int driver_bind_range(struct drm_gpusvm gpusvm, struct drm_gpusvm_range range)
		* {
		* int err = 0;
		@@ -208,7 +219,9 @@
		* return err;
		* }
		*
		* 2) Garbage Collector.
		* 2) Garbage Collector
		*
		* .. code-block:: c
		*
		* void __driver_garbage_collector(struct drm_gpusvm *gpusvm,
		* struct drm_gpusvm_range *range)
		@@ -231,7 +244,9 @@
		* __driver_garbage_collector(gpusvm, range);
		* }
		*
		* 3) Notifier callback.
		* 3) Notifier callback
		*
		* .. code-block:: c
		*
		* void driver_invalidation(struct drm_gpusvm *gpusvm,
		* struct drm_gpusvm_notifier *notifier,
		@@ -499,7 +514,7 @@ drm_gpusvm_notifier_invalidate(struct mmu_interval_notifier *mni,
		return true;
		}

		/**
		/*
		* drm_gpusvm_notifier_ops - MMU interval notifier operations for GPU SVM
		*/
		static const struct mmu_interval_notifier_ops drm_gpusvm_notifier_ops = {
		@@ -2055,7 +2070,6 @@ static int __drm_gpusvm_migrate_to_ram(struct vm_area_struct *vas,

		/**
		* drm_gpusvm_range_evict - Evict GPU SVM range
		* @pagemap: Pointer to the GPU SVM structure
		* @range: Pointer to the GPU SVM range to be removed
		*
		* This function evicts the specified GPU SVM range. This function will not
		@@ -2146,8 +2160,8 @@ static vm_fault_t drm_gpusvm_migrate_to_ram(struct vm_fault *vmf)
		return err ? VM_FAULT_SIGBUS : 0;
		}

		/**
		* drm_gpusvm_pagemap_ops() - Device page map operations for GPU SVM
		/*
		* drm_gpusvm_pagemap_ops - Device page map operations for GPU SVM
		*/
		static const struct dev_pagemap_ops drm_gpusvm_pagemap_ops = {
		.page_free = drm_gpusvm_page_free,

drivers/gpu/drm/xe/display/xe_fb_pin.c

+10 −10

Original line number	Diff line number	Diff line
		@@ -82,7 +82,7 @@ write_dpt_remapped(struct xe_bo bo, struct iosys_map map, u32 *dpt_ofs,
		static int __xe_pin_fb_vma_dpt(const struct intel_framebuffer *fb,
		const struct i915_gtt_view *view,
		struct i915_vma *vma,
		u64 physical_alignment)
		unsigned int alignment)
		{
		struct xe_device *xe = to_xe_device(fb->base.dev);
		struct xe_tile *tile0 = xe_device_get_root_tile(xe);
		@@ -108,7 +108,7 @@ static int __xe_pin_fb_vma_dpt(const struct intel_framebuffer *fb,
		XE_BO_FLAG_VRAM0 \|
		XE_BO_FLAG_GGTT \|
		XE_BO_FLAG_PAGETABLE,
		physical_alignment);
		alignment);
		else
		dpt = xe_bo_create_pin_map_at_aligned(xe, tile0, NULL,
		dpt_size, ~0ull,
		@@ -116,7 +116,7 @@ static int __xe_pin_fb_vma_dpt(const struct intel_framebuffer *fb,
		XE_BO_FLAG_STOLEN \|
		XE_BO_FLAG_GGTT \|
		XE_BO_FLAG_PAGETABLE,
		physical_alignment);
		alignment);
		if (IS_ERR(dpt))
		dpt = xe_bo_create_pin_map_at_aligned(xe, tile0, NULL,
		dpt_size, ~0ull,
		@@ -124,7 +124,7 @@ static int __xe_pin_fb_vma_dpt(const struct intel_framebuffer *fb,
		XE_BO_FLAG_SYSTEM \|
		XE_BO_FLAG_GGTT \|
		XE_BO_FLAG_PAGETABLE,
		physical_alignment);
		alignment);
		if (IS_ERR(dpt))
		return PTR_ERR(dpt);

		@@ -194,7 +194,7 @@ write_ggtt_rotated(struct xe_bo bo, struct xe_ggtt ggtt, u32 *ggtt_ofs, u32 bo
		static int __xe_pin_fb_vma_ggtt(const struct intel_framebuffer *fb,
		const struct i915_gtt_view *view,
		struct i915_vma *vma,
		u64 physical_alignment)
		unsigned int alignment)
		{
		struct drm_gem_object *obj = intel_fb_bo(&fb->base);
		struct xe_bo *bo = gem_to_xe_bo(obj);
		@@ -277,7 +277,7 @@ static int __xe_pin_fb_vma_ggtt(const struct intel_framebuffer *fb,

		static struct i915_vma __xe_pin_fb_vma(const struct intel_framebuffer fb,
		const struct i915_gtt_view *view,
		u64 physical_alignment)
		unsigned int alignment)
		{
		struct drm_device *dev = fb->base.dev;
		struct xe_device *xe = to_xe_device(dev);
		@@ -327,9 +327,9 @@ static struct i915_vma __xe_pin_fb_vma(const struct intel_framebuffer fb,

		vma->bo = bo;
		if (intel_fb_uses_dpt(&fb->base))
		ret = __xe_pin_fb_vma_dpt(fb, view, vma, physical_alignment);
		ret = __xe_pin_fb_vma_dpt(fb, view, vma, alignment);
		else
		ret = __xe_pin_fb_vma_ggtt(fb, view, vma, physical_alignment);
		ret = __xe_pin_fb_vma_ggtt(fb, view, vma, alignment);
		if (ret)
		goto err_unpin;

		@@ -422,7 +422,7 @@ int intel_plane_pin_fb(struct intel_plane_state *new_plane_state,
		struct i915_vma *vma;
		struct intel_framebuffer *intel_fb = to_intel_framebuffer(fb);
		struct intel_plane *plane = to_intel_plane(new_plane_state->uapi.plane);
		u64 phys_alignment = plane->min_alignment(plane, fb, 0);
		unsigned int alignment = plane->min_alignment(plane, fb, 0);

		if (reuse_vma(new_plane_state, old_plane_state))
		return 0;
		@@ -430,7 +430,7 @@ int intel_plane_pin_fb(struct intel_plane_state *new_plane_state,
		/* We reject creating !SCANOUT fb's, so this is weird.. */
		drm_WARN_ON(bo->ttm.base.dev, !(bo->flags & XE_BO_FLAG_SCANOUT));

		vma = __xe_pin_fb_vma(intel_fb, &new_plane_state->view.gtt, phys_alignment);
		vma = __xe_pin_fb_vma(intel_fb, &new_plane_state->view.gtt, alignment);

		if (IS_ERR(vma))
		return PTR_ERR(vma);

drivers/gpu/drm/xe/tests/xe_rtp_test.c

+1 −1

Original line number	Diff line number	Diff line
		@@ -320,7 +320,7 @@ static void xe_rtp_process_to_sr_tests(struct kunit *test)
		count_rtp_entries++;

		xe_rtp_process_ctx_enable_active_tracking(&ctx, &active, count_rtp_entries);
		xe_rtp_process_to_sr(&ctx, param->entries, reg_sr);
		xe_rtp_process_to_sr(&ctx, param->entries, count_rtp_entries, reg_sr);

		xa_for_each(&reg_sr->xa, idx, sre) {
		if (idx == param->expected_reg.addr)

drivers/gpu/drm/xe/xe_guc.c

+0 −8

Original line number	Diff line number	Diff line
		@@ -1496,14 +1496,6 @@ void xe_guc_stop(struct xe_guc *guc)

		int xe_guc_start(struct xe_guc *guc)
		{
		if (!IS_SRIOV_VF(guc_to_xe(guc))) {
		int err;

		err = xe_guc_pc_start(&guc->pc);
		xe_gt_WARN(guc_to_gt(guc), err, "Failed to start GuC PC: %pe\n",
		ERR_PTR(err));
		}

		return xe_guc_submit_start(guc);
		}