Commit 10dd1eaa authored by Matthew Brost's avatar Matthew Brost
Browse files

drm/pagemap: Disable device-to-device migration



Device-to-device migration is causing xe_exec_system_allocator --r
*race*no* to intermittently fail with engine resets and a kernel hang on
a page lock. This should work but is clearly buggy somewhere. Disable
device-to-device migration in the interim until the issue can be
root-caused.

The only downside of disabling device-to-device migration is that memory
will bounce through system memory during migration. However, this path
should be rare, as it only occurs when madvise attributes are changed or
atomics are used.

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: ec265e1f ("drm/pagemap: Support source migration over interconnect")
Signed-off-by: default avatarMatthew Brost <matthew.brost@intel.com>
Reviewed-by: default avatarFrancois Dugast <francois.dugast@intel.com>
Link: https://patch.msgid.link/20260107182716.2236607-3-matthew.brost@intel.com
parent 3902846a
Loading
Loading
Loading
Loading
+12 −2
Original line number Diff line number Diff line
@@ -480,8 +480,18 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
		.start		= start,
		.end		= end,
		.pgmap_owner	= pagemap->owner,
		.flags		= MIGRATE_VMA_SELECT_SYSTEM | MIGRATE_VMA_SELECT_DEVICE_COHERENT |
		MIGRATE_VMA_SELECT_DEVICE_PRIVATE,
		/*
		 * FIXME: MIGRATE_VMA_SELECT_DEVICE_PRIVATE intermittently
		 * causes 'xe_exec_system_allocator --r *race*no*' to trigger aa
		 * engine reset and a hard hang due to getting stuck on a folio
		 * lock. This should work and needs to be root-caused. The only
		 * downside of not selecting MIGRATE_VMA_SELECT_DEVICE_PRIVATE
		 * is that device-to-device migrations won’t work; instead,
		 * memory will bounce through system memory. This path should be
		 * rare and only occur when the madvise attributes of memory are
		 * changed or atomics are being used.
		 */
		.flags		= MIGRATE_VMA_SELECT_SYSTEM | MIGRATE_VMA_SELECT_DEVICE_COHERENT,
	};
	unsigned long i, npages = npages_in_range(start, end);
	unsigned long own_pages = 0, migrated_pages = 0;