Commit Graph

27 Commits

Author SHA1 Message Date
Rodrigo Vivi
89aa02edaa Merge drm/drm-next into drm-xe-next
Needed to get tracing cleanup and add mmio tracing series.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-12 11:31:42 -04:00
Jani Nikula
78247e48a1 drm/xe: do not select ACPI_BUTTON
The xe driver has never needed ACPI button. Selecting the kconfig is
just copy-paste from i915, which no longer needs it either. Stop
selecting ACPI_BUTTON.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Closes: https://lore.kernel.org/r/ZmGsJsXhHcPV48XJ@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1872adc6b20ce4c5ef55ba60a7233b31ace776fb.1717747542.git.jani.nikula@intel.com
2024-06-07 14:33:17 +03:00
Geert Uytterhoeven
05b8b6dd22 Revert "drm: Switch DRM_DISPLAY_HELPER to depends on"
This reverts commit e075e496f5, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.

The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in.  Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/1ba76cc4d96a8afefff5d1bc42fb1e1329c5da68.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-05-02 17:58:23 +02:00
Geert Uytterhoeven
7fe302ae19 Revert "drm: Switch DRM_DISPLAY_DP_HELPER to depends on"
This reverts commit 0323287de8, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.

The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in.  Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/89ac456805746b6d0c888f10c5120b11aacd3319.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-05-02 17:58:21 +02:00
Geert Uytterhoeven
9573446953 Revert "drm: Switch DRM_DISPLAY_HDCP_HELPER to depends on"
This reverts commit 3166e7e6d9, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.

The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in.  Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/a40e70a0abd3d841c23c107d452a43fdd70ef37a.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-05-02 17:58:21 +02:00
Geert Uytterhoeven
d7c128cb77 Revert "drm: Switch DRM_DISPLAY_HDMI_HELPER to depends on"
This reverts commit f6d2dc03fa, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.

The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in.  Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/bd288a5943dab8609f2d1f2bf413595a61df727a.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-05-02 17:58:20 +02:00
Lu Yao
67a9e86dc1 drm/xe: select X86_PLATFORM_DEVICES when ACPI_WMI is selected
ACPI_WMI is a subitem of X86_PLATFORM_DEVICES. And X86_PLATFORM_DEVICES
is not selected in the current Kconfig, and may cause Kconfig warnings:

WARNING: unmet direct dependencies detected for ACPI_WMI
  Depends on [n]: X86_PLATFORM_DEVICES [=n] && ACPI [=y]
  Selected by [m]:
  - DRM_XE [=m] && HAS_IOMEM [=y] && DRM [=m] && PCI [=y] && MMU [=y] &&
    (m && MODULES [=y] || y && KUNIT [=y]=y) && X86 [=y] && ACPI [=y]

Signed-off-by: Lu Yao <yaolu@kylinos.cn>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240415025215.15811-1-yaolu@kylinos.cn
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-16 11:52:12 -07:00
Thomas Hellström
79790b6818 Merge drm/drm-next into drm-xe-next
Backmerging drm-next in order to get up-to-date and in particular
to access commit 9ca5facd04.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-04-12 15:14:25 +02:00
Oak Zeng
81e058a3e7 drm/xe: Introduce helper to populate userptr
Introduce a helper function xe_userptr_populate_range to populate
a userptr range. This functions calls hmm_range_fault to read
CPU page tables and populate all pfns/pages of this virtual address
range. For system memory page, dma-mapping is performed
to get a dma-address which can be used later for GPU to access pages.

v1: Address review comments:
    separate a npage_in_range function (Matt)
    reparameterize function xe_userptr_populate_range function (Matt)
    move mmu_interval_read_begin() call into while loop (Thomas)
    s/mark_range_accessed/xe_mark_range_accessed (Thomas)
    use set_page_dirty_lock (vs set_page_dirty) (Thomas)
    move a few checking in xe_vma_userptr_pin_pages to hmm.c (Matt)
v2: Remove device private page support. Only support system
    pages for now. use dma-map-sg rather than dma-map-page (Matt/Thomas)
v3: Address review comments:
    Squash patch "drm/xe: Introduce a helper to free sg table" to current
    patch (Matt)
    start and end addresses are already page aligned (Matt)
    Do mmap_read_lock and mmap_read_unlock for hmm_range_fault incase of
    non system allocator call. (Matt)
    Drop kthread_use_mm and kthread_unuse_mm. (Matt)
    No need of kernel-doc for static functions.(Matt)
    Modify function names. (Matt)
    Free sgtable incase of dma_map_sgtable failure.(Matt)
    Modify loop for hmm_range_fault.(Matt)
v4: Remove the dummy function for xe_hmm_userptr_populate_range
    since CONFIG_HMM_MIRROR is needed. (Matt)
    Change variable names start/end to userptr_start/userptr_end.(Matt)
v5: Remove device private page support info from commit message. Since
    the patch doesn't support device page handling. (Thomas)

Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Brian Welty <brian.welty@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240412095237.1048599-2-himal.prasad.ghimiray@intel.com
2024-04-12 14:49:03 +02:00
Maxime Ripard
f6d2dc03fa drm: Switch DRM_DISPLAY_HDMI_HELPER to depends on
Most of our helpers have relied on being selected so far through
Kconfig, but that creates issues when we have multiple layers of helpers
with some depending on others.

Indeed, select doesn't select a dependency's dependencies, and thus
isn't super intuitive. Depends on however doesn't have that limitation,
so we can just switch all the drivers that were selecting
DRM_DISPLAY_HDMI_HELPER to depend on it.

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-12-eafee11b84b3@kernel.org
Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28 11:26:52 +01:00
Maxime Ripard
3166e7e6d9 drm: Switch DRM_DISPLAY_HDCP_HELPER to depends on
Most of our helpers have relied on being selected so far through
Kconfig, but that creates issues when we have multiple layers of helpers
with some depending on others.

Indeed, select doesn't select a dependency's dependencies, and thus
isn't super intuitive. Depends on however doesn't have that limitation,
so we can just switch all the drivers that were selecting
DRM_DISPLAY_HDCP_HELPER to depend on it.

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-11-eafee11b84b3@kernel.org
Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28 11:26:51 +01:00
Maxime Ripard
0323287de8 drm: Switch DRM_DISPLAY_DP_HELPER to depends on
Most of our helpers have relied on being selected so far through
Kconfig, but that creates issues when we have multiple layers of helpers
with some depending on others.

Indeed, select doesn't select a dependency's dependencies, and thus
isn't super intuitive. Depends on however doesn't have that limitation,
so we can just switch all the drivers that were selecting
DRM_DISPLAY_DP_HELPER to depend on it.

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-10-eafee11b84b3@kernel.org
Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28 11:26:51 +01:00
Maxime Ripard
e075e496f5 drm: Switch DRM_DISPLAY_HELPER to depends on
Most of our helpers have relied on being selected so far through
Kconfig, but that creates issues when we have multiple layers of helpers
with some depending on others.

Indeed, select doesn't select a dependency's dependencies, and thus
isn't super intuitive. Depends on however doesn't have that limitation,
so we can just switch all the drivers that were selecting
DRM_DISPLAY_HELPER to depend on it.

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-8-eafee11b84b3@kernel.org
Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28 11:26:49 +01:00
Arnd Bergmann
d1d95985ab drm/xe/kunit: fix link failure with built-in xe
When the driver is built-in but the tests are in loadable modules,
the helpers don't actually get put into the driver:

ERROR: modpost: "xe_kunit_helper_alloc_xe_device" [drivers/gpu/drm/xe/tests/xe_test.ko] undefined!

Change the Makefile to ensure they are always part of the driver
even when the rest of the kunit tests are in loadable modules.

Fixes: 5095d13d75 ("drm/xe/kunit: Define helper functions to allocate fake xe device")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-1-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 0e6fec6da2)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-03-04 08:41:11 -06:00
Arnd Bergmann
0e6fec6da2 drm/xe/kunit: fix link failure with built-in xe
When the driver is built-in but the tests are in loadable modules,
the helpers don't actually get put into the driver:

ERROR: modpost: "xe_kunit_helper_alloc_xe_device" [drivers/gpu/drm/xe/tests/xe_test.ko] undefined!

Change the Makefile to ensure they are always part of the driver
even when the rest of the kunit tests are in loadable modules.

Fixes: 5095d13d75 ("drm/xe/kunit: Define helper functions to allocate fake xe device")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-1-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-02-28 11:38:12 -08:00
Lucas De Marchi
237412e453 drm/xe: Enable 32bits build
Now that all the issues with 32bits are fixed, enable it again.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240119001612.2991381-6-lucas.demarchi@intel.com
2024-02-19 23:19:15 -08:00
Jani Nikula
bf3ff145df drm/xe: display support should not depend on EXPERT
Remove the DRM_XE_DISPLAY config dependency on EXPERT. I can only
presume the idea was only experts should be able to disable it, but the
effect is the opposite.

Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240111104716.3548744-1-jani.nikula@intel.com
(cherry picked from commit 1c7531f50e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15 21:40:32 +01:00
Lucas De Marchi
ea97a66a22 drm/xe: Disable 32bits build
Add a dependency on CONFIG_64BIT since currently the xe driver doesn't
build on 32bits. It may be enabled again after all the issues are fixed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231221222809.4123220-2-lucas.demarchi@intel.com
2023-12-22 11:17:15 +10:00
Dave Airlie
d219702902 Merge tag 'drm-xe-next-2023-12-21-pr1-1' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
Introduce a new DRM driver for Intel GPUs

Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms. The experimental support starts with Tiger Lake.
i915 will continue be the main production driver for the platforms
up to Meteor Lake and Alchemist. Then the goal is to make this Intel
Xe driver the primary driver for Lunar Lake and newer platforms.

It uses most, if not all, of the key drm concepts, in special: TTM,
drm-scheduler, drm-exec, drm-gpuvm/gpuva and others.

Signed-off-by: Dave Airlie <airlied@redhat.com>

[airlied: add an extra X86 check, fix a typo, fix drm_exec_init interface
change].

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZYSwLgXZUZ57qGPQ@intel.com
2023-12-22 10:36:21 +10:00
Maarten Lankhorst
44e694958b drm/xe/display: Implement display support
As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there.

We do this by recompiling i915/display code twice.
Now that i915 has been adapted to support the Xe build, we can add
the xe/display support.

This initial work is a collaboration of many people and unfortunately
this squashed patch won't fully honor the proper credits.
But let's try to add a few from the squashed patches:

Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2023-12-21 11:43:39 -05:00
Lucas De Marchi
08987a8b68 drm/xe: Fix build with KUNIT=m
Due to the current integration between "live" xe kunit tests and kunit,
it's not possible to have a build with the following combination:

	CONFIG_DRM_XE=y
	CONFIG_KUNIT=m

... even if kconfig doesn't block it. The reason for the failure is that
some compilation units are pulled in xe.ko:

	drivers/gpu/drm/xe/xe_bo.c:#include "tests/xe_bo.c"
	drivers/gpu/drm/xe/xe_dma_buf.c:#include "tests/xe_dma_buf.c"
	drivers/gpu/drm/xe/xe_migrate.c:#include "tests/xe_migrate.c"
	drivers/gpu/drm/xe/xe_pci.c:#include "tests/xe_pci.c"

Those files shouldn't use symbols from kunit, which should be reserved
to the tests/*_test.c files. Detangling this dependency doesn't seem
very straightforward, so fix the immediate issue instructing kconfig to
block the problematic configuration.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20231109175132.3084142-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:38 -05:00
Vitaly Lubart
87a4c85d3a drm/xe/gsc: add gsc device support
Create mei-gscfi auxiliary device and configure interrupts
to be consumed by mei-gsc device driver.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vitaly Lubart <vitaly.lubart@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:00 -05:00
Thomas Hellström
d490ecf577 drm/xe: Rework xe_exec and the VM rebind worker to use the drm_exec helper
Replace the calls to ttm_eu_reserve_buffers() by using the drm_exec
helper instead. Also make sure the locking loop covers any calls to
xe_bo_validate() / ttm_bo_validate() so that these function calls may
easily benefit from being called from within an unsealed locking
transaction and may thus perform blocking dma_resv locks in the future.

For the unlock we remove an assert that the vm->rebind_list is empty
when locks are released. Since if the error path is hit with a partly
locked list, that assert may no longer hold true we chose to remove it.

v3:
- Don't accept duplicate bo locks in the rebind worker.
v5:
- Loop over drm_exec objects in reverse when unlocking.
v6:
- We can't keep the WW ticket when retrying validation on OOM. Fix.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230908091716.36984-5-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:07 -05:00
Tejas Upadhyay
d277656472 drm/xe: Add min/max cap for engine scheduler properties
Add sysfs entries for the min, max, and defaults for each of
engine scheduler controls for every hardware engine class.

Non-elevated user IOCTLs to set these controls must be within
the min-max ranges of the sysfs entries, elevated user can set
these controls to any value. However, introduced compile time
CONFIG min-max values which restricts elevated user to be in
compile time min-max range if at all sysfs min/max are violated.

Sysfs entries examples are,
DUT# cat /sys/class/drm/cardX/device/tileN/gtN/engines/ccs/.defaults/
job_timeout_max         job_timeout_ms          preempt_timeout_min     timeslice_duration_max  timeslice_duration_us
job_timeout_min         preempt_timeout_max     preempt_timeout_us      timeslice_duration_min

DUT# cat /sys/class/drm/card1/device/tileN/gtN/engines/ccs/
.defaults/              job_timeout_min         preempt_timeout_max     preempt_timeout_us      timeslice_duration_min
job_timeout_max         job_timeout_ms          preempt_timeout_min     timeslice_duration_max  timeslice_duration_us

V12:
   - Rebase
V11:
   - Make engine_get_prop_minmax and enforce_sched_limit static - Matt
   - use enum in place of string in engine_get_prop_minmax - Matt
   - no need to use enforce_sched_limit or no need to filter min/max
     per user type in sysfs - Matt
V10:
   - Add kernel doc for non-static func
   - Make helper to get min/max for range validation - Matt
   - Filter min/max per user type
V9 :
   - Rebase to use s/xe_engine/xe_hw_engine/ - Matt
V8 :
   - fix enforce_sched_limit and avoid code duplication - Niranjana
   - Make sure min < max - Niranjana
V7 :
   - Rebase to replace hw engine with eclass interface
   - return EINVAL in place of EPERM
   - Use some APIs to avoid code duplication
V6 :
   - Rebase changes to reflect per engine class props interface - MattB
   - Use #if ENABLED - MattB
   - Remove MAX_SCHED_TIMEOUT check as range validation is enough
V5 :
   - Rebase to resolve conflicts - CI
V4 :
   - Rebase
   - Update commit to reflect tile addition
   - Use XE_HW macro directly as they are already filtered
     for CONFIG checks - Niranjana
   - Add CONFIG for enable/disable min/max limitation
     on elevated user. Default is enable - Matt/Joonas
V3 :
   - Resolve CI hooks warning for kernel-doc
V2 :
   - Restric min/max setting to #define default min/max for
     elevated user - Himal
   - Remove unrelated changes from patch - Niranjana

Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:28 -05:00
Matthew Brost
b06d47be7c drm/xe: Port Xe to GPUVA
Rather than open coding VM binds and VMA tracking, use the GPUVA
library. GPUVA provides a common infrastructure for VM binds to use mmap
/ munmap semantics and support for VK sparse bindings.

The concepts are:

1) xe_vm inherits from drm_gpuva_manager
2) xe_vma inherits from drm_gpuva
3) xe_vma_op inherits from drm_gpuva_op
4) VM bind operations (MAP, UNMAP, PREFETCH, UNMAP_ALL) call into the
GPUVA code to generate an VMA operations list which is parsed, committed,
and executed.

v2 (CI): Add break after default in case statement.
v3: Rebase
v4: Fix some error handling
v5: Use unlocked version VMA in error paths
v6: Rebase, address some review feedback mainly Thomas H
v7: Fix compile error in xe_vma_op_unwind, address checkpatch

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:18 -05:00
Rodrigo Vivi
e799485044 drm/xe: Introduce the dev_coredump infrastructure.
The goal is to use devcoredump infrastructure to report error states
captured at the crash time.

The error state will contain useful information for GPU hang debug, such
as INSTDONE registers and the current buffers getting executed, as well
as any other information that helps user space and allow later replays of
the error.

The proposal here is to avoid a Xe only error_state like i915 and use
a standard dev_coredump infrastructure to expose the error state.

For our own case, the data is only useful if it is a snapshot of the
time when the GPU crash has happened, since we reset the GPU immediately
after and the registers might have changed. So the proposal here is to
have an internal snapshot to be printed out later.

Also, usually a subsequent GPU hang can be only a cause of the initial
one. So we only save the 'first' hang. The dev_coredump has a delayed
work queue where it remove the coredump and free all the data within a
few moments of the error. When that happens we also reset our capture
state and allow further snapshots.

Right now this infra only print out the time of the hang. More information
will be migrated here on subsequent work. Also, in order to organize the
dump better, the goal is to propose dev_coredump changes itself to allow
multiple files and different controls. But for now we start Xe usage of
it without any dependency on dev_coredump core changes.

v2: Add dma_fence annotation for capture that might happen during long
    running. (Thomas and Matt)
    Use xe->drm.primary->index on drm_info msg. (Jani)
v3: checkpatch fixes
v4: Fix building and locking issues found by Francois.
    Actually let's kill all of the locking in here. gt_reset serialization
    already guarantee that there will be only one capture at the same time.
    Also, the devcoredump has its own locking to protect the free and reads
    and drivers don't need to duplicate it.
    Besides this, the dma_fence locking was pushed to a following patch
    since it is not needed in this one.
    Fix a use after free identified by KASAN: Do not stash the faulty_engine
    since that will be freed somewhere else.
v5: Fix Uptime - ktime_get_boottime actually returns the Uptime. (Francois)

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-19 18:33:51 -05:00
Matthew Brost
dd08ebf6c3 drm/xe: Introduce a new DRM driver for Intel GPUs
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms starting with Tiger Lake (first Intel Xe Architecture).

The code is at a stage where it is already functional and has experimental
support for multiple platforms starting from Tiger Lake, with initial
support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan
drivers), as well as in NEO (for OpenCL and Level0).

The new Xe driver leverages a lot from i915.

As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there. But it is not added
in this patch.

This initial work is a collaboration of many people and unfortunately
the big squashed patch won't fully honor the proper credits. But let's
get some git quick stats so we can at least try to preserve some of the
credits:

Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Co-developed-by: Francois Dugast <francois.dugast@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: José Roberto de Souza <jose.souza@intel.com>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
2023-12-12 14:05:48 -05:00