Commit 77052ab2 authored by Riana Tauro's avatar Riana Tauro Committed by Lucas De Marchi
Browse files

drm/xe: Add documentation for survivability mode



Add survivability mode document to pcode document as it is enabled
when pcode detects a failure.

v2: fix kernel-doc (Lucas)

Signed-off-by: default avatarRiana Tauro <riana.tauro@intel.com>
Reviewed-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250407051414.1651616-3-riana.tauro@intel.com


Signed-off-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
parent 16280ded
Loading
Loading
Loading
Loading
+7 −0
Original line number Diff line number Diff line
@@ -12,3 +12,10 @@ Internal API

.. kernel-doc:: drivers/gpu/drm/xe/xe_pcode.c
   :internal:

==================
Boot Survivability
==================

.. kernel-doc:: drivers/gpu/drm/xe/xe_survivability_mode.c
   :doc: Xe Boot Survivability
+23 −11
Original line number Diff line number Diff line
@@ -28,20 +28,32 @@
 * This is implemented by loading the driver with bare minimum (no drm card) to allow the firmware
 * to be flashed through mei and collect telemetry. The driver's probe flow is modified
 * such that it enters survivability mode when pcode initialization is incomplete and boot status
 * denotes a failure. The driver then  populates the survivability_mode PCI sysfs indicating
 * survivability mode and provides additional information required for debug
 * denotes a failure.
 *
 * KMD exposes below admin-only readable sysfs in survivability mode
 * Survivability mode can also be entered manually using the survivability mode attribute available
 * through configfs which is beneficial in several usecases. It can be used to address scenarios
 * where pcode does not detect failure or for validation purposes. It can also be used in
 * In-Field-Repair (IFR) to repair a single card without impacting the other cards in a node.
 *
 * device/survivability_mode: The presence of this file indicates that the card is in survivability
 *			      mode. Also, provides additional information on why the driver entered
 *			      survivability mode.
 * Use below command enable survivability mode manually::
 *
 *			      Capability Information - Provides boot status
 *			      Postcode Information   - Provides information about the failure
 *			      Overflow Information   - Provides history of previous failures
 *			      Auxiliary Information  - Certain failures may have information in
 *						       addition to postcode information
 *	# echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode
 *
 * Refer :ref:`xe_configfs` for more details on how to use configfs
 *
 * Survivability mode is indicated by the below admin-only readable sysfs which provides additional
 * debug information::
 *
 *	/sys/bus/pci/devices/<device>/surivability_mode
 *
 * Capability Information:
 *	Provides boot status
 * Postcode Information:
 *	Provides information about the failure
 * Overflow Information
 *	Provides history of previous failures
 * Auxiliary Information
 *	Certain failures may have information in addition to postcode information
 */

static u32 aux_history_offset(u32 reg_value)