Commit 88a2b4d3 authored by Timur Tabi's avatar Timur Tabi Committed by Danilo Krummrich
Browse files

nouveau/gsp: document some aspects of GSP-RM



Document a few aspects of communication with GSP-RM. These comments are
derived from notes made during early development of GSP-RM support in
Nouveau, but were not included in the initial patch set.

Reviewed-by: default avatarDave Airlie <airlied@redhat.com>
Signed-off-by: default avatarTimur Tabi <ttabi@nvidia.com>
Reviewed-by: default avatarDanilo Krummrich <dakr@redhat.com>
Signed-off-by: default avatarDanilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231122202840.2565153-1-ttabi@nvidia.com
parent fb18fe0f
Loading
Loading
Loading
Loading
+51 −0
Original line number Diff line number Diff line
@@ -26,6 +26,49 @@
 * DEALINGS IN THE SOFTWARE.
 */

/**
 * msgqTxHeader -- TX queue data structure
 * @version: the version of this structure, must be 0
 * @size: the size of the entire queue, including this header
 * @msgSize: the padded size of queue element, 16 is minimum
 * @msgCount: the number of elements in this queue
 * @writePtr: head index of this queue
 * @flags: 1 = swap the RX pointers
 * @rxHdrOff: offset of readPtr in this structure
 * @entryOff: offset of beginning of queue (msgqRxHeader), relative to
 *          beginning of this structure
 *
 * The command queue is a queue of RPCs that are sent from the driver to the
 * GSP.  The status queue is a queue of messages/responses from GSP-RM to the
 * driver.  Although the driver allocates memory for both queues, the command
 * queue is owned by the driver and the status queue is owned by GSP-RM.  In
 * addition, the headers of the two queues must not share the same 4K page.
 *
 * Each queue is prefixed with this data structure.  The idea is that a queue
 * and its header are written to only by their owner.  That is, only the
 * driver writes to the command queue and command queue header, and only the
 * GSP writes to the status (receive) queue and its header.
 *
 * This is enforced by the concept of "swapping" the RX pointers.  This is
 * why the 'flags' field must be set to 1.  'rxHdrOff' is how the GSP knows
 * where the where the tail pointer of its status queue.
 *
 * When the driver writes a new RPC to the command queue, it updates writePtr.
 * When it reads a new message from the status queue, it updates readPtr.  In
 * this way, the GSP knows when a new command is in the queue (it polls
 * writePtr) and it knows how much free space is in the status queue (it
 * checks readPtr).  The driver never cares about how much free space is in
 * the status queue.
 *
 * As usual, producers write to the head pointer, and consumers read from the
 * tail pointer.  When head == tail, the queue is empty.
 *
 * So to summarize:
 * command.writePtr = head of command queue
 * command.readPtr = tail of status queue
 * status.writePtr = head of status queue
 * status.readPtr = tail of command queue
 */
typedef struct
{
    NvU32 version;   // queue version
@@ -38,6 +81,14 @@ typedef struct
    NvU32 entryOff;  // Offset of entries from start of backing store.
} msgqTxHeader;

/**
 * msgqRxHeader - RX queue data structure
 * @readPtr: tail index of the other queue
 *
 * Although this is a separate struct, it could easily be merged into
 * msgqTxHeader.  msgqTxHeader.rxHdrOff is simply the offset of readPtr
 * from the beginning of msgqTxHeader.
 */
typedef struct
{
    NvU32 readPtr; // message id of last message read
+82 −0
Original line number Diff line number Diff line
@@ -1377,6 +1377,13 @@ r535_gsp_msg_post_event(void *priv, u32 fn, void *repv, u32 repc)
	return 0;
}

/**
 * r535_gsp_msg_run_cpu_sequencer() -- process I/O commands from the GSP
 *
 * The GSP sequencer is a list of I/O commands that the GSP can send to
 * the driver to perform for various purposes.  The most common usage is to
 * perform a special mid-initialization reset.
 */
static int
r535_gsp_msg_run_cpu_sequencer(void *priv, u32 fn, void *repv, u32 repc)
{
@@ -1716,6 +1723,23 @@ r535_gsp_libos_id8(const char *name)
	return id;
}

/**
 * create_pte_array() - creates a PTE array of a physically contiguous buffer
 * @ptes: pointer to the array
 * @addr: base address of physically contiguous buffer (GSP_PAGE_SIZE aligned)
 * @size: size of the buffer
 *
 * GSP-RM sometimes expects physically-contiguous buffers to have an array of
 * "PTEs" for each page in that buffer.  Although in theory that allows for
 * the buffer to be physically discontiguous, GSP-RM does not currently
 * support that.
 *
 * In this case, the PTEs are DMA addresses of each page of the buffer.  Since
 * the buffer is physically contiguous, calculating all the PTEs is simple
 * math.
 *
 * See memdescGetPhysAddrsForGpu()
 */
static void create_pte_array(u64 *ptes, dma_addr_t addr, size_t size)
{
	unsigned int num_pages = DIV_ROUND_UP_ULL(size, GSP_PAGE_SIZE);
@@ -1725,6 +1749,35 @@ static void create_pte_array(u64 *ptes, dma_addr_t addr, size_t size)
		ptes[i] = (u64)addr + (i << GSP_PAGE_SHIFT);
}

/**
 * r535_gsp_libos_init() -- create the libos arguments structure
 *
 * The logging buffers are byte queues that contain encoded printf-like
 * messages from GSP-RM.  They need to be decoded by a special application
 * that can parse the buffers.
 *
 * The 'loginit' buffer contains logs from early GSP-RM init and
 * exception dumps.  The 'logrm' buffer contains the subsequent logs. Both are
 * written to directly by GSP-RM and can be any multiple of GSP_PAGE_SIZE.
 *
 * The physical address map for the log buffer is stored in the buffer
 * itself, starting with offset 1. Offset 0 contains the "put" pointer.
 *
 * The GSP only understands 4K pages (GSP_PAGE_SIZE), so even if the kernel is
 * configured for a larger page size (e.g. 64K pages), we need to give
 * the GSP an array of 4K pages. Fortunately, since the buffer is
 * physically contiguous, it's simple math to calculate the addresses.
 *
 * The buffers must be a multiple of GSP_PAGE_SIZE.  GSP-RM also currently
 * ignores the @kind field for LOGINIT, LOGINTR, and LOGRM, but expects the
 * buffers to be physically contiguous anyway.
 *
 * The memory allocated for the arguments must remain until the GSP sends the
 * init_done RPC.
 *
 * See _kgspInitLibosLoggingStructures (allocates memory for buffers)
 * See kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
 */
static int
r535_gsp_libos_init(struct nvkm_gsp *gsp)
{
@@ -1835,6 +1888,35 @@ nvkm_gsp_radix3_dtor(struct nvkm_gsp *gsp, struct nvkm_gsp_radix3 *rx3)
		nvkm_gsp_mem_dtor(gsp, &rx3->mem[i]);
}

/**
 * nvkm_gsp_radix3_sg - build a radix3 table from a S/G list
 *
 * The GSP uses a three-level page table, called radix3, to map the firmware.
 * Each 64-bit "pointer" in the table is either the bus address of an entry in
 * the next table (for levels 0 and 1) or the bus address of the next page in
 * the GSP firmware image itself.
 *
 * Level 0 contains a single entry in one page that points to the first page
 * of level 1.
 *
 * Level 1, since it's also only one page in size, contains up to 512 entries,
 * one for each page in Level 2.
 *
 * Level 2 can be up to 512 pages in size, and each of those entries points to
 * the next page of the firmware image.  Since there can be up to 512*512
 * pages, that limits the size of the firmware to 512*512*GSP_PAGE_SIZE = 1GB.
 *
 * Internally, the GSP has its window into system memory, but the base
 * physical address of the aperture is not 0.  In fact, it varies depending on
 * the GPU architecture.  Since the GPU is a PCI device, this window is
 * accessed via DMA and is therefore bound by IOMMU translation.  The end
 * result is that GSP-RM must translate the bus addresses in the table to GSP
 * physical addresses.  All this should happen transparently.
 *
 * Returns 0 on success, or negative error code
 *
 * See kgspCreateRadix3_IMPL
 */
static int
nvkm_gsp_radix3_sg(struct nvkm_device *device, struct sg_table *sgt, u64 size,
		   struct nvkm_gsp_radix3 *rx3)