Commit 99e7f8e8 authored by Alexandre Courbot's avatar Alexandre Courbot Committed by Danilo Krummrich
Browse files

gpu: nova-core: falcon: align DMA transfers to 256 bytes



Falcon DMA transfers are done in 256 bytes increments, and the method
responsible for initiating the transfer checked that the required length
was indeed a multiple of 256. While correct, this also requires callers
to specifically account for this limitation of DMA transfers, and we had
for instance the fwsec code performing a seemingly arbitrary (and
potentially overflowing) upwards alignment of the DMEM load size to
match this requirement.

Let's move that alignment into the loading code itself instead: since it
is working in terms of number of transfers, we can turn this upwards
alignment into a non-overflowing operation, and check that the requested
transfer remains into the limits of the DMA object. This also allows us
to remove a DMA-specific constant in the fwsec code.

Signed-off-by: default avatarAlexandre Courbot <acourbot@nvidia.com>
Link: https://lore.kernel.org/r/20250821-falcondma_256b-v2-1-83e8647a24b5@nvidia.com


Signed-off-by: default avatarDanilo Krummrich <dakr@kernel.org>
parent d234f3ae
Loading
Loading
Loading
Loading
+22 −9
Original line number Diff line number Diff line
@@ -463,14 +463,27 @@ fn dma_wr<F: FalconFirmware<Target = E>>(
            );
            return Err(EINVAL);
        }
        if load_offsets.len % DMA_LEN > 0 {
            dev_err!(
                self.dev,
                "DMA transfer length must be a multiple of {}",
                DMA_LEN
            );

        // DMA transfers can only be done in units of 256 bytes. Compute how many such transfers we
        // need to perform.
        let num_transfers = load_offsets.len.div_ceil(DMA_LEN);

        // Check that the area we are about to transfer is within the bounds of the DMA object.
        // Upper limit of transfer is `(num_transfers * DMA_LEN) + load_offsets.src_start`.
        match num_transfers
            .checked_mul(DMA_LEN)
            .and_then(|size| size.checked_add(load_offsets.src_start))
        {
            None => {
                dev_err!(self.dev, "DMA transfer length overflow");
                return Err(EOVERFLOW);
            }
            Some(upper_bound) if upper_bound as usize > fw.size() => {
                dev_err!(self.dev, "DMA transfer goes beyond range of DMA object");
                return Err(EINVAL);
            }
            Some(_) => (),
        };

        // Set up the base source DMA address.

@@ -486,7 +499,7 @@ fn dma_wr<F: FalconFirmware<Target = E>>(
            .set_imem(target_mem == FalconMem::Imem)
            .set_sec(if sec { 1 } else { 0 });

        for pos in (0..load_offsets.len).step_by(DMA_LEN as usize) {
        for pos in (0..num_transfers).map(|i| i * DMA_LEN) {
            // Perform a transfer of size `DMA_LEN`.
            regs::NV_PFALCON_FALCON_DMATRFMOFFS::default()
                .set_offs(load_offsets.dst_start + pos)
+1 −8
Original line number Diff line number Diff line
@@ -202,9 +202,6 @@ pub(crate) struct FwsecFirmware {
    ucode: FirmwareDmaObject<Self, Signed>,
}

// We need to load full DMEM pages.
const DMEM_LOAD_SIZE_ALIGN: u32 = 256;

impl FalconLoadParams for FwsecFirmware {
    fn imem_load_params(&self) -> FalconLoadTarget {
        FalconLoadTarget {
@@ -218,11 +215,7 @@ fn dmem_load_params(&self) -> FalconLoadTarget {
        FalconLoadTarget {
            src_start: self.desc.imem_load_size,
            dst_start: self.desc.dmem_phys_base,
            // TODO[NUMM]: replace with `align_up` once it lands.
            len: self
                .desc
                .dmem_load_size
                .next_multiple_of(DMEM_LOAD_SIZE_ALIGN),
            len: self.desc.dmem_load_size,
        }
    }