Commit 04d1ff1d authored by Jakub Kicinski's avatar Jakub Kicinski
Browse files

Merge branch 'devlink-mlx5-add-new-parameters-for-link-management-and-sriov-eswitch-configurations'

Saeed Mahameed says:

====================
devlink, mlx5: Add new parameters for link management and SRIOV/eSwitch configurations [part]

This patch series introduces several devlink parameters improving device
configuration capabilities, link management, and SRIOV/eSwitch, by adding
NV config boot time parameters.

Implement the following parameters:

   a) total_vfs Parameter:
   -----------------------

Adds support for managing the number of VFs (total_vfs) and enabling
SR-IOV (enable_sriov for mlx5) through devlink. These additions enhance
user control over virtualization features directly from standard kernel
interfaces without relying on additional external tools. total_vfs
functionality is critical for environments that require flexible num VF
configuration.

   b) CQE Compression Type:
   ------------------------

Introduces a new devlink parameter, cqe_compress_type, to configure the
rate of CQE compression based on PCIe bus conditions. This setting
provides a balance between compression efficiency and overall NIC
performance under different traffic loads.
====================

Link: https://patch.msgid.link/20250907012953.301746-1-saeed@kernel.org


Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parents b90c7ca4 a4c49611
Loading
Loading
Loading
Loading
+5 −0
Original line number Diff line number Diff line
@@ -143,3 +143,8 @@ own name.
   * - ``clock_id``
     - u64
     - Clock ID used by the device for registering DPLL devices and pins.
   * - ``total_vfs``
     - u32
     - The max number of Virtual Functions (VFs) exposed by the PF.
       after reboot/pci reset, 'sriov_totalvfs' entry under the device's sysfs
       directory will report this value.
+43 −3
Original line number Diff line number Diff line
@@ -15,23 +15,53 @@ Parameters
   * - Name
     - Mode
     - Validation
     - Notes
   * - ``enable_roce``
     - driverinit
     - Type: Boolean

       If the device supports RoCE disablement, RoCE enablement state controls
     - Boolean
     - If the device supports RoCE disablement, RoCE enablement state controls
       device support for RoCE capability. Otherwise, the control occurs in the
       driver stack. When RoCE is disabled at the driver level, only raw
       ethernet QPs are supported.
   * - ``io_eq_size``
     - driverinit
     - The range is between 64 and 4096.
     -
   * - ``event_eq_size``
     - driverinit
     - The range is between 64 and 4096.
     -
   * - ``max_macs``
     - driverinit
     - The range is between 1 and 2^31. Only power of 2 values are supported.
     -
   * - ``enable_sriov``
     - permanent
     - Boolean
     - Applies to each physical function (PF) independently, if the device
       supports it. Otherwise, it applies symmetrically to all PFs.
   * - ``total_vfs``
     - permanent
     - The range is between 1 and a device-specific max.
     - Applies to each physical function (PF) independently, if the device
       supports it. Otherwise, it applies symmetrically to all PFs.

Note: permanent parameters such as ``enable_sriov`` and ``total_vfs`` require FW reset to take effect

.. code-block:: bash

   # setup parameters
   devlink dev param set pci/0000:01:00.0 name enable_sriov value true cmode permanent
   devlink dev param set pci/0000:01:00.0 name total_vfs value 8 cmode permanent

   # Fw reset
   devlink dev reload pci/0000:01:00.0 action fw_activate

   # for PCI related config such as sriov PCI reset/rescan is required:
   echo 1 >/sys/bus/pci/devices/0000:01:00.0/remove
   echo 1 >/sys/bus/pci/rescan
   grep ^ /sys/bus/pci/devices/0000:01:00.0/sriov_*


The ``mlx5`` driver also implements the following driver-specific
parameters.
@@ -117,6 +147,16 @@ parameters.
     - driverinit
     - Control the size (in packets) of the hairpin queues.

   * - ``cqe_compress_type``
     - string
     - permanent
     - Configure which mechanism/algorithm should be used by the NIC that will
       affect the rate (aggressiveness) of compressed CQEs depending on PCIe bus
       conditions and other internal NIC factors. This mode affects all queues
       that enable compression.
       * ``balanced`` : Merges fewer CQEs, resulting in a moderate compression ratio but maintaining a balance between bandwidth savings and performance
       * ``aggressive`` : Merges more CQEs into a single entry, achieving a higher compression rate and maximizing performance, particularly under high traffic loads

The ``mlx5`` driver supports reloading via ``DEVLINK_CMD_RELOAD``

Info versions
+1 −1
Original line number Diff line number Diff line
@@ -17,7 +17,7 @@ mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
		fs_counters.o fs_ft_pool.o rl.o lag/debugfs.o lag/lag.o dev.o events.o wq.o lib/gid.o \
		lib/devcom.o lib/pci_vsc.o lib/dm.o lib/fs_ttc.o diag/fs_tracepoint.o \
		diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o diag/reporter_vnic.o \
		fw_reset.o qos.o lib/tout.o lib/aso.o wc.o fs_pool.o
		fw_reset.o qos.o lib/tout.o lib/aso.o wc.o fs_pool.o lib/nv_param.o

#
# Netdev basic
+8 −0
Original line number Diff line number Diff line
@@ -10,6 +10,7 @@
#include "esw/qos.h"
#include "sf/dev/dev.h"
#include "sf/sf.h"
#include "lib/nv_param.h"

static int mlx5_devlink_flash_update(struct devlink *devlink,
				     struct devlink_flash_update_params *params,
@@ -895,8 +896,14 @@ int mlx5_devlink_params_register(struct devlink *devlink)
	if (err)
		goto max_uc_list_err;

	err = mlx5_nv_param_register_dl_params(devlink);
	if (err)
		goto nv_param_err;

	return 0;

nv_param_err:
	mlx5_devlink_max_uc_list_params_unregister(devlink);
max_uc_list_err:
	mlx5_devlink_auxdev_params_unregister(devlink);
auxdev_reg_err:
@@ -907,6 +914,7 @@ int mlx5_devlink_params_register(struct devlink *devlink)

void mlx5_devlink_params_unregister(struct devlink *devlink)
{
	mlx5_nv_param_unregister_dl_params(devlink);
	mlx5_devlink_max_uc_list_params_unregister(devlink);
	mlx5_devlink_auxdev_params_unregister(devlink);
	devl_params_unregister(devlink, mlx5_devlink_params,
+1 −0
Original line number Diff line number Diff line
@@ -22,6 +22,7 @@ enum mlx5_devlink_param_id {
	MLX5_DEVLINK_PARAM_ID_ESW_MULTIPORT,
	MLX5_DEVLINK_PARAM_ID_HAIRPIN_NUM_QUEUES,
	MLX5_DEVLINK_PARAM_ID_HAIRPIN_QUEUE_SIZE,
	MLX5_DEVLINK_PARAM_ID_CQE_COMPRESSION_TYPE
};

struct mlx5_trap_ctx {
Loading