Commit 474bb1aa authored by Jakub Kicinski's avatar Jakub Kicinski
Browse files

Merge tag 'mlx5-updates-2024-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2024-08-29

HW-Managed Flow Steering in mlx5 driver

Yevgeny Kliteynik says:
=======================

1. Overview
-----------

ConnectX devices support packet matching, modification, and redirection.
This functionality is referred as Flow Steering.
To configure a steering rule, the rule is written to the device-owned
memory. This memory is accessed and cached by the device when processing
a packet.

The first implementation of Flow Steering was done in FW, and it is
referred in the mlx5 driver as Device-Managed Flow Steering (DMFS).
Later we introduced SW-managed Flow Steering (SWS or SMFS), where the
driver is writing directly to the device's configuration memory (ICM)
through RC QP using RDMA operations (RDMA-read and RDAM-write), thus
achieving higher rates of rule insertion/deletion.

Now we introduce a new flow steering implementation: HW-Managed Flow
Steering (HWS or HMFS).

In this new approach, the driver is configuring steering rules directly
to the HW using the WQs with a special new type of WQE. This way we can
reach higher rule insertion/deletion rate with much lower CPU utilization
compared to SWS.

The key benefits of HWS as opposed to SWS:
+ HW manages the steering decision tree
   - HW calculates CRC for each entry
   - HW handles tree hash collisions
   - HW & FW manage objects refcount
+ HW keeps cache coherency:
   - HW provides tree access locking and synchronization
   - HW provides notification on completion
+ Insertion rate isn’t affected by background traffic
   - Dedicated HW components that handle insertion

2. Performance
--------------

Measuring Connection Tracking with simple IPv4 flows w/o NAT, we
are able to get ~5 times more flows offloaded per second using HWS.

3. Configuration
----------------

The enablement of HWS mode in eswitch manager is done using the same
devlink param that is already used for switching between FW-managed
steering and SW-managed steering modes:

  # devlink dev param set pci/<PCI_ID> name flow_steering_mode cmod runtime value hmfs

4. Upstream Submission
----------------------

HWS support consists of 3 main components:
+ Steering:
   - The lower layer that exposes HWS API to upper layers and implements
     all the management of flow steering building blocks
+ FS-Core
   - Implementation of fs_hws layer to enable fs_core to use HWS instead
     of FW or SW steering
   - Create HW steering action pools to utilize the ability of HWS to
     share steering actions among different rules
   - Add support for configuring HWS mode through devlink command,
     similar to configuring SWS mode
+ Connection Tracking
   - Implementation of CT support for HW steering
   - Hooks up the CT ops for the new steering mode and uses the HWS API
     to implement connection tracking.

Because of the large number of patches, we need to perform the submission
in several separate patch series. This series is the first submission that
lays the ground work for the next submissions, where an actual user of HWS
will be added.

5. Patches in this series
-------------------------

This patch series contains implementation of the first bullet from above.

=======================

* tag 'mlx5-updates-2024-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: HWS, added API and enabled HWS support
  net/mlx5: HWS, added send engine and context handling
  net/mlx5: HWS, added debug dump and internal headers
  net/mlx5: HWS, added backward-compatible API handling
  net/mlx5: HWS, added memory management handling
  net/mlx5: HWS, added vport handling
  net/mlx5: HWS, added modify header pattern and args handling
  net/mlx5: HWS, added FW commands handling
  net/mlx5: HWS, added matchers functionality
  net/mlx5: HWS, added definers handling
  net/mlx5: HWS, added rules handling
  net/mlx5: HWS, added tables handling
  net/mlx5: HWS, added actions handling
  net/mlx5: Added missing definitions in preparation for HW Steering
  net/mlx5: Added missing mlx5_ifc definition for HW Steering
====================

Link: https://patch.msgid.link/20240909181250.41596-1-saeed@kernel.org


Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parents ea403549 510f9f61
Loading
Loading
Loading
Loading
+3 −0
Original line number Diff line number Diff line
@@ -130,6 +130,9 @@ Enabling the driver and kconfig options

|    Build support for software-managed steering in the NIC.

**CONFIG_MLX5_HW_STEERING=(y/n)**

|    Build support for hardware-managed steering in the NIC.

**CONFIG_MLX5_TC_CT=(y/n)**

+10 −0
Original line number Diff line number Diff line
@@ -172,6 +172,16 @@ config MLX5_SW_STEERING
	help
	Build support for software-managed steering in the NIC.

config MLX5_HW_STEERING
	bool "Mellanox Technologies hardware-managed steering"
	depends on MLX5_CORE_EN && MLX5_ESWITCH
	default y
	help
	Build support for Hardware-Managed Flow Steering (HMFS) in the NIC.
	HMFS is a new approach to managing steering rules where STEs are
	written to ICM by HW (as opposed to SW in software-managed steering),
	which allows higher rate of rule insertion.

config MLX5_SF
	bool "Mellanox Technologies subfunction device support using auxiliary device"
	depends on MLX5_CORE && MLX5_CORE_EN
+21 −0
Original line number Diff line number Diff line
@@ -119,6 +119,27 @@ mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o
					steering/dr_action.o steering/fs_dr.o \
					steering/dr_definer.o steering/dr_ptrn.o \
					steering/dr_arg.o steering/dr_dbg.o lib/smfs.o

#
# HW Steering
#
mlx5_core-$(CONFIG_MLX5_HW_STEERING) += steering/hws/mlx5hws_cmd.o \
					steering/hws/mlx5hws_context.o \
					steering/hws/mlx5hws_pat_arg.o \
					steering/hws/mlx5hws_buddy.o \
					steering/hws/mlx5hws_pool.o \
					steering/hws/mlx5hws_table.o \
					steering/hws/mlx5hws_action.o \
					steering/hws/mlx5hws_rule.o \
					steering/hws/mlx5hws_matcher.o \
					steering/hws/mlx5hws_send.o \
					steering/hws/mlx5hws_definer.o \
					steering/hws/mlx5hws_bwc.o \
					steering/hws/mlx5hws_debug.o \
					steering/hws/mlx5hws_vport.o \
					steering/hws/mlx5hws_bwc_complex.o


#
# SF device
#
+6 −2
Original line number Diff line number Diff line
@@ -110,7 +110,9 @@ enum fs_flow_table_type {
	FS_FT_RDMA_RX		= 0X7,
	FS_FT_RDMA_TX		= 0X8,
	FS_FT_PORT_SEL		= 0X9,
	FS_FT_MAX_TYPE = FS_FT_PORT_SEL,
	FS_FT_FDB_RX		= 0xa,
	FS_FT_FDB_TX		= 0xb,
	FS_FT_MAX_TYPE = FS_FT_FDB_TX,
};

enum fs_flow_table_op_mod {
@@ -368,7 +370,9 @@ struct mlx5_flow_root_namespace *find_root(struct fs_node *node);
	(type == FS_FT_RDMA_RX) ? MLX5_CAP_FLOWTABLE_RDMA_RX(mdev, cap) :		\
	(type == FS_FT_RDMA_TX) ? MLX5_CAP_FLOWTABLE_RDMA_TX(mdev, cap) :      \
	(type == FS_FT_PORT_SEL) ? MLX5_CAP_FLOWTABLE_PORT_SELECTION(mdev, cap) :      \
	(BUILD_BUG_ON_ZERO(FS_FT_PORT_SEL != FS_FT_MAX_TYPE))\
	(type == FS_FT_FDB_RX) ? MLX5_CAP_ESW_FLOWTABLE_FDB(mdev, cap) :      \
	(type == FS_FT_FDB_TX) ? MLX5_CAP_ESW_FLOWTABLE_FDB(mdev, cap) :      \
	(BUILD_BUG_ON_ZERO(FS_FT_FDB_TX != FS_FT_MAX_TYPE))\
	)

#endif
+6 −6
Original line number Diff line number Diff line
@@ -251,9 +251,9 @@ int mlx5dr_cmd_query_flow_table(struct mlx5_core_dev *dev,
	output->level = MLX5_GET(query_flow_table_out, out, flow_table_context.level);

	output->sw_owner_icm_root_1 = MLX5_GET64(query_flow_table_out, out,
						 flow_table_context.sw_owner_icm_root_1);
						 flow_table_context.sws.sw_owner_icm_root_1);
	output->sw_owner_icm_root_0 = MLX5_GET64(query_flow_table_out, out,
						 flow_table_context.sw_owner_icm_root_0);
						 flow_table_context.sws.sw_owner_icm_root_0);

	return 0;
}
@@ -480,15 +480,15 @@ int mlx5dr_cmd_create_flow_table(struct mlx5_core_dev *mdev,
		 */
		if (attr->table_type == MLX5_FLOW_TABLE_TYPE_NIC_RX) {
			MLX5_SET64(flow_table_context, ft_mdev,
				   sw_owner_icm_root_0, attr->icm_addr_rx);
				   sws.sw_owner_icm_root_0, attr->icm_addr_rx);
		} else if (attr->table_type == MLX5_FLOW_TABLE_TYPE_NIC_TX) {
			MLX5_SET64(flow_table_context, ft_mdev,
				   sw_owner_icm_root_0, attr->icm_addr_tx);
				   sws.sw_owner_icm_root_0, attr->icm_addr_tx);
		} else if (attr->table_type == MLX5_FLOW_TABLE_TYPE_FDB) {
			MLX5_SET64(flow_table_context, ft_mdev,
				   sw_owner_icm_root_0, attr->icm_addr_rx);
				   sws.sw_owner_icm_root_0, attr->icm_addr_rx);
			MLX5_SET64(flow_table_context, ft_mdev,
				   sw_owner_icm_root_1, attr->icm_addr_tx);
				   sws.sw_owner_icm_root_1, attr->icm_addr_tx);
		}
	}

Loading