Commit 531d0d32 authored by Christoph Paasch's avatar Christoph Paasch Committed by Jakub Kicinski
Browse files

net/mlx5: Correctly set gso_size when LRO is used



gso_size is expected by the networking stack to be the size of the
payload (thus, not including ethernet/IP/TCP-headers). However, cqe_bcnt
is the full sized frame (including the headers). Dividing cqe_bcnt by
lro_num_seg will then give incorrect results.

For example, running a bpftrace higher up in the TCP-stack
(tcp_event_data_recv), we commonly have gso_size set to 1450 or 1451 even
though in reality the payload was only 1448 bytes.

This can have unintended consequences:
- In tcp_measure_rcv_mss() len will be for example 1450, but. rcv_mss
will be 1448 (because tp->advmss is 1448). Thus, we will always
recompute scaling_ratio each time an LRO-packet is received.
- In tcp_gro_receive(), it will interfere with the decision whether or
not to flush and thus potentially result in less gro'ed packets.

So, we need to discount the protocol headers from cqe_bcnt so we can
actually divide the payload by lro_num_seg to get the real gso_size.

v2:
 - Use "(unsigned char *)tcp + tcp->doff * 4 - skb->data)" to compute header-len
   (Tariq Toukan <tariqt@nvidia.com>)
 - Improve commit-message (Gal Pressman <gal@nvidia.com>)

Fixes: e586b3b0 ("net/mlx5: Ethernet Datapath files")
Signed-off-by: default avatarChristoph Paasch <cpaasch@openai.com>
Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250715-cpaasch-pf-925-investigate-incorrect-gso_size-on-cx-7-nic-v2-1-e06c3475f3ac@openai.com


Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parent dae7f9cb
Loading
Loading
Loading
Loading
+8 −4
Original line number Diff line number Diff line
@@ -1154,7 +1154,8 @@ static void mlx5e_lro_update_tcp_hdr(struct mlx5_cqe64 *cqe, struct tcphdr *tcp)
	}
}

static void mlx5e_lro_update_hdr(struct sk_buff *skb, struct mlx5_cqe64 *cqe,
static unsigned int mlx5e_lro_update_hdr(struct sk_buff *skb,
					 struct mlx5_cqe64 *cqe,
					 u32 cqe_bcnt)
{
	struct ethhdr	*eth = (struct ethhdr *)(skb->data);
@@ -1205,6 +1206,8 @@ static void mlx5e_lro_update_hdr(struct sk_buff *skb, struct mlx5_cqe64 *cqe,
		tcp->check = tcp_v6_check(payload_len, &ipv6->saddr,
					  &ipv6->daddr, check);
	}

	return (unsigned int)((unsigned char *)tcp + tcp->doff * 4 - skb->data);
}

static void *mlx5e_shampo_get_packet_hd(struct mlx5e_rq *rq, u16 header_index)
@@ -1561,8 +1564,9 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe,
		mlx5e_macsec_offload_handle_rx_skb(netdev, skb, cqe);

	if (lro_num_seg > 1) {
		mlx5e_lro_update_hdr(skb, cqe, cqe_bcnt);
		skb_shinfo(skb)->gso_size = DIV_ROUND_UP(cqe_bcnt, lro_num_seg);
		unsigned int hdrlen = mlx5e_lro_update_hdr(skb, cqe, cqe_bcnt);

		skb_shinfo(skb)->gso_size = DIV_ROUND_UP(cqe_bcnt - hdrlen, lro_num_seg);
		/* Subtract one since we already counted this as one
		 * "regular" packet in mlx5e_complete_rx_cqe()
		 */