Commit 846c76ec authored by KaFai Wan's avatar KaFai Wan Committed by Martin KaFai Lau
Browse files

bpf: Reject TCP_NODELAY in TCP header option callbacks

A BPF_SOCK_OPS program can enable
BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG and then call
bpf_setsockopt(TCP_NODELAY) from BPF_SOCK_OPS_HDR_OPT_LEN_CB or
BPF_SOCK_OPS_WRITE_HDR_OPT_CB.

In these callbacks, bpf_setsockopt(TCP_NODELAY) can reach
__tcp_sock_set_nodelay(), which can call tcp_push_pending_frames().

>From BPF_SOCK_OPS_HDR_OPT_LEN_CB, tcp_push_pending_frames() can call
tcp_current_mss(), which calls tcp_established_options() and re-enters
bpf_skops_hdr_opt_len().

BPF_SOCK_OPS_HDR_OPT_LEN_CB
  -> bpf_setsockopt(TCP_NODELAY)
    -> tcp_push_pending_frames()
      -> tcp_current_mss()
        -> tcp_established_options()
          -> bpf_skops_hdr_opt_len()
            -> BPF_SOCK_OPS_HDR_OPT_LEN_CB

>From BPF_SOCK_OPS_WRITE_HDR_OPT_CB, tcp_push_pending_frames() can call
tcp_write_xmit(), which calls tcp_transmit_skb().  That path recomputes
header option length through tcp_established_options() and
bpf_skops_hdr_opt_len() before re-entering bpf_skops_write_hdr_opt().

BPF_SOCK_OPS_WRITE_HDR_OPT_CB
  -> bpf_setsockopt(TCP_NODELAY)
    -> tcp_push_pending_frames()
      -> tcp_write_xmit()
        -> tcp_transmit_skb()
          -> tcp_established_options()
            -> bpf_skops_hdr_opt_len()
          -> bpf_skops_write_hdr_opt()
            -> BPF_SOCK_OPS_WRITE_HDR_OPT_CB

This leads to unbounded recursion and can overflow the kernel stack.

Reject TCP_NODELAY with -EOPNOTSUPP in bpf_sock_ops_setsockopt()
when bpf_setsockopt() is called from
BPF_SOCK_OPS_HDR_OPT_LEN_CB or BPF_SOCK_OPS_WRITE_HDR_OPT_CB.

Fixes: 7e41df5d ("bpf: Add a few optnames to bpf_setsockopt")
Closes: https://lore.kernel.org/bpf/d1d523c9-6901-4454-a183-94462b8f3e4e@std.uestc.edu.cn/


Reported-by: default avatarQuan Sun <2022090917019@std.uestc.edu.cn>
Reported-by: default avatarYinhao Hu <dddddd@hust.edu.cn>
Reported-by: default avatarKaiyan Mei <M202472210@hust.edu.cn>
Signed-off-by: default avatarKaFai Wan <kafai.wan@linux.dev>
Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: default avatarJiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260421155804.135786-2-kafai.wan@linux.dev
parent eb5249b1
Loading
Loading
Loading
Loading
+6 −0
Original line number Diff line number Diff line
@@ -5833,6 +5833,12 @@ BPF_CALL_5(bpf_sock_ops_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
	if (!is_locked_tcp_sock_ops(bpf_sock))
		return -EOPNOTSUPP;

	/* TCP_NODELAY triggers tcp_push_pending_frames() and re-enters these callbacks. */
	if ((bpf_sock->op == BPF_SOCK_OPS_HDR_OPT_LEN_CB ||
	     bpf_sock->op == BPF_SOCK_OPS_WRITE_HDR_OPT_CB) &&
	    level == SOL_TCP && optname == TCP_NODELAY)
		return -EOPNOTSUPP;

	return _bpf_setsockopt(bpf_sock->sk, level, optname, optval, optlen);
}