Commit 983512f3 authored by Sebastian Andrzej Siewior's avatar Sebastian Andrzej Siewior Committed by Paolo Abeni
Browse files

net: Drop the lock in skb_may_tx_timestamp()

skb_may_tx_timestamp() may acquire sock::sk_callback_lock. The lock must
not be taken in IRQ context, only softirq is okay. A few drivers receive
the timestamp via a dedicated interrupt and complete the TX timestamp
from that handler. This will lead to a deadlock if the lock is already
write-locked on the same CPU.

Taking the lock can be avoided. The socket (pointed by the skb) will
remain valid until the skb is released. The ->sk_socket and ->file
member will be set to NULL once the user closes the socket which may
happen before the timestamp arrives.
If we happen to observe the pointer while the socket is closing but
before the pointer is set to NULL then we may use it because both
pointer (and the file's cred member) are RCU freed.

Drop the lock. Use READ_ONCE() to obtain the individual pointer. Add a
matching WRITE_ONCE() where the pointer are cleared.

Link: https://lore.kernel.org/all/20260205145104.iWinkXHv@linutronix.de


Fixes: b245be1f ("net-timestamp: no-payload only sysctl")
Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
Reviewed-by: default avatarJason Xing <kerneljasonxing@gmail.com>
Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260220183858.N4ERjFW6@linutronix.de


Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
parent 82aec772
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -2098,7 +2098,7 @@ static inline int sk_rx_queue_get(const struct sock *sk)

static inline void sk_set_socket(struct sock *sk, struct socket *sock)
{
	sk->sk_socket = sock;
	WRITE_ONCE(sk->sk_socket, sock);
	if (sock) {
		WRITE_ONCE(sk->sk_uid, SOCK_INODE(sock)->i_uid);
		WRITE_ONCE(sk->sk_ino, SOCK_INODE(sock)->i_ino);
+18 −5
Original line number Diff line number Diff line
@@ -5590,15 +5590,28 @@ static void __skb_complete_tx_timestamp(struct sk_buff *skb,

static bool skb_may_tx_timestamp(struct sock *sk, bool tsonly)
{
	bool ret;
	struct socket *sock;
	struct file *file;
	bool ret = false;

	if (likely(tsonly || READ_ONCE(sock_net(sk)->core.sysctl_tstamp_allow_data)))
		return true;

	read_lock_bh(&sk->sk_callback_lock);
	ret = sk->sk_socket && sk->sk_socket->file &&
	      file_ns_capable(sk->sk_socket->file, &init_user_ns, CAP_NET_RAW);
	read_unlock_bh(&sk->sk_callback_lock);
	/* The sk pointer remains valid as long as the skb is. The sk_socket and
	 * file pointer may become NULL if the socket is closed. Both structures
	 * (including file->cred) are RCU freed which means they can be accessed
	 * within a RCU read section.
	 */
	rcu_read_lock();
	sock = READ_ONCE(sk->sk_socket);
	if (!sock)
		goto out;
	file = READ_ONCE(sock->file);
	if (!file)
		goto out;
	ret = file_ns_capable(file, &init_user_ns, CAP_NET_RAW);
out:
	rcu_read_unlock();
	return ret;
}

+1 −1
Original line number Diff line number Diff line
@@ -674,7 +674,7 @@ static void __sock_release(struct socket *sock, struct inode *inode)
		iput(SOCK_INODE(sock));
		return;
	}
	sock->file = NULL;
	WRITE_ONCE(sock->file, NULL);
}

/**