Commit 4b714fd1 authored by David S. Miller's avatar David S. Miller
Browse files

Merge branch 'vsock-virtio-vhost-zerocopy'

Arseniy Krasnov says:

====================
vsock/virtio/vhost: MSG_ZEROCOPY preparations

this patchset is first of three parts of another big patchset for
MSG_ZEROCOPY flag support:
https://lore.kernel.org/netdev/20230701063947.3422088-1-AVKrasnov@sberdevices.ru/

During review of this series, Stefano Garzarella <sgarzare@redhat.com>
suggested to split it for three parts to simplify review and merging:

1) virtio and vhost updates (for fragged skbs) <--- this patchset
2) AF_VSOCK updates (allows to enable MSG_ZEROCOPY mode and read
   tx completions) and update for Documentation/.
3) Updates for tests and utils.

This series enables handling of fragged skbs in virtio and vhost parts.
Newly logic won't be triggered, because SO_ZEROCOPY options is still
impossible to enable at this moment (next bunch of patches from big
set above will enable it).

I've included changelog to some patches anyway, because there were some
comments during review of last big patchset from the link above.

Head for this patchset is:
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=f2fa1c812c91e99d0317d1fc7d845e1e05f39716

Link to v1:
https://lore.kernel.org/netdev/20230717210051.856388-1-AVKrasnov@sberdevices.ru/
Link to v2:
https://lore.kernel.org/netdev/20230718180237.3248179-1-AVKrasnov@sberdevices.ru/
Link to v3:
https://lore.kernel.org/netdev/20230720214245.457298-1-AVKrasnov@sberdevices.ru/
Link to v4:
https://lore.kernel.org/netdev/20230727222627.1895355-1-AVKrasnov@sberdevices.ru/
Link to v5:
https://lore.kernel.org/netdev/20230730085905.3420811-1-AVKrasnov@sberdevices.ru/
Link to v6:
https://lore.kernel.org/netdev/20230814212720.3679058-1-AVKrasnov@sberdevices.ru/
Link to v7:
https://lore.kernel.org/netdev/20230827085436.941183-1-avkrasnov@salutedevices.com/
Link to v8:
https://lore.kernel.org/netdev/20230911202234.1932024-1-avkrasnov@salutedevices.com/



Changelog:
 v3 -> v4:
 * Patchset rebased and tested on new HEAD of net-next (see hash above).
 v4 -> v5:
 * See per-patch changelog after ---.
 v5 -> v6:
 * Patchset rebased and tested on new HEAD of net-next (see hash above).
 * See per-patch changelog after ---.
 v6 -> v7:
 * Patchset rebased and tested on new HEAD of net-next (see hash above).
 * See per-patch changelog after ---.
 v7 -> v8:
 * Patchset rebased and tested on new HEAD of net-next (see hash above).
 * See per-patch changelog after ---.
 v8 -> v9:
 * Patchset rebased and tested on new HEAD of net-next (see hash above).
 * See per-patch changelog after ---.
====================

Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 85605fb6 8d211285
Loading
Loading
Loading
Loading
+11 −2
Original line number Diff line number Diff line
@@ -7,7 +7,8 @@ Intro
=====

The MSG_ZEROCOPY flag enables copy avoidance for socket send calls.
The feature is currently implemented for TCP and UDP sockets.
The feature is currently implemented for TCP, UDP and VSOCK (with
virtio transport) sockets.


Opportunity and Caveats
@@ -174,7 +175,9 @@ read_notification() call in the previous snippet. A notification
is encoded in the standard error format, sock_extended_err.

The level and type fields in the control data are protocol family
specific, IP_RECVERR or IPV6_RECVERR.
specific, IP_RECVERR or IPV6_RECVERR (for TCP or UDP socket).
For VSOCK socket, cmsg_level will be SOL_VSOCK and cmsg_type will be
VSOCK_RECVERR.

Error origin is the new type SO_EE_ORIGIN_ZEROCOPY. ee_errno is zero,
as explained before, to avoid blocking read and write system calls on
@@ -235,12 +238,15 @@ Implementation
Loopback
--------

For TCP and UDP:
Data sent to local sockets can be queued indefinitely if the receive
process does not read its socket. Unbound notification latency is not
acceptable. For this reason all packets generated with MSG_ZEROCOPY
that are looped to a local socket will incur a deferred copy. This
includes looping onto packet sockets (e.g., tcpdump) and tun devices.

For VSOCK:
Data path sent to local sockets is the same as for non-local sockets.

Testing
=======
@@ -254,3 +260,6 @@ instance when run with msg_zerocopy.sh between a veth pair across
namespaces, the test will not show any improvement. For testing, the
loopback restriction can be temporarily relaxed by making
skb_orphan_frags_rx identical to skb_orphan_frags.

For VSOCK type of socket example can be found in
tools/testing/vsock/vsock_test_zerocopy.c.
+7 −0
Original line number Diff line number Diff line
@@ -398,6 +398,11 @@ static bool vhost_vsock_more_replies(struct vhost_vsock *vsock)
	return val < vq->num;
}

static bool vhost_transport_msgzerocopy_allow(void)
{
	return true;
}

static bool vhost_transport_seqpacket_allow(u32 remote_cid);

static struct virtio_transport vhost_transport = {
@@ -431,6 +436,8 @@ static struct virtio_transport vhost_transport = {
		.seqpacket_allow          = vhost_transport_seqpacket_allow,
		.seqpacket_has_data       = virtio_transport_seqpacket_has_data,

		.msgzerocopy_allow        = vhost_transport_msgzerocopy_allow,

		.notify_poll_in           = virtio_transport_notify_poll_in,
		.notify_poll_out          = virtio_transport_notify_poll_out,
		.notify_recv_init         = virtio_transport_notify_recv_init,
+1 −0
Original line number Diff line number Diff line
@@ -383,6 +383,7 @@ struct ucred {
#define SOL_MPTCP	284
#define SOL_MCTP	285
#define SOL_SMC		286
#define SOL_VSOCK	287

/* IPX options */
#define IPX_TYPE	1
+7 −0
Original line number Diff line number Diff line
@@ -177,6 +177,9 @@ struct vsock_transport {

	/* Read a single skb */
	int (*read_skb)(struct vsock_sock *, skb_read_actor_t);

	/* Zero-copy. */
	bool (*msgzerocopy_allow)(void);
};

/**** CORE ****/
@@ -241,4 +244,8 @@ static inline void __init vsock_bpf_build_proto(void)
{}
#endif

static inline bool vsock_msgzerocopy_allow(const struct vsock_transport *t)
{
	return t->msgzerocopy_allow && t->msgzerocopy_allow();
}
#endif /* __AF_VSOCK_H__ */
+17 −0
Original line number Diff line number Diff line
@@ -191,4 +191,21 @@ struct sockaddr_vm {

#define IOCTL_VM_SOCKETS_GET_LOCAL_CID		_IO(7, 0xb9)

/* MSG_ZEROCOPY notifications are encoded in the standard error format,
 * sock_extended_err. See Documentation/networking/msg_zerocopy.rst in
 * kernel source tree for more details.
 */

/* 'cmsg_level' field value of 'struct cmsghdr' for notification parsing
 * when MSG_ZEROCOPY flag is used on transmissions.
 */

#define SOL_VSOCK	287

/* 'cmsg_type' field value of 'struct cmsghdr' for notification parsing
 * when MSG_ZEROCOPY flag is used on transmissions.
 */

#define VSOCK_RECVERR	1

#endif /* _UAPI_VM_SOCKETS_H */
Loading