Commit 7deba791 authored by Martin Michaelis's avatar Martin Michaelis Committed by Jens Axboe
Browse files

io_uring/kbuf: support min length left for incremental buffers

Incrementally consumed buffer rings are generally fully consumed, but
it's quite possible that the application has a minimum size it needs to
meet to avoid truncation. Currently that minimum limit is 1 byte, but
this should be a setting that is the hands of the application. For
recvmsg multishot, a prime use case for incrementally consumed buffers,
the application may get spurious -EFAULT returned at the end of an
incrementally consumed buffer, as less space is available than the
headers need.

Grab a u32 field in struct io_uring_buf_reg, which the application can
use to inform the kernel of the minimum size that should be available
in an incrementally consumed buffer. If less than that is available,
the current buffer is fully processed and the next one will be picked.

Cc: stable@vger.kernel.org
Fixes: ae98dbf4 ("io_uring/kbuf: add support for incremental buffer consumption")
Link: https://github.com/axboe/liburing/issues/1433


Signed-off-by: default avatarMartin Michaelis <code@mgjm.de>
[axboe: write commit message, change io_buffer_list member name]
Reviewed-by: default avatarGabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
parent 55ea9683
Loading
Loading
Loading
Loading
+2 −1
Original line number Diff line number Diff line
@@ -905,7 +905,8 @@ struct io_uring_buf_reg {
	__u32	ring_entries;
	__u16	bgid;
	__u16	flags;
	__u64	resv[3];
	__u32	min_left;
	__u32	resv[5];
};

/* argument for IORING_REGISTER_PBUF_STATUS */
+7 −1
Original line number Diff line number Diff line
@@ -47,7 +47,7 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
		this_len = min_t(u32, len, buf_len);
		buf_len -= this_len;
		/* Stop looping for invalid buffer length of 0 */
		if (buf_len || !this_len) {
		if (buf_len > bl->min_left_sub_one || !this_len) {
			WRITE_ONCE(buf->addr, READ_ONCE(buf->addr) + this_len);
			WRITE_ONCE(buf->len, buf_len);
			return false;
@@ -637,6 +637,10 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
	if (reg.ring_entries >= 65536)
		return -EINVAL;

	/* minimum left byte count is a property of incremental buffers */
	if (!(reg.flags & IOU_PBUF_RING_INC) && reg.min_left)
		return -EINVAL;

	bl = io_buffer_get_list(ctx, reg.bgid);
	if (bl) {
		/* if mapped buffer ring OR classic exists, don't allow */
@@ -683,6 +687,8 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
	bl->mask = reg.ring_entries - 1;
	bl->flags |= IOBL_BUF_RING;
	bl->buf_ring = br;
	if (reg.min_left)
		bl->min_left_sub_one = reg.min_left - 1;
	if (reg.flags & IOU_PBUF_RING_INC)
		bl->flags |= IOBL_INC;
	ret = io_buffer_add_list(ctx, bl, reg.bgid);
+7 −0
Original line number Diff line number Diff line
@@ -32,6 +32,13 @@ struct io_buffer_list {

	__u16 flags;

	/*
	 * minimum required amount to be left to reuse an incrementally
	 * consumed buffer. If less than this is left at consumption time,
	 * buffer is done and head is incremented to the next buffer.
	 */
	__u32 min_left_sub_one;

	struct io_mapped_region region;
};