Commit Graph

248 Commits

Author SHA1 Message Date
Jason Wang
f2ad9d6b4e virtio_ring: switch to use vring_virtqueue for detach_unused_buf variants
Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-12-jasowang@redhat.com>
2025-12-31 05:39:18 -05:00
Jason Wang
7e81017673 virtio_ring: switch to use vring_virtqueue for disable_cb variants
Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-11-jasowang@redhat.com>
2025-12-31 05:39:18 -05:00
Jason Wang
62fa22cdab virtio_ring: use vring_virtqueue for enable_cb_delayed variants
Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-10-jasowang@redhat.com>
2025-12-31 05:39:18 -05:00
Jason Wang
74847cb573 virtio_ring: switch to use vring_virtqueue for enable_cb_prepare variants
Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-9-jasowang@redhat.com>
2025-12-31 05:39:17 -05:00
Jason Wang
ceea1cd0ae virtio: switch to use vring_virtqueue for virtqueue_get variants
Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-8-jasowang@redhat.com>
2025-12-31 05:39:17 -05:00
Jason Wang
4a0fa90b10 virtio_ring: switch to use vring_virtqueue for virtqueue_add variants
Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-7-jasowang@redhat.com>
2025-12-31 05:39:17 -05:00
Jason Wang
8b8590b708 virtio_ring: switch to use vring_virtqueue for virtqueue_kick_prepare variants
Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-6-jasowang@redhat.com>
2025-12-31 05:39:17 -05:00
Jason Wang
9552bc0581 virtio_ring: switch to use vring_virtqueue for virtqueue resize variants
Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-5-jasowang@redhat.com>
2025-12-31 05:39:17 -05:00
Jason Wang
40da006f13 virtio_ring: unify logic of virtqueue_poll() and more_used()
This patch unifies the logic of virtqueue_poll() and more_used() for
better code reusing and ease the future in order implementation.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-4-jasowang@redhat.com>
2025-12-31 05:39:17 -05:00
Jason Wang
79f6d68293 virtio_ring: switch to use vring_virtqueue in virtqueue_poll variants
Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-3-jasowang@redhat.com>
2025-12-31 05:39:17 -05:00
Jason Wang
8ce8e3e558 virtio_ring: rename virtqueue_reinit_xxx to virtqueue_reset_xxx()
To be consistent with virtqueue_reset().

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-2-jasowang@redhat.com>
2025-12-31 05:39:17 -05:00
zhangdongchuan@eswincomputing.com
4b7bf8d550 virtio_ring: code cleanup in detach_buf_split
Since the return value of vring_unmap_one_split() is exactly
vq->split.desc_extra[i].next, 'i = vq->split.desc_extra[i].next' is
redundant. Assign vring_unmap_one_split() to i instead.

Since vq->split.desc_extra is assigned to extra, use extra[i].next
instead of vq->split.desc_extra[i].next to improve readability.

No change in functionality.

Signed-off-by: zhangdongchuan <zhangdongchuan@eswincomputing.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <202511261140162936986@eswincomputing.com>
2025-12-26 15:00:00 -05:00
Kriish Sharma
f811300085 virtio: fix kernel-doc for mapping/free_coherent functions
Documentation build reported:

  WARNING: ./drivers/virtio/virtio_ring.c:3174 function parameter 'vaddr' not described in 'virtqueue_map_free_coherent'
  WARNING: ./drivers/virtio/virtio_ring.c:3308 expecting prototype for virtqueue_mapping_error(). Prototype was for virtqueue_map_mapping_error() instead

The kernel-doc block for virtqueue_map_free_coherent() omitted the @vaddr parameter, and
the kernel-doc header for virtqueue_map_mapping_error() used the wrong function name
(virtqueue_mapping_error) instead of the actual function name.

This change updates:

  - the function name in the comment to virtqueue_map_mapping_error()
  - adds the missing @vaddr description in the comment for virtqueue_map_free_coherent()

Fixes: b41cb3bcf6 ("virtio: rename dma helpers")
Signed-off-by: Kriish Sharma <kriish.sharma2006@gmail.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251110202920.2250244-1-kriish.sharma2006@gmail.com>
2025-11-27 02:03:05 -05:00
Linus Torvalds
bf897d2626 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
 "Just fixes and cleanups this time around. The mapping cleanups are
  preparing the ground for new features, though"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio-vdpa: Drop redundant conversion to bool
  vduse: Use fixed 4KB bounce pages for non-4KB page size
  vduse: switch to use virtio map API instead of DMA API
  vdpa: introduce map ops
  vdpa: support virtio_map
  virtio: introduce map ops in virtio core
  virtio_ring: rename dma_handle to map_handle
  virtio: introduce virtio_map container union
  virtio: rename dma helpers
  virtio_ring: switch to use dma_{map|unmap}_page()
  virtio_ring: constify virtqueue pointer for DMA helpers
  virtio_balloon: Remove redundant __GFP_NOWARN
  vhost: vringh: Fix copy_to_iter return value check
  vhost: vringh: Modify the return value check
2025-10-04 08:48:16 -07:00
Jason Wang
bee8c7c24b virtio: introduce map ops in virtio core
This patch introduces map operations for virtio device. Virtio used to
use DMA API which is not necessarily the case since some devices
doesn't do DMA. Instead of using tricks and abusing DMA API, let's
simply abstract the current mapping logic into a virtio specific
mapping operations. For the device or transport that doesn't do DMA,
they can implement their own mapping logic without the need to trick
DMA core. In this case the mapping metadata is opaque to the virtio
core that will be passed back to the transport or device specific map
operations. For other devices, DMA API will still be used, so map
token will still be the dma device to minimize the changeset and
performance impact.

The mapping operations are abstracted as a independent structure
instead of reusing virtio_config_ops. This allows the transport can
simply reuse the structure for lower layers like vDPA.

A set of new mapping helpers were introduced for the device that want
to do mapping by themselves.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20250821064641.5025-7-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2025-10-01 07:24:43 -04:00
Jason Wang
201e52ffe3 virtio_ring: rename dma_handle to map_handle
Following patch will introduce virtio map operations which means the
address is not necessarily used for DMA. Let's rename the dma_handle
to map_handle first.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20250821064641.5025-6-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2025-10-01 07:24:43 -04:00
Jason Wang
b16060c5c7 virtio: introduce virtio_map container union
Following patch will introduce the mapping operations for virtio
device. In order to achieve this, besides the dma device, virtio core
needs to support a transport or device specific mapping metadata as well.
So this patch introduces a union container of a dma device. The idea
is the allow the transport layer to pass device specific mapping
metadata which will be used as a parameter for the virtio mapping
operations. For the transport or device that is using DMA, dma device
is still being used.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20250821064641.5025-5-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2025-10-01 07:24:43 -04:00
Jason Wang
b41cb3bcf6 virtio: rename dma helpers
Following patch will introduce virtio mapping function to avoid
abusing DMA API for device that doesn't do DMA. To ease the
introduction, this patch rename "dma" to "map" for the current dma
mapping helpers.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20250821064641.5025-4-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2025-10-01 07:24:43 -04:00
Jason Wang
447beec806 virtio_ring: switch to use dma_{map|unmap}_page()
This patch switches to use dma_{map|unmap}_page() to reduce the
coverage of DMA operations. This would help for the following rework
on the virtio map operations.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20250821064641.5025-3-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2025-10-01 07:24:43 -04:00
Jason Wang
7d096cb3e1 virtio_ring: constify virtqueue pointer for DMA helpers
This patch constifies the virtqueue pointer for DMA helpers.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20250821064641.5025-2-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2025-10-01 07:24:43 -04:00
Leon Romanovsky
6eb1e769b2 kmsan: convert kmsan_handle_dma to use physical addresses
Convert the KMSAN DMA handling function from page-based to physical
address-based interface.

The refactoring renames kmsan_handle_dma() parameters from accepting
(struct page *page, size_t offset, size_t size) to (phys_addr_t phys,
size_t size). The existing semantics where callers are expected to
provide only kmap memory is continued here.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/3557cbaf66e935bc794f37d2b891ef75cbf2c80c.1757423202.git.leonro@nvidia.com
2025-09-12 00:18:20 +02:00
Michael S. Tsirkin
482bd84f1f virtio: document ENOSPC
drivers handle ENOSPC specially since it's an error one can
get from a working VQ. Document the semantics.

Message-Id: <2e6ec46b8d5e6755be291cec8e2ec57ef286e97b.1748356035.git.mst@redhat.com>
Reported-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
2025-08-01 09:11:07 -04:00
Laurent Vivier
45ebc7e6c1 virtio_ring: Fix error reporting in virtqueue_resize
The virtqueue_resize() function was not correctly propagating error codes
from its internal resize helper functions, specifically
virtqueue_resize_packet() and virtqueue_resize_split(). If these helpers
returned an error, but the subsequent call to virtqueue_enable_after_reset()
succeeded, the original error from the resize operation would be masked.
Consequently, virtqueue_resize() could incorrectly report success to its
caller despite an underlying resize failure.

This change restores the original code behavior:

       if (vdev->config->enable_vq_after_reset(_vq))
               return -EBUSY;

       return err;

Fix: commit ad48d53b5b ("virtio_ring: separate the logic of reset/enable from virtqueue_resize")
Cc: xuanzhuo@linux.alibaba.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20250521092236.661410-2-lvivier@redhat.com
Tested-by: Lei Yang <leiyang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03 11:40:02 +02:00
Zhongqiu Han
2e2f925fe7 virtio_ring: Fix data race by tagging event_triggered as racy for KCSAN
syzbot reports a data-race when accessing the event_triggered, here is the
simplified stack when the issue occurred:

==================================================================
BUG: KCSAN: data-race in virtqueue_disable_cb / virtqueue_enable_cb_delayed

write to 0xffff8881025bc452 of 1 bytes by task 3288 on cpu 0:
 virtqueue_enable_cb_delayed+0x42/0x3c0 drivers/virtio/virtio_ring.c:2653
 start_xmit+0x230/0x1310 drivers/net/virtio_net.c:3264
 __netdev_start_xmit include/linux/netdevice.h:5151 [inline]
 netdev_start_xmit include/linux/netdevice.h:5160 [inline]
 xmit_one net/core/dev.c:3800 [inline]

read to 0xffff8881025bc452 of 1 bytes by interrupt on cpu 1:
 virtqueue_disable_cb_split drivers/virtio/virtio_ring.c:880 [inline]
 virtqueue_disable_cb+0x92/0x180 drivers/virtio/virtio_ring.c:2566
 skb_xmit_done+0x5f/0x140 drivers/net/virtio_net.c:777
 vring_interrupt+0x161/0x190 drivers/virtio/virtio_ring.c:2715
 __handle_irq_event_percpu+0x95/0x490 kernel/irq/handle.c:158
 handle_irq_event_percpu kernel/irq/handle.c:193 [inline]

value changed: 0x01 -> 0x00
==================================================================

When the data race occurs, the function virtqueue_enable_cb_delayed() sets
event_triggered to false, and virtqueue_disable_cb_split/packed() reads it
as false due to the race condition. Since event_triggered is an unreliable
hint used for optimization, this should only cause the driver temporarily
suggest that the device not send an interrupt notification when the event
index is used.

Fix this KCSAN reported data-race issue by explicitly tagging the access as
data_racy.

Reported-by: syzbot+efe683d57990864b8c8e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67c7761a.050a0220.15b4b9.0018.GAE@google.com/
Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
Message-Id: <20250312130412.3516307-1-quic_zhonhan@quicinc.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2025-04-14 09:51:31 -04:00
Koichiro Den
8d2da07c81 virtio_ring: add a func argument 'recycle_done' to virtqueue_reset()
When virtqueue_reset() has actually recycled all unused buffers,
additional work may be required in some cases. Relying solely on its
return status is fragile, so introduce a new function argument
'recycle_done', which is invoked when it really occurs.

Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-10 11:22:21 +01:00
Koichiro Den
8d6712c892 virtio_ring: add a func argument 'recycle_done' to virtqueue_resize()
When virtqueue_resize() has actually recycled all unused buffers,
additional work may be required in some cases. Relying solely on its
return status is fragile, so introduce a new function argument
'recycle_done', which is invoked when the recycle really occurs.

Cc: <stable@vger.kernel.org> # v6.11+
Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-10 11:22:21 +01:00
Linus Torvalds
2a50b1e766 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
 "A small number of improvements all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_vdpa: remove redundant check on desc
  virtio_fs: store actual queue index in mq_map
  virtio_fs: add informative log for new tag discovery
  virtio: Make vring_new_virtqueue support packed vring
  virtio_pmem: Add freeze/restore callbacks
  vdpa/mlx5: Fix suboptimal range on iotlb iteration
2024-11-27 13:11:58 -08:00
Xuan Zhuo
880ebcbe06 virtio_ring: remove API virtqueue_set_dma_premapped
Now, this API is useless. remove it.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20241112012928.102478-8-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 18:46:55 -08:00
Xuan Zhuo
3ef66af31f virtio_ring: introduce add api for premapped
Two APIs are introduced to submit premapped per-buffers.

int virtqueue_add_inbuf_premapped(struct virtqueue *vq,
                                 struct scatterlist *sg, unsigned int num,
                                 void *data,
                                 void *ctx,
                                 gfp_t gfp);

int virtqueue_add_outbuf_premapped(struct virtqueue *vq,
                                  struct scatterlist *sg, unsigned int num,
                                  void *data,
                                  gfp_t gfp);

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20241112012928.102478-6-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 18:46:55 -08:00
Xuan Zhuo
c7e1b422af virtio_ring: perform premapped operations based on per-buffer
The current configuration sets the virtqueue (vq) to premapped mode,
implying that all buffers submitted to this queue must be mapped ahead
of time. This presents a challenge for the virtnet send queue (sq): the
virtnet driver would be required to keep track of dma information for vq
size * 17, which can be substantial. However, if the premapped mode were
applied on a per-buffer basis, the complexity would be greatly reduced.
With AF_XDP enabled, AF_XDP buffers would become premapped, while kernel
skb buffers could remain unmapped.

And consider that some sgs are not generated by the virtio driver,
that may be passed from the block stack. So we can not change the
sgs, new APIs are the better way.

So we pass the new argument 'premapped' to indicate the buffers
submitted to virtio are premapped in advance. Additionally,
DMA unmap operations for these buffers will be bypassed.

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20241112012928.102478-5-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 18:46:55 -08:00
Xuan Zhuo
aaa789843a virtio_ring: packed: record extras for indirect buffers
The subsequent commit needs to know whether every indirect buffer is
premapped or not. So we need to introduce an extra struct for every
indirect buffer to record this info.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20241112012928.102478-4-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 18:46:54 -08:00
Xuan Zhuo
bc2b4c3401 virtio_ring: split: record extras for indirect buffers
The subsequent commit needs to know whether every indirect buffer is
premapped or not. So we need to introduce an extra struct for every
indirect buffer to record this info.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20241112012928.102478-3-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 18:46:54 -08:00
Xuan Zhuo
9f19c08405 virtio_ring: introduce vring_need_unmap_buffer
To make the code readable, introduce vring_need_unmap_buffer() to
replace do_unmap.

   use_dma_api premapped -> vring_need_unmap_buffer()
1. false       false        false
2. true        false        true
3. true        true         false

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20241112012928.102478-2-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 18:46:54 -08:00
Wenyu Huang
a49c26f761 virtio: Make vring_new_virtqueue support packed vring
It is used for testing in tools/virtio/vringh_test.c.
If vring_new_virtqueue supports packed vring, we can add support for
packed vring to vringh and test it.

Signed-off-by: Wenyu Huang <huangwenyu1998@gmail.com>
Message-Id: <20241013033233.65026-1-huangwenyu1998@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2024-11-12 18:07:24 -05:00
Michael S. Tsirkin
83c334ed52 virtio_ring: tag event_triggered as racy for KCSAN
Setting event_triggered from the interrupt handler
is fundamentally racy. There are races of 2 types:
1. vq processing can read false value while interrupt
   triggered and set it to true.
   result will be a bit of extra work when disabling cbs, no big deal.

1. vq processing can set false value then interrupt
   immediately sets true value
   since interrupt then triggers a callback which will
   process buffers, this is also not an issue.

However, looks like KCSAN can not figure all this out, and warns about
the race between the write and the read.  Tag the access data_racy for
now.  We should probably look at ways to make this more
straight-forwardly correct.

Message-ID: <6bdd771a4fb7625a9227971b3cf4745c34c31a32.1726153334.git.mst@redhat.com>
Reported-by: syzbot+8a02104389c2e0ef5049@syzkaller.appspotmail.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Marco Elver <elver@google.com>
2024-09-25 07:07:44 -04:00
Jeff Johnson
ab0727f3dd virtio: add missing MODULE_DESCRIPTION() macros
With ARCH=sh, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/virtio/virtio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/virtio/virtio_ring.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Message-Id: <20240702-md-sh-drivers-virtio-v1-1-cf7325ab6ccc@quicinc.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-09 08:42:41 -04:00
Xuan Zhuo
840b2d39a2 virtio_ring: fix KMSAN error for premapped mode
Add kmsan for virtqueue_dma_map_single_attrs to fix:

BUG: KMSAN: uninit-value in receive_buf+0x45ca/0x6990
 receive_buf+0x45ca/0x6990
 virtnet_poll+0x17e0/0x3130
 net_rx_action+0x832/0x26e0
 handle_softirqs+0x330/0x10f0
 [...]

Uninit was created at:
 __alloc_pages_noprof+0x62a/0xe60
 alloc_pages_noprof+0x392/0x830
 skb_page_frag_refill+0x21a/0x5c0
 virtnet_rq_alloc+0x50/0x1500
 try_fill_recv+0x372/0x54c0
 virtnet_open+0x210/0xbe0
 __dev_open+0x56e/0x920
 __dev_change_flags+0x39c/0x2000
 dev_change_flags+0xaa/0x200
 do_setlink+0x197a/0x7420
 rtnl_setlink+0x77c/0x860
 [...]

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Tested-by: Alexander Potapenko <glider@google.com>
Message-Id: <20240606111345.93600-1-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>  # s390x
Acked-by: Jason Wang <jasowang@redhat.com>
2024-07-04 11:00:31 -04:00
Xuan Zhuo
f9dac92ba9 virtio_ring: enable premapped mode whatever use_dma_api
Now, we have virtio DMA APIs, the driver can be the premapped
mode whatever the virtio core uses dma api or not.

So remove the limit of checking use_dma_api from
virtqueue_set_dma_premapped().

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20240511031404.30903-2-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-13 17:07:41 -07:00
Xuan Zhuo
d5c0ed17fe virtio: packed: fix unmap leak for indirect desc table
When use_dma_api and premapped are true, then the do_unmap is false.

Because the do_unmap is false, vring_unmap_extra_packed is not called by
detach_buf_packed.

  if (unlikely(vq->do_unmap)) {
                curr = id;
                for (i = 0; i < state->num; i++) {
                        vring_unmap_extra_packed(vq,
                                                 &vq->packed.desc_extra[curr]);
                        curr = vq->packed.desc_extra[curr].next;
                }
  }

So the indirect desc table is not unmapped. This causes the unmap leak.

So here, we check vq->use_dma_api instead. Synchronously, dma info is
updated based on use_dma_api judgment

This bug does not occur, because no driver use the premapped with
indirect.

Fixes: b319940f83 ("virtio_ring: skip unmap for premapped")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20240223071833.26095-1-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 03:19:22 -04:00
Xuan Zhuo
1f475cd572 virtio_ring: fix syncs DMA memory with different direction
Now the APIs virtqueue_dma_sync_single_range_for_{cpu,device} ignore
the parameter 'dir', that is a mistake.

[    6.101666] ------------[ cut here ]------------
[    6.102079] DMA-API: virtio-pci 0000:00:04.0: device driver syncs DMA memory with different direction [device address=0x00000000ae010000] [size=32752 bytes] [mapped with DMA_FROM_DEVICE] [synced with DMA_BIDIRECTIONAL]
[    6.103630] WARNING: CPU: 6 PID: 0 at kernel/dma/debug.c:1125 check_sync+0x53e/0x6c0
[    6.107420] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G            E      6.6.0+ #290
[    6.108030] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[    6.108936] RIP: 0010:check_sync+0x53e/0x6c0
[    6.109289] Code: 24 10 e8 f5 d9 74 00 4c 8b 4c 24 10 4c 8b 44 24 18 48 8b 4c 24 20 48 89 c6 41 56 4c 89 ea 48 c7 c7 b0 f1 50 82 e8 32 fc f3 ff <0f> 0b 48 c7 c7 48 4b 4a 82 e8 74 d9 fc ff 8b 73 4c 48 8d 7b 50 31
[    6.110750] RSP: 0018:ffffc90000180cd8 EFLAGS: 00010092
[    6.111178] RAX: 00000000000000ce RBX: ffff888100aa5900 RCX: 0000000000000000
[    6.111744] RDX: 0000000000000104 RSI: ffffffff824c3208 RDI: 00000000ffffffff
[    6.112316] RBP: ffffc90000180d40 R08: 0000000000000000 R09: 00000000fffeffff
[    6.112893] R10: ffffc90000180b98 R11: ffffffff82f63308 R12: ffffffff83d5af00
[    6.113460] R13: ffff888100998200 R14: ffffffff824a4b5f R15: 0000000000000286
[    6.114027] FS:  0000000000000000(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000
[    6.114665] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    6.115128] CR2: 00007f10f1e03030 CR3: 0000000108272004 CR4: 0000000000770ee0
[    6.115701] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    6.116272] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    6.116842] PKRU: 55555554
[    6.117069] Call Trace:
[    6.117275]  <IRQ>
[    6.117452]  ? __warn+0x84/0x140
[    6.117727]  ? check_sync+0x53e/0x6c0
[    6.118034]  ? __report_bug+0xea/0x100
[    6.118353]  ? check_sync+0x53e/0x6c0
[    6.118653]  ? report_bug+0x41/0xc0
[    6.118944]  ? handle_bug+0x3c/0x70
[    6.119237]  ? exc_invalid_op+0x18/0x70
[    6.119551]  ? asm_exc_invalid_op+0x1a/0x20
[    6.119900]  ? check_sync+0x53e/0x6c0
[    6.120199]  ? check_sync+0x53e/0x6c0
[    6.120499]  debug_dma_sync_single_for_cpu+0x5c/0x70
[    6.120906]  ? dma_sync_single_for_cpu+0xb7/0x100
[    6.121291]  virtnet_rq_unmap+0x158/0x170 [virtio_net]
[    6.121716]  virtnet_receive+0x196/0x220 [virtio_net]
[    6.122135]  virtnet_poll+0x48/0x1b0 [virtio_net]
[    6.122524]  __napi_poll+0x29/0x1b0
[    6.123083]  net_rx_action+0x282/0x360
[    6.123612]  __do_softirq+0xf3/0x2fb
[    6.124138]  __irq_exit_rcu+0x8e/0xf0
[    6.124663]  common_interrupt+0xbc/0xe0
[    6.125202]  </IRQ>

We need to enable CONFIG_DMA_API_DEBUG and work with need sync mode(such
as swiotlb) to reproduce this warn.

Fixes: 8bd2f71054 ("virtio_ring: introduce dma sync api for virtqueue")
Reported-by: "Ning, Hongyu" <hongyu.ning@linux.intel.com>
Closes: https://lore.kernel.org/all/f37cb55a-6fc8-4e21-8789-46d468325eea@linux.intel.com/
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20231201033303.25141-1-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Hongyu Ning <hongyu.ning@linux.intel.com>
2023-12-04 08:36:25 -05:00
Yuan Yao
1acfe2c122 virtio_ring: fix avail_wrap_counter in virtqueue_add_packed
In current packed virtqueue implementation, the avail_wrap_counter won't
flip, in the case when the driver supplies a descriptor chain with a
length equals to the queue size; total_sg == vq->packed.vring.num.

Let’s assume the following situation:
vq->packed.vring.num=4
vq->packed.next_avail_idx: 1
vq->packed.avail_wrap_counter: 0

Then the driver adds a descriptor chain containing 4 descriptors.

We expect the following result with avail_wrap_counter flipped:
vq->packed.next_avail_idx: 1
vq->packed.avail_wrap_counter: 1

But, the current implementation gives the following result:
vq->packed.next_avail_idx: 1
vq->packed.avail_wrap_counter: 0

To reproduce the bug, you can set a packed queue size as small as
possible, so that the driver is more likely to provide a descriptor
chain with a length equal to the packed queue size. For example, in
qemu run following commands:
sudo qemu-system-x86_64 \
-enable-kvm \
-nographic \
-kernel "path/to/kernel_image" \
-m 1G \
-drive file="path/to/rootfs",if=none,id=disk \
-device virtio-blk,drive=disk \
-drive file="path/to/disk_image",if=none,id=rwdisk \
-device virtio-blk,drive=rwdisk,packed=on,queue-size=4,\
indirect_desc=off \
-append "console=ttyS0 root=/dev/vda rw init=/bin/bash"

Inside the VM, create a directory and mount the rwdisk device on it. The
rwdisk will hang and mount operation will not complete.

This commit fixes the wrap counter error by flipping the
packed.avail_wrap_counter, when start of descriptor chain equals to the
end of descriptor chain (head == i).

Fixes: 1ce9e6055f ("virtio_ring: introduce packed ring support")
Signed-off-by: Yuan Yao <yuanyaogoog@chromium.org>
Message-Id: <20230808051110.3492693-1-yuanyaogoog@chromium.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03 18:10:24 -04:00
Xuan Zhuo
8bd2f71054 virtio_ring: introduce dma sync api for virtqueue
These API has been introduced:

* virtqueue_dma_need_sync
* virtqueue_dma_sync_single_range_for_cpu
* virtqueue_dma_sync_single_range_for_device

These APIs can be used together with the premapped mechanism to sync the
DMA address.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-12-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03 18:10:23 -04:00
Xuan Zhuo
b6253b4e21 virtio_ring: introduce dma map api for virtqueue
Added virtqueue_dma_map_api* to map DMA addresses for virtual memory in
advance. The purpose is to keep memory mapped across multiple add/get
buf operations.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-11-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03 18:10:23 -04:00
Xuan Zhuo
ba3e0c47c0 virtio_ring: introduce virtqueue_reset()
Introduce virtqueue_reset() to release all buffer inside vq.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230810123057.43407-10-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03 18:10:23 -04:00
Xuan Zhuo
ad48d53b5b virtio_ring: separate the logic of reset/enable from virtqueue_resize
The subsequent reset function will reuse these logic.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230810123057.43407-9-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03 18:10:23 -04:00
Xuan Zhuo
4d09f24080 virtio_ring: correct the expression of the description of virtqueue_resize()
Modify the "useless" to a more accurate "unused".

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230810123057.43407-8-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03 18:10:23 -04:00
Xuan Zhuo
b319940f83 virtio_ring: skip unmap for premapped
Now we add a case where we skip dma unmap, the vq->premapped is true.

We can't just rely on use_dma_api to determine whether to skip the dma
operation. For convenience, I introduced the "do_unmap". By default, it
is the same as use_dma_api. If the driver is configured with premapped,
then do_unmap is false.

So as long as do_unmap is false, for addr of desc, we should skip dma
unmap operation.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-7-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03 18:10:23 -04:00
Xuan Zhuo
2df6475907 virtio_ring: introduce virtqueue_dma_dev()
Added virtqueue_dma_dev() to get DMA device for virtio. Then the
caller can do dma operation in advance. The purpose is to keep memory
mapped across multiple add/get buf operations.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230810123057.43407-6-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03 18:10:23 -04:00
Xuan Zhuo
d7344a2f71 virtio_ring: support add premapped buf
If the vq is the premapped mode, use the sg_dma_address() directly.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-5-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03 18:10:23 -04:00
Xuan Zhuo
8daafe9ebb virtio_ring: introduce virtqueue_set_dma_premapped()
This helper allows the driver change the dma mode to premapped mode.
Under the premapped mode, the virtio core do not do dma mapping
internally.

This just work when the use_dma_api is true. If the use_dma_api is false,
the dma options is not through the DMA APIs, that is not the standard
way of the linux kernel.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-4-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03 18:10:22 -04:00