Commit 735a309b authored by Jakub Kicinski's avatar Jakub Kicinski
Browse files

net: add net_iov_init() and use it to initialize ->page_type



Commit db359fcc ("mm: introduce a new page type for page pool in
page type") added a page_type field to struct net_iov at the same
offset as struct page::page_type, so that page_pool_set_pp_info() can
call __SetPageNetpp() uniformly on both pages and net_iovs.

The page-type API requires the field to hold the UINT_MAX "no type"
sentinel before a type can be set; for real struct page that invariant
is established by the page allocator on free. struct net_iov is not
allocated through the page allocator, so the field is left as zero
(io_uring zcrx, which uses __GFP_ZERO) or as slab garbage (devmem,
which uses kvmalloc_objs() without zeroing). When the page pool then
calls page_pool_set_pp_info() on a freshly-bound niov,
__SetPageNetpp()'s VM_BUG_ON_PAGE(page->page_type != UINT_MAX) fires
and the kernel BUGs. Triggered in selftests by io_uring zcrx setup
through the fbnic queue restart path:

 kernel BUG at ./include/linux/page-flags.h:1062!
 RIP: 0010:page_pool_set_pp_info (./include/linux/page-flags.h:1062
                                  net/core/page_pool.c:716)
 Call Trace:
  <TASK>
  net_mp_niov_set_page_pool (net/core/page_pool.c:1360)
  io_pp_zc_alloc_netmems (io_uring/zcrx.c:1089 io_uring/zcrx.c:1110)
  fbnic_fill_bdq (./include/net/page_pool/helpers.h:160
                  drivers/net/ethernet/meta/fbnic/fbnic_txrx.c:906)
  __fbnic_nv_restart (drivers/net/ethernet/meta/fbnic/fbnic_txrx.c:2470
                      drivers/net/ethernet/meta/fbnic/fbnic_txrx.c:2874)
  fbnic_queue_start (drivers/net/ethernet/meta/fbnic/fbnic_txrx.c:2903)
  netdev_rx_queue_reconfig (net/core/netdev_rx_queue.c:137)
  __netif_mp_open_rxq (net/core/netdev_rx_queue.c:234)
  io_register_zcrx (io_uring/zcrx.c:818 io_uring/zcrx.c:903)
  __io_uring_register (io_uring/register.c:931)
  __do_sys_io_uring_register (io_uring/register.c:1029)
  do_syscall_64 (arch/x86/entry/syscall_64.c:63
                 arch/x86/entry/syscall_64.c:94)
  </TASK>

The same path is reachable through devmem dmabuf binding via
netdev_nl_bind_rx_doit() -> net_devmem_bind_dmabuf_to_queue().

Add a net_iov_init() helper that stamps ->owner, ->type and the
->page_type sentinel, and use it from both the devmem and io_uring
zcrx niov init loops.

Fixes: db359fcc ("mm: introduce a new page type for page pool in page type")
Acked-by: default avatarVlastimil Babka (SUSE) <vbabka@kernel.org>
Acked-by: default avatarByungchul Park <byungchul@sk.com>
Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
Acked-by: default avatarPavel Begunkov <asml.silence@gmail.com>
Link: https://patch.msgid.link/20260428025320.853452-1-kuba@kernel.org


Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parent 0c7a5ba0
Loading
Loading
Loading
Loading
+15 −0
Original line number Diff line number Diff line
@@ -127,6 +127,21 @@ static inline unsigned int net_iov_idx(const struct net_iov *niov)
	return niov - net_iov_owner(niov)->niovs;
}

/* Initialize a niov: stamp the owning area, the memory provider type,
 * and the page_type "no type" sentinel expected by the page-type API
 * (see PAGE_TYPE_OPS in <linux/page-flags.h>) so that
 * page_pool_set_pp_info() can later call __SetPageNetpp() on a niov
 * cast to struct page.
 */
static inline void net_iov_init(struct net_iov *niov,
				struct net_iov_area *owner,
				enum net_iov_type type)
{
	niov->owner = owner;
	niov->type = type;
	niov->page_type = UINT_MAX;
}

/* netmem */

/**
+1 −2
Original line number Diff line number Diff line
@@ -495,10 +495,9 @@ static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
	for (i = 0; i < nr_iovs; i++) {
		struct net_iov *niov = &area->nia.niovs[i];

		niov->owner = &area->nia;
		net_iov_init(niov, &area->nia, NET_IOV_IOURING);
		area->freelist[i] = i;
		atomic_set(&area->user_refs[i], 0);
		niov->type = NET_IOV_IOURING;
	}

	if (ifq->dev) {
+1 −2
Original line number Diff line number Diff line
@@ -297,8 +297,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,

		for (i = 0; i < owner->area.num_niovs; i++) {
			niov = &owner->area.niovs[i];
			niov->type = NET_IOV_DMABUF;
			niov->owner = &owner->area;
			net_iov_init(niov, &owner->area, NET_IOV_DMABUF);
			page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov),
						      net_devmem_get_dma_addr(niov));
			if (direction == DMA_TO_DEVICE)