Commit ec14325c authored by Alexei Starovoitov's avatar Alexei Starovoitov
Browse files

Merge branch 'xdp-metadata-via-kfuncs-for-ice-vlan-hint'

Larysa Zaremba says:

====================
XDP metadata via kfuncs for ice + VLAN hint

This series introduces XDP hints via kfuncs [0] to the ice driver.

Series brings the following existing hints to the ice driver:
 - HW timestamp
 - RX hash with type

Series also introduces VLAN tag with protocol XDP hint, it now be accessed by
XDP and userspace (AF_XDP) programs. They can also be checked with xdp_metadata
test and xdp_hw_metadata program.

Impact of these patches on ice performance:
ZC:
* Full hints implementation decreases pps in ZC mode by less than 3%
  (64B, rxdrop)

skb (packets with invalid IP, dropped by stack):
* Overall, patchset improves peak performance in skb mode by about 0.5%

[0] https://patchwork.kernel.org/project/netdevbpf/cover/20230119221536.3349901-1-sdf@google.com/

v7:
https://lore.kernel.org/bpf/20231115175301.534113-1-larysa.zaremba@intel.com/
v6:
https://lore.kernel.org/bpf/20231012170524.21085-1-larysa.zaremba@intel.com/
Intermediate RFC v2:
https://lore.kernel.org/bpf/20230927075124.23941-1-larysa.zaremba@intel.com/
Intermediate RFC v1:
https://lore.kernel.org/bpf/20230824192703.712881-1-larysa.zaremba@intel.com/
v5:
https://lore.kernel.org/bpf/20230811161509.19722-1-larysa.zaremba@intel.com/
v4:
https://lore.kernel.org/bpf/20230728173923.1318596-1-larysa.zaremba@intel.com/
v3:
https://lore.kernel.org/bpf/20230719183734.21681-1-larysa.zaremba@intel.com/
v2:
https://lore.kernel.org/bpf/20230703181226.19380-1-larysa.zaremba@intel.com/
v1:
https://lore.kernel.org/all/20230512152607.992209-1-larysa.zaremba@intel.com/

Changes since v7:
* shorten timestamp assignment in ice
* change first argument of ice_fill_rx_descs back to xsk_buff_pool
* fix kernel-doc for ice_run_xdp_zc
* add missing XSK_CHECK_PRIV_TYPE() in ice
* resolved selftests merge conflicts with TX hints
* AF_INET patch adds new packet generation, not replaces AF_XDP one
* fix destination port in xdp_metadata

Changes since v6:
* add ability to fill cb of all xdp_buffs in xsk_buff_pool
* place just pointer to packet context in ice_xdp_buff
* add const qualifiers in veth implementation
* generate uapi for VLAN hint

Changes since v5:
* drop checksum hint from the patchset entirely
* Alex's patch that lifts the data_meta size limitation is no longer
  required in this patchset, so will be sent separately
* new patch: hide some ice hints code behind a static key
* fix several bugs in ZC mode (ice)
* change argument order in VLAN hint kfunc (tci, proto -> proto, tci)
* cosmetic changes
* analyze performance impact

Changes since v4:
* Drop the concept of partial checksum from the hint design
* Drop the concept of checksum level from the hint design

Changes since v3:
* use XDP_CHECKSUM_VALID_LVL0 + csum_level instead of csum_level + 1
* fix spelling mistakes
* read XDP timestamp unconditionally
* add TO_STR() macro

Changes since v2:
* redesign checksum hint, so now it gives full status
* rename vlan_tag -> vlan_tci, where applicable
* use open_netns() and close_netns() in xdp_metadata
* improve VLAN hint documentation
* replace CFI with DEI
* use VLAN_VID_MASK in xdp_metadata
* make vlan_get_tag() return -ENODATA
* remove unused rx_ptype in ice_xsk.c
* fix ice timestamp code division between patches

Changes since v1:
* directly return RX hash, RX timestamp and RX checksum status
  in skb-common functions
* use intermediate enum value for checksum status in ice
* get rid of ring structure dependency in ice kfunc implementation
* make variables const, when possible, in ice implementation
* use -ENODATA instead of -EOPNOTSUPP for driver implementation
* instead of having 2 separate functions for c-tag and s-tag,
  use 1 function that outputs both VLAN tag and protocol ID
* improve documentation for introduced hints
* update xdp_metadata selftest to test new hints
* implement new hints in veth, so they can be tested in xdp_metadata
* parse VLAN tag in xdp_hw_metadata
====================

Link: https://lore.kernel.org/r/20231205210847.28460-1-larysa.zaremba@intel.com


Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
parents 73376328 4c6612f6
Loading
Loading
Loading
Loading
+4 −0
Original line number Diff line number Diff line
@@ -54,6 +54,10 @@ definitions:
        name: hash
        doc:
          Device is capable of exposing receive packet hash via bpf_xdp_metadata_rx_hash().
      -
        name: vlan-tag
        doc:
          Device is capable of exposing receive packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag().
  -
    type: flags
    name: xsk-flags
+7 −1
Original line number Diff line number Diff line
@@ -20,7 +20,13 @@ Currently, the following kfuncs are supported. In the future, as more
metadata is supported, this set will grow:

.. kernel-doc:: net/core/xdp.c
   :identifiers: bpf_xdp_metadata_rx_timestamp bpf_xdp_metadata_rx_hash
   :identifiers: bpf_xdp_metadata_rx_timestamp

.. kernel-doc:: net/core/xdp.c
   :identifiers: bpf_xdp_metadata_rx_hash

.. kernel-doc:: net/core/xdp.c
   :identifiers: bpf_xdp_metadata_rx_vlan_tag

An XDP program can use these kfuncs to read the metadata into stack
variables for its own consumption. Or, to pass the metadata on to other
+2 −0
Original line number Diff line number Diff line
@@ -996,4 +996,6 @@ static inline void ice_clear_rdma_cap(struct ice_pf *pf)
	set_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags);
	clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
}

extern const struct xdp_metadata_ops ice_xdp_md_ops;
#endif /* _ICE_H_ */
+15 −0
Original line number Diff line number Diff line
@@ -519,6 +519,19 @@ static int ice_setup_rx_ctx(struct ice_rx_ring *ring)
	return 0;
}

static void ice_xsk_pool_fill_cb(struct ice_rx_ring *ring)
{
	void *ctx_ptr = &ring->pkt_ctx;
	struct xsk_cb_desc desc = {};

	XSK_CHECK_PRIV_TYPE(struct ice_xdp_buff);
	desc.src = &ctx_ptr;
	desc.off = offsetof(struct ice_xdp_buff, pkt_ctx) -
		   sizeof(struct xdp_buff);
	desc.bytes = sizeof(ctx_ptr);
	xsk_pool_fill_cb(ring->xsk_pool, &desc);
}

/**
 * ice_vsi_cfg_rxq - Configure an Rx queue
 * @ring: the ring being configured
@@ -553,6 +566,7 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
			if (err)
				return err;
			xsk_pool_set_rxq_info(ring->xsk_pool, &ring->xdp_rxq);
			ice_xsk_pool_fill_cb(ring);

			dev_info(dev, "Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring %d\n",
				 ring->q_index);
@@ -575,6 +589,7 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)

	xdp_init_buff(&ring->xdp, ice_rx_pg_size(ring) / 2, &ring->xdp_rxq);
	ring->xdp.data = NULL;
	ring->xdp_ext.pkt_ctx = &ring->pkt_ctx;
	err = ice_setup_rx_ctx(ring);
	if (err) {
		dev_err(dev, "ice_setup_rx_ctx failed for RxQ %d, err %d\n",
+208 −204

File changed.

Preview size limit exceeded, changes collapsed.

Loading