Commit c5aaf022 authored by Jakub Kicinski's avatar Jakub Kicinski
Browse files

Merge branch 'net-replace-wq-users-and-add-wq_percpu-to-alloc_workqueue-users'

Marco Crivellari says:

====================
net: replace wq users and add WQ_PERCPU to alloc_workqueue() users

Below is a summary of a discussion about the Workqueue API and cpu isolation
considerations. Details and more information are available here:

  "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."

Link: https://lore.kernel.org/20250221112003.1dSuoGyc@linutronix.de

=== Current situation: problems ===

Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.

This leads to different scenarios if a work item is scheduled on an isolated
CPU where "delay" value is 0 or greater then 0:
        schedule_delayed_work(, 0);

This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:

        schedule_delayed_work(, 1);

Will move the timer on an housekeeping CPU, and schedule the work there.

Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistentcy cannot be addressed without refactoring the API.

=== Plan and future plans ===

This patchset is the first stone on a refactoring needed in order to
address the points aforementioned; it will have a positive impact also
on the cpu isolation, in the long term, moving away percpu workqueue in
favor to an unbound model.

These are the main steps:
1)  API refactoring (that this patch is introducing)
    -   Make more clear and uniform the system wq names, both per-cpu and
        unbound. This to avoid any possible confusion on what should be
        used.

    -   Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
        introduced in this patchset and used on all the callers that are not
        currently using WQ_UNBOUND.

        WQ_UNBOUND will be removed in a future release cycle.

        Most users don't need to be per-cpu, because they don't have
        locality requirements, because of that, a next future step will be
        make "unbound" the default behavior.

2)  Check who really needs to be per-cpu
    -   Remove the WQ_PERCPU flag when is not strictly required.

3)  Add a new API (prefer local cpu)
    -   There are users that don't require a local execution, like mentioned
        above; despite that, local execution yeld to performance gain.

        This new API will prefer the local execution, without requiring it.

=== Introduced Changes by this series ===

1) [P 1-2] Replace use of system_wq and system_unbound_wq

        system_wq is a per-CPU workqueue, but his name is not clear.
        system_unbound_wq is to be used when locality is not required.

        Because of that, system_wq has been renamed in system_percpu_wq, and
        system_unbound_wq has been renamed in system_dfl_wq.

2) [P 3] add WQ_PERCPU to remaining alloc_workqueue() users

        Every alloc_workqueue() caller should use one among WQ_PERCPU or
        WQ_UNBOUND.

        WQ_UNBOUND will be removed in a next release cycle.
====================

Link: https://patch.msgid.link/20250918142427.309519-1-marco.crivellari@suse.com


Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parents dfff1808 27ce71e1
Loading
Loading
Loading
Loading
+2 −1
Original line number Diff line number Diff line
@@ -770,7 +770,8 @@ static int hi3110_open(struct net_device *net)
		goto out_close;
	}

	priv->wq = alloc_workqueue("hi3110_wq", WQ_FREEZABLE | WQ_MEM_RECLAIM,
	priv->wq = alloc_workqueue("hi3110_wq",
				   WQ_FREEZABLE | WQ_MEM_RECLAIM | WQ_PERCPU,
				   0);
	if (!priv->wq) {
		ret = -ENOMEM;
+2 −1
Original line number Diff line number Diff line
@@ -1378,7 +1378,8 @@ static int mcp251x_can_probe(struct spi_device *spi)
	if (ret)
		goto out_clk;

	priv->wq = alloc_workqueue("mcp251x_wq", WQ_FREEZABLE | WQ_MEM_RECLAIM,
	priv->wq = alloc_workqueue("mcp251x_wq",
				   WQ_FREEZABLE | WQ_MEM_RECLAIM | WQ_PERCPU,
				   0);
	if (!priv->wq) {
		ret = -ENOMEM;
+1 −1
Original line number Diff line number Diff line
@@ -472,7 +472,7 @@ int setup_rx_oom_poll_fn(struct net_device *netdev)
		q_no = lio->linfo.rxpciq[q].s.q_no;
		wq = &lio->rxq_status_wq[q_no];
		wq->wq = alloc_workqueue("rxq-oom-status",
					 WQ_MEM_RECLAIM, 0);
					 WQ_MEM_RECLAIM | WQ_PERCPU, 0);
		if (!wq->wq) {
			dev_err(&oct->pci_dev->dev, "unable to create cavium rxq oom status wq\n");
			return -ENOMEM;
+5 −3
Original line number Diff line number Diff line
@@ -526,7 +526,8 @@ static inline int setup_link_status_change_wq(struct net_device *netdev)
	struct octeon_device *oct = lio->oct_dev;

	lio->link_status_wq.wq = alloc_workqueue("link-status",
						 WQ_MEM_RECLAIM, 0);
						 WQ_MEM_RECLAIM | WQ_PERCPU,
						 0);
	if (!lio->link_status_wq.wq) {
		dev_err(&oct->pci_dev->dev, "unable to create cavium link status wq\n");
		return -1;
@@ -659,7 +660,8 @@ static inline int setup_sync_octeon_time_wq(struct net_device *netdev)
	struct octeon_device *oct = lio->oct_dev;

	lio->sync_octeon_time_wq.wq =
		alloc_workqueue("update-octeon-time", WQ_MEM_RECLAIM, 0);
		alloc_workqueue("update-octeon-time",
				WQ_MEM_RECLAIM | WQ_PERCPU, 0);
	if (!lio->sync_octeon_time_wq.wq) {
		dev_err(&oct->pci_dev->dev, "Unable to create wq to update octeon time\n");
		return -1;
@@ -1734,7 +1736,7 @@ static inline int setup_tx_poll_fn(struct net_device *netdev)
	struct octeon_device *oct = lio->oct_dev;

	lio->txq_status_wq.wq = alloc_workqueue("txq-status",
						WQ_MEM_RECLAIM, 0);
						WQ_MEM_RECLAIM | WQ_PERCPU, 0);
	if (!lio->txq_status_wq.wq) {
		dev_err(&oct->pci_dev->dev, "unable to create cavium txq status wq\n");
		return -1;
+2 −1
Original line number Diff line number Diff line
@@ -304,7 +304,8 @@ static int setup_link_status_change_wq(struct net_device *netdev)
	struct octeon_device *oct = lio->oct_dev;

	lio->link_status_wq.wq = alloc_workqueue("link-status",
						 WQ_MEM_RECLAIM, 0);
						 WQ_MEM_RECLAIM | WQ_PERCPU,
						 0);
	if (!lio->link_status_wq.wq) {
		dev_err(&oct->pci_dev->dev, "unable to create cavium link status wq\n");
		return -1;
Loading