Commit d206fbad authored by Peter Zijlstra's avatar Peter Zijlstra
Browse files

sched/fair: Revert max_newidle_lb_cost bump



Many people reported regressions on their database workloads due to:

  155213a2 ("sched/fair: Bump sd->max_newidle_lb_cost when newidle balance fails")

For instance Adam Li reported a 6% regression on SpecJBB.

Conversely this will regress schbench again; on my machine from 2.22
Mrps/s down to 2.04 Mrps/s.

Reported-by: default avatarJoseph Salisbury <joseph.salisbury@oracle.com>
Reported-by: default avatarAdam Li <adamli@os.amperecomputing.com>
Reported-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
Reported-by: default avatarHazem Mohamed Abuelfotoh <abuehaze@amazon.com>
Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: default avatarChris Mason <clm@meta.com>
Link: https://lkml.kernel.org/r/20250626144017.1510594-2-clm@fb.com
Link: https://lkml.kernel.org/r/006c9df2-b691-47f1-82e6-e233c3f91faf@oracle.com
Link: https://patch.msgid.link/20251107161739.406147760@infradead.org
parent e837456f
Loading
Loading
Loading
Loading
+3 −16
Original line number Diff line number Diff line
@@ -12230,14 +12230,8 @@ static inline bool update_newidle_cost(struct sched_domain *sd, u64 cost)
		/*
		 * Track max cost of a domain to make sure to not delay the
		 * next wakeup on the CPU.
		 *
		 * sched_balance_newidle() bumps the cost whenever newidle
		 * balance fails, and we don't want things to grow out of
		 * control.  Use the sysctl_sched_migration_cost as the upper
		 * limit, plus a litle extra to avoid off by ones.
		 */
		sd->max_newidle_lb_cost =
			min(cost, sysctl_sched_migration_cost + 200);
		sd->max_newidle_lb_cost = cost;
		sd->last_decay_max_lb_cost = jiffies;
	} else if (time_after(jiffies, sd->last_decay_max_lb_cost + HZ)) {
		/*
@@ -12920,17 +12914,10 @@ static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf)

			t1 = sched_clock_cpu(this_cpu);
			domain_cost = t1 - t0;
			update_newidle_cost(sd, domain_cost);

			curr_cost += domain_cost;
			t0 = t1;

			/*
			 * Failing newidle means it is not effective;
			 * bump the cost so we end up doing less of it.
			 */
			if (!pulled_task)
				domain_cost = (3 * sd->max_newidle_lb_cost) / 2;

			update_newidle_cost(sd, domain_cost);
		}

		/*