+124
−24
Loading
sched/fair: Revert 6d71a9c6 ("sched/fair: Fix EEVDF entity placement bug causing scheduling lag") Zicheng Qu reported that, because avg_vruntime() always includes cfs_rq->curr, when ->on_rq, place_entity() doesn't work right. Specifically, the lag scaling in place_entity() relies on avg_vruntime() being the state *before* placement of the new entity. However in this case avg_vruntime() will actually already include the entity, which breaks things. Also, Zicheng Qu argues that avg_vruntime should be invariant under reweight. IOW commit 6d71a9c6 ("sched/fair: Fix EEVDF entity placement bug causing scheduling lag") was wrong! The issue reported in 6d71a9c6 could possibly be explained by rounding artifacts -- notably the extreme weight '2' is outside of the range of avg_vruntime/sum_w_vruntime, since that uses scale_load_down(). By scaling vruntime by the real weight, but accounting it in vruntime with a factor 1024 more, the average moves significantly. However, that is now cured. Tested by reverting 66951e48 ("sched/fair: Fix update_cfs_group() vs DELAY_DEQUEUE") and tracing vruntime and vlag figures again. Reported-by:Zicheng Qu <quzicheng@huawei.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Vincent Guittot <vincent.guittot@linaro.org> Tested-by:
K Prateek Nayak <kprateek.nayak@amd.com> Tested-by:
Shubhang Kaushik <shubhang@os.amperecomputing.com> Link: https://patch.msgid.link/20260219080625.066102672%40infradead.org