Commit 260fbcb9 authored by Tejun Heo's avatar Tejun Heo
Browse files

cgroup: Move dying_tasks cleanup from cgroup_task_release() to cgroup_task_free()



Currently, cgroup_task_exit() adds thread group leaders with live member
threads to their css_set's dying_tasks list (so cgroup.procs iteration can
still see the leader), and cgroup_task_release() later removes them with
list_del_init(&task->cg_list).

An upcoming patch will defer the dying_tasks list addition, moving it from
cgroup_task_exit() (called from do_exit()) to a new function called from
finish_task_switch(). However, release_task() (which calls
cgroup_task_release()) can run either before or after finish_task_switch(),
creating a race where cgroup_task_release() might try to remove the task from
dying_tasks before or while it's being added.

Move the list_del_init() from cgroup_task_release() to cgroup_task_free() to
fix this race. cgroup_task_free() runs from __put_task_struct(), which is
always after both paths, making the cleanup safe.

Cc: Dan Schatzberg <dschatzberg@meta.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: default avatarTejun Heo <tj@kernel.org>
parent 16dad780
Loading
Loading
Loading
Loading
+5 −4
Original line number Diff line number Diff line
@@ -7019,6 +7019,11 @@ void cgroup_task_release(struct task_struct *task)
	do_each_subsys_mask(ss, ssid, have_release_callback) {
		ss->release(task);
	} while_each_subsys_mask();
}

void cgroup_task_free(struct task_struct *task)
{
	struct css_set *cset = task_css_set(task);

	if (!list_empty(&task->cg_list)) {
		spin_lock_irq(&css_set_lock);
@@ -7026,11 +7031,7 @@ void cgroup_task_release(struct task_struct *task)
		list_del_init(&task->cg_list);
		spin_unlock_irq(&css_set_lock);
	}
}

void cgroup_task_free(struct task_struct *task)
{
	struct css_set *cset = task_css_set(task);
	put_css_set(cset);
}