Commit a72f73c4 authored by Sebastian Andrzej Siewior's avatar Sebastian Andrzej Siewior Committed by Tejun Heo
Browse files

cgroup: Don't expose dead tasks in cgroup



Once a task exits it has its state set to TASK_DEAD and then it is
removed from the cgroup it belonged to. The last step happens on the task
gets out of its last schedule() invocation and is delayed on PREEMPT_RT
due to locking constraints.

As a result it is possible to receive a pid via waitpid() of a task
which is still listed in cgroup.procs for the cgroup it belonged
to. This is something that systemd does not expect and as a result it
waits for its exit until a time out occurs.
This can also be reproduced on !PREEMPT_RT kernel with a significant
delay in do_exit() after exit_notify().

Hide the task from the output which have PF_EXITING set which is done
before the parent is notified. Keeping zombies with live threads
shouldn't break anything (suggested by Tejun).

Reported-by: default avatarBert Karwatzki <spasswolf@web.de>
Closes: https://lore.kernel.org/all/20260219164648.3014-1-spasswolf@web.de/


Tested-by: default avatarBert Karwatzki <spasswolf@web.de>
Fixes: 9311e6c2 ("cgroup: Fix sleeping from invalid context warning on PREEMPT_RT")
Cc: stable@vger.kernel.org # v6.19+
Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: default avatarTejun Heo <tj@kernel.org>
parent ca174c70
Loading
Loading
Loading
Loading
+6 −0
Original line number Diff line number Diff line
@@ -5109,6 +5109,12 @@ static void css_task_iter_advance(struct css_task_iter *it)
		return;

	task = list_entry(it->task_pos, struct task_struct, cg_list);
	/*
	 * Hide tasks that are exiting but not yet removed. Keep zombie
	 * leaders with live threads visible.
	 */
	if ((task->flags & PF_EXITING) && !atomic_read(&task->signal->live))
		goto repeat;

	if (it->flags & CSS_TASK_ITER_PROCS) {
		/* if PROCS, skip over tasks which aren't group leaders */