Commit d9b05321 authored Aug 22, 2025 by Sebastian Andrzej Siewior Committed by Borislav Petkov (AMD) Aug 31, 2025

futex: Move futex_hash_free() back to __mmput()

To avoid a memory leak via mm_alloc() + mmdrop() the futex cleanup code
has been moved to __mmdrop(). This resulted in a warnings if the futex
hash table has been allocated via vmalloc() the mmdrop() was invoked
from atomic context.
The free path must stay in __mmput() to ensure it is invoked from
preemptible context.

In order to avoid the memory leak, delay the allocation of
mm_struct::mm->futex_ref to futex_hash_allocate(). This works because
neither the per-CPU counter nor the private hash has been allocated and
therefore
- futex_private_hash() callers (such as exit_pi_state_list()) don't
  acquire reference if there is no private hash yet. There is also no
  reference put.

- Regular callers (futex_hash()) fallback to global hash. No reference
  counting here.

The futex_ref member can be allocated in futex_hash_allocate() before
the private hash itself is allocated. This happens either while the
first thread is created or on request. In both cases the process has
just a single thread so there can be either futex operation in progress
or the request to create a private hash.

Move futex_hash_free() back to __mmput();
Move the allocation of mm_struct::futex_ref to futex_hash_allocate().

  [ bp: Fold a follow-up fix to prevent a use-after-free:
    https://lore.kernel.org/r/20250830213806.sEKuuGSm@linutronix.de ]

Fixes:  e703b7e2 ("futex: Move futex cleanup to __mmdrop()")
Closes: https://lore.kernel.org/all/20250821102721.6deae493@kernel.org/


Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lkml.kernel.org/r/20250822141238.PfnkTjFb@linutronix.de

parent 1b237f19

kernel/fork.c

+1 −1

Original line number	Diff line number	Diff line
		@@ -689,7 +689,6 @@ void __mmdrop(struct mm_struct *mm)
		mm_pasid_drop(mm);
		mm_destroy_cid(mm);
		percpu_counter_destroy_many(mm->rss_stat, NR_MM_COUNTERS);
		futex_hash_free(mm);

		free_mm(mm);
		}
		@@ -1138,6 +1137,7 @@ static inline void __mmput(struct mm_struct *mm)
		if (mm->binfmt)
		module_put(mm->binfmt->module);
		lru_gen_del_mm(mm);
		futex_hash_free(mm);
		mmdrop(mm);
		}

kernel/futex/core.c

+12 −4

Original line number	Diff line number	Diff line
		@@ -1722,12 +1722,9 @@ int futex_mm_init(struct mm_struct *mm)
		RCU_INIT_POINTER(mm->futex_phash, NULL);
		mm->futex_phash_new = NULL;
		/* futex-ref */
		mm->futex_ref = NULL;
		atomic_long_set(&mm->futex_atomic, 0);
		mm->futex_batches = get_state_synchronize_rcu();
		mm->futex_ref = alloc_percpu(unsigned int);
		if (!mm->futex_ref)
		return -ENOMEM;
		this_cpu_inc(mm->futex_ref); / 0 -> 1 */
		return 0;
		}

		@@ -1801,6 +1798,17 @@ static int futex_hash_allocate(unsigned int hash_slots, unsigned int flags)
		}
		}

		if (!mm->futex_ref) {
		/*
		* This will always be allocated by the first thread and
		* therefore requires no locking.
		*/
		mm->futex_ref = alloc_percpu(unsigned int);
		if (!mm->futex_ref)
		return -ENOMEM;
		this_cpu_inc(mm->futex_ref); / 0 -> 1 */
		}

		fph = kvzalloc(struct_size(fph, queues, hash_slots),
		GFP_KERNEL_ACCOUNT \| __GFP_NOWARN);
		if (!fph)