Commit 7f995823 authored by Peter Zijlstra's avatar Peter Zijlstra Committed by Ingo Molnar
Browse files

x86/mm: Fix false positive warning in switch_mm_irqs_off()

Multiple testers reported the following new warning:

	WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795

Which corresponds to:

	if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm &&
	    !cpumask_test_cpu(cpu, mm_cpumask(next))))
		cpumask_set_cpu(cpu, mm_cpumask(next));

So the problem is that unuse_temporary_mm() explicitly clears
that bit; and it has to, because otherwise the flush_tlb_mm_range() in
__text_poke() will try sending IPIs, which are not at all needed.

See also:

   https://lore.kernel.org/all/20241113095550.GBZzR3pg-RhJKPDazS@fat_crate.local/



Notably, the whole {,un}use_temporary_mm() thing requires preemption to
be disabled across it with the express purpose of keeping all TLB
nonsense CPU local, such that invalidations can also stay local etc.

However, as a side-effect, we violate this above WARN(), which sorta
makes sense for the normal case, but very much doesn't make sense here.

Change unuse_temporary_mm() to mark the mm_struct such that a further
exception (beyond init_mm) can be grafted, to keep the warning for all
the other cases.

Reported-by: default avatarChaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Reported-by: default avatarJani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/r/20250430081154.GH4439@noisy.programming.kicks-ass.net
parent 43c2df7e
Loading
Loading
Loading
Loading
+2 −2
Original line number Diff line number Diff line
@@ -16,6 +16,8 @@
#define MM_CONTEXT_LOCK_LAM		2
/* Allow LAM and SVA coexisting */
#define MM_CONTEXT_FORCE_TAGGED_SVA	3
/* Tracks mm_cpumask */
#define MM_CONTEXT_NOTRACK		4

/*
 * x86 has arch-specific MMU state beyond what lives in mm_struct.
@@ -44,9 +46,7 @@ typedef struct {
	struct ldt_struct	*ldt;
#endif

#ifdef CONFIG_X86_64
	unsigned long flags;
#endif

#ifdef CONFIG_ADDRESS_MASKING
	/* Active LAM mode:  X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (disabled) */
+10 −0
Original line number Diff line number Diff line
@@ -247,6 +247,16 @@ static inline bool is_64bit_mm(struct mm_struct *mm)
}
#endif

static inline bool is_notrack_mm(struct mm_struct *mm)
{
	return test_bit(MM_CONTEXT_NOTRACK, &mm->context.flags);
}

static inline void set_notrack_mm(struct mm_struct *mm)
{
	set_bit(MM_CONTEXT_NOTRACK, &mm->context.flags);
}

/*
 * We only want to enforce protection keys on the current process
 * because we effectively have no access to PKRU for other
+3 −0
Original line number Diff line number Diff line
@@ -28,6 +28,7 @@
#include <asm/text-patching.h>
#include <asm/memtype.h>
#include <asm/paravirt.h>
#include <asm/mmu_context.h>

/*
 * We need to define the tracepoints somewhere, and tlb.c
@@ -830,6 +831,8 @@ void __init poking_init(void)
	/* Xen PV guests need the PGD to be pinned. */
	paravirt_enter_mmap(text_poke_mm);

	set_notrack_mm(text_poke_mm);

	/*
	 * Randomize the poking address, but make sure that the following page
	 * will be mapped at the same PMD. We need 2 pages, so find space for 3,
+2 −1
Original line number Diff line number Diff line
@@ -847,7 +847,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
		 * mm_cpumask. The TLB shootdown code can figure out from
		 * cpu_tlbstate_shared.is_lazy whether or not to send an IPI.
		 */
		if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm &&
		if (IS_ENABLED(CONFIG_DEBUG_VM) &&
		    WARN_ON_ONCE(prev != &init_mm && !is_notrack_mm(prev) &&
				 !cpumask_test_cpu(cpu, mm_cpumask(next))))
			cpumask_set_cpu(cpu, mm_cpumask(next));

+1 −0
Original line number Diff line number Diff line
@@ -89,6 +89,7 @@ int __init efi_alloc_page_tables(void)
	efi_mm.pgd = efi_pgd;
	mm_init_cpumask(&efi_mm);
	init_new_context(NULL, &efi_mm);
	set_notrack_mm(&efi_mm);

	return 0;