Commit 83b0177a authored by Ingo Molnar's avatar Ingo Molnar Committed by Dave Hansen
Browse files

x86/mm: Fix SMP ordering in switch_mm_irqs_off()



Stephen noted that it is possible to not have an smp_mb() between
the loaded_mm store and the tlb_gen load in switch_mm(), meaning the
ordering against flush_tlb_mm_range() goes out the window, and it
becomes possible for switch_mm() to not observe a recent tlb_gen
update and fail to flush the TLBs.

[ dhansen: merge conflict fixed by Ingo ]

Fixes: 209954cb ("x86/mm/tlb: Update mm_cpumask lazily")
Reported-by: default avatarStephen Dolan <sdolan@janestreet.com>
Closes: https://lore.kernel.org/all/CAHDw0oGd0B4=uuv8NGqbUQ_ZVmSheU2bN70e4QhFXWvuAZdt2w@mail.gmail.com/


Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
parent f25785f9
Loading
Loading
Loading
Loading
+22 −2
Original line number Diff line number Diff line
@@ -911,11 +911,31 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
		 * CR3 and cpu_tlbstate.loaded_mm are not all in sync.
		 */
		this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING);
		barrier();

		/* Start receiving IPIs and then read tlb_gen (and LAM below) */
		/*
		 * Make sure this CPU is set in mm_cpumask() such that we'll
		 * receive invalidation IPIs.
		 *
		 * Rely on the smp_mb() implied by cpumask_set_cpu()'s atomic
		 * operation, or explicitly provide one. Such that:
		 *
		 * switch_mm_irqs_off()				flush_tlb_mm_range()
		 *   smp_store_release(loaded_mm, SWITCHING);     atomic64_inc_return(tlb_gen)
		 *   smp_mb(); // here                            // smp_mb() implied
		 *   atomic64_read(tlb_gen);                      this_cpu_read(loaded_mm);
		 *
		 * we properly order against flush_tlb_mm_range(), where the
		 * loaded_mm load can happen in mative_flush_tlb_multi() ->
		 * should_flush_tlb().
		 *
		 * This way switch_mm() must see the new tlb_gen or
		 * flush_tlb_mm_range() must see the new loaded_mm, or both.
		 */
		if (next != &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next)))
			cpumask_set_cpu(cpu, mm_cpumask(next));
		else
			smp_mb();

		next_tlb_gen = atomic64_read(&next->context.tlb_gen);

		ns = choose_new_asid(next, next_tlb_gen);