Commit 40cd0e8d authored by Ran Xiaokai's avatar Ran Xiaokai Committed by Andrew Morton
Browse files

KHO: fix boot failure due to kmemleak access to non-PRESENT pages

When booting with debug_pagealloc=on while having:
CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT=y
CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=n
the system fails to boot due to page faults during kmemleak scanning.

This occurs because:
With debug_pagealloc is enabled, __free_pages() invokes
debug_pagealloc_unmap_pages(), clearing the _PAGE_PRESENT bit for freed
pages in the kernel page table.  KHO scratch areas are allocated from
memblock and noted by kmemleak.  But these areas don't remain reserved but
released later to the page allocator using init_cma_reserved_pageblock(). 
This causes subsequent kmemleak scans access non-PRESENT pages, leading to
fatal page faults.

Mark scratch areas with kmemleak_ignore_phys() after they are allocated
from memblock to exclude them from kmemleak scanning before they are
released to buddy allocator to fix this.

[ran.xiaokai@zte.com.cn: add comment]
  Link: https://lkml.kernel.org/r/20251127122700.103927-1-ranxiaokai627@163.com
Link: https://lkml.kernel.org/r/20251122182929.92634-1-ranxiaokai627@163.com


Signed-off-by: default avatarRan Xiaokai <ran.xiaokai@zte.com.cn>
Reviewed-by: default avatarMike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: default avatarPratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent fb5c3644
Loading
Loading
Loading
Loading
+10 −0
Original line number Diff line number Diff line
@@ -11,6 +11,7 @@

#include <linux/cleanup.h>
#include <linux/cma.h>
#include <linux/kmemleak.h>
#include <linux/count_zeros.h>
#include <linux/kexec.h>
#include <linux/kexec_handover.h>
@@ -1369,6 +1370,15 @@ static __init int kho_init(void)
		unsigned long count = kho_scratch[i].size >> PAGE_SHIFT;
		unsigned long pfn;

		/*
		 * When debug_pagealloc is enabled, __free_pages() clears the
		 * corresponding PRESENT bit in the kernel page table.
		 * Subsequent kmemleak scans of these pages cause the
		 * non-PRESENT page faults.
		 * Mark scratch areas with kmemleak_ignore_phys() to exclude
		 * them from kmemleak scanning.
		 */
		kmemleak_ignore_phys(kho_scratch[i].addr);
		for (pfn = base_pfn; pfn < base_pfn + count;
		     pfn += pageblock_nr_pages)
			init_cma_reserved_pageblock(pfn_to_page(pfn));