Commit 689b8986 authored by Zi Yan's avatar Zi Yan Committed by Andrew Morton
Browse files

mm/memory-failure: improve large block size folio handling

Large block size (LBS) folios cannot be split to order-0 folios but
min_order_for_folio().  Current split fails directly, but that is not
optimal.  Split the folio to min_order_for_folio(), so that, after split,
only the folio containing the poisoned page becomes unusable instead.

For soft offline, do not split the large folio if its
min_order_for_folio() is not 0.  Since the folio is still accessible from
userspace and premature split might lead to potential performance loss.

Link: https://lkml.kernel.org/r/20251031162001.670503-3-ziy@nvidia.com


Signed-off-by: default avatarZi Yan <ziy@nvidia.com>
Suggested-by: default avatarJane Chu <jane.chu@oracle.com>
Reviewed-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
Reviewed-by: default avatarLorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
Reviewed-by: default avatarWei Yang <richard.weiyang@gmail.com>
Reviewed-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: default avatarBarry Song <baohua@kernel.org>
Reviewed-by: default avatarLance Yang <lance.yang@linux.dev>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Pankaj Raghav <kernel@pankajraghav.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent a7ef12c6
Loading
Loading
Loading
Loading
+27 −4
Original line number Diff line number Diff line
@@ -1659,12 +1659,13 @@ static int identify_page_state(unsigned long pfn, struct page *p,
 * there is still more to do, hence the page refcount we took earlier
 * is still needed.
 */
static int try_to_split_thp_page(struct page *page, bool release)
static int try_to_split_thp_page(struct page *page, unsigned int new_order,
		bool release)
{
	int ret;

	lock_page(page);
	ret = split_huge_page(page);
	ret = split_huge_page_to_order(page, new_order);
	unlock_page(page);

	if (ret && release)
@@ -2420,6 +2421,9 @@ int memory_failure(unsigned long pfn, int flags)
	folio_unlock(folio);

	if (folio_test_large(folio)) {
		const int new_order = min_order_for_split(folio);
		int err;

		/*
		 * The flag must be set after the refcount is bumped
		 * otherwise it may race with THP split.
@@ -2434,7 +2438,16 @@ int memory_failure(unsigned long pfn, int flags)
		 * page is a valid handlable page.
		 */
		folio_set_has_hwpoisoned(folio);
		if (try_to_split_thp_page(p, false) < 0) {
		err = try_to_split_thp_page(p, new_order, /* release= */ false);
		/*
		 * If splitting a folio to order-0 fails, kill the process.
		 * Split the folio regardless to minimize unusable pages.
		 * Because the memory failure code cannot handle large
		 * folios, this split is always treated as if it failed.
		 */
		if (err || new_order) {
			/* get folio again in case the original one is split */
			folio = page_folio(p);
			res = -EHWPOISON;
			kill_procs_now(p, pfn, flags, folio);
			put_page(p);
@@ -2761,7 +2774,17 @@ static int soft_offline_in_use_page(struct page *page)
	};

	if (!huge && folio_test_large(folio)) {
		if (try_to_split_thp_page(page, true)) {
		const int new_order = min_order_for_split(folio);

		/*
		 * If new_order (target split order) is not 0, do not split the
		 * folio at all to retain the still accessible large folio.
		 * NOTE: if minimizing the number of soft offline pages is
		 * preferred, split it to non-zero new_order like it is done in
		 * memory_failure().
		 */
		if (new_order || try_to_split_thp_page(page, /* new_order= */ 0,
						       /* release= */ true)) {
			pr_info("%#lx: thp split failed\n", pfn);
			return -EBUSY;
		}