Commit 6dd55dd1 authored by David Hildenbrand's avatar David Hildenbrand Committed by Andrew Morton
Browse files

fs/proc/task_mmu: remove per-page mapcount dependency for smaps/smaps_rollup...

fs/proc/task_mmu: remove per-page mapcount dependency for smaps/smaps_rollup (CONFIG_NO_PAGE_MAPCOUNT)

Let's implement an alternative when per-page mapcounts in large folios are
no longer maintained -- soon with CONFIG_NO_PAGE_MAPCOUNT.

When computing the output for smaps / smaps_rollups, in particular when
calculating the USS (Unique Set Size) and the PSS (Proportional Set Size),
we still rely on per-page mapcounts.

To determine private vs.  shared, we'll use folio_likely_mapped_shared(),
similar to how we handle PM_MMAP_EXCLUSIVE.  Similarly, we might now
under-estimate the USS and count pages towards "shared" that are actually
"private" ("exclusively mapped").

When calculating the PSS, we'll now also use the average per-page mapcount
for large folios: this can result in both, an over-estimation and an
under-estimation of the PSS.  The difference is not expected to matter
much in practice, but we'll have to learn as we go.

We can now provide folio_precise_page_mapcount() only with
CONFIG_PAGE_MAPCOUNT, and remove one of the last users of per-page
mapcounts when CONFIG_NO_PAGE_MAPCOUNT is enabled.

Document the new behavior.

Link: https://lkml.kernel.org/r/20250303163014.1128035-20-david@redhat.com


Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
Cc: Andy Lutomirks^H^Hski <luto@kernel.org>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michal Koutn <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: tejun heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent 7a34ae14
Loading
Loading
Loading
Loading
+19 −3
Original line number Diff line number Diff line
@@ -502,9 +502,25 @@ process, its PSS will be 1500. "Pss_Dirty" is the portion of PSS which
consists of dirty pages.  ("Pss_Clean" is not included, but it can be
calculated by subtracting "Pss_Dirty" from "Pss".)

Note that even a page which is part of a MAP_SHARED mapping, but has only
a single pte mapped, i.e.  is currently used by only one process, is accounted
as private and not as shared.
Traditionally, a page is accounted as "private" if it is mapped exactly once,
and a page is accounted as "shared" when mapped multiple times, even when
mapped in the same process multiple times. Note that this accounting is
independent of MAP_SHARED.

In some kernel configurations, the semantics of pages part of a larger
allocation (e.g., THP) can differ: a page is accounted as "private" if all
pages part of the corresponding large allocation are *certainly* mapped in the
same process, even if the page is mapped multiple times in that process. A
page is accounted as "shared" if any page page of the larger allocation
is *maybe* mapped in a different process. In some cases, a large allocation
might be treated as "maybe mapped by multiple processes" even though this
is no longer the case.

Some kernel configurations do not track the precise number of times a page part
of a larger allocation is mapped. In this case, when calculating the PSS, the
average number of mappings per page in this larger allocation might be used
as an approximation for the number of mappings of a page. The PSS calculation
will be imprecise in this case.

"Referenced" indicates the amount of memory currently marked as referenced or
accessed.
+8 −0
Original line number Diff line number Diff line
@@ -157,6 +157,7 @@ unsigned name_to_int(const struct qstr *qstr);
/* Worst case buffer size needed for holding an integer. */
#define PROC_NUMBUF 13

#ifdef CONFIG_PAGE_MAPCOUNT
/**
 * folio_precise_page_mapcount() - Number of mappings of this folio page.
 * @folio: The folio.
@@ -187,6 +188,13 @@ static inline int folio_precise_page_mapcount(struct folio *folio,

	return mapcount;
}
#else /* !CONFIG_PAGE_MAPCOUNT */
static inline int folio_precise_page_mapcount(struct folio *folio,
		struct page *page)
{
	BUILD_BUG();
}
#endif /* CONFIG_PAGE_MAPCOUNT */

/**
 * folio_average_page_mapcount() - Average number of mappings per page in this
+15 −2
Original line number Diff line number Diff line
@@ -707,6 +707,8 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
	struct folio *folio = page_folio(page);
	int i, nr = compound ? compound_nr(page) : 1;
	unsigned long size = nr * PAGE_SIZE;
	bool exclusive;
	int mapcount;

	/*
	 * First accumulate quantities that depend only on |size| and the type
@@ -747,18 +749,29 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
				      dirty, locked, present);
		return;
	}

	if (IS_ENABLED(CONFIG_NO_PAGE_MAPCOUNT)) {
		mapcount = folio_average_page_mapcount(folio);
		exclusive = !folio_maybe_mapped_shared(folio);
	}

	/*
	 * We obtain a snapshot of the mapcount. Without holding the folio lock
	 * this snapshot can be slightly wrong as we cannot always read the
	 * mapcount atomically.
	 */
	for (i = 0; i < nr; i++, page++) {
		int mapcount = folio_precise_page_mapcount(folio, page);
		unsigned long pss = PAGE_SIZE << PSS_SHIFT;

		if (IS_ENABLED(CONFIG_PAGE_MAPCOUNT)) {
			mapcount = folio_precise_page_mapcount(folio, page);
			exclusive = mapcount < 2;
		}

		if (mapcount >= 2)
			pss /= mapcount;
		smaps_page_accumulate(mss, folio, PAGE_SIZE, pss,
				dirty, locked, mapcount < 2);
				dirty, locked, exclusive);
	}
}