Commit e89d5bf3 authored by Baolin Wang's avatar Baolin Wang Committed by Andrew Morton
Browse files

mm: fault in complete folios instead of individual pages for tmpfs

After commit acd7ccb2 ("mm: shmem: add large folio support for
tmpfs"), tmpfs can also support large folio allocation (not just PMD-sized
large folios).

However, when accessing tmpfs via mmap(), although tmpfs supports large
folios, we still establish mappings at the base page granularity, which is
unreasonable.

We can map multiple consecutive pages of tmpfs folios at once according to
the size of the large folio.  On one hand, this can reduce the overhead of
page faults; on the other hand, it can leverage hardware architecture
optimizations to reduce TLB misses, such as contiguous PTEs on the ARM
architecture.

Moreover, tmpfs mount will use the 'huge=' option to control large folio
allocation explicitly.  So it can be understood that the process's RSS
statistics might increase, and I think this will not cause any obvious
effects for users.

Performance test:
I created a 1G tmpfs file, populated with 64K large folios, and write-accessed it
sequentially via mmap(). I observed a significant performance improvement:

Before the patch:
real	0m0.158s
user	0m0.008s
sys	0m0.150s

After the patch:
real	0m0.021s
user	0m0.004s
sys	0m0.017s

Link: https://lkml.kernel.org/r/440940e78aeb7430c5cc8b6d2088ae98265b9809.1751599072.git.baolin.wang@linux.alibaba.com


Fixes: acd7ccb2 ("mm: shmem: add large folio support for tmpfs")
Signed-off-by: default avatarBaolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
Acked-by: default avatarZi Yan <ziy@nvidia.com>
Reviewed-by: default avatarVishal Moola (Oracle) <vishal.moola@gmail.com>
Reviewed-by: default avatarBarry Song <baohua@kernel.org>
Reviewed-by: default avatarLorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent fed48693
Loading
Loading
Loading
Loading
+2 −2
Original line number Diff line number Diff line
@@ -5383,10 +5383,10 @@ vm_fault_t finish_fault(struct vm_fault *vmf)

	/*
	 * Using per-page fault to maintain the uffd semantics, and same
	 * approach also applies to non-anonymous-shmem faults to avoid
	 * approach also applies to non shmem/tmpfs faults to avoid
	 * inflating the RSS of the process.
	 */
	if (!vma_is_anon_shmem(vma) || unlikely(userfaultfd_armed(vma)) ||
	if (!vma_is_shmem(vma) || unlikely(userfaultfd_armed(vma)) ||
	    unlikely(needs_fallback)) {
		nr_pages = 1;
	} else if (nr_pages > 1) {