Commit c009da42 authored by Frank van der Linden's avatar Frank van der Linden Committed by Andrew Morton
Browse files

mm, cma: support multiple contiguous ranges, if requested

Currently, CMA manages one range of physically contiguous memory. 
Creation of larger CMA areas with hugetlb_cma may run in to gaps in
physical memory, so that they are not able to allocate that contiguous
physical range from memblock when creating the CMA area.

This can happen, for example, on an AMD system with > 1TB of memory, where
there will be a gap just below the 1TB (40bit DMA) line.  If you have set
aside most of memory for potential hugetlb CMA allocation,
cma_declare_contiguous_nid will fail.

hugetlb_cma doesn't need the entire area to be one physically contiguous
range.  It just cares about being able to get physically contiguous chunks
of a certain size (e.g.  1G), and it is fine to have the CMA area backed
by multiple physical ranges, as long as it gets 1G contiguous allocations.

Multi-range support is implemented by introducing an array of ranges,
instead of just one big one.  Each range has its own bitmap.  Effectively,
the allocate and release operations work as before, just per-range.  So,
instead of going through one large bitmap, they now go through a number of
smaller ones.

The maximum number of supported ranges is 8, as defined in CMA_MAX_RANGES.

Since some current users of CMA expect a CMA area to just use one
physically contiguous range, only allow for multiple ranges if a new
interface, cma_declare_contiguous_nid_multi, is used.  The other
interfaces will work like before, creating only CMA areas with 1 range.

cma_declare_contiguous_nid_multi works as follows, mimicking the
default "bottom-up, above 4G" reservation approach:

0) Try cma_declare_contiguous_nid, which will use only one
   region. If this succeeds, return. This makes sure that for
   all the cases that currently work, the behavior remains
   unchanged even if the caller switches from
   cma_declare_contiguous_nid to cma_declare_contiguous_nid_multi.
1) Select the largest free memblock ranges above 4G, with
   a maximum number of CMA_MAX_RANGES.
2) If we did not find at most CMA_MAX_RANGES that add
   up to the total size requested, return -ENOMEM.
3) Sort the selected ranges by base address.
4) Reserve them bottom-up until we get what we wanted.

Link: https://lkml.kernel.org/r/20250228182928.2645936-3-fvdl@google.com


Signed-off-by: default avatarFrank van der Linden <fvdl@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin (Cruise) <roman.gushchin@linux.dev>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent 7365ff2c
Loading
Loading
Loading
Loading
+8 −2
Original line number Diff line number Diff line
@@ -12,10 +12,16 @@ its CMA name like below:

The structure of the files created under that directory is as follows:

 - [RO] base_pfn: The base PFN (Page Frame Number) of the zone.
 - [RO] base_pfn: The base PFN (Page Frame Number) of the CMA area.
        This is the same as ranges/0/base_pfn.
 - [RO] count: Amount of memory in the CMA area.
 - [RO] order_per_bit: Order of pages represented by one bit.
 - [RO] bitmap: The bitmap of page states in the zone.
 - [RO] bitmap: The bitmap of allocated pages in the area.
        This is the same as ranges/0/base_pfn.
 - [RO] ranges/N/base_pfn: The base PFN of contiguous range N
        in the CMA area.
 - [RO] ranges/N/bitmap: The bit map of allocated pages in
        range N in the CMA area.
 - [WO] alloc: Allocate N pages from that CMA area. For example::

	echo 5 > <debugfs>/cma/<cma_name>/alloc
+3 −0
Original line number Diff line number Diff line
@@ -40,6 +40,9 @@ static inline int __init cma_declare_contiguous(phys_addr_t base,
	return cma_declare_contiguous_nid(base, size, limit, alignment,
			order_per_bit, fixed, name, res_cma, NUMA_NO_NODE);
}
extern int __init cma_declare_contiguous_multi(phys_addr_t size,
			phys_addr_t align, unsigned int order_per_bit,
			const char *name, struct cma **res_cma, int nid);
extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
					unsigned int order_per_bit,
					const char *name,
+476 −118

File changed.

Preview size limit exceeded, changes collapsed.

+22 −5
Original line number Diff line number Diff line
@@ -10,19 +10,35 @@ struct cma_kobject {
	struct cma *cma;
};

struct cma {
/*
 * Multi-range support. This can be useful if the size of the allocation
 * is not expected to be larger than the alignment (like with hugetlb_cma),
 * and the total amount of memory requested, while smaller than the total
 * amount of memory available, is large enough that it doesn't fit in a
 * single physical memory range because of memory holes.
 */
struct cma_memrange {
	unsigned long base_pfn;
	unsigned long count;
	unsigned long	available_count;
	unsigned long *bitmap;
#ifdef CONFIG_CMA_DEBUGFS
	struct debugfs_u32_array dfs_bitmap;
#endif
};
#define CMA_MAX_RANGES 8

struct cma {
	unsigned long   count;
	unsigned long	available_count;
	unsigned int order_per_bit; /* Order of pages represented by one bit */
	spinlock_t	lock;
#ifdef CONFIG_CMA_DEBUGFS
	struct hlist_head mem_head;
	spinlock_t mem_head_lock;
	struct debugfs_u32_array dfs_bitmap;
#endif
	char name[CMA_MAX_NAME];
	int nranges;
	struct cma_memrange ranges[CMA_MAX_RANGES];
#ifdef CONFIG_CMA_SYSFS
	/* the number of CMA page successful allocations */
	atomic64_t nr_pages_succeeded;
@@ -39,9 +55,10 @@ struct cma {
extern struct cma cma_areas[MAX_CMA_AREAS];
extern unsigned int cma_area_count;

static inline unsigned long cma_bitmap_maxno(struct cma *cma)
static inline unsigned long cma_bitmap_maxno(struct cma *cma,
		struct cma_memrange *cmr)
{
	return cma->count >> cma->order_per_bit;
	return cmr->count >> cma->order_per_bit;
}

#ifdef CONFIG_CMA_SYSFS
+41 −15

File changed.

Preview size limit exceeded, changes collapsed.