Commit 148f95f7 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull slab updates from Vlastimil Babka:

 - The percpu sheaves caching layer was introduced as opt-in in 6.18 and
   now we enable it for all caches and remove the previous cpu (partial)
   slab caching mechanism.

   Besides the lower locking overhead and much more likely fastpath when
   freeing, this removes the rather complicated code related to the cpu
   slab lockless fastpaths (using this_cpu_try_cmpxchg128/64) and all
   its complications for PREEMPT_RT or kmalloc_nolock().

   The lockless slab freelist+counters update operation using
   try_cmpxchg128/64 remains and is crucial for freeing remote NUMA
   objects, and to allow flushing objects from sheaves to slabs mostly
   without the node list_lock (Vlastimil Babka)

 - Eliminate slabobj_ext metadata overhead when possible. Instead of
   using kmalloc() to allocate the array for memcg and/or allocation
   profiling tag pointers, use leftover space in a slab or per-object
   padding due to alignment (Harry Yoo)

 - Various followup improvements to the above (Hao Li)

* tag 'slab-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (39 commits)
  slub: let need_slab_obj_exts() return false if SLAB_NO_OBJ_EXT is set
  mm/slab: only allow SLAB_OBJ_EXT_IN_OBJ for unmergeable caches
  mm/slab: place slabobj_ext metadata in unused space within s->size
  mm/slab: move [__]ksize and slab_ksize() to mm/slub.c
  mm/slab: save memory by allocating slabobj_ext array from leftover
  mm/memcontrol,alloc_tag: handle slabobj_ext access under KASAN poison
  mm/slab: use stride to access slabobj_ext
  mm/slab: abstract slabobj_ext access via new slab_obj_ext() helper
  ext4: specify the free pointer offset for ext4_inode_cache
  mm/slab: allow specifying free pointer offset when using constructor
  mm/slab: use unsigned long for orig_size to ensure proper metadata align
  slub: clarify object field layout comments
  mm/slab: avoid allocating slabobj_ext array from its own slab
  slub: avoid list_lock contention from __refill_objects_any()
  mm/slub: cleanup and repurpose some stat items
  mm/slub: remove DEACTIVATE_TO_* stat items
  slab: remove frozen slab checks from __slab_free()
  slab: update overview comments
  slab: refill sheaves from all nodes
  slab: remove unused PREEMPT_RT specific macros
  ...
parents 41f1a086 815c8e35
Loading
Loading
Loading
Loading
+13 −6
Original line number Diff line number Diff line
@@ -1496,12 +1496,19 @@ static void init_once(void *foo)

static int __init init_inodecache(void)
{
	ext4_inode_cachep = kmem_cache_create_usercopy("ext4_inode_cache",
				sizeof(struct ext4_inode_info), 0,
				SLAB_RECLAIM_ACCOUNT | SLAB_ACCOUNT,
				offsetof(struct ext4_inode_info, i_data),
				sizeof_field(struct ext4_inode_info, i_data),
				init_once);
	struct kmem_cache_args args = {
		.useroffset = offsetof(struct ext4_inode_info, i_data),
		.usersize = sizeof_field(struct ext4_inode_info, i_data),
		.use_freeptr_offset = true,
		.freeptr_offset = offsetof(struct ext4_inode_info, i_flags),
		.ctor = init_once,
	};

	ext4_inode_cachep = kmem_cache_create("ext4_inode_cache",
				sizeof(struct ext4_inode_info),
				&args,
				SLAB_RECLAIM_ACCOUNT | SLAB_ACCOUNT);

	if (ext4_inode_cachep == NULL)
		return -ENOMEM;
	return 0;
+22 −18
Original line number Diff line number Diff line
@@ -58,8 +58,9 @@ enum _slab_flag_bits {
#endif
	_SLAB_OBJECT_POISON,
	_SLAB_CMPXCHG_DOUBLE,
#ifdef CONFIG_SLAB_OBJ_EXT
	_SLAB_NO_OBJ_EXT,
#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
	_SLAB_OBJ_EXT_IN_OBJ,
#endif
	_SLAB_FLAGS_LAST_BIT
};
@@ -239,10 +240,12 @@ enum _slab_flag_bits {
#define SLAB_TEMPORARY		SLAB_RECLAIM_ACCOUNT	/* Objects are short-lived */

/* Slab created using create_boot_cache */
#ifdef CONFIG_SLAB_OBJ_EXT
#define SLAB_NO_OBJ_EXT		__SLAB_FLAG_BIT(_SLAB_NO_OBJ_EXT)

#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
#define SLAB_OBJ_EXT_IN_OBJ	__SLAB_FLAG_BIT(_SLAB_OBJ_EXT_IN_OBJ)
#else
#define SLAB_NO_OBJ_EXT		__SLAB_FLAG_UNUSED
#define SLAB_OBJ_EXT_IN_OBJ	__SLAB_FLAG_UNUSED
#endif

/*
@@ -300,24 +303,26 @@ struct kmem_cache_args {
	unsigned int usersize;
	/**
	 * @freeptr_offset: Custom offset for the free pointer
	 * in &SLAB_TYPESAFE_BY_RCU caches
	 * in caches with &SLAB_TYPESAFE_BY_RCU or @ctor
	 *
	 * By default &SLAB_TYPESAFE_BY_RCU caches place the free pointer
	 * outside of the object. This might cause the object to grow in size.
	 * Cache creators that have a reason to avoid this can specify a custom
	 * free pointer offset in their struct where the free pointer will be
	 * placed.
	 * By default, &SLAB_TYPESAFE_BY_RCU and @ctor caches place the free
	 * pointer outside of the object. This might cause the object to grow
	 * in size. Cache creators that have a reason to avoid this can specify
	 * a custom free pointer offset in their data structure where the free
	 * pointer will be placed.
	 *
	 * Note that placing the free pointer inside the object requires the
	 * caller to ensure that no fields are invalidated that are required to
	 * guard against object recycling (See &SLAB_TYPESAFE_BY_RCU for
	 * details).
	 * For caches with &SLAB_TYPESAFE_BY_RCU, the caller must ensure that
	 * the free pointer does not overlay fields required to guard against
	 * object recycling (See &SLAB_TYPESAFE_BY_RCU for details).
	 *
	 * Using %0 as a value for @freeptr_offset is valid. If @freeptr_offset
	 * is specified, %use_freeptr_offset must be set %true.
	 * For caches with @ctor, the caller must ensure that the free pointer
	 * does not overlay fields initialized by the constructor.
	 *
	 * Note that @ctor currently isn't supported with custom free pointers
	 * as a @ctor requires an external free pointer.
	 * Currently, only caches with &SLAB_TYPESAFE_BY_RCU or @ctor
	 * may specify @freeptr_offset.
	 *
	 * Using %0 as a value for @freeptr_offset is valid. If @freeptr_offset
	 * is specified, @use_freeptr_offset must be set %true.
	 */
	unsigned int freeptr_offset;
	/**
@@ -508,7 +513,6 @@ void * __must_check krealloc_node_align_noprof(const void *objp, size_t new_size
void kfree(const void *objp);
void kfree_nolock(const void *objp);
void kfree_sensitive(const void *objp);
size_t __ksize(const void *objp);

DEFINE_FREE(kfree, void *, if (!IS_ERR_OR_NULL(_T)) kfree(_T))
DEFINE_FREE(kfree_sensitive, void *, if (_T) kfree_sensitive(_T))
+0 −11
Original line number Diff line number Diff line
@@ -247,17 +247,6 @@ config SLUB_STATS
	  out which slabs are relevant to a particular load.
	  Try running: slabinfo -DA

config SLUB_CPU_PARTIAL
	default y
	depends on SMP && !SLUB_TINY
	bool "Enable per cpu partial caches"
	help
	  Per cpu partial caches accelerate objects allocation and freeing
	  that is local to a processor at the price of more indeterminism
	  in the latency of the free. On overflow these caches will be cleared
	  which requires the taking of locks that may cause latency spikes.
	  Typically one would choose no for a realtime system.

config RANDOM_KMALLOC_CACHES
	default n
	depends on !SLUB_TINY
+1 −0
Original line number Diff line number Diff line
@@ -838,6 +838,7 @@ static inline struct page *alloc_frozen_pages_noprof(gfp_t gfp, unsigned int ord
struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned int order);
#define alloc_frozen_pages_nolock(...) \
	alloc_hooks(alloc_frozen_pages_nolock_noprof(__VA_ARGS__))
void free_frozen_pages_nolock(struct page *page, unsigned int order);

extern void zone_pcp_reset(struct zone *zone);
extern void zone_pcp_disable(struct zone *zone);
+24 −7
Original line number Diff line number Diff line
@@ -2627,16 +2627,24 @@ struct mem_cgroup *mem_cgroup_from_obj_slab(struct slab *slab, void *p)
	 * Memcg membership data for each individual object is saved in
	 * slab->obj_exts.
	 */
	struct slabobj_ext *obj_exts;
	unsigned long obj_exts;
	struct slabobj_ext *obj_ext;
	unsigned int off;

	obj_exts = slab_obj_exts(slab);
	if (!obj_exts)
		return NULL;

	get_slab_obj_exts(obj_exts);
	off = obj_to_index(slab->slab_cache, slab, p);
	if (obj_exts[off].objcg)
		return obj_cgroup_memcg(obj_exts[off].objcg);
	obj_ext = slab_obj_ext(slab, obj_exts, off);
	if (obj_ext->objcg) {
		struct obj_cgroup *objcg = obj_ext->objcg;

		put_slab_obj_exts(obj_exts);
		return obj_cgroup_memcg(objcg);
	}
	put_slab_obj_exts(obj_exts);

	return NULL;
}
@@ -3222,6 +3230,9 @@ bool __memcg_slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
	}

	for (i = 0; i < size; i++) {
		unsigned long obj_exts;
		struct slabobj_ext *obj_ext;

		slab = virt_to_slab(p[i]);

		if (!slab_obj_exts(slab) &&
@@ -3244,29 +3255,35 @@ bool __memcg_slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
					slab_pgdat(slab), cache_vmstat_idx(s)))
			return false;

		obj_exts = slab_obj_exts(slab);
		get_slab_obj_exts(obj_exts);
		off = obj_to_index(s, slab, p[i]);
		obj_ext = slab_obj_ext(slab, obj_exts, off);
		obj_cgroup_get(objcg);
		slab_obj_exts(slab)[off].objcg = objcg;
		obj_ext->objcg = objcg;
		put_slab_obj_exts(obj_exts);
	}

	return true;
}

void __memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
			    void **p, int objects, struct slabobj_ext *obj_exts)
			    void **p, int objects, unsigned long obj_exts)
{
	size_t obj_size = obj_full_size(s);

	for (int i = 0; i < objects; i++) {
		struct obj_cgroup *objcg;
		struct slabobj_ext *obj_ext;
		unsigned int off;

		off = obj_to_index(s, slab, p[i]);
		objcg = obj_exts[off].objcg;
		obj_ext = slab_obj_ext(slab, obj_exts, off);
		objcg = obj_ext->objcg;
		if (!objcg)
			continue;

		obj_exts[off].objcg = NULL;
		obj_ext->objcg = NULL;
		refill_obj_stock(objcg, obj_size, true, -obj_size,
				 slab_pgdat(slab), cache_vmstat_idx(s));
		obj_cgroup_put(objcg);
Loading