Commit ff661eee authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull cgroup updates from Tejun Heo:

 - cpuset changes:

    - Continue separating v1 and v2 implementations by moving more
      v1-specific logic into cpuset-v1.c

    - Improve partition handling. Sibling partitions are no longer
      invalidated on cpuset.cpus conflict, cpuset.cpus changes no longer
      fail in v2, and effective_xcpus computation is made consistent

    - Fix partition effective CPUs overlap that caused a warning on
      cpuset removal when sibling partitions shared CPUs

 - Increase the maximum cgroup subsystem count from 16 to 32 to
   accommodate future subsystem additions

 - Misc cleanups and selftest improvements including switching to
   css_is_online() helper, removing dead code and stale documentation
   references, using lockdep_assert_cpuset_lock_held() consistently, and
   adding polling helpers for asynchronously updated cgroup statistics

* tag 'cgroup-for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits)
  cpuset: fix overlap of partition effective CPUs
  cgroup: increase maximum subsystem count from 16 to 32
  cgroup: Remove stale cpu.rt.max reference from documentation
  cpuset: replace direct lockdep_assert_held() with lockdep_assert_cpuset_lock_held()
  cgroup/cpuset: Move the v1 empty cpus/mems check to cpuset1_validate_change()
  cgroup/cpuset: Don't invalidate sibling partitions on cpuset.cpus conflict
  cgroup/cpuset: Don't fail cpuset.cpus change in v2
  cgroup/cpuset: Consistently compute effective_xcpus in update_cpumasks_hier()
  cgroup/cpuset: Streamline rm_siblings_excl_cpus()
  cpuset: remove dead code in cpuset-v1.c
  cpuset: remove v1-specific code from generate_sched_domains
  cpuset: separate generate_sched_domains for v1 and v2
  cpuset: move update_domain_attr_tree to cpuset_v1.c
  cpuset: add cpuset1_init helper for v1 initialization
  cpuset: add cpuset1_online_css helper for v1-specific operations
  cpuset: add lockdep_assert_cpuset_lock_held helper
  cpuset: Remove unnecessary checks in rebuild_sched_domains_locked
  cgroup: switch to css_is_online() helper
  selftests: cgroup: Replace sleep with cg_read_key_long_poll() for waiting on nr_dying_descendants
  selftests: cgroup: make test_memcg_sock robust against delayed sock stats
  ...
parents 9bdc6489 8b1f3c54
Loading
Loading
Loading
Loading
+27 −17
Original line number Diff line number Diff line
@@ -737,9 +737,6 @@ combinations are invalid and should be rejected. Also, if the
resource is mandatory for execution of processes, process migrations
may be rejected.

"cpu.rt.max" hard-allocates realtime slices and is an example of this
type.


Interface Files
===============
@@ -2561,10 +2558,10 @@ Cpuset Interface Files
	Users can manually set it to a value that is different from
	"cpuset.cpus".	One constraint in setting it is that the list of
	CPUs must be exclusive with respect to "cpuset.cpus.exclusive"
	of its sibling.  If "cpuset.cpus.exclusive" of a sibling cgroup
	isn't set, its "cpuset.cpus" value, if set, cannot be a subset
	of it to leave at least one CPU available when the exclusive
	CPUs are taken away.
	and "cpuset.cpus.exclusive.effective" of its siblings.	Another
	constraint is that it cannot be a superset of "cpuset.cpus"
	of its sibling in order to leave at least one CPU available to
	that sibling when the exclusive CPUs are taken away.

	For a parent cgroup, any one of its exclusive CPUs can only
	be distributed to at most one of its child cgroups.  Having an
@@ -2584,9 +2581,9 @@ Cpuset Interface Files
	of this file will always be a subset of its parent's
	"cpuset.cpus.exclusive.effective" if its parent is not the root
	cgroup.  It will also be a subset of "cpuset.cpus.exclusive"
	if it is set.  If "cpuset.cpus.exclusive" is not set, it is
	treated to have an implicit value of "cpuset.cpus" in the
	formation of local partition.
	if it is set.  This file should only be non-empty if either
	"cpuset.cpus.exclusive" is set or when the current cpuset is
	a valid partition root.

  cpuset.cpus.isolated
	A read-only and root cgroup only multiple values file.
@@ -2618,13 +2615,22 @@ Cpuset Interface Files
	There are two types of partitions - local and remote.  A local
	partition is one whose parent cgroup is also a valid partition
	root.  A remote partition is one whose parent cgroup is not a
	valid partition root itself.  Writing to "cpuset.cpus.exclusive"
	is optional for the creation of a local partition as its
	"cpuset.cpus.exclusive" file will assume an implicit value that
	is the same as "cpuset.cpus" if it is not set.	Writing the
	proper "cpuset.cpus.exclusive" values down the cgroup hierarchy
	before the target partition root is mandatory for the creation
	of a remote partition.
	valid partition root itself.

	Writing to "cpuset.cpus.exclusive" is optional for the creation
	of a local partition as its "cpuset.cpus.exclusive" file will
	assume an implicit value that is the same as "cpuset.cpus" if it
	is not set.  Writing the proper "cpuset.cpus.exclusive" values
	down the cgroup hierarchy before the target partition root is
	mandatory for the creation of a remote partition.

	Not all the CPUs requested in "cpuset.cpus.exclusive" can be
	used to form a new partition.  Only those that were present
	in its parent's "cpuset.cpus.exclusive.effective" control
	file can be used.  For partitions created without setting
	"cpuset.cpus.exclusive", exclusive CPUs specified in sibling's
	"cpuset.cpus.exclusive" or "cpuset.cpus.exclusive.effective"
	also cannot be used.

	Currently, a remote partition cannot be created under a local
	partition.  All the ancestors of a remote partition root except
@@ -2632,6 +2638,10 @@ Cpuset Interface Files

	The root cgroup is always a partition root and its state cannot
	be changed.  All other non-root cgroups start out as "member".
	Even though the "cpuset.cpus.exclusive*" and "cpuset.cpus"
	control files are not present in the root cgroup, they are
	implicitly the same as the "/sys/devices/system/cpu/possible"
	sysfs file.

	When set to "root", the current cgroup is the root of a new
	partition or scheduling domain.  The set of exclusive CPUs is
+1 −1
Original line number Diff line number Diff line
@@ -981,7 +981,7 @@ void wbc_account_cgroup_owner(struct writeback_control *wbc, struct folio *folio

	css = mem_cgroup_css_from_folio(folio);
	/* dead cgroups shouldn't contribute to inode ownership arbitration */
	if (!(css->flags & CSS_ONLINE))
	if (!css_is_online(css))
		return;

	id = css->id;
+4 −4
Original line number Diff line number Diff line
@@ -535,10 +535,10 @@ struct cgroup {
	 * one which may have more subsystems enabled.  Controller knobs
	 * are made available iff it's enabled in ->subtree_control.
	 */
	u16 subtree_control;
	u16 subtree_ss_mask;
	u16 old_subtree_control;
	u16 old_subtree_ss_mask;
	u32 subtree_control;
	u32 subtree_ss_mask;
	u32 old_subtree_control;
	u32 old_subtree_ss_mask;

	/* Private pointers for each registered subsystem */
	struct cgroup_subsys_state __rcu *subsys[CGROUP_SUBSYS_COUNT];
+2 −0
Original line number Diff line number Diff line
@@ -76,6 +76,7 @@ extern void inc_dl_tasks_cs(struct task_struct *task);
extern void dec_dl_tasks_cs(struct task_struct *task);
extern void cpuset_lock(void);
extern void cpuset_unlock(void);
extern void lockdep_assert_cpuset_lock_held(void);
extern void cpuset_cpus_allowed_locked(struct task_struct *p, struct cpumask *mask);
extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mask);
extern bool cpuset_cpus_allowed_fallback(struct task_struct *p);
@@ -196,6 +197,7 @@ static inline void inc_dl_tasks_cs(struct task_struct *task) { }
static inline void dec_dl_tasks_cs(struct task_struct *task) { }
static inline void cpuset_lock(void) { }
static inline void cpuset_unlock(void) { }
static inline void lockdep_assert_cpuset_lock_held(void) { }

static inline void cpuset_cpus_allowed_locked(struct task_struct *p,
					struct cpumask *mask)
+1 −1
Original line number Diff line number Diff line
@@ -893,7 +893,7 @@ static inline bool mem_cgroup_online(struct mem_cgroup *memcg)
{
	if (mem_cgroup_disabled())
		return true;
	return !!(memcg->css.flags & CSS_ONLINE);
	return css_is_online(&memcg->css);
}

void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
Loading