Commit 895b9b12 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull cgroup updates from Tejun Heo:

 - Added Michal Koutný as a maintainer

 - Counters in pids.events were behaving inconsistently. pids.events
   made properly hierarchical and pids.events.local added

 - misc.peak and misc.events.local added

 - cpuset remote partition creation and cpuset.cpus.exclusive handling
   improved

 - Code cleanups, non-critical fixes, doc updates

 - for-6.10-fixes is merged in to receive two non-critical fixes that
   didn't trigger pull

* tag 'cgroup-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (23 commits)
  cgroup: Add Michal Koutný as a maintainer
  cgroup/misc: Introduce misc.events.local
  cgroup/rstat: add force idle show helper
  cgroup: Protect css->cgroup write under css_set_lock
  cgroup/misc: Introduce misc.peak
  cgroup_misc: add kernel-doc comments for enum misc_res_type
  cgroup/cpuset: Prevent UAF in proc_cpuset_show()
  selftest/cgroup: Update test_cpuset_prs.sh to match changes
  cgroup/cpuset: Make cpuset.cpus.exclusive independent of cpuset.cpus
  cgroup/cpuset: Delay setting of CS_CPU_EXCLUSIVE until valid partition
  selftest/cgroup: Fix test_cpuset_prs.sh problems reported by test robot
  cgroup/cpuset: Fix remote root partition creation problem
  cgroup: avoid the unnecessary list_add(dying_tasks) in cgroup_exit()
  cgroup/cpuset: Optimize isolated partition only generate_sched_domains() calls
  cgroup/cpuset: Reduce the lock protecting CS_SCHED_LOAD_BALANCE
  kernel/cgroup: cleanup cgroup_base_files when fail to add cgroup_psi_files
  selftests: cgroup: Add basic tests for pids controller
  selftests: cgroup: Lexicographic order in Makefile
  cgroup/pids: Add pids.events.local
  cgroup/pids: Make event counters hierarchical
  ...
parents f97b956b 9283ff5b
Loading
Loading
Loading
Loading
+2 −1
Original line number Diff line number Diff line
@@ -36,7 +36,8 @@ superset of parent/child/pids.current.

The pids.events file contains event counters:

  - max: Number of times fork failed because limit was hit.
  - max: Number of times fork failed in the cgroup because limit was hit in
    self or ancestors.

Example
-------
+39 −8
Original line number Diff line number Diff line
@@ -239,6 +239,13 @@ cgroup v2 currently supports the following mount options.
          will not be tracked by the memory controller (even if cgroup
          v2 is remounted later on).

  pids_localevents
        The option restores v1-like behavior of pids.events:max, that is only
        local (inside cgroup proper) fork failures are counted. Without this
        option pids.events.max represents any pids.max enforcemnt across
        cgroup's subtree.



Organizing Processes and Threads
--------------------------------
@@ -2205,12 +2212,18 @@ PID Interface Files
	descendants has ever reached.

  pids.events
	A read-only flat-keyed file which exists on non-root cgroups. The
	following entries are defined. Unless specified otherwise, a value
	change in this file generates a file modified event.
	A read-only flat-keyed file which exists on non-root cgroups. Unless
	specified otherwise, a value change in this file generates a file
	modified event. The following entries are defined.

	  max
		Number of times fork failed because limit was hit.
		The number of times the cgroup's total number of processes hit the pids.max
		limit (see also pids_localevents).

  pids.events.local
	Similar to pids.events but the fields in the file are local
	to the cgroup i.e. not hierarchical. The file modified event
	generated on this file reflects only the local events.

Organisational operations are not blocked by cgroup policies, so it is
possible to have pids.current > pids.max.  This can be done by either
@@ -2346,8 +2359,12 @@ Cpuset Interface Files
	is always a subset of it.

	Users can manually set it to a value that is different from
	"cpuset.cpus".	The only constraint in setting it is that the
	list of CPUs must be exclusive with respect to its sibling.
	"cpuset.cpus".	One constraint in setting it is that the list of
	CPUs must be exclusive with respect to "cpuset.cpus.exclusive"
	of its sibling.  If "cpuset.cpus.exclusive" of a sibling cgroup
	isn't set, its "cpuset.cpus" value, if set, cannot be a subset
	of it to leave at least one CPU available when the exclusive
	CPUs are taken away.

	For a parent cgroup, any one of its exclusive CPUs can only
	be distributed to at most one of its child cgroups.  Having an
@@ -2363,8 +2380,8 @@ Cpuset Interface Files
	cpuset-enabled cgroups.

	This file shows the effective set of exclusive CPUs that
	can be used to create a partition root.  The content of this
	file will always be a subset of "cpuset.cpus" and its parent's
	can be used to create a partition root.  The content
	of this file will always be a subset of its parent's
	"cpuset.cpus.exclusive.effective" if its parent is not the root
	cgroup.  It will also be a subset of "cpuset.cpus.exclusive"
	if it is set.  If "cpuset.cpus.exclusive" is not set, it is
@@ -2625,6 +2642,15 @@ Miscellaneous controller provides 3 interface files. If two misc resources (res_
	  res_a 3
	  res_b 0

  misc.peak
        A read-only flat-keyed file shown in all cgroups.  It shows the
        historical maximum usage of the resources in the cgroup and its
        children.::

	  $ cat misc.peak
	  res_a 10
	  res_b 8

  misc.max
        A read-write flat-keyed file shown in the non root cgroups. Allowed
        maximum usage of the resources in the cgroup and its children.::
@@ -2654,6 +2680,11 @@ Miscellaneous controller provides 3 interface files. If two misc resources (res_
		The number of times the cgroup's resource usage was
		about to go over the max boundary.

  misc.events.local
        Similar to misc.events but the fields in the file are local to the
        cgroup i.e. not hierarchical. The file modified event generated on
        this file reflects only the local events.

Migration and Ownership
~~~~~~~~~~~~~~~~~~~~~~~

+1 −0
Original line number Diff line number Diff line
@@ -5528,6 +5528,7 @@ CONTROL GROUP (CGROUP)
M:	Tejun Heo <tj@kernel.org>
M:	Zefan Li <lizefan.x@bytedance.com>
M:	Johannes Weiner <hannes@cmpxchg.org>
M:	Michal Koutný <mkoutny@suse.com>
L:	cgroups@vger.kernel.org
S:	Maintained
T:	git git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git
+6 −1
Original line number Diff line number Diff line
@@ -120,6 +120,11 @@ enum {
	 * Enable hugetlb accounting for the memory controller.
	 */
	CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING = (1 << 19),

	/*
	 * Enable legacy local pids.events.
	 */
	CGRP_ROOT_PIDS_LOCAL_EVENTS = (1 << 20),
};

/* cftype->flags */
+9 −3
Original line number Diff line number Diff line
@@ -9,15 +9,16 @@
#define _MISC_CGROUP_H_

/**
 * Types of misc cgroup entries supported by the host.
 * enum misc_res_type - Types of misc cgroup entries supported by the host.
 */
enum misc_res_type {
#ifdef CONFIG_KVM_AMD_SEV
	/* AMD SEV ASIDs resource */
	/** @MISC_CG_RES_SEV: AMD SEV ASIDs resource */
	MISC_CG_RES_SEV,
	/* AMD SEV-ES ASIDs resource */
	/** @MISC_CG_RES_SEV_ES: AMD SEV-ES ASIDs resource */
	MISC_CG_RES_SEV_ES,
#endif
	/** @MISC_CG_RES_TYPES: count of enum misc_res_type constants */
	MISC_CG_RES_TYPES
};

@@ -30,13 +31,16 @@ struct misc_cg;
/**
 * struct misc_res: Per cgroup per misc type resource
 * @max: Maximum limit on the resource.
 * @watermark: Historical maximum usage of the resource.
 * @usage: Current usage of the resource.
 * @events: Number of times, the resource limit exceeded.
 */
struct misc_res {
	u64 max;
	atomic64_t watermark;
	atomic64_t usage;
	atomic64_t events;
	atomic64_t events_local;
};

/**
@@ -50,6 +54,8 @@ struct misc_cg {

	/* misc.events */
	struct cgroup_file events_file;
	/* misc.events.local */
	struct cgroup_file events_local_file;

	struct misc_res res[MISC_CG_RES_TYPES];
};
Loading