Merge tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm (5c00ff74) · Commits · git / linux-net

Documentation/admin-guide/blockdev/zram.rst

+2 −0

Original line number	Diff line number	Diff line
		@@ -47,6 +47,8 @@ The list of possible return codes:
		-ENOMEM zram was not able to allocate enough memory to fulfil your
		needs.
		-EINVAL invalid input has been provided.
		-EAGAIN re-try operation later (e.g. when attempting to run recompress
		and writeback simultaneously).
		======== =============================================================

		If you use 'echo', the returned value is set by the 'echo' utility,

Documentation/admin-guide/cgroup-v1/memory.rst

+3 −79

Original line number	Diff line number	Diff line
		@@ -90,9 +90,7 @@ Brief summary of control files.
		used.
		memory.swappiness set/show swappiness parameter of vmscan
		(See sysctl's vm.swappiness)
		memory.move_charge_at_immigrate set/show controls of moving charges
		This knob is deprecated and shouldn't be
		used.
		memory.move_charge_at_immigrate This knob is deprecated.
		memory.oom_control set/show oom controls.
		This knob is deprecated and shouldn't be
		used.
		@@ -243,10 +241,6 @@ behind this approach is that a cgroup that aggressively uses a shared
		page will eventually get charged for it (once it is uncharged from
		the cgroup that brought it in -- this will happen on memory pressure).

		But see :ref:`section 8.2 <cgroup-v1-memory-movable-charges>` when moving a
		task to another cgroup, its pages may be recharged to the new cgroup, if
		move_charge_at_immigrate has been chosen.

		2.4 Swap Extension
		--------------------------------------

		@@ -756,78 +750,8 @@ If we want to change this to 1G, we can at any time use::

		THIS IS DEPRECATED!

		It's expensive and unreliable! It's better practice to launch workload
		tasks directly from inside their target cgroup. Use dedicated workload
		cgroups to allow fine-grained policy adjustments without having to
		move physical pages between control domains.

		Users can move charges associated with a task along with task migration, that
		is, uncharge task's pages from the old cgroup and charge them to the new cgroup.
		This feature is not supported in !CONFIG_MMU environments because of lack of
		page tables.

		8.1 Interface
		-------------

		This feature is disabled by default. It can be enabled (and disabled again) by
		writing to memory.move_charge_at_immigrate of the destination cgroup.

		If you want to enable it::

		# echo (some positive value) > memory.move_charge_at_immigrate

		.. note::
		Each bits of move_charge_at_immigrate has its own meaning about what type
		of charges should be moved. See :ref:`section 8.2
		<cgroup-v1-memory-movable-charges>` for details.

		.. note::
		Charges are moved only when you move mm->owner, in other words,
		a leader of a thread group.

		.. note::
		If we cannot find enough space for the task in the destination cgroup, we
		try to make space by reclaiming memory. Task migration may fail if we
		cannot make enough space.

		.. note::
		It can take several seconds if you move charges much.

		And if you want disable it again::

		# echo 0 > memory.move_charge_at_immigrate

		.. _cgroup-v1-memory-movable-charges:

		8.2 Type of charges which can be moved
		--------------------------------------

		Each bit in move_charge_at_immigrate has its own meaning about what type of
		charges should be moved. But in any case, it must be noted that an account of
		a page or a swap can be moved only when it is charged to the task's current
		(old) memory cgroup.

		+---+--------------------------------------------------------------------------+
		\|bit\| what type of charges would be moved ? \|
		+===+==========================================================================+
		\| 0 \| A charge of an anonymous page (or swap of it) used by the target task. \|
		\| \| You must enable Swap Extension (see 2.4) to enable move of swap charges. \|
		+---+--------------------------------------------------------------------------+
		\| 1 \| A charge of file pages (normal file, tmpfs file (e.g. ipc shared memory) \|
		\| \| and swaps of tmpfs file) mmapped by the target task. Unlike the case of \|
		\| \| anonymous pages, file pages (and swaps) in the range mmapped by the task \|
		\| \| will be moved even if the task hasn't done page fault, i.e. they might \|
		\| \| not be the task's "RSS", but other task's "RSS" that maps the same file. \|
		\| \| The mapcount of the page is ignored (the page can be moved independent \|
		\| \| of the mapcount). You must enable Swap Extension (see 2.4) to \|
		\| \| enable move of swap charges. \|
		+---+--------------------------------------------------------------------------+

		8.3 TODO
		--------

		- All of moving charge operations are done under cgroup_mutex. It's not good
		behavior to hold the mutex too long, so we may need some trick.
		Reading memory.move_charge_at_immigrate will always return 0 and writing
		to it will always return -EINVAL.

		9. Memory thresholds
		====================

Documentation/admin-guide/cgroup-v2.rst

+5 −0

Original line number	Diff line number	Diff line
		@@ -1655,6 +1655,11 @@ The following nested keys are defined.
		pgdemote_khugepaged
		Number of pages demoted by khugepaged.

		hugetlb
		Amount of memory used by hugetlb pages. This metric only shows
		up if hugetlb usage is accounted for in memory.current (i.e.
		cgroup is mounted with the memory_hugetlb_accounting option).

		memory.numa_stat
		A read-only nested-keyed file which exists on non-root cgroups.

Documentation/admin-guide/kernel-parameters.txt

+17 −0

Original line number	Diff line number	Diff line
		@@ -6711,6 +6711,16 @@
		Force threading of all interrupt handlers except those
		marked explicitly IRQF_NO_THREAD.

		thp_shmem= [KNL]
		Format: <size>[KMG],<size>[KMG]:<policy>;<size>[KMG]-<size>[KMG]:<policy>
		Control the default policy of each hugepage size for the
		internal shmem mount. <policy> is one of policies available
		for the shmem mount ("always", "inherit", "never", "within_size",
		and "advise").
		It can be used multiple times for multiple shmem THP sizes.
		See Documentation/admin-guide/mm/transhuge.rst for more
		details.

		topology= [S390,EARLY]
		Format: {off \| on}
		Specify if the kernel should make use of the cpu
		@@ -6952,6 +6962,13 @@
		See Documentation/admin-guide/mm/transhuge.rst
		for more details.

		transparent_hugepage_shmem= [KNL]
		Format: [always\|within_size\|advise\|never\|deny\|force]
		Can be used to control the hugepage allocation policy for
		the internal shmem mount.
		See Documentation/admin-guide/mm/transhuge.rst
		for more details.

		trusted.source= [KEYS]
		Format: <string>
		This parameter identifies the trust source as a backend

Documentation/admin-guide/mm/transhuge.rst

+33 −2

Original line number	Diff line number	Diff line
		@@ -326,6 +326,29 @@ PMD_ORDER THP policy will be overridden. If the policy for PMD_ORDER
		is not defined within a valid ``thp_anon``, its policy will default to
		``never``.

		Similarly to ``transparent_hugepage``, you can control the hugepage
		allocation policy for the internal shmem mount by using the kernel parameter
		``transparent_hugepage_shmem=<policy>``, where ``<policy>`` is one of the
		seven valid policies for shmem (``always``, ``within_size``, ``advise``,
		``never``, ``deny``, and ``force``).

		In the same manner as ``thp_anon`` controls each supported anonymous THP
		size, ``thp_shmem`` controls each supported shmem THP size. ``thp_shmem``
		has the same format as ``thp_anon``, but also supports the policy
		``within_size``.

		``thp_shmem=`` may be specified multiple times to configure all THP sizes
		as required. If ``thp_shmem=`` is specified at least once, any shmem THP
		sizes not explicitly configured on the command line are implicitly set to
		``never``.

		``transparent_hugepage_shmem`` setting only affects the global toggle. If
		``thp_shmem`` is not specified, PMD_ORDER hugepage will default to
		``inherit``. However, if a valid ``thp_shmem`` setting is provided by the
		user, the PMD_ORDER hugepage policy will be overridden. If the policy for
		PMD_ORDER is not defined within a valid ``thp_shmem``, its policy will
		default to ``never``.

		Hugepages in tmpfs/shmem
		========================

		@@ -530,10 +553,18 @@ anon_fault_fallback_charge
		instead falls back to using huge pages with lower orders or
		small pages even though the allocation was successful.

		swpout
		is incremented every time a huge page is swapped out in one
		zswpout
		is incremented every time a huge page is swapped out to zswap in one
		piece without splitting.

		swpin
		is incremented every time a huge page is swapped in from a non-zswap
		swap device in one piece.

		swpout
		is incremented every time a huge page is swapped out to a non-zswap
		swap device in one piece without splitting.

		swpout_fallback
		is incremented if a huge page has to be split before swapout.
		Usually because failed to allocate some continuous swap space