Commit e991acf1 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of...

Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "Significant patch series in this pull request:

   - "squashfs: Remove page->mapping references" (Matthew Wilcox) gets
     us closer to being able to remove page->mapping

   - "relayfs: misc changes" (Jason Xing) does some maintenance and
     minor feature addition work in relayfs

   - "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches
     us from static preallocation of the kdump crashkernel's working
     memory over to dynamic allocation. So the difficulty of a-priori
     estimation of the second kernel's needs is removed and the first
     kernel obtains extra memory

   - "generalize panic_print's dump function to be used by other
     kernel parts" (Feng Tang) implements some consolidation and
     rationalization of the various ways in which a failing kernel
     splats information at the operator

* tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits)
  tools/getdelays: add backward compatibility for taskstats version
  kho: add test for kexec handover
  delaytop: enhance error logging and add PSI feature description
  samples: Kconfig: fix spelling mistake "instancess" -> "instances"
  fat: fix too many log in fat_chain_add()
  scripts/spelling.txt: add notifer||notifier to spelling.txt
  xen/xenbus: fix typo "notifer"
  net: mvneta: fix typo "notifer"
  drm/xe: fix typo "notifer"
  cxl: mce: fix typo "notifer"
  KVM: x86: fix typo "notifer"
  MAINTAINERS: add maintainers for delaytop
  ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below()
  ucount: fix atomic_long_inc_below() argument type
  kexec: enable CMA based contiguous allocation
  stackdepot: make max number of pools boot-time configurable
  lib/xxhash: remove unused functions
  init/Kconfig: restore CONFIG_BROKEN help text
  lib/raid6: update recov_rvv.c zero page usage
  docs: update docs after introducing delaytop
  ...
parents 3c4a063b 085dece6
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -673,6 +673,7 @@ Muchun Song <muchun.song@linux.dev> <smuchun@gmail.com>
Ross Zwisler <zwisler@kernel.org> <ross.zwisler@linux.intel.com>
Rudolf Marek <R.Marek@sh.cvut.cz>
Rui Saraiva <rmps@joel.ist.utl.pt>
Sachin Mokashi <sachin.mokashi@intel.com> <sachinx.mokashi@intel.com>
Sachin P Sant <ssant@in.ibm.com>
Sai Prakash Ranjan <quic_saipraka@quicinc.com> <saiprakash.ranjan@codeaurora.org>
Sakari Ailus <sakari.ailus@linux.intel.com> <sakari.ailus@iki.fi>
+56 −0
Original line number Diff line number Diff line
@@ -131,3 +131,59 @@ Get IO accounting for pid 1, it works only with -p::
	linuxrc: read=65536, write=0, cancelled_write=0

The above command can be used with -v to get more debug information.

After the system starts, use `delaytop` to get the system-wide delay information,
which includes system-wide PSI information and Top-N high-latency tasks.

`delaytop` supports sorting by CPU latency in descending order by default,
displays the top 20 high-latency tasks by default, and refreshes the latency
data every 2 seconds by default.

Get PSI information and Top-N tasks delay, since system boot::

	bash# ./delaytop
	System Pressure Information: (avg10/avg60/avg300/total)
	CPU some:       0.0%/   0.0%/   0.0%/     345(ms)
	CPU full:       0.0%/   0.0%/   0.0%/       0(ms)
	Memory full:    0.0%/   0.0%/   0.0%/       0(ms)
	Memory some:    0.0%/   0.0%/   0.0%/       0(ms)
	IO full:        0.0%/   0.0%/   0.0%/      65(ms)
	IO some:        0.0%/   0.0%/   0.0%/      79(ms)
	IRQ full:       0.0%/   0.0%/   0.0%/       0(ms)
	Top 20 processes (sorted by CPU delay):
	  PID   TGID  COMMAND          CPU(ms)  IO(ms) SWAP(ms) RCL(ms) THR(ms) CMP(ms)  WP(ms) IRQ(ms)
	----------------------------------------------------------------------------------------------
	  161    161  zombie_memcg_re   1.40    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	  130    130  blkcg_punt_bio    1.37    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	  444    444  scsi_tmf_0        0.73    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	 1280   1280  rsyslogd          0.53    0.04    0.00    0.00    0.00    0.00    0.00    0.00
	   12     12  ksoftirqd/0       0.47    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	 1277   1277  nbd-server        0.44    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	  308    308  kworker/2:2-sys   0.41    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	   55     55  netns             0.36    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	 1187   1187  acpid             0.31    0.03    0.00    0.00    0.00    0.00    0.00    0.00
	 6184   6184  kworker/1:2-sys   0.24    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	  186    186  kaluad            0.24    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	   18     18  ksoftirqd/1       0.24    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	  185    185  kmpath_rdacd      0.23    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	  190    190  kstrp             0.23    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	 2759   2759  agetty            0.20    0.03    0.00    0.00    0.00    0.00    0.00    0.00
	 1190   1190  kworker/0:3-sys   0.19    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	 1272   1272  sshd              0.15    0.04    0.00    0.00    0.00    0.00    0.00    0.00
	 1156   1156  license           0.15    0.11    0.00    0.00    0.00    0.00    0.00    0.00
	  134    134  md                0.13    0.00    0.00    0.00    0.00    0.00    0.00    0.00
	 6142   6142  kworker/3:2-xfs   0.13    0.00    0.00    0.00    0.00    0.00    0.00    0.00

Dynamic interactive interface of delaytop::

	# ./delaytop -p pid
	Print delayacct stats

	# ./delaytop -P num
	Display the top N tasks

	# ./delaytop -n num
	Set delaytop refresh frequency (num times)

	# ./delaytop -d secs
	Specify refresh interval as secs
+21 −0
Original line number Diff line number Diff line
@@ -311,6 +311,27 @@ crashkernel syntax

            crashkernel=0,low

4) crashkernel=size,cma

	Reserve additional crash kernel memory from CMA. This reservation is
	usable by the first system's userspace memory and kernel movable
	allocations (memory balloon, zswap). Pages allocated from this memory
	range will not be included in the vmcore so this should not be used if
	dumping of userspace memory is intended and it has to be expected that
	some movable kernel pages may be missing from the dump.

	A standard crashkernel reservation, as described above, is still needed
	to hold the crash kernel and initrd.

	This option increases the risk of a kdump failure: DMA transfers
	configured by the first kernel may end up corrupting the second
	kernel's memory.

	This reservation method is intended for systems that can't afford to
	sacrifice enough memory for standard crashkernel reservation and where
	less reliable and possibly incomplete kdump is preferable to no kdump at
	all.

Boot into System Kernel
-----------------------
1) Update the boot loader (such as grub, yaboot, or lilo) configuration
+47 −1
Original line number Diff line number Diff line
@@ -994,6 +994,28 @@
			0: to disable low allocation.
			It will be ignored when crashkernel=X,high is not used
			or memory reserved is below 4G.
	crashkernel=size[KMG],cma
			[KNL, X86] Reserve additional crash kernel memory from
			CMA. This reservation is usable by the first system's
			userspace memory and kernel movable allocations (memory
			balloon, zswap). Pages allocated from this memory range
			will not be included in the vmcore so this should not
			be used if dumping of userspace memory is intended and
			it has to be expected that some movable kernel pages
			may be missing from the dump.

			A standard crashkernel reservation, as described above,
			is still needed to hold the crash kernel and initrd.

			This option increases the risk of a kdump failure: DMA
			transfers configured by the first kernel may end up
			corrupting the second kernel's memory.

			This reservation method is intended for systems that
			can't afford to sacrifice enough memory for standard
			crashkernel reservation and where less reliable and
			possibly incomplete kdump is preferable to no kdump at
			all.

	cryptomgr.notests
			[KNL] Disable crypto self-tests
@@ -4557,7 +4579,7 @@
			bit 2: print timer info
			bit 3: print locks info if CONFIG_LOCKDEP is on
			bit 4: print ftrace buffer
			bit 5: print all printk messages in buffer
			bit 5: replay all messages on consoles at the end of panic
			bit 6: print all CPUs backtrace (if available in the arch)
			bit 7: print only tasks in uninterruptible (blocked) state
			*Be aware* that this option may print a _lot_ of lines,
@@ -4565,6 +4587,25 @@
			Use this option carefully, maybe worth to setup a
			bigger log buffer with "log_buf_len" along with this.

	panic_sys_info= A comma separated list of extra information to be dumped
                        on panic.
                        Format: val[,val...]
                        Where @val can be any of the following:

                        tasks:          print all tasks info
                        mem:            print system memory info
			timers:         print timers info
                        locks:          print locks info if CONFIG_LOCKDEP is on
                        ftrace:         print ftrace buffer
                        all_bt:         print all CPUs backtrace (if available in the arch)
                        blocked_tasks:  print only tasks in uninterruptible (blocked) state

                        This is a human readable alternative to the 'panic_print' option.

	panic_console_replay
			When panic happens, replay all kernel messages on
			consoles at the end of panic.

	parkbd.port=	[HW] Parallel port number the keyboard adapter is
			connected to, default is 0.
			Format: <parport#>
@@ -7032,6 +7073,11 @@
			consumed by the stack hash table. By default this is set
			to false.

	stack_depot_max_pools= [KNL,EARLY]
			Specify the maximum number of pools to use for storing
			stack traces. Pools are allocated on-demand up to this
			limit. Default value is 8191 pools.

	stacktrace	[FTRACE]
			Enabled the stack tracer on boot up.

+19 −1
Original line number Diff line number Diff line
@@ -890,7 +890,7 @@ bit 1 print system memory info
bit 2  print timer info
bit 3  print locks info if ``CONFIG_LOCKDEP`` is on
bit 4  print ftrace buffer
bit 5  print all printk messages in buffer
bit 5  replay all messages on consoles at the end of panic
bit 6  print all CPUs backtrace (if available in the arch)
bit 7  print only tasks in uninterruptible (blocked) state
=====  ============================================
@@ -900,6 +900,24 @@ So for example to print tasks and memory info on panic, user can::
  echo 3 > /proc/sys/kernel/panic_print


panic_sys_info
==============

A comma separated list of extra information to be dumped on panic,
for example, "tasks,mem,timers,...".  It is a human readable alternative
to 'panic_print'. Possible values are:

=============   ===================================================
tasks           print all tasks info
mem             print system memory info
timer           print timers info
lock            print locks info if CONFIG_LOCKDEP is on
ftrace          print ftrace buffer
all_bt          print all CPUs backtrace (if available in the arch)
blocked_tasks   print only tasks in uninterruptible (blocked) state
=============   ===================================================


panic_on_rcu_stall
==================

Loading