Commit a8848c4b authored by Tony Luck's avatar Tony Luck Committed by Borislav Petkov (AMD)
Browse files

x86,fs/resctrl: Update documentation for telemetry events



Update resctrl filesystem documentation with the details about the resctrl
files that support telemetry events.

  [ bp: Drop the debugfs hunk of the documentation until a better debugging
    solution is found. ]

Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: default avatarReinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
parent 4bbfc901
Loading
Loading
Loading
Loading
+54 −12
Original line number Diff line number Diff line
@@ -252,13 +252,12 @@ with respect to allocation:
			bandwidth percentages are directly applied to
			the threads running on the core

If RDT monitoring is available there will be an "L3_MON" directory
If L3 monitoring is available there will be an "L3_MON" directory
with the following files:

"num_rmids":
		The number of RMIDs available. This is the
		upper bound for how many "CTRL_MON" + "MON"
		groups can be created.
		The number of RMIDs supported by hardware for
		L3 monitoring events.

"mon_features":
		Lists the monitoring events if
@@ -484,6 +483,24 @@ with the following files:
		bytes) at which a previously used LLC_occupancy
		counter can be considered for re-use.

If telemetry monitoring is available there will be a "PERF_PKG_MON" directory
with the following files:

"num_rmids":
		The number of RMIDs for telemetry monitoring events.

		On Intel resctrl will not enable telemetry events if the number of
		RMIDs that can be tracked concurrently is lower than the total number
		of RMIDs supported. Telemetry events can be force-enabled with the
		"rdt=" kernel parameter, but this may reduce the number of
		monitoring groups that can be created.

"mon_features":
		Lists the telemetry monitoring events that are enabled on this system.

The upper bound for how many "CTRL_MON" + "MON" can be created
is the smaller of the L3_MON and PERF_PKG_MON "num_rmids" values.

Finally, in the top level of the "info" directory there is a file
named "last_cmd_status". This is reset with every "command" issued
via the file system (making new directories or writing to any of the
@@ -589,15 +606,40 @@ When control is enabled all CTRL_MON groups will also contain:
When monitoring is enabled all MON groups will also contain:

"mon_data":
	This contains a set of files organized by L3 domain and by
	RDT event. E.g. on a system with two L3 domains there will
	be subdirectories "mon_L3_00" and "mon_L3_01".	Each of these
	directories have one file per event (e.g. "llc_occupancy",
	"mbm_total_bytes", and "mbm_local_bytes"). In a MON group these
	files provide a read out of the current value of the event for
	all tasks in the group. In CTRL_MON groups these files provide
	the sum for all tasks in the CTRL_MON group and all tasks in
	This contains directories for each monitor domain.

	If L3 monitoring is enabled, there will be a "mon_L3_XX" directory for
	each instance of an L3 cache. Each directory contains files for the enabled
	L3 events (e.g. "llc_occupancy", "mbm_total_bytes", and "mbm_local_bytes").

	If telemetry monitoring is enabled, there will be a "mon_PERF_PKG_YY"
	directory for each physical processor package. Each directory contains
	files for the enabled telemetry events (e.g. "core_energy". "activity",
	"uops_retired", etc.)

	The info/`*`/mon_features files provide the full list of enabled
	event/file names.

	"core energy" reports a floating point number for the energy (in Joules)
	consumed by cores (registers, arithmetic units, TLB and L1/L2 caches)
	during execution of instructions summed across all logical CPUs on a
	package for the current monitoring group.

	"activity" also reports a floating point value (in Farads).  This provides
	an estimate of work done independent of the frequency that the CPUs used
	for execution.

	Note that "core energy" and "activity" only measure energy/activity in the
	"core" of the CPU (arithmetic units, TLB, L1 and L2 caches, etc.). They
	do not include L3 cache, memory, I/O devices etc.

	All other events report decimal integer values.

	In a MON group these files provide a read out of the current value of
	the event for all tasks in the group. In CTRL_MON groups these files
	provide the sum for all tasks in the CTRL_MON group and all tasks in
	MON groups. Please see example section for more details on usage.

	On systems with Sub-NUMA Cluster (SNC) enabled there are extra
	directories for each node (located within the "mon_L3_XX" directory
	for the L3 cache they occupy). These are named "mon_sub_L3_YY"