Commit 87916225 authored by Ian Rogers's avatar Ian Rogers Committed by Namhyung Kim
Browse files

perf vendor events: Update jaketown metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736



Co-authored-by: default avatarWeilin Wang <weilin.wang@intel.com>
Co-authored-by: default avatarCaleb Biggers <caleb.biggers@intel.com>
Signed-off-by: default avatarIan Rogers <irogers@google.com>
Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-21-irogers@google.com
parent 3235704c
Loading
Loading
Loading
Loading
+123 −0

File changed.

Preview size limit exceeded, changes collapsed.

+52 −0
Original line number Diff line number Diff line
[
    {
        "Unit": "core",
        "CountersNumFixed": "3",
        "CountersNumGeneric": "4"
    },
    {
        "Unit": "CBOX",
        "CountersNumFixed": "0",
        "CountersNumGeneric": "4"
    },
    {
        "Unit": "PCU",
        "CountersNumFixed": "0",
        "CountersNumGeneric": "4"
    },
    {
        "Unit": "UBOX",
        "CountersNumFixed": "0",
        "CountersNumGeneric": "2"
    },
    {
        "Unit": "QPI",
        "CountersNumFixed": "0",
        "CountersNumGeneric": "4"
    },
    {
        "Unit": "R3QPI",
        "CountersNumFixed": "0",
        "CountersNumGeneric": "3"
    },
    {
        "Unit": "R2PCIe",
        "CountersNumFixed": "0",
        "CountersNumGeneric": "4"
    },
    {
        "Unit": "HA",
        "CountersNumFixed": "0",
        "CountersNumGeneric": "4"
    },
    {
        "Unit": "iMC",
        "CountersNumFixed": "0",
        "CountersNumGeneric": "4"
    },
    {
        "Unit": "IRP",
        "CountersNumFixed": "0",
        "CountersNumGeneric": "2"
    }
]
 No newline at end of file
+15 −0
Original line number Diff line number Diff line
[
    {
        "BriefDescription": "Cycles with any input/output SSE or FP assist.",
        "Counter": "0,1,2,3",
        "CounterMask": "1",
        "EventCode": "0xCA",
        "EventName": "FP_ASSIST.ANY",
@@ -9,6 +10,7 @@
    },
    {
        "BriefDescription": "Number of SIMD FP assists due to input values.",
        "Counter": "0,1,2,3",
        "EventCode": "0xCA",
        "EventName": "FP_ASSIST.SIMD_INPUT",
        "SampleAfterValue": "100003",
@@ -16,6 +18,7 @@
    },
    {
        "BriefDescription": "Number of SIMD FP assists due to Output values.",
        "Counter": "0,1,2,3",
        "EventCode": "0xCA",
        "EventName": "FP_ASSIST.SIMD_OUTPUT",
        "SampleAfterValue": "100003",
@@ -23,6 +26,7 @@
    },
    {
        "BriefDescription": "Number of X87 assists due to input value.",
        "Counter": "0,1,2,3",
        "EventCode": "0xCA",
        "EventName": "FP_ASSIST.X87_INPUT",
        "SampleAfterValue": "100003",
@@ -30,6 +34,7 @@
    },
    {
        "BriefDescription": "Number of X87 assists due to output value.",
        "Counter": "0,1,2,3",
        "EventCode": "0xCA",
        "EventName": "FP_ASSIST.X87_OUTPUT",
        "SampleAfterValue": "100003",
@@ -37,6 +42,7 @@
    },
    {
        "BriefDescription": "Number of SSE* or AVX-128 FP Computational packed double-precision uops issued this cycle.",
        "Counter": "0,1,2,3",
        "EventCode": "0x10",
        "EventName": "FP_COMP_OPS_EXE.SSE_PACKED_DOUBLE",
        "SampleAfterValue": "2000003",
@@ -44,6 +50,7 @@
    },
    {
        "BriefDescription": "Number of SSE* or AVX-128 FP Computational packed single-precision uops issued this cycle.",
        "Counter": "0,1,2,3",
        "EventCode": "0x10",
        "EventName": "FP_COMP_OPS_EXE.SSE_PACKED_SINGLE",
        "SampleAfterValue": "2000003",
@@ -51,6 +58,7 @@
    },
    {
        "BriefDescription": "Number of SSE* or AVX-128 FP Computational scalar double-precision uops issued this cycle.",
        "Counter": "0,1,2,3",
        "EventCode": "0x10",
        "EventName": "FP_COMP_OPS_EXE.SSE_SCALAR_DOUBLE",
        "SampleAfterValue": "2000003",
@@ -58,6 +66,7 @@
    },
    {
        "BriefDescription": "Number of SSE* or AVX-128 FP Computational scalar single-precision uops issued this cycle.",
        "Counter": "0,1,2,3",
        "EventCode": "0x10",
        "EventName": "FP_COMP_OPS_EXE.SSE_SCALAR_SINGLE",
        "SampleAfterValue": "2000003",
@@ -65,6 +74,7 @@
    },
    {
        "BriefDescription": "Number of FP Computational Uops Executed this cycle. The number of FADD, FSUB, FCOM, FMULs, integer MULs and IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. This event does not distinguish an FADD used in the middle of a transcendental flow from a s.",
        "Counter": "0,1,2,3",
        "EventCode": "0x10",
        "EventName": "FP_COMP_OPS_EXE.X87",
        "SampleAfterValue": "2000003",
@@ -72,6 +82,7 @@
    },
    {
        "BriefDescription": "Number of GSSE memory assist for stores. GSSE microcode assist is being invoked whenever the hardware is unable to properly handle GSSE-256b operations.",
        "Counter": "0,1,2,3",
        "EventCode": "0xC1",
        "EventName": "OTHER_ASSISTS.AVX_STORE",
        "SampleAfterValue": "100003",
@@ -79,6 +90,7 @@
    },
    {
        "BriefDescription": "Number of transitions from AVX-256 to legacy SSE when penalty applicable.",
        "Counter": "0,1,2,3",
        "EventCode": "0xC1",
        "EventName": "OTHER_ASSISTS.AVX_TO_SSE",
        "SampleAfterValue": "100003",
@@ -86,6 +98,7 @@
    },
    {
        "BriefDescription": "Number of transitions from SSE to AVX-256 when penalty applicable.",
        "Counter": "0,1,2,3",
        "EventCode": "0xC1",
        "EventName": "OTHER_ASSISTS.SSE_TO_AVX",
        "SampleAfterValue": "100003",
@@ -93,6 +106,7 @@
    },
    {
        "BriefDescription": "Number of AVX-256 Computational FP double precision uops issued this cycle.",
        "Counter": "0,1,2,3",
        "EventCode": "0x11",
        "EventName": "SIMD_FP_256.PACKED_DOUBLE",
        "SampleAfterValue": "2000003",
@@ -100,6 +114,7 @@
    },
    {
        "BriefDescription": "Number of GSSE-256 Computational FP single precision uops issued this cycle.",
        "Counter": "0,1,2,3",
        "EventCode": "0x11",
        "EventName": "SIMD_FP_256.PACKED_SINGLE",
        "SampleAfterValue": "2000003",
+32 −0
Original line number Diff line number Diff line
[
    {
        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
        "Counter": "0,1,2,3",
        "EventCode": "0xE6",
        "EventName": "BACLEARS.ANY",
        "SampleAfterValue": "100003",
@@ -8,6 +9,7 @@
    },
    {
        "BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switches.",
        "Counter": "0,1,2,3",
        "EventCode": "0xAB",
        "EventName": "DSB2MITE_SWITCHES.COUNT",
        "SampleAfterValue": "2000003",
@@ -15,6 +17,7 @@
    },
    {
        "BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles.",
        "Counter": "0,1,2,3",
        "EventCode": "0xAB",
        "EventName": "DSB2MITE_SWITCHES.PENALTY_CYCLES",
        "PublicDescription": "This event counts the cycles attributed to a switch from the Decoded Stream Buffer (DSB), which holds decoded instructions, to the legacy decode pipeline.  It excludes cycles when the back-end cannot  accept new micro-ops.  The penalty for these switches is potentially several cycles of instruction starvation, where no micro-ops are delivered to the back-end.",
@@ -23,6 +26,7 @@
    },
    {
        "BriefDescription": "Cases of cancelling valid Decode Stream Buffer (DSB) fill not because of exceeding way limit.",
        "Counter": "0,1,2,3",
        "EventCode": "0xAC",
        "EventName": "DSB_FILL.ALL_CANCEL",
        "SampleAfterValue": "2000003",
@@ -30,6 +34,7 @@
    },
    {
        "BriefDescription": "Cycles when Decode Stream Buffer (DSB) fill encounter more than 3 Decode Stream Buffer (DSB) lines.",
        "Counter": "0,1,2,3",
        "EventCode": "0xAC",
        "EventName": "DSB_FILL.EXCEED_DSB_LINES",
        "SampleAfterValue": "2000003",
@@ -37,6 +42,7 @@
    },
    {
        "BriefDescription": "Cases of cancelling valid DSB fill not because of exceeding way limit.",
        "Counter": "0,1,2,3",
        "EventCode": "0xAC",
        "EventName": "DSB_FILL.OTHER_CANCEL",
        "SampleAfterValue": "2000003",
@@ -44,6 +50,7 @@
    },
    {
        "BriefDescription": "Number of Instruction Cache, Streaming Buffer and Victim Cache Reads. both cacheable and noncacheable, including UC fetches.",
        "Counter": "0,1,2,3",
        "EventCode": "0x80",
        "EventName": "ICACHE.HIT",
        "SampleAfterValue": "2000003",
@@ -51,6 +58,7 @@
    },
    {
        "BriefDescription": "Instruction cache, streaming buffer and victim cache misses.",
        "Counter": "0,1,2,3",
        "EventCode": "0x80",
        "EventName": "ICACHE.MISSES",
        "PublicDescription": "This event counts the number of instruction cache, streaming buffer and victim cache misses. Counting includes unchacheable accesses.",
@@ -59,6 +67,7 @@
    },
    {
        "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops.",
        "Counter": "0,1,2,3",
        "CounterMask": "4",
        "EventCode": "0x79",
        "EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS",
@@ -67,6 +76,7 @@
    },
    {
        "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop.",
        "Counter": "0,1,2,3",
        "CounterMask": "1",
        "EventCode": "0x79",
        "EventName": "IDQ.ALL_DSB_CYCLES_ANY_UOPS",
@@ -75,6 +85,7 @@
    },
    {
        "BriefDescription": "Cycles MITE is delivering 4 Uops.",
        "Counter": "0,1,2,3",
        "CounterMask": "4",
        "EventCode": "0x79",
        "EventName": "IDQ.ALL_MITE_CYCLES_4_UOPS",
@@ -83,6 +94,7 @@
    },
    {
        "BriefDescription": "Cycles MITE is delivering any Uop.",
        "Counter": "0,1,2,3",
        "CounterMask": "1",
        "EventCode": "0x79",
        "EventName": "IDQ.ALL_MITE_CYCLES_ANY_UOPS",
@@ -91,6 +103,7 @@
    },
    {
        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path.",
        "Counter": "0,1,2,3",
        "CounterMask": "1",
        "EventCode": "0x79",
        "EventName": "IDQ.DSB_CYCLES",
@@ -99,6 +112,7 @@
    },
    {
        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path.",
        "Counter": "0,1,2,3",
        "EventCode": "0x79",
        "EventName": "IDQ.DSB_UOPS",
        "SampleAfterValue": "2000003",
@@ -106,6 +120,7 @@
    },
    {
        "BriefDescription": "Instruction Decode Queue (IDQ) empty cycles.",
        "Counter": "0,1,2,3",
        "EventCode": "0x79",
        "EventName": "IDQ.EMPTY",
        "SampleAfterValue": "2000003",
@@ -113,6 +128,7 @@
    },
    {
        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from MITE path.",
        "Counter": "0,1,2,3",
        "EventCode": "0x79",
        "EventName": "IDQ.MITE_ALL_UOPS",
        "SampleAfterValue": "2000003",
@@ -120,6 +136,7 @@
    },
    {
        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path.",
        "Counter": "0,1,2,3",
        "CounterMask": "1",
        "EventCode": "0x79",
        "EventName": "IDQ.MITE_CYCLES",
@@ -128,6 +145,7 @@
    },
    {
        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from MITE path.",
        "Counter": "0,1,2,3",
        "EventCode": "0x79",
        "EventName": "IDQ.MITE_UOPS",
        "SampleAfterValue": "2000003",
@@ -135,6 +153,7 @@
    },
    {
        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy.",
        "Counter": "0,1,2,3",
        "CounterMask": "1",
        "EventCode": "0x79",
        "EventName": "IDQ.MS_CYCLES",
@@ -144,6 +163,7 @@
    },
    {
        "BriefDescription": "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy.",
        "Counter": "0,1,2,3",
        "CounterMask": "1",
        "EventCode": "0x79",
        "EventName": "IDQ.MS_DSB_CYCLES",
@@ -152,6 +172,7 @@
    },
    {
        "BriefDescription": "Deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while Microcode Sequencer (MS) is busy.",
        "Counter": "0,1,2,3",
        "CounterMask": "1",
        "EdgeDetect": "1",
        "EventCode": "0x79",
@@ -161,6 +182,7 @@
    },
    {
        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy.",
        "Counter": "0,1,2,3",
        "EventCode": "0x79",
        "EventName": "IDQ.MS_DSB_UOPS",
        "SampleAfterValue": "2000003",
@@ -168,6 +190,7 @@
    },
    {
        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy.",
        "Counter": "0,1,2,3",
        "EventCode": "0x79",
        "EventName": "IDQ.MS_MITE_UOPS",
        "SampleAfterValue": "2000003",
@@ -175,6 +198,7 @@
    },
    {
        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
        "Counter": "0,1,2,3",
        "CounterMask": "1",
        "EdgeDetect": "1",
        "EventCode": "0x79",
@@ -184,6 +208,7 @@
    },
    {
        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy.",
        "Counter": "0,1,2,3",
        "EventCode": "0x79",
        "EventName": "IDQ.MS_UOPS",
        "SampleAfterValue": "2000003",
@@ -191,6 +216,7 @@
    },
    {
        "BriefDescription": "Uops not delivered to Resource Allocation Table (RAT) per thread when backend of the machine is not stalled .",
        "Counter": "0,1,2,3",
        "EventCode": "0x9C",
        "EventName": "IDQ_UOPS_NOT_DELIVERED.CORE",
        "PublicDescription": "This event counts the number of uops not delivered to the back-end per cycle, per thread, when the back-end was not stalled.  In the ideal case 4 uops can be delivered each cycle.  The event counts the undelivered uops - so if 3 were delivered in one cycle, the counter would be incremented by 1 for that cycle (4 - 3). If the back-end is stalled, the count for this event is not incremented even when uops were not delivered, because the back-end would not have been able to accept them.  This event is used in determining the front-end bound category of the top-down pipeline slots characterization.",
@@ -199,6 +225,7 @@
    },
    {
        "BriefDescription": "Cycles per thread when 4 or more uops are not delivered to Resource Allocation Table (RAT) when backend of the machine is not stalled.",
        "Counter": "0,1,2,3",
        "CounterMask": "4",
        "EventCode": "0x9C",
        "EventName": "IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE",
@@ -207,6 +234,7 @@
    },
    {
        "BriefDescription": "Counts cycles FE delivered 4 uops or Resource Allocation Table (RAT) was stalling FE.",
        "Counter": "0,1,2,3",
        "CounterMask": "1",
        "EventCode": "0x9C",
        "EventName": "IDQ_UOPS_NOT_DELIVERED.CYCLES_FE_WAS_OK",
@@ -216,6 +244,7 @@
    },
    {
        "BriefDescription": "Cycles when 1 or more uops were delivered to the by the front end.",
        "Counter": "0,1,2,3",
        "CounterMask": "4",
        "EventCode": "0x9C",
        "EventName": "IDQ_UOPS_NOT_DELIVERED.CYCLES_GE_1_UOP_DELIV.CORE",
@@ -225,6 +254,7 @@
    },
    {
        "BriefDescription": "Cycles per thread when 3 or more uops are not delivered to Resource Allocation Table (RAT) when backend of the machine is not stalled.",
        "Counter": "0,1,2,3",
        "CounterMask": "3",
        "EventCode": "0x9C",
        "EventName": "IDQ_UOPS_NOT_DELIVERED.CYCLES_LE_1_UOP_DELIV.CORE",
@@ -233,6 +263,7 @@
    },
    {
        "BriefDescription": "Cycles with less than 2 uops delivered by the front end.",
        "Counter": "0,1,2,3",
        "CounterMask": "2",
        "EventCode": "0x9C",
        "EventName": "IDQ_UOPS_NOT_DELIVERED.CYCLES_LE_2_UOP_DELIV.CORE",
@@ -241,6 +272,7 @@
    },
    {
        "BriefDescription": "Cycles with less than 3 uops delivered by the front end.",
        "Counter": "0,1,2,3",
        "CounterMask": "1",
        "EventCode": "0x9C",
        "EventName": "IDQ_UOPS_NOT_DELIVERED.CYCLES_LE_3_UOP_DELIV.CORE",
+12 −12
Original line number Diff line number Diff line
@@ -73,7 +73,7 @@
        "BriefDescription": "This category represents fraction of slots where no uops are being delivered due to a lack of required resources for accepting new uops in the Backend",
        "MetricConstraint": "NO_GROUP_EVENTS_NMI",
        "MetricExpr": "1 - (tma_frontend_bound + tma_bad_speculation + tma_retiring)",
        "MetricGroup": "TmaL1;TopdownL1;tma_L1_group",
        "MetricGroup": "BvOB;TmaL1;TopdownL1;tma_L1_group",
        "MetricName": "tma_backend_bound",
        "MetricThreshold": "tma_backend_bound > 0.2",
        "MetricgroupNoGroup": "TopdownL1",
@@ -94,7 +94,7 @@
        "BriefDescription": "This metric represents fraction of slots the CPU has wasted due to Branch Misprediction",
        "MetricConstraint": "NO_GROUP_EVENTS",
        "MetricExpr": "BR_MISP_RETIRED.ALL_BRANCHES / (BR_MISP_RETIRED.ALL_BRANCHES + MACHINE_CLEARS.COUNT) * tma_bad_speculation",
        "MetricGroup": "BadSpec;BrMispredicts;TmaL2;TopdownL2;tma_L2_group;tma_bad_speculation_group;tma_issueBM",
        "MetricGroup": "BadSpec;BrMispredicts;BvMP;TmaL2;TopdownL2;tma_L2_group;tma_bad_speculation_group;tma_issueBM",
        "MetricName": "tma_branch_mispredicts",
        "MetricThreshold": "tma_branch_mispredicts > 0.1 & tma_bad_speculation > 0.15",
        "MetricgroupNoGroup": "TopdownL2",
@@ -124,7 +124,7 @@
    {
        "BriefDescription": "This metric represents fraction of cycles where the Divider unit was active",
        "MetricExpr": "ARITH.FPU_DIV_ACTIVE / tma_info_core_core_clks",
        "MetricGroup": "TopdownL3;tma_L3_group;tma_core_bound_group",
        "MetricGroup": "BvCB;TopdownL3;tma_L3_group;tma_core_bound_group",
        "MetricName": "tma_divider",
        "MetricThreshold": "tma_divider > 0.2 & (tma_core_bound > 0.1 & tma_backend_bound > 0.2)",
        "PublicDescription": "This metric represents fraction of cycles where the Divider unit was active. Divide and square root instructions are performed by the Divider unit and can take considerably longer latency than integer or Floating Point addition; subtraction; or multiplication. Sample with: ARITH.DIVIDER_UOPS",
@@ -152,7 +152,7 @@
    {
        "BriefDescription": "This metric roughly estimates the fraction of cycles where the Data TLB (DTLB) was missed by load accesses",
        "MetricExpr": "(7 * DTLB_LOAD_MISSES.STLB_HIT + DTLB_LOAD_MISSES.WALK_DURATION) / tma_info_thread_clks",
        "MetricGroup": "MemoryTLB;TopdownL4;tma_L4_group;tma_issueTLB;tma_l1_bound_group",
        "MetricGroup": "BvMT;MemoryTLB;TopdownL4;tma_L4_group;tma_issueTLB;tma_l1_bound_group",
        "MetricName": "tma_dtlb_load",
        "MetricThreshold": "tma_dtlb_load > 0.1",
        "PublicDescription": "This metric roughly estimates the fraction of cycles where the Data TLB (DTLB) was missed by load accesses. TLBs (Translation Look-aside Buffers) are processor caches for recently used entries out of the Page Tables that are used to map virtual- to physical-addresses by the operating system. This metric approximates the potential delay of demand loads missing the first-level data TLB (assuming worst case scenario with back to back misses to different pages). This includes hitting in the second-level TLB (STLB) as well as performing a hardware page walk on an STLB miss. Sample with: MEM_UOPS_RETIRED.STLB_MISS_LOADS_PS. Related metrics: tma_dtlb_store",
@@ -226,7 +226,7 @@
    {
        "BriefDescription": "This category represents fraction of slots where the processor's Frontend undersupplies its Backend",
        "MetricExpr": "IDQ_UOPS_NOT_DELIVERED.CORE / tma_info_thread_slots",
        "MetricGroup": "PGO;TmaL1;TopdownL1;tma_L1_group",
        "MetricGroup": "BvFB;BvIO;PGO;TmaL1;TopdownL1;tma_L1_group",
        "MetricName": "tma_frontend_bound",
        "MetricThreshold": "tma_frontend_bound > 0.15",
        "MetricgroupNoGroup": "TopdownL1",
@@ -296,13 +296,13 @@
    },
    {
        "BriefDescription": "Average CPU Utilization (percentage)",
        "MetricExpr": "CPU_CLK_UNHALTED.REF_TSC / TSC",
        "MetricExpr": "tma_info_system_cpus_utilized / #num_cpus_online",
        "MetricGroup": "HPC;Summary",
        "MetricName": "tma_info_system_cpu_utilization"
    },
    {
        "BriefDescription": "Average number of utilized CPUs",
        "MetricExpr": "#num_cpus_online * tma_info_system_cpu_utilization",
        "MetricExpr": "CPU_CLK_UNHALTED.REF_TSC / TSC",
        "MetricGroup": "Summary",
        "MetricName": "tma_info_system_cpus_utilized"
    },
@@ -419,7 +419,7 @@
    {
        "BriefDescription": "This metric represents fraction of cycles the CPU was stalled due to Instruction TLB (ITLB) misses",
        "MetricExpr": "(12 * ITLB_MISSES.STLB_HIT + ITLB_MISSES.WALK_DURATION) / tma_info_thread_clks",
        "MetricGroup": "BigFootprint;FetchLat;MemoryTLB;TopdownL3;tma_L3_group;tma_fetch_latency_group",
        "MetricGroup": "BigFootprint;BvBC;FetchLat;MemoryTLB;TopdownL3;tma_L3_group;tma_fetch_latency_group",
        "MetricName": "tma_itlb_misses",
        "MetricThreshold": "tma_itlb_misses > 0.05 & (tma_fetch_latency > 0.1 & tma_frontend_bound > 0.15)",
        "PublicDescription": "This metric represents fraction of cycles the CPU was stalled due to Instruction TLB (ITLB) misses. Sample with: ITLB_MISSES.WALK_COMPLETED",
@@ -458,7 +458,7 @@
        "BriefDescription": "This metric represents fraction of slots the CPU has wasted due to Machine Clears",
        "MetricConstraint": "NO_GROUP_EVENTS",
        "MetricExpr": "tma_bad_speculation - tma_branch_mispredicts",
        "MetricGroup": "BadSpec;MachineClears;TmaL2;TopdownL2;tma_L2_group;tma_bad_speculation_group;tma_issueMC;tma_issueSyncxn",
        "MetricGroup": "BadSpec;BvMS;MachineClears;TmaL2;TopdownL2;tma_L2_group;tma_bad_speculation_group;tma_issueMC;tma_issueSyncxn",
        "MetricName": "tma_machine_clears",
        "MetricThreshold": "tma_machine_clears > 0.1 & tma_bad_speculation > 0.15",
        "MetricgroupNoGroup": "TopdownL2",
@@ -468,7 +468,7 @@
    {
        "BriefDescription": "This metric estimates fraction of cycles where the core's performance was likely hurt due to approaching bandwidth limits of external memory - DRAM ([SPR-HBM] and/or HBM)",
        "MetricExpr": "min(CPU_CLK_UNHALTED.THREAD, cpu@OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD\\,cmask\\=6@) / tma_info_thread_clks",
        "MetricGroup": "MemoryBW;Offcore;TopdownL4;tma_L4_group;tma_dram_bound_group;tma_issueBW",
        "MetricGroup": "BvMS;MemoryBW;Offcore;TopdownL4;tma_L4_group;tma_dram_bound_group;tma_issueBW",
        "MetricName": "tma_mem_bandwidth",
        "MetricThreshold": "tma_mem_bandwidth > 0.2 & (tma_dram_bound > 0.1 & (tma_memory_bound > 0.2 & tma_backend_bound > 0.2))",
        "PublicDescription": "This metric estimates fraction of cycles where the core's performance was likely hurt due to approaching bandwidth limits of external memory - DRAM ([SPR-HBM] and/or HBM).  The underlying heuristic assumes that a similar off-core traffic is generated by all IA cores. This metric does not aggregate non-data-read requests by this logical processor; requests from other IA Logical Processors/Physical Cores/sockets; or other non-IA devices like GPU; hence the maximum external memory bandwidth limits may or may not be approached when this metric is flagged (see Uncore counters for that). Related metrics: tma_info_system_dram_bw_use",
@@ -477,7 +477,7 @@
    {
        "BriefDescription": "This metric estimates fraction of cycles where the performance was likely hurt due to latency from external memory - DRAM ([SPR-HBM] and/or HBM)",
        "MetricExpr": "min(CPU_CLK_UNHALTED.THREAD, OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD) / tma_info_thread_clks - tma_mem_bandwidth",
        "MetricGroup": "MemoryLat;Offcore;TopdownL4;tma_L4_group;tma_dram_bound_group;tma_issueLat",
        "MetricGroup": "BvML;MemoryLat;Offcore;TopdownL4;tma_L4_group;tma_dram_bound_group;tma_issueLat",
        "MetricName": "tma_mem_latency",
        "MetricThreshold": "tma_mem_latency > 0.1 & (tma_dram_bound > 0.1 & (tma_memory_bound > 0.2 & tma_backend_bound > 0.2))",
        "PublicDescription": "This metric estimates fraction of cycles where the performance was likely hurt due to latency from external memory - DRAM ([SPR-HBM] and/or HBM).  This metric does not aggregate requests from other Logical Processors/Physical Cores/sockets (see Uncore counters for that). Related metrics: ",
@@ -525,7 +525,7 @@
    {
        "BriefDescription": "This category represents fraction of slots utilized by useful work i.e. issued uops that eventually get retired",
        "MetricExpr": "UOPS_RETIRED.RETIRE_SLOTS / tma_info_thread_slots",
        "MetricGroup": "TmaL1;TopdownL1;tma_L1_group",
        "MetricGroup": "BvUW;TmaL1;TopdownL1;tma_L1_group",
        "MetricName": "tma_retiring",
        "MetricThreshold": "tma_retiring > 0.7 | tma_heavy_operations > 0.1",
        "MetricgroupNoGroup": "TopdownL1",
Loading