Commit 429b7638 authored by Besar Wicaksono's avatar Besar Wicaksono Committed by Will Deacon
Browse files

perf: add NVIDIA Tegra410 CPU Memory Latency PMU



Adds CPU Memory (CMEM) Latency PMU support in Tegra410 SOC.
The PMU is used to measure latency between the edge of the
Unified Coherence Fabric to the local system DRAM.

Reviewed-by: default avatarIlkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: default avatarBesar Wicaksono <bwicaksono@nvidia.com>
Signed-off-by: default avatarWill Deacon <will@kernel.org>
parent 3dd73022
Loading
Loading
Loading
Loading
+25 −0
Original line number Diff line number Diff line
@@ -8,6 +8,7 @@ metrics like memory bandwidth, latency, and utilization:
* Unified Coherence Fabric (UCF)
* PCIE
* PCIE-TGT
* CPU Memory (CMEM) Latency

PMU Driver
----------
@@ -344,3 +345,27 @@ Example usage:
  0x10000 to 0x100FF on socket 0's PCIE RC-1::

    perf stat -a -e nvidia_pcie_tgt_pmu_0_rc_1/event=0x1,dst_addr_base=0x10000,dst_addr_mask=0xFFF00,dst_addr_en=0x1/

CPU Memory (CMEM) Latency PMU
-----------------------------

This PMU monitors latency events of memory read requests from the edge of the
Unified Coherence Fabric (UCF) to local CPU DRAM:

  * RD_REQ counters: count read requests (32B per request).
  * RD_CUM_OUTS counters: accumulated outstanding request counter, which track
    how many cycles the read requests are in flight.
  * CYCLES counter: counts the number of elapsed cycles.

The average latency is calculated as::

   FREQ_IN_GHZ = CYCLES / ELAPSED_TIME_IN_NS
   AVG_LATENCY_IN_CYCLES = RD_CUM_OUTS / RD_REQ
   AVERAGE_LATENCY_IN_NS = AVG_LATENCY_IN_CYCLES / FREQ_IN_GHZ

The events and configuration options of this PMU device are described in sysfs,
see /sys/bus/event_source/devices/nvidia_cmem_latency_pmu_<socket-id>.

Example usage::

  perf stat -a -e '{nvidia_cmem_latency_pmu_0/rd_req/,nvidia_cmem_latency_pmu_0/rd_cum_outs/,nvidia_cmem_latency_pmu_0/cycles/}'
+7 −0
Original line number Diff line number Diff line
@@ -311,4 +311,11 @@ config MARVELL_PEM_PMU
	  Enable support for PCIe Interface performance monitoring
	  on Marvell platform.

config NVIDIA_TEGRA410_CMEM_LATENCY_PMU
	tristate "NVIDIA Tegra410 CPU Memory Latency PMU"
	depends on ARM64 && ACPI
	help
	  Enable perf support for CPU memory latency counters monitoring on
	  NVIDIA Tegra410 SoC.

endmenu
+1 −0
Original line number Diff line number Diff line
@@ -35,3 +35,4 @@ obj-$(CONFIG_DWC_PCIE_PMU) += dwc_pcie_pmu.o
obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += arm_cspmu/
obj-$(CONFIG_MESON_DDR_PMU) += amlogic/
obj-$(CONFIG_CXL_PMU) += cxl_pmu.o
obj-$(CONFIG_NVIDIA_TEGRA410_CMEM_LATENCY_PMU) += nvidia_t410_cmem_latency_pmu.o
+736 −0

File added.

Preview size limit exceeded, changes collapsed.