Commit 119e3eef authored by Will Deacon's avatar Will Deacon
Browse files

Merge branch 'for-next/perf' into for-next/core

* for-next/perf: (33 commits)
  perf: arm-ni: Fix an NULL vs IS_ERR() bug
  perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled
  MAINTAINERS: List Arm interconnect PMUs as supported
  perf: Add driver for Arm NI-700 interconnect PMU
  dt-bindings/perf: Add Arm NI-700 PMU
  perf/arm-cmn: Improve format attr printing
  perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check
  perf/arm-cmn: Support CMN S3
  dt-bindings: perf: arm-cmn: Add CMN S3
  perf/arm-cmn: Refactor DTC PMU register access
  perf/arm-cmn: Make cycle counts less surprising
  perf/arm-cmn: Improve build-time assertion
  perf/arm-cmn: Ensure dtm_idx is big enough
  perf/arm-cmn: Fix CCLA register offset
  perf/arm-cmn: Refactor node ID handling. Again.
  drivers/perf: hisi_pcie: Export supported Root Ports [bdf_min, bdf_max]
  drivers/perf: hisi_pcie: Fix TLP headers bandwidth counting
  drivers/perf: hisi_pcie: Record hardware counts correctly
  drivers/perf: arm_spe: Use perf_allow_kernel() for permissions
  perf/dwc_pcie: Add support for QCOM vendor devices
  ...
parents c2c94023 2e091a80
Loading
Loading
Loading
Loading
+17 −0
Original line number Diff line number Diff line
====================================
Arm Network-on Chip Interconnect PMU
====================================

NI-700 and friends implement a distinct PMU for each clock domain within the
interconnect. Correspondingly, the driver exposes multiple PMU devices named
arm_ni_<x>_cd_<y>, where <x> is an (arbitrary) instance identifier and <y> is
the clock domain ID within that particular instance. If multiple NI instances
exist within a system, the PMU devices can be correlated with the underlying
hardware instance via sysfs parentage.

Each PMU exposes base event aliases for the interface types present in its clock
domain. These require qualifying with the "eventid" and "nodeid" parameters
to specify the event code to count and the interface at which to count it
(per the configured hardware ID as reflected in the xxNI_NODE_INFO register).
The exception is the "cycles" alias for the PMU cycle counter, which is encoded
with the PMU node type and needs no further qualification.
+8 −8
Original line number Diff line number Diff line
@@ -46,16 +46,16 @@ Some of the events only exist for specific configurations.
DesignWare Cores (DWC) PCIe PMU Driver
=======================================

This driver adds PMU devices for each PCIe Root Port named based on the BDF of
This driver adds PMU devices for each PCIe Root Port named based on the SBDF of
the Root Port. For example,

    30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
    0001:30:03.0 PCI bridge: Device 1ded:8000 (rev 01)

the PMU device name for this Root Port is dwc_rootport_3018.
the PMU device name for this Root Port is dwc_rootport_13018.

The DWC PCIe PMU driver registers a perf PMU driver, which provides
description of available events and configuration options in sysfs, see
/sys/bus/event_source/devices/dwc_rootport_{bdf}.
/sys/bus/event_source/devices/dwc_rootport_{sbdf}.

The "format" directory describes format of the config fields of the
perf_event_attr structure. The "events" directory provides configuration
@@ -66,16 +66,16 @@ The "perf list" command shall list the available events from sysfs, e.g.::

    $# perf list | grep dwc_rootport
    <...>
    dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/        [Kernel PMU event]
    dwc_rootport_13018/Rx_PCIe_TLP_Data_Payload/        [Kernel PMU event]
    <...>
    dwc_rootport_3018/rx_memory_read,lane=?/               [Kernel PMU event]
    dwc_rootport_13018/rx_memory_read,lane=?/               [Kernel PMU event]

Time Based Analysis Event Usage
-------------------------------

Example usage of counting PCIe RX TLP data payload (Units of bytes)::

    $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/
    $# perf stat -a -e dwc_rootport_13018/Rx_PCIe_TLP_Data_Payload/

The average RX/TX bandwidth can be calculated using the following formula:

@@ -88,7 +88,7 @@ Lane Event Usage
Each lane has the same event set and to avoid generating a list of hundreds
of events, the user need to specify the lane ID explicitly, e.g.::

    $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/
    $# perf stat -a -e dwc_rootport_13018/rx_memory_read,lane=4/

The driver does not support sampling, therefore "perf record" will not
work. Per-task (without "-a") perf sessions are not supported.
+3 −1
Original line number Diff line number Diff line
@@ -28,7 +28,9 @@ The "identifier" sysfs file allows users to identify the version of the
PMU hardware device.

The "bus" sysfs file allows users to get the bus number of Root Ports
monitored by PMU.
monitored by PMU. Furthermore users can get the Root Ports range in
[bdf_min, bdf_max] from "bdf_min" and "bdf_max" sysfs attributes
respectively.

Example usage of perf::

+1 −0
Original line number Diff line number Diff line
@@ -16,6 +16,7 @@ Performance monitor support
   starfive_starlink_pmu
   arm-ccn
   arm-cmn
   arm-ni
   xgene-pmu
   arm_dsu_pmu
   thunderx2-pmu
+1 −0
Original line number Diff line number Diff line
@@ -16,6 +16,7 @@ properties:
      - arm,cmn-600
      - arm,cmn-650
      - arm,cmn-700
      - arm,cmn-s3
      - arm,ci-700

  reg:
Loading