Commit dd9168ab authored by Will Deacon's avatar Will Deacon
Browse files

Merge branch 'for-next/perf' into for-next/core

* for-next/perf: (30 commits)
  arm: perf: Fix ARCH=arm build with GCC
  MAINTAINERS: add maintainers for DesignWare PCIe PMU driver
  drivers/perf: add DesignWare PCIe PMU driver
  PCI: Move pci_clear_and_set_dword() helper to PCI header
  PCI: Add Alibaba Vendor ID to linux/pci_ids.h
  docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
  Revert "perf/arm_dmc620: Remove duplicate format attribute #defines"
  Documentation: arm64: Document the PMU event counting threshold feature
  arm64: perf: Add support for event counting threshold
  arm: pmu: Move error message and -EOPNOTSUPP to individual PMUs
  KVM: selftests: aarch64: Update tools copy of arm_pmuv3.h
  perf/arm_dmc620: Remove duplicate format attribute #defines
  arm: pmu: Share user ABI format mechanism with SPE
  arm64: perf: Include threshold control fields in PMEVTYPER mask
  arm: perf: Convert remaining fields to use GENMASK
  arm: perf: Use GENMASK for PMMIR fields
  arm: perf/kvm: Use GENMASK for ARMV8_PMU_PMCR_N
  arm: perf: Remove inlines from arm_pmuv3.c
  drivers/perf: arm_dsu_pmu: Remove kerneldoc-style comment syntax
  drivers/perf: Remove usage of the deprecated ida_simple_xx() API
  ...
parents 3b47bd8f bb339db4
Loading
Loading
Loading
Loading
+94 −0
Original line number Diff line number Diff line
======================================================================
Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU)
======================================================================

DesignWare Cores (DWC) PCIe PMU
===============================

The PMU is a PCIe configuration space register block provided by each PCIe Root
Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error
injection, and Statistics).

As the name indicates, the RAS DES capability supports system level
debugging, AER error injection, and collection of statistics. To facilitate
collection of statistics, Synopsys DesignWare Cores PCIe controller
provides the following two features:

- one 64-bit counter for Time Based Analysis (RX/TX data throughput and
  time spent in each low-power LTSSM state) and
- one 32-bit counter for Event Counting (error and non-error events for
  a specified lane)

Note: There is no interrupt for counter overflow.

Time Based Analysis
-------------------

Using this feature you can obtain information regarding RX/TX data
throughput and time spent in each low-power LTSSM state by the controller.
The PMU measures data in two categories:

- Group#0: Percentage of time the controller stays in LTSSM states.
- Group#1: Amount of data processed (Units of 16 bytes).

Lane Event counters
-------------------

Using this feature you can obtain Error and Non-Error information in
specific lane by the controller. The PMU event is selected by all of:

- Group i
- Event j within the Group i
- Lane k

Some of the events only exist for specific configurations.

DesignWare Cores (DWC) PCIe PMU Driver
=======================================

This driver adds PMU devices for each PCIe Root Port named based on the BDF of
the Root Port. For example,

    30:03.0 PCI bridge: Device 1ded:8000 (rev 01)

the PMU device name for this Root Port is dwc_rootport_3018.

The DWC PCIe PMU driver registers a perf PMU driver, which provides
description of available events and configuration options in sysfs, see
/sys/bus/event_source/devices/dwc_rootport_{bdf}.

The "format" directory describes format of the config fields of the
perf_event_attr structure. The "events" directory provides configuration
templates for all documented events.  For example,
"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1".

The "perf list" command shall list the available events from sysfs, e.g.::

    $# perf list | grep dwc_rootport
    <...>
    dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/        [Kernel PMU event]
    <...>
    dwc_rootport_3018/rx_memory_read,lane=?/               [Kernel PMU event]

Time Based Analysis Event Usage
-------------------------------

Example usage of counting PCIe RX TLP data payload (Units of bytes)::

    $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/

The average RX/TX bandwidth can be calculated using the following formula:

    PCIe RX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window
    PCIe TX Bandwidth = Tx_PCIe_TLP_Data_Payload / Measure_Time_Window

Lane Event Usage
-------------------------------

Each lane has the same event set and to avoid generating a list of hundreds
of events, the user need to specify the lane ID explicitly, e.g.::

    $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/

The driver does not support sampling, therefore "perf record" will not
work. Per-task (without "-a") perf sessions are not supported.
+37 −8
Original line number Diff line number Diff line
@@ -13,8 +13,8 @@ is one register for each counter. Counter 0 is special in that it always counts
interrupt is raised. If any other counter overflows, it continues counting, and
no interrupt is raised.

The "format" directory describes format of the config (event ID) and config1
(AXI filtering) fields of the perf_event_attr structure, see /sys/bus/event_source/
The "format" directory describes format of the config (event ID) and config1/2
(AXI filter setting) fields of the perf_event_attr structure, see /sys/bus/event_source/
devices/imx8_ddr0/format/. The "events" directory describes the events types
hardware supported that can be used with perf tool, see /sys/bus/event_source/
devices/imx8_ddr0/events/. The "caps" directory describes filter features implemented
@@ -28,12 +28,11 @@ in DDR PMU, see /sys/bus/events_source/devices/imx8_ddr0/caps/.
AXI filtering is only used by CSV modes 0x41 (axid-read) and 0x42 (axid-write)
to count reading or writing matches filter setting. Filter setting is various
from different DRAM controller implementations, which is distinguished by quirks
in the driver. You also can dump info from userspace, filter in "caps" directory
indicates whether PMU supports AXI ID filter or not; enhanced_filter indicates
whether PMU supports enhanced AXI ID filter or not. Value 0 for un-supported, and
value 1 for supported.
in the driver. You also can dump info from userspace, "caps" directory show the
type of AXI filter (filter, enhanced_filter and super_filter). Value 0 for
un-supported, and value 1 for supported.

* With DDR_CAP_AXI_ID_FILTER quirk(filter: 1, enhanced_filter: 0).
* With DDR_CAP_AXI_ID_FILTER quirk(filter: 1, enhanced_filter: 0, super_filter: 0).
  Filter is defined with two configuration parts:
  --AXI_ID defines AxID matching value.
  --AXI_MASKING defines which bits of AxID are meaningful for the matching.
@@ -65,7 +64,37 @@ value 1 for supported.

        perf stat -a -e imx8_ddr0/axid-read,axi_id=0x12/ cmd, which will monitor ARID=0x12

* With DDR_CAP_AXI_ID_FILTER_ENHANCED quirk(filter: 1, enhanced_filter: 1).
* With DDR_CAP_AXI_ID_FILTER_ENHANCED quirk(filter: 1, enhanced_filter: 1, super_filter: 0).
  This is an extension to the DDR_CAP_AXI_ID_FILTER quirk which permits
  counting the number of bytes (as opposed to the number of bursts) from DDR
  read and write transactions concurrently with another set of data counters.

* With DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER quirk(filter: 0, enhanced_filter: 0, super_filter: 1).
  There is a limitation in previous AXI filter, it cannot filter different IDs
  at the same time as the filter is shared between counters. This quirk is the
  extension of AXI ID filter. One improvement is that counter 1-3 has their own
  filter, means that it supports concurrently filter various IDs. Another
  improvement is that counter 1-3 supports AXI PORT and CHANNEL selection. Support
  selecting address channel or data channel.

  Filter is defined with 2 configuration registers per counter 1-3.
  --Counter N MASK COMP register - including AXI_ID and AXI_MASKING.
  --Counter N MUX CNTL register - including AXI CHANNEL and AXI PORT.

      - 0: address channel
      - 1: data channel

  PMU in DDR subsystem, only one single port0 exists, so axi_port is reserved
  which should be 0.

  .. code-block:: bash

      perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd
      perf stat -a -e imx8_ddr0/axid-write,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd

  .. note::

      axi_channel is inverted in userspace, and it will be reverted in driver
      automatically. So that users do not need specify axi_channel if want to
      monitor data channel from DDR transactions, since data channel is more
      meaningful.
+1 −0
Original line number Diff line number Diff line
@@ -19,6 +19,7 @@ Performance monitor support
   arm_dsu_pmu
   thunderx2-pmu
   alibaba_pmu
   dwc_pcie_pmu
   nvidia-pmu
   meson-ddr-pmu
   cxl
+72 −0
Original line number Diff line number Diff line
@@ -164,3 +164,75 @@ and should be used to mask the upper bits as needed.
   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
.. _tools/lib/perf/tests/test-evsel.c:
   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c

Event Counting Threshold
==========================================

Overview
--------

FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
events whose count meets a specified threshold condition. For example if
threshold_compare is set to 2 ('Greater than or equal'), and the
threshold is set to 2, then the PMU counter will now only increment by
when an event would have previously incremented the PMU counter by 2 or
more on a single processor cycle.

To increment by 1 after passing the threshold condition instead of the
number of events on that cycle, add the 'threshold_count' option to the
commandline.

How-to
------

These are the parameters for controlling the feature:

.. list-table::
   :header-rows: 1

   * - Parameter
     - Description
   * - threshold
     - Value to threshold the event by. A value of 0 means that
       thresholding is disabled and the other parameters have no effect.
   * - threshold_compare
     - | Comparison function to use, with the following values supported:
       |
       | 0: Not-equal
       | 1: Equals
       | 2: Greater-than-or-equal
       | 3: Less-than
   * - threshold_count
     - If this is set, count by 1 after passing the threshold condition
       instead of the value of the event on this cycle.

The threshold, threshold_compare and threshold_count values can be
provided per event, for example:

.. code-block:: sh

  perf stat -e stall_slot/threshold=2,threshold_compare=2/ \
            -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/

In this example the stall_slot event will count by 2 or more on every
cycle where 2 or more stalls happen. And dtlb_walk will count by 1 on
every cycle where the number of dtlb walks were less than 10.

The maximum supported threshold value can be read from the caps of each
PMU, for example:

.. code-block:: sh

  cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max

  0x000000ff

If a value higher than this is given, then opening the event will result
in an error. The highest possible maximum is 4095, as the config field
for threshold is limited to 12 bits, and the Perf tool will refuse to
parse higher values.

If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read
0, and attempting to set a threshold value will also result in an error.
threshold_max will also read as 0 on aarch32 guests, even if the host
is running on hardware with the feature.
+3 −0
Original line number Diff line number Diff line
@@ -27,6 +27,9 @@ properties:
              - fsl,imx8mq-ddr-pmu
              - fsl,imx8mp-ddr-pmu
          - const: fsl,imx8m-ddr-pmu
      - items:
          - const: fsl,imx8dxl-ddr-pmu
          - const: fsl,imx8-ddr-pmu

  reg:
    maxItems: 1
Loading