Commit 9ad09c4f authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull arm64 updates from Will Deacon:
 "We've got a little less than normal thanks to the holidays in
  December, but there's the usual summary below. The highlight is
  probably the 52-bit physical addressing (LPA2) clean-up from Ard.

  Confidential Computing:

   - Register a platform device when running in CCA realm mode to enable
     automatic loading of dependent modules

  CPU Features:

   - Update a bunch of system register definitions to pick up new field
     encodings from the architectural documentation

   - Add hwcaps and selftests for the new (2024) dpISA extensions

  Documentation:

   - Update EL3 (firmware) requirements for booting Linux on modern
     arm64 designs

   - Remove stale information about the kernel virtual memory map

  Miscellaneous:

   - Minor cleanups and typo fixes

  Memory management:

   - Fix vmemmap_check_pmd() to look at the PMD type bits

   - LPA2 (52-bit physical addressing) cleanups and minor fixes

   - Adjust physical address space depending upon whether or not LPA2 is
     enabled

  Perf and PMUs:

   - Add port filtering support for NVIDIA's NVLINK-C2C Coresight PMU

   - Extend AXI filtering support for the DDR PMU on NXP IMX SoCs

   - Fix Designware PCIe PMU event numbering

   - Add generic branch events for the Apple M1 CPU PMU

   - Add support for Marvell Odyssey DDR and LLC-TAD PMUs

   - Cleanups to the Hisilicon DDRC and Uncore PMU code

   - Advertise discard mode for the SPE PMU

   - Add the perf users mailing list to our MAINTAINERS entry"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (64 commits)
  Documentation: arm64: Remove stale and redundant virtual memory diagrams
  perf docs: arm_spe: Document new discard mode
  perf: arm_spe: Add format option for discard mode
  MAINTAINERS: Add perf list for drivers/perf/
  arm64: Remove duplicate included header
  drivers/perf: apple_m1: Map generic branch events
  arm64: rsi: Add automatic arm-cca-guest module loading
  kselftest/arm64: Add 2024 dpISA extensions to hwcap test
  KVM: arm64: Allow control of dpISA extensions in ID_AA64ISAR3_EL1
  arm64/hwcap: Describe 2024 dpISA extensions to userspace
  arm64/sysreg: Update ID_AA64SMFR0_EL1 to DDI0601 2024-12
  arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented
  drivers/perf: hisi: Set correct IRQ affinity for PMUs with no association
  arm64/sme: Move storage of reg_smidr to __cpuinfo_store_cpu()
  arm64: mm: Test for pmd_sect() in vmemmap_check_pmd()
  arm64/mm: Replace open encodings with PXD_TABLE_BIT
  arm64/mm: Rename pte_mkpresent() as pte_mkvalid()
  arm64/sysreg: Update ID_AA64ISAR2_EL1 to DDI0601 2024-09
  arm64/sysreg: Update ID_AA64ZFR0_EL1 to DDI0601 2024-09
  arm64/sysreg: Update ID_AA64FPFR0_EL1 to DDI0601 2024-09
  ...
parents e7244cc3 1dd33936
Loading
Loading
Loading
Loading
+3 −3
Original line number Diff line number Diff line
@@ -60,7 +60,7 @@ description of available events and configuration options in sysfs, see
The "format" directory describes format of the config fields of the
perf_event_attr structure. The "events" directory provides configuration
templates for all documented events.  For example,
"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1".
"rx_pcie_tlp_data_payload" is an equivalent of "eventid=0x21,type=0x0".

The "perf list" command shall list the available events from sysfs, e.g.::

@@ -79,8 +79,8 @@ Example usage of counting PCIe RX TLP data payload (Units of bytes)::

The average RX/TX bandwidth can be calculated using the following formula:

    PCIe RX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window
    PCIe TX Bandwidth = Tx_PCIe_TLP_Data_Payload / Measure_Time_Window
    PCIe RX Bandwidth = rx_pcie_tlp_data_payload / Measure_Time_Window
    PCIe TX Bandwidth = tx_pcie_tlp_data_payload / Measure_Time_Window

Lane Event Usage
-------------------------------
+4 −1
Original line number Diff line number Diff line
@@ -35,7 +35,10 @@ e.g. hisi_sccl1_hha0/rx_operations is RX_OPERATIONS event of HHA index #0 in
SCCL ID #1.

The driver also provides a "cpumask" sysfs attribute, which shows the CPU core
ID used to count the uncore PMU event.
ID used to count the uncore PMU event. An "associated_cpus" sysfs attribute is
also provided to show the CPUs associated with this PMU. The "cpumask" indicates
the CPUs to open the events, usually as a hint for userspaces tools like perf.
It only contains one associated CPU from the "associated_cpus".

Example usage of perf::

+2 −0
Original line number Diff line number Diff line
@@ -14,6 +14,8 @@ Performance monitor support
   qcom_l2_pmu
   qcom_l3_pmu
   starfive_starlink_pmu
   mrvl-odyssey-ddr-pmu
   mrvl-odyssey-tad-pmu
   arm-ccn
   arm-cmn
   arm-ni
+80 −0
Original line number Diff line number Diff line
===================================================================
Marvell Odyssey DDR PMU Performance Monitoring Unit (PMU UNCORE)
===================================================================

Odyssey DRAM Subsystem supports eight counters for monitoring performance
and software can program those counters to monitor any of the defined
performance events. Supported performance events include those counted
at the interface between the DDR controller and the PHY, interface between
the DDR Controller and the CHI interconnect, or within the DDR Controller.

Additionally DSS also supports two fixed performance event counters, one
for ddr reads and the other for ddr writes.

The counter will be operating in either manual or auto mode.

The PMU driver exposes the available events and format options under sysfs::

        /sys/bus/event_source/devices/mrvl_ddr_pmu_<>/events/
        /sys/bus/event_source/devices/mrvl_ddr_pmu_<>/format/

Examples::

        $ perf list | grep ddr
        mrvl_ddr_pmu_<>/ddr_act_bypass_access/   [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_bsm_alloc/           [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_bsm_starvation/      [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_cam_active_access/   [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_cam_mwr/             [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_cam_rd_active_access/ [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_cam_rd_or_wr_access/ [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_cam_read/            [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_cam_wr_access/       [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_cam_write/           [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_capar_error/         [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_crit_ref/            [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_ddr_reads/           [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_ddr_writes/          [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_dfi_cmd_is_retry/    [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_dfi_cycles/          [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_dfi_parity_poison/   [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_dfi_rd_data_access/  [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_dfi_wr_data_access/  [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_dqsosc_mpc/          [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_dqsosc_mrr/          [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_enter_mpsm/          [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_enter_powerdown/     [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_enter_selfref/       [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_hif_pri_rdaccess/    [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_hif_rd_access/       [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_hif_rd_or_wr_access/ [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_hif_rmw_access/      [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_hif_wr_access/       [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_hpri_sched_rd_crit_access/ [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_load_mode/           [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_lpri_sched_rd_crit_access/ [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_precharge/           [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_precharge_for_other/ [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_precharge_for_rdwr/  [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_raw_hazard/          [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_rd_bypass_access/    [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_rd_crc_error/        [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_rd_uc_ecc_error/     [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_rdwr_transitions/    [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_refresh/             [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_retry_fifo_full/     [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_spec_ref/            [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_tcr_mrr/             [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_war_hazard/          [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_waw_hazard/          [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_win_limit_reached_rd/ [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_win_limit_reached_wr/ [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_wr_crc_error/        [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_wr_trxn_crit_access/ [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_write_combine/       [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_zqcl/                [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_zqlatch/             [Kernel PMU event]
        mrvl_ddr_pmu_<>/ddr_zqstart/             [Kernel PMU event]

        $ perf stat -e ddr_cam_read,ddr_cam_write,ddr_cam_active_access,ddr_cam
          rd_or_wr_access,ddr_cam_rd_active_access,ddr_cam_mwr <workload>
+37 −0
Original line number Diff line number Diff line
====================================================================
Marvell Odyssey LLC-TAD Performance Monitoring Unit (PMU UNCORE)
====================================================================

Each TAD provides eight 64-bit counters for monitoring
cache behavior.The driver always configures the same counter for
all the TADs. The user would end up effectively reserving one of
eight counters in every TAD to look across all TADs.
The occurrences of events are aggregated and presented to the user
at the end of running the workload. The driver does not provide a
way for the user to partition TADs so that different TADs are used for
different applications.

The performance events reflect various internal or interface activities.
By combining the values from multiple performance counters, cache
performance can be measured in terms such as: cache miss rate, cache
allocations, interface retry rate, internal resource occupancy, etc.

The PMU driver exposes the available events and format options under sysfs::

        /sys/bus/event_source/devices/tad/events/
        /sys/bus/event_source/devices/tad/format/

Examples::

   $ perf list | grep tad
        tad/tad_alloc_any/                                 [Kernel PMU event]
        tad/tad_alloc_dtg/                                 [Kernel PMU event]
        tad/tad_alloc_ltg/                                 [Kernel PMU event]
        tad/tad_hit_any/                                   [Kernel PMU event]
        tad/tad_hit_dtg/                                   [Kernel PMU event]
        tad/tad_hit_ltg/                                   [Kernel PMU event]
        tad/tad_req_msh_in_exlmn/                          [Kernel PMU event]
        tad/tad_tag_rd/                                    [Kernel PMU event]
        tad/tad_tot_cycle/                                 [Kernel PMU event]

   $ perf stat -e tad_alloc_dtg,tad_alloc_ltg,tad_alloc_any,tad_hit_dtg,tad_hit_ltg,tad_hit_any,tad_tag_rd <workload>
Loading