Commit 3349ada3 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull powerpc updates from Madhavan Srinivasan:

 - Support for dynamic preemption

 - Migrate powerpc boards GPIO driver to new setter API

 - Added new PMU for KVM host-wide measurement

 - Enhancement to htmdump driver to support more functions

 - Added character device for couple RTAS supported APIs

 - Minor fixes and cleanup

Thanks to Amit Machhiwal, Athira Rajeev, Bagas Sanjaya, Bartosz
Golaszewski, Christophe Leroy, Eddie James, Gaurav Batra, Gautam
Menghani, Geert Uytterhoeven, Haren Myneni, Hari Bathini, Jiri Slaby
(SUSE), Linus Walleij, Michal Suchanek, Naveen N Rao (AMD), Nilay
Shroff, Ricardo B. Marlière, Ritesh Harjani (IBM), Sathvika Vasireddy,
Shrikanth Hegde, Stephen Rothwell, Sourabh Jain, Thorsten Blum, Vaibhav
Jain, Venkat Rao Bagalkote, and Viktor Malik.

* tag 'powerpc-6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (52 commits)
  MAINTAINERS: powerpc: Remove myself as a reviewer
  powerpc/iommu: Use str_disabled_enabled() helper
  powerpc/powermac: Use str_enabled_disabled() and str_on_off() helpers
  powerpc/mm/fault: Use str_write_read() helper function
  powerpc: Replace strcpy() with strscpy() in proc_ppc64_init()
  powerpc/pseries/iommu: Fix kmemleak in TCE table userspace view
  powerpc/kernel: Fix ppc_save_regs inclusion in build
  powerpc: Transliterate author name and remove FIXME
  powerpc/pseries/htmdump: Include header file to get is_kvm_guest() definition
  KVM: PPC: Book3S HV: Fix IRQ map warnings with XICS on pSeries KVM Guest
  powerpc/8xx: Reduce alignment constraint for kernel memory
  powerpc/boot: Fix build with gcc 15
  powerpc/pseries/htmdump: Add documentation for H_HTM debugfs interface
  powerpc/pseries/htmdump: Add htm capabilities support to htmdump module
  powerpc/pseries/htmdump: Add htm flags support to htmdump module
  powerpc/pseries/htmdump: Add htm setup support to htmdump module
  powerpc/pseries/htmdump: Add htm info support to htmdump module
  powerpc/pseries/htmdump: Add htm status support to htmdump module
  powerpc/pseries/htmdump: Add htm start support to htmdump module
  powerpc/pseries/htmdump: Add htm configure support to htmdump module
  ...
parents d8cb0683 8682a574
Loading
Loading
Loading
Loading
+104 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0
.. _htm:

===================================
HTM (Hardware Trace Macro)
===================================

Athira Rajeev, 2 Mar 2025

.. contents::
    :depth: 3


Basic overview
==============

H_HTM is used as an interface for executing Hardware Trace Macro (HTM)
functions, including setup, configuration, control and dumping of the HTM data.
For using HTM, it is required to setup HTM buffers and HTM operations can
be controlled using the H_HTM hcall. The hcall can be invoked for any core/chip
of the system from within a partition itself. To use this feature, a debugfs
folder called "htmdump" is present under /sys/kernel/debug/powerpc.


HTM debugfs example usage
=========================

.. code-block:: sh

  #  ls /sys/kernel/debug/powerpc/htmdump/
  coreindexonchip  htmcaps  htmconfigure  htmflags  htminfo  htmsetup
  htmstart  htmstatus  htmtype  nodalchipindex  nodeindex  trace

Details on each file:

* nodeindex, nodalchipindex, coreindexonchip specifies which partition to configure the HTM for.
* htmtype: specifies the type of HTM. Supported target is hardwareTarget.
* trace: is to read the HTM data.
* htmconfigure: Configure/Deconfigure the HTM. Writing 1 to the file will configure the trace, writing 0 to the file will do deconfigure.
* htmstart: start/Stop the HTM. Writing 1 to the file will start the tracing, writing 0 to the file will stop the tracing.
* htmstatus: get the status of HTM. This is needed to understand the HTM state after each operation.
* htmsetup: set the HTM buffer size. Size of HTM buffer is in power of 2
* htminfo: provides the system processor configuration details. This is needed to understand the appropriate values for nodeindex, nodalchipindex, coreindexonchip.
* htmcaps : provides the HTM capabilities like minimum/maximum buffer size, what kind of tracing the HTM supports etc.
* htmflags : allows to pass flags to hcall. Currently supports controlling the wrapping of HTM buffer.

To see the system processor configuration details:

.. code-block:: sh

  # cat /sys/kernel/debug/powerpc/htmdump/htminfo > htminfo_file

The result can be interpreted using hexdump.

To collect HTM traces for a partition represented by nodeindex as
zero, nodalchipindex as 1 and coreindexonchip as 12

.. code-block:: sh

  # cd /sys/kernel/debug/powerpc/htmdump/
  # echo 2 > htmtype
  # echo 33 > htmsetup ( sets 8GB memory for HTM buffer, number is size in power of 2 )

This requires a CEC reboot to get the HTM buffers allocated.

.. code-block:: sh

  # cd /sys/kernel/debug/powerpc/htmdump/
  # echo 2 > htmtype
  # echo 0 > nodeindex
  # echo 1 > nodalchipindex
  # echo 12 > coreindexonchip
  # echo 1 > htmflags     # to set noWrap for HTM buffers
  # echo 1 > htmconfigure # Configure the HTM
  # echo 1 > htmstart     # Start the HTM
  # echo 0 > htmstart     # Stop the HTM
  # echo 0 > htmconfigure # Deconfigure the HTM
  # cat htmstatus         # Dump the status of HTM entries as data

Above will set the htmtype and core details, followed by executing respective HTM operation.

Read the HTM trace data
========================

After starting the trace collection, run the workload
of interest. Stop the trace collection after required period
of time, and read the trace file.

.. code-block:: sh

  # cat /sys/kernel/debug/powerpc/htmdump/trace > trace_file

This trace file will contain the relevant instruction traces
collected during the workload execution. And can be used as
input file for trace decoders to understand data.

Benefits of using HTM debugfs interface
=======================================

It is now possible to collect traces for a particular core/chip
from within any partition of the system and decode it. Through
this enablement, a small partition can be dedicated to collect the
trace data and analyze to provide important information for Performance
analysis, Software tuning, or Hardware debug.
+30 −10
Original line number Diff line number Diff line
@@ -208,13 +208,9 @@ associated values for each ID in the GSB::
      flags:
         Bit 0: getGuestWideState: Request state of the Guest instead
           of an individual VCPU.
         Bit 1: takeOwnershipOfVcpuState Indicate the L1 is taking
           over ownership of the VCPU state and that the L0 can free
           the storage holding the state. The VCPU state will need to
           be returned to the Hypervisor via H_GUEST_SET_STATE prior
           to H_GUEST_RUN_VCPU being called for this VCPU. The data
           returned in the dataBuffer is in a Hypervisor internal
           format.
         Bit 1: getHostWideState: Request stats of the Host. This causes
           the guestId and vcpuId parameters to be ignored and attempting
           to get the VCPU/Guest state will cause an error.
         Bits 2-63: Reserved
      guestId: ID obtained from H_GUEST_CREATE
      vcpuId: ID of the vCPU pass to H_GUEST_CREATE_VCPU
@@ -406,8 +402,9 @@ the partition like the timebase offset and partition scoped page
table information.

+--------+-------+----+--------+----------------------------------+
|   ID   | Size  | RW | Thread | Details                          |
|        | Bytes |    | Guest  |                                  |
|   ID   | Size  | RW |(H)ost  | Details                          |
|        | Bytes |    |(G)uest |                                  |
|        |       |    |(T)hread|                                  |
|        |       |    |Scope   |                                  |
+========+=======+====+========+==================================+
| 0x0000 |       | RW |   TG   | NOP element                      |
@@ -434,6 +431,29 @@ table information.
|        |       |    |        |- 0x8 Table size.                 |
+--------+-------+----+--------+----------------------------------+
| 0x0007-|       |    |        | Reserved                         |
| 0x07FF |       |    |        |                                  |
+--------+-------+----+--------+----------------------------------+
| 0x0800 | 0x08  | R  |   H    | Current usage in bytes of the    |
|        |       |    |        | L0's Guest Management Space      |
|        |       |    |        | for an L1-Lpar.                  |
+--------+-------+----+--------+----------------------------------+
| 0x0801 | 0x08  | R  |   H    | Max bytes available in the       |
|        |       |    |        | L0's Guest Management Space for  |
|        |       |    |        | an L1-Lpar                       |
+--------+-------+----+--------+----------------------------------+
| 0x0802 | 0x08  | R  |   H    | Current usage in bytes of the    |
|        |       |    |        | L0's Guest Page Table Management |
|        |       |    |        | Space for an L1-Lpar             |
+--------+-------+----+--------+----------------------------------+
| 0x0803 | 0x08  | R  |   H    | Max bytes available in the L0's  |
|        |       |    |        | Guest Page Table Management      |
|        |       |    |        | Space for an L1-Lpar             |
+--------+-------+----+--------+----------------------------------+
| 0x0804 | 0x08  | R  |   H    | Cumulative Reclaimed bytes from  |
|        |       |    |        | L0 Guest's Page Table Management |
|        |       |    |        | Space due to overcommit          |
+--------+-------+----+--------+----------------------------------+
| 0x0805-|       |    |        | Reserved                         |
| 0x0BFF |       |    |        |                                  |
+--------+-------+----+--------+----------------------------------+
| 0x0C00 | 0x10  | RW |   T    |Run vCPU Input Buffer:            |
+6 −0
Original line number Diff line number Diff line
@@ -366,6 +366,12 @@ Code Seq# Include File Comments
                                                                     <mailto:linuxppc-dev>
0xB2  01-02  arch/powerpc/include/uapi/asm/papr-sysparm.h            powerpc/pseries system parameter API
                                                                     <mailto:linuxppc-dev>
0xB2  03-05  arch/powerpc/include/uapi/asm/papr-indices.h            powerpc/pseries indices API
                                                                     <mailto:linuxppc-dev>
0xB2  06-07  arch/powerpc/include/uapi/asm/papr-platform-dump.h      powerpc/pseries Platform Dump API
                                                                     <mailto:linuxppc-dev>
0xB2  08     powerpc/include/uapi/asm/papr-physical-attestation.h    powerpc/pseries Physical Attestation API
                                                                     <mailto:linuxppc-dev>
0xB3  00     linux/mmc/ioctl.h
0xB4  00-0F  linux/gpio.h                                            <mailto:linux-gpio@vger.kernel.org>
0xB5  00-0F  uapi/linux/rpmsg.h                                      <mailto:linux-remoteproc@vger.kernel.org>
+0 −1
Original line number Diff line number Diff line
@@ -13665,7 +13665,6 @@ M: Madhavan Srinivasan <maddy@linux.ibm.com>
M:	Michael Ellerman <mpe@ellerman.id.au>
R:	Nicholas Piggin <npiggin@gmail.com>
R:	Christophe Leroy <christophe.leroy@csgroup.eu>
R:	Naveen N Rao <naveen@kernel.org>
L:	linuxppc-dev@lists.ozlabs.org
S:	Supported
W:	https://github.com/linuxppc/wiki/wiki
+6 −5
Original line number Diff line number Diff line
@@ -277,6 +277,7 @@ config PPC
	select HAVE_PERF_EVENTS_NMI		if PPC64
	select HAVE_PERF_REGS
	select HAVE_PERF_USER_STACK_DUMP
	select HAVE_PREEMPT_DYNAMIC_KEY
	select HAVE_RETHOOK			if KPROBES
	select HAVE_REGS_AND_STACK_ACCESS_API
	select HAVE_RELIABLE_STACKTRACE
@@ -894,7 +895,7 @@ config DATA_SHIFT
	int "Data shift" if DATA_SHIFT_BOOL
	default 24 if STRICT_KERNEL_RWX && PPC64
	range 17 28 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC || KFENCE) && PPC_BOOK3S_32
	range 19 23 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC || KFENCE) && PPC_8xx
	range 14 23 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC || KFENCE) && PPC_8xx
	range 20 24 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC || KFENCE) && PPC_85xx
	default 22 if STRICT_KERNEL_RWX && PPC_BOOK3S_32
	default 18 if (DEBUG_PAGEALLOC || KFENCE) && PPC_BOOK3S_32
@@ -907,10 +908,10 @@ config DATA_SHIFT
	  On Book3S 32 (603+), DBATs are used to map kernel text and rodata RO.
	  Smaller is the alignment, greater is the number of necessary DBATs.

	  On 8xx, large pages (512kb or 8M) are used to map kernel linear
	  memory. Aligning to 8M reduces TLB misses as only 8M pages are used
	  in that case. If PIN_TLB is selected, it must be aligned to 8M as
	  8M pages will be pinned.
	  On 8xx, large pages (16kb or 512kb or 8M) are used to map kernel
	  linear memory. Aligning to 8M reduces TLB misses as only 8M pages
	  are used in that case. If PIN_TLB is selected, it must be aligned
	  to 8M as 8M pages will be pinned.

config ARCH_FORCE_MAX_ORDER
	int "Order of maximal physically contiguous allocations"
Loading