Commit 0939bd2f authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge tag 'perf-tools-for-v6.16-1-2025-06-03' of...

Merge tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "perf report/top/annotate TUI:

   - Accept the left arrow key as a Zoom out if done on the first column

   - Show if source code toggle status in title, to help spotting bugs
     with the various disassemblers (capstone, llvm, objdump)

   - Provide feedback on unhandled hotkeys

  Build:

   - Better inform when certain features are not available with warnings
     in the build process and in 'perf version --build-options' or 'perf -vv'

  perf record:

   - Improve the --off-cpu code by synthesizing events for switch-out ->
     switch-in intervals using a BPF program. This can be fine tuned
     using a --off-cpu-thresh knob

  perf report:

   - Add 'tgid' sort key

  perf mem/c2c:

   - Add 'op', 'cache', 'snoop', 'dtlb' output fields

   - Add support for 'ldlat' on AMD IBS (Instruction Based Sampling)

  perf ftrace:

   - Use process/session specific trace settings instead of messing with
     the global ftrace knobs

  perf trace:

   - Implement syscall summary in BPF

   - Support --summary-mode=cgroup

   - Always print return value for syscalls returning a pid

   - The rseq and set_robust_list don't return a pid, just -errno

  perf lock contention:

   - Symbolize zone->lock using BTF

   - Add -J/--inject-delay option to estimate impact on application
     performance by optimization of kernel locking behavior

  perf stat:

   - Improve hybrid support for the NMI watchdog warning

  Symbol resolution:

   - Handle 'u' and 'l' symbols in /proc/kallsyms, resolving some Rust
     symbols

   - Improve Rust demangler

  Hardware tracing:

  Intel PT:

   - Fix PEBS-via-PT data_src

   - Do not default to recording all switch events

   - Fix pattern matching with python3 on the SQL viewer script

  arm64:

   - Fixups for the hip08 hha PMU

  Vendor events:

   - Update Intel events/metrics files for alderlake, alderlaken,
     arrowlake, bonnell, broadwell, broadwellde, broadwellx,
     cascadelakex, clearwaterforest, elkhartlake, emeraldrapids,
     grandridge, graniterapids, haswell, haswellx, icelake, icelakex,
     ivybridge, ivytown, jaketown, lunarlake, meteorlake, nehalemep,
     nehalemex, rocketlake, sandybridge, sapphirerapids, sierraforest,
     skylake, skylakex, snowridgex, tigerlake, westmereep-dp,
     westmereep-sp, westmereep-sx

  python support:

   - Add support for event counts in the python binding, add a
     counting.py example

  perf list:

   - Display the PMU name associated with a perf metric in JSON

  perf test:

   - Hybrid improvements for metric value validation test

   - Fix LBR test by ignoring idle task

   - Add AMD IBS sw filter ana d'ldlat' tests

   - Add 'perf trace --summary-mode=cgroup' test

   - Add tests for the various language symbol demanglers

  Miscellaneous:

   - Allow specifying the cpu an event will be tied using '-e
     event/cpu=N/'

   - Sync various headers with the kernel sources

   - Add annotations to use clang's -Wthread-safety and fix some
     problems it detected

   - Make dump_stack() use perf's symbol resolution to provide better
     backtraces

   - Intel TPEBS support cleanups and fixes. TPEBS stands for Timed PEBS
     (Precision Event-Based Sampling), that adds timing info, the
     retirement latency of instructions

   - Various memory allocation (some detected by ASAN) and reference
     counting fixes

   - Add a 8-byte aligned PERF_RECORD_COMPRESSED2 to replace
     PERF_RECORD_COMPRESSED

   - Skip unsupported event types in perf.data files, don't stop when
     finding one

   - Improve lookups using hashmaps and binary searches"

* tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (206 commits)
  perf callchain: Always populate the addr_location map when adding IP
  perf lock contention: Reject more than 10ms delays for safety
  perf trace: Set errpid to false for rseq and set_robust_list
  perf symbol: Move demangling code out of symbol-elf.c
  perf trace: Always print return value for syscalls returning a pid
  perf script: Print PERF_AUX_FLAG_COLLISION flag
  perf mem: Show absolute percent in mem_stat output
  perf mem: Display sort order only if it's available
  perf mem: Describe overhead calculation in brief
  perf record: Fix incorrect --user-regs comments
  Revert "perf thread: Ensure comm_lock held for comm_list"
  perf test trace_summary: Skip --bpf-summary tests if no libbpf
  perf test intel-pt: Skip jitdump test if no libelf
  perf intel-tpebs: Avoid race when evlist is being deleted
  perf test demangle-java: Don't segv if demangling fails
  perf symbol: Fix use-after-free in filename__read_build_id
  perf pmu: Avoid segv for missing name/alias_name in wildcarding
  perf machine: Factor creating a "live" machine out of dwarf-unwind
  perf test: Add AMD IBS sw filter test
  perf mem: Count L2 HITM for c2c statistic
  ...
parents 70087d22 a913ef6f
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -10860,6 +10860,7 @@ W: http://www.hisilicon.com
F:	Documentation/admin-guide/perf/hisi-pcie-pmu.rst
F:	Documentation/admin-guide/perf/hisi-pmu.rst
F:	drivers/perf/hisilicon
F:	tools/perf/pmu-events/arch/arm64/hisilicon/
HISILICON PTT DRIVER
M:	Yicong Yang <yangyicong@hisilicon.com>
+2 −0
Original line number Diff line number Diff line
@@ -129,6 +129,7 @@
#define FUJITSU_CPU_PART_A64FX		0x001

#define HISI_CPU_PART_TSV110		0xD01
#define HISI_CPU_PART_HIP12		0xD06

#define APPLE_CPU_PART_M1_ICESTORM	0x022
#define APPLE_CPU_PART_M1_FIRESTORM	0x023
@@ -202,6 +203,7 @@
#define MIDR_NVIDIA_CARMEL MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_CARMEL)
#define MIDR_FUJITSU_A64FX MIDR_CPU_MODEL(ARM_CPU_IMP_FUJITSU, FUJITSU_CPU_PART_A64FX)
#define MIDR_HISI_TSV110 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_TSV110)
#define MIDR_HISI_HIP12 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_HIP12)
#define MIDR_APPLE_M1_ICESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_ICESTORM)
#define MIDR_APPLE_M1_FIRESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_FIRESTORM)
#define MIDR_APPLE_M1_ICESTORM_PRO MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_ICESTORM_PRO)
+4 −1
Original line number Diff line number Diff line
@@ -75,7 +75,7 @@
#define X86_FEATURE_CENTAUR_MCR		( 3*32+ 3) /* "centaur_mcr" Centaur MCRs (= MTRRs) */
#define X86_FEATURE_K8			( 3*32+ 4) /* Opteron, Athlon64 */
#define X86_FEATURE_ZEN5		( 3*32+ 5) /* CPU based on Zen5 microarchitecture */
/* Free                                 ( 3*32+ 6) */
#define X86_FEATURE_ZEN6		( 3*32+ 6) /* CPU based on Zen6 microarchitecture */
/* Free                                 ( 3*32+ 7) */
#define X86_FEATURE_CONSTANT_TSC	( 3*32+ 8) /* "constant_tsc" TSC ticks at a constant rate */
#define X86_FEATURE_UP			( 3*32+ 9) /* "up" SMP kernel running on UP */
@@ -482,6 +482,7 @@
#define X86_FEATURE_AMD_HTR_CORES	(21*32+ 6) /* Heterogeneous Core Topology */
#define X86_FEATURE_AMD_WORKLOAD_CLASS	(21*32+ 7) /* Workload Classification */
#define X86_FEATURE_PREFER_YMM		(21*32+ 8) /* Avoid ZMM registers due to downclocking */
#define X86_FEATURE_INDIRECT_THUNK_ITS	(21*32+ 9) /* Use thunk for indirect branches in lower half of cacheline */

/*
 * BUG word(s)
@@ -534,4 +535,6 @@
#define X86_BUG_BHI			X86_BUG( 1*32+ 3) /* "bhi" CPU is affected by Branch History Injection */
#define X86_BUG_IBPB_NO_RET		X86_BUG( 1*32+ 4) /* "ibpb_no_ret" IBPB omits return target predictions */
#define X86_BUG_SPECTRE_V2_USER		X86_BUG( 1*32+ 5) /* "spectre_v2_user" CPU is affected by Spectre variant 2 attack between user processes */
#define X86_BUG_ITS			X86_BUG( 1*32+ 6) /* "its" CPU is affected by Indirect Target Selection */
#define X86_BUG_ITS_NATIVE_ONLY		X86_BUG( 1*32+ 7) /* "its_native_only" CPU is affected by ITS, VMX is not affected */
#endif /* _ASM_X86_CPUFEATURES_H */
+8 −0
Original line number Diff line number Diff line
@@ -211,6 +211,14 @@
						 * VERW clears CPU Register
						 * File.
						 */
#define ARCH_CAP_ITS_NO			BIT_ULL(62) /*
						     * Not susceptible to
						     * Indirect Target Selection.
						     * This bit is not set by
						     * HW, but is synthesized by
						     * VMMs for guests to know
						     * their affected status.
						     */

#define MSR_IA32_FLUSH_CMD		0x0000010b
#define L1D_FLUSH			BIT(0)	/*
+0 −4
Original line number Diff line number Diff line
@@ -87,7 +87,6 @@ FEATURE_TESTS_BASIC := \
        libtracefs                      \
        libcpupower                     \
        libcrypto                       \
        libunwind                       \
        pthread-attr-setaffinity-np     \
        pthread-barrier     		\
        reallocarray                    \
@@ -148,15 +147,12 @@ endif
FEATURE_DISPLAY ?=              \
         libdw                  \
         glibc                  \
         libbfd                 \
         libbfd-buildid		\
         libelf                 \
         libnuma                \
         numa_num_possible_cpus \
         libperl                \
         libpython              \
         libcrypto              \
         libunwind              \
         libcapstone            \
         llvm-perf              \
         zlib                   \
Loading