Commit d7c8087a authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull power management updates from Rafael Wysocki:
 "Once again, cpufreq is the most active development area, mostly
  because of the new feature additions and documentation updates in the
  amd-pstate driver, but there are also changes in the cpufreq core
  related to boost support and other assorted updates elsewhere.

  Next up are power capping changes due to the major cleanup of the
  Intel RAPL driver.

  On the cpuidle front, a new C-states table for Intel Panther Lake is
  added to the intel_idle driver, the stopped tick handling in the menu
  and teo governors is updated, and there are a couple of cleanups.

  Apart from the above, support for Tegra114 is added to devfreq and
  there are assorted cleanups of that code, there are also two updates
  of the operating performance points (OPP) library, two minor updates
  related to hibernation, and cpupower utility man pages updates and
  cleanups.

  Specifics:

   - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)

   - Update cpufreq-dt-platdev blocklist (Faruque Ansari)

   - Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
     Rosen Penev)

   - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)

   - Add support for new features: CPPC performance priority, Dynamic
     EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham
     Shenoy, Mario Limonciello)

   - Fix sysfs files being present when HW missing and broken/outdated
     documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)

   - Pass the policy to cpufreq_driver->adjust_perf() to avoid using
     cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate
     which leads to a scheduling-while-atomic bug (K Prateek Nayak)

   - Clean up dead code in Kconfig for cpufreq (Julian Braha)

   - Remove max_freq_req update for pre-existing cpufreq policy and add
     a boost_freq_req QoS request to save the boost constraint instead
     of overwriting the last scaling_max_freq constraint (Pierre
     Gondois)

   - Embed cpufreq QoS freq_req objects in cpufreq policy so they all
     are allocated in one go along with the policy to simplify lifetime
     rules and avoid error handling issues (Viresh Kumar)

   - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
     scaling driver (Henry Tseng)

   - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
     of cpumask_weight() because the former is more efficient (Yury
     Norov)

   - Use sysfs_emit() in sysfs show functions for cpufreq governor
     attributes (Thorsten Blum)

   - Update intel_pstate to stop returning an error when "off" is
     written to its status sysfs attribute while the driver is already
     off (Fabio De Francesco)

   - Include current frequency in the debug message printed by
     __cpufreq_driver_target() (Pengjie Zhang)

   - Refine stopped tick handling in the menu cpuidle governor and
     rearrange stopped tick handling in the teo cpuidle governor (Rafael
     Wysocki)

   - Add Panther Lake C-states table to the intel_idle driver (Artem
     Bityutskiy)

   - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)

   - Simplify cpuidle_register_device() with guard() (Huisong Li)

   - Use performance level if available to distinguish between rates in
     OPP debugfs (Manivannan Sadhasivam)

   - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)

   - Return -ENODATA if the snapshot image is not loaded (Alberto
     Garcia)

   - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
     Biggers)

   - Clean up and rearrange the intel_rapl power capping driver to make
     the respective interface drivers (TPMI, MSR, and MMOI) hold their
     own settings and primitives and consolidate PL4 and PMU support
     flags into rapl_defaults (Kuppuswamy Sathyanarayanan)

   - Correct kernel-doc function parameter names in the power capping
     core code (Randy Dunlap)

   - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)

   - Use _visible attribute to replace create/remove_sysfs_files() in
     devfreq (Pengjie Zhang)

   - Add Tegra114 support to activity monitor device in tegra30-devfreq
     as a preparation to upcoming EMC controller support (Svyatoslav
     Ryhel)

   - Fix mistakes in cpupower man pages, add the boost and epp options
     to the cpupower-frequency-info man page, and add the perf-bias
     option to the cpupower-info man page (Roberto Ricci)

   - Remove unnecessary extern declarations from getopt.h in arguments
     parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
     cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)"

* tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
  cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
  cpupower: remove extern declarations in cmd functions
  cpuidle: Simplify cpuidle_register_device() with guard()
  PM / devfreq: tegra30-devfreq: add support for Tegra114
  PM / devfreq: use _visible attribute to replace create/remove_sysfs_files()
  PM / devfreq: Remove unneeded casting for HZ_PER_KHZ
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  ...
parents 2e31b161 d923f70e
Loading
Loading
Loading
Loading
+7 −0
Original line number Diff line number Diff line
@@ -501,6 +501,13 @@ Kernel parameters
			disable
			  Disable amd-pstate preferred core.

	amd_dynamic_epp=
			[X86]
			disable
			  Disable amd-pstate dynamic EPP.
			enable
			  Enable amd-pstate dynamic EPP.

	amijoy.map=	[HW,JOY] Amiga joystick support
			Map of devices attached to JOY0DAT and JOY1DAT
			Format: <a>,<b>
+77 −10
Original line number Diff line number Diff line
@@ -239,8 +239,12 @@ control its functionality at the system level. They are located in the

 root@hr-test1:/home/ray# ls /sys/devices/system/cpu/cpufreq/policy0/*amd*
 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_highest_perf
 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_hw_prefcore
 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_lowest_nonlinear_freq
 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_max_freq
 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_floor_freq
 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_floor_count
 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_prefcore_ranking


``amd_pstate_highest_perf / amd_pstate_max_freq``
@@ -264,14 +268,46 @@ This attribute is read-only.

``amd_pstate_hw_prefcore``

Whether the platform supports the preferred core feature and it has been
enabled. This attribute is read-only.
Whether the platform supports the preferred core feature and it has
been enabled. This attribute is read-only. This file is only visible
on platforms which support the preferred core feature.

``amd_pstate_prefcore_ranking``

The performance ranking of the core. This number doesn't have any unit, but
larger numbers are preferred at the time of reading. This can change at
runtime based on platform conditions. This attribute is read-only.
runtime based on platform conditions. This attribute is read-only. This file
is only visible on platforms which support the preferred core feature.

``amd_pstate_floor_freq``

The floor frequency associated with each CPU. Userspace can write any
value between ``cpuinfo_min_freq`` and ``scaling_max_freq`` into this
file. When the system is under power or thermal constraints, the
platform firmware will attempt to throttle the CPU frequency to the
value specified in ``amd_pstate_floor_freq`` before throttling it
further. This allows userspace to specify different floor frequencies
to different CPUs. For optimal results, threads of the same core
should have the same floor frequency value. This file is only visible
on platforms that support the CPPC Performance Priority feature.


``amd_pstate_floor_count``

The number of distinct Floor Performance levels supported by the
platform. For example, if this value is 2, then the number of unique
values obtained from the command ``cat
/sys/devices/system/cpu/cpufreq/policy*/amd_pstate_floor_freq |
sort -n | uniq`` should be at most this number for the behavior
described in ``amd_pstate_floor_freq`` to take effect. A zero value
implies that the platform supports unlimited floor performance levels.
This file is only visible on platforms that support the CPPC
Performance Priority feature.

**Note**: When ``amd_pstate_floor_count`` is non-zero, the frequency to
which the CPU is throttled under power or thermal constraints is
undefined when the number of unique values of ``amd_pstate_floor_freq``
across all CPUs in the system exceeds ``amd_pstate_floor_count``.

``energy_performance_available_preferences``

@@ -280,16 +316,22 @@ A list of all the supported EPP preferences that could be used for
These profiles represent different hints that are provided
to the low-level firmware about the user's desired energy vs efficiency
tradeoff.  ``default`` represents the epp value is set by platform
firmware. This attribute is read-only.
firmware. ``custom`` designates that integer values 0-255 may be written
as well.  This attribute is read-only.

``energy_performance_preference``

The current energy performance preference can be read from this attribute.
and user can change current preference according to energy or performance needs
Please get all support profiles list from
``energy_performance_available_preferences`` attribute, all the profiles are
integer values defined between 0 to 255 when EPP feature is enabled by platform
firmware, if EPP feature is disabled, driver will ignore the written value
Coarse named profiles are available in the attribute
``energy_performance_available_preferences``.
Users can also write individual integer values between 0 to 255.
When dynamic EPP is enabled, writes to energy_performance_preference are blocked
even when EPP feature is enabled by platform firmware. Lower epp values shift the bias
towards improved performance while a higher epp value shifts the bias towards
power-savings. The exact impact can change from one platform to the other.
If a valid integer was last written, then a number will be returned on future reads.
If a valid string was last written then a string will be returned on future reads.
This attribute is read-write.

``boost``
@@ -311,6 +353,24 @@ boost or `1` to enable it, for the respective CPU using the sysfs path
Other performance and frequency values can be read back from
``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.

Dynamic energy performance profile
==================================
The amd-pstate driver supports dynamically selecting the energy performance
profile based on whether the machine is running on AC or DC power.

Whether this behavior is enabled by default depends on the kernel
config option `CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP`. This behavior can also be overridden
at runtime by the sysfs file ``/sys/devices/system/cpu/cpufreq/policyX/dynamic_epp``.

When set to enabled, the driver will select a different energy performance
profile when the machine is running on battery or AC power. The driver will
also register with the platform profile handler to receive notifications of
user desired power state and react to those.
When set to disabled, the driver will not change the energy performance profile
based on the power source and will not react to user desired power state.

Attempting to manually write to the ``energy_performance_preference`` sysfs
file will fail when ``dynamic_epp`` is enabled.

``amd-pstate`` vs ``acpi-cpufreq``
======================================
@@ -422,6 +482,13 @@ For systems that support ``amd-pstate`` preferred core, the core rankings will
always be advertised by the platform. But OS can choose to ignore that via the
kernel parameter ``amd_prefcore=disable``.

``amd_dynamic_epp``

When AMD pstate is in auto mode, dynamic EPP will control whether the kernel
autonomously changes the EPP mode. The default is configured by
``CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP`` but can be explicitly enabled with
``amd_dynamic_epp=enable`` or disabled with ``amd_dynamic_epp=disable``.

User Space Interface in ``sysfs`` - General
===========================================

@@ -790,13 +857,13 @@ Reference
===========

.. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming,
       https://www.amd.com/system/files/TechDocs/24593.pdf
       https://docs.amd.com/v/u/en-US/24593_3.44_APM_Vol2

.. [2] Advanced Configuration and Power Interface Specification,
       https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf

.. [3] Processor Programming Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors
       https://www.amd.com/system/files/TechDocs/56569-A1-PUB.zip
       https://docs.amd.com/v/u/en-US/56569-A1-PUB_3.03

.. [4] Linux Kernel Selftests,
       https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html
+1 −0
Original line number Diff line number Diff line
@@ -24,6 +24,7 @@ properties:
    enum:
      - nvidia,tegra186-ccplex-cluster
      - nvidia,tegra234-ccplex-cluster
      - nvidia,tegra238-ccplex-cluster

  reg:
    maxItems: 1
+1 −0
Original line number Diff line number Diff line
@@ -35,6 +35,7 @@ properties:
      - description: v2 of CPUFREQ HW (EPSS)
        items:
          - enum:
              - qcom,eliza-cpufreq-epss
              - qcom,milos-cpufreq-epss
              - qcom,qcs8300-cpufreq-epss
              - qcom,qdu1000-cpufreq-epss
+18 −7
Original line number Diff line number Diff line
@@ -1234,9 +1234,9 @@ F: drivers/gpu/drm/amd/pm/
AMD PSTATE DRIVER
M:	Huang Rui <ray.huang@amd.com>
M:	Gautham R. Shenoy <gautham.shenoy@amd.com>
M:	Mario Limonciello <mario.limonciello@amd.com>
R:	Perry Yuan <perry.yuan@amd.com>
R:	K Prateek Nayak <kprateek.nayak@amd.com>
L:	linux-pm@vger.kernel.org
S:	Supported
F:	Documentation/admin-guide/pm/amd-pstate.rst
@@ -6618,6 +6618,17 @@ M: Bence Csókás <bence98@sch.bme.hu>
S:	Maintained
F:	drivers/i2c/busses/i2c-cp2615.c
CPU FREQUENCY DRIVERS - CPPC CPUFREQ
M:	"Rafael J. Wysocki" <rafael@kernel.org>
M:	Viresh Kumar <viresh.kumar@linaro.org>
R:	Jie Zhan <zhanjie9@hisilicon.com>
R:	Lifeng Zheng <zhenglifeng1@huawei.com>
R:	Pierre Gondois <pierre.gondois@arm.com>
R:	Sumit Gupta <sumitg@nvidia.com>
L:	linux-pm@vger.kernel.org
S:	Maintained
F:	drivers/cpufreq/cppc_cpufreq.c
CPU FREQUENCY DRIVERS - VEXPRESS SPC ARM BIG LITTLE
M:	Viresh Kumar <viresh.kumar@linaro.org>
M:	Sudeep Holla <sudeep.holla@kernel.org>
@@ -6626,6 +6637,12 @@ S: Maintained
W:	http://www.arm.com/products/processors/technologies/biglittleprocessing.php
F:	drivers/cpufreq/vexpress-spc-cpufreq.c
CPU FREQUENCY DRIVERS - VIRTUAL MACHINE CPUFREQ
M:	Saravana Kannan <saravanak@kernel.org>
L:	linux-pm@vger.kernel.org
S:	Maintained
F:	drivers/cpufreq/virtual-cpufreq.c
CPU FREQUENCY SCALING FRAMEWORK
M:	"Rafael J. Wysocki" <rafael@kernel.org>
M:	Viresh Kumar <viresh.kumar@linaro.org>
@@ -6645,12 +6662,6 @@ F: kernel/sched/cpufreq*.c
F:	rust/kernel/cpufreq.rs
F:	tools/testing/selftests/cpufreq/
CPU FREQUENCY DRIVERS - VIRTUAL MACHINE CPUFREQ
M:	Saravana Kannan <saravanak@kernel.org>
L:	linux-pm@vger.kernel.org
S:	Maintained
F:	drivers/cpufreq/virtual-cpufreq.c
CPU HOTPLUG
M:	Thomas Gleixner <tglx@kernel.org>
M:	Peter Zijlstra <peterz@infradead.org>
Loading