Commit 8aa49284 authored by Rafael J. Wysocki's avatar Rafael J. Wysocki
Browse files

Merge branch 'thermal-intel'

Merge changes in Intel thermal control drivers for 6.7-rc1:

 - Add power floor notifications support to the int340x thermal control
   driver (Srinivas Pandruvada).

 - Rework updating trip points in the int340x thermal driver so that it
   does not access thermal zone internals directly (Rafael Wysocki).

 - Use param_get_byte() instead of param_get_int() as the max_idle module
   parameter .get() callback in the Intel powerclamp thermal driver to
   avoid possible out-of-bounds access (David Arcari).

 - Add workload hints support to the the int340x thermal driver (Srinivas
   Pandruvada).

* thermal-intel:
  selftests/thermel/intel: Add test to read power floor status
  thermal: int340x: processor_thermal: Enable power floor support
  thermal: int340x: processor_thermal: Handle power floor interrupts
  thermal: int340x: processor_thermal: Support power floor notifications
  thermal: int340x: processor_thermal: Set feature mask before proc_thermal_add
  thermal: int340x: processor_thermal: Common function to clear SOC interrupt
  thermal: int340x: processor_thermal: Move interrupt status MMIO offset to common header
  thermal: intel: powerclamp: fix mismatch in get function for max_idle
  thermal: int340x: Use thermal_zone_for_each_trip()
  thermal: int340x: processor_thermal: Ack all PCI interrupts
  thermal: int340x: Add ArrowLake-S PCI ID
  selftests/thermel/intel: Add test to read workload hint
  thermal: int340x: Handle workload hint interrupts
  thermal: int340x: processor_thermal: Add workload type hint interface
  thermal: int340x: Remove PROC_THERMAL_FEATURE_WLT_REQ for Meteor Lake
  thermal: int340x: processor_thermal: Use non MSI interrupts by default
  thermal: int340x: processor_thermal: Add interrupt configuration function
  thermal: int340x: processor_thermal: Move mailbox code to common module
parents 598c20f9 d4d27e5a
Loading
Loading
Loading
Loading
+64 −0
Original line number Diff line number Diff line
@@ -164,6 +164,16 @@ ABI.
``power_limit_1_tmax_us`` (RO)
	Maximum powercap sysfs constraint_1_time_window_us for Intel RAPL

``power_floor_status`` (RO)
	When set to 1, the power floor of the system in the current
	configuration has been reached.  It needs to be reconfigured to allow
	power to be reduced any further.

``power_floor_enable`` (RW)
	When set to 1, enable reading and notification of the power floor
	status. Notifications are triggered for the power_floor_status
	attribute value changes.

:file:`/sys/bus/pci/devices/0000\:00\:04.0/`

``tcc_offset_degree_celsius`` (RW)
@@ -315,3 +325,57 @@ DPTF Fan Control
----------------------------------------

Refer to Documentation/admin-guide/acpi/fan_performance_states.rst

Workload Type Hints
----------------------------------------

The firmware in Meteor Lake processor generation is capable of identifying
workload type and passing hints regarding it to the OS. A special sysfs
interface is provided to allow user space to obtain workload type hints from
the firmware and control the rate at which they are provided.

User space can poll attribute "workload_type_index" for the current hint or
can receive a notification whenever the value of this attribute is updated.

file:`/sys/bus/pci/devices/0000:00:04.0/workload_hint/`
Segment 0, bus 0, device 4, function 0 is reserved for the processor thermal
device on all Intel client processors. So, the above path doesn't change
based on the processor generation.

``workload_hint_enable`` (RW)
	Enable firmware to send workload type hints to user space.

``notification_delay_ms`` (RW)
	Minimum delay in milliseconds before firmware will notify OS. This is
	for the rate control of notifications. This delay is between changing
	the workload type prediction in the firmware and notifying the OS about
	the change. The default delay is 1024 ms. The delay of 0 is invalid.
	The delay is rounded up to the nearest power of 2 to simplify firmware
	programming of the delay value. The read of notification_delay_ms
	attribute shows the effective value used.

``workload_type_index`` (RO)
	Predicted workload type index. User space can get notification of
	change via existing sysfs attribute change notification mechanism.

	The supported index values and their meaning for the Meteor Lake
	processor generation are as follows:

	0 -  Idle: System performs no tasks, power and idle residency are
		consistently low for long periods of time.

	1 – Battery Life: Power is relatively low, but the processor may
		still be actively performing a task, such as video playback for
		a long period of time.

	2 – Sustained: Power level that is relatively high for a long period
		of time, with very few to no periods of idleness, which will
		eventually exhaust RAPL Power Limit 1 and 2.

	3 – Bursty: Consumes a relatively constant average amount of power, but
		periods of relative idleness are interrupted by bursts of
		activity. The bursts are relatively short and the periods of
		relative idleness between them typically prevent RAPL Power
		Limit 1 from being exhausted.

	4 – Unknown: Can't classify.
+3 −0
Original line number Diff line number Diff line
@@ -10,5 +10,8 @@ obj-$(CONFIG_INT340X_THERMAL) += processor_thermal_device_pci.o
obj-$(CONFIG_PROC_THERMAL_MMIO_RAPL) += processor_thermal_rapl.o
obj-$(CONFIG_INT340X_THERMAL)	+= processor_thermal_rfim.o
obj-$(CONFIG_INT340X_THERMAL)	+= processor_thermal_mbox.o
obj-$(CONFIG_INT340X_THERMAL)	+= processor_thermal_wt_req.o
obj-$(CONFIG_INT340X_THERMAL)	+= processor_thermal_wt_hint.o
obj-$(CONFIG_INT340X_THERMAL)	+= processor_thermal_power_floor.o
obj-$(CONFIG_INT3406_THERMAL)	+= int3406_thermal.o
obj-$(CONFIG_ACPI_THERMAL_REL)	+= acpi_thermal_rel.o
+42 −36
Original line number Diff line number Diff line
@@ -67,6 +67,16 @@ static struct thermal_zone_device_ops int340x_thermal_zone_ops = {
	.critical	= int340x_thermal_critical,
};

static inline void *int_to_trip_priv(int i)
{
	return (void *)(long)i;
}

static inline int trip_priv_to_int(const struct thermal_trip *trip)
{
	return (long)trip->priv;
}

static int int340x_thermal_read_trips(struct acpi_device *zone_adev,
				      struct thermal_trip *zone_trips,
				      int trip_cnt)
@@ -101,6 +111,7 @@ static int int340x_thermal_read_trips(struct acpi_device *zone_adev,
			break;

		zone_trips[trip_cnt].type = THERMAL_TRIP_ACTIVE;
		zone_trips[trip_cnt].priv = int_to_trip_priv(i);
		trip_cnt++;
	}

@@ -212,20 +223,12 @@ void int340x_thermal_zone_remove(struct int34x_thermal_zone *int34x_zone)
}
EXPORT_SYMBOL_GPL(int340x_thermal_zone_remove);

void int340x_thermal_update_trips(struct int34x_thermal_zone *int34x_zone)
static int int340x_update_one_trip(struct thermal_trip *trip, void *arg)
{
	struct acpi_device *zone_adev = int34x_zone->adev;
	struct thermal_trip *zone_trips = int34x_zone->trips;
	int trip_cnt = int34x_zone->zone->num_trips;
	int act_trip_nr = 0;
	int i;

	mutex_lock(&int34x_zone->zone->lock);

	for (i = int34x_zone->aux_trip_nr; i < trip_cnt; i++) {
	struct acpi_device *zone_adev = arg;
	int temp, err;

		switch (zone_trips[i].type) {
	switch (trip->type) {
	case THERMAL_TRIP_CRITICAL:
		err = thermal_acpi_critical_trip_temp(zone_adev, &temp);
		break;
@@ -236,21 +239,24 @@ void int340x_thermal_update_trips(struct int34x_thermal_zone *int34x_zone)
		err = thermal_acpi_passive_trip_temp(zone_adev, &temp);
		break;
	case THERMAL_TRIP_ACTIVE:
			err = thermal_acpi_active_trip_temp(zone_adev, act_trip_nr++,
		err = thermal_acpi_active_trip_temp(zone_adev,
						    trip_priv_to_int(trip),
						    &temp);
		break;
	default:
		err = -ENODEV;
	}
		if (err) {
			zone_trips[i].temperature = THERMAL_TEMP_INVALID;
			continue;
		}
	if (err)
		temp = THERMAL_TEMP_INVALID;

		zone_trips[i].temperature = temp;
	trip->temperature = temp;
	return 0;
}

	mutex_unlock(&int34x_zone->zone->lock);
void int340x_thermal_update_trips(struct int34x_thermal_zone *int34x_zone)
{
	thermal_zone_for_each_trip(int34x_zone->zone, int340x_update_one_trip,
				   int34x_zone->adev);
}
EXPORT_SYMBOL_GPL(int340x_thermal_update_trips);

+80 −5
Original line number Diff line number Diff line
@@ -26,6 +26,48 @@ static ssize_t power_limit_##index##_##suffix##_show(struct device *dev, \
	(unsigned long)proc_dev->power_limits[index].suffix * 1000); \
}

static ssize_t power_floor_status_show(struct device *dev,
				       struct device_attribute *attr,
				       char *buf)
{
	struct proc_thermal_device *proc_dev = dev_get_drvdata(dev);
	int ret;

	ret = proc_thermal_read_power_floor_status(proc_dev);

	return sysfs_emit(buf, "%d\n", ret);
}

static ssize_t power_floor_enable_show(struct device *dev,
				       struct device_attribute *attr,
				       char *buf)
{
	struct proc_thermal_device *proc_dev = dev_get_drvdata(dev);
	bool ret;

	ret = proc_thermal_power_floor_get_state(proc_dev);

	return sysfs_emit(buf, "%d\n", ret);
}

static ssize_t power_floor_enable_store(struct device *dev,
					struct device_attribute *attr,
					const char *buf, size_t count)
{
	struct proc_thermal_device *proc_dev = dev_get_drvdata(dev);
	u8 state;
	int ret;

	if (kstrtou8(buf, 0, &state))
		return -EINVAL;

	ret = proc_thermal_power_floor_set_state(proc_dev, !!state);
	if (ret)
		return ret;

	return count;
}

POWER_LIMIT_SHOW(0, min_uw)
POWER_LIMIT_SHOW(0, max_uw)
POWER_LIMIT_SHOW(0, step_uw)
@@ -50,6 +92,9 @@ static DEVICE_ATTR_RO(power_limit_1_step_uw);
static DEVICE_ATTR_RO(power_limit_1_tmin_us);
static DEVICE_ATTR_RO(power_limit_1_tmax_us);

static DEVICE_ATTR_RO(power_floor_status);
static DEVICE_ATTR_RW(power_floor_enable);

static struct attribute *power_limit_attrs[] = {
	&dev_attr_power_limit_0_min_uw.attr,
	&dev_attr_power_limit_1_min_uw.attr,
@@ -61,12 +106,30 @@ static struct attribute *power_limit_attrs[] = {
	&dev_attr_power_limit_1_tmin_us.attr,
	&dev_attr_power_limit_0_tmax_us.attr,
	&dev_attr_power_limit_1_tmax_us.attr,
	&dev_attr_power_floor_status.attr,
	&dev_attr_power_floor_enable.attr,
	NULL
};

static umode_t power_limit_attr_visible(struct kobject *kobj, struct attribute *attr, int unused)
{
	struct device *dev = kobj_to_dev(kobj);
	struct proc_thermal_device *proc_dev;

	if (attr != &dev_attr_power_floor_status.attr && attr != &dev_attr_power_floor_enable.attr)
		return attr->mode;

	proc_dev = dev_get_drvdata(dev);
	if (!proc_dev || !(proc_dev->mmio_feature_mask & PROC_THERMAL_FEATURE_POWER_FLOOR))
		return 0;

	return attr->mode;
}

static const struct attribute_group power_limit_attribute_group = {
	.attrs = power_limit_attrs,
	.name = "power_limits"
	.name = "power_limits",
	.is_visible = power_limit_attr_visible,
};

static ssize_t tcc_offset_degree_celsius_show(struct device *dev,
@@ -346,12 +409,18 @@ int proc_thermal_mmio_add(struct pci_dev *pdev,
		}
	}

	if (feature_mask & PROC_THERMAL_FEATURE_MBOX) {
		ret = proc_thermal_mbox_add(pdev, proc_priv);
	if (feature_mask & PROC_THERMAL_FEATURE_WT_REQ) {
		ret = proc_thermal_wt_req_add(pdev, proc_priv);
		if (ret) {
			dev_err(&pdev->dev, "failed to add MBOX interface\n");
			goto err_rem_rfim;
		}
	} else if (feature_mask & PROC_THERMAL_FEATURE_WT_HINT) {
		ret = proc_thermal_wt_hint_add(pdev, proc_priv);
		if (ret) {
			dev_err(&pdev->dev, "failed to add WT Hint\n");
			goto err_rem_rfim;
		}
	}

	return 0;
@@ -374,12 +443,18 @@ void proc_thermal_mmio_remove(struct pci_dev *pdev, struct proc_thermal_device *
	    proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_DVFS)
		proc_thermal_rfim_remove(pdev);

	if (proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_MBOX)
		proc_thermal_mbox_remove(pdev);
	if (proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_POWER_FLOOR)
		proc_thermal_power_floor_set_state(proc_priv, false);

	if (proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_WT_REQ)
		proc_thermal_wt_req_remove(pdev);
	else if (proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_WT_HINT)
		proc_thermal_wt_hint_remove(pdev);
}
EXPORT_SYMBOL_GPL(proc_thermal_mmio_remove);

MODULE_IMPORT_NS(INTEL_TCC);
MODULE_IMPORT_NS(INT340X_THERMAL);
MODULE_AUTHOR("Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>");
MODULE_DESCRIPTION("Processor Thermal Reporting Device Driver");
MODULE_LICENSE("GPL v2");
+30 −3
Original line number Diff line number Diff line
@@ -10,6 +10,7 @@
#include <linux/intel_rapl.h>

#define PCI_DEVICE_ID_INTEL_ADL_THERMAL	0x461d
#define PCI_DEVICE_ID_INTEL_ARL_S_THERMAL 0xAD03
#define PCI_DEVICE_ID_INTEL_BDW_THERMAL	0x1603
#define PCI_DEVICE_ID_INTEL_BSW_THERMAL	0x22DC

@@ -59,8 +60,10 @@ struct rapl_mmio_regs {
#define PROC_THERMAL_FEATURE_RAPL	0x01
#define PROC_THERMAL_FEATURE_FIVR	0x02
#define PROC_THERMAL_FEATURE_DVFS	0x04
#define PROC_THERMAL_FEATURE_MBOX	0x08
#define PROC_THERMAL_FEATURE_WT_REQ	0x08
#define PROC_THERMAL_FEATURE_DLVR	0x10
#define PROC_THERMAL_FEATURE_WT_HINT	0x20
#define PROC_THERMAL_FEATURE_POWER_FLOOR	0x40

#if IS_ENABLED(CONFIG_PROC_THERMAL_MMIO_RAPL)
int proc_thermal_rapl_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv);
@@ -80,13 +83,37 @@ static void __maybe_unused proc_thermal_rapl_remove(void)
int proc_thermal_rfim_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv);
void proc_thermal_rfim_remove(struct pci_dev *pdev);

int proc_thermal_mbox_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv);
void proc_thermal_mbox_remove(struct pci_dev *pdev);
int proc_thermal_wt_req_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv);
void proc_thermal_wt_req_remove(struct pci_dev *pdev);

#define MBOX_CMD_WORKLOAD_TYPE_READ	0x0E
#define MBOX_CMD_WORKLOAD_TYPE_WRITE	0x0F

#define MBOX_DATA_BIT_AC_DC		30
#define MBOX_DATA_BIT_VALID		31

#define SOC_WT_RES_INT_STATUS_OFFSET	0x5B18
#define SOC_WT_RES_INT_STATUS_MASK	GENMASK_ULL(3, 2)

int proc_thermal_read_power_floor_status(struct proc_thermal_device *proc_priv);
int proc_thermal_power_floor_set_state(struct proc_thermal_device *proc_priv, bool enable);
bool proc_thermal_power_floor_get_state(struct proc_thermal_device *proc_priv);
void proc_thermal_power_floor_intr_callback(struct pci_dev *pdev,
					    struct proc_thermal_device *proc_priv);
bool proc_thermal_check_power_floor_intr(struct proc_thermal_device *proc_priv);

int processor_thermal_send_mbox_read_cmd(struct pci_dev *pdev, u16 id, u64 *resp);
int processor_thermal_send_mbox_write_cmd(struct pci_dev *pdev, u16 id, u32 data);
int processor_thermal_mbox_interrupt_config(struct pci_dev *pdev, bool enable, int enable_bit,
					    int time_window);
int proc_thermal_add(struct device *dev, struct proc_thermal_device *priv);
void proc_thermal_remove(struct proc_thermal_device *proc_priv);

int proc_thermal_wt_hint_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv);
void proc_thermal_wt_hint_remove(struct pci_dev *pdev);
void proc_thermal_wt_intr_callback(struct pci_dev *pdev, struct proc_thermal_device *proc_priv);
bool proc_thermal_check_wt_intr(struct proc_thermal_device *proc_priv);

int proc_thermal_suspend(struct device *dev);
int proc_thermal_resume(struct device *dev);
int proc_thermal_mmio_add(struct pci_dev *pdev,
Loading