Commit 548183ea authored by Sean Christopherson's avatar Sean Christopherson Committed by Joerg Roedel
Browse files

iommu/vt-d: Wire up irq_ack() to irq_move_irq() for posted MSIs



Set the posted MSI irq_chip's irq_ack() hook to irq_move_irq() instead of
a dummy/empty callback so that posted MSIs process pending changes to the
IRQ's SMP affinity.  Failure to honor a pending set-affinity results in
userspace being unable to change the effective affinity of the IRQ, as
IRQD_SETAFFINITY_PENDING is never cleared and so irq_set_affinity_locked()
always defers moving the IRQ.

The issue is most easily reproducible by setting /proc/irq/xx/smp_affinity
multiple times in quick succession, as only the first update is likely to
be handled in process context.

Fixes: ed1e48ea ("iommu/vt-d: Enable posted mode for device MSIs")
Cc: Robert Lippert <rlippert@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: default avatarWentao Yang <wentaoyang@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20250321194249.1217961-1-seanjc@google.com


Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
parent df4bf3fa
Loading
Loading
Loading
Loading
+15 −14
Original line number Diff line number Diff line
@@ -1287,43 +1287,44 @@ static struct irq_chip intel_ir_chip = {
};

/*
 * With posted MSIs, all vectors are multiplexed into a single notification
 * vector. Devices MSIs are then dispatched in a demux loop where
 * EOIs can be coalesced as well.
 * With posted MSIs, the MSI vectors are multiplexed into a single notification
 * vector, and only the notification vector is sent to the APIC IRR.  Device
 * MSIs are then dispatched in a demux loop that harvests the MSIs from the
 * CPU's Posted Interrupt Request bitmap.  I.e. Posted MSIs never get sent to
 * the APIC IRR, and thus do not need an EOI.  The notification handler instead
 * performs a single EOI after processing the PIR.
 *
 * "INTEL-IR-POST" IRQ chip does not do EOI on ACK, thus the dummy irq_ack()
 * function. Instead EOI is performed by the posted interrupt notification
 * handler.
 * Note!  Pending SMP/CPU affinity changes, which are per MSI, must still be
 * honored, only the APIC EOI is omitted.
 *
 * For the example below, 3 MSIs are coalesced into one CPU notification. Only
 * one apic_eoi() is needed.
 * one apic_eoi() is needed, but each MSI needs to process pending changes to
 * its CPU affinity.
 *
 * __sysvec_posted_msi_notification()
 *	irq_enter();
 *		handle_edge_irq()
 *			irq_chip_ack_parent()
 *				dummy(); // No EOI
 *				irq_move_irq(); // No EOI
 *			handle_irq_event()
 *				driver_handler()
 *		handle_edge_irq()
 *			irq_chip_ack_parent()
 *				dummy(); // No EOI
 *				irq_move_irq(); // No EOI
 *			handle_irq_event()
 *				driver_handler()
 *		handle_edge_irq()
 *			irq_chip_ack_parent()
 *				dummy(); // No EOI
 *				irq_move_irq(); // No EOI
 *			handle_irq_event()
 *				driver_handler()
 *	apic_eoi()
 *	irq_exit()
 *
 */

static void dummy_ack(struct irq_data *d) { }

static struct irq_chip intel_ir_chip_post_msi = {
	.name			= "INTEL-IR-POST",
	.irq_ack		= dummy_ack,
	.irq_ack		= irq_move_irq,
	.irq_set_affinity	= intel_ir_set_affinity,
	.irq_compose_msi_msg	= intel_ir_compose_msi_msg,
	.irq_set_vcpu_affinity	= intel_ir_set_vcpu_affinity,