Commit 87e4951e authored by weizijie's avatar weizijie Committed by Sean Christopherson
Browse files

KVM: x86: Rescan I/O APIC routes after EOI interception for old routing



Rescan I/O APIC routes for a vCPU after handling an intercepted I/O APIC
EOI for an IRQ that is not targeting said vCPU, i.e. after handling what's
effectively a stale EOI VM-Exit.  If a level-triggered IRQ is in-flight
when IRQ routing changes, e.g. because the guest changes routing from its
IRQ handler, then KVM intercepts EOIs on both the new and old target vCPUs,
so that the in-flight IRQ can be de-asserted when it's EOI'd.

However, only the EOI for the in-flight IRQ needs to be intercepted, as
IRQs on the same vector with the new routing are coincidental, i.e. occur
only if the guest is reusing the vector for multiple interrupt sources.
If the I/O APIC routes aren't rescanned, KVM will unnecessarily intercept
EOIs for the vector and negative impact the vCPU's interrupt performance.

Note, both commit db2bdcbb ("KVM: x86: fix edge EOI and IOAPIC reconfig
race") and commit 0fc5a36d ("KVM: x86: ioapic: Fix level-triggered EOI
and IOAPIC reconfigure race") mentioned this issue, but it was considered
a "rare" occurrence thus was not addressed.  However in real environments,
this issue can happen even in a well-behaved guest.

Cc: Kai Huang <kai.huang@intel.com>
Co-developed-by: default avatarxuyun <xuyun_xy.xy@linux.alibaba.com>
Signed-off-by: default avatarxuyun <xuyun_xy.xy@linux.alibaba.com>
Signed-off-by: default avatarweizijie <zijie.wei@linux.alibaba.com>
[sean: massage changelog and comments, use int/-1, reset at scan]
Reviewed-by: default avatarKai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20250304013335.4155703-4-seanjc@google.com


Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
parent c2207bbc
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -1034,6 +1034,7 @@ struct kvm_vcpu_arch {

	int pending_ioapic_eoi;
	int pending_external_vector;
	int highest_stale_pending_ioapic_eoi;

	/* be preempted when it's in kernel-mode(cpl=0) */
	bool preempted_in_kernel;
+14 −2
Original line number Diff line number Diff line
@@ -412,9 +412,21 @@ void kvm_scan_ioapic_irq(struct kvm_vcpu *vcpu, u32 dest_id, u16 dest_mode,
	 * level-triggered IRQ.  The EOI needs to be intercepted and forwarded
	 * to I/O APIC emulation so that the IRQ can be de-asserted.
	 */
	if (kvm_apic_match_dest(vcpu, NULL, APIC_DEST_NOSHORT, dest_id, dest_mode) ||
	    kvm_apic_pending_eoi(vcpu, vector))
	if (kvm_apic_match_dest(vcpu, NULL, APIC_DEST_NOSHORT, dest_id, dest_mode)) {
		__set_bit(vector, ioapic_handled_vectors);
	} else if (kvm_apic_pending_eoi(vcpu, vector)) {
		__set_bit(vector, ioapic_handled_vectors);

		/*
		 * Track the highest pending EOI for which the vCPU is NOT the
		 * target in the new routing.  Only the EOI for the IRQ that is
		 * in-flight (for the old routing) needs to be intercepted, any
		 * future IRQs that arrive on this vCPU will be coincidental to
		 * the level-triggered routing and don't need to be intercepted.
		 */
		if ((int)vector > vcpu->arch.highest_stale_pending_ioapic_eoi)
			vcpu->arch.highest_stale_pending_ioapic_eoi = vector;
	}
}

void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
+8 −0
Original line number Diff line number Diff line
@@ -1459,6 +1459,14 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector)
	if (!kvm_ioapic_handles_vector(apic, vector))
		return;

	/*
	 * If the intercepted EOI is for an IRQ that was pending from previous
	 * routing, then re-scan the I/O APIC routes as EOIs for the IRQ likely
	 * no longer need to be intercepted.
	 */
	if (apic->vcpu->arch.highest_stale_pending_ioapic_eoi == vector)
		kvm_make_request(KVM_REQ_SCAN_IOAPIC, apic->vcpu);

	/* Request a KVM exit to inform the userspace IOAPIC. */
	if (irqchip_split(apic->vcpu->kvm)) {
		apic->vcpu->arch.pending_ioapic_eoi = vector;
+1 −0
Original line number Diff line number Diff line
@@ -10692,6 +10692,7 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
		return;

	bitmap_zero(vcpu->arch.ioapic_handled_vectors, 256);
	vcpu->arch.highest_stale_pending_ioapic_eoi = -1;

	kvm_x86_call(sync_pir_to_irr)(vcpu);