Commit b3a37bff authored by Sagi Shahar's avatar Sagi Shahar Committed by Sean Christopherson
Browse files

KVM: TDX: Reject fully in-kernel irqchip if EOIs are protected, i.e. for TDX VMs

Reject KVM_CREATE_IRQCHIP if the VM type has protected EOIs, i.e. if KVM
can't intercept EOI and thus can't faithfully emulate level-triggered
interrupts that are routed through the I/O APIC.  For TDX VMs, the
TDX-Module owns the VMX EOI-bitmap and configures all IRQ vectors to have
the CPU accelerate EOIs, i.e. doesn't allow KVM to intercept any EOIs.

KVM already requires a split irqchip[1], but does so during vCPU creation,
which is both too late to allow userspace to fallback to a split irqchip
and a less-than-stellar experience for userspace since an -EINVAL on
KVM_VCPU_CREATE is far harder to debug/triage than failure exactly on
KVM_CREATE_IRQCHIP.  And of course, allowing an action that ultimately
fails is arguably a bug regardless of the impact on userspace.

Link: https://lore.kernel.org/lkml/20250222014757.897978-11-binbin.wu@linux.intel.com [1]
Link: https://lore.kernel.org/lkml/aK3vZ5HuKKeFuuM4@google.com


Suggested-by: default avatarSean Christopherson <seanjc@google.com>
Signed-off-by: default avatarSagi Shahar <sagis@google.com>
Reviewed-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: default avatarBinbin Wu <binbin.wu@linux.intel.com>
Acked-by: default avatarKai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20250827011726.2451115-1-sagis@google.com


[sean: massage shortlog+changelog, relocate setting has_protected_eoi]
Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
parent aac057dd
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -1362,6 +1362,7 @@ struct kvm_arch {
	u8 vm_type;
	bool has_private_mem;
	bool has_protected_state;
	bool has_protected_eoi;
	bool pre_fault_allowed;
	struct hlist_head *mmu_page_hash;
	struct list_head active_mmu_pages;
+5 −0
Original line number Diff line number Diff line
@@ -629,6 +629,11 @@ int tdx_vm_init(struct kvm *kvm)
	struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);

	kvm->arch.has_protected_state = true;
	/*
	 * TDX Module doesn't allow the hypervisor to modify the EOI-bitmap,
	 * i.e. all EOIs are accelerated and never trigger exits.
	 */
	kvm->arch.has_protected_eoi = true;
	kvm->arch.has_private_mem = true;
	kvm->arch.disabled_quirks |= KVM_X86_QUIRK_IGNORE_GUEST_PAT;

+9 −0
Original line number Diff line number Diff line
@@ -6989,6 +6989,15 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
		if (irqchip_in_kernel(kvm))
			goto create_irqchip_unlock;

		/*
		 * Disallow an in-kernel I/O APIC if the VM has protected EOIs,
		 * i.e. if KVM can't intercept EOIs and thus can't properly
		 * emulate level-triggered interrupts.
		 */
		r = -ENOTTY;
		if (kvm->arch.has_protected_eoi)
			goto create_irqchip_unlock;

		r = -EINVAL;
		if (kvm->created_vcpus)
			goto create_irqchip_unlock;