Commit c9c1e20b authored by Yan Zhao's avatar Yan Zhao Committed by Paolo Bonzini
Browse files

KVM: x86: Introduce Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT



Introduce an Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT to have
KVM ignore guest PAT when this quirk is enabled.

On AMD platforms, KVM always honors guest PAT.  On Intel however there are
two issues.  First, KVM *cannot* honor guest PAT if CPU feature self-snoop
is not supported. Second, UC access on certain Intel platforms can be very
slow[1] and honoring guest PAT on those platforms may break some old
guests that accidentally specify video RAM as UC. Those old guests may
never expect the slowness since KVM always forces WB previously. See [2].

So, introduce a quirk that KVM can enable by default on all Intel platforms
to avoid breaking old unmodifiable guests. Newer userspace can disable this
quirk if it wishes KVM to honor guest PAT; disabling the quirk will fail
if self-snoop is not supported, i.e. if KVM cannot obey the wish.

The quirk is a no-op on AMD and also if any assigned devices have
non-coherent DMA.  This is not an issue, as KVM_X86_QUIRK_CD_NW_CLEARED is
another example of a quirk that is sometimes automatically disabled.

Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
Suggested-by: default avatarSean Christopherson <seanjc@google.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: default avatarYan Zhao <yan.y.zhao@intel.com>
Link: https://lore.kernel.org/all/Ztl9NWCOupNfVaCA@yzhao56-desk.sh.intel.com # [1]
Link: https://lore.kernel.org/all/87jzfutmfc.fsf@redhat.com

 # [2]
Message-ID: <20250224070946.31482-1-yan.y.zhao@intel.com>
[Use supported_quirks/inapplicable_quirks to support both AMD and
 no-self-snoop cases, as well as to remove the shadow_memtype_mask check
 from kvm_mmu_may_ignore_guest_pat(). - Paolo]
Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
parent bd7d5362
Loading
Loading
Loading
Loading
+22 −0
Original line number Diff line number Diff line
@@ -8160,6 +8160,28 @@ KVM_X86_QUIRK_STUFF_FEATURE_MSRS By default, at vCPU creation, KVM sets the
                                    and 0x489), as KVM does now allow them to
                                    be set by userspace (KVM sets them based on
                                    guest CPUID, for safety purposes).

KVM_X86_QUIRK_IGNORE_GUEST_PAT      By default, on Intel platforms, KVM ignores
                                    guest PAT and forces the effective memory
                                    type to WB in EPT.  The quirk is not available
                                    on Intel platforms which are incapable of
                                    safely honoring guest PAT (i.e., without CPU
                                    self-snoop, KVM always ignores guest PAT and
                                    forces effective memory type to WB).  It is
                                    also ignored on AMD platforms or, on Intel,
                                    when a VM has non-coherent DMA devices
                                    assigned; KVM always honors guest PAT in
                                    such case. The quirk is needed to avoid
                                    slowdowns on certain Intel Xeon platforms
                                    (e.g. ICX, SPR) where self-snoop feature is
                                    supported but UC is slow enough to cause
                                    issues with some older guests that use
                                    UC instead of WC to map the video RAM.
                                    Userspace can disable the quirk to honor
                                    guest PAT if it knows that there is no such
                                    guest software, for example if it does not
                                    expose a bochs graphics device (which is
                                    known to have had a buggy driver).
=================================== ============================================

7.32 KVM_CAP_MAX_VCPU_ID
+4 −2
Original line number Diff line number Diff line
@@ -2420,10 +2420,12 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages);
	 KVM_X86_QUIRK_FIX_HYPERCALL_INSN |	\
	 KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS |	\
	 KVM_X86_QUIRK_SLOT_ZAP_ALL |		\
	 KVM_X86_QUIRK_STUFF_FEATURE_MSRS)
	 KVM_X86_QUIRK_STUFF_FEATURE_MSRS |	\
	 KVM_X86_QUIRK_IGNORE_GUEST_PAT)

#define KVM_X86_CONDITIONAL_QUIRKS		\
	 KVM_X86_QUIRK_CD_NW_CLEARED
	(KVM_X86_QUIRK_CD_NW_CLEARED |		\
	 KVM_X86_QUIRK_IGNORE_GUEST_PAT)

/*
 * KVM previously used a u32 field in kvm_run to indicate the hypercall was
+1 −0
Original line number Diff line number Diff line
@@ -441,6 +441,7 @@ struct kvm_sync_regs {
#define KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS	(1 << 6)
#define KVM_X86_QUIRK_SLOT_ZAP_ALL		(1 << 7)
#define KVM_X86_QUIRK_STUFF_FEATURE_MSRS	(1 << 8)
#define KVM_X86_QUIRK_IGNORE_GUEST_PAT		(1 << 9)

#define KVM_STATE_NESTED_FORMAT_VMX	0
#define KVM_STATE_NESTED_FORMAT_SVM	1
+1 −1
Original line number Diff line number Diff line
@@ -232,7 +232,7 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
	return -(u32)fault & errcode;
}

bool kvm_mmu_may_ignore_guest_pat(void);
bool kvm_mmu_may_ignore_guest_pat(struct kvm *kvm);

int kvm_mmu_post_init_vm(struct kvm *kvm);
void kvm_mmu_pre_destroy_vm(struct kvm *kvm);
+6 −4
Original line number Diff line number Diff line
@@ -4663,17 +4663,19 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu,
}
#endif

bool kvm_mmu_may_ignore_guest_pat(void)
bool kvm_mmu_may_ignore_guest_pat(struct kvm *kvm)
{
	/*
	 * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
	 * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
	 * honor the memtype from the guest's PAT so that guest accesses to
	 * memory that is DMA'd aren't cached against the guest's wishes.  As a
	 * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
	 * KVM _always_ ignores guest PAT (when EPT is enabled).
	 * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA.
	 * KVM _always_ ignores guest PAT, when EPT is enabled and when quirk
	 * KVM_X86_QUIRK_IGNORE_GUEST_PAT is enabled or the CPU lacks the
	 * ability to safely honor guest PAT.
	 */
	return shadow_memtype_mask;
	return kvm_check_has_quirk(kvm, KVM_X86_QUIRK_IGNORE_GUEST_PAT);
}

int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
Loading