Commit fadaf574 authored by Paolo Bonzini's avatar Paolo Bonzini
Browse files

Merge tag 'kvm-x86-docs-6.7' of https://github.com/kvm-x86/linux into HEAD

KVM x86 Documentation updates for 6.7:

 - Fix various typos, notably a confusing reference to the non-existent
   "struct kvm_vcpu_event" (the actual structure is kvm_vcpu_events, plural).

 - Update x86's kvm_mmu_page documentation to bring it closer to the code
   (this raced with the removal of async zapping and so the documentation is
   already stale; my bad).

 - Document the behavior of x86 PMU filters on fixed counters.
parents f2336467 b35babd3
Loading
Loading
Loading
Loading
+27 −9
Original line number Diff line number Diff line
@@ -547,7 +547,7 @@ ioctl is useful if the in-kernel PIC is not used.
PPC:
^^^^

Queues an external interrupt to be injected. This ioctl is overleaded
Queues an external interrupt to be injected. This ioctl is overloaded
with 3 different irq values:

a) KVM_INTERRUPT_SET
@@ -998,7 +998,7 @@ be set in the flags field of this ioctl:
The KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL flag requests KVM to generate
the contents of the hypercall page automatically; hypercalls will be
intercepted and passed to userspace through KVM_EXIT_XEN.  In this
ase, all of the blob size and address fields must be zero.
case, all of the blob size and address fields must be zero.

The KVM_XEN_HVM_CONFIG_EVTCHN_SEND flag indicates to KVM that userspace
will always use the KVM_XEN_HVM_EVTCHN_SEND ioctl to deliver event
@@ -1103,7 +1103,7 @@ Other flags returned by ``KVM_GET_CLOCK`` are accepted but ignored.
:Extended by: KVM_CAP_INTR_SHADOW
:Architectures: x86, arm64
:Type: vcpu ioctl
:Parameters: struct kvm_vcpu_event (out)
:Parameters: struct kvm_vcpu_events (out)
:Returns: 0 on success, -1 on error

X86:
@@ -1226,7 +1226,7 @@ directly to the virtual CPU).
:Extended by: KVM_CAP_INTR_SHADOW
:Architectures: x86, arm64
:Type: vcpu ioctl
:Parameters: struct kvm_vcpu_event (in)
:Parameters: struct kvm_vcpu_events (in)
:Returns: 0 on success, -1 on error

X86:
@@ -3115,7 +3115,7 @@ as follow::
   };

An entry with a "page_shift" of 0 is unused. Because the array is
organized in increasing order, a lookup can stop when encoutering
organized in increasing order, a lookup can stop when encountering
such an entry.

The "slb_enc" field provides the encoding to use in the SLB for the
@@ -3507,7 +3507,7 @@ Possible features:
	      - KVM_RUN and KVM_GET_REG_LIST are not available;

	      - KVM_GET_ONE_REG and KVM_SET_ONE_REG cannot be used to access
	        the scalable archietctural SVE registers
	        the scalable architectural SVE registers
	        KVM_REG_ARM64_SVE_ZREG(), KVM_REG_ARM64_SVE_PREG() or
	        KVM_REG_ARM64_SVE_FFR;

@@ -4453,7 +4453,7 @@ This will have undefined effects on the guest if it has not already
placed itself in a quiescent state where no vcpu will make MMU enabled
memory accesses.

On succsful completion, the pending HPT will become the guest's active
On successful completion, the pending HPT will become the guest's active
HPT and the previous HPT will be discarded.

On failure, the guest will still be operating on its previous HPT.
@@ -5068,7 +5068,7 @@ before the vcpu is fully usable.

Between KVM_ARM_VCPU_INIT and KVM_ARM_VCPU_FINALIZE, the feature may be
configured by use of ioctls such as KVM_SET_ONE_REG.  The exact configuration
that should be performaned and how to do it are feature-dependent.
that should be performed and how to do it are feature-dependent.

Other calls that depend on a particular feature being finalized, such as
KVM_RUN, KVM_GET_REG_LIST, KVM_GET_ONE_REG and KVM_SET_ONE_REG, will fail with
@@ -5176,6 +5176,24 @@ Valid values for 'action'::
  #define KVM_PMU_EVENT_ALLOW 0
  #define KVM_PMU_EVENT_DENY 1

Via this API, KVM userspace can also control the behavior of the VM's fixed
counters (if any) by configuring the "action" and "fixed_counter_bitmap" fields.

Specifically, KVM follows the following pseudo-code when determining whether to
allow the guest FixCtr[i] to count its pre-defined fixed event::

  FixCtr[i]_is_allowed = (action == ALLOW) && (bitmap & BIT(i)) ||
    (action == DENY) && !(bitmap & BIT(i));
  FixCtr[i]_is_denied = !FixCtr[i]_is_allowed;

KVM always consumes fixed_counter_bitmap, it's userspace's responsibility to
ensure fixed_counter_bitmap is set correctly, e.g. if userspace wants to define
a filter that only affects general purpose counters.

Note, the "events" field also applies to fixed counters' hardcoded event_select
and unit_mask values.  "fixed_counter_bitmap" has higher priority than "events"
if there is a contradiction between the two.

4.121 KVM_PPC_SVM_OFF
---------------------

@@ -5527,7 +5545,7 @@ KVM_XEN_ATTR_TYPE_EVTCHN
  from the guest. A given sending port number may be directed back to
  a specified vCPU (by APIC ID) / port / priority on the guest, or to
  trigger events on an eventfd. The vCPU and priority can be changed
  by setting KVM_XEN_EVTCHN_UPDATE in a subsequent call, but but other
  by setting KVM_XEN_EVTCHN_UPDATE in a subsequent call, but other
  fields cannot change for a given sending port. A port mapping is
  removed by using KVM_XEN_EVTCHN_DEASSIGN in the flags field. Passing
  KVM_XEN_EVTCHN_RESET in the flags field removes all interception of
+34 −9
Original line number Diff line number Diff line
@@ -202,10 +202,22 @@ Shadow pages contain the following information:
    Is 1 if the MMU instance cannot use A/D bits.  EPT did not have A/D
    bits before Haswell; shadow EPT page tables also cannot use A/D bits
    if the L1 hypervisor does not enable them.
  role.guest_mode:
    Indicates the shadow page is created for a nested guest.
  role.passthrough:
    The page is not backed by a guest page table, but its first entry
    points to one.  This is set if NPT uses 5-level page tables (host
    CR4.LA57=1) and is shadowing L1's 4-level NPT (L1 CR4.LA57=0).
  mmu_valid_gen:
    The MMU generation of this page, used to fast zap of all MMU pages within a
    VM without blocking vCPUs too long. Specifically, KVM updates the per-VM
    valid MMU generation which causes the mismatch of mmu_valid_gen for each mmu
    page. This makes all existing MMU pages obsolete. Obsolete pages can't be
    used. Therefore, vCPUs must load a new, valid root before re-entering the
    guest. The MMU generation is only ever '0' or '1'. Note, the TDP MMU doesn't
    use this field as non-root TDP MMU pages are reachable only from their
    owning root. Thus it suffices for TDP MMU to use role.invalid in root pages
    to invalidate all MMU pages.
  gfn:
    Either the guest page table containing the translations shadowed by this
    page, or the base page frame for linear translations.  See role.direct.
@@ -219,21 +231,30 @@ Shadow pages contain the following information:
    at __pa(sp2->spt).  sp2 will point back at sp1 through parent_pte.
    The spt array forms a DAG structure with the shadow page as a node, and
    guest pages as leaves.
  gfns:
    An array of 512 guest frame numbers, one for each present pte.  Used to
    perform a reverse map from a pte to a gfn. When role.direct is set, any
    element of this array can be calculated from the gfn field when used, in
    this case, the array of gfns is not allocated. See role.direct and gfn.
  root_count:
    A counter keeping track of how many hardware registers (guest cr3 or
    pdptrs) are now pointing at the page.  While this counter is nonzero, the
    page cannot be destroyed.  See role.invalid.
  shadowed_translation:
    An array of 512 shadow translation entries, one for each present pte. Used
    to perform a reverse map from a pte to a gfn as well as its access
    permission. When role.direct is set, the shadow_translation array is not
    allocated. This is because the gfn contained in any element of this array
    can be calculated from the gfn field when used.  In addition, when
    role.direct is set, KVM does not track access permission for each of the
    gfn. See role.direct and gfn.
  root_count / tdp_mmu_root_count:
     root_count is a reference counter for root shadow pages in Shadow MMU.
     vCPUs elevate the refcount when getting a shadow page that will be used as
     a root page, i.e. page that will be loaded into hardware directly (CR3,
     PDPTRs, nCR3 EPTP). Root pages cannot be destroyed while their refcount is
     non-zero. See role.invalid. tdp_mmu_root_count is similar but exclusively
     used in TDP MMU as an atomic refcount.
  parent_ptes:
    The reverse mapping for the pte/ptes pointing at this page's spt. If
    parent_ptes bit 0 is zero, only one spte points at this page and
    parent_ptes points at this single spte, otherwise, there exists multiple
    sptes pointing at this page and (parent_ptes & ~0x1) points at a data
    structure with a list of parent sptes.
  ptep:
    The kernel virtual address of the SPTE that points at this shadow page.
    Used exclusively by the TDP MMU, this field is a union with parent_ptes.
  unsync:
    If true, then the translations in this page may not match the guest's
    translation.  This is equivalent to the state of the tlb when a pte is
@@ -261,6 +282,10 @@ Shadow pages contain the following information:
    since the last time the page table was actually used; if emulation
    is triggered too frequently on this page, KVM will unmap the page
    to avoid emulation in the future.
  tdp_mmu_page:
    Is 1 if the shadow page is a TDP MMU page. This variable is used to
    bifurcate the control flows for KVM when walking any data structure that
    may contain pages from both TDP MMU and shadow MMU.

Reverse map
===========