Commit e5f62e27 authored by Paolo Bonzini's avatar Paolo Bonzini
Browse files

Merge tag 'kvmarm-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for Linux 6.10

- Move a lot of state that was previously stored on a per vcpu
  basis into a per-CPU area, because it is only pertinent to the
  host while the vcpu is loaded. This results in better state
  tracking, and a smaller vcpu structure.

- Add full handling of the ERET/ERETAA/ERETAB instructions in
  nested virtualisation. The last two instructions also require
  emulating part of the pointer authentication extension.
  As a result, the trap handling of pointer authentication has
  been greattly simplified.

- Turn the global (and not very scalable) LPI translation cache
  into a per-ITS, scalable cache, making non directly injected
  LPIs much cheaper to make visible to the vcpu.

- A batch of pKVM patches, mostly fixes and cleanups, as the
  upstreaming process seems to be resuming. Fingers crossed!

- Allocate PPIs and SGIs outside of the vcpu structure, allowing
  for smaller EL2 mapping and some flexibility in implementing
  more or less than 32 private IRQs.

- Purge stale mpidr_data if a vcpu is created after the MPIDR
  map has been created.

- Preserve vcpu-specific ID registers across a vcpu reset.

- Various minor cleanups and improvements.
parents 4232da23 eaa46a28
Loading
Loading
Loading
Loading
+7 −0
Original line number Diff line number Diff line
@@ -6894,6 +6894,13 @@ Note that KVM does not skip the faulting instruction as it does for
KVM_EXIT_MMIO, but userspace has to emulate any change to the processing state
if it decides to decode and emulate the instruction.

This feature isn't available to protected VMs, as userspace does not
have access to the state that is required to perform the emulation.
Instead, a data abort exception is directly injected in the guest.
Note that although KVM_CAP_ARM_NISV_TO_USER will be reported if
queried outside of a protected VM context, the feature will not be
exposed if queried on a protected VM file descriptor.

::

		/* KVM_EXIT_X86_RDMSR / KVM_EXIT_X86_WRMSR */
+138 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

=======================================
ARM firmware pseudo-registers interface
=======================================

KVM handles the hypercall services as requested by the guests. New hypercall
services are regularly made available by the ARM specification or by KVM (as
vendor services) if they make sense from a virtualization point of view.

This means that a guest booted on two different versions of KVM can observe
two different "firmware" revisions. This could cause issues if a given guest
is tied to a particular version of a hypercall service, or if a migration
causes a different version to be exposed out of the blue to an unsuspecting
guest.

In order to remedy this situation, KVM exposes a set of "firmware
pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
interface. These registers can be saved/restored by userspace, and set
to a convenient value as required.

The following registers are defined:

* KVM_REG_ARM_PSCI_VERSION:

  KVM implements the PSCI (Power State Coordination Interface)
  specification in order to provide services such as CPU on/off, reset
  and power-off to the guest.

  - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
    (and thus has already been initialized)
  - Returns the current PSCI version on GET_ONE_REG (defaulting to the
    highest PSCI version implemented by KVM and compatible with v0.2)
  - Allows any PSCI version implemented by KVM and compatible with
    v0.2 to be set with SET_ONE_REG
  - Affects the whole VM (even if the register view is per-vcpu)

* KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
    Holds the state of the firmware support to mitigate CVE-2017-5715, as
    offered by KVM to the guest via a HVC call. The workaround is described
    under SMCCC_ARCH_WORKAROUND_1 in [1].

  Accepted values are:

    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL:
      KVM does not offer
      firmware support for the workaround. The mitigation status for the
      guest is unknown.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL:
      The workaround HVC call is
      available to the guest and required for the mitigation.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED:
      The workaround HVC call
      is available to the guest, but it is not needed on this VCPU.

* KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
    Holds the state of the firmware support to mitigate CVE-2018-3639, as
    offered by KVM to the guest via a HVC call. The workaround is described
    under SMCCC_ARCH_WORKAROUND_2 in [1]_.

  Accepted values are:

    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
      A workaround is not
      available. KVM does not offer firmware support for the workaround.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
      The workaround state is
      unknown. KVM does not offer firmware support for the workaround.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
      The workaround is available,
      and can be disabled by a vCPU. If
      KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED is set, it is active for
      this vCPU.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
      The workaround is always active on this vCPU or it is not needed.


Bitmap Feature Firmware Registers
---------------------------------

Contrary to the above registers, the following registers exposes the
hypercall services in the form of a feature-bitmap to the userspace. This
bitmap is translated to the services that are available to the guest.
There is a register defined per service call owner and can be accessed via
GET/SET_ONE_REG interface.

By default, these registers are set with the upper limit of the features
that are supported. This way userspace can discover all the usable
hypercall services via GET_ONE_REG. The user-space can write-back the
desired bitmap back via SET_ONE_REG. The features for the registers that
are untouched, probably because userspace isn't aware of them, will be
exposed as is to the guest.

Note that KVM will not allow the userspace to configure the registers
anymore once any of the vCPUs has run at least once. Instead, it will
return a -EBUSY.

The pseudo-firmware bitmap register are as follows:

* KVM_REG_ARM_STD_BMAP:
    Controls the bitmap of the ARM Standard Secure Service Calls.

  The following bits are accepted:

    Bit-0: KVM_REG_ARM_STD_BIT_TRNG_V1_0:
      The bit represents the services offered under v1.0 of ARM True Random
      Number Generator (TRNG) specification, ARM DEN0098.

* KVM_REG_ARM_STD_HYP_BMAP:
    Controls the bitmap of the ARM Standard Hypervisor Service Calls.

  The following bits are accepted:

    Bit-0: KVM_REG_ARM_STD_HYP_BIT_PV_TIME:
      The bit represents the Paravirtualized Time service as represented by
      ARM DEN0057A.

* KVM_REG_ARM_VENDOR_HYP_BMAP:
    Controls the bitmap of the Vendor specific Hypervisor Service Calls.

  The following bits are accepted:

    Bit-0: KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT
      The bit represents the ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID
      and ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID function-ids.

    Bit-1: KVM_REG_ARM_VENDOR_HYP_BIT_PTP:
      The bit represents the Precision Time Protocol KVM service.

Errors:

    =======  =============================================================
    -ENOENT   Unknown register accessed.
    -EBUSY    Attempt a 'write' to the register after the VM has started.
    -EINVAL   Invalid bitmap written to the register.
    =======  =============================================================

.. [1] https://developer.arm.com/-/media/developer/pdf/ARM_DEN_0070A_Firmware_interfaces_for_mitigating_CVE-2017-5715.pdf
+44 −136
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

=======================
ARM Hypercall Interface
=======================

KVM handles the hypercall services as requested by the guests. New hypercall
services are regularly made available by the ARM specification or by KVM (as
vendor services) if they make sense from a virtualization point of view.

This means that a guest booted on two different versions of KVM can observe
two different "firmware" revisions. This could cause issues if a given guest
is tied to a particular version of a hypercall service, or if a migration
causes a different version to be exposed out of the blue to an unsuspecting
guest.

In order to remedy this situation, KVM exposes a set of "firmware
pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
interface. These registers can be saved/restored by userspace, and set
to a convenient value as required.

The following registers are defined:

* KVM_REG_ARM_PSCI_VERSION:

  KVM implements the PSCI (Power State Coordination Interface)
  specification in order to provide services such as CPU on/off, reset
  and power-off to the guest.

  - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
    (and thus has already been initialized)
  - Returns the current PSCI version on GET_ONE_REG (defaulting to the
    highest PSCI version implemented by KVM and compatible with v0.2)
  - Allows any PSCI version implemented by KVM and compatible with
    v0.2 to be set with SET_ONE_REG
  - Affects the whole VM (even if the register view is per-vcpu)

* KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
    Holds the state of the firmware support to mitigate CVE-2017-5715, as
    offered by KVM to the guest via a HVC call. The workaround is described
    under SMCCC_ARCH_WORKAROUND_1 in [1].

  Accepted values are:

    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL:
      KVM does not offer
      firmware support for the workaround. The mitigation status for the
      guest is unknown.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL:
      The workaround HVC call is
      available to the guest and required for the mitigation.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED:
      The workaround HVC call
      is available to the guest, but it is not needed on this VCPU.

* KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
    Holds the state of the firmware support to mitigate CVE-2018-3639, as
    offered by KVM to the guest via a HVC call. The workaround is described
    under SMCCC_ARCH_WORKAROUND_2 in [1]_.

  Accepted values are:

    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
      A workaround is not
      available. KVM does not offer firmware support for the workaround.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
      The workaround state is
      unknown. KVM does not offer firmware support for the workaround.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
      The workaround is available,
      and can be disabled by a vCPU. If
      KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED is set, it is active for
      this vCPU.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
      The workaround is always active on this vCPU or it is not needed.


Bitmap Feature Firmware Registers
---------------------------------

Contrary to the above registers, the following registers exposes the
hypercall services in the form of a feature-bitmap to the userspace. This
bitmap is translated to the services that are available to the guest.
There is a register defined per service call owner and can be accessed via
GET/SET_ONE_REG interface.

By default, these registers are set with the upper limit of the features
that are supported. This way userspace can discover all the usable
hypercall services via GET_ONE_REG. The user-space can write-back the
desired bitmap back via SET_ONE_REG. The features for the registers that
are untouched, probably because userspace isn't aware of them, will be
exposed as is to the guest.

Note that KVM will not allow the userspace to configure the registers
anymore once any of the vCPUs has run at least once. Instead, it will
return a -EBUSY.

The pseudo-firmware bitmap register are as follows:

* KVM_REG_ARM_STD_BMAP:
    Controls the bitmap of the ARM Standard Secure Service Calls.

  The following bits are accepted:

    Bit-0: KVM_REG_ARM_STD_BIT_TRNG_V1_0:
      The bit represents the services offered under v1.0 of ARM True Random
      Number Generator (TRNG) specification, ARM DEN0098.

* KVM_REG_ARM_STD_HYP_BMAP:
    Controls the bitmap of the ARM Standard Hypervisor Service Calls.

  The following bits are accepted:

    Bit-0: KVM_REG_ARM_STD_HYP_BIT_PV_TIME:
      The bit represents the Paravirtualized Time service as represented by
      ARM DEN0057A.

* KVM_REG_ARM_VENDOR_HYP_BMAP:
    Controls the bitmap of the Vendor specific Hypervisor Service Calls.

  The following bits are accepted:

    Bit-0: KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT
      The bit represents the ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID
      and ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID function-ids.

    Bit-1: KVM_REG_ARM_VENDOR_HYP_BIT_PTP:
      The bit represents the Precision Time Protocol KVM service.

Errors:

    =======  =============================================================
    -ENOENT   Unknown register accessed.
    -EBUSY    Attempt a 'write' to the register after the VM has started.
    -EINVAL   Invalid bitmap written to the register.
    =======  =============================================================

.. [1] https://developer.arm.com/-/media/developer/pdf/ARM_DEN_0070A_Firmware_interfaces_for_mitigating_CVE-2017-5715.pdf
===============================================
KVM/arm64-specific hypercalls exposed to guests
===============================================

This file documents the KVM/arm64-specific hypercalls which may be
exposed by KVM/arm64 to guest operating systems. These hypercalls are
issued using the HVC instruction according to version 1.1 of the Arm SMC
Calling Convention (DEN0028/C):

https://developer.arm.com/docs/den0028/c

All KVM/arm64-specific hypercalls are allocated within the "Vendor
Specific Hypervisor Service Call" range with a UID of
``28b46fb6-2ec5-11e9-a9ca-4b564d003a74``. This UID should be queried by the
guest using the standard "Call UID" function for the service range in
order to determine that the KVM/arm64-specific hypercalls are available.

``ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID``
---------------------------------------------

Provides a discovery mechanism for other KVM/arm64 hypercalls.

+---------------------+-------------------------------------------------------------+
| Presence:           | Mandatory for the KVM/arm64 UID                             |
+---------------------+-------------------------------------------------------------+
| Calling convention: | HVC32                                                       |
+---------------------+----------+--------------------------------------------------+
| Function ID:        | (uint32) | 0x86000000                                       |
+---------------------+----------+--------------------------------------------------+
| Arguments:          | None                                                        |
+---------------------+----------+----+---------------------------------------------+
| Return Values:      | (uint32) | R0 | Bitmap of available function numbers 0-31   |
|                     +----------+----+---------------------------------------------+
|                     | (uint32) | R1 | Bitmap of available function numbers 32-63  |
|                     +----------+----+---------------------------------------------+
|                     | (uint32) | R2 | Bitmap of available function numbers 64-95  |
|                     +----------+----+---------------------------------------------+
|                     | (uint32) | R3 | Bitmap of available function numbers 96-127 |
+---------------------+----------+----+---------------------------------------------+

``ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID``
----------------------------------------

See ptp_kvm.rst
+1 −0
Original line number Diff line number Diff line
@@ -7,6 +7,7 @@ ARM
.. toctree::
   :maxdepth: 2

   fw-pseudo-registers
   hyp-abi
   hypercalls
   pvtime
+24 −14
Original line number Diff line number Diff line
@@ -7,19 +7,29 @@ PTP_KVM is used for high precision time sync between host and guests.
It relies on transferring the wall clock and counter value from the
host to the guest using a KVM-specific hypercall.

* ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID: 0x86000001
``ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID``
----------------------------------------

This hypercall uses the SMC32/HVC32 calling convention:
Retrieve current time information for the specific counter. There are no
endianness restrictions.

ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID
    ==============    ========    =====================================
    Function ID:      (uint32)    0x86000001
    Arguments:        (uint32)    KVM_PTP_VIRT_COUNTER(0)
                                  KVM_PTP_PHYS_COUNTER(1)
    Return Values:    (int32)     NOT_SUPPORTED(-1) on error, or
                      (uint32)    Upper 32 bits of wall clock time (r0)
                      (uint32)    Lower 32 bits of wall clock time (r1)
                      (uint32)    Upper 32 bits of counter (r2)
                      (uint32)    Lower 32 bits of counter (r3)
    Endianness:                   No Restrictions.
    ==============    ========    =====================================
+---------------------+-------------------------------------------------------+
| Presence:           | Optional                                              |
+---------------------+-------------------------------------------------------+
| Calling convention: | HVC32                                                 |
+---------------------+----------+--------------------------------------------+
| Function ID:        | (uint32) | 0x86000001                                 |
+---------------------+----------+----+---------------------------------------+
| Arguments:          | (uint32) | R1 | ``KVM_PTP_VIRT_COUNTER (0)``          |
|                     |          |    +---------------------------------------+
|                     |          |    | ``KVM_PTP_PHYS_COUNTER (1)``          |
+---------------------+----------+----+---------------------------------------+
| Return Values:      | (int32)  | R0 | ``NOT_SUPPORTED (-1)`` on error, else |
|                     |          |    | upper 32 bits of wall clock time      |
|                     +----------+----+---------------------------------------+
|                     | (uint32) | R1 | Lower 32 bits of wall clock time      |
|                     +----------+----+---------------------------------------+
|                     | (uint32) | R2 | Upper 32 bits of counter              |
|                     +----------+----+---------------------------------------+
|                     | (uint32) | R3 | Lower 32 bits of counter              |
+---------------------+----------+----+---------------------------------------+
Loading