mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
synced 2026-04-18 03:23:53 -04:00
Merge branch 'kvm-tdx-initial' into HEAD
This large commit contains the initial support for TDX in KVM. All x86
parts enable the host-side hypercalls that KVM uses to talk to the TDX
module, a software component that runs in a special CPU mode called SEAM
(Secure Arbitration Mode).
The series is in turn split into multiple sub-series, each with a separate
merge commit:
- Initialization: basic setup for using the TDX module from KVM, plus
ioctls to create TDX VMs and vCPUs.
- MMU: in TDX, private and shared halves of the address space are mapped by
different EPT roots, and the private half is managed by the TDX module.
Using the support that was added to the generic MMU code in 6.14,
add support for TDX's secure page tables to the Intel side of KVM.
Generic KVM code takes care of maintaining a mirror of the secure page
tables so that they can be queried efficiently, and ensuring that changes
are applied to both the mirror and the secure EPT.
- vCPU enter/exit: implement the callbacks that handle the entry of a TDX
vCPU (via the SEAMCALL TDH.VP.ENTER) and the corresponding save/restore
of host state.
- Userspace exits: introduce support for guest TDVMCALLs that KVM forwards to
userspace. These correspond to the usual KVM_EXIT_* "heavyweight vmexits"
but are triggered through a different mechanism, similar to VMGEXIT for
SEV-ES and SEV-SNP.
- Interrupt handling: support for virtual interrupt injection as well as
handling VM-Exits that are caused by vectored events. Exclusive to
TDX are machine-check SMIs, which the kernel already knows how to
handle through the kernel machine check handler (commit 7911f145de,
"x86/mce: Implement recovery for errors in TDX/SEAM non-root mode")
- Loose ends: handling of the remaining exits from the TDX module, including
EPT violation/misconfig and several TDVMCALL leaves that are handled in
the kernel (CPUID, HLT, RDMSR/WRMSR, GetTdVmCallInfo); plus returning
an error or ignoring operations that are not supported by TDX guests
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This commit is contained in:
@@ -1411,6 +1411,9 @@ the memory region are automatically reflected into the guest. For example, an
|
||||
mmap() that affects the region will be made visible immediately. Another
|
||||
example is madvise(MADV_DROP).
|
||||
|
||||
For TDX guest, deleting/moving memory region loses guest memory contents.
|
||||
Read only region isn't supported. Only as-id 0 is supported.
|
||||
|
||||
Note: On arm64, a write generated by the page-table walker (to update
|
||||
the Access and Dirty flags, for example) never results in a
|
||||
KVM_EXIT_MMIO exit when the slot has the KVM_MEM_READONLY flag. This
|
||||
@@ -4768,7 +4771,7 @@ H_GET_CPU_CHARACTERISTICS hypercall.
|
||||
|
||||
:Capability: basic
|
||||
:Architectures: x86
|
||||
:Type: vm
|
||||
:Type: vm ioctl, vcpu ioctl
|
||||
:Parameters: an opaque platform specific structure (in/out)
|
||||
:Returns: 0 on success; -1 on error
|
||||
|
||||
@@ -4776,9 +4779,11 @@ If the platform supports creating encrypted VMs then this ioctl can be used
|
||||
for issuing platform-specific memory encryption commands to manage those
|
||||
encrypted VMs.
|
||||
|
||||
Currently, this ioctl is used for issuing Secure Encrypted Virtualization
|
||||
(SEV) commands on AMD Processors. The SEV commands are defined in
|
||||
Documentation/virt/kvm/x86/amd-memory-encryption.rst.
|
||||
Currently, this ioctl is used for issuing both Secure Encrypted Virtualization
|
||||
(SEV) commands on AMD Processors and Trusted Domain Extensions (TDX) commands
|
||||
on Intel Processors. The detailed commands are defined in
|
||||
Documentation/virt/kvm/x86/amd-memory-encryption.rst and
|
||||
Documentation/virt/kvm/x86/intel-tdx.rst.
|
||||
|
||||
4.111 KVM_MEMORY_ENCRYPT_REG_REGION
|
||||
-----------------------------------
|
||||
@@ -6827,6 +6832,7 @@ should put the acknowledged interrupt vector into the 'epr' field.
|
||||
#define KVM_SYSTEM_EVENT_WAKEUP 4
|
||||
#define KVM_SYSTEM_EVENT_SUSPEND 5
|
||||
#define KVM_SYSTEM_EVENT_SEV_TERM 6
|
||||
#define KVM_SYSTEM_EVENT_TDX_FATAL 7
|
||||
__u32 type;
|
||||
__u32 ndata;
|
||||
__u64 data[16];
|
||||
@@ -6853,6 +6859,11 @@ Valid values for 'type' are:
|
||||
reset/shutdown of the VM.
|
||||
- KVM_SYSTEM_EVENT_SEV_TERM -- an AMD SEV guest requested termination.
|
||||
The guest physical address of the guest's GHCB is stored in `data[0]`.
|
||||
- KVM_SYSTEM_EVENT_TDX_FATAL -- a TDX guest reported a fatal error state.
|
||||
KVM doesn't do any parsing or conversion, it just dumps 16 general-purpose
|
||||
registers to userspace, in ascending order of the 4-bit indices for x86-64
|
||||
general-purpose registers in instruction encoding, as defined in the Intel
|
||||
SDM.
|
||||
- KVM_SYSTEM_EVENT_WAKEUP -- the exiting vCPU is in a suspended state and
|
||||
KVM has recognized a wakeup event. Userspace may honor this event by
|
||||
marking the exiting vCPU as runnable, or deny it and call KVM_RUN again.
|
||||
@@ -8194,6 +8205,28 @@ KVM_X86_QUIRK_STUFF_FEATURE_MSRS By default, at vCPU creation, KVM sets the
|
||||
and 0x489), as KVM does now allow them to
|
||||
be set by userspace (KVM sets them based on
|
||||
guest CPUID, for safety purposes).
|
||||
|
||||
KVM_X86_QUIRK_IGNORE_GUEST_PAT By default, on Intel platforms, KVM ignores
|
||||
guest PAT and forces the effective memory
|
||||
type to WB in EPT. The quirk is not available
|
||||
on Intel platforms which are incapable of
|
||||
safely honoring guest PAT (i.e., without CPU
|
||||
self-snoop, KVM always ignores guest PAT and
|
||||
forces effective memory type to WB). It is
|
||||
also ignored on AMD platforms or, on Intel,
|
||||
when a VM has non-coherent DMA devices
|
||||
assigned; KVM always honors guest PAT in
|
||||
such case. The quirk is needed to avoid
|
||||
slowdowns on certain Intel Xeon platforms
|
||||
(e.g. ICX, SPR) where self-snoop feature is
|
||||
supported but UC is slow enough to cause
|
||||
issues with some older guests that use
|
||||
UC instead of WC to map the video RAM.
|
||||
Userspace can disable the quirk to honor
|
||||
guest PAT if it knows that there is no such
|
||||
guest software, for example if it does not
|
||||
expose a bochs graphics device (which is
|
||||
known to have had a buggy driver).
|
||||
=================================== ============================================
|
||||
|
||||
7.32 KVM_CAP_MAX_VCPU_ID
|
||||
|
||||
Reference in New Issue
Block a user