Commit 0f099dc9 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull KVM fixes from Paolo Bonzini:
 "ARM:

   - Ensure perf events programmed to count during guest execution are
     actually enabled before entering the guest in the nVHE
     configuration

   - Restore out-of-range handler for stage-2 translation faults

   - Several fixes to stage-2 TLB invalidations to avoid stale
     translations, possibly including partial walk caches

   - Fix early handling of architectural VHE-only systems to ensure E2H
     is appropriately set

   - Correct a format specifier warning in the arch_timer selftest

   - Make the KVM banner message correctly handle all of the possible
     configurations

  RISC-V:

   - Remove redundant semicolon in num_isa_ext_regs()

   - Fix APLIC setipnum_le/be write emulation

   - Fix APLIC in_clrip[x] read emulation

  x86:

   - Fix a bug in KVM_SET_CPUID{2,} where KVM looks at the wrong CPUID
     entries (old vs. new) and ultimately neglects to clear PV_UNHALT
     from vCPUs with HLT-exiting disabled

   - Documentation fixes for SEV

   - Fix compat ABI for KVM_MEMORY_ENCRYPT_OP

   - Fix a 14-year-old goof in a declaration shared by host and guest;
     the enabled field used by Linux when running as a guest pushes the
     size of "struct kvm_vcpu_pv_apf_data" from 64 to 68 bytes. This is
     really unconsequential because KVM never consumes anything beyond
     the first 64 bytes, but the resulting struct does not match the
     documentation

  Selftests:

   - Fix spelling mistake in arch_timer selftest"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
  KVM: arm64: Rationalise KVM banner output
  arm64: Fix early handling of FEAT_E2H0 not being implemented
  KVM: arm64: Ensure target address is granule-aligned for range TLBI
  KVM: arm64: Use TLBI_TTL_UNKNOWN in __kvm_tlb_flush_vmid_range()
  KVM: arm64: Don't pass a TLBI level hint when zapping table entries
  KVM: arm64: Don't defer TLB invalidation when zapping table entries
  KVM: selftests: Fix __GUEST_ASSERT() format warnings in ARM's arch timer test
  KVM: arm64: Fix out-of-IPA space translation fault handling
  KVM: arm64: Fix host-programmed guest events in nVHE
  RISC-V: KVM: Fix APLIC in_clrip[x] read emulation
  RISC-V: KVM: Fix APLIC setipnum_le/be write emulation
  RISC-V: KVM: Remove second semicolon
  KVM: selftests: Fix spelling mistake "trigged" -> "triggered"
  Documentation: kvm/sev: clarify usage of KVM_MEMORY_ENCRYPT_OP
  Documentation: kvm/sev: separate description of firmware
  KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP
  KVM: selftests: Check that PV_UNHALT is cleared when HLT exiting is disabled
  KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT
  KVM: x86: Introduce __kvm_get_hypervisor_cpuid() helper
  KVM: SVM: Return -EINVAL instead of -EBUSY on attempt to re-init SEV/SEV-ES
  ...
parents 701b3899 9bc60f73
Loading
Loading
Loading
Loading
+24 −18
Original line number Diff line number Diff line
@@ -46,21 +46,16 @@ SEV hardware uses ASIDs to associate a memory encryption key with a VM.
Hence, the ASID for the SEV-enabled guests must be from 1 to a maximum value
defined in the CPUID 0x8000001f[ecx] field.

SEV Key Management
==================
The KVM_MEMORY_ENCRYPT_OP ioctl
===============================

The SEV guest key management is handled by a separate processor called the AMD
Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
key management interface to perform common hypervisor activities such as
encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
information, see the SEV Key Management spec [api-spec]_

The main ioctl to access SEV is KVM_MEMORY_ENCRYPT_OP.  If the argument
to KVM_MEMORY_ENCRYPT_OP is NULL, the ioctl returns 0 if SEV is enabled
and ``ENOTTY`` if it is disabled (on some older versions of Linux,
the ioctl runs normally even with a NULL argument, and therefore will
likely return ``EFAULT``).  If non-NULL, the argument to KVM_MEMORY_ENCRYPT_OP
must be a struct kvm_sev_cmd::
The main ioctl to access SEV is KVM_MEMORY_ENCRYPT_OP, which operates on
the VM file descriptor.  If the argument to KVM_MEMORY_ENCRYPT_OP is NULL,
the ioctl returns 0 if SEV is enabled and ``ENOTTY`` if it is disabled
(on some older versions of Linux, the ioctl tries to run normally even
with a NULL argument, and therefore will likely return ``EFAULT`` instead
of zero if SEV is enabled).  If non-NULL, the argument to
KVM_MEMORY_ENCRYPT_OP must be a struct kvm_sev_cmd::

       struct kvm_sev_cmd {
               __u32 id;
@@ -87,10 +82,6 @@ guests, such as launching, running, snapshotting, migrating and decommissioning.
The KVM_SEV_INIT command is used by the hypervisor to initialize the SEV platform
context. In a typical workflow, this command should be the first command issued.

The firmware can be initialized either by using its own non-volatile storage or
the OS can manage the NV storage for the firmware using the module parameter
``init_ex_path``. If the file specified by ``init_ex_path`` does not exist or
is invalid, the OS will create or override the file with output from PSP.

Returns: 0 on success, -negative on error

@@ -434,6 +425,21 @@ issued by the hypervisor to make the guest ready for execution.

Returns: 0 on success, -negative on error

Firmware Management
===================

The SEV guest key management is handled by a separate processor called the AMD
Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
key management interface to perform common hypervisor activities such as
encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
information, see the SEV Key Management spec [api-spec]_

The AMD-SP firmware can be initialized either by using its own non-volatile
storage or the OS can manage the NV storage for the firmware using
parameter ``init_ex_path`` of the ``ccp`` module. If the file specified
by ``init_ex_path`` does not exist or is invalid, the OS will create or
override the file with PSP non-volatile storage.

References
==========

+9 −10
Original line number Diff line number Diff line
@@ -193,8 +193,8 @@ data:
	Asynchronous page fault (APF) control MSR.

	Bits 63-6 hold 64-byte aligned physical address of a 64 byte memory area
	which must be in guest RAM and must be zeroed. This memory is expected
	to hold a copy of the following structure::
	which must be in guest RAM. This memory is expected to hold the
	following structure::

	  struct kvm_vcpu_pv_apf_data {
		/* Used for 'page not present' events delivered via #PF */
@@ -204,7 +204,6 @@ data:
		__u32 token;

		__u8 pad[56];
		__u32 enabled;
	  };

	Bits 5-4 of the MSR are reserved and should be zero. Bit 0 is set to 1
@@ -232,14 +231,14 @@ data:
	as regular page fault, guest must reset 'flags' to '0' before it does
	something that can generate normal page fault.

	Bytes 5-7 of 64 byte memory location ('token') will be written to by the
	Bytes 4-7 of 64 byte memory location ('token') will be written to by the
	hypervisor at the time of APF 'page ready' event injection. The content
	of these bytes is a token which was previously delivered as 'page not
	present' event. The event indicates the page in now available. Guest is
	supposed to write '0' to 'token' when it is done handling 'page ready'
	event and to write 1' to MSR_KVM_ASYNC_PF_ACK after clearing the location;
	writing to the MSR forces KVM to re-scan its queue and deliver the next
	pending notification.
	of these bytes is a token which was previously delivered in CR2 as
	'page not present' event. The event indicates the page is now available.
	Guest is supposed to write '0' to 'token' when it is done handling
	'page ready' event and to write '1' to MSR_KVM_ASYNC_PF_ACK after
	clearing the location; writing to the MSR forces KVM to re-scan its
	queue and deliver the next pending notification.

	Note, MSR_KVM_ASYNC_PF_INT MSR specifying the interrupt vector for 'page
	ready' APF delivery needs to be written to before enabling APF mechanism
+16 −13
Original line number Diff line number Diff line
@@ -291,6 +291,21 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
	blr	x2
0:
	mov_q	x0, HCR_HOST_NVHE_FLAGS

	/*
	 * Compliant CPUs advertise their VHE-onlyness with
	 * ID_AA64MMFR4_EL1.E2H0 < 0. HCR_EL2.E2H can be
	 * RES1 in that case. Publish the E2H bit early so that
	 * it can be picked up by the init_el2_state macro.
	 *
	 * Fruity CPUs seem to have HCR_EL2.E2H set to RAO/WI, but
	 * don't advertise it (they predate this relaxation).
	 */
	mrs_s	x1, SYS_ID_AA64MMFR4_EL1
	tbz	x1, #(ID_AA64MMFR4_EL1_E2H0_SHIFT + ID_AA64MMFR4_EL1_E2H0_WIDTH - 1), 1f

	orr	x0, x0, #HCR_E2H
1:
	msr	hcr_el2, x0
	isb

@@ -303,22 +318,10 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)

	mov_q	x1, INIT_SCTLR_EL1_MMU_OFF

	/*
	 * Compliant CPUs advertise their VHE-onlyness with
	 * ID_AA64MMFR4_EL1.E2H0 < 0. HCR_EL2.E2H can be
	 * RES1 in that case.
	 *
	 * Fruity CPUs seem to have HCR_EL2.E2H set to RES1, but
	 * don't advertise it (they predate this relaxation).
	 */
	mrs_s	x0, SYS_ID_AA64MMFR4_EL1
	ubfx	x0, x0, #ID_AA64MMFR4_EL1_E2H0_SHIFT, #ID_AA64MMFR4_EL1_E2H0_WIDTH
	tbnz	x0, #(ID_AA64MMFR4_EL1_E2H0_SHIFT + ID_AA64MMFR4_EL1_E2H0_WIDTH - 1), 1f

	mrs	x0, hcr_el2
	and	x0, x0, #HCR_E2H
	cbz	x0, 2f
1:

	/* Set a sane SCTLR_EL1, the VHE way */
	pre_disable_mmu_workaround
	msr_s	SYS_SCTLR_EL12, x1
+5 −8
Original line number Diff line number Diff line
@@ -2597,14 +2597,11 @@ static __init int kvm_arm_init(void)
	if (err)
		goto out_hyp;

	if (is_protected_kvm_enabled()) {
		kvm_info("Protected nVHE mode initialized successfully\n");
	} else if (in_hyp_mode) {
		kvm_info("VHE mode initialized successfully\n");
	} else {
		char mode = cpus_have_final_cap(ARM64_KVM_HVHE) ? 'h' : 'n';
		kvm_info("Hyp mode (%cVHE) initialized successfully\n", mode);
	}
	kvm_info("%s%sVHE mode initialized successfully\n",
		 in_hyp_mode ? "" : (is_protected_kvm_enabled() ?
				     "Protected " : "Hyp "),
		 in_hyp_mode ? "" : (cpus_have_final_cap(ARM64_KVM_HVHE) ?
				     "h" : "n"));

	/*
	 * FIXME: Do something reasonable if kvm_init() fails after pKVM
+2 −1
Original line number Diff line number Diff line
@@ -154,7 +154,8 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
	/* Switch to requested VMID */
	__tlb_switch_to_guest(mmu, &cxt, false);

	__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0);
	__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride,
				TLBI_TTL_UNKNOWN);

	dsb(ish);
	__tlbi(vmalle1is);
Loading