Commit 54900632 authored by Paolo Bonzini's avatar Paolo Bonzini
Browse files
KVM/arm64 updates for 7.0

- Add support for FEAT_IDST, allowing ID registers that are not
  implemented to be reported as a normal trap rather than as an UNDEF
  exception.

- Add sanitisation of the VTCR_EL2 register, fixing a number of
  UXN/PXN/XN bugs in the process.

- Full handling of RESx bits, instead of only RES0, and resulting in
  SCTLR_EL2 being added to the list of sanitised registers.

- More pKVM fixes for features that are not supposed to be exposed to
  guests.

- Make sure that MTE being disabled on the pKVM host doesn't give it
  the ability to attack the hypervisor.

- Allow pKVM's host stage-2 mappings to use the Force Write Back
  version of the memory attributes by using the "pass-through'
  encoding.

- Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the
  guest.

- Preliminary work for guest GICv5 support.

- A bunch of debugfs fixes, removing pointless custom iterators stored
  in guest data structures.

- A small set of FPSIMD cleanups.

- Selftest fixes addressing the incorrect alignment of page
  allocation.

- Other assorted low-impact fixes and spelling fixes.
parents c14f6466 63163661
Loading
Loading
Loading
Loading
+12 −0
Original line number Diff line number Diff line
@@ -556,6 +556,18 @@ Before jumping into the kernel, the following conditions must be met:

   - MDCR_EL3.TPM (bit 6) must be initialized to 0b0

  For CPUs with support for 64-byte loads and stores without status (FEAT_LS64):

  - If the kernel is entered at EL1 and EL2 is present:

    - HCRX_EL2.EnALS (bit 1) must be initialised to 0b1.

  For CPUs with support for 64-byte stores with status (FEAT_LS64_V):

  - If the kernel is entered at EL1 and EL2 is present:

    - HCRX_EL2.EnASR (bit 2) must be initialised to 0b1.

The requirements described above for CPU mode, caches, MMUs, architected
timers, coherency and system registers apply to all CPUs.  All CPUs must
enter the kernel in the same exception level.  Where the values documented
+7 −0
Original line number Diff line number Diff line
@@ -444,6 +444,13 @@ HWCAP3_MTE_STORE_ONLY
HWCAP3_LSFE
    Functionality implied by ID_AA64ISAR3_EL1.LSFE == 0b0001

HWCAP3_LS64
    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0001. Note that
    the function of instruction ld64b/st64b requires support by CPU, system
    and target (device) memory location and HWCAP3_LS64 implies the support
    of CPU. User should only use ld64b/st64b on supported target (device)
    memory location, otherwise fallback to the non-atomic alternatives.


4. Unused AT_HWCAP bits
-----------------------
+36 −7
Original line number Diff line number Diff line
@@ -1303,12 +1303,13 @@ userspace, for example because of missing instruction syndrome decode
information or because there is no device mapped at the accessed IPA, then
userspace can ask the kernel to inject an external abort using the address
from the exiting fault on the VCPU. It is a programming error to set
ext_dabt_pending after an exit which was not either KVM_EXIT_MMIO or
KVM_EXIT_ARM_NISV. This feature is only available if the system supports
KVM_CAP_ARM_INJECT_EXT_DABT. This is a helper which provides commonality in
how userspace reports accesses for the above cases to guests, across different
userspace implementations. Nevertheless, userspace can still emulate all Arm
exceptions by manipulating individual registers using the KVM_SET_ONE_REG API.
ext_dabt_pending after an exit which was not either KVM_EXIT_MMIO,
KVM_EXIT_ARM_NISV, or KVM_EXIT_ARM_LDST64B. This feature is only available if
the system supports KVM_CAP_ARM_INJECT_EXT_DABT. This is a helper which
provides commonality in how userspace reports accesses for the above cases to
guests, across different userspace implementations. Nevertheless, userspace
can still emulate all Arm exceptions by manipulating individual registers
using the KVM_SET_ONE_REG API.

See KVM_GET_VCPU_EVENTS for the data structure.

@@ -7050,12 +7051,14 @@ in send_page or recv a buffer to recv_page).

::

		/* KVM_EXIT_ARM_NISV */
		/* KVM_EXIT_ARM_NISV / KVM_EXIT_ARM_LDST64B */
		struct {
			__u64 esr_iss;
			__u64 fault_ipa;
		} arm_nisv;

- KVM_EXIT_ARM_NISV:

Used on arm64 systems. If a guest accesses memory not in a memslot,
KVM will typically return to userspace and ask it to do MMIO emulation on its
behalf. However, for certain classes of instructions, no instruction decode
@@ -7089,6 +7092,32 @@ Note that although KVM_CAP_ARM_NISV_TO_USER will be reported if
queried outside of a protected VM context, the feature will not be
exposed if queried on a protected VM file descriptor.

- KVM_EXIT_ARM_LDST64B:

Used on arm64 systems. When a guest using a LD64B, ST64B, ST64BV, ST64BV0,
outside of a memslot, KVM will return to userspace with KVM_EXIT_ARM_LDST64B,
exposing the relevant ESR_EL2 information and faulting IPA, similarly to
KVM_EXIT_ARM_NISV.

Userspace is supposed to fully emulate the instructions, which includes:

	- fetch of the operands for a store, including ACCDATA_EL1 in the case
	  of a ST64BV0 instruction
	- deal with the endianness if the guest is big-endian
	- emulate the access, including the delivery of an exception if the
	  access didn't succeed
	- provide a return value in the case of ST64BV/ST64BV0
	- return the data in the case of a load
	- increment PC if the instruction was successfully executed

Note that there is no expectation of performance for this emulation, as it
involves a large number of interaction with the guest state. It is, however,
expected that the instruction's semantics are preserved, specially the
single-copy atomicity property of the 64 byte access.

This exit reason must be handled if userspace sets ID_AA64ISAR1_EL1.LS64 to a
non-zero value, indicating that FEAT_LS64* is enabled.

::

		/* KVM_EXIT_X86_RDMSR / KVM_EXIT_X86_WRMSR */
+0 −33
Original line number Diff line number Diff line
@@ -1680,7 +1680,6 @@ config MITIGATE_SPECTRE_BRANCH_HISTORY
config ARM64_SW_TTBR0_PAN
	bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
	depends on !KCSAN
	select ARM64_PAN
	help
	  Enabling this option prevents the kernel from accessing
	  user-space memory directly by pointing TTBR0_EL1 to a reserved
@@ -1859,36 +1858,6 @@ config ARM64_HW_AFDBM
	  to work on pre-ARMv8.1 hardware and the performance impact is
	  minimal. If unsure, say Y.

config ARM64_PAN
	bool "Enable support for Privileged Access Never (PAN)"
	default y
	help
	  Privileged Access Never (PAN; part of the ARMv8.1 Extensions)
	  prevents the kernel or hypervisor from accessing user-space (EL0)
	  memory directly.

	  Choosing this option will cause any unprotected (not using
	  copy_to_user et al) memory access to fail with a permission fault.

	  The feature is detected at runtime, and will remain as a 'nop'
	  instruction if the cpu does not implement the feature.

config ARM64_LSE_ATOMICS
	bool
	default ARM64_USE_LSE_ATOMICS

config ARM64_USE_LSE_ATOMICS
	bool "Atomic instructions"
	default y
	help
	  As part of the Large System Extensions, ARMv8.1 introduces new
	  atomic instructions that are designed specifically to scale in
	  very large systems.

	  Say Y here to make use of these instructions for the in-kernel
	  atomic routines. This incurs a small overhead on CPUs that do
	  not support these instructions.

endmenu # "ARMv8.1 architectural features"

menu "ARMv8.2 architectural features"
@@ -2125,7 +2094,6 @@ config ARM64_MTE
	depends on ARM64_AS_HAS_MTE && ARM64_TAGGED_ADDR_ABI
	depends on AS_HAS_ARMV8_5
	# Required for tag checking in the uaccess routines
	select ARM64_PAN
	select ARCH_HAS_SUBPAGE_FAULTS
	select ARCH_USES_HIGH_VMA_FLAGS
	select ARCH_USES_PG_ARCH_2
@@ -2157,7 +2125,6 @@ menu "ARMv8.7 architectural features"
config ARM64_EPAN
	bool "Enable support for Enhanced Privileged Access Never (EPAN)"
	default y
	depends on ARM64_PAN
	help
	  Enhanced Privileged Access Never (EPAN) allows Privileged
	  Access Never to be used with Execute-only mappings.
+0 −2
Original line number Diff line number Diff line
@@ -19,8 +19,6 @@ cpucap_is_possible(const unsigned int cap)
			   "cap must be < ARM64_NCAPS");

	switch (cap) {
	case ARM64_HAS_PAN:
		return IS_ENABLED(CONFIG_ARM64_PAN);
	case ARM64_HAS_EPAN:
		return IS_ENABLED(CONFIG_ARM64_EPAN);
	case ARM64_SVE:
Loading