Commit Graph

579 Commits

Author SHA1 Message Date
Paolo Bonzini
42a0305ab1 Merge tag 'kvmarm-fixes-6.17-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 changes for 6.17, take #2

 - Correctly handle 'invariant' system registers for protected VMs

 - Improved handling of VNCR data aborts, including external aborts

 - Fixes for handling of FEAT_RAS for NV guests, providing a sane
   fault context during SEA injection and preventing the use of
   RASv1p1 fault injection hardware

 - Ensure that page table destruction when a VM is destroyed gives an
   opportunity to reschedule

 - Large fix to KVM's infrastructure for managing guest context loaded
   on the CPU, addressing issues where the output of AT emulation
   doesn't get reflected to the guest

 - Fix AT S12 emulation to actually perform stage-2 translation when
   necessary

 - Avoid attempting vLPI irqbypass when GICv4 has been explicitly
   disabled for a VM

 - Minor KVM + selftest fixes
2025-08-29 12:57:31 -04:00
Fuad Tabba
798eb59787 KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef exception
In pKVM, a race condition can occur if a guest updates its VBAR_EL1
register and, before a vCPU exit synchronizes this change, the
hypervisor needs to inject an undefined exception into a protected
guest.

In this scenario, the vCPU still holds the stale VBAR_EL1 value from
before the guest's update. When pKVM injects the exception, it ends up
using the stale value.

Explicitly read the live value of VBAR_EL1 from the guest and update the
vCPU value immediately before pending the exception. This ensures the
vCPU's value is the same as the guest's and that the exception will be
handled at the correct address upon resuming the guest.

Reported-by: Keir Fraser <keirf@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20250807120133.871892-3-tabba@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-08 01:29:32 -07:00
Fuad Tabba
eaa43934b4 KVM: arm64: Handle AIDR_EL1 and REVIDR_EL1 in host for protected VMs
Since commit 17efc1acee ("arm64: Expose AIDR_EL1 via sysfs"), AIDR_EL1
is read early during boot. Therefore, a guest running as a protected VM
will fail to boot because when it attempts to access AIDR_EL1, access to
that register is restricted in pKVM for protected guests.

Similar to how MIDR_EL1 is handled by the host for protected VMs, let
the host handle accesses to AIDR_EL1 as well as REVIDR_EL1. However note
that, unlike MIDR_EL1, AIDR_EL1 and REVIDR_EL1 are trapped by
HCR_EL2.TID1. Therefore, explicitly mark them as handled by the host for
protected VMs. TID1 is always set in pKVM, because it needs to restrict
access to SMIDR_EL1, which is also trapped by that bit.

Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20250807120133.871892-2-tabba@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-08 01:29:31 -07:00
Arnd Bergmann
700d6868fe kvm: arm64: use BUG() instead of BUG_ON(1)
The BUG_ON() macro adds a little bit of complexity over BUG(), and in
some cases this ends up confusing the compiler's control flow analysis
in a way that results in a warning. This one now shows up with clang-21:

arch/arm64/kvm/vgic/vgic-mmio.c:1094:3: error: variable 'len' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
 1094 |                 BUG_ON(1);

Change both instances of BUG_ON(1) to a plain BUG() in the arm64 kvm
code, to avoid the false-positive warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250807072132.4170088-1-arnd@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-08 01:28:57 -07:00
Linus Torvalds
6fb44438a5 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
 "A quick summary: perf support for Branch Record Buffer Extensions
  (BRBE), typical PMU hardware updates, small additions to MTE for
  store-only tag checking and exposing non-address bits to signal
  handlers, HAVE_LIVEPATCH enabled on arm64, VMAP_STACK forced on.

  There is also a TLBI optimisation on hardware that does not require
  break-before-make when changing the user PTEs between contiguous and
  non-contiguous.

  More details:

  Perf and PMU updates:

   - Add support for new (v3) Hisilicon SLLC and DDRC PMUs

   - Add support for Arm-NI PMU integrations that share interrupts
     between clock domains within a given instance

   - Allow SPE to be configured with a lower sample period than the
     minimum recommendation advertised by PMSIDR_EL1.Interval

   - Add suppport for Arm's "Branch Record Buffer Extension" (BRBE)

   - Adjust the perf watchdog period according to cpu frequency changes

   - Minor driver fixes and cleanups

  Hardware features:

   - Support for MTE store-only checking (FEAT_MTE_STORE_ONLY)

   - Support for reporting the non-address bits during a synchronous MTE
     tag check fault (FEAT_MTE_TAGGED_FAR)

   - Optimise the TLBI when folding/unfolding contiguous PTEs on
     hardware with FEAT_BBM (break-before-make) level 2 and no TLB
     conflict aborts

  Software features:

   - Enable HAVE_LIVEPATCH after implementing arch_stack_walk_reliable()
     and using the text-poke API for late module relocations

   - Force VMAP_STACK always on and change arm64_efi_rt_init() to use
     arch_alloc_vmap_stack() in order to avoid KASAN false positives

  ACPI:

   - Improve SPCR handling and messaging on systems lacking an SPCR
     table

  Debug:

   - Simplify the debug exception entry path

   - Drop redundant DBG_MDSCR_* macros

  Kselftests:

   - Cleanups and improvements for SME, SVE and FPSIMD tests

  Miscellaneous:

   - Optimise loop to reduce redundant operations in contpte_ptep_get()

   - Remove ISB when resetting POR_EL0 during signal handling

   - Mark the kernel as tainted on SEA and SError panic

   - Remove redundant gcs_free() call"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits)
  arm64/gcs: task_gcs_el0_enable() should use passed task
  arm64: Kconfig: Keep selects somewhat alphabetically ordered
  arm64: signal: Remove ISB when resetting POR_EL0
  kselftest/arm64: Handle attempts to disable SM on SME only systems
  kselftest/arm64: Fix SVE write data generation for SME only systems
  kselftest/arm64: Test SME on SME only systems in fp-ptrace
  kselftest/arm64: Test FPSIMD format data writes via NT_ARM_SVE in fp-ptrace
  kselftest/arm64: Allow sve-ptrace to run on SME only systems
  arm64/mm: Drop redundant addr increment in set_huge_pte_at()
  kselftest/arm4: Provide local defines for AT_HWCAP3
  arm64: Mark kernel as tainted on SAE and SError panic
  arm64/gcs: Don't call gcs_free() when releasing task_struct
  drivers/perf: hisi: Support PMUs with no interrupt
  drivers/perf: hisi: Relax the event number check of v2 PMUs
  drivers/perf: hisi: Add support for HiSilicon SLLC v3 PMU driver
  drivers/perf: hisi: Use ACPI driver_data to retrieve SLLC PMU information
  drivers/perf: hisi: Add support for HiSilicon DDRC v3 PMU driver
  drivers/perf: hisi: Simplify the probe process for each DDRC version
  perf/arm-ni: Support sharing IRQs within an NI instance
  perf/arm-ni: Consolidate CPU affinity handling
  ...
2025-07-29 20:21:54 -07:00
Linus Torvalds
8e736a2eea Merge tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:

 - Introduce and start using TRAILING_OVERLAP() helper for fixing
   embedded flex array instances (Gustavo A. R. Silva)

 - mux: Convert mux_control_ops to a flex array member in mux_chip
   (Thorsten Blum)

 - string: Group str_has_prefix() and strstarts() (Andy Shevchenko)

 - Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
   Kees Cook)

 - Refactor and rename stackleak feature to support Clang

 - Add KUnit test for seq_buf API

 - Fix KUnit fortify test under LTO

* tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
  sched/task_stack: Add missing const qualifier to end_of_stack()
  kstack_erase: Support Clang stack depth tracking
  kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
  init.h: Disable sanitizer coverage for __init and __head
  kstack_erase: Disable kstack_erase for all of arm compressed boot code
  x86: Handle KCOV __init vs inline mismatches
  arm64: Handle KCOV __init vs inline mismatches
  s390: Handle KCOV __init vs inline mismatches
  arm: Handle KCOV __init vs inline mismatches
  mips: Handle KCOV __init vs inline mismatch
  powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section
  configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON
  configs/hardening: Enable CONFIG_KSTACK_ERASE
  stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS
  stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
  stackleak: Rename STACKLEAK to KSTACK_ERASE
  seq_buf: Introduce KUnit tests
  string: Group str_has_prefix() and strstarts()
  kunit/fortify: Add back "volatile" for sizeof() constants
  acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings
  ...
2025-07-28 17:16:12 -07:00
Kees Cook
57fbad15c2 stackleak: Rename STACKLEAK to KSTACK_ERASE
In preparation for adding Clang sanitizer coverage stack depth tracking
that can support stack depth callbacks:

- Add the new top-level CONFIG_KSTACK_ERASE option which will be
  implemented either with the stackleak GCC plugin, or with the Clang
  stack depth callback support.
- Rename CONFIG_GCC_PLUGIN_STACKLEAK as needed to CONFIG_KSTACK_ERASE,
  but keep it for anything specific to the GCC plugin itself.
- Rename all exposed "STACKLEAK" names and files to "KSTACK_ERASE" (named
  for what it does rather than what it protects against), but leave as
  many of the internals alone as possible to avoid even more churn.

While here, also split "prev_lowest_stack" into CONFIG_KSTACK_ERASE_METRICS,
since that's the only place it is referenced from.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250717232519.2984886-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:35:01 -07:00
Anshuman Khandual
d7567e9b9b KVM: arm64: nvhe: Disable branch generation in nVHE guests
While BRBE can record branches within guests, the host recording
branches in guests is not supported by perf (though events are).
Support for BRBE in guests will supported by providing direct access
to BRBE within the guests. That is how x86 LBR works for guests.
Therefore, BRBE needs to be disabled on guest entry and restored on
exit.

For nVHE, this requires explicit handling for guests. Before
entering a guest, save the BRBE state and disable the it. When
returning to the host, restore the state.

For VHE, it is not necessary. We initialize
BRBCR_EL1.{E1BRE,E0BRE}=={0,0} at boot time, and HCR_EL2.TGE==1 while
running in the host. We configure BRBCR_EL2.{E2BRE,E0HBRE} to enable
branch recording in the host. When entering the guest, we set
HCR_EL2.TGE==0 which means BRBCR_EL1 is used instead of BRBCR_EL2.
Consequently for VHE, BRBE recording is disabled at EL1 and EL0 when
running a guest.

Should recording in guests (by the host) ever be desired, the perf ABI
will need to be extended to distinguish guest addresses (struct
perf_branch_entry.priv) for starters. BRBE records would also need to be
invalidated on guest entry/exit as guest/host EL1 and EL0 records can't
be distinguished.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Co-developed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: James Clark <james.clark@linaro.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250611-arm-brbe-v19-v23-3-e7775563036e@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08 16:11:27 +01:00
Quentin Perret
e728e70580 KVM: arm64: Adjust range correctly during host stage-2 faults
host_stage2_adjust_range() tries to find the largest block mapping that
fits within a memory or mmio region (represented by a kvm_mem_range in
this function) during host stage-2 faults under pKVM. To do so, it walks
the host stage-2 page-table, finds the faulting PTE and its level, and
then progressively increments the level until it finds a granule of the
appropriate size. However, the condition in the loop implementing the
above is broken as it checks kvm_level_supports_block_mapping() for the
next level instead of the current, so pKVM may attempt to map a region
larger than can be covered with a single block.

This is not a security problem and is quite rare in practice (the
kvm_mem_range check usually forces host_stage2_adjust_range() to choose a
smaller granule), but this is clearly not the expected behaviour.

Refactor the loop to fix the bug and improve readability.

Fixes: c4f0935e4d ("KVM: arm64: Optimize host memory aborts")
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250625105548.984572-1-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-26 08:04:43 +01:00
Mark Rutland
59e6e101a6 KVM: arm64: Remove ad-hoc CPTR manipulation from fpsimd_sve_sync()
There's no need for fpsimd_sve_sync() to write to CPTR/CPACR. All
relevant traps are always disabled earlier within __kvm_vcpu_run(), when
__deactivate_cptr_traps() configures CPTR/CPACR.

With irrelevant details elided, the flow is:

handle___kvm_vcpu_run(...)
{
	flush_hyp_vcpu(...) {
		fpsimd_sve_flush(...);
	}

	__kvm_vcpu_run(...) {
		__activate_traps(...) {
			__activate_cptr_traps(...);
		}

		do {
			__guest_enter(...);
		} while (...);

		__deactivate_traps(....) {
			__deactivate_cptr_traps(...);
		}
	}

	sync_hyp_vcpu(...) {
		fpsimd_sve_sync(...);
	}
}

Remove the unnecessary write to CPTR/CPACR. An ISB is still necessary,
so a comment is added to describe this requirement.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250617133718.4014181-5-mark.rutland@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19 13:06:20 +01:00
Mark Rutland
e62dd50784 KVM: arm64: Reorganise CPTR trap manipulation
The NVHE/HVHE and VHE modes have separate implementations of
__activate_cptr_traps() and __deactivate_cptr_traps() in their
respective switch.c files. There's some duplication of logic, and it's
not currently possible to reuse this logic elsewhere.

Move the logic into the common switch.h header so that it can be reused,
and de-duplicate the common logic.

This rework changes the way SVE traps are deactivated in VHE mode,
aligning it with NVHE/HVHE modes:

* Before this patch, VHE's __deactivate_cptr_traps() would
  unconditionally enable SVE for host EL2 (but not EL0), regardless of
  whether the ARM64_SVE cpucap was set.

* After this patch, VHE's __deactivate_cptr_traps() will take the
  ARM64_SVE cpucap into account. When ARM64_SVE is not set, SVE will be
  trapped from EL2 and below.

The old and new behaviour are both benign:

* When ARM64_SVE is not set, the host will not touch SVE state, and will
  not reconfigure SVE traps. Host EL0 access to SVE will be trapped as
  expected.

* When ARM64_SVE is set, the host will configure EL0 SVE traps before
  returning to EL0 as part of reloading the EL0 FPSIMD/SVE/SME state.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250617133718.4014181-4-mark.rutland@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19 13:06:19 +01:00
Paolo Bonzini
ce360c2bfd Merge tag 'kvmarm-fixes-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.16, take #2

- Rework of system register accessors for system registers that are
  directly writen to memory, so that sanitisation of the in-memory
  value happens at the correct time (after the read, or before the
  write). For convenience, RMW-style accessors are also provided.

- Multiple fixes for the so-called "arch-timer-edge-cases' selftest,
  which was always broken.
2025-06-11 14:25:22 -04:00
Marc Zyngier
6678791ee3 KVM: arm64: Add assignment-specific sysreg accessor
Assigning a value to a system register doesn't do what it is
supposed to be doing if that register is one that has RESx bits.

The main problem is that we use __vcpu_sys_reg(), which can be used
both as a lvalue and rvalue. When used as a lvalue, the bit masking
occurs *before* the new value is assigned, meaning that we (1) do
pointless work on the old cvalue, and (2) potentially assign an
invalid value as we fail to apply the masks to it.

Fix this by providing a new __vcpu_assign_sys_reg() that does
what it says on the tin, and sanitises the *new* value instead of
the old one. This comes with a significant amount of churn.

Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250603070824.1192795-2-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-05 14:17:32 +01:00
Paolo Bonzini
4d526b02df Merge tag 'kvmarm-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.16

* New features:

  - Add large stage-2 mapping support for non-protected pKVM guests,
    clawing back some performance.

  - Add UBSAN support to the standalone EL2 object used in nVHE/hVHE and
    protected modes.

  - Enable nested virtualisation support on systems that support it
    (yes, it has been a long time coming), though it is disabled by
    default.

* Improvements, fixes and cleanups:

  - Large rework of the way KVM tracks architecture features and links
    them with the effects of control bits. This ensures correctness of
    emulation (the data is automatically extracted from the published
    JSON files), and helps dealing with the evolution of the
    architecture.

  - Significant changes to the way pKVM tracks ownership of pages,
    avoiding page table walks by storing the state in the hypervisor's
    vmemmap. This in turn enables the THP support described above.

  - New selftest checking the pKVM ownership transition rules

  - Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
    even if the host didn't have it.

  - Fixes for the address translation emulation, which happened to be
    rather buggy in some specific contexts.

  - Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
    from the number of counters exposed to a guest and addressing a
    number of issues in the process.

  - Add a new selftest for the SVE host state being corrupted by a
    guest.

  - Keep HCR_EL2.xMO set at all times for systems running with the
    kernel at EL2, ensuring that the window for interrupts is slightly
    bigger, and avoiding a pretty bad erratum on the AmpereOne HW.

  - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
    from a pretty bad case of TLB corruption unless accesses to HCR_EL2
    are heavily synchronised.

  - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
    tables in a human-friendly fashion.

  - and the usual random cleanups.
2025-05-26 16:19:46 -04:00
Marc Zyngier
1b85d923ba Merge branch kvm-arm64/misc-6.16 into kvmarm-master/next
* kvm-arm64/misc-6.16:
  : .
  : Misc changes and improvements for 6.16:
  :
  : - Add a new selftest for the SVE host state being corrupted by a guest
  :
  : - Keep HCR_EL2.xMO set at all times for systems running with the kernel at EL2,
  :   ensuring that the window for interrupts is slightly bigger, and avoiding
  :   a pretty bad erratum on the AmpereOne HW
  :
  : - Replace a couple of open-coded on/off strings with str_on_off()
  :
  : - Get rid of the pKVM memblock sorting, which now appears to be superflous
  :
  : - Drop superflous clearing of ICH_LR_EOI in the LR when nesting
  :
  : - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers from
  :   a pretty bad case of TLB corruption unless accesses to HCR_EL2 are
  :   heavily synchronised
  :
  : - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS tables
  :   in a human-friendly fashion
  : .
  KVM: arm64: Fix documentation for vgic_its_iter_next()
  KVM: arm64: vgic-its: Add debugfs interface to expose ITS tables
  arm64: errata: Work around AmpereOne's erratum AC04_CPU_23
  KVM: arm64: nv: Remove clearing of ICH_LR<n>.EOI if ICH_LR<n>.HW == 1
  KVM: arm64: Drop sort_memblock_regions()
  KVM: arm64: selftests: Add test for SVE host corruption
  KVM: arm64: Force HCR_EL2.xMO to 1 at all times in VHE mode
  KVM: arm64: Replace ternary flags with str_on_off() helper

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:59:43 +01:00
Marc Zyngier
fef3acf5ae Merge branch kvm-arm64/fgt-masks into kvmarm-master/next
* kvm-arm64/fgt-masks: (43 commits)
  : .
  : Large rework of the way KVM deals with trap bits in conjunction with
  : the CPU feature registers. It now draws a direct link between which
  : the feature set, the system registers that need to UNDEF to match
  : the configuration and bits that need to behave as RES0 or RES1 in
  : the trap registers that are visible to the guest.
  :
  : Best of all, these definitions are mostly automatically generated
  : from the JSON description published by ARM under a permissive
  : license.
  : .
  KVM: arm64: Handle TSB CSYNC traps
  KVM: arm64: Add FGT descriptors for FEAT_FGT2
  KVM: arm64: Allow sysreg ranges for FGT descriptors
  KVM: arm64: Add context-switch for FEAT_FGT2 registers
  KVM: arm64: Add trap routing for FEAT_FGT2 registers
  KVM: arm64: Add sanitisation for FEAT_FGT2 registers
  KVM: arm64: Add FEAT_FGT2 registers to the VNCR page
  KVM: arm64: Use HCR_EL2 feature map to drive fixed-value bits
  KVM: arm64: Use HCRX_EL2 feature map to drive fixed-value bits
  KVM: arm64: Allow kvm_has_feat() to take variable arguments
  KVM: arm64: Use FGT feature maps to drive RES0 bits
  KVM: arm64: Validate FGT register descriptions against RES0 masks
  KVM: arm64: Switch to table-driven FGU configuration
  KVM: arm64: Handle PSB CSYNC traps
  KVM: arm64: Use KVM-specific HCRX_EL2 RES0 mask
  KVM: arm64: Remove hand-crafted masks for FGT registers
  KVM: arm64: Use computed FGT masks to setup FGT registers
  KVM: arm64: Propagate FGT masks to the nVHE hypervisor
  KVM: arm64: Unconditionally configure fine-grain traps
  KVM: arm64: Use computed masks as sanitisers for FGT registers
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:58:15 +01:00
Marc Zyngier
cb86616c39 Merge branch kvm-arm64/ubsan-el2 into kvmarm-master/next
* kvm-arm64/ubsan-el2:
  : .
  : Add UBSAN support to the EL2 portion of KVM, reusing most of the
  : existing logic provided by CONFIG_IBSAN_TRAP.
  :
  : Patches courtesy of Mostafa Saleh.
  : .
  KVM: arm64: Handle UBSAN faults
  KVM: arm64: Introduce CONFIG_UBSAN_KVM_EL2
  ubsan: Remove regs from report_ubsan_failure()
  arm64: Introduce esr_is_ubsan_brk()

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:57:32 +01:00
Vincent Donnefort
c353fde17d KVM: arm64: np-guest CMOs with PMD_SIZE fixmap
With the introduction of stage-2 huge mappings in the pKVM hypervisor,
guest pages CMO is needed for PMD_SIZE size. Fixmap only supports
PAGE_SIZE and iterating over the huge-page is time consuming (mostly due
to TLBI on hyp_fixmap_unmap) which is a problem for EL2 latency.

Introduce a shared PMD_SIZE fixmap (hyp_fixblock_map/hyp_fixblock_unmap)
to improve guest page CMOs when stage-2 huge mappings are installed.

On a Pixel6, the iterative solution resulted in a latency of ~700us,
while the PMD_SIZE fixmap reduces it to ~100us.

Because of the horrendous private range allocation that would be
necessary, this is disabled for 64KiB pages systems.

Suggested-by: Quentin Perret <qperret@google.com>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250521124834.1070650-11-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-21 14:33:51 +01:00
Vincent Donnefort
db14091d8f KVM: arm64: Stage-2 huge mappings for np-guests
Now np-guests hypercalls with range are supported, we can let the
hypervisor to install block mappings whenever the Stage-1 allows it,
that is when backed by either Hugetlbfs or THPs. The size of those block
mappings is limited to PMD_SIZE.

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/r/20250521124834.1070650-10-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-21 14:33:51 +01:00
Vincent Donnefort
c4d99a833d KVM: arm64: Add a range to __pkvm_host_test_clear_young_guest()
In preparation for supporting stage-2 huge mappings for np-guest. Add a
nr_pages argument to the __pkvm_host_test_clear_young_guest hypercall.
This range supports only two values: 1 or PMD_SIZE / PAGE_SIZE (that is
512 on a 4K-pages system).

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/r/20250521124834.1070650-7-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-21 14:33:51 +01:00
Vincent Donnefort
0eb802b3b4 KVM: arm64: Add a range to __pkvm_host_wrprotect_guest()
In preparation for supporting stage-2 huge mappings for np-guest. Add a
nr_pages argument to the __pkvm_host_wrprotect_guest hypercall. This
range supports only two values: 1 or PMD_SIZE / PAGE_SIZE (that is 512
on a 4K-pages system).

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/r/20250521124834.1070650-6-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-21 14:33:51 +01:00
Vincent Donnefort
f28f1d02f4 KVM: arm64: Add a range to __pkvm_host_unshare_guest()
In preparation for supporting stage-2 huge mappings for np-guest. Add a
nr_pages argument to the __pkvm_host_unshare_guest hypercall. This range
supports only two values: 1 or PMD_SIZE / PAGE_SIZE (that is 512 on a
4K-pages system).

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/r/20250521124834.1070650-5-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-21 14:33:51 +01:00
Vincent Donnefort
4274385ebf KVM: arm64: Add a range to __pkvm_host_share_guest()
In preparation for supporting stage-2 huge mappings for np-guest. Add a
nr_pages argument to the __pkvm_host_share_guest hypercall. This range
supports only two values: 1 or PMD_SIZE / PAGE_SIZE (that is 512 on a
4K-pages system).

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/r/20250521124834.1070650-4-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-21 14:33:51 +01:00
Vincent Donnefort
3db771fa23 KVM: arm64: Introduce for_each_hyp_page
Add a helper to iterate over the hypervisor vmemmap. This will be
particularly handy with the introduction of huge mapping support
for the np-guest stage-2.

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/r/20250521124834.1070650-3-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-21 14:33:51 +01:00
Vincent Donnefort
944a1ed8cc KVM: arm64: Handle huge mappings for np-guest CMOs
clean_dcache_guest_page() and invalidate_icache_guest_page() accept a
size as an argument. But they also rely on fixmap, which can only map a
single PAGE_SIZE page.

With the upcoming stage-2 huge mappings for pKVM np-guests, those
callbacks will get size > PAGE_SIZE. Loop the CMOs on a PAGE_SIZE basis
until the whole range is done.

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/r/20250521124834.1070650-2-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-21 14:33:51 +01:00
Marc Zyngier
d5702dd224 Merge branch kvm-arm64/pkvm-selftest-6.16 into kvm-arm64/pkvm-np-thp-6.16
* kvm-arm64/pkvm-selftest-6.16:
  : .
  : pKVM selftests covering the memory ownership transitions by
  : Quentin Perret. From the initial cover letter:
  :
  : "We have recently found a bug [1] in the pKVM memory ownership
  : transitions by code inspection, but it could have been caught with a
  : test.
  :
  : Introduce a boot-time selftest exercising all the known pKVM memory
  : transitions and importantly checks the rejection of illegal transitions.
  :
  : The new test is hidden behind a new Kconfig option separate from
  : CONFIG_EL2_NVHE_DEBUG on purpose as that has side effects on the
  : transition checks ([1] doesn't reproduce with EL2 debug enabled).
  :
  : [1] https://lore.kernel.org/kvmarm/20241128154406.602875-1-qperret@google.com/"
  : .
  KVM: arm64: Extend pKVM selftest for np-guests
  KVM: arm64: Selftest for pKVM transitions
  KVM: arm64: Don't WARN from __pkvm_host_share_guest()
  KVM: arm64: Add .hyp.data section

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-21 14:33:43 +01:00
D Scott Phillips
fed55f49fa arm64: errata: Work around AmpereOne's erratum AC04_CPU_23
On AmpereOne AC04, updates to HCR_EL2 can rarely corrupt simultaneous
translations for data addresses initiated by load/store instructions.
Only instruction initiated translations are vulnerable, not translations
from prefetches for example. A DSB before the store to HCR_EL2 is
sufficient to prevent older instructions from hitting the window for
corruption, and an ISB after is sufficient to prevent younger
instructions from hitting the window for corruption.

Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250513184514.2678288-1-scott@os.amperecomputing.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-19 12:46:26 +01:00
Marc Zyngier
4bc0fe0898 KVM: arm64: Add sanitisation for FEAT_FGT2 registers
Just like the FEAT_FGT registers, treat the FGT2 variant the same
way. THis is a large  update, but a fairly mechanical one.

The config dependencies are extracted from the 2025-03 JSON drop.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-19 11:36:10 +01:00
Mostafa Saleh
61b38f7591 KVM: arm64: Introduce CONFIG_UBSAN_KVM_EL2
Add a new Kconfig CONFIG_UBSAN_KVM_EL2 for KVM which enables
UBSAN for EL2 code (in protected/nvhe/hvhe) modes.
This will re-use the same checks enabled for the kernel for
the hypervisor. The only difference is that for EL2 it always
emits a "brk" instead of implementing hooks as the hypervisor
can't print reports.

The KVM code will re-use the same code for the kernel
"report_ubsan_failure()" so #ifdefs are changed to also have this
code for CONFIG_UBSAN_KVM_EL2

Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250430162713.1997569-4-smostafa@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-07 11:21:35 +01:00
Mostafa Saleh
3949e28786 KVM: arm64: Fix memory check in host_stage2_set_owner_locked()
I found this simple bug while preparing some patches for pKVM.
AFAICT, it should be harmless (besides crashing the kernel if it
was misbehaving)

Fixes: e94a7dea29 ("KVM: arm64: Move host page ownership tracking to the hyp vmemmap")
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Link: https://lore.kernel.org/r/20250501162450.2784043-1-smostafa@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-05-07 00:17:05 -07:00
Marc Zyngier
311ba55a5f KVM: arm64: Propagate FGT masks to the nVHE hypervisor
The nVHE hypervisor needs to have access to its own view of the FGT
masks, which unfortunately results in a bit of data duplication.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-06 17:35:25 +01:00
Quentin Perret
48d5645072 KVM: arm64: Extend pKVM selftest for np-guests
The pKVM selftest intends to test as many memory 'transitions' as
possible, so extend it to cover sharing pages with non-protected guests,
including in the case of multi-sharing.

Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416160900.3078417-5-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-06 09:56:18 +01:00
Quentin Perret
6c2d4c319c KVM: arm64: Selftest for pKVM transitions
We have recently found a bug [1] in the pKVM memory ownership
transitions by code inspection, but it could have been caught with a
test.

Introduce a boot-time selftest exercising all the known pKVM memory
transitions and importantly checks the rejection of illegal transitions.

The new test is hidden behind a new Kconfig option separate from
CONFIG_EL2_NVHE_DEBUG on purpose as that has side effects on the
transition checks ([1] doesn't reproduce with EL2 debug enabled).

[1] https://lore.kernel.org/kvmarm/20241128154406.602875-1-qperret@google.com/

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416160900.3078417-4-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-06 09:56:18 +01:00
Quentin Perret
845f126732 KVM: arm64: Don't WARN from __pkvm_host_share_guest()
We currently WARN() if the host attempts to share a page that is not in
an acceptable state with a guest. This isn't strictly necessary and
makes testing much harder, so drop the WARN and make sure to propage the
error code instead.

Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416160900.3078417-3-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-06 09:56:18 +01:00
David Brazdil
74b13d5816 KVM: arm64: Add .hyp.data section
The hypervisor has not needed its own .data section because all globals
were either .rodata or .bss. To avoid having to initialize future
data-structures at run-time, let's introduce add a .data section to the
hypervisor.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416160900.3078417-2-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-06 09:56:18 +01:00
Quentin Perret
43c475504a KVM: arm64: Unconditionally cross check hyp state
Now that the hypervisor's state is stored in the hyp_vmemmap, we no
longer need an expensive page-table walk to read it. This means we can
now afford to cross check the hyp-state during all memory ownership
transitions where the hyp is involved unconditionally, hence avoiding
problems such as [1].

[1] https://lore.kernel.org/kvmarm/20241128154406.602875-1-qperret@google.com/

Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-8-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-04-28 09:23:46 +01:00
Quentin Perret
48d8488823 KVM: arm64: Defer EL2 stage-1 mapping on share
We currently blindly map into EL2 stage-1 *any* page passed to the
__pkvm_host_share_hyp() HVC. This is less than ideal from a security
perspective as it makes exploitation of potential hypervisor gadgets
easier than it should be. But interestingly, pKVM should never need to
access SHARED_BORROWED pages that it hasn't previously pinned, so there
is no need to map the page before that.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-7-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-04-28 09:23:46 +01:00
Quentin Perret
3390b3cbb6 KVM: arm64: Move hyp state to hyp_vmemmap
Tracking the hypervisor's ownership state into struct hyp_page has
several benefits, including allowing far more efficient lookups (no
page-table walk needed) and de-corelating the state from the presence
of a mapping. This will later allow to map pages into EL2 stage-1 less
proactively which is generally a good thing for security. And in the
future this will help with tracking the state of pages mapped into the
hypervisor's private range without requiring an alias into the 'linear
map' range.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-6-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-04-28 09:23:46 +01:00
Quentin Perret
ba5b2e5b9d KVM: arm64: Introduce {get,set}_host_state() helpers
Instead of directly accessing the host_state member in struct hyp_page,
introduce static inline accessors to do it. The future hyp_state member
will follow the same pattern as it will need some logic in the accessors.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-5-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-04-28 09:23:46 +01:00
Fuad Tabba
5db1bef933 KVM: arm64: Track SVE state in the hypervisor vcpu structure
When dealing with a guest with SVE enabled, make sure the host SVE
state is pinned at EL2 S1, and that the hypervisor vCPU state is
correctly initialised (and then unpinned on teardown).

Co-authored-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-2-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-04-28 09:23:46 +01:00
Will Deacon
a344e258ac KVM: arm64: Use acquire/release to communicate FF-A version negotiation
The pKVM FF-A proxy rejects FF-A requests other than FFA_VERSION until
version negotiation is complete, which is signalled by setting the
global 'has_version_negotiated' variable.

To avoid excessive locking, this variable is checked directly from
kvm_host_ffa_handler() in response to an FF-A call, but this can race
against another CPU performing the negotiation and potentially lead to
reading a torn value (incredibly unlikely for a 'bool') or problematic
re-ordering of the accesses to 'has_version_negotiated' and
'hyp_ffa_version' whereby a stale version number could be read by
__do_ffa_mem_xfer().

Use acquire/release primitives when writing 'has_version_negotiated'
with the version lock held and when reading without the lock held.

Cc: Sebastian Ene <sebastianene@google.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Marc Zyngier <maz@kernel.org>
Fixes: c9c012625e ("KVM: arm64: Trap FFA_VERSION host call in pKVM")
Signed-off-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250407152755.1041-1-will@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-04-07 15:03:34 -07:00
Oliver Upton
26fbdf3692 KVM: arm64: Don't translate FAR if invalid/unsafe
Don't re-walk the page tables if an SEA occurred during the faulting
page table walk to avoid taking a fatal exception in the hyp.
Additionally, check that FAR_EL2 is valid for SEAs not taken on PTW
as the architecture doesn't guarantee it contains the fault VA.

Finally, fix up the rest of the abort path by checking for SEAs early
and bugging the VM if we get further along with an UNKNOWN fault IPA.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250402201725.2963645-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-04-03 00:28:51 -07:00
Oliver Upton
1cf3e126f1 arm64: Convert HPFAR_EL2 to sysreg table
Switch over to the typical sysreg table for HPFAR_EL2 as we're about to
start using more fields in the register.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250402201725.2963645-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-04-03 00:28:51 -07:00
Oliver Upton
ca19dd4323 Merge branch 'kvm-arm64/pkvm-6.15' into kvmarm/next
* kvm-arm64/pkvm-6.15:
  : pKVM updates for 6.15
  :
  :  - SecPageTable stats for stage-2 table pages allocated by the protected
  :    hypervisor (Vincent Donnefort)
  :
  :  - HCRX_EL2 trap + vCPU initialization fixes for pKVM (Fuad Tabba)
  KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu
  KVM: arm64: Factor out pKVM hyp vcpu creation to separate function
  KVM: arm64: Initialize HCRX_EL2 traps in pKVM
  KVM: arm64: Factor out setting HCRX_EL2 traps into separate function
  KVM: arm64: Count pKVM stage-2 usage in secondary pagetable stats
  KVM: arm64: Distinct pKVM teardown memcache for stage-2
  KVM: arm64: Add flags to kvm_hyp_memcache

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-19 14:54:40 -07:00
Oliver Upton
4f2774c57a Merge branch 'kvm-arm64/writable-midr' into kvmarm/next
* kvm-arm64/writable-midr:
  : Writable implementation ID registers, courtesy of Sebastian Ott
  :
  : Introduce a new capability that allows userspace to set the
  : ID registers that identify a CPU implementation: MIDR_EL1, REVIDR_EL1,
  : and AIDR_EL1. Also plug a hole in KVM's trap configuration where
  : SMIDR_EL1 was readable at EL1, despite the fact that KVM does not
  : support SME.
  KVM: arm64: Fix documentation for KVM_CAP_ARM_WRITABLE_IMP_ID_REGS
  KVM: arm64: Copy MIDR_EL1 into hyp VM when it is writable
  KVM: arm64: Copy guest CTR_EL0 into hyp VM
  KVM: selftests: arm64: Test writes to MIDR,REVIDR,AIDR
  KVM: arm64: Allow userspace to change the implementation ID registers
  KVM: arm64: Load VPIDR_EL2 with the VM's MIDR_EL1 value
  KVM: arm64: Maintain per-VM copy of implementation ID regs
  KVM: arm64: Set HCR_EL2.TID1 unconditionally

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-19 14:54:32 -07:00
Fuad Tabba
1eab115486 KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu
Instead of creating and initializing _all_ hyp vcpus in pKVM when
the first host vcpu runs for the first time, initialize _each_
hyp vcpu in conjunction with its corresponding host vcpu.

Some of the host vcpu state (e.g., system registers and traps
values) is not initialized until the first time the host vcpu is
run. Therefore, initializing a hyp vcpu before its corresponding
host vcpu has run for the first time might not view the complete
host state of these vcpus.

Additionally, this behavior is inline with non-protected modes.

Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20250314111832.4137161-5-tabba@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-14 16:06:03 -07:00
Fuad Tabba
066daa8d3b KVM: arm64: Initialize HCRX_EL2 traps in pKVM
Initialize and set the traps controlled by the HCRX_EL2 in pKVM.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20250314111832.4137161-3-tabba@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-14 16:00:49 -07:00
Vincent Donnefort
8c0d7d14c5 KVM: arm64: Distinct pKVM teardown memcache for stage-2
In order to account for memory dedicated to the stage-2 page-tables, use
a separated memcache when tearing down the VM. Meanwhile rename
reclaim_guest_pages to reflect the fact it only reclaim page-table
pages.

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250313114038.1502357-3-vdonnefort@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-14 00:56:29 -07:00
Oliver Upton
03e1b89d05 KVM: arm64: Copy MIDR_EL1 into hyp VM when it is writable
KVM recently added a capability that allows userspace to override the
'implementation ID' registers presented to the VM. MIDR_EL1 is a special
example, where the hypervisor can directly set the value when read from
EL1 using VPIDR_EL2.

Copy the VM-wide value for MIDR_EL1 into the hyp VM for non-protected
guests when the capability is enabled so VPIDR_EL2 gets set up
correctly.

Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/kvmarm/ac594b9c-4bbb-46c8-9391-e7a68ce4de5b@sirena.org.uk/
Fixes: 3adaee7830 ("KVM: arm64: Allow userspace to change the implementation ID registers")
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250305230825.484091-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-05 16:56:57 -08:00
Oliver Upton
9d91227364 KVM: arm64: Copy guest CTR_EL0 into hyp VM
Since commit 2843cae266 ("KVM: arm64: Treat CTR_EL0 as a VM feature
ID register") KVM has allowed userspace to configure the VM-wide view of
CTR_EL0, falling back to trap-n-emulate if the value doesn't match
hardware. It appears that this has worked by chance in protected-mode
for some time, and on systems with FEAT_EVT protected-mode
unconditionally sets TID4 (i.e. TID2 traps sans CTR_EL0).

Forward the guest CTR_EL0 value through to the hyp VM and align the
TID2/TID4 configuration with the non-protected setup.

Fixes: 2843cae266 ("KVM: arm64: Treat CTR_EL0 as a VM feature ID register")
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250305230825.484091-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-05 16:55:41 -08:00