Commit Graph

111 Commits

Author SHA1 Message Date
Linus Torvalds
249872f53d Merge tag 'tsm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm
Pull PCIe Link Encryption and Device Authentication from Dan Williams:
 "New PCI infrastructure and one architecture implementation for PCIe
  link encryption establishment via platform firmware services.

  This work is the result of multiple vendors coming to consensus on
  some core infrastructure (thanks Alexey, Yilun, and Aneesh!), and
  three vendor implementations, although only one is included in this
  pull. The PCI core changes have an ack from Bjorn, the crypto/ccp/
  changes have an ack from Tom, and the iommu/amd/ changes have an ack
  from Joerg.

  PCIe link encryption is made possible by the soup of acronyms
  mentioned in the shortlog below. Link Integrity and Data Encryption
  (IDE) is a protocol for installing keys in the transmitter and
  receiver at each end of a link. That protocol is transported over Data
  Object Exchange (DOE) mailboxes using PCI configuration requests.

  The aspect that makes this a "platform firmware service" is that the
  key provisioning and protocol is coordinated through a Trusted
  Execution Envrionment (TEE) Security Manager (TSM). That is either
  firmware running in a coprocessor (AMD SEV-TIO), or quasi-hypervisor
  software (Intel TDX Connect / ARM CCA) running in a protected CPU
  mode.

  Now, the only reason to ask a TSM to run this protocol and install the
  keys rather than have a Linux driver do the same is so that later, a
  confidential VM can ask the TSM directly "can you certify this
  device?".

  That precludes host Linux from provisioning its own keys, because host
  Linux is outside the trust domain for the VM. It also turns out that
  all architectures, save for one, do not publish a mechanism for an OS
  to establish keys in the root port. So "TSM-established link
  encryption" is the only cross-architecture path for this capability
  for the foreseeable future.

  This unblocks the other arch implementations to follow in v6.20/v7.0,
  once they clear some other dependencies, and it unblocks the next
  phase of work to implement the end-to-end flow of confidential device
  assignment. The PCIe specification calls this end-to-end flow Trusted
  Execution Environment (TEE) Device Interface Security Protocol
  (TDISP).

  In the meantime, Linux gets a link encryption facility which has
  practical benefits along the same lines as memory encryption. It
  authenticates devices via certificates and may protect against
  interposer attacks trying to capture clear-text PCIe traffic.

  Summary:

   - Introduce the PCI/TSM core for the coordination of device
     authentication, link encryption and establishment (IDE), and later
     management of the device security operational states (TDISP).
     Notify the new TSM core layer of PCI device arrival and departure

   - Add a low level TSM driver for the link encryption establishment
     capabilities of the AMD SEV-TIO architecture

   - Add a library of helpers TSM drivers to use for IDE establishment
     and the DOE transport

   - Add skeleton support for 'bind' and 'guest_request' operations in
     support of TDISP"

* tag 'tsm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm: (23 commits)
  crypto/ccp: Fix CONFIG_PCI=n build
  virt: Fix Kconfig warning when selecting TSM without VIRT_DRIVERS
  crypto/ccp: Implement SEV-TIO PCIe IDE (phase1)
  iommu/amd: Report SEV-TIO support
  psp-sev: Assign numbers to all status codes and add new
  ccp: Make snp_reclaim_pages and __sev_do_cmd_locked public
  PCI/TSM: Add 'dsm' and 'bound' attributes for dependent functions
  PCI/TSM: Add pci_tsm_guest_req() for managing TDIs
  PCI/TSM: Add pci_tsm_bind() helper for instantiating TDIs
  PCI/IDE: Initialize an ID for all IDE streams
  PCI/IDE: Add Address Association Register setup for downstream MMIO
  resource: Introduce resource_assigned() for discerning active resources
  PCI/TSM: Drop stub for pci_tsm_doe_transfer()
  drivers/virt: Drop VIRT_DRIVERS build dependency
  PCI/TSM: Report active IDE streams
  PCI/IDE: Report available IDE streams
  PCI/IDE: Add IDE establishment helpers
  PCI: Establish document for PCI host bridge sysfs attributes
  PCI: Add PCIe Device 3 Extended Capability enumeration
  PCI/TSM: Establish Secure Sessions and Link Encryption
  ...
2025-12-06 10:15:41 -08:00
Linus Torvalds
51d90a15fe Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
 "ARM:

   - Support for userspace handling of synchronous external aborts
     (SEAs), allowing the VMM to potentially handle the abort in a
     non-fatal manner

   - Large rework of the VGIC's list register handling with the goal of
     supporting more active/pending IRQs than available list registers
     in hardware. In addition, the VGIC now supports EOImode==1 style
     deactivations for IRQs which may occur on a separate vCPU than the
     one that acked the IRQ

   - Support for FEAT_XNX (user / privileged execute permissions) and
     FEAT_HAF (hardware update to the Access Flag) in the software page
     table walkers and shadow MMU

   - Allow page table destruction to reschedule, fixing long
     need_resched latencies observed when destroying a large VM

   - Minor fixes to KVM and selftests

  Loongarch:

   - Get VM PMU capability from HW GCFG register

   - Add AVEC basic support

   - Use 64-bit register definition for EIOINTC

   - Add KVM timer test cases for tools/selftests

  RISC/V:

   - SBI message passing (MPXY) support for KVM guest

   - Give a new, more specific error subcode for the case when in-kernel
     AIA virtualization fails to allocate IMSIC VS-file

   - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually
     in small chunks

   - Fix guest page fault within HLV* instructions

   - Flush VS-stage TLB after VCPU migration for Andes cores

  s390:

   - Always allocate ESCA (Extended System Control Area), instead of
     starting with the basic SCA and converting to ESCA with the
     addition of the 65th vCPU. The price is increased number of exits
     (and worse performance) on z10 and earlier processor; ESCA was
     introduced by z114/z196 in 2010

   - VIRT_XFER_TO_GUEST_WORK support

   - Operation exception forwarding support

   - Cleanups

  x86:

   - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO
     SPTE caching is disabled, as there can't be any relevant SPTEs to
     zap

   - Relocate a misplaced export

   - Fix an async #PF bug where KVM would clear the completion queue
     when the guest transitioned in and out of paging mode, e.g. when
     handling an SMI and then returning to paged mode via RSM

   - Leave KVM's user-return notifier registered even when disabling
     virtualization, as long as kvm.ko is loaded. On reboot/shutdown,
     keeping the notifier registered is ok; the kernel does not use the
     MSRs and the callback will run cleanly and restore host MSRs if the
     CPU manages to return to userspace before the system goes down

   - Use the checked version of {get,put}_user()

   - Fix a long-lurking bug where KVM's lack of catch-up logic for
     periodic APIC timers can result in a hard lockup in the host

   - Revert the periodic kvmclock sync logic now that KVM doesn't use a
     clocksource that's subject to NTP corrections

   - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the
     latter behind CONFIG_CPU_MITIGATIONS

   - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast
     path; the only reason they were handled in the fast path was to
     paper of a bug in the core #MC code, and that has long since been
     fixed

   - Add emulator support for AVX MOV instructions, to play nice with
     emulated devices whose guest drivers like to access PCI BARs with
     large multi-byte instructions

  x86 (AMD):

   - Fix a few missing "VMCB dirty" bugs

   - Fix the worst of KVM's lack of EFER.LMSLE emulation

   - Add AVIC support for addressing 4k vCPUs in x2AVIC mode

   - Fix incorrect handling of selective CR0 writes when checking
     intercepts during emulation of L2 instructions

   - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32]
     on VMRUN and #VMEXIT

   - Fix a bug where KVM corrupt the guest code stream when re-injecting
     a soft interrupt if the guest patched the underlying code after the
     VM-Exit, e.g. when Linux patches code with a temporary INT3

   - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits
     to userspace, and extend KVM "support" to all policy bits that
     don't require any actual support from KVM

  x86 (Intel):

   - Use the root role from kvm_mmu_page to construct EPTPs instead of
     the current vCPU state, partly as worthwhile cleanup, but mostly to
     pave the way for tracking per-root TLB flushes, and elide EPT
     flushes on pCPU migration if the root is clean from a previous
     flush

   - Add a few missing nested consistency checks

   - Rip out support for doing "early" consistency checks via hardware
     as the functionality hasn't been used in years and is no longer
     useful in general; replace it with an off-by-default module param
     to WARN if hardware fails a check that KVM does not perform

   - Fix a currently-benign bug where KVM would drop the guest's
     SPEC_CTRL[63:32] on VM-Enter

   - Misc cleanups

   - Overhaul the TDX code to address systemic races where KVM (acting
     on behalf of userspace) could inadvertantly trigger lock contention
     in the TDX-Module; KVM was either working around these in weird,
     ugly ways, or was simply oblivious to them (though even Yan's
     devilish selftests could only break individual VMs, not the host
     kernel)

   - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a
     TDX vCPU, if creating said vCPU failed partway through

   - Fix a few sparse warnings (bad annotation, 0 != NULL)

   - Use struct_size() to simplify copying TDX capabilities to userspace

   - Fix a bug where TDX would effectively corrupt user-return MSR
     values if the TDX Module rejects VP.ENTER and thus doesn't clobber
     host MSRs as expected

  Selftests:

   - Fix a math goof in mmu_stress_test when running on a single-CPU
     system/VM

   - Forcefully override ARCH from x86_64 to x86 to play nice with
     specifying ARCH=x86_64 on the command line

   - Extend a bunch of nested VMX to validate nested SVM as well

   - Add support for LA57 in the core VM_MODE_xxx macro, and add a test
     to verify KVM can save/restore nested VMX state when L1 is using
     5-level paging, but L2 is not

   - Clean up the guest paging code in anticipation of sharing the core
     logic for nested EPT and nested NPT

  guest_memfd:

   - Add NUMA mempolicy support for guest_memfd, and clean up a variety
     of rough edges in guest_memfd along the way

   - Define a CLASS to automatically handle get+put when grabbing a
     guest_memfd from a memslot to make it harder to leak references

   - Enhance KVM selftests to make it easer to develop and debug
     selftests like those added for guest_memfd NUMA support, e.g. where
     test and/or KVM bugs often result in hard-to-debug SIGBUS errors

   - Misc cleanups

  Generic:

   - Use the recently-added WQ_PERCPU when creating the per-CPU
     workqueue for irqfd cleanup

   - Fix a goof in the dirty ring documentation

   - Fix choice of target for directed yield across different calls to
     kvm_vcpu_on_spin(); the function was always starting from the first
     vCPU instead of continuing the round-robin search"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (260 commits)
  KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS
  KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2
  KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX}
  KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected"
  KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot()
  KVM: arm64: Add endian casting to kvm_swap_s[12]_desc()
  KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n
  KVM: arm64: selftests: Add test for AT emulation
  KVM: arm64: nv: Expose hardware access flag management to NV guests
  KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW
  KVM: arm64: Implement HW access flag management in stage-1 SW PTW
  KVM: arm64: Propagate PTW errors up to AT emulation
  KVM: arm64: Add helper for swapping guest descriptor
  KVM: arm64: nv: Use pgtable definitions in stage-2 walk
  KVM: arm64: Handle endianness in read helper for emulated PTW
  KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW
  KVM: arm64: Call helper for reading descriptors directly
  KVM: arm64: nv: Advertise support for FEAT_XNX
  KVM: arm64: Teach ptdump about FEAT_XNX permissions
  KVM: s390: Use generic VIRT_XFER_TO_GUEST_WORK functions
  ...
2025-12-05 17:01:20 -08:00
Alexey Kardashevskiy
4be423572d crypto/ccp: Implement SEV-TIO PCIe IDE (phase1)
Implement the SEV-TIO (Trusted I/O) firmware interface for PCIe TDISP
(Trust Domain In-Socket Protocol). This enables secure communication
between trusted domains and PCIe devices through the PSP (Platform
Security Processor).

The implementation includes:
- Device Security Manager (DSM) operations for establishing secure links
- SPDM (Security Protocol and Data Model) over DOE (Data Object Exchange)
- IDE (Integrity Data Encryption) stream management for secure PCIe

This module bridges the SEV firmware stack with the generic PCIe TSM
framework.

This is phase1 as described in Documentation/driver-api/pci/tsm.rst.

On AMD SEV, the AMD PSP firmware acts as TSM (manages the security/trust).
The CCP driver provides the interface to it and registers in the TSM
subsystem.

Detect the PSP support (reported via FEATURE_INFO + SNP_PLATFORM_STATUS)
and enable SEV-TIO in the SNP_INIT_EX call if the hardware supports TIO.

Implement SEV TIO PSP command wrappers in sev-dev-tio.c and store
the data in the SEV-TIO-specific structs.

Implement TSM hooks and IDE setup in sev-dev-tsm.c.

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Link: https://patch.msgid.link/692f506bb80c9_261c11004@dwillia2-mobl4.notmuch
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-12-02 12:50:33 -08:00
Alexey Kardashevskiy
8a5dd102e4 ccp: Make snp_reclaim_pages and __sev_do_cmd_locked public
The snp_reclaim_pages() helper reclaims pages in the FW state. SEV-TIO
and the TMPM driver (a hardware engine which smashes IOMMU PDEs among
other things) will use to reclaim memory when cleaning up.

Share and export snp_reclaim_pages().

Most of the SEV-TIO code uses sev_do_cmd() which locks the sev_cmd_mutex
and already exported. But the SNP init code (which also sets up SEV-TIO)
executes under the sev_cmd_mutex lock so the SEV-TIO code has to use
the __sev_do_cmd_locked() helper. This one though does not need to be
exported/shared globally as SEV-TIO is a part of the CCP driver still.

Share __sev_do_cmd_locked() via the CCP internal header.

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Link: https://patch.msgid.link/20251202024449.542361-2-aik@amd.com
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-12-02 12:05:51 -08:00
Tom Lendacky
c9434e64e8 crypto: ccp - Add an API to return the supported SEV-SNP policy bits
Supported policy bits are dependent on the level of SEV firmware that is
currently running. Create an API to return the supported policy bits for
the current level of firmware.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://patch.msgid.link/e3f711366ddc22e3dd215c987fd2e28dc1c07f54.1761593632.git.thomas.lendacky@amd.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-14 10:30:10 -08:00
Christian Brauner
b7b4f7554b sev-dev: use override credential guards
Use override credential guards for scoped credential override with
automatic restoration on scope exit.

Link: https://patch.msgid.link/20251103-work-creds-guards-prepare_creds-v1-4-b447b82f2c9b@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-05 23:11:42 +01:00
Christian Brauner
89c545e29e sev-dev: use prepare credential guard
Use the prepare credential guard for allocating a new set of
credentials.

Link: https://patch.msgid.link/20251103-work-creds-guards-prepare_creds-v1-3-b447b82f2c9b@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-04 12:37:01 +01:00
Christian Brauner
4c5941ca11 sev-dev: use guard for path
Just use a guard and also move the path_put() out of the credential
change's scope. There's no need to do this with the overridden
credentials.

Link: https://patch.msgid.link/20251103-work-creds-guards-prepare_creds-v1-2-b447b82f2c9b@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-04 12:37:00 +01:00
Linus Torvalds
908057d185 Merge tag 'v6.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Drivers:
   - Add ciphertext hiding support to ccp
   - Add hashjoin, gather and UDMA data move features to hisilicon
   - Add lz4 and lz77_only to hisilicon
   - Add xilinx hwrng driver
   - Add ti driver with ecb/cbc aes support
   - Add ring buffer idle and command queue telemetry for GEN6 in qat

  Others:
   - Use rcu_dereference_all to stop false alarms in rhashtable
   - Fix CPU number wraparound in padata"

* tag 'v6.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (78 commits)
  dt-bindings: rng: hisi-rng: convert to DT schema
  crypto: doc - Add explicit title heading to API docs
  hwrng: ks-sa - fix division by zero in ks_sa_rng_init
  KEYS: X.509: Fix Basic Constraints CA flag parsing
  crypto: anubis - simplify return statement in anubis_mod_init
  crypto: hisilicon/qm - set NULL to qm->debug.qm_diff_regs
  crypto: hisilicon/qm - clear all VF configurations in the hardware
  crypto: hisilicon - enable error reporting again
  crypto: hisilicon/qm - mask axi error before memory init
  crypto: hisilicon/qm - invalidate queues in use
  crypto: qat - Return pointer directly in adf_ctl_alloc_resources
  crypto: aspeed - Fix dma_unmap_sg() direction
  rhashtable: Use rcu_dereference_all and rcu_dereference_all_check
  crypto: comp - Use same definition of context alloc and free ops
  crypto: omap - convert from tasklet to BH workqueue
  crypto: qat - Replace kzalloc() + copy_from_user() with memdup_user()
  crypto: caam - double the entropy delay interval for retry
  padata: WQ_PERCPU added to alloc_workqueue users
  padata: replace use of system_unbound_wq with system_dfl_wq
  crypto: cryptd - WQ_PERCPU added to alloc_workqueue users
  ...
2025-10-04 14:59:29 -07:00
Linus Torvalds
bed0653fe2 Merge tag 'iommu-updates-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu updates from Joerg Roedel:

 - Inte VT-d:
    - IOMMU driver updated to the latest VT-d specification
    - Don't enable PRS if PDS isn't supported
    - Replace snprintf with scnprintf
    - Fix legacy mode page table dump through debugfs
    - Miscellaneous cleanups

 - AMD-Vi:
     - Support kdump boot when SNP is enabled

 - Apple-DART:
     - 4-level page-table support

 - RISC-V IOMMU:
     - ACPI support

 - Small number of miscellaneous cleanups and fixes

* tag 'iommu-updates-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (22 commits)
  iommu/vt-d: Disallow dirty tracking if incoherent page walk
  iommu/vt-d: debugfs: Avoid dumping context command register
  iommu/vt-d: Removal of Advanced Fault Logging
  iommu/vt-d: PRS isn't usable if PDS isn't supported
  iommu/vt-d: Remove LPIG from page group response descriptor
  iommu/vt-d: Drop unused cap_super_offset()
  iommu/vt-d: debugfs: Fix legacy mode page table dump logic
  iommu/vt-d: Replace snprintf with scnprintf in dmar_latency_snapshot()
  iommu/io-pgtable-dart: Fix off by one error in table index check
  iommu/riscv: Add ACPI support
  ACPI: scan: Add support for RISC-V in acpi_iommu_configure_id()
  ACPI: RISC-V: Add support for RIMT
  iommu/omap: Use int type to store negative error codes
  iommu/apple-dart: Clear stream error indicator bits for T8110 DARTs
  iommu/amd: Skip enabling command/event buffers for kdump
  crypto: ccp: Skip SEV and SNP INIT for kdump boot
  iommu/amd: Reuse device table for kdump
  iommu/amd: Add support to remap/unmap IOMMU buffers for kdump
  iommu/apple-dart: Add 4-level page table support
  iommu/io-pgtable-dart: Add 4-level page table support
  ...
2025-10-03 18:00:11 -07:00
Linus Torvalds
22bdd6e68b Merge tag 'x86_apic_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV and apic updates from Borislav Petkov:

 - Add functionality to provide runtime firmware updates for the non-x86
   parts of an AMD platform like the security processor (ASP) firmware,
   modules etc, for example. The intent being that these updates are
   interim, live fixups before a proper BIOS update can be attempted

 - Add guest support for AMD's Secure AVIC feature which gives encrypted
   guests the needed protection against a malicious hypervisor
   generating unexpected interrupts and injecting them into such guest,
   thus interfering with its operation in an unexpected and negative
   manner.

   The advantage of this scheme is that the guest determines which
   interrupts and when to accept them vs leaving that to the benevolence
   (or not) of the hypervisor

 - Strictly separate the startup code from the rest of the kernel where
   former is executed from the initial 1:1 mapping of memory.

   The problem was that the toolchain-generated version of the code was
   being executed from a different mapping of memory than what was
   "assumed" during code generation, needing an ever-growing pile of
   fixups for absolute memory references which are invalid in the early,
   1:1 memory mapping during boot.

   The major advantage of this is that there's no need to check the 1:1
   mapping portion of the code for absolute relocations anymore and get
   rid of the RIP_REL_REF() macro sprinkling all over the place.

   For more info, see Ard's very detailed writeup on this [1]

 - The usual cleanups and fixes

Link: https://lore.kernel.org/r/CAMj1kXEzKEuePEiHB%2BHxvfQbFz0sTiHdn4B%2B%2BzVBJ2mhkPkQ4Q@mail.gmail.com [1]

* tag 'x86_apic_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
  x86/boot: Drop erroneous __init annotation from early_set_pages_state()
  crypto: ccp - Add AMD Seamless Firmware Servicing (SFS) driver
  crypto: ccp - Add new HV-Fixed page allocation/free API
  x86/sev: Add new dump_rmp parameter to snp_leak_pages() API
  x86/startup/sev: Document the CPUID flow in the boot #VC handler
  objtool: Ignore __pi___cfi_ prefixed symbols
  x86/sev: Zap snp_abort()
  x86/apic/savic: Do not use snp_abort()
  x86/boot: Get rid of the .head.text section
  x86/boot: Move startup code out of __head section
  efistub/x86: Remap inittext read-execute when needed
  x86/boot: Create a confined code area for startup code
  x86/kbuild: Incorporate boot/startup/ via Kbuild makefile
  x86/boot: Revert "Reject absolute references in .head.text"
  x86/boot: Check startup code for absence of absolute relocations
  objtool: Add action to check for absence of absolute relocations
  x86/sev: Export startup routines for later use
  x86/sev: Move __sev_[get|put]_ghcb() into separate noinstr object
  x86/sev: Provide PIC aliases for SEV related data objects
  x86/boot: Provide PIC aliases for 5-level paging related constants
  ...
2025-09-30 13:40:35 -07:00
Joerg Roedel
5f4b8c03f4 Merge branches 'apple/dart', 'ti/omap', 'riscv', 'intel/vt-d' and 'amd/amd-vi' into next 2025-09-26 10:03:33 +02:00
Ashish Kalra
e09701dcdd crypto: ccp - Add new HV-Fixed page allocation/free API
When SEV-SNP is active, the TEE extended command header page and all output
buffers for TEE extended commands (such as used by Seamless Firmware servicing
support) must be in hypervisor-fixed state, assigned to the hypervisor and
marked immutable in the RMP entrie(s).

Add a new generic SEV API interface to allocate/free hypervisor fixed pages
which abstracts hypervisor fixed page allocation/free for PSP sub devices. The
API internally uses SNP_INIT_EX to transition pages to HV-Fixed page state.

If SNP is not enabled then the allocator is simply a wrapper over
alloc_pages() and __free_pages().

When the sub device free the pages, they are put on a free list and future
allocation requests will try to re-use the freed pages from this list. But
this list is not preserved across PSP driver load/unload hence this free/reuse
support is only supported while PSP driver is loaded. As HV_FIXED page state
is only changed at reboot, these pages are leaked as they cannot be returned
back to the page allocator and then potentially allocated to guests, which
will cause SEV-SNP guests to fail to start or terminate when accessing the
HV_FIXED page.

Suggested-by: Thomas Lendacky <Thomas.Lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://lore.kernel.org/cover.1758057691.git.ashish.kalra@amd.com
2025-09-17 12:11:39 +02:00
Qianfeng Rong
e002780c14 crypto: ccp - Use int type to store negative error codes
Change the 'ret' variable in __sev_do_cmd_locked() from unsigned int to
int, as it needs to store negative error codes.

No effect on runtime.

Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-09-13 12:11:05 +08:00
Borislav Petkov (AMD)
46834d90a9 crypto: ccp - Always pass in an error pointer to __sev_platform_shutdown_locked()
When

  9770b428b1 ("crypto: ccp - Move dev_info/err messages for SEV/SNP init and shutdown")

moved the error messages dumping so that they don't need to be issued by
the callers, it missed the case where __sev_firmware_shutdown() calls
__sev_platform_shutdown_locked() with a NULL argument which leads to
a NULL ptr deref on the shutdown path, during suspend to disk:

  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: Oops: 0000 [#1] SMP NOPTI
  CPU: 0 UID: 0 PID: 983 Comm: hib.sh Not tainted 6.17.0-rc4+ #1 PREEMPT(voluntary)
  Hardware name: Supermicro Super Server/H12SSL-i, BIOS 2.5 09/08/2022
  RIP: 0010:__sev_platform_shutdown_locked.cold+0x0/0x21 [ccp]

That rIP is:

  00000000000006fd <__sev_platform_shutdown_locked.cold>:
   6fd:   8b 13                   mov    (%rbx),%edx
   6ff:   48 8b 7d 00             mov    0x0(%rbp),%rdi
   703:   89 c1                   mov    %eax,%ecx

  Code: 74 05 31 ff 41 89 3f 49 8b 3e 89 ea 48 c7 c6 a0 8e 54 a0 41 bf 92 ff ff ff e8 e5 2e 09 e1 c6 05 2a d4 38 00 01 e9 26 af ff ff <8b> 13 48 8b 7d 00 89 c1 48 c7 c6 18 90 54 a0 89 44 24 04 e8 c1 2e
  RSP: 0018:ffffc90005467d00 EFLAGS: 00010282
  RAX: 00000000ffffff92 RBX: 0000000000000000 RCX: 0000000000000000
  			     ^^^^^^^^^^^^^^^^
and %rbx is nice and clean.

  Call Trace:
   <TASK>
   __sev_firmware_shutdown.isra.0
   sev_dev_destroy
   psp_dev_destroy
   sp_destroy
   pci_device_shutdown
   device_shutdown
   kernel_power_off
   hibernate.cold
   state_store
   kernfs_fop_write_iter
   vfs_write
   ksys_write
   do_syscall_64
   entry_SYSCALL_64_after_hwframe

Pass in a pointer to the function-local error var in the caller.

With that addressed, suspending the ccp shows the error properly at
least:

  ccp 0000:47:00.1: sev command 0x2 timed out, disabling PSP
  ccp 0000:47:00.1: SEV: failed to SHUTDOWN error 0x0, rc -110
  SEV-SNP: Leaking PFN range 0x146800-0x146a00
  SEV-SNP: PFN 0x146800 unassigned, dumping non-zero entries in 2M PFN region: [0x146800 - 0x146a00]
  ...
  ccp 0000:47:00.1: SEV-SNP firmware shutdown failed, rc -16, error 0x0
  ACPI: PM: Preparing to enter system sleep state S5
  kvm: exiting hardware virtualization
  reboot: Power down

Btw, this driver is crying to be cleaned up to pass in a proper I/O
struct which can be used to store information between the different
functions, otherwise stuff like that will happen in the future again.

Fixes: 9770b428b1 ("crypto: ccp - Move dev_info/err messages for SEV/SNP init and shutdown")
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Reviewed-by: Ashish Kalra <ashish.kalra@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-09-13 12:07:44 +08:00
Ashish Kalra
8c571019d8 crypto: ccp: Skip SEV and SNP INIT for kdump boot
Since SEV or SNP may already be initialized in the previous kernel,
attempting to initialize them again in the kdump kernel can result
in SNP initialization failures, which in turn lead to IOMMU
initialization failures. Moreover, SNP/SEV guests are not run under a
kdump kernel, so there is no need to initialize SEV or SNP during
kdump boot.

Skip SNP and SEV INIT if doing kdump boot.

Tested-by: Sairaj Kodilkar <sarunkod@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Link: https://lore.kernel.org/r/d884eff5f6180d8b8c6698a6168988118cf9cba1.1756157913.git.ashish.kalra@amd.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-09-05 14:45:16 +02:00
Michael Roth
ed53a5050f crypto: ccp - Fix checks for SNP_VLEK_LOAD input buffer length
The SNP_VLEK_LOAD IOCTL currently fails due to sev_cmd_buffer_len()
returning the default expected buffer length of 0 instead of the correct
value, which would be sizeof(struct sev_user_data_snp_vlek_load). Add
specific handling for SNP_VLEK_LOAD so the correct expected size is
returned.

Reported-by: Diego GonzalezVillalobos <Diego.GonzalezVillalobos@amd.com>
Cc: Diego GonzalezVillalobos <Diego.GonzalezVillalobos@amd.com>
Fixes: 332d2c1d71 ("crypto: ccp: Add the SNP_VLEK_LOAD command")
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-16 17:24:31 +08:00
Ashish Kalra
c9760b0fca crypto: ccp - Add support to enable CipherTextHiding on SNP_INIT_EX
To enable ciphertext hiding, it must be specified in the SNP_INIT_EX
command as part of SNP initialization.

Modify the sev_platform_init_args structure, which is used as input to
sev_platform_init(), to include a field that, when non-zero,
indicates that ciphertext hiding should be enabled and specifies the
maximum ASID that can be used for an SEV-SNP guest.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-16 17:20:23 +08:00
Ashish Kalra
45d59bd4a3 crypto: ccp - Introduce new API interface to indicate SEV-SNP Ciphertext hiding feature
Implement an API that checks the overall feature support for SEV-SNP
ciphertext hiding.

This API verifies both the support of the SEV firmware for the feature
and its enablement in the platform's BIOS.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-16 17:20:23 +08:00
Ashish Kalra
33cfb80d19 crypto: ccp - Add support for SNP_FEATURE_INFO command
The FEATURE_INFO command provides hypervisors with a programmatic means
to learn about the supported features of the currently loaded firmware.
This command mimics the CPUID instruction relative to sub-leaf input and
the four unsigned integer output values. To obtain information
regarding the features present in the currently loaded SEV firmware,
use the SNP_FEATURE_INFO command.

Cache the SNP platform status and feature information from CPUID
0x8000_0024 in the sev_device structure. If SNP is enabled, utilize
this cached SNP platform status for the API major, minor and build
version.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-16 17:20:23 +08:00
Ashish Kalra
459daec42e crypto: ccp - Cache SEV platform status and platform state
Cache the SEV platform status into sev_device structure and use this
cached SEV platform status for api_major/minor/build.

The platform state is unique between SEV and SNP and hence needs to be
tracked independently.

Remove the state field from sev_device structure and instead track SEV
state from the cached SEV platform status.

Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-16 17:20:23 +08:00
Alexey Kardashevskiy
b4abeccb8d crypto: ccp - Fix locking on alloc failure handling
The __snp_alloc_firmware_pages() helper allocates pages in the firmware
state (alloc + rmpupdate). In case of failed rmpupdate, it tries
reclaiming pages with already changed state. This requires calling
the PSP firmware and since there is sev_cmd_mutex to guard such calls,
the helper takes a "locked" parameter so specify if the lock needs to
be held.

Most calls happen from snp_alloc_firmware_page() which executes without
the lock. However

commit 24512afa43 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled")

switched sev_fw_alloc() from alloc_pages() (which does not call the PSP) to
__snp_alloc_firmware_pages() (which does) but did not account for the fact
that sev_fw_alloc() is called from __sev_platform_init_locked()
(via __sev_platform_init_handle_tmr()) and executes with the lock held.

Add a "locked" parameter to __snp_alloc_firmware_pages().
Make sev_fw_alloc() use the new parameter to prevent potential deadlock in
rmp_mark_pages_firmware() if rmpupdate() failed.

Fixes: 24512afa43 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled")
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Pratik R. Sampat <prsampat@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-07-07 15:27:04 +12:00
Ashish Kalra
ab8b9fd39c crypto: ccp - Fix SNP panic notifier unregistration
Panic notifiers are invoked with RCU read lock held and when the
SNP panic notifier tries to unregister itself from the panic
notifier callback itself it causes a deadlock as notifier
unregistration does RCU synchronization.

Code flow for SNP panic notifier:
snp_shutdown_on_panic() ->
__sev_firmware_shutdown() ->
__sev_snp_shutdown_locked() ->
atomic_notifier_chain_unregister(.., &snp_panic_notifier)

Fix SNP panic notifier to unregister itself during SNP shutdown
only if panic is not in progress.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: stable@vger.kernel.org
Fixes: 19860c3274 ("crypto: ccp - Register SNP panic notifier only if SNP is enabled")
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-06-23 17:00:27 +08:00
Ashish Kalra
0fa766726c crypto: ccp - Fix dereferencing uninitialized error pointer
Fix below smatch warnings:
drivers/crypto/ccp/sev-dev.c:1312 __sev_platform_init_locked()
error: we previously assumed 'error' could be null

Fixes: 9770b428b1 ("crypto: ccp - Move dev_info/err messages for SEV/SNP init and shutdown")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202505071746.eWOx5QgC-lkp@intel.com/
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-06-13 17:26:17 +08:00
Linus Torvalds
785cdec46e Merge tag 'x86-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core x86 updates from Ingo Molnar:
 "Boot code changes:

   - A large series of changes to reorganize the x86 boot code into a
     better isolated and easier to maintain base of PIC early startup
     code in arch/x86/boot/startup/, by Ard Biesheuvel.

     Motivation & background:

  	| Since commit
  	|
  	|    c88d71508e ("x86/boot/64: Rewrite startup_64() in C")
  	|
  	| dated Jun 6 2017, we have been using C code on the boot path in a way
  	| that is not supported by the toolchain, i.e., to execute non-PIC C
  	| code from a mapping of memory that is different from the one provided
  	| to the linker. It should have been obvious at the time that this was a
  	| bad idea, given the need to sprinkle fixup_pointer() calls left and
  	| right to manipulate global variables (including non-pointer variables)
  	| without crashing.
  	|
  	| This C startup code has been expanding, and in particular, the SEV-SNP
  	| startup code has been expanding over the past couple of years, and
  	| grown many of these warts, where the C code needs to use special
  	| annotations or helpers to access global objects.

     This tree includes the first phase of this work-in-progress x86
     boot code reorganization.

  Scalability enhancements and micro-optimizations:

   - Improve code-patching scalability (Eric Dumazet)

   - Remove MFENCEs for X86_BUG_CLFLUSH_MONITOR (Andrew Cooper)

  CPU features enumeration updates:

   - Thorough reorganization and cleanup of CPUID parsing APIs (Ahmed S.
     Darwish)

   - Fix, refactor and clean up the cacheinfo code (Ahmed S. Darwish,
     Thomas Gleixner)

   - Update CPUID bitfields to x86-cpuid-db v2.3 (Ahmed S. Darwish)

  Memory management changes:

   - Allow temporary MMs when IRQs are on (Andy Lutomirski)

   - Opt-in to IRQs-off activate_mm() (Andy Lutomirski)

   - Simplify choose_new_asid() and generate better code (Borislav
     Petkov)

   - Simplify 32-bit PAE page table handling (Dave Hansen)

   - Always use dynamic memory layout (Kirill A. Shutemov)

   - Make SPARSEMEM_VMEMMAP the only memory model (Kirill A. Shutemov)

   - Make 5-level paging support unconditional (Kirill A. Shutemov)

   - Stop prefetching current->mm->mmap_lock on page faults (Mateusz
     Guzik)

   - Predict valid_user_address() returning true (Mateusz Guzik)

   - Consolidate initmem_init() (Mike Rapoport)

  FPU support and vector computing:

   - Enable Intel APX support (Chang S. Bae)

   - Reorgnize and clean up the xstate code (Chang S. Bae)

   - Make task_struct::thread constant size (Ingo Molnar)

   - Restore fpu_thread_struct_whitelist() to fix
     CONFIG_HARDENED_USERCOPY=y (Kees Cook)

   - Simplify the switch_fpu_prepare() + switch_fpu_finish() logic (Oleg
     Nesterov)

   - Always preserve non-user xfeatures/flags in __state_perm (Sean
     Christopherson)

  Microcode loader changes:

   - Help users notice when running old Intel microcode (Dave Hansen)

   - AMD: Do not return error when microcode update is not necessary
     (Annie Li)

   - AMD: Clean the cache if update did not load microcode (Boris
     Ostrovsky)

  Code patching (alternatives) changes:

   - Simplify, reorganize and clean up the x86 text-patching code (Ingo
     Molnar)

   - Make smp_text_poke_batch_process() subsume
     smp_text_poke_batch_finish() (Nikolay Borisov)

   - Refactor the {,un}use_temporary_mm() code (Peter Zijlstra)

  Debugging support:

   - Add early IDT and GDT loading to debug relocate_kernel() bugs
     (David Woodhouse)

   - Print the reason for the last reset on modern AMD CPUs (Yazen
     Ghannam)

   - Add AMD Zen debugging document (Mario Limonciello)

   - Fix opcode map (!REX2) superscript tags (Masami Hiramatsu)

   - Stop decoding i64 instructions in x86-64 mode at opcode (Masami
     Hiramatsu)

  CPU bugs and bug mitigations:

   - Remove X86_BUG_MMIO_UNKNOWN (Borislav Petkov)

   - Fix SRSO reporting on Zen1/2 with SMT disabled (Borislav Petkov)

   - Restructure and harmonize the various CPU bug mitigation methods
     (David Kaplan)

   - Fix spectre_v2 mitigation default on Intel (Pawan Gupta)

  MSR API:

   - Large MSR code and API cleanup (Xin Li)

   - In-kernel MSR API type cleanups and renames (Ingo Molnar)

  PKEYS:

   - Simplify PKRU update in signal frame (Chang S. Bae)

  NMI handling code:

   - Clean up, refactor and simplify the NMI handling code (Sohil Mehta)

   - Improve NMI duration console printouts (Sohil Mehta)

  Paravirt guests interface:

   - Restrict PARAVIRT_XXL to 64-bit only (Kirill A. Shutemov)

  SEV support:

   - Share the sev_secrets_pa value again (Tom Lendacky)

  x86 platform changes:

   - Introduce the <asm/amd/> header namespace (Ingo Molnar)

   - i2c: piix4, x86/platform: Move the SB800 PIIX4 FCH definitions to
     <asm/amd/fch.h> (Mario Limonciello)

  Fixes and cleanups:

   - x86 assembly code cleanups and fixes (Uros Bizjak)

   - Misc fixes and cleanups (Andi Kleen, Andy Lutomirski, Andy
     Shevchenko, Ard Biesheuvel, Bagas Sanjaya, Baoquan He, Borislav
     Petkov, Chang S. Bae, Chao Gao, Dan Williams, Dave Hansen, David
     Kaplan, David Woodhouse, Eric Biggers, Ingo Molnar, Josh Poimboeuf,
     Juergen Gross, Malaya Kumar Rout, Mario Limonciello, Nathan
     Chancellor, Oleg Nesterov, Pawan Gupta, Peter Zijlstra, Shivank
     Garg, Sohil Mehta, Thomas Gleixner, Uros Bizjak, Xin Li)"

* tag 'x86-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (331 commits)
  x86/bugs: Fix spectre_v2 mitigation default on Intel
  x86/bugs: Restructure ITS mitigation
  x86/xen/msr: Fix uninitialized variable 'err'
  x86/msr: Remove a superfluous inclusion of <asm/asm.h>
  x86/paravirt: Restrict PARAVIRT_XXL to 64-bit only
  x86/mm/64: Make 5-level paging support unconditional
  x86/mm/64: Make SPARSEMEM_VMEMMAP the only memory model
  x86/mm/64: Always use dynamic memory layout
  x86/bugs: Fix indentation due to ITS merge
  x86/cpuid: Rename hypervisor_cpuid_base()/for_each_possible_hypervisor_cpuid_base() to cpuid_base_hypervisor()/for_each_possible_cpuid_base_hypervisor()
  x86/cpu/intel: Rename CPUID(0x2) descriptors iterator parameter
  x86/cacheinfo: Rename CPUID(0x2) descriptors iterator parameter
  x86/cpuid: Rename cpuid_get_leaf_0x2_regs() to cpuid_leaf_0x2()
  x86/cpuid: Rename have_cpuid_p() to cpuid_feature()
  x86/cpuid: Set <asm/cpuid/api.h> as the main CPUID header
  x86/cpuid: Move CPUID(0x2) APIs into <cpuid/api.h>
  x86/msr: Add rdmsrl_on_cpu() compatibility wrapper
  x86/mm: Fix kernel-doc descriptions of various pgtable methods
  x86/asm-offsets: Export certain 'struct cpuinfo_x86' fields for 64-bit asm use too
  x86/boot: Defer initialization of VM space related global variables
  ...
2025-05-26 16:04:17 -07:00
Xin Li (Intel)
efef7f184f x86/msr: Add explicit includes of <asm/msr.h>
For historic reasons there are some TSC-related functions in the
<asm/msr.h> header, even though there's an <asm/tsc.h> header.

To facilitate the relocation of rdtsc{,_ordered}() from <asm/msr.h>
to <asm/tsc.h> and to eventually eliminate the inclusion of
<asm/msr.h> in <asm/tsc.h>, add an explicit <asm/msr.h> dependency
to the source files that reference definitions from <asm/msr.h>.

[ mingo: Clarified the changelog. ]

Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Link: https://lore.kernel.org/r/20250501054241.1245648-1-xin@zytor.com
2025-05-02 10:23:47 +02:00
Ashish Kalra
9af6339a65 crypto: ccp - Fix __sev_snp_shutdown_locked
Fix smatch warning:
	drivers/crypto/ccp/sev-dev.c:1755 __sev_snp_shutdown_locked()
	error: uninitialized symbol 'dfflush_error'.

Fixes: 9770b428b1 ("crypto: ccp - Move dev_info/err messages for SEV/SNP init and shutdown")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-crypto/d9c2e79c-e26e-47b7-8243-ff6e7b101ec3@stanley.mountain/
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-16 15:16:22 +08:00
Ingo Molnar
78255eb239 x86/msr: Rename 'wrmsrl()' to 'wrmsrq()'
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-10 11:58:33 +02:00
Herbert Xu
af7e23c616 crypto: ccp - Silence may-be-uninitialized warning in sev_ioctl_do_pdh_export
The recent reordering of code in sev_ioctl_do_pdh_export triggered
a false-positive may-be-uninitialized warning from gcc:

In file included from ../include/linux/sched/task.h:13,
                 from ../include/linux/sched/signal.h:9,
                 from ../include/linux/rcuwait.h:6,
                 from ../include/linux/percpu-rwsem.h:7,
                 from ../include/linux/fs.h:34,
                 from ../include/linux/compat.h:17,
                 from ../arch/x86/include/asm/ia32.h:7,
                 from ../arch/x86/include/asm/elf.h:10,
                 from ../include/linux/elf.h:6,
                 from ../include/linux/module.h:19,
                 from ../drivers/crypto/ccp/sev-dev.c:11:
In function ‘copy_to_user’,
    inlined from ‘sev_ioctl_do_pdh_export’ at ../drivers/crypto/ccp/sev-dev.c:2036:7,
    inlined from ‘sev_ioctl’ at ../drivers/crypto/ccp/sev-dev.c:2249:9:
../include/linux/uaccess.h:225:16: warning: ‘input_cert_chain_address’ may be used uninitialized [-Wmaybe-uninitialized]
  225 |         return _copy_to_user(to, from, n);
      |                ^~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/crypto/ccp/sev-dev.c: In function ‘sev_ioctl’:
../drivers/crypto/ccp/sev-dev.c:1961:22: note: ‘input_cert_chain_address’ was declared here
 1961 |         void __user *input_cert_chain_address;
      |                      ^~~~~~~~~~~~~~~~~~~~~~~~

Silence it by moving the initialisation of the variables in question
prior to the NULL check.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-08 15:54:38 +08:00
Ashish Kalra
3f8f0133a5 crypto: ccp - Move SEV/SNP Platform initialization to KVM
SNP initialization is forced during PSP driver probe purely because SNP
can't be initialized if VMs are running.  But the only in-tree user of
SEV/SNP functionality is KVM, and KVM depends on PSP driver for the same.
Forcing SEV/SNP initialization because a hypervisor could be running
legacy non-confidential VMs make no sense.

This patch removes SEV/SNP initialization from the PSP driver probe
time and moves the requirement to initialize SEV/SNP functionality
to KVM if it wants to use SEV/SNP.

Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-08 15:54:38 +08:00
Ashish Kalra
f7b86e0e75 crypto: ccp - Add new SEV/SNP platform shutdown API
Add new API interface to do SEV/SNP platform shutdown when KVM module
is unloaded.

Reviewed-by: Dionna Glaze <dionnaglaze@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-07 13:22:26 +08:00
Ashish Kalra
19860c3274 crypto: ccp - Register SNP panic notifier only if SNP is enabled
Currently, the SNP panic notifier is registered on module initialization
regardless of whether SNP is being enabled or initialized.

Instead, register the SNP panic notifier only when SNP is actually
initialized and unregister the notifier when SNP is shutdown.

Reviewed-by: Dionna Glaze <dionnaglaze@google.com>
Reviewed-by: Alexey Kardashevskiy <aik@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-07 13:22:26 +08:00
Ashish Kalra
65a895a44e crypto: ccp - Reset TMR size at SNP Shutdown
Implicit SNP initialization as part of some SNP ioctls modify TMR size
to be SNP compliant which followed by SNP shutdown will leave the
TMR size modified and then subsequently cause SEV only initialization
to fail, hence, reset TMR size to default at SNP Shutdown.

Acked-by: Dionna Glaze <dionnaglaze@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-07 13:22:26 +08:00
Ashish Kalra
ceac7fb89e crypto: ccp - Ensure implicit SEV/SNP init and shutdown in ioctls
Modify the behavior of implicit SEV initialization in some of the
SEV ioctls to do both SEV initialization and shutdown and add
implicit SNP initialization and shutdown to some of the SNP ioctls
so that the change of SEV/SNP platform initialization not being
done during PSP driver probe time does not break userspace tools
such as sevtool, etc.

Prior to this patch, SEV has always been initialized before these
ioctls as SEV initialization is done as part of PSP module probe,
but now with SEV initialization being moved to KVM module load instead
of PSP driver probe, the implied SEV INIT actually makes sense and gets
used and additionally to maintain SEV platform state consistency
before and after the ioctl SEV shutdown needs to be done after the
firmware call.

It is important to do SEV Shutdown here with the SEV/SNP initialization
moving to KVM, an implicit SEV INIT here as part of the SEV ioctls not
followed with SEV Shutdown will cause SEV to remain in INIT state and
then a future SNP INIT in KVM module load will fail.

Also ensure that for these SEV ioctls both implicit SNP and SEV INIT is
done followed by both SEV and SNP shutdown as RMP table must be
initialized before calling SEV INIT if SNP host support is enabled.

Similarly, prior to this patch, SNP has always been initialized before
these ioctls as SNP initialization is done as part of PSP module probe,
therefore, to keep a consistent behavior, SNP init needs to be done
here implicitly as part of these ioctls followed with SNP shutdown
before returning from the ioctl to maintain the consistent platform
state before and after the ioctl.

Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-07 13:22:26 +08:00
Ashish Kalra
9770b428b1 crypto: ccp - Move dev_info/err messages for SEV/SNP init and shutdown
Move dev_info and dev_err messages related to SEV/SNP initialization
and shutdown into __sev_platform_init_locked(), __sev_snp_init_locked()
and __sev_platform_shutdown_locked(), __sev_snp_shutdown_locked() so
that they don't need to be issued from callers.

This allows both _sev_platform_init_locked() and various SEV/SNP ioctls
to call __sev_platform_init_locked(), __sev_snp_init_locked() and
__sev_platform_shutdown_locked(), __sev_snp_shutdown_locked() for
implicit SEV/SNP initialization and shutdown without additionally
printing any errors/success messages.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-07 13:22:25 +08:00
Ashish Kalra
6131e119f5 crypto: ccp - Abort doing SEV INIT if SNP INIT fails
If SNP host support (SYSCFG.SNPEn) is set, then the RMP table must
be initialized before calling SEV INIT.

In other words, if SNP_INIT(_EX) is not issued or fails then
SEV INIT will fail if SNP host support (SYSCFG.SNPEn) is enabled.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-07 13:22:25 +08:00
Christian Brauner
25fe3d58e4 sev-dev: avoid pointless cred reference count bump
and fix a memory leak while at it. The new creds are created via
prepare_creds() and then reverted via put_cred(revert_creds()). The
additional reference count bump from override_creds() wasn't even taken
into account before.

Link: https://lore.kernel.org/r/20241125-work-cred-v2-8-68b9d38bb5b2@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-02 11:25:10 +01:00
Christian Brauner
51c0bcf097 tree-wide: s/revert_creds_light()/revert_creds()/g
Rename all calls to revert_creds_light() back to revert_creds().

Link: https://lore.kernel.org/r/20241125-work-cred-v2-6-68b9d38bb5b2@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-02 11:25:09 +01:00
Christian Brauner
6771e004b4 tree-wide: s/override_creds_light()/override_creds()/g
Rename all calls to override_creds_light() back to overrid_creds().

Link: https://lore.kernel.org/r/20241125-work-cred-v2-5-68b9d38bb5b2@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-02 11:25:09 +01:00
Christian Brauner
f905e00904 tree-wide: s/revert_creds()/put_cred(revert_creds_light())/g
Convert all calls to revert_creds() over to explicitly dropping
reference counts in preparation for converting revert_creds() to
revert_creds_light() semantics.

Link: https://lore.kernel.org/r/20241125-work-cred-v2-3-68b9d38bb5b2@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-02 11:25:09 +01:00
Christian Brauner
0a670e151a tree-wide: s/override_creds()/override_creds_light(get_new_cred())/g
Convert all callers from override_creds() to
override_creds_light(get_new_cred()) in preparation of making
override_creds() not take a separate reference at all.

Link: https://lore.kernel.org/r/20241125-work-cred-v2-1-68b9d38bb5b2@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-02 11:25:08 +01:00
Amit Shah
3401f63e72 crypto: ccp - do not request interrupt on cmd completion when irqs disabled
While sending a command to the PSP, we always requested an interrupt
from the PSP after command completion.  This worked for most cases.  For
the special case of irqs being disabled -- e.g. when running within
crashdump or kexec contexts, we should not set the SEV_CMDRESP_IOC flag,
so the PSP knows to not attempt interrupt delivery.

Fixes: 8ef979584e ("crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump")

Based-on-patch-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Amit Shah <amit.shah@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-09-06 14:50:46 +08:00
Pavan Kumar Paluri
ce3d2d6b15 crypto: ccp - Properly unregister /dev/sev on sev PLATFORM_STATUS failure
In case of sev PLATFORM_STATUS failure, sev_get_api_version() fails
resulting in sev_data field of psp_master nulled out. This later becomes
a problem when unloading the ccp module because the device has not been
unregistered (via misc_deregister()) before clearing the sev_data field
of psp_master. As a result, on reloading the ccp module, a duplicate
device issue is encountered as can be seen from the dmesg log below.

on reloading ccp module via modprobe ccp

Call Trace:
  <TASK>
  dump_stack_lvl+0xd7/0xf0
  dump_stack+0x10/0x20
  sysfs_warn_dup+0x5c/0x70
  sysfs_create_dir_ns+0xbc/0xd
  kobject_add_internal+0xb1/0x2f0
  kobject_add+0x7a/0xe0
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? get_device_parent+0xd4/0x1e0
  ? __pfx_klist_children_get+0x10/0x10
  device_add+0x121/0x870
  ? srso_alias_return_thunk+0x5/0xfbef5
  device_create_groups_vargs+0xdc/0x100
  device_create_with_groups+0x3f/0x60
  misc_register+0x13b/0x1c0
  sev_dev_init+0x1d4/0x290 [ccp]
  psp_dev_init+0x136/0x300 [ccp]
  sp_init+0x6f/0x80 [ccp]
  sp_pci_probe+0x2a6/0x310 [ccp]
  ? srso_alias_return_thunk+0x5/0xfbef5
  local_pci_probe+0x4b/0xb0
  work_for_cpu_fn+0x1a/0x30
  process_one_work+0x203/0x600
  worker_thread+0x19e/0x350
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xeb/0x120
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x3c/0x60
  ? __pfx_kthread+0x10/0x10
  ret_from_fork_asm+0x1a/0x30
  </TASK>
  kobject: kobject_add_internal failed for sev with -EEXIST, don't try to register things with the same name in the same directory.
  ccp 0000:22:00.1: sev initialization failed
  ccp 0000:22:00.1: psp initialization failed
  ccp 0000:a2:00.1: no command queues available
  ccp 0000:a2:00.1: psp enabled

Address this issue by unregistering the /dev/sev before clearing out
sev_data in case of PLATFORM_STATUS failure.

Fixes: 200664d523 ("crypto: ccp: Add Secure Encrypted Virtualization (SEV) command support")
Cc: stable@vger.kernel.org
Signed-off-by: Pavan Kumar Paluri <papaluri@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-30 18:22:30 +08:00
Tom Lendacky
142a794bcf crypto: ccp - Add additional information about an SEV firmware upgrade
Print additional information, in the form of the old and new versions of
the SEV firmware, so that it can be seen what the base firmware was before
the upgrade.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-24 21:36:07 +08:00
Linus Torvalds
2c9b351240 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Initial infrastructure for shadow stage-2 MMUs, as part of nested
     virtualization enablement

   - Support for userspace changes to the guest CTR_EL0 value, enabling
     (in part) migration of VMs between heterogenous hardware

   - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1
     of the protocol

   - FPSIMD/SVE support for nested, including merged trap configuration
     and exception routing

   - New command-line parameter to control the WFx trap behavior under
     KVM

   - Introduce kCFI hardening in the EL2 hypervisor

   - Fixes + cleanups for handling presence/absence of FEAT_TCRX

   - Miscellaneous fixes + documentation updates

  LoongArch:

   - Add paravirt steal time support

   - Add support for KVM_DIRTY_LOG_INITIALLY_SET

   - Add perf kvm-stat support for loongarch

  RISC-V:

   - Redirect AMO load/store access fault traps to guest

   - perf kvm stat support

   - Use guest files for IMSIC virtualization, when available

  s390:

   - Assortment of tiny fixes which are not time critical

  x86:

   - Fixes for Xen emulation

   - Add a global struct to consolidate tracking of host values, e.g.
     EFER

   - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the
     effective APIC bus frequency, because TDX

   - Print the name of the APICv/AVIC inhibits in the relevant
     tracepoint

   - Clean up KVM's handling of vendor specific emulation to
     consistently act on "compatible with Intel/AMD", versus checking
     for a specific vendor

   - Drop MTRR virtualization, and instead always honor guest PAT on
     CPUs that support self-snoop

   - Update to the newfangled Intel CPU FMS infrastructure

   - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as
     it reads '0' and writes from userspace are ignored

   - Misc cleanups

  x86 - MMU:

   - Small cleanups, renames and refactoring extracted from the upcoming
     Intel TDX support

   - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages
     that can't hold leafs SPTEs

   - Unconditionally drop mmu_lock when allocating TDP MMU page tables
     for eager page splitting, to avoid stalling vCPUs when splitting
     huge pages

   - Bug the VM instead of simply warning if KVM tries to split a SPTE
     that is non-present or not-huge. KVM is guaranteed to end up in a
     broken state because the callers fully expect a valid SPTE, it's
     all but dangerous to let more MMU changes happen afterwards

  x86 - AMD:

   - Make per-CPU save_area allocations NUMA-aware

   - Force sev_es_host_save_area() to be inlined to avoid calling into
     an instrumentable function from noinstr code

   - Base support for running SEV-SNP guests. API-wise, this includes a
     new KVM_X86_SNP_VM type, encrypting/measure the initial image into
     guest memory, and finalizing it before launching it. Internally,
     there are some gmem/mmu hooks needed to prepare gmem-allocated
     pages before mapping them into guest private memory ranges

     This includes basic support for attestation guest requests, enough
     to say that KVM supports the GHCB 2.0 specification

     There is no support yet for loading into the firmware those signing
     keys to be used for attestation requests, and therefore no need yet
     for the host to provide certificate data for those keys.

     To support fetching certificate data from userspace, a new KVM exit
     type will be needed to handle fetching the certificate from
     userspace.

     An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS
     exit type to handle this was introduced in v1 of this patchset, but
     is still being discussed by community, so for now this patchset
     only implements a stub version of SNP Extended Guest Requests that
     does not provide certificate data

  x86 - Intel:

   - Remove an unnecessary EPT TLB flush when enabling hardware

   - Fix a series of bugs that cause KVM to fail to detect nested
     pending posted interrupts as valid wake eents for a vCPU executing
     HLT in L2 (with HLT-exiting disable by L1)

   - KVM: x86: Suppress MMIO that is triggered during task switch
     emulation

     Explicitly suppress userspace emulated MMIO exits that are
     triggered when emulating a task switch as KVM doesn't support
     userspace MMIO during complex (multi-step) emulation

     Silently ignoring the exit request can result in the
     WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace
     for some other reason prior to purging mmio_needed

     See commit 0dc902267c ("KVM: x86: Suppress pending MMIO write
     exits if emulator detects exception") for more details on KVM's
     limitations with respect to emulated MMIO during complex emulator
     flows

  Generic:

   - Rename the AS_UNMOVABLE flag that was introduced for KVM to
     AS_INACCESSIBLE, because the special casing needed by these pages
     is not due to just unmovability (and in fact they are only
     unmovable because the CPU cannot access them)

   - New ioctl to populate the KVM page tables in advance, which is
     useful to mitigate KVM page faults during guest boot or after live
     migration. The code will also be used by TDX, but (probably) not
     through the ioctl

   - Enable halt poll shrinking by default, as Intel found it to be a
     clear win

   - Setup empty IRQ routing when creating a VM to avoid having to
     synchronize SRCU when creating a split IRQCHIP on x86

   - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with
     a flag that arch code can use for hooking both sched_in() and
     sched_out()

   - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
     truncating a bogus value from userspace, e.g. to help userspace
     detect bugs

   - Mark a vCPU as preempted if and only if it's scheduled out while in
     the KVM_RUN loop, e.g. to avoid marking it preempted and thus
     writing guest memory when retrieving guest state during live
     migration blackout

  Selftests:

   - Remove dead code in the memslot modification stress test

   - Treat "branch instructions retired" as supported on all AMD Family
     17h+ CPUs

   - Print the guest pseudo-RNG seed only when it changes, to avoid
     spamming the log for tests that create lots of VMs

   - Make the PMU counters test less flaky when counting LLC cache
     misses by doing CLFLUSH{OPT} in every loop iteration"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
  crypto: ccp: Add the SNP_VLEK_LOAD command
  KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops
  KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops
  KVM: x86: Replace static_call_cond() with static_call()
  KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
  x86/sev: Move sev_guest.h into common SEV header
  KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
  KVM: x86: Suppress MMIO that is triggered during task switch emulation
  KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro
  KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE
  KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
  KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
  KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level
  KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault"
  KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler
  KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
  KVM: Document KVM_PRE_FAULT_MEMORY ioctl
  mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE
  perf kvm: Add kvm-stat for loongarch64
  LoongArch: KVM: Add PV steal time support in guest side
  ...
2024-07-20 12:41:03 -07:00
Michael Roth
332d2c1d71 crypto: ccp: Add the SNP_VLEK_LOAD command
When requesting an attestation report a guest is able to specify whether
it wants SNP firmware to sign the report using either a Versioned Chip
Endorsement Key (VCEK), which is derived from chip-unique secrets, or a
Versioned Loaded Endorsement Key (VLEK) which is obtained from an AMD
Key Derivation Service (KDS) and derived from seeds allocated to
enrolled cloud service providers (CSPs).

For VLEK keys, an SNP_VLEK_LOAD SNP firmware command is used to load
them into the system after obtaining them from the KDS. Add a
corresponding userspace interface so to allow the loading of VLEK keys
into the system.

See SEV-SNP Firmware ABI 1.54, SNP_VLEK_LOAD for more details.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240501085210.2213060-21-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-17 12:46:26 -04:00
Kim Phillips
468e329577 crypto: ccp - Fix null pointer dereference in __sev_snp_shutdown_locked
Fix a null pointer dereference induced by DEBUG_TEST_DRIVER_REMOVE.
Return from __sev_snp_shutdown_locked() if the psp_device or the
sev_device structs are not initialized. Without the fix, the driver will
produce the following splat:

   ccp 0000:55:00.5: enabling device (0000 -> 0002)
   ccp 0000:55:00.5: sev enabled
   ccp 0000:55:00.5: psp enabled
   BUG: kernel NULL pointer dereference, address: 00000000000000f0
   #PF: supervisor read access in kernel mode
   #PF: error_code(0x0000) - not-present page
   PGD 0 P4D 0
   Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
   CPU: 262 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc1+ #29
   RIP: 0010:__sev_snp_shutdown_locked+0x2e/0x150
   Code: 00 55 48 89 e5 41 57 41 56 41 54 53 48 83 ec 10 41 89 f7 49 89 fe 65 48 8b 04 25 28 00 00 00 48 89 45 d8 48 8b 05 6a 5a 7f 06 <4c> 8b a0 f0 00 00 00 41 0f b6 9c 24 a2 00 00 00 48 83 fb 02 0f 83
   RSP: 0018:ffffb2ea4014b7b8 EFLAGS: 00010286
   RAX: 0000000000000000 RBX: ffff9e4acd2e0a28 RCX: 0000000000000000
   RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb2ea4014b808
   RBP: ffffb2ea4014b7e8 R08: 0000000000000106 R09: 000000000003d9c0
   R10: 0000000000000001 R11: ffffffffa39ff070 R12: ffff9e49d40590c8
   R13: 0000000000000000 R14: ffffb2ea4014b808 R15: 0000000000000000
   FS:  0000000000000000(0000) GS:ffff9e58b1e00000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 00000000000000f0 CR3: 0000000418a3e001 CR4: 0000000000770ef0
   PKRU: 55555554
   Call Trace:
    <TASK>
    ? __die_body+0x6f/0xb0
    ? __die+0xcc/0xf0
    ? page_fault_oops+0x330/0x3a0
    ? save_trace+0x2a5/0x360
    ? do_user_addr_fault+0x583/0x630
    ? exc_page_fault+0x81/0x120
    ? asm_exc_page_fault+0x2b/0x30
    ? __sev_snp_shutdown_locked+0x2e/0x150
    __sev_firmware_shutdown+0x349/0x5b0
    ? pm_runtime_barrier+0x66/0xe0
    sev_dev_destroy+0x34/0xb0
    psp_dev_destroy+0x27/0x60
    sp_destroy+0x39/0x90
    sp_pci_remove+0x22/0x60
    pci_device_remove+0x4e/0x110
    really_probe+0x271/0x4e0
    __driver_probe_device+0x8f/0x160
    driver_probe_device+0x24/0x120
    __driver_attach+0xc7/0x280
    ? driver_attach+0x30/0x30
    bus_for_each_dev+0x10d/0x130
    driver_attach+0x22/0x30
    bus_add_driver+0x171/0x2b0
    ? unaccepted_memory_init_kdump+0x20/0x20
    driver_register+0x67/0x100
    __pci_register_driver+0x83/0x90
    sp_pci_init+0x22/0x30
    sp_mod_init+0x13/0x30
    do_one_initcall+0xb8/0x290
    ? sched_clock_noinstr+0xd/0x10
    ? local_clock_noinstr+0x3e/0x100
    ? stack_depot_save_flags+0x21e/0x6a0
    ? local_clock+0x1c/0x60
    ? stack_depot_save_flags+0x21e/0x6a0
    ? sched_clock_noinstr+0xd/0x10
    ? local_clock_noinstr+0x3e/0x100
    ? __lock_acquire+0xd90/0xe30
    ? sched_clock_noinstr+0xd/0x10
    ? local_clock_noinstr+0x3e/0x100
    ? __create_object+0x66/0x100
    ? local_clock+0x1c/0x60
    ? __create_object+0x66/0x100
    ? parameq+0x1b/0x90
    ? parse_one+0x6d/0x1d0
    ? parse_args+0xd7/0x1f0
    ? do_initcall_level+0x180/0x180
    do_initcall_level+0xb0/0x180
    do_initcalls+0x60/0xa0
    ? kernel_init+0x1f/0x1d0
    do_basic_setup+0x41/0x50
    kernel_init_freeable+0x1ac/0x230
    ? rest_init+0x1f0/0x1f0
    kernel_init+0x1f/0x1d0
    ? rest_init+0x1f0/0x1f0
    ret_from_fork+0x3d/0x50
    ? rest_init+0x1f0/0x1f0
    ret_from_fork_asm+0x11/0x20
    </TASK>
   Modules linked in:
   CR2: 00000000000000f0
   ---[ end trace 0000000000000000 ]---
   RIP: 0010:__sev_snp_shutdown_locked+0x2e/0x150
   Code: 00 55 48 89 e5 41 57 41 56 41 54 53 48 83 ec 10 41 89 f7 49 89 fe 65 48 8b 04 25 28 00 00 00 48 89 45 d8 48 8b 05 6a 5a 7f 06 <4c> 8b a0 f0 00 00 00 41 0f b6 9c 24 a2 00 00 00 48 83 fb 02 0f 83
   RSP: 0018:ffffb2ea4014b7b8 EFLAGS: 00010286
   RAX: 0000000000000000 RBX: ffff9e4acd2e0a28 RCX: 0000000000000000
   RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb2ea4014b808
   RBP: ffffb2ea4014b7e8 R08: 0000000000000106 R09: 000000000003d9c0
   R10: 0000000000000001 R11: ffffffffa39ff070 R12: ffff9e49d40590c8
   R13: 0000000000000000 R14: ffffb2ea4014b808 R15: 0000000000000000
   FS:  0000000000000000(0000) GS:ffff9e58b1e00000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 00000000000000f0 CR3: 0000000418a3e001 CR4: 0000000000770ef0
   PKRU: 55555554
   Kernel panic - not syncing: Fatal exception
   Kernel Offset: 0x1fc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Fixes: 1ca5614b84 ("crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP")
Cc: stable@vger.kernel.org
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: John Allen <john.allen@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-16 13:41:53 +08:00
Borislav Petkov (AMD)
0ecaefb303 x86/CPU/AMD: Track SNP host status with cc_platform_*()
The host SNP worthiness can determined later, after alternatives have
been patched, in snp_rmptable_init() depending on cmdline options like
iommu=pt which is incompatible with SNP, for example.

Which means that one cannot use X86_FEATURE_SEV_SNP and will need to
have a special flag for that control.

Use that newly added CC_ATTR_HOST_SEV_SNP in the appropriate places.

Move kdump_sev_callback() to its rightful place, while at it.

Fixes: 216d106c7f ("x86/sev: Add SEV-SNP host initialization support")
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Srikanth Aithal <sraithal@amd.com>
Link: https://lore.kernel.org/r/20240327154317.29909-6-bp@alien8.de
2024-04-04 10:40:30 +02:00
Linus Torvalds
38b334fc76 Merge tag 'x86_sev_for_v6.9_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV updates from Borislav Petkov:

 - Add the x86 part of the SEV-SNP host support.

   This will allow the kernel to be used as a KVM hypervisor capable of
   running SNP (Secure Nested Paging) guests. Roughly speaking, SEV-SNP
   is the ultimate goal of the AMD confidential computing side,
   providing the most comprehensive confidential computing environment
   up to date.

   This is the x86 part and there is a KVM part which did not get ready
   in time for the merge window so latter will be forthcoming in the
   next cycle.

 - Rework the early code's position-dependent SEV variable references in
   order to allow building the kernel with clang and -fPIE/-fPIC and
   -mcmodel=kernel

 - The usual set of fixes, cleanups and improvements all over the place

* tag 'x86_sev_for_v6.9_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  x86/sev: Disable KMSAN for memory encryption TUs
  x86/sev: Dump SEV_STATUS
  crypto: ccp - Have it depend on AMD_IOMMU
  iommu/amd: Fix failure return from snp_lookup_rmpentry()
  x86/sev: Fix position dependent variable references in startup code
  crypto: ccp: Make snp_range_list static
  x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
  Documentation: virt: Fix up pre-formatted text block for SEV ioctls
  crypto: ccp: Add the SNP_SET_CONFIG command
  crypto: ccp: Add the SNP_COMMIT command
  crypto: ccp: Add the SNP_PLATFORM_STATUS command
  x86/cpufeatures: Enable/unmask SEV-SNP CPU feature
  KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe
  crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump
  iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown
  crypto: ccp: Handle legacy SEV commands when SNP is enabled
  crypto: ccp: Handle non-volatile INIT_EX data when SNP is enabled
  crypto: ccp: Handle the legacy TMR allocation when SNP is enabled
  x86/sev: Introduce an SNP leaked pages list
  crypto: ccp: Provide an API to issue SEV and SNP commands
  ...
2024-03-11 17:44:11 -07:00
Borislav Petkov (AMD)
f9e6f00d93 crypto: ccp: Make snp_range_list static
Fix:

  drivers/crypto/ccp/sev-dev.c:93:28: sparse: sparse: symbol 'snp_range_list' was not declared. Should it be static?

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402031410.GTE3PJ1Y-lkp@intel.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/202402031410.GTE3PJ1Y-lkp@intel.com
2024-02-03 11:41:41 +01:00