Commit d65e1a0f authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull s390 updates from Alexander Gordeev:

 - Store AP Query Configuration Information in a static buffer

 - Rework the AP initialization and add missing cleanups to the error
   path

 - Swap IRQ and AP bus/device registration to avoid race conditions

 - Export prot_virt_guest symbol

 - Introduce AP configuration changes notifier interface to facilitate
   modularization of the AP bus

 - Add CONFIG_AP kernel configuration option to allow modularization of
   the AP bus

 - Rework CONFIG_ZCRYPT_DEBUG kernel configuration option description
   and dependency and rename it to CONFIG_AP_DEBUG

 - Convert sprintf() and snprintf() to sysfs_emit() in CIO code

 - Adjust indentation of RELOCS command build step

 - Make crypto performance counters upward compatible

 - Convert make_page_secure() and gmap_make_secure() to use folio

 - Rework channel-utilization-block (CUB) handling in preparation of
   introducing additional CUBs

 - Use attribute groups to simplify registration, removal and extension
   of measurement-related channel-path sysfs attributes

 - Add a per-channel-path binary "ext_measurement" sysfs attribute that
   provides access to extended channel-path measurement data

 - Export measurement data for all channel-measurement-groups (CMG), not
   only for a specific ones. This enables support of new CMG data
   formats in userspace without the need for kernel changes

 - Add a per-channel-path sysfs attribute "speed_bps" that provides the
   operating speed in bits per second or 0 if the operating speed is not
   available

 - The CIO tracepoint subchannel-type field "st" is incorrectly set to
   the value of subchannel-enabled SCHIB "ena" field. Fix that

 - Do not forcefully limit vmemmap starting address to MAX_PHYSMEM_BITS

 - Consider the maximum physical address available to a DCSS segment
   (512GB) when memory layout is set up

 - Simplify the virtual memory layout setup by reducing the size of
   identity mapping vs vmemmap overlap

 - Swap vmalloc and Lowcore/Real Memory Copy areas in virtual memory.
   This will allow to place the kernel image next to kernel modules

 - Move everyting KASLR related from <asm/setup.h> to <asm/page.h>

 - Put virtual memory layout information into a structure to improve
   code generation

 - Currently __kaslr_offset is the kernel offset in both physical and
   virtual memory spaces. Uncouple these offsets to allow uncoupling of
   the addresses spaces

 - Currently the identity mapping base address is implicit and is always
   set to zero. Make it explicit by putting into __identity_base
   persistent boot variable and use it in proper context

 - Introduce .amode31 section start and end macros AMODE31_START and
   AMODE31_END

 - Introduce OS_INFO entries that do not reference any data in memory,
   but rather provide only values

 - Store virtual memory layout in OS_INFO. It is read out by
   makedumpfile, crash and other tools

 - Store virtual memory layout in VMCORE_INFO. It is read out by crash
   and other tools when /proc/kcore device is used

 - Create additional PT_LOAD ELF program header that covers kernel image
   only, so that vmcore tools could locate kernel text and data when
   virtual and physical memory spaces are uncoupled

 - Uncouple physical and virtual address spaces

 - Map kernel at fixed location when KASLR mode is disabled. The
   location is defined by CONFIG_KERNEL_IMAGE_BASE kernel configuration
   value.

 - Rework deployment of kernel image for both compressed and
   uncompressed variants as defined by CONFIG_KERNEL_UNCOMPRESSED kernel
   configuration value

 - Move .vmlinux.relocs section in front of the compressed kernel. The
   interim section rescue step is avoided as result

 - Correct modules thunk offset calculation when branch target is more
   than 2GB away

 - Kernel modules contain their own set of expoline thunks. Now that the
   kernel modules area is less than 4GB away from kernel expoline
   thunks, make modules use kernel expolines. Also make EXPOLINE_EXTERN
   the default if the compiler supports it

 - userfaultfd can insert shared zeropages into processes running VMs,
   but that is not allowed for s390. Fallback to allocating a fresh
   zeroed anonymous folio and insert that instead

 - Re-enable shared zeropages for non-PV and non-skeys KVM guests

 - Rename hex2bitmap() to ap_hex2bitmap() and export it for external use

 - Add ap_config sysfs attribute to provide the means for setting or
   displaying adapters, domains and control domains assigned to a
   vfio-ap mediated device in a single operation

 - Make vfio_ap_mdev_link_queue() ignore duplicate link requests

 - Add write support to ap_config sysfs attribute to allow atomic update
   a vfio-ap mediated device state

 - Document ap_config sysfs attribute

 - Function os_info_old_init() is expected to be called only from a
   regular kdump kernel. Enable it to be called from a stand-alone dump
   kernel

 - Address gcc -Warray-bounds warning and fix array size in struct
   os_info

 - s390 does not support SMBIOS, so drop unneeded CONFIG_DMI checks

 - Use unwinder instead of __builtin_return_address() with ftrace to
   prevent returning of undefined values

 - Sections .hash and .gnu.hash are only created when CONFIG_PIE_BUILD
   kernel is enabled. Drop these for the case CONFIG_PIE_BUILD is
   disabled

 - Compile kernel with -fPIC and link with -no-pie to allow kpatch
   feature always succeed and drop the whole CONFIG_PIE_BUILD
   option-enabled code

 - Add missing virt_to_phys() converter for VSIE facility and crypto
   control blocks

* tag 's390-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits)
  Revert "s390: Relocate vmlinux ELF data to virtual address space"
  KVM: s390: vsie: Use virt_to_phys for crypto control block
  s390: Relocate vmlinux ELF data to virtual address space
  s390: Compile kernel with -fPIC and link with -no-pie
  s390: vmlinux.lds.S: Drop .hash and .gnu.hash for !CONFIG_PIE_BUILD
  s390/ftrace: Use unwinder instead of __builtin_return_address()
  s390/pci: Drop unneeded reference to CONFIG_DMI
  s390/os_info: Fix array size in struct os_info
  s390/os_info: Initialize old os_info in standalone dump kernel
  docs: Update s390 vfio-ap doc for ap_config sysfs attribute
  s390/vfio-ap: Add write support to sysfs attr ap_config
  s390/vfio-ap: Ignore duplicate link requests in vfio_ap_mdev_link_queue
  s390/vfio-ap: Add sysfs attr, ap_config, to export mdev state
  s390/ap: Externalize AP bus specific bitmap reading function
  s390/mm: Re-enable the shared zeropage for !PV and !skeys KVM guests
  mm/userfaultfd: Do not place zeropages when zeropages are disallowed
  s390/expoline: Make modules use kernel expolines
  s390/nospec: Correct modules thunk offset calculation
  s390/boot: Do not rescue .vmlinux.relocs section
  s390/boot: Rework deployment of the kernel image
  ...
parents a38297e3 1812dc9c
Loading
Loading
Loading
Loading
+3 −1
Original line number Diff line number Diff line
@@ -4785,7 +4785,9 @@

	prot_virt=	[S390] enable hosting protected virtual machines
			isolated from the hypervisor (if hardware supports
			that).
			that). If enabled, the default kernel base address
			might be overridden even when Kernel Address Space
			Layout Randomization is disabled.
			Format: <bool>

	psi=		[KNL] Enable or disable pressure stall information
+1 −0
Original line number Diff line number Diff line
@@ -8,6 +8,7 @@ s390 Architecture
    cds
    3270
    driver-model
    mm
    monreader
    qeth
    s390dbf
+111 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

=================
Memory Management
=================

Virtual memory layout
=====================

.. note::

 - Some aspects of the virtual memory layout setup are not
   clarified (number of page levels, alignment, DMA memory).

 - Unused gaps in the virtual memory layout could be present
   or not - depending on how partucular system is configured.
   No page tables are created for the unused gaps.

 - The virtual memory regions are tracked or untracked by KASAN
   instrumentation, as well as the KASAN shadow memory itself is
   created only when CONFIG_KASAN configuration option is enabled.

::

  =============================================================================
  |    Physical      |	  Virtual	| VM area description
  =============================================================================
  +- 0 --------------+- 0 --------------+
  |		     | S390_lowcore	| Low-address memory
  |		     +- 8 KB -----------+
  |		     |			|
  |		     |			|
  |		     | ... unused gap	| KASAN untracked
  |		     |			|
  +- AMODE31_START --+- AMODE31_START --+ .amode31 rand. phys/virt start
  |.amode31 text/data|.amode31 text/data| KASAN untracked
  +- AMODE31_END ----+- AMODE31_END ----+ .amode31 rand. phys/virt end (<2GB)
  |		     |			|
  |		     |			|
  +- __kaslr_offset_phys		| kernel rand. phys start
  |		     |			|
  | kernel text/data |			|
  |		     |			|
  +------------------+			| kernel phys end
  |		     |			|
  |		     |			|
  |		     |			|
  |		     |			|
  +- ident_map_size -+			|
		     |			|
		     |	... unused gap	| KASAN untracked
		     |			|
		     +- __identity_base + identity mapping start (>= 2GB)
		     |			|
		     | identity		| phys == virt - __identity_base
		     | mapping		| virt == phys + __identity_base
		     |			|
		     |			| KASAN tracked
		     |			|
		     |			|
		     |			|
		     |			|
		     |			|
		     |			|
		     |			|
		     |			|
		     |			|
		     |			|
		     |			|
		     |			|
		     |			|
		     |			|
		     |			|
		     +---- vmemmap -----+ 'struct page' array start
		     |			|
		     | virtually mapped |
		     | memory map	| KASAN untracked
		     |			|
		     +- __abs_lowcore --+
		     |			|
		     | Absolute Lowcore | KASAN untracked
		     |			|
		     +- __memcpy_real_area
		     |			|
		     |	Real Memory Copy| KASAN untracked
		     |			|
		     +- VMALLOC_START --+ vmalloc area start
		     |			| KASAN untracked or
		     |	vmalloc area	| KASAN shallowly populated in case
		     |			|	CONFIG_KASAN_VMALLOC=y
		     +- MODULES_VADDR --+ modules area start
		     |			| KASAN allocated per module or
		     |	modules area	| KASAN shallowly populated in case
		     |			|	CONFIG_KASAN_VMALLOC=y
		     +- __kaslr_offset -+ kernel rand. virt start
		     |			| KASAN tracked
		     | kernel text/data | phys == (kvirt - __kaslr_offset) +
		     |			|	  __kaslr_offset_phys
		     +- kernel .bss end + kernel rand. virt end
		     |			|
		     |	... unused gap	| KASAN untracked
		     |			|
		     +------------------+ UltraVisor Secure Storage limit
		     |			|
		     |	... unused gap	| KASAN untracked
		     |			|
		     +KASAN_SHADOW_START+ KASAN shadow memory start
		     |			|
		     |	 KASAN shadow	| KASAN untracked
		     |			|
		     +------------------+ ASCE limit
+31 −1
Original line number Diff line number Diff line
@@ -380,6 +380,36 @@ matrix device.
    control_domains:
      A read-only file for displaying the control domain numbers assigned to the
      vfio_ap mediated device.
    ap_config:
      A read/write file that, when written to, allows all three of the
      vfio_ap mediated device's ap matrix masks to be replaced in one shot.
      Three masks are given, one for adapters, one for domains, and one for
      control domains. If the given state cannot be set then no changes are
      made to the vfio-ap mediated device.

      The format of the data written to ap_config is as follows:
      {amask},{dmask},{cmask}\n

      \n is a newline character.

      amask, dmask, and cmask are masks identifying which adapters, domains,
      and control domains should be assigned to the mediated device.

      The format of a mask is as follows:
      0xNN..NN

      Where NN..NN is 64 hexadecimal characters representing a 256-bit value.
      The leftmost (highest order) bit represents adapter/domain 0.

      For an example set of masks that represent your mdev's current
      configuration, simply cat ap_config.

      Setting an adapter or domain number greater than the maximum allowed for
      the system will result in an error.

      This attribute is intended to be used by automation. End users would be
      better served using the respective assign/unassign attributes for
      adapters, domains, and control domains.

* functions:

@@ -550,7 +580,7 @@ These are the steps:
   following Kconfig elements selected:
   * IOMMU_SUPPORT
   * S390
   * ZCRYPT
   * AP
   * VFIO
   * KVM

+51 −14
Original line number Diff line number Diff line
@@ -17,6 +17,9 @@ config ARCH_HAS_ILOG2_U32
config ARCH_HAS_ILOG2_U64
	def_bool n

config ARCH_PROC_KCORE_TEXT
	def_bool y

config GENERIC_HWEIGHT
	def_bool y

@@ -552,7 +555,7 @@ config EXPOLINE
	  If unsure, say N.

config EXPOLINE_EXTERN
	def_bool n
	def_bool y if EXPOLINE
	depends on EXPOLINE
	depends on CC_IS_GCC && GCC_VERSION >= 110200
	depends on $(success,$(srctree)/arch/s390/tools/gcc-thunk-extern.sh $(CC))
@@ -590,18 +593,6 @@ config RELOCATABLE
	  Note: this option exists only for documentation purposes, please do
	  not remove it.

config PIE_BUILD
	def_bool CC_IS_CLANG && !$(cc-option,-munaligned-symbols)
	help
	  If the compiler is unable to generate code that can manage unaligned
	  symbols, the kernel is linked as a position-independent executable
	  (PIE) and includes dynamic relocations that are processed early
	  during bootup.

	  For kpatch functionality, it is recommended to build the kernel
	  without the PIE_BUILD option. PIE_BUILD is only enabled when the
	  compiler lacks proper support for handling unaligned symbols.

config RANDOMIZE_BASE
	bool "Randomize the address of the kernel image (KASLR)"
	default y
@@ -611,6 +602,25 @@ config RANDOMIZE_BASE
	  as a security feature that deters exploit attempts relying on
	  knowledge of the location of kernel internals.

config KERNEL_IMAGE_BASE
	hex "Kernel image base address"
	range 0x100000 0x1FFFFFE0000000 if !KASAN
	range 0x100000 0x1BFFFFE0000000 if KASAN
	default 0x3FFE0000000 if !KASAN
	default 0x7FFFE0000000 if KASAN
	help
	  This is the address at which the kernel image is loaded in case
	  Kernel Address Space Layout Randomization (KASLR) is disabled.

	  In case the Protected virtualization guest support is enabled the
	  Ultravisor imposes a virtual address limit. If the value of this
	  option leads to the kernel image exceeding the Ultravisor limit,
	  this option is ignored and the image is loaded below the limit.

	  If the value of this option leads to the kernel image overlapping
	  the virtual memory where other data structures are located, this
	  option is ignored and the image is loaded above the structures.

endmenu

menu "Memory setup"
@@ -724,6 +734,33 @@ config EADM_SCH
	  To compile this driver as a module, choose M here: the
	  module will be called eadm_sch.

config AP
	def_tristate y
	prompt "Support for Adjunct Processors (ap)"
	help
	  This driver allows usage to Adjunct Processor (AP) devices via
	  the ap bus, cards and queues. Supported Adjunct Processors are
	  the CryptoExpress Cards (CEX).

	  To compile this driver as a module, choose M here: the
	  module will be called ap.

	  If unsure, say Y (default).

config AP_DEBUG
	def_bool n
	prompt "Enable debug features for Adjunct Processor (ap) devices"
	depends on AP
	help
	  Say 'Y' here to enable some additional debug features for Adjunct
	  Processor (ap) devices.

	  There will be some more sysfs attributes displayed for ap queues.

	  Do not enable on production level kernel build.

	  If unsure, say N.

config VFIO_CCW
	def_tristate n
	prompt "Support for VFIO-CCW subchannels"
@@ -740,7 +777,7 @@ config VFIO_AP
	prompt "VFIO support for AP devices"
	depends on KVM
	depends on VFIO
	depends on ZCRYPT
	depends on AP
	select VFIO_MDEV
	help
	  This driver grants access to Adjunct Processor (AP) devices
Loading