Loading Documentation/driver-api/cxl/allocation/dax.rst 0 → 100644 +60 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 =========== DAX Devices =========== CXL capacity exposed as a DAX device can be accessed directly via mmap. Users may wish to use this interface mechanism to write their own userland CXL allocator, or to managed shared or persistent memory regions across multiple hosts. If the capacity is shared across hosts or persistent, appropriate flushing mechanisms must be employed unless the region supports Snoop Back-Invalidate. Note that mappings must be aligned (size and base) to the dax device's base alignment, which is typically 2MB - but maybe be configured larger. :: #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <sys/mman.h> #include <fcntl.h> #include <unistd.h> #define DEVICE_PATH "/dev/dax0.0" // Replace DAX device path #define DEVICE_SIZE (4ULL * 1024 * 1024 * 1024) // 4GB int main() { int fd; void* mapped_addr; /* Open the DAX device */ fd = open(DEVICE_PATH, O_RDWR); if (fd < 0) { perror("open"); return -1; } /* Map the device into memory */ mapped_addr = mmap(NULL, DEVICE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); if (mapped_addr == MAP_FAILED) { perror("mmap"); close(fd); return -1; } printf("Mapped address: %p\n", mapped_addr); /* You can now access the device through the mapped address */ uint64_t* ptr = (uint64_t*)mapped_addr; *ptr = 0x1234567890abcdef; // Write a value to the device printf("Value at address %p: 0x%016llx\n", ptr, *ptr); /* Clean up */ munmap(mapped_addr, DEVICE_SIZE); close(fd); return 0; } Documentation/driver-api/cxl/allocation/hugepages.rst 0 → 100644 +32 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 ========== Huge Pages ========== Contiguous Memory Allocator =========================== CXL Memory onlined as SystemRAM during early boot is eligible for use by CMA, as the NUMA node hosting that capacity will be `Online` at the time CMA carves out contiguous capacity. CXL Memory deferred to the CXL Driver for configuration cannot have its capacity allocated by CMA - as the NUMA node hosting the capacity is `Offline` at :code:`__init` time - when CMA carves out contiguous capacity. HugeTLB ======= Different huge page sizes allow different memory configurations. 2MB Huge Pages -------------- All CXL capacity regardless of configuration time or memory zone is eligible for use as 2MB huge pages. 1GB Huge Pages -------------- CXL capacity onlined in :code:`ZONE_NORMAL` is eligible for 1GB Gigantic Page allocation. CXL capacity onlined in :code:`ZONE_MOVABLE` is not eligible for 1GB Gigantic Page allocation. Documentation/driver-api/cxl/allocation/page-allocator.rst 0 → 100644 +85 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 ================== The Page Allocator ================== The kernel page allocator services all general page allocation requests, such as :code:`kmalloc`. CXL configuration steps affect the behavior of the page allocator based on the selected `Memory Zone` and `NUMA node` the capacity is placed in. This section mostly focuses on how these configurations affect the page allocator (as of Linux v6.15) rather than the overall page allocator behavior. NUMA nodes and mempolicy ======================== Unless a task explicitly registers a mempolicy, the default memory policy of the linux kernel is to allocate memory from the `local NUMA node` first, and fall back to other nodes only if the local node is pressured. Generally, we expect to see local DRAM and CXL memory on separate NUMA nodes, with the CXL memory being non-local. Technically, however, it is possible for a compute node to have no local DRAM, and for CXL memory to be the `local` capacity for that compute node. Memory Zones ============ CXL capacity may be onlined in :code:`ZONE_NORMAL` or :code:`ZONE_MOVABLE`. As of v6.15, the page allocator attempts to allocate from the highest available and compatible ZONE for an allocation from the local node first. An example of a `zone incompatibility` is attempting to service an allocation marked :code:`GFP_KERNEL` from :code:`ZONE_MOVABLE`. Kernel allocations are typically not migratable, and as a result can only be serviced from :code:`ZONE_NORMAL` or lower. To simplify this, the page allocator will prefer :code:`ZONE_MOVABLE` over :code:`ZONE_NORMAL` by default, but if :code:`ZONE_MOVABLE` is depleted, it will fallback to allocate from :code:`ZONE_NORMAL`. Zone and Node Quirks ==================== Let's consider a configuration where the local DRAM capacity is largely onlined into :code:`ZONE_NORMAL`, with no :code:`ZONE_MOVABLE` capacity present. The CXL capacity has the opposite configuration - all onlined in :code:`ZONE_MOVABLE`. Under the default allocation policy, the page allocator will completely skip :code:`ZONE_MOVABLE` as a valid allocation target. This is because, as of Linux v6.15, the page allocator does (approximately) the following: :: for (each zone in local_node): for (each node in fallback_order): attempt_allocation(gfp_flags); Because the local node does not have :code:`ZONE_MOVABLE`, the CXL node is functionally unreachable for direct allocation. As a result, the only way for CXL capacity to be used is via `demotion` in the reclaim path. This configuration also means that if the DRAM ndoe has :code:`ZONE_MOVABLE` capacity - when that capacity is depleted, the page allocator will actually prefer CXL :code:`ZONE_MOVABLE` pages over DRAM :code:`ZONE_NORMAL` pages. We may wish to invert this priority in future Linux versions. If `demotion` and `swap` are disabled, Linux will begin to cause OOM crashes when the DRAM nodes are depleted. See the reclaim section for more details. CGroups and CPUSets =================== Finally, assuming CXL memory is reachable via the page allocation (i.e. onlined in :code:`ZONE_NORMAL`), the :code:`cpusets.mems_allowed` may be used by containers to limit the accessibility of certain NUMA nodes for tasks in that container. Users may wish to utilize this in multi-tenant systems where some tasks prefer not to use slower memory. In the reclaim section we'll discuss some limitations of this interface to prevent demotions of shared data to CXL memory (if demotions are enabled). Documentation/driver-api/cxl/allocation/reclaim.rst 0 → 100644 +51 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 ======= Reclaim ======= Another way CXL memory can be utilized *indirectly* is via the reclaim system in :code:`mm/vmscan.c`. Reclaim is engaged when memory capacity on the system becomes pressured based on global and cgroup-local `watermark` settings. In this section we won't discuss the `watermark` configurations, just how CXL memory can be consumed by various pieces of reclaim system. Demotion ======== By default, the reclaim system will prefer swap (or zswap) when reclaiming memory. Enabling :code:`kernel/mm/numa/demotion_enabled` will cause vmscan to opportunistically prefer distant NUMA nodes to swap or zswap, if capacity is available. Demotion engages the :code:`mm/memory_tier.c` component to determine the next demotion node. The next demotion node is based on the :code:`HMAT` or :code:`CDAT` performance data. cpusets.mems_allowed quirk -------------------------- In Linux v6.15 and below, demotion does not respect :code:`cpusets.mems_allowed` when migrating pages. As a result, if demotion is enabled, vmscan cannot guarantee isolation of a container's memory from nodes not set in mems_allowed. In Linux v6.XX and up, demotion does attempt to respect :code:`cpusets.mems_allowed`; however, certain classes of shared memory originally instantiated by another cgroup (such as common libraries - e.g. libc) may still be demoted. As a result, the mems_allowed interface still cannot provide perfect isolation from the remote nodes. ZSwap and Node Preference ========================= In Linux v6.15 and below, ZSwap allocates memory from the local node of the processor for the new pages being compressed. Since pages being compressed are typically cold, the result is a cold page becomes promoted - only to be later demoted as it ages off the LRU. In Linux v6.XX, ZSwap tries to prefer the node of the page being compressed as the allocation target for the compression page. This helps prevent thrashing. Demotion with ZSwap =================== When enabling both Demotion and ZSwap, you create a situation where ZSwap will prefer the slowest form of CXL memory by default until that tier of memory is exhausted. Documentation/driver-api/cxl/devices/device-types.rst 0 → 100644 +165 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 ===================== Devices and Protocols ===================== The type of CXL device (Memory, Accelerator, etc) dictates many configuration steps. This section covers some basic background on device types and on-device resources used by the platform and OS which impact configuration. Protocols ========= There are three core protocols to CXL. For the purpose of this documentation, we will only discuss very high level definitions as the specific hardware details are largely abstracted away from Linux. See the CXL specification for more details. CXL.io ------ The basic interaction protocol, similar to PCIe configuration mechanisms. Typically used for initialization, configuration, and I/O access for anything other than memory (CXL.mem) or cache (CXL.cache) operations. The Linux CXL driver exposes access to .io functionalty via the various sysfs interfaces and /dev/cxl/ devices (which exposes direct access to device mailboxes). CXL.cache --------- The mechanism by which a device may coherently access and cache host memory. Largely transparent to Linux once configured. CXL.mem --------- The mechanism by which the CPU may coherently access and cache device memory. Largely transparent to Linux once configured. Device Types ============ Type-1 ------ A Type-1 CXL device: * Supports cxl.io and cxl.cache protocols * Implements a fully coherent cache * Allows Device-to-Host coherence and Host-to-Device snoops. * Does NOT have host-managed device memory (HDM) Typical examples of type-1 devices is a Smart NIC - which may want to directly operate on host-memory (DMA) to store incoming packets. These devices largely rely on CPU-attached memory. Type-2 ------ A Type-2 CXL Device: * Supports cxl.io, cxl.cache, and cxl.mem protocols * Optionally implements coherent cache and Host-Managed Device Memory * Is typically an accelerator device w/ high bandwidth memory. The primary difference between a type-1 and type-2 device is the presence of host-managed device memory, which allows the device to operate on a local memory bank - while the CPU sill has coherent DMA to the same memory. The allows things like GPUs to expose their memory via DAX devices or file descriptors, allows drivers and programs direct access to device memory rather than use block-transfer semantics. Type-3 ------ A Type-3 CXL Device * Supports cxl.io and cxl.mem * Implements Host-Managed Device Memory * May provide either Volatile or Persistent memory capacity (or both). A basic example of a type-3 device is a simple memory expander, whose local memory capacity is exposed to the CPU for access directly via basic coherent DMA. Switch ------ A CXL switch is a device capacity of routing any CXL (and by extension, PCIe) protocol between an upstream, downstream, or peer devices. Many devices, such as Multi-Logical Devices, imply the presence of switching in some manner. Logical Devices and Heads ------------------------- A CXL device may present one or more "Logical Devices" to one or more hosts (via physical "Heads"). A Single-Logical Device (SLD) is a device which presents a single device to one or more heads. A Multi-Logical Device (MLD) is a device which may present multiple devices to one or more devices. A Single-Headed Device exposes only a single physical connection. A Multi-Headed Device exposes multiple physical connections. MHSLD ~~~~~ A Multi-Headed Single-Logical Device (MHSLD) exposes a single logical device to multiple heads which may be connected to one or more discrete hosts. An example of this would be a simple memory-pool which may be statically configured (prior to boot) to expose portions of its memory to Linux via :doc:`CEDT <../platform/acpi/cedt>`. MHMLD ~~~~~ A Multi-Headed Multi-Logical Device (MHMLD) exposes multiple logical devices to multiple heads which may be connected to one or more discrete hosts. An example of this would be a Dynamic Capacity Device or which may be configured at runtime to expose portions of its memory to Linux. Example Devices =============== Memory Expander --------------- The simplest form of Type-3 device is a memory expander. A memory expander exposes Host-Managed Device Memory (HDM) to Linux. This memory may be Volatile or Non-Volatile (Persistent). Memory Expanders will typically be considered a form of Single-Headed, Single-Logical Device - as its form factor will typically be an add-in-card (AIC) or some other similar form-factor. The Linux CXL driver provides support for static or dynamic configuration of basic memory expanders. The platform may program decoders prior to OS init (e.g. auto-decoders), or the user may program the fabric if the platform defers these operations to the OS. Multiple Memory Expanders may be added to an external chassis and exposed to a host via a head attached to a CXL switch. This is a "memory pool", and would be considered an MHSLD or MHMLD depending on the management capabilities provided by the switch platform. As of v6.14, Linux does not provide a formalized interface to manage non-DCD MHSLD or MHMLD devices. Dynamic Capacity Device (DCD) ----------------------------- A Dynamic Capacity Device is a Type-3 device which provides dynamic management of memory capacity. The basic premise of a DCD to provide an allocator-like interface for physical memory capacity to a "Fabric Manager" (an external, privileged host with privileges to change configurations for other hosts). A DCD manages "Memory Extents", which may be volatile or persistent. Extents may also be exclusive to a single host or shared across multiple hosts. As of v6.14, Linux does not provide a formalized interface to manage DCD devices, however there is active work on LKML targeting future release. Loading
Documentation/driver-api/cxl/allocation/dax.rst 0 → 100644 +60 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 =========== DAX Devices =========== CXL capacity exposed as a DAX device can be accessed directly via mmap. Users may wish to use this interface mechanism to write their own userland CXL allocator, or to managed shared or persistent memory regions across multiple hosts. If the capacity is shared across hosts or persistent, appropriate flushing mechanisms must be employed unless the region supports Snoop Back-Invalidate. Note that mappings must be aligned (size and base) to the dax device's base alignment, which is typically 2MB - but maybe be configured larger. :: #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <sys/mman.h> #include <fcntl.h> #include <unistd.h> #define DEVICE_PATH "/dev/dax0.0" // Replace DAX device path #define DEVICE_SIZE (4ULL * 1024 * 1024 * 1024) // 4GB int main() { int fd; void* mapped_addr; /* Open the DAX device */ fd = open(DEVICE_PATH, O_RDWR); if (fd < 0) { perror("open"); return -1; } /* Map the device into memory */ mapped_addr = mmap(NULL, DEVICE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); if (mapped_addr == MAP_FAILED) { perror("mmap"); close(fd); return -1; } printf("Mapped address: %p\n", mapped_addr); /* You can now access the device through the mapped address */ uint64_t* ptr = (uint64_t*)mapped_addr; *ptr = 0x1234567890abcdef; // Write a value to the device printf("Value at address %p: 0x%016llx\n", ptr, *ptr); /* Clean up */ munmap(mapped_addr, DEVICE_SIZE); close(fd); return 0; }
Documentation/driver-api/cxl/allocation/hugepages.rst 0 → 100644 +32 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 ========== Huge Pages ========== Contiguous Memory Allocator =========================== CXL Memory onlined as SystemRAM during early boot is eligible for use by CMA, as the NUMA node hosting that capacity will be `Online` at the time CMA carves out contiguous capacity. CXL Memory deferred to the CXL Driver for configuration cannot have its capacity allocated by CMA - as the NUMA node hosting the capacity is `Offline` at :code:`__init` time - when CMA carves out contiguous capacity. HugeTLB ======= Different huge page sizes allow different memory configurations. 2MB Huge Pages -------------- All CXL capacity regardless of configuration time or memory zone is eligible for use as 2MB huge pages. 1GB Huge Pages -------------- CXL capacity onlined in :code:`ZONE_NORMAL` is eligible for 1GB Gigantic Page allocation. CXL capacity onlined in :code:`ZONE_MOVABLE` is not eligible for 1GB Gigantic Page allocation.
Documentation/driver-api/cxl/allocation/page-allocator.rst 0 → 100644 +85 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 ================== The Page Allocator ================== The kernel page allocator services all general page allocation requests, such as :code:`kmalloc`. CXL configuration steps affect the behavior of the page allocator based on the selected `Memory Zone` and `NUMA node` the capacity is placed in. This section mostly focuses on how these configurations affect the page allocator (as of Linux v6.15) rather than the overall page allocator behavior. NUMA nodes and mempolicy ======================== Unless a task explicitly registers a mempolicy, the default memory policy of the linux kernel is to allocate memory from the `local NUMA node` first, and fall back to other nodes only if the local node is pressured. Generally, we expect to see local DRAM and CXL memory on separate NUMA nodes, with the CXL memory being non-local. Technically, however, it is possible for a compute node to have no local DRAM, and for CXL memory to be the `local` capacity for that compute node. Memory Zones ============ CXL capacity may be onlined in :code:`ZONE_NORMAL` or :code:`ZONE_MOVABLE`. As of v6.15, the page allocator attempts to allocate from the highest available and compatible ZONE for an allocation from the local node first. An example of a `zone incompatibility` is attempting to service an allocation marked :code:`GFP_KERNEL` from :code:`ZONE_MOVABLE`. Kernel allocations are typically not migratable, and as a result can only be serviced from :code:`ZONE_NORMAL` or lower. To simplify this, the page allocator will prefer :code:`ZONE_MOVABLE` over :code:`ZONE_NORMAL` by default, but if :code:`ZONE_MOVABLE` is depleted, it will fallback to allocate from :code:`ZONE_NORMAL`. Zone and Node Quirks ==================== Let's consider a configuration where the local DRAM capacity is largely onlined into :code:`ZONE_NORMAL`, with no :code:`ZONE_MOVABLE` capacity present. The CXL capacity has the opposite configuration - all onlined in :code:`ZONE_MOVABLE`. Under the default allocation policy, the page allocator will completely skip :code:`ZONE_MOVABLE` as a valid allocation target. This is because, as of Linux v6.15, the page allocator does (approximately) the following: :: for (each zone in local_node): for (each node in fallback_order): attempt_allocation(gfp_flags); Because the local node does not have :code:`ZONE_MOVABLE`, the CXL node is functionally unreachable for direct allocation. As a result, the only way for CXL capacity to be used is via `demotion` in the reclaim path. This configuration also means that if the DRAM ndoe has :code:`ZONE_MOVABLE` capacity - when that capacity is depleted, the page allocator will actually prefer CXL :code:`ZONE_MOVABLE` pages over DRAM :code:`ZONE_NORMAL` pages. We may wish to invert this priority in future Linux versions. If `demotion` and `swap` are disabled, Linux will begin to cause OOM crashes when the DRAM nodes are depleted. See the reclaim section for more details. CGroups and CPUSets =================== Finally, assuming CXL memory is reachable via the page allocation (i.e. onlined in :code:`ZONE_NORMAL`), the :code:`cpusets.mems_allowed` may be used by containers to limit the accessibility of certain NUMA nodes for tasks in that container. Users may wish to utilize this in multi-tenant systems where some tasks prefer not to use slower memory. In the reclaim section we'll discuss some limitations of this interface to prevent demotions of shared data to CXL memory (if demotions are enabled).
Documentation/driver-api/cxl/allocation/reclaim.rst 0 → 100644 +51 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 ======= Reclaim ======= Another way CXL memory can be utilized *indirectly* is via the reclaim system in :code:`mm/vmscan.c`. Reclaim is engaged when memory capacity on the system becomes pressured based on global and cgroup-local `watermark` settings. In this section we won't discuss the `watermark` configurations, just how CXL memory can be consumed by various pieces of reclaim system. Demotion ======== By default, the reclaim system will prefer swap (or zswap) when reclaiming memory. Enabling :code:`kernel/mm/numa/demotion_enabled` will cause vmscan to opportunistically prefer distant NUMA nodes to swap or zswap, if capacity is available. Demotion engages the :code:`mm/memory_tier.c` component to determine the next demotion node. The next demotion node is based on the :code:`HMAT` or :code:`CDAT` performance data. cpusets.mems_allowed quirk -------------------------- In Linux v6.15 and below, demotion does not respect :code:`cpusets.mems_allowed` when migrating pages. As a result, if demotion is enabled, vmscan cannot guarantee isolation of a container's memory from nodes not set in mems_allowed. In Linux v6.XX and up, demotion does attempt to respect :code:`cpusets.mems_allowed`; however, certain classes of shared memory originally instantiated by another cgroup (such as common libraries - e.g. libc) may still be demoted. As a result, the mems_allowed interface still cannot provide perfect isolation from the remote nodes. ZSwap and Node Preference ========================= In Linux v6.15 and below, ZSwap allocates memory from the local node of the processor for the new pages being compressed. Since pages being compressed are typically cold, the result is a cold page becomes promoted - only to be later demoted as it ages off the LRU. In Linux v6.XX, ZSwap tries to prefer the node of the page being compressed as the allocation target for the compression page. This helps prevent thrashing. Demotion with ZSwap =================== When enabling both Demotion and ZSwap, you create a situation where ZSwap will prefer the slowest form of CXL memory by default until that tier of memory is exhausted.
Documentation/driver-api/cxl/devices/device-types.rst 0 → 100644 +165 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 ===================== Devices and Protocols ===================== The type of CXL device (Memory, Accelerator, etc) dictates many configuration steps. This section covers some basic background on device types and on-device resources used by the platform and OS which impact configuration. Protocols ========= There are three core protocols to CXL. For the purpose of this documentation, we will only discuss very high level definitions as the specific hardware details are largely abstracted away from Linux. See the CXL specification for more details. CXL.io ------ The basic interaction protocol, similar to PCIe configuration mechanisms. Typically used for initialization, configuration, and I/O access for anything other than memory (CXL.mem) or cache (CXL.cache) operations. The Linux CXL driver exposes access to .io functionalty via the various sysfs interfaces and /dev/cxl/ devices (which exposes direct access to device mailboxes). CXL.cache --------- The mechanism by which a device may coherently access and cache host memory. Largely transparent to Linux once configured. CXL.mem --------- The mechanism by which the CPU may coherently access and cache device memory. Largely transparent to Linux once configured. Device Types ============ Type-1 ------ A Type-1 CXL device: * Supports cxl.io and cxl.cache protocols * Implements a fully coherent cache * Allows Device-to-Host coherence and Host-to-Device snoops. * Does NOT have host-managed device memory (HDM) Typical examples of type-1 devices is a Smart NIC - which may want to directly operate on host-memory (DMA) to store incoming packets. These devices largely rely on CPU-attached memory. Type-2 ------ A Type-2 CXL Device: * Supports cxl.io, cxl.cache, and cxl.mem protocols * Optionally implements coherent cache and Host-Managed Device Memory * Is typically an accelerator device w/ high bandwidth memory. The primary difference between a type-1 and type-2 device is the presence of host-managed device memory, which allows the device to operate on a local memory bank - while the CPU sill has coherent DMA to the same memory. The allows things like GPUs to expose their memory via DAX devices or file descriptors, allows drivers and programs direct access to device memory rather than use block-transfer semantics. Type-3 ------ A Type-3 CXL Device * Supports cxl.io and cxl.mem * Implements Host-Managed Device Memory * May provide either Volatile or Persistent memory capacity (or both). A basic example of a type-3 device is a simple memory expander, whose local memory capacity is exposed to the CPU for access directly via basic coherent DMA. Switch ------ A CXL switch is a device capacity of routing any CXL (and by extension, PCIe) protocol between an upstream, downstream, or peer devices. Many devices, such as Multi-Logical Devices, imply the presence of switching in some manner. Logical Devices and Heads ------------------------- A CXL device may present one or more "Logical Devices" to one or more hosts (via physical "Heads"). A Single-Logical Device (SLD) is a device which presents a single device to one or more heads. A Multi-Logical Device (MLD) is a device which may present multiple devices to one or more devices. A Single-Headed Device exposes only a single physical connection. A Multi-Headed Device exposes multiple physical connections. MHSLD ~~~~~ A Multi-Headed Single-Logical Device (MHSLD) exposes a single logical device to multiple heads which may be connected to one or more discrete hosts. An example of this would be a simple memory-pool which may be statically configured (prior to boot) to expose portions of its memory to Linux via :doc:`CEDT <../platform/acpi/cedt>`. MHMLD ~~~~~ A Multi-Headed Multi-Logical Device (MHMLD) exposes multiple logical devices to multiple heads which may be connected to one or more discrete hosts. An example of this would be a Dynamic Capacity Device or which may be configured at runtime to expose portions of its memory to Linux. Example Devices =============== Memory Expander --------------- The simplest form of Type-3 device is a memory expander. A memory expander exposes Host-Managed Device Memory (HDM) to Linux. This memory may be Volatile or Non-Volatile (Persistent). Memory Expanders will typically be considered a form of Single-Headed, Single-Logical Device - as its form factor will typically be an add-in-card (AIC) or some other similar form-factor. The Linux CXL driver provides support for static or dynamic configuration of basic memory expanders. The platform may program decoders prior to OS init (e.g. auto-decoders), or the user may program the fabric if the platform defers these operations to the OS. Multiple Memory Expanders may be added to an external chassis and exposed to a host via a head attached to a CXL switch. This is a "memory pool", and would be considered an MHSLD or MHMLD depending on the management capabilities provided by the switch platform. As of v6.14, Linux does not provide a formalized interface to manage non-DCD MHSLD or MHMLD devices. Dynamic Capacity Device (DCD) ----------------------------- A Dynamic Capacity Device is a Type-3 device which provides dynamic management of memory capacity. The basic premise of a DCD to provide an allocator-like interface for physical memory capacity to a "Fabric Manager" (an external, privileged host with privileges to change configurations for other hosts). A DCD manages "Memory Extents", which may be volatile or persistent. Extents may also be exclusive to a single host or shared across multiple hosts. As of v6.14, Linux does not provide a formalized interface to manage DCD devices, however there is active work on LKML targeting future release.