Commit e62f81bb authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull CXL updates from Dave Jiang:
 "Core:

   - A CXL maturity map has been added to the documentation to detail
     the current state of CXL enabling.

     It provides the status of the current state of various CXL features
     to inform current and future contributors of where things are and
     which areas need contribution.

   - A notifier handler has been added in order for a newly created CXL
     memory region to trigger the abstract distance metrics calculation.

     This should bring parity for CXL memory to the same level vs
     hotplugged DRAM for NUMA abstract distance calculation. The
     abstract distance reflects relative performance used for memory
     tiering handling.

   - An addition for XOR math has been added to address the CXL DPA to
     SPA translation.

     CXL address translation did not support address interleave math
     with XOR prior to this change.

  Fixes:

   - Fix to address race condition in the CXL memory hotplug notifier

   - Add missing MODULE_DESCRIPTION() for CXL modules

   - Fix incorrect vendor debug UUID define

  Misc:

   - A warning has been added to inform users of an unsupported
     configuration when mixing CXL VH and RCH/RCD hierarchies

   - The ENXIO error code has been replaced with EBUSY for inject poison
     limit reached via debugfs and cxl-test support

   - Moving the PCI config read in cxl_dvsec_rr_decode() to avoid
     unnecessary PCI config reads

   - A refactor to a common struct for DRAM and general media CXL
     events"

* tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/core/pci: Move reading of control register to immediately before usage
  cxl: Remove defunct code calculating host bridge target positions
  cxl/region: Verify target positions using the ordered target list
  cxl: Restore XOR'd position bits during address translation
  cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa()
  cxl/test: Replace ENXIO with EBUSY for inject poison limit reached
  cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached
  cxl/acpi: Warn on mixed CXL VH and RCH/RCD Hierarchy
  cxl/core: Fix incorrect vendor debug UUID define
  Documentation: CXL Maturity Map
  cxl/region: Simplify cxl_region_nid()
  cxl/region: Support to calculate memory tier abstract distance
  cxl/region: Fix a race condition in memory hotplug notifier
  cxl: add missing MODULE_DESCRIPTION() macros
  cxl/events: Use a common struct for DRAM and General Media events
parents 7b5d4818 a0328b39
Loading
Loading
Loading
Loading
+4 −3
Original line number Diff line number Diff line
@@ -14,9 +14,10 @@ Description:
		event to its internal Informational Event log, updates the
		Event Status register, and if configured, interrupts the host.
		It is not an error to inject poison into an address that
		already has poison present and no error is returned. The
		inject_poison attribute is only visible for devices supporting
		the capability.
		already has poison present and no error is returned. If the
		device returns 'Inject Poison Limit Reached' an -EBUSY error
		is returned to the user. The inject_poison attribute is only
		visible for devices supporting the capability.


What:		/sys/kernel/debug/memX/clear_poison
+2 −0
Original line number Diff line number Diff line
@@ -9,4 +9,6 @@ Compute Express Link

   memory-devices

   maturity-map

.. only::  subproject and html
+202 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>

===========================================
Compute Express Link Subsystem Maturity Map
===========================================

The Linux CXL subsystem tracks the dynamic `CXL specification
<https://computeexpresslink.org/cxl-specification-landing-page>`_ that
continues to respond to new use cases with new features, capability
updates and fixes. At any given point some aspects of the subsystem are
more mature than others. While the periodic pull requests summarize the
`work being incorporated each merge window
<https://lore.kernel.org/linux-cxl/?q=s%3APULL+s%3ACXL+tc%3Atorvalds+NOT+s%3ARe>`_,
those do not always convey progress relative to a starting point and a
future end goal.

What follows is a coarse breakdown of the subsystem's major
responsibilities along with a maturity score. The expectation is that
the change-history of this document provides an overview summary of the
subsystem maturation over time.

The maturity scores are:

- [3] Mature: Work in this area is complete and no changes on the horizon.
  Note that this score can regress from one kernel release to the next
  based on new test results or end user reports.

- [2] Stabilizing: Major functionality operational, common cases are
  mature, but known corner cases are still a work in progress.

- [1] Initial: Capability that has exited the Proof of Concept phase, but
  may still have significant gaps to close and fixes to apply as real
  world testing occurs.

- [0] Known gap: Feature is on a medium to long term horizon to
  implement.  If the specification has a feature that does not even have
  a '0' score in this document, there is a good chance that no one in
  the linux-cxl@vger.kernel.org community has started to look at it.

- X: Out of scope for kernel enabling, or kernel enabling not required

Feature and Capabilities
========================

Enumeration / Provisioning
--------------------------
All of the fundamental enumeration an object model of the subsystem is
in place, but there are several corner cases that are pending closure.


* [2] CXL Window Enumeration

  * [0] :ref:`Extended-linear memory-side cache <extended-linear>`
  * [0] Low Memory-hole
  * [0] Hetero-interleave

* [2] Switch Enumeration

  * [0] CXL register enumeration link-up dependency

* [2] HDM Decoder Configuration

  * [0] Decoder target and granularity constraints

* [2] Performance enumeration

  * [3] Endpoint CDAT
  * [3] Switch CDAT
  * [1] CDAT to Core-mm integration

    * [1] x86
    * [0] Arm64
    * [0] All other arch.

  * [0] Shared link

* [2] Hotplug
  (see CXL Window Enumeration)

  * [0] Handle Soft Reserved conflicts

* [0] :ref:`RCH link status <rch-link-status>`
* [0] Fabrics / G-FAM (chapter 7)
* [0] Global Access Endpoint


RAS
---
In many ways CXL can be seen as a standardization of what would normally
be handled by custom EDAC drivers. The open development here is
mainly caused by the enumeration corner cases above.

* [3] Component events (OS)
* [2] Component events (FFM)
* [1] Endpoint protocol errors (OS)
* [1] Endpoint protocol errors (FFM)
* [0] Switch protocol errors (OS)
* [1] Switch protocol errors (FFM)
* [2] DPA->HPA Address translation

    * [1] XOR Interleave translation
      (see CXL Window Enumeration)

* [1] Memory Failure coordination
* [0] Scrub control
* [2] ACPI error injection EINJ

  * [0] EINJ v2
  * [X] Compliance DOE

* [2] Native error injection
* [3] RCH error handling
* [1] VH error handling
* [0] PPR
* [0] Sparing
* [0] Device built in test


Mailbox commands
----------------

* [3] Firmware update
* [3] Health / Alerts
* [1] :ref:`Background commands <background-commands>`
* [3] Sanitization
* [3] Security commands
* [3] RAW Command Debug Passthrough
* [0] CEL-only-validation Passthrough
* [0] Switch CCI
* [3] Timestamp
* [1] PMEM labels
* [0] PMEM GPF / Dirty Shutdown
* [0] Scan Media

PMU
---
* [1] Type 3 PMU
* [0] Switch USP/ DSP, Root Port

Security
--------

* [X] CXL Trusted Execution Environment Security Protocol (TSP)
* [X] CXL IDE (subsumed by TSP)

Memory-pooling
--------------

* [1] Hotplug of LDs (via PCI hotplug)
* [0] Dynamic Capacity Device (DCD) Support

Multi-host sharing
------------------

* [0] Hardware coherent shared memory
* [0] Software managed coherency shared memory

Multi-host memory
-----------------

* [0] Dynamic Capacity Device Support
* [0] Sharing

Accelerator
-----------

* [0] Accelerator memory enumeration HDM-D (CXL 1.1/2.0 Type-2)
* [0] Accelerator memory enumeration HDM-DB (CXL 3.0 Type-2)
* [0] CXL.cache 68b (CXL 2.0)
* [0] CXL.cache 256b Cache IDs (CXL 3.0)

User Flow Support
-----------------

* [0] HPA->DPA Address translation (need xormaps export solution)

Details
=======

.. _extended-linear:

* **Extended-linear memory-side cache**: An HMAT proposal to enumerate the presence of a
  memory-side cache where the cache capacity extends the SRAT address
  range capacity. `See the ECN
  <https://lore.kernel.org/linux-cxl/6650e4f835a0e_195e294a8@dwillia2-mobl3.amr.corp.intel.com.notmuch/>`_
  for more details:

.. _rch-link-status:

* **RCH Link Status**: RCH (Restricted CXL Host) topologies, end up
  hiding some standard registers like PCIe Link Status / Capabilities in
  the CXL RCRB (Root Complex Register Block).

.. _background-commands:

* **Background commands**: The CXL background command mechanism is
  awkward as the single slot is monopolized potentially indefinitely by
  various commands. A `cancel on conflict
  <http://lore.kernel.org/r/66035c2e8ba17_770232948b@dwillia2-xfh.jf.intel.com.notmuch>`_
  facility is needed to make sure the kernel can ensure forward progress
  of priority commands.
+1 −0
Original line number Diff line number Diff line
@@ -5613,6 +5613,7 @@ M: Ira Weiny <ira.weiny@intel.com>
M:	Dan Williams <dan.j.williams@intel.com>
L:	linux-cxl@vger.kernel.org
S:	Maintained
F:	Documentation/driver-api/cxl
F:	drivers/cxl/
F:	include/linux/einj-cxl.h
F:	include/linux/cxl-event.h
+62 −57
Original line number Diff line number Diff line
@@ -22,56 +22,42 @@ static const guid_t acpi_cxl_qtg_id_guid =
	GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);

/*
 * Find a targets entry (n) in the host bridge interleave list.
 * CXL Specification 3.0 Table 9-22
 */
static int cxl_xor_calc_n(u64 hpa, struct cxl_cxims_data *cximsd, int iw,
			  int ig)
{
	int i = 0, n = 0;
	u8 eiw;

	/* IW: 2,4,6,8,12,16 begin building 'n' using xormaps */
	if (iw != 3) {
		for (i = 0; i < cximsd->nr_maps; i++)
			n |= (hweight64(hpa & cximsd->xormaps[i]) & 1) << i;
	}
	/* IW: 3,6,12 add a modulo calculation to 'n' */
	if (!is_power_of_2(iw)) {
		if (ways_to_eiw(iw, &eiw))
			return -1;
		hpa &= GENMASK_ULL(51, eiw + ig);
		n |= do_div(hpa, 3) << i;
	}
	return n;
}

static struct cxl_dport *cxl_hb_xor(struct cxl_root_decoder *cxlrd, int pos)
static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
{
	struct cxl_cxims_data *cximsd = cxlrd->platform_data;
	struct cxl_switch_decoder *cxlsd = &cxlrd->cxlsd;
	struct cxl_decoder *cxld = &cxlsd->cxld;
	int ig = cxld->interleave_granularity;
	int iw = cxld->interleave_ways;
	int n = 0;
	u64 hpa;

	if (dev_WARN_ONCE(&cxld->dev,
			  cxld->interleave_ways != cxlsd->nr_targets,
			  "misconfigured root decoder\n"))
		return NULL;
	int hbiw = cxlrd->cxlsd.nr_targets;
	u64 val;
	int pos;

	hpa = cxlrd->res->start + pos * ig;
	/* No xormaps for host bridge interleave ways of 1 or 3 */
	if (hbiw == 1 || hbiw == 3)
		return hpa;

	/* Entry (n) is 0 for no interleave (iw == 1) */
	if (iw != 1)
		n = cxl_xor_calc_n(hpa, cximsd, iw, ig);
	/*
	 * For root decoders using xormaps (hbiw: 2,4,6,8,12,16) restore
	 * the position bit to its value before the xormap was applied at
	 * HPA->DPA translation.
	 *
	 * pos is the lowest set bit in an XORMAP
	 * val is the XORALLBITS(HPA & XORMAP)
	 *
	 * XORALLBITS: The CXL spec (3.1 Table 9-22) defines XORALLBITS
	 * as an operation that outputs a single bit by XORing all the
	 * bits in the input (hpa & xormap). Implement XORALLBITS using
	 * hweight64(). If the hamming weight is even the XOR of those
	 * bits results in val==0, if odd the XOR result is val==1.
	 */

	if (n < 0)
		return NULL;
	for (int i = 0; i < cximsd->nr_maps; i++) {
		if (!cximsd->xormaps[i])
			continue;
		pos = __ffs(cximsd->xormaps[i]);
		val = (hweight64(hpa & cximsd->xormaps[i]) & 1);
		hpa = (hpa & ~(1ULL << pos)) | (val << pos);
	}

	return cxlrd->cxlsd.target[n];
	return hpa;
}

struct cxl_cxims_context {
@@ -361,7 +347,6 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
	struct cxl_port *root_port = ctx->root_port;
	struct cxl_cxims_context cxims_ctx;
	struct device *dev = ctx->dev;
	cxl_calc_hb_fn cxl_calc_hb;
	struct cxl_decoder *cxld;
	unsigned int ways, i, ig;
	int rc;
@@ -389,13 +374,9 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
	if (rc)
		return rc;

	if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_MODULO)
		cxl_calc_hb = cxl_hb_modulo;
	else
		cxl_calc_hb = cxl_hb_xor;

	struct cxl_root_decoder *cxlrd __free(put_cxlrd) =
		cxl_root_decoder_alloc(root_port, ways, cxl_calc_hb);
		cxl_root_decoder_alloc(root_port, ways);

	if (IS_ERR(cxlrd))
		return PTR_ERR(cxlrd);

@@ -434,6 +415,9 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,

	cxlrd->qos_class = cfmws->qtg_id;

	if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_XOR)
		cxlrd->hpa_to_spa = cxl_xor_hpa_to_spa;

	rc = cxl_decoder_add(cxld, target_map);
	if (rc)
		return rc;
@@ -482,6 +466,8 @@ struct cxl_chbs_context {
	unsigned long long uid;
	resource_size_t base;
	u32 cxl_version;
	int nr_versions;
	u32 saved_version;
};

static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
@@ -490,22 +476,31 @@ static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
	struct cxl_chbs_context *ctx = arg;
	struct acpi_cedt_chbs *chbs;

	if (ctx->base != CXL_RESOURCE_NONE)
		return 0;

	chbs = (struct acpi_cedt_chbs *) header;

	if (ctx->uid != chbs->uid)
	if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL11 &&
	    chbs->length != CXL_RCRB_SIZE)
		return 0;

	ctx->cxl_version = chbs->cxl_version;
	if (!chbs->base)
		return 0;

	if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL11 &&
	    chbs->length != CXL_RCRB_SIZE)
	if (ctx->saved_version != chbs->cxl_version) {
		/*
		 * cxl_version cannot be overwritten before the next two
		 * checks, then use saved_version
		 */
		ctx->saved_version = chbs->cxl_version;
		ctx->nr_versions++;
	}

	if (ctx->base != CXL_RESOURCE_NONE)
		return 0;

	if (ctx->uid != chbs->uid)
		return 0;

	ctx->cxl_version = chbs->cxl_version;
	ctx->base = chbs->base;

	return 0;
@@ -529,10 +524,19 @@ static int cxl_get_chbs(struct device *dev, struct acpi_device *hb,
		.uid = uid,
		.base = CXL_RESOURCE_NONE,
		.cxl_version = UINT_MAX,
		.saved_version = UINT_MAX,
	};

	acpi_table_parse_cedt(ACPI_CEDT_TYPE_CHBS, cxl_get_chbs_iter, ctx);

	if (ctx->nr_versions > 1) {
		/*
		 * Disclaim eRCD support given some component register may
		 * only be found via CHBCR
		 */
		dev_info(dev, "Unsupported platform config, mixed Virtual Host and Restricted CXL Host hierarchy.");
	}

	return 0;
}

@@ -921,6 +925,7 @@ static void __exit cxl_acpi_exit(void)
/* load before dax_hmem sees 'Soft Reserved' CXL ranges */
subsys_initcall(cxl_acpi_init);
module_exit(cxl_acpi_exit);
MODULE_DESCRIPTION("CXL ACPI: Platform Support");
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);
MODULE_IMPORT_NS(ACPI);
Loading