Commit af65545a authored by Borislav Petkov (AMD)'s avatar Borislav Petkov (AMD)
Browse files

Merge remote-tracking branches 'ras/edac-drivers', 'ras/edac-misc' and...


Merge remote-tracking branches 'ras/edac-drivers', 'ras/edac-misc' and 'ras/edac-amd-atl' into edac-updates-for-v6.9

* ras/edac-drivers:
  EDAC/i10nm: Add Intel Grand Ridge micro-server support
  EDAC/igen6: Add one more Intel Alder Lake-N SoC support

* ras/edac-misc:
  EDAC/versal: Convert to platform remove callback returning void
  EDAC/versal: Make the bit position of injected errors configurable
  EDAC/synopsys: Convert to devm_platform_ioremap_resource()

* ras/edac-amd-atl:
  RAS/AMD/FMPM: Fix off by one when unwinding on error
  RAS/AMD/FMPM: Add debugfs interface to print record entries
  RAS/AMD/FMPM: Save SPA values
  RAS: Export helper to get ras_debugfs_dir
  RAS/AMD/ATL: Fix bit overflow in denorm_addr_df4_np2()
  RAS: Introduce a FRU memory poison manager
  RAS/AMD/ATL: Add MI300 row retirement support
  Documentation: Move RAS section to admin-guide
  RAS/AMD/ATL: Add MI300 DRAM to normalized address translation support
  RAS/AMD/ATL: Fix array overflow in get_logical_coh_st_fabric_id_mi300()
  RAS/AMD/ATL: Add MI300 support
  Documentation: RAS: Add index and address translation section
  EDAC/amd64: Use new AMD Address Translation Library
  RAS: Introduce AMD Address Translation Library

Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
Loading
Loading
Loading
Loading
+24 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

Address translation
===================

x86 AMD
-------

Zen-based AMD systems include a Data Fabric that manages the layout of
physical memory. Devices attached to the Fabric, like memory controllers,
I/O, etc., may not have a complete view of the system physical memory map.
These devices may provide a "normalized", i.e. device physical, address
when reporting memory errors. Normalized addresses must be translated to
a system physical address for the kernel to action on the memory.

AMD Address Translation Library (CONFIG_AMD_ATL) provides translation for
this case.

Glossary of acronyms used in address translation for Zen-based systems

* CCM               = Cache Coherent Moderator
* COD               = Cluster-on-Die
* COH_ST            = Coherent Station
* DF                = Data Fabric
+3 −8
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

Reliability, Availability and Serviceability features
=====================================================

This documents different aspects of the RAS functionality present in the
kernel.

Error decoding
---------------
==============

* x86
x86
---

Error decoding on AMD systems should be done using the rasdaemon tool:
https://github.com/mchehab/rasdaemon/
+7 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0
.. toctree::
   :maxdepth: 2

   main
   error-decoding
   address-translation
+7 −3
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>

============================================
Reliability, Availability and Serviceability
============================================
==================================================
Reliability, Availability and Serviceability (RAS)
==================================================

This documents different aspects of the RAS functionality present in the
kernel.

RAS concepts
************
+1 −1
Original line number Diff line number Diff line
@@ -122,7 +122,7 @@ configure specific aspects of kernel behavior to your liking.
   pmf
   pnp
   rapidio
   ras
   RAS/index
   rtc
   serial-console
   svga
Loading