Commit Graph

1351241 Commits

Author SHA1 Message Date
Bjorn Helgaas
d96c67a57c Merge branch 'pci/controller/rcar-gen4'
- Describe endpoint BAR 4 as being fixed size (Jerome Brunet)

- Document how to obtain R-Car V4H (r8a779g0) controller firmware
  (Yoshihiro Shimoda)

* pci/controller/rcar-gen4:
  PCI: rcar-gen4: Document how to obtain platform firmware
  PCI: rcar-gen4: set ep BAR4 fixed size
2025-06-04 10:50:42 -05:00
Bjorn Helgaas
05cf00aa05 Merge branch 'pci/controller/qcom'
- Add OF support for parsing DT 'eq-presets-<N>gts' property for lane
  equalization presets (Krishna Chaitanya Chundru)

- Read Maximum Link Width from the Link Capabilities register if DT lacks
  'num-lanes' property (Krishna Chaitanya Chundru)

- Add Physical Layer 64 GT/s Capability ID and register offsets for 8, 32,
  and 64 GT/s lane equalization registers (Krishna Chaitanya Chundru)

- Add generic dwc support for configuring lane equalization presets
  (Krishna Chaitanya Chundru)

- Add DT and driver support for PCIe on IPQ5018 SoC (Nitheesh Sekar)

* pci/controller/qcom:
  PCI: qcom: Add support for IPQ5018
  dt-bindings: PCI: qcom: Add IPQ5018 SoC
  PCI: dwc: Add support for configuring lane equalization presets
  PCI: Add lane equalization register offsets
  PCI: dwc: Update pci->num_lanes to maximum supported link width
  PCI: of: Add of_pci_get_equalization_presets() API
2025-06-04 10:50:42 -05:00
Bjorn Helgaas
c7b9c59124 Merge branch 'pci/controller/mvebu'
- Use for_each_of_range() iterator for parsing 'ranges' (Rob Herring)

* pci/controller/mvebu:
  PCI: mvebu: Use for_each_of_range() iterator for parsing "ranges"
2025-06-04 10:50:41 -05:00
Bjorn Helgaas
dee6ce5c6f Merge branch 'pci/controller/mobiveil'
- Use to_delayed_work() instead of open-coding it (Chen Ni)

* pci/controller/mobiveil:
  PCI: ls-gen4: Use to_delayed_work()
2025-06-04 10:50:41 -05:00
Bjorn Helgaas
f4ff0b0ed2 Merge branch 'pci/controller/imx6'
- Apply link training workaround only on IMX6Q, IMX6SX, IMX6SP (Richard
  Zhu)

- Remove redundant dw_pcie_wait_for_link() from imx_pcie_start_link();
  since the DWC core does this, imx6 only needs it when retraining for a
  faster link speed (Richard Zhu)

- Toggle i.MX95 core reset to align with PHY powerup (Richard Zhu)

- Set SYS_AUX_PWR_DET to work around i.MX95 ERR051624 erratum: in some
  cases, the controller can't exit 'L23 Ready' through Beacon or PERST#
  deassertion (Richard Zhu)

- Clear GEN3_ZRXDC_NONCOMPL to work around i.MX95 ERR051586 erratum:
  controller can't meet 2.5 GT/s ZRX-DC timing when operating at 8 GT/s,
  causing timeouts in L1 (Richard Zhu)

- Wait for i.MX95 PLL lock before enabling controller (Richard Zhu)

- Save/restore i.MX95 LUT for suspend/resume (Richard Zhu)

* pci/controller/imx6:
  PCI: imx6: Save and restore the LUT setting during suspend/resume for i.MX95 SoC
  PCI: imx6: Add PLL lock check for i.MX95 SoC
  PCI: imx6: Add workaround for errata ERR051586
  PCI: imx6: Add workaround for errata ERR051624
  PCI: imx6: Toggle the core reset for i.MX95 PCIe
  PCI: imx6: Call dw_pcie_wait_for_link() from start_link() callback only when required
  PCI: imx6: Skip link up workaround for newer platforms
2025-06-04 10:50:40 -05:00
Bjorn Helgaas
20611193be Merge branch 'pci/controller/dwc'
- Set PORT_LOGIC_LINK_WIDTH to one lane to make initial link training more
  robust; this will not affect the intended link width if all lanes are
  functional (Wenbin Yao)

* pci/controller/dwc:
  PCI: dwc: Make link training more robust by setting PORT_LOGIC_LINK_WIDTH to one lane
2025-06-04 10:50:39 -05:00
Bjorn Helgaas
00c78a3c3f Merge branch 'pci/controller/dwc-ep'
- Use FIELD_GET() to simplify extracting register values (Hans Zhang)

* pci/controller/dwc-ep:
  PCI: dwc: ep: Fix errno typo
  PCI: dwc: ep: Use FIELD_GET() where applicable
2025-06-04 10:50:39 -05:00
Bjorn Helgaas
20279628bb Merge branch 'pci/controller/dw-rockchip'
- Check only PCIE_LINKUP, not LTSSM status, to determine whether the link
  is up (Shawn Lin)

- Increase N_FTS (used in L0s->L0 transitions) and enable ASPM L0s for Root
  Complex and Endpoint modes (Shawn Lin)

- Hide the broken ATS Capability in rockchip_pcie_ep_init() instead of
  rockchip_pcie_ep_pre_init() so it stays hidden after PERST# resets
  non-sticky registers (Shawn Lin)

- Remove unused PCIE_CLIENT_GENERAL_DEBUG definition (Hans Zhang)

- Organize register and bitfield definitions logically (Hans Zhang)

- Use rockchip_pcie_link_up() to check link up instead of open coding, and
  use GENMASK() and FIELD_GET() when possible (Hans Zhang)

- Call phy_power_off() before phy_exit() in rockchip_pcie_phy_deinit()
  (Diederik de Haas)

- Return bool (not int) for link-up check in dw_pcie_ops.link_up() and
  armada8k, dra7xx, dw-rockchip, exynos, histb, keembay, keystone, kirin,
  meson, qcom, qcom-ep, rcar_gen4, spear13xx, tegra194, uniphier, visconti
  (Hans Zhang)

- Return bool (not int) for link-up check in mobiveil_pab_ops.link_up() and
  layerscape-gen4, mobiveil (Hans Zhang)

- Simplify j721e link-up check (Hans Zhang)

- Convert pci-host-common to a library so platforms that don't need native
  host controller drivers don't need to include these helper functions
  (Manivannan Sadhasivam)

* pci/controller/dw-rockchip:
  PCI: qcom: Replace PERST# sleep time with proper macro
  PCI: dw-rockchip: Replace PERST# sleep time with proper macro
  PCI: host-common: Convert to library for host controller drivers
  PCI: cadence: Simplify J721e link status check
  PCI: mobiveil: Return bool from link up check
  PCI: dwc: Return bool from link up check
  PCI: dw-rockchip: Fix PHY function call sequence in rockchip_pcie_phy_deinit()
  PCI: dw-rockchip: Use rockchip_pcie_link_up() to check link up instead of open coding
  PCI: dw-rockchip: Reorganize register and bitfield definitions
  PCI: dw-rockchip: Remove unused PCIE_CLIENT_GENERAL_DEBUG definition
  PCI: dw-rockchip: Move rockchip_pcie_ep_hide_broken_ats_cap_rk3588() to dw_pcie_ep_ops::init()
  PCI: dw-rockchip: Enable ASPM L0s capability for both RC and EP modes
  PCI: dw-rockchip: Remove PCIE_L0S_ENTRY check from rockchip_pcie_link_up()

# Conflicts:
#	drivers/pci/controller/pcie-apple.c
#	include/linux/pci-ecam.h
2025-06-04 10:50:38 -05:00
Bjorn Helgaas
3f0b36295e Merge branch 'pci/controller/cadence'
- Drop a runtime PM 'put' to resolve a runtime atomic count underflow (Hans
  Zhang)

- Use shared PCIE_MSG_CODE_* definitions and remove duplicate
  cdns_pcie_msg_code definitions (Hans Zhang)

- Make the cadence core buildable as a module (Kishon Vijay Abraham I)

- Add cdns_pcie_host_disable() and cdns_pcie_ep_disable() for use by
  loadable drivers when they are removed (Siddharth Vadapalli)

- Make j721e buildable as a loadable and removable module (Siddharth
  Vadapalli)

- Fix j721e host/endpoint dependencies that result in link failures in
  some configs (Arnd Bergmann)

* pci/controller/cadence:
  PCI: j721e: Fix host/endpoint dependencies
  PCI: j721e: Add support to build as a loadable module
  PCI: cadence-ep: Introduce cdns_pcie_ep_disable() helper for cleanup
  PCI: cadence-host: Introduce cdns_pcie_host_disable() helper for cleanup
  PCI: cadence: Add support to build pcie-cadence library as a kernel module
  PCI: cadence: Remove duplicate message code definitions
  PCI: cadence: Fix runtime atomic count underflow
2025-06-04 10:50:04 -05:00
Bjorn Helgaas
c3b2f9dccb Merge branch 'pci/controller/apple'
- Skip ports disabled in DT when setting up ports (Janne Grunau)

- Add t6020 compatible string (Alyssa Rosenzweig)

- Extract ECAM bridge creation helper from pci_host_common_probe() to
  separate driver-specific things like MSI from PCI things (Marc Zyngier)

- Dynamically allocate RID-to_SID bitmap to prepare for SoCs with varying
  capabilities (Marc Zyngier)

- Directly set/clear INTx mask bits because T602x dropped the accessors
  that could do this without locking (Marc Zyngier)

- Move port PHY registers to their own reg items to accommodate T602x,
  which moves them around; retain default offsets for existing DTs that
  lack phy%d entries with the reg offsets (Hector Martin)

- Stop polling for core refclk, which doesn't work on T602x and the
  bootloader has already done anyway (Hector Martin)

- Use gpiod_set_value_cansleep() when asserting PERST# in probe because
  we're allowed to sleep there (Hector Martin)

- Move register offsets into SoC-specific structure (Hector Martin)

- Add T602x PCIe support (Hector Martin)

* pci/controller/apple:
  PCI: apple: Add T602x PCIe support
  PCI: apple: Abstract register offsets via a SoC-specific structure
  PCI: apple: Use gpiod_set_value_cansleep in probe flow
  PCI: apple: Drop poll for CORE_RC_PHYIF_STAT_REFCLK
  PCI: apple: Move port PHY registers to their own reg items
  PCI: apple: Fix missing OF node reference in apple_pcie_setup_port
  PCI: apple: Move away from INTMSK{SET,CLR} for INTx and private interrupts
  PCI: apple: Dynamically allocate RID-to_SID bitmap
  PCI: apple: Move over to standalone probing
  PCI: ecam: Allow cfg->priv to be pre-populated from the root port device
  PCI: host-generic: Extract an ECAM bridge creation helper from pci_host_common_probe()
  dt-bindings: pci: apple,pcie: Add t6020 compatible string
  PCI: apple: Set only available ports up
2025-06-04 10:50:04 -05:00
Bjorn Helgaas
2ce738726a Merge branch 'pci/endpoint'
- For fixed-size BARs, retain both the actual size and the possibly larger
  size allocated to accommodate iATU alignment requirements (Jerome Brunet)

- Simplify ctrl/SPAD space allocation and avoid allocating more space than
  needed (Jerome Brunet)

- Correct MSI-X PBA offset calculations for DesignWare and Cadence endpoint
  controllers (Niklas Cassel)

- Align the return value (number of interrupts) encoding for
  pci_epc_get_msi()/pci_epc_ops::get_msi() and
  pci_epc_get_msix()/pci_epc_ops::get_msix() (Niklas Cassel)

- Align the nr_irqs parameter encoding for
  pci_epc_set_msi()/pci_epc_ops::set_msi() and
  pci_epc_set_msix()/pci_epc_ops::set_msix() (Niklas Cassel)

* pci/endpoint:
  PCI: endpoint: Align pci_epc_set_msix(), pci_epc_ops::set_msix() nr_irqs encoding
  PCI: endpoint: Align pci_epc_set_msi(), pci_epc_ops::set_msi() nr_irqs encoding
  PCI: endpoint: Align pci_epc_get_msix(), pci_epc_ops::get_msix() return value encoding
  PCI: endpoint: Align pci_epc_get_msi(), pci_epc_ops::get_msi() return value encoding
  PCI: cadence-ep: Correct PBA offset in .set_msix() callback
  PCI: dwc: ep: Correct PBA offset in .set_msix() callback
  PCI: endpoint: pci-epf-vntb: Simplify ctrl/SPAD space allocation
  PCI: endpoint: Retain fixed-size BAR size as well as aligned size
2025-06-04 10:50:03 -05:00
Bjorn Helgaas
014dbfe0e4 Merge branch 'pci/virtualization'
- Add an ACS quirk for Loongson Root Ports that don't advertise ACS but
  don't allow peer-to-peer transactions between Root Ports; the quirk
  allows each Root Port to be in a separate IOMMU group (Huacai Chen)

* pci/virtualization:
  PCI: Add ACS quirk for Loongson PCIe
2025-06-04 10:50:03 -05:00
Bjorn Helgaas
df11586119 Merge branch 'pci/reset'
- Fix locking issue in the slot reset path (Ilpo Järvinen)

* pci/reset:
  PCI: Fix lock symmetry in pci_slot_unlock()
2025-06-04 10:50:02 -05:00
Bjorn Helgaas
4dac48e8a7 Merge branch 'pci/pwrctrl'
- Rename pwrctrl Kconfig symbols from 'PWRCTL' to 'PWRCTRL' to match the
  filename paths.  Retain old deprecated symbols for compatibility, except
  for the pwrctrl slot driver (PCI_PWRCTRL_SLOT) (Johan Hovold)

- When unregistering pwrctrl, cancel outstanding rescan work before
  cleaning up data structures to avoid use-after-free issues (Brian Norris)

* pci/pwrctrl:
  arm64: Kconfig: switch to HAVE_PWRCTRL
  wifi: ath12k: switch to PCI_PWRCTRL_PWRSEQ
  wifi: ath11k: switch to PCI_PWRCTRL_PWRSEQ
  PCI/pwrctrl: Rename pwrctrl Kconfig symbols and slot module
  PCI/pwrctrl: Cancel outstanding rescan work when unregistering
2025-06-04 10:50:02 -05:00
Bjorn Helgaas
f377d9cb25 Merge branch 'pci/pm'
- Add pm_runtime_put() cleanup helper for use with __free() to
  automatically drop the device usage count when a pointer goes out of
  scope (Alex Williamson)

- Increment PM usage counter when probing reset methods so we don't try to
  read config space of a powered-off device (Alex Williamson)

- Set all devices to D0 during enumeration to ensure ACPI opregion is
  connected via _REG (Mario Limonciello)

* pci/pm:
  PCI: Explicitly put devices into D0 when initializing
  PCI: Increment PM usage counter when probing reset methods
  PM: runtime: Define pm_runtime_put cleanup helper
2025-06-04 10:50:01 -05:00
Bjorn Helgaas
80fe18d1de Merge branch 'pci/pci-acpi'
- Fix pci_acpi_scan_root() memory leak when we fail to create a PCI bus
  (Zhe Qiao)

* pci/pci-acpi:
  PCI/ACPI: Fix allocated memory release on error in pci_acpi_scan_root()
2025-06-04 10:50:00 -05:00
Bjorn Helgaas
3ebd1305c1 Merge branch 'pci/irq'
- Use of_fwnode_handle() so of_node_to_fwnode() can be removed (Jiri Slaby)

* pci/irq:
  irqdomain: pci: Switch to of_fwnode_handle()
2025-06-04 10:50:00 -05:00
Bjorn Helgaas
bb5c909e6a Merge branch 'pci/hotplug'
- Ignore Presence Detect Changed caused by DPC.  pciehp already ignores
  Link Down/Up events caused by DPC, but on slots using in-band presence
  detect, DPC causes a spurious Presence Detect Changed event (Lukas
  Wunner)

- Ignore Link Down/Up caused by Secondary Bus Reset.  On hotplug ports
  using in-band presence detect, the reset causes a Presence Detect Changed
  event, which mistakenly caused teardown and re-enumeration of the device.
  Drivers may need to annotate code that resets their device (Lukas Wunner)

* pci/hotplug:
  PCI: hotplug: Drop superfluous #include directives
  PCI: pciehp: Ignore Link Down/Up caused by Secondary Bus Reset
  PCI: pciehp: Ignore Presence Detect Changed caused by DPC

# Conflicts:
#	drivers/pci/pci.h
2025-06-04 10:49:59 -05:00
Bjorn Helgaas
68d0370e4e Merge branch 'pci/enumeration'
- Remove pci_fixup_cardbus(), which has no users left (Heiner Kallweit)

- Print the actual delay time in pci_bridge_wait_for_secondary_bus()
  instead of assuming it was 1000ms (Wilfred Mallawa)

- Revert 'iommu/amd: Prevent binding other PCI drivers to IOMMU PCI
  devices', which broke resume from system sleep on AMD platforms and has
  been fixed by other commits (Lukas Wunner)

- Restrict visibility of pci_dev.match_driver since it's no longer used
  outside the PCI core (Lukas Wunner)

* pci/enumeration:
  PCI: Limit visibility of match_driver flag to PCI core
  Revert "iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices"
  PCI: Print the actual delay time in pci_bridge_wait_for_secondary_bus()
  PCI: Use PCI_STD_NUM_BARS instead of 6
  PCI: Remove pci_fixup_cardbus()

# Conflicts:
#	drivers/pci/pci.h
2025-06-04 10:49:56 -05:00
Bjorn Helgaas
f56278a46d Merge branch 'pci/devres'
- Remove mtip32xx use of pcim_iounmap_regions(), which is deprecated and
  unnecessary (Philipp Stanner)

- Remove pcim_iounmap_regions() and pcim_request_region_exclusive() and
  related flags since all uses have been removed (Philipp Stanner)

- Rework devres 'request' functions so they are no longer 'hybrid', i.e.,
  their behavior no longer depends on whether pcim_enable_device or
  pci_enable_device() was used, and remove related code (Philipp Stanner)

* pci/devres:
  PCI: Remove function pcim_intx() prototype from pci.h
  PCI: Remove hybrid-devres usage warnings from kernel-doc
  PCI: Remove redundant set of request functions
  PCI: Remove exclusive requests flags from _pcim_request_region()
  PCI: Remove pcim_request_region_exclusive()
  Documentation/driver-api: Update pcim_enable_device()
  PCI: Remove hybrid devres nature from request functions
  PCI: Remove pcim_iounmap_regions()
  mtip32xx: Remove unnecessary pcim_iounmap_regions() calls
2025-06-04 10:49:50 -05:00
Bjorn Helgaas
1acf6a5e79 Merge branch 'pci/bwctrl'
- Simplify link bandwidth controller by replacing the count of Link
  Bandwidth Management Status (LBMS) events with a PCI_LINK_LBMS_SEEN flag
  (Ilpo Järvinen)

- Update the Link Speed after retraining, since the Link Speed may have
  changed (Ilpo Järvinen)

* pci/bwctrl:
  PCI: Update Link Speed after retraining
  PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag
2025-06-04 10:49:49 -05:00
Bjorn Helgaas
f5b6c76e55 Merge branch 'pci/aer'
- Initialize struct aer_err_info before using it to avoid depending on
  stack garbage (Bjorn Helgaas)

- Log the DPC Error Source ID only when it's actually valid (when ERR_FATAL
  or ERR_NONFATAL was received from a downstream device) and decode into
  bus/device/function (Bjorn Helgaas)

- Consolidate AER Error Source ID in one place for message consistency
  (Bjorn Helgaas)

- Update statistics and emit trace events early in AER logging paths,
  before any potential ratelimiting (Bjorn Helgaas)

- Determine AER log level once and save it so all related messages use the
  same level (Karolina Stolarek)

- Use KERN_WARNING, not KERN_ERR, when logging PCIe Correctable Errors.

- Ratelimit PCIe Correctable and Non-Fatal error logging, with sysfs
  controls on interval and burst count, to avoid flooding logs and RCU
  stall warnings (Jon Pan-Doh)

* pci/aer:
  PCI/ERR: Remove misleading TODO regarding kernel panic
  PCI/AER: Add sysfs attributes for log ratelimits
  PCI/AER: Add ratelimits to PCI AER Documentation
  PCI/AER: Ratelimit correctable and non-fatal error logging
  PCI/AER: Simplify add_error_device()
  PCI/AER: Convert aer_get_device_error_info(), aer_print_error() to index
  PCI/AER: Rename struct aer_stats to aer_info
  PCI/AER: Reduce pci_print_aer() correctable error level to KERN_WARNING
  PCI/ERR: Add printk level to pcie_print_tlp_log()
  PCI/AER: Check log level once and remember it
  PCI/AER: Trace error event before ratelimiting
  PCI/AER: Update statistics before ratelimiting
  PCI/AER: Simplify pci_print_aer()
  PCI/AER: Initialize aer_err_info before using it
  PCI/AER: Move aer_print_source() earlier in file
  PCI/AER: Rename aer_print_port_info() to aer_print_source()
  PCI/AER: Extract bus/dev/fn in aer_print_port_info() with PCI_BUS_NUM(), etc
  PCI/AER: Consolidate Error Source ID logging in aer_isr_one_error_type()
  PCI/AER: Factor COR/UNCOR error handling out from aer_isr_one_error()
  PCI/DPC: Log Error Source ID only when valid
  PCI/DPC: Initialize aer_err_info before using it
2025-06-04 10:49:49 -05:00
Arnd Bergmann
3c05e88413 PCI: j721e: Fix host/endpoint dependencies
The j721e driver has a single platform driver that can be built-in or a
loadable module, but it calls two separate backend drivers depending on
whether it is a host or endpoint.

If the two modes are not the same, we can end up with a situation where the
built-in pci-j721e driver tries to call the modular host or endpoint
driver, which causes a link failure:

  ld.lld-21: error: undefined symbol: cdns_pcie_ep_setup
  >>> referenced by pci-j721e.c
  >>>               drivers/pci/controller/cadence/pci-j721e.o:(j721e_pcie_probe) in archive vmlinux.a

  ld.lld-21: error: undefined symbol: cdns_pcie_host_setup
  >>> referenced by pci-j721e.c
  >>>               drivers/pci/controller/cadence/pci-j721e.o:(j721e_pcie_probe) in archive vmlinux.a

Rework the dependencies so that the 'select' is done by the common Kconfig
symbol, based on which of the two are enabled. Effectively this means that
having one built-in makes the other either built-in or disabled, but all
configurations will now build.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://patch.msgid.link/20250423162523.2060405-1-arnd@kernel.org
2025-06-02 16:02:37 -05:00
Siddharth Vadapalli
a2790bf81f PCI: j721e: Add support to build as a loadable module
The 'pci-j721e.c' driver is the application/glue/wrapper driver for the
Cadence PCIe Controllers on TI SoCs. Implement support for building it as a
loadable module.

Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://patch.msgid.link/20250417124408.2752248-5-s-vadapalli@ti.com
2025-06-02 16:02:37 -05:00
Siddharth Vadapalli
3a4b05c9ba PCI: cadence-ep: Introduce cdns_pcie_ep_disable() helper for cleanup
Introduce the helper function cdns_pcie_ep_disable() which will undo the
configuration performed by cdns_pcie_ep_setup(). Also, export it for use
by the existing callers of cdns_pcie_ep_setup(), thereby allowing them
to cleanup on their exit path.

Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://patch.msgid.link/20250417124408.2752248-4-s-vadapalli@ti.com
2025-06-02 16:02:37 -05:00
Siddharth Vadapalli
47f25da6c5 PCI: cadence-host: Introduce cdns_pcie_host_disable() helper for cleanup
Introduce the helper function cdns_pcie_host_disable() which will undo
the configuration performed by cdns_pcie_host_setup(). Also, export it
for use by existing callers of cdns_pcie_host_setup(), thereby allowing
them to cleanup on their exit path.

Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://patch.msgid.link/20250417124408.2752248-3-s-vadapalli@ti.com
2025-06-02 16:02:37 -05:00
Kishon Vijay Abraham I
f876904e44 PCI: cadence: Add support to build pcie-cadence library as a kernel module
Currently, the Cadence PCIe controller driver can be built as a built-in
module only. Since PCIe functionality is not a necessity for booting, add
support to build the Cadence PCIe driver as a loadable module as well.

Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://patch.msgid.link/20250417124408.2752248-2-s-vadapalli@ti.com
2025-06-02 16:02:33 -05:00
Niklas Cassel
ec49e25332 PCI: qcom: Replace PERST# sleep time with proper macro
Replace the PERST# sleep time with the proper macro (PCIE_T_PVPERL_MS).
No functional change.

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Hans Zhang <18255117159@163.com>
Reviewed-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Link: https://patch.msgid.link/20250506073934.433176-10-cassel@kernel.org
2025-05-30 16:56:56 -05:00
Niklas Cassel
d34719d0e8 PCI: dw-rockchip: Replace PERST# sleep time with proper macro
Replace the PERST# sleep time with the proper macro (PCIE_T_PVPERL_MS).
No functional change.

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Laszlo Fiat <laszlo.fiat@proton.me>
Reviewed-by: Hans Zhang <18255117159@163.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Link: https://patch.msgid.link/20250506073934.433176-9-cassel@kernel.org
2025-05-30 16:56:51 -05:00
Manivannan Sadhasivam
d1c696dba1 PCI: host-common: Convert to library for host controller drivers
This common library will be used as a placeholder for helper functions
shared by the host controller drivers. This avoids placing the host
controller drivers specific helpers in drivers/pci/*.c, to avoid enlarging
the kernel image on platforms that do not use host controller drivers at
all (like x86/ACPI platforms).

Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250508-pcie-reset-slot-v4-3-7050093e2b50@linaro.org
2025-05-30 12:21:57 -05:00
Manivannan Sadhasivam
b06d125e62 PCI/ERR: Remove misleading TODO regarding kernel panic
A PCI device is just another peripheral in a system. So failure to
recover it, must not result in a kernel panic. So remove the TODO which
is quite misleading.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Link: https://patch.msgid.link/20250508-pcie-reset-slot-v4-1-7050093e2b50@linaro.org
2025-05-30 12:21:19 -05:00
Hans Zhang
16b2da850f PCI: cadence: Remove duplicate message code definitions
The Cadence PCIe controller driver defines message codes in enum
cdns_pcie_msg_code duplicating the existing PCIE_MSG_CODE_* definitions in
drivers/pci/pci.h. The driver only uses ASSERT_INTA and DEASSERT_INTA codes
from this enum.

Remove the redundant Cadence-specific enum definitions and use the ones
available in drivers/pci/pci.h. This helps in avoiding code duplication,
maintaining consistency with the spec, and simplifying the code
maintenance.

Signed-off-by: Hans Zhang <18255117159@163.com>
[mani: commit message rewording]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250401145023.22948-1-18255117159@163.com
2025-05-28 16:55:27 -05:00
Niklas Cassel
de0321bcc5 PCI: endpoint: Align pci_epc_set_msix(), pci_epc_ops::set_msix() nr_irqs encoding
The kdoc for pci_epc_set_msix() says:
"Invoke to set the required number of MSI-X interrupts."

The kdoc for the callback pci_epc_ops->set_msix() says:
"ops to set the requested number of MSI-X interrupts in the MSI-X
capability register"

pci_epc_ops::set_msix() does however expect the parameter 'interrupts' to
be in the encoding as defined by the Table Size field. Nowhere in the
kdoc does it say that the number of interrupts should be in Table Size
encoding.

It is very confusing that the API pci_epc_set_msix() and the callback
function pci_epc_ops::set_msix() both take a parameter named 'interrupts',
but they expect completely different encodings.

Clean up the API and the callback function to have the same semantics,
i.e. the parameter represents the number of interrupts, regardless of the
internal encoding of that value.

Also rename the parameter 'interrupts' to 'nr_irqs', in both the wrapper
function and the callback function, such that the name is unambiguous.

[bhelgaas: more specific subject]

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable+noautosel@kernel.org # this is simply a cleanup
Link: https://patch.msgid.link/20250514074313.283156-14-cassel@kernel.org
2025-05-28 16:47:56 -05:00
Niklas Cassel
f62da6e727 PCI: endpoint: Align pci_epc_set_msi(), pci_epc_ops::set_msi() nr_irqs encoding
The kdoc for pci_epc_set_msi() says:
"Invoke to set the required number of MSI interrupts."

The kdoc for the callback pci_epc_ops::set_msi() says:
"ops to set the requested number of MSI interrupts in the MSI capability
register"

pci_epc_ops::set_msi() does however expect the parameter 'interrupts' to be
in the encoding as defined by the Multiple Message Capable (MMC) field of
the MSI capability structure. Nowhere in the kdoc does it say that the
number of interrupts should be in MMC encoding.

It is very confusing that the API pci_epc_set_msi() and the callback
function pci_epc_ops::set_msi() both take a parameter named 'interrupts',
but they expect completely different encodings.

Clean up the API and the callback function to have the same semantics,
i.e. the parameter represents the number of interrupts, regardless of the
internal encoding of that value.

Also rename the parameter 'interrupts' to 'nr_irqs', in both the wrapper
function and the callback function, such that the name is unambiguous.

[bhelgaas: more specific subject]

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable+noautosel@kernel.org # this is simply a cleanup
Link: https://patch.msgid.link/20250514074313.283156-13-cassel@kernel.org
2025-05-28 16:47:56 -05:00
Niklas Cassel
0917ed8f16 PCI: endpoint: Align pci_epc_get_msix(), pci_epc_ops::get_msix() return value encoding
The kdoc for pci_epc_get_msix() says:
"Invoke to get the number of MSI-X interrupts allocated by the RC"

The kdoc for the callback pci_epc_ops->get_msix() says:
"ops to get the number of MSI-X interrupts allocated by the RC from the
MSI-X capability register"

pci_epc_ops::get_msix() does however return the number of interrupts in the
encoding as defined by the Table Size field. Nowhere in the kdoc does it
say that the returned number of interrupts is in Table Size encoding.

It is very confusing that the API pci_epc_get_msix() and the callback
function pci_epc_ops::get_msix() don't return the same value.

Clean up the API and the callback function to have the same semantics,
i.e. return the number of interrupts, regardless of the internal encoding
of that value.

[bhelgaas: more specific subject]

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: stable+noautosel@kernel.org # this is simply a cleanup
Link: https://patch.msgid.link/20250514074313.283156-12-cassel@kernel.org
2025-05-28 16:47:56 -05:00
Niklas Cassel
f7f15fc532 PCI: endpoint: Align pci_epc_get_msi(), pci_epc_ops::get_msi() return value encoding
The kdoc for API pci_epc_get_msi() says:
"Invoke to get the number of MSI interrupts allocated by the RC"

The kdoc for the callback pci_epc_ops::get_msi() says:
"ops to get the number of MSI interrupts allocated by the RC from
the MSI capability register"

pci_epc_ops::get_msi() does however return the number of interrupts in the
encoding as defined by the Multiple Message Enable (MME) field of the MSI
Capability structure.

Nowhere in the kdoc does it say that the returned number of interrupts is
in MME encoding. It is very confusing that the API pci_epc_get_msi() and
the callback function pci_epc_ops::get_msi() don't return the same value.

Clean up the API and the callback function to have the same semantics,
i.e. return the number of interrupts, regardless of the internal encoding
of that value.

[bhelgaas: more specific subject]

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: stable+noautosel@kernel.org # this is simply a cleanup
Link: https://patch.msgid.link/20250514074313.283156-11-cassel@kernel.org
2025-05-28 16:47:56 -05:00
Niklas Cassel
c8bcb01352 PCI: cadence-ep: Correct PBA offset in .set_msix() callback
While cdns_pcie_ep_set_msix() writes the Table Size field correctly (N-1),
the calculation of the PBA offset is wrong because it calculates space for
(N-1) entries instead of N.

This results in the following QEMU error when using PCI passthrough on a
device which relies on the PCI endpoint subsystem:

  failed to add PCI capability 0x11[0x50]@0xb0: table & pba overlap, or they don't fit in BARs, or don't align

Fix the calculation of PBA offset in the MSI-X capability.

[bhelgaas: more specific subject and commit log]

Fixes: 3ef5d16f50 ("PCI: cadence: Add MSI-X support to Endpoint driver")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250514074313.283156-10-cassel@kernel.org
2025-05-28 16:47:56 -05:00
Niklas Cassel
810276362b PCI: dwc: ep: Correct PBA offset in .set_msix() callback
While dw_pcie_ep_set_msix() writes the Table Size field correctly (N-1),
the calculation of the PBA offset is wrong because it calculates space for
(N-1) entries instead of N.

This results in the following QEMU error when using PCI passthrough on a
device which relies on the PCI endpoint subsystem:

  failed to add PCI capability 0x11[0x50]@0xb0: table & pba overlap, or they don't fit in BARs, or don't align

Fix the calculation of PBA offset in the MSI-X capability.

[bhelgaas: more specific subject and commit log]

Fixes: 83153d9f36 ("PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250514074313.283156-9-cassel@kernel.org
2025-05-28 16:47:56 -05:00
Jerome Brunet
e5327a6556 PCI: endpoint: pci-epf-vntb: Simplify ctrl/SPAD space allocation
When allocating the shared ctrl/SPAD space, epf_ntb_config_spad_bar_alloc()
should not try to handle the size quirks for underlying BAR, whether it is
fixed size or alignment. This is already handled by pci_epf_alloc_space().

Also, when handling the alignment, this allocates more space than
necessary. For example, with a SPAD size of 1024B and a ctrl size of 308B,
the space necessary is 1332B. If the alignment is 1MB,
epf_ntb_config_spad_bar_alloc() tries to allocate 2MB where 1MB would have
been more than enough.

Drop the handling of the BAR size quirks and let pci_epf_alloc_space()
handle that. Just make sure the 32bits SPAD register are aligned on 32bits.

Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250424-pci-ep-size-alignment-v5-2-2d4ec2af23f5@baylibre.com
2025-05-28 16:47:37 -05:00
Jerome Brunet
793908d60b PCI: endpoint: Retain fixed-size BAR size as well as aligned size
When allocating space for an endpoint function on a BAR with a fixed size,
the size saved in 'struct pci_epf_bar.size' should be the fixed size as
expected by pci_epc_set_bar().

However, if pci_epf_alloc_space() increased the allocation size to
accommodate iATU alignment requirements, it previously saved the larger
aligned size in .size, which broke pci_epc_set_bar().

To solve this, keep the fixed BAR size in .size and save the aligned size
in a new .aligned_size for use when deallocating it.

Fixes: 2a9a801620 ("PCI: endpoint: Add support to specify alignment for buffers allocated to BARs")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
[mani: commit message fixup]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: more specific subject, commit log, wrap comment to match file]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250424-pci-ep-size-alignment-v5-1-2d4ec2af23f5@baylibre.com
2025-05-28 16:15:40 -05:00
Johan Hovold
46bc169f6f arm64: Kconfig: switch to HAVE_PWRCTRL
The HAVE_PWRCTRL symbol has been renamed to reflect the pwrctrl framework
name. Switch to the non-deprecated symbol.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://patch.msgid.link/20250402132634.18065-5-johan+linaro@kernel.org
2025-05-23 15:23:18 -05:00
Johan Hovold
d5fc190934 wifi: ath12k: switch to PCI_PWRCTRL_PWRSEQ
The PCI_PWRCTRL_PWRSEQ and HAVE_PWRCTRL symbols have been renamed to
reflect the pwrctrl framework name. Switch to the non-deprecated symbols.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jeff Johnson <jjohnson@kernel.org> # drivers/net/wireless/ath/...
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://patch.msgid.link/20250402132634.18065-4-johan+linaro@kernel.org
2025-05-23 15:23:12 -05:00
Johan Hovold
52ddd0265b wifi: ath11k: switch to PCI_PWRCTRL_PWRSEQ
The PCI_PWRCTRL_PWRSEQ and HAVE_PWRCTRL symbols have been renamed to
reflect the pwrctrl framework name. Switch to the non-deprecated symbols.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jeff Johnson <jjohnson@kernel.org> # drivers/net/wireless/ath/...
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://patch.msgid.link/20250402132634.18065-3-johan+linaro@kernel.org
2025-05-23 15:23:07 -05:00
Johan Hovold
13bbf6a5f0 PCI/pwrctrl: Rename pwrctrl Kconfig symbols and slot module
Commits b88cbaaa6f ("PCI/pwrctrl: Rename pwrctl files to pwrctrl") and
3f925cd628 ("PCI/pwrctrl: Rename pwrctrl functions and structures")
renamed the "pwrctl" framework to "pwrctrl" for consistency reasons.

Rename also the Kconfig symbols so that they reflect the new name while
adding entries for the deprecated ones. The old symbols can be removed once
everything that depends on them has been updated.

Note that no deprecated symbol is added for the new slot driver to avoid
having to add a user visible option.

Rename the new slot module to reflect the framework name and match the
other pwrctrl modules.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://patch.msgid.link/20250402132634.18065-2-johan+linaro@kernel.org
2025-05-23 15:22:42 -05:00
Brian Norris
8b926f2377 PCI/pwrctrl: Cancel outstanding rescan work when unregistering
It's possible to trigger use-after-free here by:

  (a) forcing rescan_work_func() to take a long time and
  (b) utilizing a pwrctrl driver that may be unloaded for some reason

Cancel outstanding work to ensure it is finished before we allow our data
structures to be cleaned up.

[bhelgaas: tidy commit log]
Fixes: 8f62819aaa ("PCI/pwrctl: Rescan bus on a separate thread")
Signed-off-by: Brian Norris <briannorris@google.com>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Cc: Konrad Dybcio <konradybcio@kernel.org>
Link: https://patch.msgid.link/20250409115313.1.Ia319526ed4ef06bec3180378c9a008340cec9658@changeid
2025-05-23 15:22:25 -05:00
Jon Pan-Doh
b4fe7398de PCI/AER: Add sysfs attributes for log ratelimits
Allow userspace to read/write log ratelimits per device (including
enable/disable). Create aer/ sysfs directory to store them and any
future AER configs.

The new sysfs files are:

  /sys/bus/pci/devices/*/aer/correctable_ratelimit_burst
  /sys/bus/pci/devices/*/aer/correctable_ratelimit_interval_ms
  /sys/bus/pci/devices/*/aer/nonfatal_ratelimit_burst
  /sys/bus/pci/devices/*/aer/nonfatal_ratelimit_interval_ms

The default values are ratelimit_burst=10, ratelimit_interval_ms=5000, so
if we try to emit more than 10 messages in a 5 second period, some are
suppressed.

Update AER sysfs ABI filename to reflect the broader scope of AER sysfs
attributes (e.g. stats and ratelimits).

  Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats ->
    sysfs-bus-pci-devices-aer

Tested using aer-inject[1]. Configured correctable log ratelimit to 5.
Sent 6 AER errors. Observed 5 errors logged while AER stats
(cat /sys/bus/pci/devices/<dev>/aer_dev_correctable) shows 6.

Disabled ratelimiting and sent 6 more AER errors. Observed all 6 errors
logged and accounted in AER stats (12 total errors).

[1] https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git

[bhelgaas: note fatal errors are not ratelimited, "aer_report" ->
"aer_info", replace ratelimit_log_enable toggle with *_ratelimit_interval_ms]

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Jon Pan-Doh <pandoh@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-21-helgaas@kernel.org
2025-05-23 11:11:45 -05:00
Jon Pan-Doh
24816cc298 PCI/AER: Add ratelimits to PCI AER Documentation
Add ratelimits section for rationale and defaults.

[bhelgaas: note fatal errors are not ratelimited]

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Jon Pan-Doh <pandoh@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://patch.msgid.link/20250522232339.1525671-20-helgaas@kernel.org
2025-05-23 11:11:45 -05:00
Jon Pan-Doh
a57f2bfb4a PCI/AER: Ratelimit correctable and non-fatal error logging
Spammy devices can flood kernel logs with AER errors and slow/stall
execution. Add per-device ratelimits for AER correctable and non-fatal
uncorrectable errors that use the kernel defaults (10 per 5s).  Logging of
fatal errors is not ratelimited.

There are two AER logging entry points:

  - aer_print_error() is used by DPC and native AER

  - pci_print_aer() is used by GHES and CXL

The native AER aer_print_error() case includes a loop that may log details
from multiple devices, which are ratelimited individually.  If we log
details for any device, we also log the Error Source ID from the Root Port
or RCEC.

If no such device details are found, we still log the Error Source from the
ERR_* Message, ratelimited by the Root Port or RCEC that received it.

The DPC aer_print_error() case is not ratelimited, since this only happens
for fatal errors.

The CXL pci_print_aer() case is ratelimited by the Error Source device.

The GHES pci_print_aer() case is via aer_recover_work_func(), which
searches for the Error Source device.  If the device is not found, there's
no per-device ratelimit, so we use a system-wide ratelimit that covers all
error types (correctable, non-fatal, and fatal).

Sargun at Meta reported internally that a flood of AER errors causes RCU
CPU stall warnings and CSD-lock warnings.

Tested using aer-inject[1]. Sent 11 AER errors. Observed 10 errors logged
while AER stats (cat /sys/bus/pci/devices/<dev>/aer_dev_correctable) show
true count of 11.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git

[bhelgaas: commit log, factor out trace_aer_event() and aer_print_rp_info()
changes to previous patches, enable Error Source logging if any downstream
detail will be printed, don't ratelimit fatal errors, "aer_report" ->
"aer_info", "cor_log_ratelimit" -> "correctable_ratelimit",
"uncor_log_ratelimit" -> "nonfatal_ratelimit"]

Reported-by: Sargun Dhillon <sargun@meta.com>
Signed-off-by: Jon Pan-Doh <pandoh@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-19-helgaas@kernel.org
2025-05-23 11:11:45 -05:00
Bjorn Helgaas
d72bae4230 PCI/AER: Simplify add_error_device()
Return -ENOSPC error early so the usual path through add_error_device() is
the straightline code.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-18-helgaas@kernel.org
2025-05-23 11:11:45 -05:00
Bjorn Helgaas
94bc15c348 PCI/AER: Convert aer_get_device_error_info(), aer_print_error() to index
Previously aer_get_device_error_info() and aer_print_error() took a pointer
to struct aer_err_info and a pointer to a pci_dev.  Typically the pci_dev
was one of the elements of the aer_err_info.dev[] array (DPC was an
exception, where the dev[] array was unused).

Convert aer_get_device_error_info() and aer_print_error() to take an index
into the aer_err_info.dev[] array instead.  A future patch will add
per-device ratelimit information, so the index makes it convenient to find
the ratelimit associated with the device.

To accommodate DPC, set info->dev[0] to the DPC port before using these
interfaces.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-17-helgaas@kernel.org
2025-05-23 11:11:36 -05:00