Loading Documentation/ABI/testing/sysfs-devices-system-cpu +7 −6 Original line number Diff line number Diff line Loading @@ -513,17 +513,18 @@ Description: information about CPUs heterogeneity. cpu_capacity: capacity of cpuX. What: /sys/devices/system/cpu/vulnerabilities /sys/devices/system/cpu/vulnerabilities/gather_data_sampling /sys/devices/system/cpu/vulnerabilities/itlb_multihit /sys/devices/system/cpu/vulnerabilities/l1tf /sys/devices/system/cpu/vulnerabilities/mds /sys/devices/system/cpu/vulnerabilities/meltdown /sys/devices/system/cpu/vulnerabilities/mmio_stale_data /sys/devices/system/cpu/vulnerabilities/retbleed /sys/devices/system/cpu/vulnerabilities/spec_store_bypass /sys/devices/system/cpu/vulnerabilities/spectre_v1 /sys/devices/system/cpu/vulnerabilities/spectre_v2 /sys/devices/system/cpu/vulnerabilities/spec_store_bypass /sys/devices/system/cpu/vulnerabilities/l1tf /sys/devices/system/cpu/vulnerabilities/mds /sys/devices/system/cpu/vulnerabilities/srbds /sys/devices/system/cpu/vulnerabilities/tsx_async_abort /sys/devices/system/cpu/vulnerabilities/itlb_multihit /sys/devices/system/cpu/vulnerabilities/mmio_stale_data /sys/devices/system/cpu/vulnerabilities/retbleed Date: January 2018 Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> Description: Information about CPU vulnerabilities Loading Documentation/admin-guide/hw-vuln/gather_data_sampling.rst 0 → 100644 +109 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 GDS - Gather Data Sampling ========================== Gather Data Sampling is a hardware vulnerability which allows unprivileged speculative access to data which was previously stored in vector registers. Problem ------- When a gather instruction performs loads from memory, different data elements are merged into the destination vector register. However, when a gather instruction that is transiently executed encounters a fault, stale data from architectural or internal vector registers may get transiently forwarded to the destination vector register instead. This will allow a malicious attacker to infer stale data using typical side channel techniques like cache timing attacks. GDS is a purely sampling-based attack. The attacker uses gather instructions to infer the stale vector register data. The victim does not need to do anything special other than use the vector registers. The victim does not need to use gather instructions to be vulnerable. Because the buffers are shared between Hyper-Threads cross Hyper-Thread attacks are possible. Attack scenarios ---------------- Without mitigation, GDS can infer stale data across virtually all permission boundaries: Non-enclaves can infer SGX enclave data Userspace can infer kernel data Guests can infer data from hosts Guest can infer guest from other guests Users can infer data from other users Because of this, it is important to ensure that the mitigation stays enabled in lower-privilege contexts like guests and when running outside SGX enclaves. The hardware enforces the mitigation for SGX. Likewise, VMMs should ensure that guests are not allowed to disable the GDS mitigation. If a host erred and allowed this, a guest could theoretically disable GDS mitigation, mount an attack, and re-enable it. Mitigation mechanism -------------------- This issue is mitigated in microcode. The microcode defines the following new bits: ================================ === ============================ IA32_ARCH_CAPABILITIES[GDS_CTRL] R/O Enumerates GDS vulnerability and mitigation support. IA32_ARCH_CAPABILITIES[GDS_NO] R/O Processor is not vulnerable. IA32_MCU_OPT_CTRL[GDS_MITG_DIS] R/W Disables the mitigation 0 by default. IA32_MCU_OPT_CTRL[GDS_MITG_LOCK] R/W Locks GDS_MITG_DIS=0. Writes to GDS_MITG_DIS are ignored Can't be cleared once set. ================================ === ============================ GDS can also be mitigated on systems that don't have updated microcode by disabling AVX. This can be done by setting gather_data_sampling="force" or "clearcpuid=avx" on the kernel command-line. If used, these options will disable AVX use by turning off XSAVE YMM support. However, the processor will still enumerate AVX support. Userspace that does not follow proper AVX enumeration to check both AVX *and* XSAVE YMM support will break. Mitigation control on the kernel command line --------------------------------------------- The mitigation can be disabled by setting "gather_data_sampling=off" or "mitigations=off" on the kernel command line. Not specifying either will default to the mitigation being enabled. Specifying "gather_data_sampling=force" will use the microcode mitigation when available or disable AVX on affected systems where the microcode hasn't been updated to include the mitigation. GDS System Information ------------------------ The kernel provides vulnerability status information through sysfs. For GDS this can be accessed by the following sysfs file: /sys/devices/system/cpu/vulnerabilities/gather_data_sampling The possible values contained in this file are: ============================== ============================================= Not affected Processor not vulnerable. Vulnerable Processor vulnerable and mitigation disabled. Vulnerable: No microcode Processor vulnerable and microcode is missing mitigation. Mitigation: AVX disabled, no microcode Processor is vulnerable and microcode is missing mitigation. AVX disabled as mitigation. Mitigation: Microcode Processor is vulnerable and mitigation is in effect. Mitigation: Microcode (locked) Processor is vulnerable and mitigation is in effect and cannot be disabled. Unknown: Dependent on hypervisor status Running on a virtual guest processor that is affected but with no way to know if host processor is mitigated or vulnerable. ============================== ============================================= GDS Default mitigation ---------------------- The updated microcode will enable the mitigation by default. The kernel's default action is to leave the mitigation enabled. Documentation/admin-guide/hw-vuln/index.rst +2 −0 Original line number Diff line number Diff line Loading @@ -19,3 +19,5 @@ are configurable at compile, boot or run time. l1d_flush.rst processor_mmio_stale_data.rst cross-thread-rsb.rst srso gather_data_sampling.rst Documentation/admin-guide/hw-vuln/srso.rst 0 → 100644 +133 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 Speculative Return Stack Overflow (SRSO) ======================================== This is a mitigation for the speculative return stack overflow (SRSO) vulnerability found on AMD processors. The mechanism is by now the well known scenario of poisoning CPU functional units - the Branch Target Buffer (BTB) and Return Address Predictor (RAP) in this case - and then tricking the elevated privilege domain (the kernel) into leaking sensitive data. AMD CPUs predict RET instructions using a Return Address Predictor (aka Return Address Stack/Return Stack Buffer). In some cases, a non-architectural CALL instruction (i.e., an instruction predicted to be a CALL but is not actually a CALL) can create an entry in the RAP which may be used to predict the target of a subsequent RET instruction. The specific circumstances that lead to this varies by microarchitecture but the concern is that an attacker can mis-train the CPU BTB to predict non-architectural CALL instructions in kernel space and use this to control the speculative target of a subsequent kernel RET, potentially leading to information disclosure via a speculative side-channel. The issue is tracked under CVE-2023-20569. Affected processors ------------------- AMD Zen, generations 1-4. That is, all families 0x17 and 0x19. Older processors have not been investigated. System information and options ------------------------------ First of all, it is required that the latest microcode be loaded for mitigations to be effective. The sysfs file showing SRSO mitigation status is: /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow The possible values in this file are: - 'Not affected' The processor is not vulnerable - 'Vulnerable: no microcode' The processor is vulnerable, no microcode extending IBPB functionality to address the vulnerability has been applied. - 'Mitigation: microcode' Extended IBPB functionality microcode patch has been applied. It does not address User->Kernel and Guest->Host transitions protection but it does address User->User and VM->VM attack vectors. (spec_rstack_overflow=microcode) - 'Mitigation: safe RET' Software-only mitigation. It complements the extended IBPB microcode patch functionality by addressing User->Kernel and Guest->Host transitions protection. Selected by default or by spec_rstack_overflow=safe-ret - 'Mitigation: IBPB' Similar protection as "safe RET" above but employs an IBPB barrier on privilege domain crossings (User->Kernel, Guest->Host). (spec_rstack_overflow=ibpb) - 'Mitigation: IBPB on VMEXIT' Mitigation addressing the cloud provider scenario - the Guest->Host transitions only. (spec_rstack_overflow=ibpb-vmexit) In order to exploit vulnerability, an attacker needs to: - gain local access on the machine - break kASLR - find gadgets in the running kernel in order to use them in the exploit - potentially create and pin an additional workload on the sibling thread, depending on the microarchitecture (not necessary on fam 0x19) - run the exploit Considering the performance implications of each mitigation type, the default one is 'Mitigation: safe RET' which should take care of most attack vectors, including the local User->Kernel one. As always, the user is advised to keep her/his system up-to-date by applying software updates regularly. The default setting will be reevaluated when needed and especially when new attack vectors appear. As one can surmise, 'Mitigation: safe RET' does come at the cost of some performance depending on the workload. If one trusts her/his userspace and does not want to suffer the performance impact, one can always disable the mitigation with spec_rstack_overflow=off. Similarly, 'Mitigation: IBPB' is another full mitigation type employing an indrect branch prediction barrier after having applied the required microcode patch for one's system. This mitigation comes also at a performance cost. Mitigation: safe RET -------------------- The mitigation works by ensuring all RET instructions speculate to a controlled location, similar to how speculation is controlled in the retpoline sequence. To accomplish this, the __x86_return_thunk forces the CPU to mispredict every function return using a 'safe return' sequence. To ensure the safety of this mitigation, the kernel must ensure that the safe return sequence is itself free from attacker interference. In Zen3 and Zen4, this is accomplished by creating a BTB alias between the untraining function srso_untrain_ret_alias() and the safe return function srso_safe_ret_alias() which results in evicting a potentially poisoned BTB entry and using that safe one for all function returns. In older Zen1 and Zen2, this is accomplished using a reinterpretation technique similar to Retbleed one: srso_untrain_ret() and srso_safe_ret(). Documentation/admin-guide/kdump/vmcoreinfo.rst +6 −0 Original line number Diff line number Diff line Loading @@ -624,3 +624,9 @@ Used to get the correct ranges: * VMALLOC_START ~ VMALLOC_END : vmalloc() / ioremap() space. * VMEMMAP_START ~ VMEMMAP_END : vmemmap space, used for struct page array. * KERNEL_LINK_ADDR : start address of Kernel link and BPF va_kernel_pa_offset ------------------- Indicates the offset between the kernel virtual and physical mappings. Used to translate virtual to physical addresses. Loading
Documentation/ABI/testing/sysfs-devices-system-cpu +7 −6 Original line number Diff line number Diff line Loading @@ -513,17 +513,18 @@ Description: information about CPUs heterogeneity. cpu_capacity: capacity of cpuX. What: /sys/devices/system/cpu/vulnerabilities /sys/devices/system/cpu/vulnerabilities/gather_data_sampling /sys/devices/system/cpu/vulnerabilities/itlb_multihit /sys/devices/system/cpu/vulnerabilities/l1tf /sys/devices/system/cpu/vulnerabilities/mds /sys/devices/system/cpu/vulnerabilities/meltdown /sys/devices/system/cpu/vulnerabilities/mmio_stale_data /sys/devices/system/cpu/vulnerabilities/retbleed /sys/devices/system/cpu/vulnerabilities/spec_store_bypass /sys/devices/system/cpu/vulnerabilities/spectre_v1 /sys/devices/system/cpu/vulnerabilities/spectre_v2 /sys/devices/system/cpu/vulnerabilities/spec_store_bypass /sys/devices/system/cpu/vulnerabilities/l1tf /sys/devices/system/cpu/vulnerabilities/mds /sys/devices/system/cpu/vulnerabilities/srbds /sys/devices/system/cpu/vulnerabilities/tsx_async_abort /sys/devices/system/cpu/vulnerabilities/itlb_multihit /sys/devices/system/cpu/vulnerabilities/mmio_stale_data /sys/devices/system/cpu/vulnerabilities/retbleed Date: January 2018 Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> Description: Information about CPU vulnerabilities Loading
Documentation/admin-guide/hw-vuln/gather_data_sampling.rst 0 → 100644 +109 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 GDS - Gather Data Sampling ========================== Gather Data Sampling is a hardware vulnerability which allows unprivileged speculative access to data which was previously stored in vector registers. Problem ------- When a gather instruction performs loads from memory, different data elements are merged into the destination vector register. However, when a gather instruction that is transiently executed encounters a fault, stale data from architectural or internal vector registers may get transiently forwarded to the destination vector register instead. This will allow a malicious attacker to infer stale data using typical side channel techniques like cache timing attacks. GDS is a purely sampling-based attack. The attacker uses gather instructions to infer the stale vector register data. The victim does not need to do anything special other than use the vector registers. The victim does not need to use gather instructions to be vulnerable. Because the buffers are shared between Hyper-Threads cross Hyper-Thread attacks are possible. Attack scenarios ---------------- Without mitigation, GDS can infer stale data across virtually all permission boundaries: Non-enclaves can infer SGX enclave data Userspace can infer kernel data Guests can infer data from hosts Guest can infer guest from other guests Users can infer data from other users Because of this, it is important to ensure that the mitigation stays enabled in lower-privilege contexts like guests and when running outside SGX enclaves. The hardware enforces the mitigation for SGX. Likewise, VMMs should ensure that guests are not allowed to disable the GDS mitigation. If a host erred and allowed this, a guest could theoretically disable GDS mitigation, mount an attack, and re-enable it. Mitigation mechanism -------------------- This issue is mitigated in microcode. The microcode defines the following new bits: ================================ === ============================ IA32_ARCH_CAPABILITIES[GDS_CTRL] R/O Enumerates GDS vulnerability and mitigation support. IA32_ARCH_CAPABILITIES[GDS_NO] R/O Processor is not vulnerable. IA32_MCU_OPT_CTRL[GDS_MITG_DIS] R/W Disables the mitigation 0 by default. IA32_MCU_OPT_CTRL[GDS_MITG_LOCK] R/W Locks GDS_MITG_DIS=0. Writes to GDS_MITG_DIS are ignored Can't be cleared once set. ================================ === ============================ GDS can also be mitigated on systems that don't have updated microcode by disabling AVX. This can be done by setting gather_data_sampling="force" or "clearcpuid=avx" on the kernel command-line. If used, these options will disable AVX use by turning off XSAVE YMM support. However, the processor will still enumerate AVX support. Userspace that does not follow proper AVX enumeration to check both AVX *and* XSAVE YMM support will break. Mitigation control on the kernel command line --------------------------------------------- The mitigation can be disabled by setting "gather_data_sampling=off" or "mitigations=off" on the kernel command line. Not specifying either will default to the mitigation being enabled. Specifying "gather_data_sampling=force" will use the microcode mitigation when available or disable AVX on affected systems where the microcode hasn't been updated to include the mitigation. GDS System Information ------------------------ The kernel provides vulnerability status information through sysfs. For GDS this can be accessed by the following sysfs file: /sys/devices/system/cpu/vulnerabilities/gather_data_sampling The possible values contained in this file are: ============================== ============================================= Not affected Processor not vulnerable. Vulnerable Processor vulnerable and mitigation disabled. Vulnerable: No microcode Processor vulnerable and microcode is missing mitigation. Mitigation: AVX disabled, no microcode Processor is vulnerable and microcode is missing mitigation. AVX disabled as mitigation. Mitigation: Microcode Processor is vulnerable and mitigation is in effect. Mitigation: Microcode (locked) Processor is vulnerable and mitigation is in effect and cannot be disabled. Unknown: Dependent on hypervisor status Running on a virtual guest processor that is affected but with no way to know if host processor is mitigated or vulnerable. ============================== ============================================= GDS Default mitigation ---------------------- The updated microcode will enable the mitigation by default. The kernel's default action is to leave the mitigation enabled.
Documentation/admin-guide/hw-vuln/index.rst +2 −0 Original line number Diff line number Diff line Loading @@ -19,3 +19,5 @@ are configurable at compile, boot or run time. l1d_flush.rst processor_mmio_stale_data.rst cross-thread-rsb.rst srso gather_data_sampling.rst
Documentation/admin-guide/hw-vuln/srso.rst 0 → 100644 +133 −0 Original line number Diff line number Diff line .. SPDX-License-Identifier: GPL-2.0 Speculative Return Stack Overflow (SRSO) ======================================== This is a mitigation for the speculative return stack overflow (SRSO) vulnerability found on AMD processors. The mechanism is by now the well known scenario of poisoning CPU functional units - the Branch Target Buffer (BTB) and Return Address Predictor (RAP) in this case - and then tricking the elevated privilege domain (the kernel) into leaking sensitive data. AMD CPUs predict RET instructions using a Return Address Predictor (aka Return Address Stack/Return Stack Buffer). In some cases, a non-architectural CALL instruction (i.e., an instruction predicted to be a CALL but is not actually a CALL) can create an entry in the RAP which may be used to predict the target of a subsequent RET instruction. The specific circumstances that lead to this varies by microarchitecture but the concern is that an attacker can mis-train the CPU BTB to predict non-architectural CALL instructions in kernel space and use this to control the speculative target of a subsequent kernel RET, potentially leading to information disclosure via a speculative side-channel. The issue is tracked under CVE-2023-20569. Affected processors ------------------- AMD Zen, generations 1-4. That is, all families 0x17 and 0x19. Older processors have not been investigated. System information and options ------------------------------ First of all, it is required that the latest microcode be loaded for mitigations to be effective. The sysfs file showing SRSO mitigation status is: /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow The possible values in this file are: - 'Not affected' The processor is not vulnerable - 'Vulnerable: no microcode' The processor is vulnerable, no microcode extending IBPB functionality to address the vulnerability has been applied. - 'Mitigation: microcode' Extended IBPB functionality microcode patch has been applied. It does not address User->Kernel and Guest->Host transitions protection but it does address User->User and VM->VM attack vectors. (spec_rstack_overflow=microcode) - 'Mitigation: safe RET' Software-only mitigation. It complements the extended IBPB microcode patch functionality by addressing User->Kernel and Guest->Host transitions protection. Selected by default or by spec_rstack_overflow=safe-ret - 'Mitigation: IBPB' Similar protection as "safe RET" above but employs an IBPB barrier on privilege domain crossings (User->Kernel, Guest->Host). (spec_rstack_overflow=ibpb) - 'Mitigation: IBPB on VMEXIT' Mitigation addressing the cloud provider scenario - the Guest->Host transitions only. (spec_rstack_overflow=ibpb-vmexit) In order to exploit vulnerability, an attacker needs to: - gain local access on the machine - break kASLR - find gadgets in the running kernel in order to use them in the exploit - potentially create and pin an additional workload on the sibling thread, depending on the microarchitecture (not necessary on fam 0x19) - run the exploit Considering the performance implications of each mitigation type, the default one is 'Mitigation: safe RET' which should take care of most attack vectors, including the local User->Kernel one. As always, the user is advised to keep her/his system up-to-date by applying software updates regularly. The default setting will be reevaluated when needed and especially when new attack vectors appear. As one can surmise, 'Mitigation: safe RET' does come at the cost of some performance depending on the workload. If one trusts her/his userspace and does not want to suffer the performance impact, one can always disable the mitigation with spec_rstack_overflow=off. Similarly, 'Mitigation: IBPB' is another full mitigation type employing an indrect branch prediction barrier after having applied the required microcode patch for one's system. This mitigation comes also at a performance cost. Mitigation: safe RET -------------------- The mitigation works by ensuring all RET instructions speculate to a controlled location, similar to how speculation is controlled in the retpoline sequence. To accomplish this, the __x86_return_thunk forces the CPU to mispredict every function return using a 'safe return' sequence. To ensure the safety of this mitigation, the kernel must ensure that the safe return sequence is itself free from attacker interference. In Zen3 and Zen4, this is accomplished by creating a BTB alias between the untraining function srso_untrain_ret_alias() and the safe return function srso_safe_ret_alias() which results in evicting a potentially poisoned BTB entry and using that safe one for all function returns. In older Zen1 and Zen2, this is accomplished using a reinterpretation technique similar to Retbleed one: srso_untrain_ret() and srso_safe_ret().
Documentation/admin-guide/kdump/vmcoreinfo.rst +6 −0 Original line number Diff line number Diff line Loading @@ -624,3 +624,9 @@ Used to get the correct ranges: * VMALLOC_START ~ VMALLOC_END : vmalloc() / ioremap() space. * VMEMMAP_START ~ VMEMMAP_END : vmemmap space, used for struct page array. * KERNEL_LINK_ADDR : start address of Kernel link and BPF va_kernel_pa_offset ------------------- Indicates the offset between the kernel virtual and physical mappings. Used to translate virtual to physical addresses.