Merge remote-tracking branch 'torvalds/master' into perf-tools-next (55b29050) · Commits · git / linux-net

Documentation/ABI/testing/sysfs-devices-system-cpu

+7 −6

Original line number	Diff line number	Diff line
		@@ -513,17 +513,18 @@ Description: information about CPUs heterogeneity.
		cpu_capacity: capacity of cpuX.

		What: /sys/devices/system/cpu/vulnerabilities
		/sys/devices/system/cpu/vulnerabilities/gather_data_sampling
		/sys/devices/system/cpu/vulnerabilities/itlb_multihit
		/sys/devices/system/cpu/vulnerabilities/l1tf
		/sys/devices/system/cpu/vulnerabilities/mds
		/sys/devices/system/cpu/vulnerabilities/meltdown
		/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
		/sys/devices/system/cpu/vulnerabilities/retbleed
		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
		/sys/devices/system/cpu/vulnerabilities/spectre_v1
		/sys/devices/system/cpu/vulnerabilities/spectre_v2
		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
		/sys/devices/system/cpu/vulnerabilities/l1tf
		/sys/devices/system/cpu/vulnerabilities/mds
		/sys/devices/system/cpu/vulnerabilities/srbds
		/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
		/sys/devices/system/cpu/vulnerabilities/itlb_multihit
		/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
		/sys/devices/system/cpu/vulnerabilities/retbleed
		Date: January 2018
		Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
		Description: Information about CPU vulnerabilities

Documentation/admin-guide/hw-vuln/gather_data_sampling.rst

0 → 100644

+109 −0

Original line number	Diff line number	Diff line
		.. SPDX-License-Identifier: GPL-2.0

		GDS - Gather Data Sampling
		==========================

		Gather Data Sampling is a hardware vulnerability which allows unprivileged
		speculative access to data which was previously stored in vector registers.

		Problem
		-------
		When a gather instruction performs loads from memory, different data elements
		are merged into the destination vector register. However, when a gather
		instruction that is transiently executed encounters a fault, stale data from
		architectural or internal vector registers may get transiently forwarded to the
		destination vector register instead. This will allow a malicious attacker to
		infer stale data using typical side channel techniques like cache timing
		attacks. GDS is a purely sampling-based attack.

		The attacker uses gather instructions to infer the stale vector register data.
		The victim does not need to do anything special other than use the vector
		registers. The victim does not need to use gather instructions to be
		vulnerable.

		Because the buffers are shared between Hyper-Threads cross Hyper-Thread attacks
		are possible.

		Attack scenarios
		----------------
		Without mitigation, GDS can infer stale data across virtually all
		permission boundaries:

		Non-enclaves can infer SGX enclave data
		Userspace can infer kernel data
		Guests can infer data from hosts
		Guest can infer guest from other guests
		Users can infer data from other users

		Because of this, it is important to ensure that the mitigation stays enabled in
		lower-privilege contexts like guests and when running outside SGX enclaves.

		The hardware enforces the mitigation for SGX. Likewise, VMMs should ensure
		that guests are not allowed to disable the GDS mitigation. If a host erred and
		allowed this, a guest could theoretically disable GDS mitigation, mount an
		attack, and re-enable it.

		Mitigation mechanism
		--------------------
		This issue is mitigated in microcode. The microcode defines the following new
		bits:

		================================ === ============================
		IA32_ARCH_CAPABILITIES[GDS_CTRL] R/O Enumerates GDS vulnerability
		and mitigation support.
		IA32_ARCH_CAPABILITIES[GDS_NO] R/O Processor is not vulnerable.
		IA32_MCU_OPT_CTRL[GDS_MITG_DIS] R/W Disables the mitigation
		0 by default.
		IA32_MCU_OPT_CTRL[GDS_MITG_LOCK] R/W Locks GDS_MITG_DIS=0. Writes
		to GDS_MITG_DIS are ignored
		Can't be cleared once set.
		================================ === ============================

		GDS can also be mitigated on systems that don't have updated microcode by
		disabling AVX. This can be done by setting gather_data_sampling="force" or
		"clearcpuid=avx" on the kernel command-line.

		If used, these options will disable AVX use by turning off XSAVE YMM support.
		However, the processor will still enumerate AVX support. Userspace that
		does not follow proper AVX enumeration to check both AVX and XSAVE YMM
		support will break.

		Mitigation control on the kernel command line
		---------------------------------------------
		The mitigation can be disabled by setting "gather_data_sampling=off" or
		"mitigations=off" on the kernel command line. Not specifying either will default
		to the mitigation being enabled. Specifying "gather_data_sampling=force" will
		use the microcode mitigation when available or disable AVX on affected systems
		where the microcode hasn't been updated to include the mitigation.

		GDS System Information
		------------------------
		The kernel provides vulnerability status information through sysfs. For
		GDS this can be accessed by the following sysfs file:

		/sys/devices/system/cpu/vulnerabilities/gather_data_sampling

		The possible values contained in this file are:

		============================== =============================================
		Not affected Processor not vulnerable.
		Vulnerable Processor vulnerable and mitigation disabled.
		Vulnerable: No microcode Processor vulnerable and microcode is missing
		mitigation.
		Mitigation: AVX disabled,
		no microcode Processor is vulnerable and microcode is missing
		mitigation. AVX disabled as mitigation.
		Mitigation: Microcode Processor is vulnerable and mitigation is in
		effect.
		Mitigation: Microcode (locked) Processor is vulnerable and mitigation is in
		effect and cannot be disabled.
		Unknown: Dependent on
		hypervisor status Running on a virtual guest processor that is
		affected but with no way to know if host
		processor is mitigated or vulnerable.
		============================== =============================================

		GDS Default mitigation
		----------------------
		The updated microcode will enable the mitigation by default. The kernel's
		default action is to leave the mitigation enabled.

Documentation/admin-guide/hw-vuln/index.rst

+2 −0

Original line number	Diff line number	Diff line
		@@ -19,3 +19,5 @@ are configurable at compile, boot or run time.
		l1d_flush.rst
		processor_mmio_stale_data.rst
		cross-thread-rsb.rst
		srso
		gather_data_sampling.rst

Documentation/admin-guide/hw-vuln/srso.rst

0 → 100644

+133 −0

Original line number	Diff line number	Diff line
		.. SPDX-License-Identifier: GPL-2.0

		Speculative Return Stack Overflow (SRSO)
		========================================

		This is a mitigation for the speculative return stack overflow (SRSO)
		vulnerability found on AMD processors. The mechanism is by now the well
		known scenario of poisoning CPU functional units - the Branch Target
		Buffer (BTB) and Return Address Predictor (RAP) in this case - and then
		tricking the elevated privilege domain (the kernel) into leaking
		sensitive data.

		AMD CPUs predict RET instructions using a Return Address Predictor (aka
		Return Address Stack/Return Stack Buffer). In some cases, a non-architectural
		CALL instruction (i.e., an instruction predicted to be a CALL but is
		not actually a CALL) can create an entry in the RAP which may be used
		to predict the target of a subsequent RET instruction.

		The specific circumstances that lead to this varies by microarchitecture
		but the concern is that an attacker can mis-train the CPU BTB to predict
		non-architectural CALL instructions in kernel space and use this to
		control the speculative target of a subsequent kernel RET, potentially
		leading to information disclosure via a speculative side-channel.

		The issue is tracked under CVE-2023-20569.

		Affected processors
		-------------------

		AMD Zen, generations 1-4. That is, all families 0x17 and 0x19. Older
		processors have not been investigated.

		System information and options
		------------------------------

		First of all, it is required that the latest microcode be loaded for
		mitigations to be effective.

		The sysfs file showing SRSO mitigation status is:

		/sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow

		The possible values in this file are:

		- 'Not affected' The processor is not vulnerable

		- 'Vulnerable: no microcode' The processor is vulnerable, no
		microcode extending IBPB functionality
		to address the vulnerability has been
		applied.

		- 'Mitigation: microcode' Extended IBPB functionality microcode
		patch has been applied. It does not
		address User->Kernel and Guest->Host
		transitions protection but it does
		address User->User and VM->VM attack
		vectors.

		(spec_rstack_overflow=microcode)

		- 'Mitigation: safe RET' Software-only mitigation. It complements
		the extended IBPB microcode patch
		functionality by addressing User->Kernel
		and Guest->Host transitions protection.

		Selected by default or by
		spec_rstack_overflow=safe-ret

		- 'Mitigation: IBPB' Similar protection as "safe RET" above
		but employs an IBPB barrier on privilege
		domain crossings (User->Kernel,
		Guest->Host).

		(spec_rstack_overflow=ibpb)

		- 'Mitigation: IBPB on VMEXIT' Mitigation addressing the cloud provider
		scenario - the Guest->Host transitions
		only.

		(spec_rstack_overflow=ibpb-vmexit)

		In order to exploit vulnerability, an attacker needs to:

		- gain local access on the machine

		- break kASLR

		- find gadgets in the running kernel in order to use them in the exploit

		- potentially create and pin an additional workload on the sibling
		thread, depending on the microarchitecture (not necessary on fam 0x19)

		- run the exploit

		Considering the performance implications of each mitigation type, the
		default one is 'Mitigation: safe RET' which should take care of most
		attack vectors, including the local User->Kernel one.

		As always, the user is advised to keep her/his system up-to-date by
		applying software updates regularly.

		The default setting will be reevaluated when needed and especially when
		new attack vectors appear.

		As one can surmise, 'Mitigation: safe RET' does come at the cost of some
		performance depending on the workload. If one trusts her/his userspace
		and does not want to suffer the performance impact, one can always
		disable the mitigation with spec_rstack_overflow=off.

		Similarly, 'Mitigation: IBPB' is another full mitigation type employing
		an indrect branch prediction barrier after having applied the required
		microcode patch for one's system. This mitigation comes also at
		a performance cost.

		Mitigation: safe RET
		--------------------

		The mitigation works by ensuring all RET instructions speculate to
		a controlled location, similar to how speculation is controlled in the
		retpoline sequence. To accomplish this, the __x86_return_thunk forces
		the CPU to mispredict every function return using a 'safe return'
		sequence.

		To ensure the safety of this mitigation, the kernel must ensure that the
		safe return sequence is itself free from attacker interference. In Zen3
		and Zen4, this is accomplished by creating a BTB alias between the
		untraining function srso_untrain_ret_alias() and the safe return
		function srso_safe_ret_alias() which results in evicting a potentially
		poisoned BTB entry and using that safe one for all function returns.

		In older Zen1 and Zen2, this is accomplished using a reinterpretation
		technique similar to Retbleed one: srso_untrain_ret() and
		srso_safe_ret().

Documentation/admin-guide/kdump/vmcoreinfo.rst

+6 −0

Original line number	Diff line number	Diff line
		@@ -624,3 +624,9 @@ Used to get the correct ranges:
		* VMALLOC_START ~ VMALLOC_END : vmalloc() / ioremap() space.
		* VMEMMAP_START ~ VMEMMAP_END : vmemmap space, used for struct page array.
		* KERNEL_LINK_ADDR : start address of Kernel link and BPF

		va_kernel_pa_offset
		-------------------

		Indicates the offset between the kernel virtual and physical mappings.
		Used to translate virtual to physical addresses.