Commit ab0e7f20 authored by Borislav Petkov (AMD)'s avatar Borislav Petkov (AMD)
Browse files

Documentation: Merge x86-specific boot options doc into kernel-parameters.txt



Documentation/arch/x86/x86_64/boot-options.rst is causing unnecessary
confusion by being a second place where one can put x86 boot options.
Move them into the main one.

Drop removed ones like "acpi=ht", while at it.

Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: default avatarSohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20241202190011.11979-1-bp@kernel.org
parent 525077ae
Loading
Loading
Loading
Loading
+0 −3
Original line number Diff line number Diff line
@@ -194,8 +194,6 @@ is applicable::
	WDT	Watchdog support is enabled.
	X86-32	X86-32, aka i386 architecture is enabled.
	X86-64	X86-64 architecture is enabled.
			More X86-64 boot options can be found in
			Documentation/arch/x86/x86_64/boot-options.rst.
	X86	Either 32-bit or 64-bit x86 (same as X86-32+X86-64)
	X86_UV	SGI UV support is enabled.
	XEN	Xen support is enabled
@@ -213,7 +211,6 @@ Do not modify the syntax of boot loader parameters without extreme
need or coordination with <Documentation/arch/x86/boot.rst>.

There are also arch-specific kernel-parameters not documented here.
See for example <Documentation/arch/x86/x86_64/boot-options.rst>.

Note that ALL kernel parameters listed below are CASE SENSITIVE, and that
a trailing = on the name of any parameter states that that parameter will
+225 −12
Original line number Diff line number Diff line
@@ -21,6 +21,10 @@
				strictly ACPI specification compliant.
			rsdt -- prefer RSDT over (default) XSDT
			copy_dsdt -- copy DSDT to memory
			nocmcff -- Disable firmware first mode for corrected
			errors. This disables parsing the HEST CMC error
			source to check if firmware has set the FF flag. This
			may result in duplicate corrected error reports.
			nospcr -- disable console in ACPI SPCR table as
				default _serial_ console on ARM64
			For ARM64, ONLY "acpi=off", "acpi=on", "acpi=force" or
@@ -405,6 +409,8 @@
			not play well with APC CPU idle - disable it if you have
			APC and your system crashes randomly.

	apic		[APIC,X86-64] Use IO-APIC. Default.

	apic=		[APIC,X86,EARLY] Advanced Programmable Interrupt Controller
			Change the output verbosity while booting
			Format: { quiet (default) | verbose | debug }
@@ -424,6 +430,10 @@
			      useful so that a dump capture kernel won't be
			      shot down by NMI

	apicpmtimer	Do APIC timer calibration using the pmtimer. Implies
			apicmaintimer. Useful when your PIT timer is totally
			broken.

	autoconf=	[IPV6]
			See Documentation/networking/ipv6.rst.

@@ -1726,6 +1736,8 @@

			off:	Disable GDS mitigation.

	gbpages		[X86] Use GB pages for kernel direct mappings.

	gcov_persist=	[GCOV] When non-zero (default), profiling data for
			kernel modules is saved and remains accessible via
			debugfs, even when the module is unloaded/reloaded.
@@ -2008,12 +2020,21 @@

	idle=		[X86,EARLY]
			Format: idle=poll, idle=halt, idle=nomwait
			Poll forces a polling idle loop that can slightly
			improve the performance of waking up a idle CPU, but
			will use a lot of power and make the system run hot.
			Not recommended.

			idle=poll:  Don't do power saving in the idle loop
			using HLT, but poll for rescheduling event. This will
			make the CPUs eat a lot more power, but may be useful
			to get slightly better performance in multiprocessor
			benchmarks. It also makes some profiling using
			performance counters more accurate.  Please note that
			on systems with MONITOR/MWAIT support (like Intel
			EM64T CPUs) this option has no performance advantage
			over the normal idle loop.  It may also interact badly
			with hyperthreading.

			idle=halt: Halt is forced to be used for CPU idle.
			In such case C2/C3 won't be used again.

			idle=nomwait: Disable mwait for CPU C-states

	idxd.sva=	[HW]
@@ -2311,20 +2332,73 @@
		relaxed

	iommu=		[X86,EARLY]

		off
			Don't initialize and use any kind of IOMMU.

		force
			Force the use of the hardware IOMMU even when
			it is not actually needed (e.g. because < 3 GB
			memory).

		noforce
			Don't force hardware IOMMU usage when it is not
			needed. (default).

		biomerge
		panic
		nopanic
		merge
		nomerge

		soft
		pt		[X86]
		nopt		[X86]
		nobypass	[PPC/POWERNV]
			Use software bounce buffering (SWIOTLB) (default for
			Intel machines). This can be used to prevent the usage
			of an available hardware IOMMU.

			[X86]
		pt
			[X86]
		nopt
			[PPC/POWERNV]
		nobypass
			Disable IOMMU bypass, using IOMMU for PCI devices.

		[X86]
		AMD Gart HW IOMMU-specific options:

		<size>
			Set the size of the remapping area in bytes.

		allowed
			Overwrite iommu off workarounds for specific chipsets

		fullflush
			Flush IOMMU on each allocation (default).

		nofullflush
			Don't use IOMMU fullflush.

		memaper[=<order>]
			Allocate an own aperture over RAM with size
			32MB<<order.  (default: order=1, i.e. 64MB)

		merge
			Do scatter-gather (SG) merging. Implies "force"
			(experimental).

		nomerge
			Don't do scatter-gather (SG) merging.

		noaperture
			Ask the IOMMU not to touch the aperture for AGP.

		noagp
			Don't initialize the AGP driver and use full aperture.

		panic
			Always panic when IOMMU overflows.

	iommu.forcedac=	[ARM64,X86,EARLY] Control IOVA allocation for PCI devices.
			Format: { "0" | "1" }
			0 - Try to allocate a 32-bit DMA address first, before
@@ -3259,9 +3333,77 @@
			devices can be requested on-demand with the
			/dev/loop-control interface.

	mce		[X86-32] Machine Check Exception
	mce=		[X86-{32,64}]

			Please see Documentation/arch/x86/x86_64/machinecheck.rst for sysfs runtime tunables.

		off
			disable machine check

		no_cmci
			disable CMCI(Corrected Machine Check Interrupt) that
			Intel processor supports.  Usually this disablement is
			not recommended, but it might be handy if your
			hardware is misbehaving.

			Note that you'll get more problems without CMCI than
			with due to the shared banks, i.e. you might get
			duplicated error logs.

		dont_log_ce
			don't make logs for corrected errors.  All events
			reported as corrected are silently cleared by OS. This
			option will be useful if you have no interest in any
			of corrected errors.

		ignore_ce
			disable features for corrected errors, e.g.
			polling timer and CMCI.  All events reported as
			corrected are not cleared by OS and remained in its
			error banks.

			Usually this disablement is not recommended, however
			if there is an agent checking/clearing corrected
			errors (e.g. BIOS or hardware monitoring
			applications), conflicting with OS's error handling,
			and you cannot deactivate the agent, then this option
			will be a help.

		no_lmce
			do not opt-in to Local MCE delivery. Use legacy method
			to broadcast MCEs.

		bootlog
			enable logging of machine checks left over from
			booting. Disabled by default on AMD Fam10h and older
			because some BIOS leave bogus ones.

			If your BIOS doesn't do that it's a good idea to
			enable though to make sure you log even machine check
			events that result in a reboot. On Intel systems it is
			enabled by default.

		nobootlog
			disable boot machine check logging.

		monarchtimeout (number)
			sets the time in us to wait for other CPUs on machine
			checks. 0 to disable.

		bios_cmci_threshold
			don't overwrite the bios-set CMCI threshold. This boot
			option prevents Linux from overwriting the CMCI
			threshold set by the bios.  Without this option, Linux
			always sets the CMCI threshold to 1. Enabling this may
			make memory predictive failure analysis less effective
			if the bios sets thresholds for memory errors since we
			will not see details for all errors.

		recovery
			force-enable recoverable machine check code paths

			Everything else is in sysfs now.

	mce=option	[X86-64] See Documentation/arch/x86/x86_64/boot-options.rst

	md=		[HW] RAID subsystems devices and level
			See Documentation/admin-guide/md.rst.
@@ -3887,6 +4029,8 @@
	noapic		[SMP,APIC,EARLY] Tells the kernel to not make use of any
			IOAPICs that may be present in the system.

	noapictimer	[APIC,X86] Don't set up the APIC timer

	noautogroup	Disable scheduler automatic task group creation.

	nocache		[ARM,EARLY]
@@ -3934,6 +4078,8 @@
			register save and restore. The kernel will only save
			legacy floating-point registers on task switch.

	nogbpages	[X86] Do not use GB pages for kernel direct mappings.

	no_hash_pointers
			[KNL,EARLY]
			Force pointers printed to the console or buffers to be
@@ -3960,6 +4106,8 @@
			the impact of the sleep instructions. This is also
			useful when using JTAG debugger.

	nohpet		[X86] Don't use the HPET timer.

	nohugeiomap	[KNL,X86,PPC,ARM64,EARLY] Disable kernel huge I/O mappings.

	nohugevmalloc	[KNL,X86,PPC,ARM64,EARLY] Disable kernel huge vmalloc mappings.
@@ -4111,8 +4259,10 @@

	nosync		[HW,M68K] Disables sync negotiation for all devices.

	no_timer_check	[X86,APIC] Disables the code which tests for
			broken timer IRQ sources.
	no_timer_check	[X86,APIC] Disables the code which tests for broken
			timer IRQ sources, i.e., the IO-APIC timer. This can
			work around problems with incorrect timer
			initialization on some boards.

	no_uaccess_flush
	                [PPC,EARLY] Don't flush the L1-D cache after accessing user data.
@@ -4192,6 +4342,11 @@
			If given as an integer followed by 'U', it will
			divide each physical node into N emulated nodes.

	numa=noacpi	[X86] Don't parse the SRAT table for NUMA setup

	numa=nohmat	[X86] Don't parse the HMAT table for NUMA setup, or
			soft-reserved memory partitioning.

	numa_balancing=	[KNL,ARM64,PPC,RISCV,S390,X86] Enable or disable automatic
			NUMA balancing.
			Allowed values are enable and disable
@@ -5710,6 +5865,55 @@
			      reboot_cpu is s[mp]#### with #### being the processor
					to be used for rebooting.

		acpi
			Use the ACPI RESET_REG in the FADT. If ACPI is not
			configured or the ACPI reset does not work, the reboot
			path attempts the reset using the keyboard controller.

		bios
			Use the CPU reboot vector for warm reset

		cold
			Set the cold reboot flag

		default
			There are some built-in platform specific "quirks"
			- you may see: "reboot: <name> series board detected.
			Selecting <type> for reboots." In the case where you
			think the quirk is in error (e.g. you have newer BIOS,
			or newer board) using this option will ignore the
			built-in quirk table, and use the generic default
			reboot actions.

		efi
			Use efi reset_system runtime service. If EFI is not
			configured or the EFI reset does not work, the reboot
			path attempts the reset using the keyboard controller.

		force
			Don't stop other CPUs on reboot. This can make reboot
			more reliable in some cases.

		kbd
			Use the keyboard controller. cold reset (default)

		pci
			Use a write to the PCI config space register 0xcf9 to
			trigger reboot.

		triple
			Force a triple fault (init)

		warm
			Don't set the cold reboot flag

			Using warm reset will be much faster especially on big
			memory systems because the BIOS will not go through
			the memory check.  Disadvantage is that not all
			hardware will be completely reinitialized on reboot so
			there may be boot problems on some systems.


	refscale.holdoff= [KNL]
			Set test-start holdoff period.  The purpose of
			this parameter is to delay the start of the
@@ -6101,7 +6305,16 @@

	serialnumber	[BUGS=X86-32]

	sev=option[,option...] [X86-64] See Documentation/arch/x86/x86_64/boot-options.rst
	sev=option[,option...] [X86-64]

		debug
			Enable debug messages.

		nosnp
			Do not enable SEV-SNP (applies to host/hypervisor
			only). Setting 'nosnp' avoids the RMP check overhead
			in memory accesses when users do not want to run
			SEV-SNP guests.

	shapers=	[NET]
			Maximal number of shapers.
+0 −312
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

===========================
AMD64 Specific Boot Options
===========================

There are many others (usually documented in driver documentation), but
only the AMD64 specific ones are listed here.

Machine check
=============
Please see Documentation/arch/x86/x86_64/machinecheck.rst for sysfs runtime tunables.

   mce=off
		Disable machine check
   mce=no_cmci
		Disable CMCI(Corrected Machine Check Interrupt) that
		Intel processor supports.  Usually this disablement is
		not recommended, but it might be handy if your hardware
		is misbehaving.
		Note that you'll get more problems without CMCI than with
		due to the shared banks, i.e. you might get duplicated
		error logs.
   mce=dont_log_ce
		Don't make logs for corrected errors.  All events reported
		as corrected are silently cleared by OS.
		This option will be useful if you have no interest in any
		of corrected errors.
   mce=ignore_ce
		Disable features for corrected errors, e.g. polling timer
		and CMCI.  All events reported as corrected are not cleared
		by OS and remained in its error banks.
		Usually this disablement is not recommended, however if
		there is an agent checking/clearing corrected errors
		(e.g. BIOS or hardware monitoring applications), conflicting
		with OS's error handling, and you cannot deactivate the agent,
		then this option will be a help.
   mce=no_lmce
		Do not opt-in to Local MCE delivery. Use legacy method
		to broadcast MCEs.
   mce=bootlog
		Enable logging of machine checks left over from booting.
		Disabled by default on AMD Fam10h and older because some BIOS
		leave bogus ones.
		If your BIOS doesn't do that it's a good idea to enable though
		to make sure you log even machine check events that result
		in a reboot. On Intel systems it is enabled by default.
   mce=nobootlog
		Disable boot machine check logging.
   mce=monarchtimeout (number)
		monarchtimeout:
		Sets the time in us to wait for other CPUs on machine checks. 0
		to disable.
   mce=bios_cmci_threshold
		Don't overwrite the bios-set CMCI threshold. This boot option
		prevents Linux from overwriting the CMCI threshold set by the
		bios. Without this option, Linux always sets the CMCI
		threshold to 1. Enabling this may make memory predictive failure
		analysis less effective if the bios sets thresholds for memory
		errors since we will not see details for all errors.
   mce=recovery
		Force-enable recoverable machine check code paths

   nomce (for compatibility with i386)
		same as mce=off

   Everything else is in sysfs now.

APICs
=====

   apic
	Use IO-APIC. Default

   noapic
	Don't use the IO-APIC.

   disableapic
	Don't use the local APIC

   nolapic
     Don't use the local APIC (alias for i386 compatibility)

   pirq=...
	See Documentation/arch/x86/i386/IO-APIC.rst

   noapictimer
	Don't set up the APIC timer

   no_timer_check
	Don't check the IO-APIC timer. This can work around
	problems with incorrect timer initialization on some boards.

   apicpmtimer
	Do APIC timer calibration using the pmtimer. Implies
	apicmaintimer. Useful when your PIT timer is totally broken.

Timing
======

  notsc
    Deprecated, use tsc=unstable instead.

  nohpet
    Don't use the HPET timer.

Idle loop
=========

  idle=poll
    Don't do power saving in the idle loop using HLT, but poll for rescheduling
    event. This will make the CPUs eat a lot more power, but may be useful
    to get slightly better performance in multiprocessor benchmarks. It also
    makes some profiling using performance counters more accurate.
    Please note that on systems with MONITOR/MWAIT support (like Intel EM64T
    CPUs) this option has no performance advantage over the normal idle loop.
    It may also interact badly with hyperthreading.

Rebooting
=========

   reboot=b[ios] | t[riple] | k[bd] | a[cpi] | e[fi] | p[ci] [, [w]arm | [c]old]
      bios
        Use the CPU reboot vector for warm reset
      warm
        Don't set the cold reboot flag
      cold
        Set the cold reboot flag
      triple
        Force a triple fault (init)
      kbd
        Use the keyboard controller. cold reset (default)
      acpi
        Use the ACPI RESET_REG in the FADT. If ACPI is not configured or
        the ACPI reset does not work, the reboot path attempts the reset
        using the keyboard controller.
      efi
        Use efi reset_system runtime service. If EFI is not configured or
        the EFI reset does not work, the reboot path attempts the reset using
        the keyboard controller.
      pci
        Use a write to the PCI config space register 0xcf9 to trigger reboot.

   Using warm reset will be much faster especially on big memory
   systems because the BIOS will not go through the memory check.
   Disadvantage is that not all hardware will be completely reinitialized
   on reboot so there may be boot problems on some systems.

   reboot=force
     Don't stop other CPUs on reboot. This can make reboot more reliable
     in some cases.

   reboot=default
     There are some built-in platform specific "quirks" - you may see:
     "reboot: <name> series board detected. Selecting <type> for reboots."
     In the case where you think the quirk is in error (e.g. you have
     newer BIOS, or newer board) using this option will ignore the built-in
     quirk table, and use the generic default reboot actions.

NUMA
====

  numa=off
    Only set up a single NUMA node spanning all memory.

  numa=noacpi
    Don't parse the SRAT table for NUMA setup

  numa=nohmat
    Don't parse the HMAT table for NUMA setup, or soft-reserved memory
    partitioning.

ACPI
====

  acpi=off
    Don't enable ACPI
  acpi=ht
    Use ACPI boot table parsing, but don't enable ACPI interpreter
  acpi=force
    Force ACPI on (currently not needed)
  acpi=strict
    Disable out of spec ACPI workarounds.
  acpi_sci={edge,level,high,low}
    Set up ACPI SCI interrupt.
  acpi=noirq
    Don't route interrupts
  acpi=nocmcff
    Disable firmware first mode for corrected errors. This
    disables parsing the HEST CMC error source to check if
    firmware has set the FF flag. This may result in
    duplicate corrected error reports.

PCI
===

  pci=off
    Don't use PCI
  pci=conf1
    Use conf1 access.
  pci=conf2
    Use conf2 access.
  pci=rom
    Assign ROMs.
  pci=assign-busses
    Assign busses
  pci=irqmask=MASK
    Set PCI interrupt mask to MASK
  pci=lastbus=NUMBER
    Scan up to NUMBER busses, no matter what the mptable says.
  pci=noacpi
    Don't use ACPI to set up PCI interrupt routing.

IOMMU (input/output memory management unit)
===========================================
Multiple x86-64 PCI-DMA mapping implementations exist, for example:

   1. <kernel/dma/direct.c>: use no hardware/software IOMMU at all
      (e.g. because you have < 3 GB memory).
      Kernel boot message: "PCI-DMA: Disabling IOMMU"

   2. <arch/x86/kernel/amd_gart_64.c>: AMD GART based hardware IOMMU.
      Kernel boot message: "PCI-DMA: using GART IOMMU"

   3. <arch/x86_64/kernel/pci-swiotlb.c> : Software IOMMU implementation. Used
      e.g. if there is no hardware IOMMU in the system and it is need because
      you have >3GB memory or told the kernel to us it (iommu=soft))
      Kernel boot message: "PCI-DMA: Using software bounce buffering
      for IO (SWIOTLB)"

::

  iommu=[<size>][,noagp][,off][,force][,noforce]
  [,memaper[=<order>]][,merge][,fullflush][,nomerge]
  [,noaperture]

General iommu options:

    off
      Don't initialize and use any kind of IOMMU.
    noforce
      Don't force hardware IOMMU usage when it is not needed. (default).
    force
      Force the use of the hardware IOMMU even when it is
      not actually needed (e.g. because < 3 GB memory).
    soft
      Use software bounce buffering (SWIOTLB) (default for
      Intel machines). This can be used to prevent the usage
      of an available hardware IOMMU.

iommu options only relevant to the AMD GART hardware IOMMU:

    <size>
      Set the size of the remapping area in bytes.
    allowed
      Overwrite iommu off workarounds for specific chipsets.
    fullflush
      Flush IOMMU on each allocation (default).
    nofullflush
      Don't use IOMMU fullflush.
    memaper[=<order>]
      Allocate an own aperture over RAM with size 32MB<<order.
      (default: order=1, i.e. 64MB)
    merge
      Do scatter-gather (SG) merging. Implies "force" (experimental).
    nomerge
      Don't do scatter-gather (SG) merging.
    noaperture
      Ask the IOMMU not to touch the aperture for AGP.
    noagp
      Don't initialize the AGP driver and use full aperture.
    panic
      Always panic when IOMMU overflows.

iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU
implementation:

    swiotlb=<slots>[,force,noforce]
      <slots>
        Prereserve that many 2K slots for the software IO bounce buffering.
      force
        Force all IO through the software TLB.
      noforce
        Do not initialize the software TLB.


Miscellaneous
=============

  nogbpages
    Do not use GB pages for kernel direct mappings.
  gbpages
    Use GB pages for kernel direct mappings.


AMD SEV (Secure Encrypted Virtualization)
=========================================
Options relating to AMD SEV, specified via the following format:

::

   sev=option1[,option2]

The available options are:

   debug
     Enable debug messages.

   nosnp
     Do not enable SEV-SNP (applies to host/hypervisor only). Setting
     'nosnp' avoids the RMP check overhead in memory accesses when
     users do not want to run SEV-SNP guests.
+1 −1
Original line number Diff line number Diff line
@@ -18,7 +18,7 @@ For more information on the features of cpusets, see
Documentation/admin-guide/cgroup-v1/cpusets.rst.
There are a number of different configurations you can use for your needs.  For
more information on the numa=fake command line option and its various ways of
configuring fake nodes, see Documentation/arch/x86/x86_64/boot-options.rst.
configuring fake nodes, see Documentation/admin-guide/kernel-parameters.txt

For the purposes of this introduction, we'll assume a very primitive NUMA
emulation setup of "numa=fake=4*512,".  This will split our system memory into
+0 −1
Original line number Diff line number Diff line
@@ -7,7 +7,6 @@ x86_64 Support
.. toctree::
   :maxdepth: 2

   boot-options
   uefi
   mm
   5level-paging
Loading