diff options
Diffstat (limited to 'Documentation/admin-guide/kernel-parameters.txt')
| -rw-r--r-- | Documentation/admin-guide/kernel-parameters.txt | 2611 |
1 files changed, 1994 insertions, 617 deletions
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a1457995fd41..a8d0afde7f85 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1,7 +1,116 @@ - acpi= [HW,ACPI,X86,ARM64,RISCV64] + ACPI ACPI support is enabled. + AGP AGP (Accelerated Graphics Port) is enabled. + ALSA ALSA sound support is enabled. + APIC APIC support is enabled. + APM Advanced Power Management support is enabled. + APPARMOR AppArmor support is enabled. + ARM ARM architecture is enabled. + ARM64 ARM64 architecture is enabled. + AX25 Appropriate AX.25 support is enabled. + CLK Common clock infrastructure is enabled. + CMA Contiguous Memory Area support is enabled. + DRM Direct Rendering Management support is enabled. + DYNAMIC_DEBUG Build in debug messages and enable them at runtime + EARLY Parameter processed too early to be embedded in initrd. + EDD BIOS Enhanced Disk Drive Services (EDD) is enabled + EFI EFI Partitioning (GPT) is enabled + EVM Extended Verification Module + FB The frame buffer device is enabled. + FTRACE Function tracing enabled. + GCOV GCOV profiling is enabled. + HIBERNATION HIBERNATION is enabled. + HW Appropriate hardware is enabled. + HYPER_V HYPERV support is enabled. + IMA Integrity measurement architecture is enabled. + IP_PNP IP DHCP, BOOTP, or RARP is enabled. + IPV6 IPv6 support is enabled. + ISAPNP ISA PnP code is enabled. + ISDN Appropriate ISDN support is enabled. + ISOL CPU Isolation is enabled. + JOY Appropriate joystick support is enabled. + KGDB Kernel debugger support is enabled. + KVM Kernel Virtual Machine support is enabled. + LIBATA Libata driver is enabled + LOONGARCH LoongArch architecture is enabled. + LOOP Loopback device support is enabled. + LP Printer support is enabled. + M68k M68k architecture is enabled. + These options have more detailed description inside of + Documentation/arch/m68k/kernel-options.rst. + MDA MDA console support is enabled. + MIPS MIPS architecture is enabled. + MOUSE Appropriate mouse support is enabled. + MSI Message Signaled Interrupts (PCI). + MTD MTD (Memory Technology Device) support is enabled. + NET Appropriate network support is enabled. + NFS Appropriate NFS support is enabled. + NUMA NUMA support is enabled. + OF Devicetree is enabled. + PARISC The PA-RISC architecture is enabled. + PCI PCI bus support is enabled. + PCIE PCI Express support is enabled. + PCMCIA The PCMCIA subsystem is enabled. + PNP Plug & Play support is enabled. + PPC PowerPC architecture is enabled. + PPT Parallel port support is enabled. + PS2 Appropriate PS/2 support is enabled. + PV_OPS A paravirtualized kernel is enabled. + RAM RAM disk support is enabled. + RDT Intel Resource Director Technology. + RISCV RISCV architecture is enabled. + S390 S390 architecture is enabled. + SCSI Appropriate SCSI support is enabled. + A lot of drivers have their options described inside + the Documentation/scsi/ sub-directory. + SDW SoundWire support is enabled. + SECURITY Different security models are enabled. + SELINUX SELinux support is enabled. + SERIAL Serial support is enabled. + SH SuperH architecture is enabled. + SMP The kernel is an SMP kernel. + SPARC Sparc architecture is enabled. + SUSPEND System suspend states are enabled. + SWSUSP Software suspend (hibernation) is enabled. + TPM TPM drivers are enabled. + UMS USB Mass Storage support is enabled. + USB USB support is enabled. + USBHID USB Human Interface Device support is enabled. + V4L Video For Linux support is enabled. + VGA The VGA console has been enabled. + VMMIO Driver for memory mapped virtio devices is enabled. + VT Virtual terminal support is enabled. + WDT Watchdog support is enabled. + X86-32 X86-32, aka i386 architecture is enabled. + X86-64 X86-64 architecture is enabled. + X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64) + X86_UV SGI UV support is enabled. + XEN Xen support is enabled + XTENSA xtensa architecture is enabled. + +In addition, the following text indicates that the option + + BOOT Is a boot loader parameter. + BUGS= Relates to possible processor bugs on the said processor. + KNL Is a kernel start-up parameter. + + +Kernel parameters + + accept_memory= [MM] + Format: { eager | lazy } + default: lazy + By default, unaccepted memory is accepted lazily to + avoid prolonged boot times. The lazy option will add + some runtime overhead until all memory is eventually + accepted. In most cases the overhead is negligible. + For some workloads or for debugging purposes + accept_memory=eager can be used to accept all memory + at once during boot. + + acpi= [HW,ACPI,X86,ARM64,RISCV64,EARLY] Advanced Configuration and Power Interface Format: { force | on | off | strict | noirq | rsdt | - copy_dsdt } + copy_dsdt | nospcr } force -- enable ACPI if default was off on -- enable ACPI but allow fallback to DT [arm64,riscv64] off -- disable ACPI if default was on @@ -10,12 +119,20 @@ strictly ACPI specification compliant. rsdt -- prefer RSDT over (default) XSDT copy_dsdt -- copy DSDT to memory - For ARM64 and RISCV64, ONLY "acpi=off", "acpi=on" or - "acpi=force" are available + nocmcff -- Disable firmware first mode for corrected + errors. This disables parsing the HEST CMC error + source to check if firmware has set the FF flag. This + may result in duplicate corrected error reports. + nospcr -- disable console in ACPI SPCR table as + default _serial_ console on ARM64 + For ARM64, ONLY "acpi=off", "acpi=on", "acpi=force" or + "acpi=nospcr" are available + For RISCV64, ONLY "acpi=off", "acpi=on" or "acpi=force" + are available See also Documentation/power/runtime_pm.rst, pci=noacpi - acpi_apic_instance= [ACPI, IOAPIC] + acpi_apic_instance= [ACPI,IOAPIC,EARLY] Format: <int> 2: use 2nd APIC table, if available 1,0: use 1st APIC table @@ -30,7 +147,7 @@ If set to native, use the device's native backlight mode. If set to none, disable the ACPI backlight interface. - acpi_force_32bit_fadt_addr + acpi_force_32bit_fadt_addr [ACPI,EARLY] force FADT to use 32 bit addresses rather than the 64 bit X_* addresses. Some firmware have broken 64 bit addresses for force ACPI ignore these and use @@ -86,7 +203,7 @@ no: ACPI OperationRegions are not marked as reserved, no further checks are performed. - acpi_force_table_verification [HW,ACPI] + acpi_force_table_verification [HW,ACPI,EARLY] Enable table checksum verification during early stage. By default, this is disabled due to x86 early mapping size limitation. @@ -126,7 +243,7 @@ acpi_no_memhotplug [ACPI] Disable memory hotplug. Useful for kdump kernels. - acpi_no_static_ssdt [HW,ACPI] + acpi_no_static_ssdt [HW,ACPI,EARLY] Disable installation of static SSDTs at early boot time By default, SSDTs contained in the RSDT/XSDT will be installed automatically and they will appear under @@ -140,7 +257,7 @@ Ignore the ACPI-based watchdog interface (WDAT) and let a native driver control the watchdog device instead. - acpi_rsdp= [ACPI,EFI,KEXEC] + acpi_rsdp= [ACPI,EFI,KEXEC,EARLY] Pass the RSDP address to the kernel, mostly used on machines running EFI runtime service to boot the second kernel for kdump. @@ -217,10 +334,10 @@ to assume that this machine's pmtimer latches its value and always returns good values. - acpi_sci= [HW,ACPI] ACPI System Control Interrupt trigger mode + acpi_sci= [HW,ACPI,EARLY] ACPI System Control Interrupt trigger mode Format: { level | edge | high | low } - acpi_skip_timer_override [HW,ACPI] + acpi_skip_timer_override [HW,ACPI,EARLY] Recognize and ignore IRQ0/pin2 Interrupt Override. For broken nForce2 BIOS resulting in XT-PIC timer. @@ -255,11 +372,11 @@ behave incorrectly in some ways with respect to system suspend and resume to be ignored (use wisely). - acpi_use_timer_override [HW,ACPI] + acpi_use_timer_override [HW,ACPI,EARLY] Use timer override. For some broken Nvidia NF5 boards that require a timer override, but don't have HPET - add_efi_memmap [EFI; X86] Include EFI memory map in + add_efi_memmap [EFI,X86,EARLY] Include EFI memory map in kernel's map of available physical RAM. agp= [AGP] @@ -296,7 +413,7 @@ do not want to use tracing_snapshot_alloc() as it needs to be done where GFP_KERNEL allocations are allowed. - allow_mismatched_32bit_el0 [ARM64] + allow_mismatched_32bit_el0 [ARM64,EARLY] Allow execve() of 32-bit applications and setting of the PER_LINUX32 personality on systems where only a strict subset of the CPUs support 32-bit EL0. When this @@ -318,12 +435,17 @@ allowed anymore to lift isolation requirements as needed. This option does not override iommu=pt - force_enable - Force enable the IOMMU on platforms known - to be buggy with IOMMU enabled. Use this - option with care. - pgtbl_v1 - Use v1 page table for DMA-API (Default). - pgtbl_v2 - Use v2 page table for DMA-API. - irtcachedis - Disable Interrupt Remapping Table (IRT) caching. + force_enable - Force enable the IOMMU on platforms known + to be buggy with IOMMU enabled. Use this + option with care. + pgtbl_v1 - Use v1 page table for DMA-API (Default). + pgtbl_v2 - Use v2 page table for DMA-API. + irtcachedis - Disable Interrupt Remapping Table (IRT) caching. + nohugepages - Limit page-sizes used for v1 page-tables + to 4 KiB. + v2_pgsizes_only - Limit page-sizes used for v1 page-tables + to 4KiB/2Mib/1GiB. + amd_iommu_dump= [HW,X86-64] Enable AMD IOMMU driver option to dump the ACPI table @@ -340,7 +462,7 @@ This mode requires kvm-amd.avic=1. (Default when IOMMU HW support is present.) - amd_pstate= [X86] + amd_pstate= [X86,EARLY] disable Do not enable amd_pstate as the default scaling driver for the supported processors @@ -363,6 +485,11 @@ selects a performance level in this range and appropriate to the current workload. + amd_prefcore= + [X86] + disable + Disable amd-pstate preferred core. + amijoy.map= [HW,JOY] Amiga joystick support Map of devices attached to JOY0DAT and JOY1DAT Format: <a>,<b> @@ -380,17 +507,15 @@ not play well with APC CPU idle - disable it if you have APC and your system crashes randomly. - apic= [APIC,X86] Advanced Programmable Interrupt Controller + apic [APIC,X86-64] Use IO-APIC. Default. + + apic= [APIC,X86,EARLY] Advanced Programmable Interrupt Controller Change the output verbosity while booting Format: { quiet (default) | verbose | debug } Change the amount of debugging information output when initialising the APIC and IO-APIC components. - For X86-32, this can also be used to specify an APIC - driver name. - Format: apic=driver_name - Examples: apic=bigsmp - apic_extnmi= [APIC,X86] External NMI delivery setting + apic_extnmi= [APIC,X86,EARLY] External NMI delivery setting Format: { bsp (default) | all | none } bsp: External NMI is delivered only to CPU 0 all: External NMIs are broadcast to all CPUs as a @@ -399,6 +524,10 @@ useful so that a dump capture kernel won't be shot down by NMI + apicpmtimer Do APIC timer calibration using the pmtimer. Implies + apicmaintimer. Useful when your PIT timer is totally + broken. + autoconf= [IPV6] See Documentation/networking/ipv6.rst. @@ -415,23 +544,32 @@ arcrimi= [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards Format: <io>,<irq>,<nodeID> + arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of + 32 bit applications. + arm64.nobti [ARM64] Unconditionally disable Branch Target Identification support - arm64.nopauth [ARM64] Unconditionally disable Pointer Authentication + arm64.nogcs [ARM64] Unconditionally disable Guarded Control Stack support + arm64.nomops [ARM64] Unconditionally disable Memory Copy and Memory + Set instructions support + + arm64.nompam [ARM64] Unconditionally disable Memory Partitioning And + Monitoring support + arm64.nomte [ARM64] Unconditionally disable Memory Tagging Extension support - arm64.nosve [ARM64] Unconditionally disable Scalable Vector - Extension support + arm64.nopauth [ARM64] Unconditionally disable Pointer Authentication + support arm64.nosme [ARM64] Unconditionally disable Scalable Matrix Extension support - arm64.nomops [ARM64] Unconditionally disable Memory Copy and Memory - Set instructions support + arm64.nosve [ARM64] Unconditionally disable Scalable Vector + Extension support ataflop= [HW,M68k] @@ -494,24 +632,37 @@ Format: <io>,<irq>,<mode> See header of drivers/net/hamradio/baycom_ser_hdx.c. + bdev_allow_write_mounted= + Format: <bool> + Control the ability to open a mounted block device + for writing, i.e., allow / disallow writes that bypass + the FS. This was implemented as a means to prevent + fuzzers from crashing the kernel by overwriting the + metadata underneath a mounted FS without its awareness. + This also prevents destructive formatting of mounted + filesystems by naive storage tooling that don't use + O_EXCL. Default is Y and can be changed through the + Kconfig option CONFIG_BLK_DEV_WRITE_MOUNTED. + bert_disable [ACPI] Disable BERT OS support on buggy BIOSes. - bgrt_disable [ACPI][X86] + bgrt_disable [ACPI,X86,EARLY] Disable BGRT to avoid flickering OEM logo. blkdevparts= Manual partition parsing of block device(s) for embedded devices based on command line input. See Documentation/block/cmdline-partition.rst - boot_delay= Milliseconds to delay each printk during boot. + boot_delay= [KNL,EARLY] + Milliseconds to delay each printk during boot. Only works if CONFIG_BOOT_PRINTK_DELAY is enabled, and you may also have to specify "lpj=". Boot_delay values larger than 10 seconds (10000) are assumed erroneous and ignored. Format: integer - bootconfig [KNL] + bootconfig [KNL,EARLY] Extended command line options can be added to an initrd and this will cause the kernel to look for it. @@ -546,14 +697,32 @@ trust validation. format: { id:<keyid> | builtin } - cca= [MIPS] Override the kernel pages' cache coherency + cca= [MIPS,EARLY] Override the kernel pages' cache coherency algorithm. Accepted values range from 0 to 7 inclusive. See arch/mips/include/asm/pgtable-bits.h for platform specific values (SB1, Loongson3 and others). ccw_timeout_log [S390] - See Documentation/s390/common_io.rst for details. + See Documentation/arch/s390/common_io.rst for details. + + cfi= [X86-64] Set Control Flow Integrity checking features + when CONFIG_FINEIBT is enabled. + Format: feature[,feature...] + Default: auto + + auto: Use FineIBT if IBT available, otherwise kCFI. + Under FineIBT, enable "paranoid" mode when + FRED is not available. + off: Turn off CFI checking. + kcfi: Use kCFI (disable FineIBT). + fineibt: Use FineIBT (even if IBT not available). + norand: Do not re-randomize CFI hashes. + paranoid: Add caller hash checking under FineIBT. + bhi: Enable register poisoning to stop speculation + across FineIBT. (Disabled by default.) + warn: Do not enforce CFI checking: warn only. + debug: Report CFI initialization details. cgroup_disable= [KNL] Disable a particular controller or optional feature Format: {name of the controller(s) or feature(s) to disable} @@ -580,12 +749,32 @@ named mounts. Specifying both "all" and "named" disables all v1 hierarchies. + cgroup_v1_proc= [KNL] Show also missing controllers in /proc/cgroups + Format: { "true" | "false" } + /proc/cgroups lists only v1 controllers by default. + This compatibility option enables listing also v2 + controllers (whose v1 code is not compiled!), so that + semi-legacy software can check this file to decide + about usage of v2 (sic) controllers. + + cgroup_favordynmods= [KNL] Enable or Disable favordynmods. + Format: { "true" | "false" } + Defaults to the value of CONFIG_CGROUP_FAVOR_DYNMODS. + cgroup.memory= [KNL] Pass options to the cgroup memory controller. Format: <string> nosocket -- Disable socket memory accounting. nokmem -- Disable kernel memory accounting. nobpf -- Disable BPF memory accounting. + check_pages= [MM,EARLY] Enable sanity checking of pages after + allocations / before freeing. This adds checks to catch + double-frees, use-after-frees, and other sources of + page corruption by inspecting page internals (flags, + mapcount/refcount, memcg_data, etc.). + Format: { "0" | "1" } + Default: 0 (1 if CONFIG_DEBUG_VM is set) + checkreqprot= [SELINUX] Set initial checkreqprot flag value. Format: { "0" | "1" } See security/selinux/Kconfig help text. @@ -598,7 +787,7 @@ Setting checkreqprot to 1 is deprecated. cio_ignore= [S390] - See Documentation/s390/common_io.rst for details. + See Documentation/arch/s390/common_io.rst for details. clearcpuid=X[,X...] [X86] Disable CPUID feature X for the kernel. See @@ -657,19 +846,13 @@ [X86-64] hpet,tsc clocksource.arm_arch_timer.evtstrm= - [ARM,ARM64] + [ARM,ARM64,EARLY] Format: <bool> Enable/disable the eventstream feature of the ARM architected timer so that code using WFE-based polling loops can be debugged more effectively on production systems. - clocksource.max_cswd_read_retries= [KNL] - Number of clocksource_watchdog() retries due to - external delays before the clock will be marked - unstable. Defaults to two retries, that is, - three attempts to read the clock under test. - clocksource.verify_n_cpus= [KNL] Limit the number of CPUs checked for clocksources marked with CLOCK_SOURCE_VERIFY_PERCPU that @@ -687,7 +870,7 @@ 10 seconds when built into the kernel. cma=nn[MG]@[start[MG][-end[MG]]] - [KNL,CMA] + [KNL,CMA,EARLY] Sets the size of kernel global memory area for contiguous memory allocations and optionally the placement constraint by the physical address range of @@ -696,7 +879,7 @@ kernel/dma/contiguous.c cma_pernuma=nn[MG] - [ARM64,KNL,CMA] + [KNL,CMA,EARLY] Sets the size of kernel per-numa memory area for contiguous memory allocations. A value of 0 disables per-numa CMA altogether. And If this option is not @@ -706,6 +889,17 @@ which is located in node nid, if the allocation fails, they will fallback to the global default memory area. + numa_cma=<node>:nn[MG][,<node>:nn[MG]] + [KNL,CMA,EARLY] + Sets the size of kernel numa memory area for + contiguous memory allocations. It will reserve CMA + area for the specified node. + + With numa CMA enabled, DMA users on node nid will + first try to allocate buffer from the numa area + which is located in node nid, if the allocation fails, + they will fallback to the global default memory area. + cmo_free_hint= [PPC] Format: { yes | no } Specify whether pages are marked as being inactive when they are freed. This is used in CMO environments @@ -713,7 +907,7 @@ a hypervisor. Default: yes - coherent_pool=nn[KMG] [ARM,KNL] + coherent_pool=nn[KMG] [ARM,KNL,EARLY] Sets the size of memory pool for coherent, atomic dma allocations, by default set to 256K. @@ -731,7 +925,7 @@ condev= [HW,S390] console device conmode= - con3215_drop= [S390] 3215 console drop mode. + con3215_drop= [S390,EARLY] 3215 console drop mode. Format: y|n|Y|N|1|0 When set to true, drop data on the 3215 console when the console buffer is full. In this case the @@ -759,6 +953,25 @@ Documentation/networking/netconsole.rst for an alternative. + <DEVNAME>:<n>.<n>[,options] + Use the specified serial port on the serial core bus. + The addressing uses DEVNAME of the physical serial port + device, followed by the serial core controller instance, + and the serial port instance. The options are the same + as documented for the ttyS addressing above. + + The mapping of the serial ports to the tty instances + can be viewed with: + + $ ls -d /sys/bus/serial-base/devices/*:*.*/tty/* + /sys/bus/serial-base/devices/00:04:0.0/tty/ttyS0 + + In the above example, the console can be addressed with + console=00:04:0.0. Note that a console addressed this + way will only get added when the related device driver + is ready. The use of an earlycon parameter in addition to + the console may be desired for console output early on. + uart[8250],io,<addr>[,options] uart[8250],mmio,<addr>[,options] uart[8250],mmio16,<addr>[,options] @@ -837,7 +1050,7 @@ kernel before the cpufreq driver probes. cpu_init_udelay=N - [X86] Delay for N microsec between assert and de-assert + [X86,EARLY] Delay for N microsec between assert and de-assert of APIC INIT to start processors. This delay occurs on every CPU online, such as boot, and resume from suspend. Default: 10000 @@ -849,22 +1062,26 @@ the parameter has no effect. crash_kexec_post_notifiers - Run kdump after running panic-notifiers and dumping - kmsg. This only for the users who doubt kdump always - succeeds in any situation. - Note that this also increases risks of kdump failure, - because some panic notifiers can make the crashed - kernel more unstable. + Only jump to kdump kernel after running the panic + notifiers and dumping kmsg. This option increases + the risks of a kdump failure, since some panic + notifiers can make the crashed kernel more unstable. + In configurations where kdump may not be reliable, + running the panic notifiers could allow collecting + more data on dmesg, like stack traces from other CPUS + or extra data dumped by panic_print. Note that some + configurations enable this option unconditionally, + like Hyper-V, PowerPC (fadump) and AMD SEV-SNP. crashkernel=size[KMG][@offset[KMG]] - [KNL] Using kexec, Linux can switch to a 'crash kernel' + [KNL,EARLY] Using kexec, Linux can switch to a 'crash kernel' upon panic. This parameter reserves the physical memory region [offset, offset + size] for that kernel image. If '@offset' is omitted, then a suitable offset is selected automatically. - [KNL, X86-64, ARM64] Select a region under 4G first, and - fall back to reserve region above 4G when '@offset' - hasn't been specified. + [KNL, X86-64, ARM64, RISCV, LoongArch] Select a region + under 4G first, and fall back to reserve region above + 4G when '@offset' hasn't been specified. See Documentation/admin-guide/kdump/kdump.rst for further details. crashkernel=range1:size1[,range2:size2,...][@offset] @@ -875,29 +1092,54 @@ Documentation/admin-guide/kdump/kdump.rst for an example. crashkernel=size[KMG],high - [KNL, X86-64, ARM64] range could be above 4G. Allow kernel - to allocate physical memory region from top, so could - be above 4G if system have more than 4G ram installed. - Otherwise memory region will be allocated below 4G, if - available. + [KNL, X86-64, ARM64, RISCV, LoongArch] range could be + above 4G. + Allow kernel to allocate physical memory region from top, + so could be above 4G if system have more than 4G ram + installed. Otherwise memory region will be allocated + below 4G, if available. It will be ignored if crashkernel=X is specified. crashkernel=size[KMG],low - [KNL, X86-64, ARM64] range under 4G. When crashkernel=X,high - is passed, kernel could allocate physical memory region - above 4G, that cause second kernel crash on system - that require some amount of low memory, e.g. swiotlb - requires at least 64M+32K low memory, also enough extra - low memory is needed to make sure DMA buffers for 32-bit - devices won't run out. Kernel would try to allocate + [KNL, X86-64, ARM64, RISCV, LoongArch] range under 4G. + When crashkernel=X,high is passed, kernel could allocate + physical memory region above 4G, that cause second kernel + crash on system that require some amount of low memory, + e.g. swiotlb requires at least 64M+32K low memory, also + enough extra low memory is needed to make sure DMA buffers + for 32-bit devices won't run out. Kernel would try to allocate default size of memory below 4G automatically. The default size is platform dependent. --> x86: max(swiotlb_size_or_default() + 8MiB, 256MiB) --> arm64: 128MiB + --> riscv: 128MiB + --> loongarch: 128MiB This one lets the user specify own low range under 4G for second kernel instead. 0: to disable low allocation. It will be ignored when crashkernel=X,high is not used or memory reserved is below 4G. + crashkernel=size[KMG],cma + [KNL, X86, ppc] Reserve additional crash kernel memory from + CMA. This reservation is usable by the first system's + userspace memory and kernel movable allocations (memory + balloon, zswap). Pages allocated from this memory range + will not be included in the vmcore so this should not + be used if dumping of userspace memory is intended and + it has to be expected that some movable kernel pages + may be missing from the dump. + + A standard crashkernel reservation, as described above, + is still needed to hold the crash kernel and initrd. + + This option increases the risk of a kdump failure: DMA + transfers configured by the first kernel may end up + corrupting the second kernel's memory. + + This reservation method is intended for systems that + can't afford to sacrifice enough memory for standard + crashkernel reservation and where less reliable and + possibly incomplete kdump is preferable to no kdump at + all. cryptomgr.notests [KNL] Disable crypto self-tests @@ -925,10 +1167,10 @@ Format: <port#>,<type> See also Documentation/input/devices/joystick-parport.rst - debug [KNL] Enable kernel debugging (events log level). + debug [KNL,EARLY] Enable kernel debugging (events log level). debug_boot_weak_hash - [KNL] Enable printing [hashed] pointers early in the + [KNL,EARLY] Enable printing [hashed] pointers early in the boot sequence. If enabled, we use a weak hash instead of siphash to hash pointers. Use this option if you are seeing instances of '(___ptrval___)') and need to see a @@ -945,29 +1187,29 @@ will print _a_lot_ more information - normally only useful to lockdep developers. - debug_objects [KNL] Enable object debugging + debug_objects [KNL,EARLY] Enable object debugging debug_guardpage_minorder= - [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this + [KNL,EARLY] When CONFIG_DEBUG_PAGEALLOC is set, this parameter allows control of the order of pages that will be intentionally kept free (and hence protected) by the buddy allocator. Bigger value increase the probability of catching random memory corruption, but reduce the amount of memory for normal system use. The maximum - possible value is MAX_ORDER/2. Setting this parameter - to 1 or 2 should be enough to identify most random - memory corruption problems caused by bugs in kernel or - driver code when a CPU writes to (or reads from) a - random memory location. Note that there exists a class - of memory corruptions problems caused by buggy H/W or - F/W or by drivers badly programming DMA (basically when - memory is written at bus level and the CPU MMU is - bypassed) which are not detectable by - CONFIG_DEBUG_PAGEALLOC, hence this option will not help - tracking down these problems. + possible value is MAX_PAGE_ORDER/2. Setting this + parameter to 1 or 2 should be enough to identify most + random memory corruption problems caused by bugs in + kernel or driver code when a CPU writes to (or reads + from) a random memory location. Note that there exists + a class of memory corruptions problems caused by buggy + H/W or F/W or by drivers badly programming DMA + (basically when memory is written at bus level and the + CPU MMU is bypassed) which are not detectable by + CONFIG_DEBUG_PAGEALLOC, hence this option will not + help tracking down these problems. debug_pagealloc= - [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this parameter + [KNL,EARLY] When CONFIG_DEBUG_PAGEALLOC is set, this parameter enables the feature at boot time. By default, it is disabled and the system will work mostly the same as a kernel built without CONFIG_DEBUG_PAGEALLOC. @@ -975,14 +1217,10 @@ useful to also enable the page_owner functionality. on: enable the feature - debugfs= [KNL] This parameter enables what is exposed to userspace - and debugfs internal clients. - Format: { on, no-mount, off } + debugfs= [KNL,EARLY] This parameter enables what is exposed to + userspace and debugfs internal clients. + Format: { on, off } on: All functions are enabled. - no-mount: - Filesystem is not registered but kernel clients can - access APIs and a crashkernel can be used to read - its content. There is nothing to mount. off: Filesystem is not registered and clients get a -EPERM as result when trying to register files or directories within debugfs. @@ -1055,7 +1293,7 @@ dhash_entries= [KNL] Set number of hash buckets for dentry cache. - disable_1tb_segments [PPC] + disable_1tb_segments [PPC,EARLY] Disables the use of 1TB hash page table segments. This causes the kernel to fall back to 256MB segments which can be useful when debugging issues that require an SLB @@ -1064,41 +1302,32 @@ disable= [IPV6] See Documentation/networking/ipv6.rst. - disable_radix [PPC] + disable_radix [PPC,EARLY] Disable RADIX MMU mode on POWER9 disable_tlbie [PPC] Disable TLBIE instruction. Currently does not work with KVM, with HASH MMU, or with coherent accelerators. - disable_cpu_apicid= [X86,APIC,SMP] - Format: <int> - The number of initial APIC ID for the - corresponding CPU to be disabled at boot, - mostly used for the kdump 2nd kernel to - disable BSP to wake up multiple CPUs without - causing system reset or hang due to sending - INIT from AP to BSP. - - disable_ddw [PPC/PSERIES] + disable_ddw [PPC/PSERIES,EARLY] Disable Dynamic DMA Window support. Use this to workaround buggy firmware. disable_ipv6= [IPV6] See Documentation/networking/ipv6.rst. - disable_mtrr_cleanup [X86] + disable_mtrr_cleanup [X86,EARLY] The kernel tries to adjust MTRR layout from continuous to discrete, to make X server driver able to add WB entry later. This parameter disables that. - disable_mtrr_trim [X86, Intel and AMD only] + disable_mtrr_trim [X86, Intel and AMD only,EARLY] By default the kernel will trim any uncacheable memory out of your available memory pool based on MTRR settings. This parameter disables that behavior, possibly causing your machine to run very slowly. - disable_timer_pin_1 [X86] + disable_timer_pin_1 [X86,EARLY] Disable PIN 1 of APIC timer Can be useful to work around chipset bugs. @@ -1121,6 +1350,26 @@ The filter can be disabled or changed to another driver later using sysfs. + reg_file_data_sampling= + [X86] Controls mitigation for Register File Data + Sampling (RFDS) vulnerability. RFDS is a CPU + vulnerability which may allow userspace to infer + kernel data values previously stored in floating point + registers, vector registers, or integer registers. + RFDS only affects Intel Atom processors. + + on: Turns ON the mitigation. + off: Turns OFF the mitigation. + + This parameter overrides the compile time default set + by CONFIG_MITIGATION_RFDS. Mitigation cannot be + disabled when other VERW based mitigations (like MDS) + are enabled. In order to disable RFDS mitigation all + VERW based mitigations need to be disabled. + + For details see: + Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst + driver_async_probe= [KNL] List of driver names to be probed asynchronously. * matches with all driver names. If * is specified, the @@ -1133,22 +1382,16 @@ panels may send no or incorrect EDID data sets. This parameter allows to specify an EDID data sets in the /lib/firmware directory that are used instead. - Generic built-in EDID data sets are used, if one of - edid/1024x768.bin, edid/1280x1024.bin, - edid/1680x1050.bin, or edid/1920x1080.bin is given - and no file with the same name exists. Details and - instructions how to build your own EDID data are - available in Documentation/admin-guide/edid.rst. An EDID - data set will only be used for a particular connector, - if its name and a colon are prepended to the EDID - name. Each connector may use a unique EDID data - set by separating the files with a comma. An EDID + An EDID data set will only be used for a particular + connector, if its name and a colon are prepended to + the EDID name. Each connector may use a unique EDID + data set by separating the files with a comma. An EDID data set with no connector name will be used for any connectors not explicitly specified. dscc4.setup= [NET] - dt_cpu_ftrs= [PPC] + dt_cpu_ftrs= [PPC,EARLY] Format: {"off" | "known"} Control how the dt_cpu_ftrs device-tree binding is used for CPU feature discovery and setup (if it @@ -1168,12 +1411,12 @@ Documentation/admin-guide/dynamic-debug-howto.rst for details. - early_ioremap_debug [KNL] + early_ioremap_debug [KNL,EARLY] Enable debug messages in early_ioremap support. This is useful for tracking down temporary early mappings which are not unmapped. - earlycon= [KNL] Output early console device and options. + earlycon= [KNL,EARLY] Output early console device and options. When used with no options, the early console is determined by stdout-path property in device tree's @@ -1309,7 +1552,7 @@ address must be provided, and the serial port must already be setup and configured. - earlyprintk= [X86,SH,ARM,M68k,S390] + earlyprintk= [X86,SH,ARM,M68k,S390,UM,EARLY] earlyprintk=vga earlyprintk=sclp earlyprintk=xen @@ -1317,13 +1560,18 @@ earlyprintk=serial[,0x...[,baudrate]] earlyprintk=ttySn[,baudrate] earlyprintk=dbgp[debugController#] - earlyprintk=pciserial[,force],bus:device.function[,baudrate] + earlyprintk=mmio32,membase[,{nocfg|baudrate}] + earlyprintk=pciserial[,force],bus:device.function[,{nocfg|baudrate}] earlyprintk=xdbc[xhciController#] + earlyprintk=bios earlyprintk is useful when the kernel crashes before the normal console is initialized. It is not enabled by default because it has some cosmetic problems. + Use "nocfg" to skip UART configuration, assume + BIOS/firmware has configured UART correctly. + Append ",keep" to not disable it when the real console takes over. @@ -1349,6 +1597,8 @@ The sclp output can only be used on s390. + The bios output can only be used on SuperH. + The optional "force" to "pciserial" enables use of a PCI device even when its classcode is not of the UART class. @@ -1364,7 +1614,7 @@ edd= [EDD] Format: {"off" | "on" | "skip[mbr]"} - efi= [EFI] + efi= [EFI,EARLY] Format: { "debug", "disable_early_pci_dma", "nochunk", "noruntime", "nosoftreserve", "novamap", "no_disable_early_pci_dma" } @@ -1385,33 +1635,12 @@ no_disable_early_pci_dma: Leave the busmaster bit set on all PCI bridges while in the EFI boot stub - efi_no_storage_paranoia [EFI; X86] + efi_no_storage_paranoia [EFI,X86,EARLY] Using this parameter you can use more than 50% of your efi variable storage. Use this parameter only if you are really sure that your UEFI does sane gc and fulfills the spec otherwise your board may brick. - efi_fake_mem= nn[KMG]@ss[KMG]:aa[,nn[KMG]@ss[KMG]:aa,..] [EFI; X86] - Add arbitrary attribute to specific memory range by - updating original EFI memory map. - Region of memory which aa attribute is added to is - from ss to ss+nn. - - If efi_fake_mem=2G@4G:0x10000,2G@0x10a0000000:0x10000 - is specified, EFI_MEMORY_MORE_RELIABLE(0x10000) - attribute is added to range 0x100000000-0x180000000 and - 0x10a0000000-0x1120000000. - - If efi_fake_mem=8G@9G:0x40000 is specified, the - EFI_MEMORY_SP(0x40000) attribute is added to - range 0x240000000-0x43fffffff. - - Using this parameter you can do debugging of EFI memmap - related features. For example, you can do debugging of - Address Range Mirroring feature even if your box - doesn't support it, or mark specific memory as - "soft reserved". - efivar_ssdt= [EFI; X86] Name of an EFI variable that contains an SSDT that is to be dynamically loaded by Linux. If there are multiple variables with the same name but with different @@ -1422,7 +1651,7 @@ eisa_irq_edge= [PARISC,HW] See header of drivers/parisc/eisa.c. - ekgdboc= [X86,KGDB] Allow early kernel console debugging + ekgdboc= [X86,KGDB,EARLY] Allow early kernel console debugging Format: ekgdboc=kbd This is designed to be used in conjunction with @@ -1437,13 +1666,13 @@ See comment before function elanfreq_setup() in arch/x86/kernel/cpu/cpufreq/elanfreq.c. - elfcorehdr=[size[KMG]@]offset[KMG] [IA64,PPC,SH,X86,S390] + elfcorehdr=[size[KMG]@]offset[KMG] [PPC,SH,X86,S390,EARLY] Specifies physical address of start of kernel core image elf header and optionally the size. Generally kexec loader will pass this option to capture kernel. See Documentation/admin-guide/kdump/kdump.rst for details. - enable_mtrr_cleanup [X86] + enable_mtrr_cleanup [X86,EARLY] The kernel tries to adjust MTRR layout from continuous to discrete, to make X server driver able to add WB entry later. This parameter enables that. @@ -1476,7 +1705,7 @@ Permit 'security.evm' to be updated regardless of current integrity status. - early_page_ext [KNL] Enforces page_ext initialization to earlier + early_page_ext [KNL,EARLY] Enforces page_ext initialization to earlier stages so cover more early boot allocations. Please note that as side effect some optimizations might be disabled to achieve that (e.g. parallelized @@ -1487,6 +1716,7 @@ failslab= fail_usercopy= fail_page_alloc= + fail_skb_realloc= fail_make_request=[KNL] General fault injection mechanism. Format: <interval>,<probability>,<space>,<times> @@ -1500,12 +1730,6 @@ floppy= [HW] See Documentation/admin-guide/blockdev/floppy.rst. - force_pal_cache_flush - [IA-64] Avoid check_sal_cache_flush which may hang on - buggy SAL_CACHE_FLUSH implementations. Using this - parameter will force ia64_sal_cache_flush to call - ia64_pal_cache_flush instead of SAL_CACHE_FLUSH. - forcepae [X86-32] Forcefully enable Physical Address Extension (PAE). Many Pentium M systems disable PAE but may have a @@ -1513,6 +1737,12 @@ Warning: use of this parameter will taint the kernel and may cause unknown problems. + fred= [X86-64] + Enable/disable Flexible Return and Event Delivery. + Format: { on | off } + on: enable FRED when it's present. + off: disable FRED, the default setting. + ftrace=[tracer] [FTRACE] will set and start the specified tracer as early as possible in order to facilitate early @@ -1535,12 +1765,28 @@ The above will cause the "foo" tracing instance to trigger a snapshot at the end of boot up. - ftrace_dump_on_oops[=orig_cpu] + ftrace_dump_on_oops[=2(orig_cpu) | =<instance>][,<instance> | + ,<instance>=2(orig_cpu)] [FTRACE] will dump the trace buffers on oops. - If no parameter is passed, ftrace will dump - buffers of all CPUs, but if you pass orig_cpu, it will - dump only the buffer of the CPU that triggered the - oops. + If no parameter is passed, ftrace will dump global + buffers of all CPUs, if you pass 2 or orig_cpu, it + will dump only the buffer of the CPU that triggered + the oops, or the specific instance will be dumped if + its name is passed. Multiple instance dump is also + supported, and instances are separated by commas. Each + instance supports only dump on CPU that triggered the + oops by passing 2 or orig_cpu to it. + + ftrace_dump_on_oops=foo=orig_cpu + + The above will dump only the buffer of "foo" instance + on CPU that triggered the oops. + + ftrace_dump_on_oops,foo,bar=orig_cpu + + The above will dump global buffer on all CPUs, the + buffer of "foo" instance on all CPUs and the buffer + of "bar" instance on CPU that triggered the oops. ftrace_filter=[function-list] [FTRACE] Limit the functions traced by the function @@ -1574,7 +1820,7 @@ can be changed at run time by the max_graph_depth file in the tracefs tracing directory. default: 0 (no limit) - fw_devlink= [KNL] Create device links between consumer and supplier + fw_devlink= [KNL,EARLY] Create device links between consumer and supplier devices by scanning the firmware to infer the consumer/supplier relationships. This feature is especially useful when drivers are loaded as modules as @@ -1593,12 +1839,12 @@ rpm -- Like "on", but also use to order runtime PM. fw_devlink.strict=<bool> - [KNL] Treat all inferred dependencies as mandatory + [KNL,EARLY] Treat all inferred dependencies as mandatory dependencies. This only applies for fw_devlink=on|rpm. Format: <bool> fw_devlink.sync_state = - [KNL] When all devices that could probe have finished + [KNL,EARLY] When all devices that could probe have finished probing, this parameter controls what to do with devices that haven't yet received their sync_state() calls. @@ -1619,10 +1865,32 @@ gamma= [HW,DRM] - gart_fix_e820= [X86-64] disable the fix e820 for K8 GART + gart_fix_e820= [X86-64,EARLY] disable the fix e820 for K8 GART Format: off | on default: on + gather_data_sampling= + [X86,INTEL,EARLY] Control the Gather Data Sampling (GDS) + mitigation. + + Gather Data Sampling is a hardware vulnerability which + allows unprivileged speculative access to data which was + previously stored in vector registers. + + This issue is mitigated by default in updated microcode. + The mitigation may have a performance impact but can be + disabled. On systems without the microcode mitigation + disabling AVX serves as a mitigation. + + force: Disable AVX to mitigate systems without + microcode mitigation. No effect if the microcode + mitigation is present. Known to cause crashes in + userspace with buggy AVX enumeration. + + off: Disable GDS mitigation. + + gbpages [X86] Use GB pages for kernel direct mappings. + gcov_persist= [GCOV] When non-zero (default), profiling data for kernel modules is saved and remains accessible via debugfs, even when the module is unloaded/reloaded. @@ -1670,7 +1938,9 @@ allocation boundaries as a proactive defense against bounds-checking flaws in the kernel's copy_to_user()/copy_from_user() interface. - on Perform hardened usercopy checks (default). + The default is determined by + CONFIG_HARDENED_USERCOPY_DEFAULT_ON. + on Perform hardened usercopy checks. off Disable hardened usercopy checks. hardlockup_all_cpu_backtrace= @@ -1678,13 +1948,32 @@ backtraces on all cpus. Format: 0 | 1 + hash_pointers= + [KNL,EARLY] + By default, when pointers are printed to the console + or buffers via the %p format string, that pointer is + "hashed", i.e. obscured by hashing the pointer value. + This is a security feature that hides actual kernel + addresses from unprivileged users, but it also makes + debugging the kernel more difficult since unequal + pointers can no longer be compared. The choices are: + Format: { auto | always | never } + Default: auto + + auto - Hash pointers unless slab_debug is enabled. + always - Always hash pointers (even if slab_debug is + enabled). + never - Never hash pointers. This option should only + be specified when debugging the kernel. Do + not use on production kernels. The boot + param "no_hash_pointers" is an alias for + this mode. + hashdist= [KNL,NUMA] Large hashes allocated during boot are distributed across NUMA nodes. Defaults on for 64-bit NUMA, off otherwise. Format: 0 | 1 (for off | on) - hcl= [IA-64] SGI's Hardware Graph compatibility layer - hd= [EIDE] (E)IDE hard drive subsystem geometry Format: <cyl>,<head>,<sect> @@ -1702,7 +1991,35 @@ (that will set all pages holding image data during restoration read-only). - highmem=nn[KMG] [KNL,BOOT] forces the highmem zone to have an exact + hibernate.compressor= [HIBERNATION] Compression algorithm to be + used with hibernation. + Format: { lzo | lz4 } + Default: lzo + + lzo: Select LZO compression algorithm to + compress/decompress hibernation image. + + lz4: Select LZ4 compression algorithm to + compress/decompress hibernation image. + + hibernate.pm_test_delay= + [HIBERNATION] + Sets the number of seconds to remain in a hibernation test + mode before resuming the system (see + /sys/power/pm_test). Only available when CONFIG_PM_DEBUG + is set. Default value is 5. + + hibernate_compression_threads= + [HIBERNATION] + Set the number of threads used for compressing or decompressing + hibernation images. + + Format: <integer> + Default: 3 + Minimum: 1 + Example: hibernate_compression_threads=4 + + highmem=nn[KMG] [KNL,BOOT,EARLY] forces the highmem zone to have an exact size of <nn>. This works even on boxes that have no highmem otherwise. This also works to reduce highmem size on bigger boxes. @@ -1713,7 +2030,7 @@ hlt [BUGS=ARM,SH] - hostname= [KNL] Set the hostname (aka UTS nodename). + hostname= [KNL,EARLY] Set the hostname (aka UTS nodename). Format: <string> This allows setting the system's hostname during early startup. This sets the name returned by gethostname. @@ -1737,7 +2054,7 @@ hpet_mmap= [X86, HPET_MMAP] Allow userspace to mmap HPET registers. Default set by CONFIG_HPET_MMAP_DEFAULT. - hugepages= [HW] Number of HugeTLB pages to allocate at boot. + hugepages= [HW,EARLY] Number of HugeTLB pages to allocate at boot. If this follows hugepagesz (below), it specifies the number of pages of hugepagesz to be allocated. If this is the first HugeTLB parameter on the command @@ -1749,16 +2066,25 @@ <node>:<integer>[,<node>:<integer>] hugepagesz= - [HW] The size of the HugeTLB pages. This is used in - conjunction with hugepages (above) to allocate huge - pages of a specific size at boot. The pair - hugepagesz=X hugepages=Y can be specified once for - each supported huge page size. Huge page sizes are - architecture dependent. See also + [HW,EARLY] The size of the HugeTLB pages. This is + used in conjunction with hugepages (above) to + allocate huge pages of a specific size at boot. The + pair hugepagesz=X hugepages=Y can be specified once + for each supported huge page size. Huge page sizes + are architecture dependent. See also Documentation/admin-guide/mm/hugetlbpage.rst. Format: size[KMG] - hugetlb_cma= [HW,CMA] The size of a CMA area used for allocation + hugepage_alloc_threads= + [HW] The number of threads that should be used to + allocate hugepages during boot. This option can be + used to improve system bootup time when allocating + a large amount of huge pages. + The default value is 25% of the available hardware threads. + + Note that this parameter only applies to non-gigantic huge pages. + + hugetlb_cma= [HW,CMA,EARLY] The size of a CMA area used for allocation of gigantic hugepages. Or using node format, the size of a CMA area per node can be specified. Format: nn[KMGTPE] or (node format) @@ -1768,6 +2094,13 @@ hugepages using the CMA allocator. If enabled, the boot-time allocation of gigantic hugepages is skipped. + hugetlb_cma_only= + [HW,CMA,EARLY] When allocating new HugeTLB pages, only + try to allocate from the CMA areas. + + This option does nothing if hugetlb_cma= is not also + specified. + hugetlb_free_vmemmap= [KNL] Requires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP enabled. @@ -1789,14 +2122,20 @@ the added memory block itself do not be affected. hung_task_panic= - [KNL] Should the hung task detector generate panics. - Format: 0 | 1 + [KNL] Number of hung tasks to trigger kernel panic. + Format: <int> + + When set to a non-zero value, a kernel panic will be triggered if + the number of detected hung tasks reaches this value. - A value of 1 instructs the kernel to panic when a - hung task is detected. The default value is controlled - by the CONFIG_BOOTPARAM_HUNG_TASK_PANIC build-time - option. The value selected by this boot parameter can - be changed later by the kernel.hung_task_panic sysctl. + 0: don't panic + 1: panic immediately on first hung task + N: panic after N hung tasks are detected in a single scan + + The default value is controlled by the + CONFIG_BOOTPARAM_HUNG_TASK_PANIC build-time option. The value + selected by this boot parameter can be changed later by the + kernel.hung_task_panic sysctl. hvc_iucv= [S390] Number of z/VM IUCV hypervisor console (HVC) terminal devices. Valid values: 0..8 @@ -1804,9 +2143,16 @@ If specified, z/VM IUCV HVC accepts connections from listed z/VM user IDs only. - hv_nopvspin [X86,HYPER_V] Disables the paravirt spinlock optimizations - which allow the hypervisor to 'idle' the - guest on lock contention. + hv_nopvspin [X86,HYPER_V,EARLY] + Disables the paravirt spinlock optimizations + which allow the hypervisor to 'idle' the guest + on lock contention. + + hw_protection= [HW] + Format: reboot | shutdown + + Hardware protection action taken on critical events like + overtemperature or imminent voltage loss. i2c_bus= [HW] Override the default board specific I2C bus speed or register an additional I2C bus that is not @@ -1814,6 +2160,28 @@ Format: <bus_id>,<clkrate> + i2c_touchscreen_props= [HW,ACPI,X86] + Set device-properties for ACPI-enumerated I2C-attached + touchscreen, to e.g. fix coordinates of upside-down + mounted touchscreens. If you need this option please + submit a drivers/platform/x86/touchscreen_dmi.c patch + adding a DMI quirk for this. + + Format: + <ACPI_HW_ID>:<prop_name>=<val>[:prop_name=val][:...] + Where <val> is one of: + Omit "=<val>" entirely Set a boolean device-property + Unsigned number Set a u32 device-property + Anything else Set a string device-property + + Examples (split over multiple lines): + i2c_touchscreen_props=GDIX1001:touchscreen-inverted-x: + touchscreen-inverted-y + + i2c_touchscreen_props=MSSL1680:touchscreen-size-x=1920: + touchscreen-size-y=1080:touchscreen-inverted-y: + firmware-name=gsl1680-vendor-model.fw:silead,home-button + i8042.debug [HW] Toggle i8042 debug mode i8042.unmask_kbd_data [HW] Enable printing of interrupt data from the KBD port @@ -1861,18 +2229,33 @@ 0 -- machine default 1 -- force brightness inversion + ia32_emulation= [X86-64] + Format: <bool> + When true, allows loading 32-bit programs and executing 32-bit + syscalls, essentially overriding IA32_EMULATION_DEFAULT_DISABLED at + boot time. When false, unconditionally disables IA32 emulation. + icn= [HW,ISDN] Format: <io>[,<membase>[,<icn_id>[,<icn_id2>]]] - idle= [X86] + idle= [X86,EARLY] Format: idle=poll, idle=halt, idle=nomwait - Poll forces a polling idle loop that can slightly - improve the performance of waking up a idle CPU, but - will use a lot of power and make the system run hot. - Not recommended. + + idle=poll: Don't do power saving in the idle loop + using HLT, but poll for rescheduling event. This will + make the CPUs eat a lot more power, but may be useful + to get slightly better performance in multiprocessor + benchmarks. It also makes some profiling using + performance counters more accurate. Please note that + on systems with MONITOR/MWAIT support (like Intel + EM64T CPUs) this option has no performance advantage + over the normal idle loop. It may also interact badly + with hyperthreading. + idle=halt: Halt is forced to be used for CPU idle. In such case C2/C3 won't be used again. + idle=nomwait: Disable mwait for CPU C-states idxd.sva= [HW] @@ -1887,7 +2270,7 @@ for the device. By default it is set to false (0). ieee754= [MIPS] Select IEEE Std 754 conformance mode - Format: { strict | legacy | 2008 | relaxed } + Format: { strict | legacy | 2008 | relaxed | emulated } Default: strict Choose which programs will be accepted for execution @@ -1907,6 +2290,8 @@ by the FPU relaxed accept any binaries regardless of whether supported by the FPU + emulated accept any binaries but enable FPU emulator + if binary mode is unsupported by the FPU. The FPU emulator is always able to support both NaN encodings, so if no FPU hardware is present or it has @@ -1921,7 +2306,7 @@ mode generally follows that for the NaN encoding, except where unsupported by hardware. - ignore_loglevel [KNL] + ignore_loglevel [KNL,EARLY] Ignore loglevel setting - this will print /all/ kernel messages to the console. Useful for debugging. We also add it as printk module parameter, so users @@ -2014,6 +2399,28 @@ different crypto accelerators. This option can be used to achieve best performance for particular HW. + ima= [IMA] Enable or disable IMA + Format: { "off" | "on" } + Default: "on" + Note that disabling IMA is limited to kdump kernel. + + indirect_target_selection= [X86,Intel] Mitigation control for Indirect + Target Selection(ITS) bug in Intel CPUs. Updated + microcode is also required for a fix in IBPB. + + on: Enable mitigation (default). + off: Disable mitigation. + force: Force the ITS bug and deploy default + mitigation. + vmexit: Only deploy mitigation if CPU is affected by + guest/host isolation part of ITS. + stuff: Deploy RSB-fill mitigation when retpoline is + also deployed. Otherwise, deploy the default + mitigation. + + For details see: + Documentation/admin-guide/hw-vuln/indirect-target-selection.rst + init= [KNL] Format: <full_path> Run specified binary instead of /sbin/init as init @@ -2039,21 +2446,21 @@ unpacking being completed before device_ and late_ initcalls. - initrd= [BOOT] Specify the location of the initial ramdisk + initrd= [BOOT,EARLY] Specify the location of the initial ramdisk - initrdmem= [KNL] Specify a physical address and size from which to + initrdmem= [KNL,EARLY] Specify a physical address and size from which to load the initrd. If an initrd is compiled in or specified in the bootparams, it takes priority over this setting. Format: ss[KMG],nn[KMG] Default is 0, 0 - init_on_alloc= [MM] Fill newly allocated pages and heap objects with + init_on_alloc= [MM,EARLY] Fill newly allocated pages and heap objects with zeroes. Format: 0 | 1 Default set by CONFIG_INIT_ON_ALLOC_DEFAULT_ON. - init_on_free= [MM] Fill freed pages and heap objects with zeroes. + init_on_free= [MM,EARLY] Fill freed pages and heap objects with zeroes. Format: 0 | 1 Default set by CONFIG_INIT_ON_FREE_DEFAULT_ON. @@ -2109,7 +2516,7 @@ 0 disables intel_idle and fall back on acpi_idle. 1 to 9 specify maximum depth of C-state. - intel_pstate= [X86] + intel_pstate= [X86,EARLY] disable Do not enable intel_pstate as the default scaling driver for the supported processors @@ -2152,35 +2559,93 @@ per_cpu_perf_limits Allow per-logical-CPU P-State performance control limits using cpufreq sysfs interface + no_cas + Do not enable capacity-aware scheduling (CAS) on + hybrid systems - intremap= [X86-64, Intel-IOMMU] + intremap= [X86-64,Intel-IOMMU,EARLY] on enable Interrupt Remapping (default) off disable Interrupt Remapping nosid disable Source ID checking no_x2apic_optout BIOS x2APIC opt-out request will be ignored nopost disable Interrupt Posting + posted_msi + enable MSIs delivered as posted interrupts iomem= Disable strict checking of access to MMIO memory strict regions from userspace. relaxed - iommu= [X86] + iommu= [X86,EARLY] + off + Don't initialize and use any kind of IOMMU. + force + Force the use of the hardware IOMMU even when + it is not actually needed (e.g. because < 3 GB + memory). + noforce + Don't force hardware IOMMU usage when it is not + needed. (default). + biomerge panic nopanic merge nomerge + soft - pt [X86] - nopt [X86] - nobypass [PPC/POWERNV] + Use software bounce buffering (SWIOTLB) (default for + Intel machines). This can be used to prevent the usage + of an available hardware IOMMU. + + [X86] + pt + [X86] + nopt + [PPC/POWERNV] + nobypass Disable IOMMU bypass, using IOMMU for PCI devices. - iommu.forcedac= [ARM64, X86] Control IOVA allocation for PCI devices. + [X86] + AMD Gart HW IOMMU-specific options: + + <size> + Set the size of the remapping area in bytes. + + allowed + Overwrite iommu off workarounds for specific chipsets + + fullflush + Flush IOMMU on each allocation (default). + + nofullflush + Don't use IOMMU fullflush. + + memaper[=<order>] + Allocate an own aperture over RAM with size + 32MB<<order. (default: order=1, i.e. 64MB) + + merge + Do scatter-gather (SG) merging. Implies "force" + (experimental). + + nomerge + Don't do scatter-gather (SG) merging. + + noaperture + Ask the IOMMU not to touch the aperture for AGP. + + noagp + Don't initialize the AGP driver and use full aperture. + + panic + Always panic when IOMMU overflows. + + iommu.forcedac= [ARM64,X86,EARLY] Control IOVA allocation for PCI devices. Format: { "0" | "1" } 0 - Try to allocate a 32-bit DMA address first, before falling back to the full range if needed. @@ -2188,7 +2653,7 @@ forcing Dual Address Cycle for PCI cards supporting greater than 32-bit addressing. - iommu.strict= [ARM64, X86] Configure TLB invalidation behaviour + iommu.strict= [ARM64,X86,S390,EARLY] Configure TLB invalidation behaviour Format: { "0" | "1" } 0 - Lazy mode. Request that DMA unmap operations use deferred @@ -2204,7 +2669,7 @@ legacy driver-specific options takes precedence. iommu.passthrough= - [ARM64, X86] Configure DMA to bypass the IOMMU by default. + [ARM64,X86,EARLY] Configure DMA to bypass the IOMMU by default. Format: { "0" | "1" } 0 - Use IOMMU translation for DMA. 1 - Bypass the IOMMU for DMA. @@ -2214,7 +2679,7 @@ See comment before marvel_specify_io7 in arch/alpha/kernel/core_marvel.c. - io_delay= [X86] I/O delay method + io_delay= [X86,EARLY] I/O delay method 0x80 Standard port 0x80 based delay 0xed @@ -2227,37 +2692,61 @@ ip= [IP_PNP] See Documentation/admin-guide/nfs/nfsroot.rst. - ipcmni_extend [KNL] Extend the maximum number of unique System V + ipcmni_extend [KNL,EARLY] Extend the maximum number of unique System V IPC identifiers from 32,768 to 16,777,216. + ipe.enforce= [IPE] + Format: <bool> + Determine whether IPE starts in permissive (0) or + enforce (1) mode. The default is enforce. + + ipe.success_audit= + [IPE] + Format: <bool> + Start IPE with success auditing enabled, emitting + an audit event when a binary is allowed. The default + is 0. + irqaffinity= [SMP] Set the default irq affinity mask The argument is a cpu list, as described above. irqchip.gicv2_force_probe= - [ARM, ARM64] + [ARM,ARM64,EARLY] Format: <bool> Force the kernel to look for the second 4kB page of a GICv2 controller even if the memory range exposed by the device tree is too small. irqchip.gicv3_nolpi= - [ARM, ARM64] + [ARM,ARM64,EARLY] Force the kernel to ignore the availability of LPIs (and by consequence ITSs). Intended for system that use the kernel as a bootloader, and thus want to let secondary kernels in charge of setting up LPIs. - irqchip.gicv3_pseudo_nmi= [ARM64] + irqchip.gicv3_pseudo_nmi= [ARM64,EARLY] Enables support for pseudo-NMIs in the kernel. This requires the kernel to be built with CONFIG_ARM64_PSEUDO_NMI. + irqchip.riscv_imsic_noipi + [RISC-V,EARLY] + Force the kernel to not use IMSIC software injected MSIs + as IPIs. Intended for system where IMSIC is trap-n-emulated, + and thus want to reduce MMIO traps when triggering IPIs + to multiple harts. + irqfixup [HW] When an interrupt is not handled search all handlers for it. Intended to get systems with badly broken firmware running. + irqhandler.duration_warn_us= [KNL] + Warn if an IRQ handler exceeds the specified duration + threshold in microseconds. Useful for identifying + long-running IRQs in the system. + irqpoll [HW] When an interrupt is not handled search all handlers for it. Also check all handlers each timer @@ -2275,7 +2764,9 @@ specified in the flag list (default: domain): nohz - Disable the tick when a single task runs. + Disable the tick when a single task runs as well as + disabling other kernel noises like having RCU callbacks + offloaded. This is equivalent to the nohz_full parameter. A residual 1Hz tick is offloaded to workqueues, which you need to affine to housekeeping through the global @@ -2393,15 +2884,15 @@ parameter KASAN will print report only for the first invalid access. - keep_bootcon [KNL] + keep_bootcon [KNL,EARLY] Do not unregister boot console at start. This is only useful for debugging when something happens in the window between unregistering the boot console and initializing the real console. - keepinitrd [HW,ARM] + keepinitrd [HW,ARM] See retain_initrd. - kernelcore= [KNL,X86,IA-64,PPC] + kernelcore= [KNL,X86,PPC,EARLY] Format: nn[KMGTPE] | nn% | "mirror" This parameter specifies the amount of memory usable by the kernel for non-movable allocations. The requested @@ -2426,7 +2917,7 @@ for Movable pages. "nn[KMGTPE]", "nn%", and "mirror" are exclusive, so you cannot specify multiple forms. - kgdbdbgp= [KGDB,HW] kgdb over EHCI usb debug port. + kgdbdbgp= [KGDB,HW,EARLY] kgdb over EHCI usb debug port. Format: <Controller#>[,poll interval] The controller # is the number of the ehci usb debug port as it is probed via PCI. The poll interval is @@ -2447,7 +2938,7 @@ kms, kbd format: kms,kbd kms, kbd and serial format: kms,kbd,<ser_dev>[,baud] - kgdboc_earlycon= [KGDB,HW] + kgdboc_earlycon= [KGDB,HW,EARLY] If the boot console provides the ability to read characters and can work in polling mode, you can use this parameter to tell kgdb to use it as a backend @@ -2462,14 +2953,39 @@ blank and the first boot console that implements read() will be picked. - kgdbwait [KGDB] Stop kernel execution and enter the + kgdbwait [KGDB,EARLY] Stop kernel execution and enter the kernel debugger at the earliest opportunity. + kho= [KEXEC,EARLY] + Format: { "0" | "1" | "off" | "on" | "y" | "n" } + Enables or disables Kexec HandOver. + "0" | "off" | "n" - kexec handover is disabled + "1" | "on" | "y" - kexec handover is enabled + + kho_scratch= [KEXEC,EARLY] + Format: ll[KMG],mm[KMG],nn[KMG] | nn% + Defines the size of the KHO scratch region. The KHO + scratch regions are physically contiguous memory + ranges that can only be used for non-kernel + allocations. That way, even when memory is heavily + fragmented with handed over memory, the kexeced + kernel will always have enough contiguous ranges to + bootstrap itself. + + It is possible to specify the exact amount of + memory in the form of "ll[KMG],mm[KMG],nn[KMG]" + where the first parameter defines the size of a low + memory scratch area, the second parameter defines + the size of a global scratch area and the third + parameter defines the size of additional per-node + scratch areas. The form "nn%" defines scale factor + (in percents) of memory that was used during boot. + kmac= [MIPS] Korina ethernet MAC address. Configure the RouterBoard 532 series on-chip Ethernet adapter MAC address. - kmemleak= [KNL] Boot-time kmemleak enable/disable + kmemleak= [KNL,EARLY] Boot-time kmemleak enable/disable Valid arguments: on, off Default: on Built with CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y, @@ -2488,8 +3004,8 @@ See also Documentation/trace/kprobetrace.rst "Kernel Boot Parameter" section. - kpti= [ARM64] Control page table isolation of user - and kernel address spaces. + kpti= [ARM64,EARLY] Control page table isolation of + user and kernel address spaces. Default: enabled on cores which need mitigation. 0: force disabled 1: force enabled @@ -2528,6 +3044,23 @@ Default is Y (on). + kvm.enable_virt_at_load=[KVM,ARM64,LOONGARCH,MIPS,RISCV,X86] + If enabled, KVM will enable virtualization in hardware + when KVM is loaded, and disable virtualization when KVM + is unloaded (if KVM is built as a module). + + If disabled, KVM will dynamically enable and disable + virtualization on-demand when creating and destroying + VMs, i.e. on the 0=>1 and 1=>0 transitions of the + number of VMs. + + Enabling virtualization at module load avoids potential + latency for creation of the 0=>1 VM, as KVM serializes + virtualization enabling across all online CPUs. The + "cost" of enabling virtualization when KVM is loaded, + is that doing so may interfere with using out-of-tree + hypervisors that want to "own" virtualization hardware. + kvm.enable_vmware_backdoor=[KVM] Support VMware backdoor PV interface. Default is false (don't support). @@ -2565,43 +3098,87 @@ (enabled). Disable by KVM if hardware lacks support for NPT. + kvm-amd.ciphertext_hiding_asids= + [KVM,AMD] Ciphertext hiding prevents disallowed accesses + to SNP private memory from reading ciphertext. Instead, + reads will see constant default values (0xff). + + If ciphertext hiding is enabled, the joint SEV-ES and + SEV-SNP ASID space is partitioned into separate SEV-ES + and SEV-SNP ASID ranges, with the SEV-SNP range being + [1..max_snp_asid] and the SEV-ES range being + (max_snp_asid..min_sev_asid), where min_sev_asid is + enumerated by CPUID.0x.8000_001F[EDX]. + + A non-zero value enables SEV-SNP ciphertext hiding and + adjusts the ASID ranges for SEV-ES and SEV-SNP guests. + KVM caps the number of SEV-SNP ASIDs at the maximum + possible value, e.g. specifying -1u will assign all + joint SEV-ES and SEV-SNP ASIDs to SEV-SNP. Note, + assigning all joint ASIDs to SEV-SNP, i.e. configuring + max_snp_asid == min_sev_asid-1, will effectively make + SEV-ES unusable. + kvm-arm.mode= - [KVM,ARM] Select one of KVM/arm64's modes of operation. + [KVM,ARM,EARLY] Select one of KVM/arm64's modes of + operation. none: Forcefully disable KVM. nvhe: Standard nVHE-based mode, without support for protected guests. - protected: nVHE-based mode with support for guests whose - state is kept private from the host. + protected: Mode with support for guests whose state is + kept private from the host, using VHE or + nVHE depending on HW support. nested: VHE-based mode with support for nested - virtualization. Requires at least ARMv8.3 - hardware. + virtualization. Requires at least ARMv8.4 + hardware (with FEAT_NV2). Defaults to VHE/nVHE based on hardware support. Setting mode to "protected" will disable kexec and hibernation - for the host. "nested" is experimental and should be - used with extreme caution. + for the host. To force nVHE on VHE hardware, add + "arm64_sw.hvhe=0 id_aa64mmfr1.vh=0" to the + command-line. + "nested" is experimental and should be used with + extreme caution. kvm-arm.vgic_v3_group0_trap= - [KVM,ARM] Trap guest accesses to GICv3 group-0 + [KVM,ARM,EARLY] Trap guest accesses to GICv3 group-0 system registers kvm-arm.vgic_v3_group1_trap= - [KVM,ARM] Trap guest accesses to GICv3 group-1 + [KVM,ARM,EARLY] Trap guest accesses to GICv3 group-1 system registers kvm-arm.vgic_v3_common_trap= - [KVM,ARM] Trap guest accesses to GICv3 common + [KVM,ARM,EARLY] Trap guest accesses to GICv3 common system registers kvm-arm.vgic_v4_enable= - [KVM,ARM] Allow use of GICv4 for direct injection of - LPIs. + [KVM,ARM,EARLY] Allow use of GICv4 for direct + injection of LPIs. + + kvm-arm.wfe_trap_policy= + [KVM,ARM] Control when to set WFE instruction trap for + KVM VMs. Traps are allowed but not guaranteed by the + CPU architecture. + + trap: set WFE instruction trap + + notrap: clear WFE instruction trap + + kvm-arm.wfi_trap_policy= + [KVM,ARM] Control when to set WFI instruction trap for + KVM VMs. Traps are allowed but not guaranteed by the + CPU architecture. + + trap: set WFI instruction trap - kvm_cma_resv_ratio=n [PPC] + notrap: clear WFI instruction trap + + kvm_cma_resv_ratio=n [PPC,EARLY] Reserves given percentage from system memory area for contiguous memory allocation for KVM hash pagetable allocation. @@ -2624,7 +3201,7 @@ kvm-intel.flexpriority= [KVM,Intel] Control KVM's use of FlexPriority feature - (TPR shadow). Default is 1 (enabled). Disalbe by KVM if + (TPR shadow). Default is 1 (enabled). Disable by KVM if hardware lacks support for it. kvm-intel.nested= @@ -2654,7 +3231,7 @@ (enabled). Disable by KVM if hardware lacks support for it. - l1d_flush= [X86,INTEL] + l1d_flush= [X86,INTEL,EARLY] Control mitigation for L1D based snooping vulnerability. Certain CPUs are vulnerable to an exploit against CPU @@ -2671,7 +3248,7 @@ on - enable the interface for the mitigation - l1tf= [X86] Control mitigation of the L1TF vulnerability on + l1tf= [X86,EARLY] Control mitigation of the L1TF vulnerability on affected CPUs The kernel PTE inversion protection is unconditionally @@ -2740,7 +3317,7 @@ l3cr= [PPC] - lapic [X86-32,APIC] Enable the local APIC even if BIOS + lapic [X86-32,APIC,EARLY] Enable the local APIC even if BIOS disabled it. lapic= [X86,APIC] Do not use TSC deadline @@ -2748,7 +3325,7 @@ back to the programmable timer unit in the LAPIC. Format: notscdeadline - lapic_timer_c2_ok [X86,APIC] trust the local apic timer + lapic_timer_c2_ok [X86,APIC,EARLY] trust the local apic timer in C2 power state. libata.dma= [LIBATA] DMA control @@ -2843,6 +3420,8 @@ * max_sec_lba48: Set or clear transfer size limit to 65535 sectors. + * external: Mark port as external (hotplug-capable). + * [no]lpm: Enable or disable link power management. * [no]setxfer: Indicate if transfer speed mode setting @@ -2872,7 +3451,7 @@ lockd.nlm_udpport=M [NFS] Assign UDP port. Format: <integer> - lockdown= [SECURITY] + lockdown= [SECURITY,EARLY] { integrity | confidentiality } Enable the kernel lockdown feature. If set to integrity, kernel features that allow userland to @@ -2881,6 +3460,38 @@ to extract confidential information from the kernel are also disabled. + locktorture.acq_writer_lim= [KNL] + Set the time limit in jiffies for a lock + acquisition. Acquisitions exceeding this limit + will result in a splat once they do complete. + + locktorture.bind_readers= [KNL] + Specify the list of CPUs to which the readers are + to be bound. + + locktorture.bind_writers= [KNL] + Specify the list of CPUs to which the writers are + to be bound. + + locktorture.call_rcu_chains= [KNL] + Specify the number of self-propagating call_rcu() + chains to set up. These are used to ensure that + there is a high probability of an RCU grace period + in progress at any given time. Defaults to 0, + which disables these call_rcu() chains. + + locktorture.long_hold= [KNL] + Specify the duration in milliseconds for the + occasional long-duration lock hold time. Defaults + to 100 milliseconds. Select 0 to disable. + + locktorture.nested_locks= [KNL] + Specify the maximum lock nesting depth that + locktorture is to exercise, up to a limit of 8 + (MAX_NESTED_LOCKS). Specify zero to disable. + Note that this parameter is ineffective on types + of locks that do not support nested acquisition. + locktorture.nreaders_stress= [KNL] Set the number of locking read-acquisition kthreads. Defaults to being automatically set based on the @@ -2896,6 +3507,25 @@ Set time (s) between CPU-hotplug operations, or zero to disable CPU-hotplug testing. + locktorture.rt_boost= [KNL] + Do periodic testing of real-time lock priority + boosting. Select 0 to disable, 1 to boost + only rt_mutex, and 2 to boost unconditionally. + Defaults to 2, which might seem to be an + odd choice, but which should be harmless for + non-real-time spinlocks, due to their disabling + of preemption. Note that non-realtime mutexes + disable boosting. + + locktorture.rt_boost_factor= [KNL] + Number that determines how often and for how + long priority boosting is exercised. This is + scaled down by the number of writers, so that the + number of boosts per unit time remains roughly + constant as the number of writers increases. + On the other hand, the duration of each boost + increases with the number of writers. + locktorture.shuffle_interval= [KNL] Set task-shuffle interval (jiffies). Shuffling tasks allows some CPUs to go into dyntick-idle @@ -2921,10 +3551,15 @@ locktorture.verbose= [KNL] Enable additional printk() statements. + locktorture.writer_fifo= [KNL] + Run the write-side locktorture kthreads at + sched_set_fifo() real-time priority. + logibm.irq= [HW,MOUSE] Logitech Bus Mouse Driver Format: <irq> - loglevel= All Kernel Messages with a loglevel smaller than the + loglevel= [KNL,EARLY] + All Kernel Messages with a loglevel smaller than the console loglevel will be printed to the console. It can also be changed with klogd or other programs. The loglevels are defined as follows: @@ -2938,13 +3573,15 @@ 6 (KERN_INFO) informational 7 (KERN_DEBUG) debug-level messages - log_buf_len=n[KMG] Sets the size of the printk ring buffer, - in bytes. n must be a power of two and greater - than the minimal size. The minimal size is defined - by LOG_BUF_SHIFT kernel config parameter. There is - also CONFIG_LOG_CPU_MAX_BUF_SHIFT config parameter - that allows to increase the default size depending on - the number of CPUs. See init/Kconfig for more details. + log_buf_len=n[KMG] [KNL,EARLY] + Sets the size of the printk ring buffer, in bytes. + n must be a power of two and greater than the + minimal size. The minimal size is defined by + LOG_BUF_SHIFT kernel config parameter. There + is also CONFIG_LOG_CPU_MAX_BUF_SHIFT config + parameter that allows to increase the default size + depending on the number of CPUs. See init/Kconfig + for more details. logo.nologo [FB] Disables display of the built-in Linux logo. This may be used to provide more screen space for @@ -2982,27 +3619,17 @@ unlikely, in the extreme case this might damage your hardware. - ltpc= [NET] - Format: <io>,<irq>,<dma> - lsm.debug [SECURITY] Enable LSM initialization debugging output. lsm=lsm1,...,lsmN [SECURITY] Choose order of LSM initialization. This overrides CONFIG_LSM, and the "security=" parameter. - machvec= [IA-64] Force the use of a particular machine-vector - (machvec) in a generic kernel. - Example: machvec=hpzx1 - machtype= [Loongson] Share the same kernel image file between different yeeloong laptops. Example: machtype=lemote-yeeloong-2f-7inch - max_addr=nn[KMG] [KNL,BOOT,IA-64] All physical memory greater - than or equal to this physical address is ignored. - - maxcpus= [SMP] Maximum number of processors that an SMP kernel + maxcpus= [SMP,EARLY] Maximum number of processors that an SMP kernel will bring up during bootup. maxcpus=n : n >= 0 limits the kernel to bring up 'n' processors. Surely after bootup you can bring up the other plugged cpu by executing @@ -3018,9 +3645,77 @@ devices can be requested on-demand with the /dev/loop-control interface. - mce [X86-32] Machine Check Exception + mce= [X86-{32,64}] + + Please see Documentation/arch/x86/x86_64/machinecheck.rst for sysfs runtime tunables. + + off + disable machine check + + no_cmci + disable CMCI(Corrected Machine Check Interrupt) that + Intel processor supports. Usually this disablement is + not recommended, but it might be handy if your + hardware is misbehaving. + + Note that you'll get more problems without CMCI than + with due to the shared banks, i.e. you might get + duplicated error logs. + + dont_log_ce + don't make logs for corrected errors. All events + reported as corrected are silently cleared by OS. This + option will be useful if you have no interest in any + of corrected errors. + + ignore_ce + disable features for corrected errors, e.g. + polling timer and CMCI. All events reported as + corrected are not cleared by OS and remained in its + error banks. + + Usually this disablement is not recommended, however + if there is an agent checking/clearing corrected + errors (e.g. BIOS or hardware monitoring + applications), conflicting with OS's error handling, + and you cannot deactivate the agent, then this option + will be a help. + + no_lmce + do not opt-in to Local MCE delivery. Use legacy method + to broadcast MCEs. + + bootlog + enable logging of machine checks left over from + booting. Disabled by default on AMD Fam10h and older + because some BIOS leave bogus ones. + + If your BIOS doesn't do that it's a good idea to + enable though to make sure you log even machine check + events that result in a reboot. On Intel systems it is + enabled by default. + + nobootlog + disable boot machine check logging. + + monarchtimeout (number) + sets the time in us to wait for other CPUs on machine + checks. 0 to disable. + + bios_cmci_threshold + don't overwrite the bios-set CMCI threshold. This boot + option prevents Linux from overwriting the CMCI + threshold set by the bios. Without this option, Linux + always sets the CMCI threshold to 1. Enabling this may + make memory predictive failure analysis less effective + if the bios sets thresholds for memory errors since we + will not see details for all errors. + + recovery + force-enable recoverable machine check code paths + + Everything else is in sysfs now. - mce=option [X86-64] See Documentation/arch/x86/x86_64/boot-options.rst md= [HW] RAID subsystems devices and level See Documentation/admin-guide/md.rst. @@ -3029,7 +3724,7 @@ Format: <first>,<last> Specifies range of consoles to be captured by the MDA. - mds= [X86,INTEL] + mds= [X86,INTEL,EARLY] Control mitigation for the Micro-architectural Data Sampling (MDS) vulnerability. @@ -3061,11 +3756,12 @@ For details see: Documentation/admin-guide/hw-vuln/mds.rst - mem=nn[KMG] [HEXAGON] Set the memory size. + mem=nn[KMG] [HEXAGON,EARLY] Set the memory size. Must be specified, otherwise memory size will be 0. - mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory - Amount of memory to be used in cases as follows: + mem=nn[KMG] [KNL,BOOT,EARLY] Force usage of a specific amount + of memory Amount of memory to be used in cases + as follows: 1 for test; 2 when the kernel is not able to see the whole system memory; @@ -3089,8 +3785,8 @@ if system memory of hypervisor is not sufficient. mem=nn[KMG]@ss[KMG] - [ARM,MIPS] - override the memory layout reported by - firmware. + [ARM,MIPS,EARLY] - override the memory layout + reported by firmware. Define a memory region of size nn[KMG] starting at ss[KMG]. Multiple different regions can be specified with @@ -3099,28 +3795,28 @@ mem=nopentium [BUGS=X86-32] Disable usage of 4MB pages for kernel memory. - memblock=debug [KNL] Enable memblock debug messages. + memblock=debug [KNL,EARLY] Enable memblock debug messages. memchunk=nn[KMG] [KNL,SH] Allow user to override the default size for per-device physically contiguous DMA buffers. - memhp_default_state=online/offline + memhp_default_state=online/offline/online_kernel/online_movable [KNL] Set the initial state for the memory hotplug onlining policy. If not specified, the default value is set according to the - CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE kernel config - option. + CONFIG_MHP_DEFAULT_ONLINE_TYPE kernel config + options. See Documentation/admin-guide/mm/memory-hotplug.rst. - memmap=exactmap [KNL,X86] Enable setting of an exact + memmap=exactmap [KNL,X86,EARLY] Enable setting of an exact E820 memory map, as specified by the user. Such memmap=exactmap lines can be constructed based on BIOS output or other requirements. See the memmap=nn@ss option description. memmap=nn[KMG]@ss[KMG] - [KNL, X86, MIPS, XTENSA] Force usage of a specific region of memory. + [KNL, X86,MIPS,XTENSA,EARLY] Force usage of a specific region of memory. Region of memory to be used is from ss to ss+nn. If @ss[KMG] is omitted, it is equivalent to mem=nn[KMG], which limits max address to nn[KMG]. @@ -3130,11 +3826,11 @@ memmap=100M@2G,100M#3G,1G!1024G memmap=nn[KMG]#ss[KMG] - [KNL,ACPI] Mark specific memory as ACPI data. + [KNL,ACPI,EARLY] Mark specific memory as ACPI data. Region of memory to be marked is from ss to ss+nn. memmap=nn[KMG]$ss[KMG] - [KNL,ACPI] Mark specific memory as reserved. + [KNL,ACPI,EARLY] Mark specific memory as reserved. Region of memory to be reserved is from ss to ss+nn. Example: Exclude memory from 0x18690000-0x1869ffff memmap=64K$0x18690000 @@ -3144,14 +3840,14 @@ like Grub2, otherwise '$' and the following number will be eaten. - memmap=nn[KMG]!ss[KMG] + memmap=nn[KMG]!ss[KMG,EARLY] [KNL,X86] Mark specific memory as protected. Region of memory to be used, from ss to ss+nn. The memory region may be marked as e820 type 12 (0xc) and is NVDIMM or ADR memory. memmap=<size>%<offset>-<oldtype>+<newtype> - [KNL,ACPI] Convert memory within the specified region + [KNL,ACPI,EARLY] Convert memory within the specified region from <oldtype> to <newtype>. If "-<oldtype>" is left out, the whole region will be marked as <newtype>, even if previously unavailable. If "+<newtype>" is left @@ -3159,25 +3855,25 @@ specified as e820 types, e.g., 1 = RAM, 2 = reserved, 3 = ACPI, 12 = PRAM. - memory_corruption_check=0/1 [X86] + memory_corruption_check=0/1 [X86,EARLY] Some BIOSes seem to corrupt the first 64k of memory when doing things like suspend/resume. Setting this option will scan the memory looking for corruption. Enabling this will both detect corruption and prevent the kernel from using the memory being corrupted. - However, its intended as a diagnostic tool; if + However, it's intended as a diagnostic tool; if repeatable BIOS-originated corruption always affects the same memory, you can use memmap= to prevent the kernel from using that memory. - memory_corruption_check_size=size [X86] + memory_corruption_check_size=size [X86,EARLY] By default it checks for corruption in the low 64k, making this memory unavailable for normal use. Use this parameter to scan for corruption in more or less memory. - memory_corruption_check_period=seconds [X86] + memory_corruption_check_period=seconds [X86,EARLY] By default it checks for corruption every 60 seconds. Use this parameter to check at some other rate. 0 disables periodic checking. @@ -3201,7 +3897,7 @@ Note that even when enabled, there are a few cases where the feature is not effective. - memtest= [KNL,X86,ARM,M68K,PPC,RISCV] Enable memtest + memtest= [KNL,X86,ARM,M68K,PPC,RISCV,EARLY] Enable memtest Format: <integer> default : 0 <disable> Specifies the number of memtest passes to be @@ -3213,9 +3909,7 @@ mem_encrypt= [X86-64] AMD Secure Memory Encryption (SME) control Valid arguments: on, off - Default (depends on kernel configuration option): - on (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) - off (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n) + Default: off mem_encrypt=on: Activate SME mem_encrypt=off: Do not activate SME @@ -3228,10 +3922,6 @@ deep - Suspend-To-RAM or equivalent (if supported) See Documentation/admin-guide/pm/sleep-states.rst. - mfgpt_irq= [IA-32] Specify the IRQ to use for the - Multi-Function General Purpose Timers on AMD Geode - platforms. - mfgptfix [X86-32] Fix MFGPT timers on AMD Geode platforms when the BIOS has incorrectly applied a workaround. TinyBIOS version 0.98 is known to be affected, 0.99 fixes the @@ -3239,8 +3929,18 @@ mga= [HW,DRM] - min_addr=nn[KMG] [KNL,BOOT,IA-64] All physical memory below this - physical address is ignored. + microcode= [X86] Control the behavior of the microcode loader. + Available options, comma separated: + + base_rev=X - with <X> with format: <u32> + Set the base microcode revision of each thread when in + debug mode. + + dis_ucode_ldr: disable the microcode loader + + force_minrev: + Enable or disable the microcode minimal revision + enforcement for the runtime microcode loader. mini2440= [ARM,HW,KNL] Format:[0..2][b][c][t] @@ -3264,33 +3964,42 @@ https://repo.or.cz/w/linux-2.6/mini2440.git mitigations= - [X86,PPC,S390,ARM64] Control optional mitigations for + [X86,PPC,S390,ARM64,EARLY] Control optional mitigations for CPU vulnerabilities. This is a set of curated, arch-independent options, each of which is an aggregation of existing arch-specific options. + Note, "mitigations" is supported if and only if the + kernel was built with CPU_MITIGATIONS=y. + off Disable all optional CPU mitigations. This improves system performance, but it may also expose users to several CPU vulnerabilities. - Equivalent to: nopti [X86,PPC] - if nokaslr then kpti=0 [ARM64] - nospectre_v1 [X86,PPC] - nobp=0 [S390] - nospectre_v2 [X86,PPC,S390,ARM64] - spectre_v2_user=off [X86] - spec_store_bypass_disable=off [X86,PPC] - ssbd=force-off [ARM64] - nospectre_bhb [ARM64] + Equivalent to: if nokaslr then kpti=0 [ARM64] + gather_data_sampling=off [X86] + indirect_target_selection=off [X86] + kvm.nx_huge_pages=off [X86] l1tf=off [X86] mds=off [X86] - tsx_async_abort=off [X86] - kvm.nx_huge_pages=off [X86] - srbds=off [X86,INTEL] + mmio_stale_data=off [X86] no_entry_flush [PPC] no_uaccess_flush [PPC] - mmio_stale_data=off [X86] + nobp=0 [S390] + nopti [X86,PPC] + nospectre_bhb [ARM64] + nospectre_v1 [X86,PPC] + nospectre_v2 [X86,PPC,S390,ARM64] + reg_file_data_sampling=off [X86] retbleed=off [X86] + spec_rstack_overflow=off [X86] + spec_store_bypass_disable=off [X86,PPC] + spectre_bhi=off [X86] + spectre_v2_user=off [X86] + srbds=off [X86,INTEL] + ssbd=force-off [ARM64] + tsx_async_abort=off [X86] + vmscape=off [X86] Exceptions: This does not have any effect on @@ -3315,8 +4024,12 @@ mmio_stale_data=full,nosmt [X86] retbleed=auto,nosmt [X86] + [X86] After one of the above options, additionally + supports attack-vector based controls as documented in + Documentation/admin-guide/hw-vuln/attack_vector_controls.rst + mminit_loglevel= - [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this + [KNL,EARLY] When CONFIG_DEBUG_MEMORY_INIT is set, this parameter allows control of the logging verbosity for the additional memory initialisation checks. A value of 0 disables mminit logging and a level of 4 will @@ -3324,7 +4037,7 @@ so loglevel=8 may also need to be specified. mmio_stale_data= - [X86,INTEL] Control mitigation for the Processor + [X86,INTEL,EARLY] Control mitigation for the Processor MMIO Stale Data vulnerabilities. Processor MMIO Stale Data is a class of @@ -3399,7 +4112,7 @@ mousedev.yres= [MOUSE] Vertical screen resolution, used for devices reporting absolute coordinates, such as tablets - movablecore= [KNL,X86,IA-64,PPC] + movablecore= [KNL,X86,PPC,EARLY] Format: nn[KMGTPE] | nn% This parameter is the complement to kernelcore=, it specifies the amount of memory used for migratable @@ -3410,7 +4123,7 @@ that the amount of memory usable for all allocations is not too small. - movable_node [KNL] Boot-time switch to make hotplugable memory + movable_node [KNL,EARLY] Boot-time switch to make hotplugable memory NUMA nodes to be movable. This means that the memory of such nodes will be usable only for movable allocations which rules out almost all kernel @@ -3425,30 +4138,25 @@ mtdparts= [MTD] See drivers/mtd/parsers/cmdlinepart.c - mtdset= [ARM] - ARM/S3C2412 JIVE boot control - - See arch/arm/mach-s3c/mach-jive.c - mtouchusb.raw_coordinates= [HW] Make the MicroTouch USB driver use raw coordinates ('y', default) or cooked coordinates ('n') - mtrr=debug [X86] + mtrr=debug [X86,EARLY] Enable printing debug information related to MTRR registers at boot time. - mtrr_chunk_size=nn[KMG] [X86] + mtrr_chunk_size=nn[KMG,X86,EARLY] used for mtrr cleanup. It is largest continuous chunk that could hold holes aka. UC entries. - mtrr_gran_size=nn[KMG] [X86] + mtrr_gran_size=nn[KMG,X86,EARLY] Used for mtrr cleanup. It is granularity of mtrr block. Default is 1. Large value could prevent small alignment from using up MTRRs. - mtrr_spare_reg_nr=n [X86] + mtrr_spare_reg_nr=n [X86,EARLY] Format: <integer> Range: 0,7 : spare reg number Default : 1 @@ -3496,6 +4204,13 @@ [NFS] set the TCP port on which the NFSv4 callback channel should listen. + nfs.delay_retrans= + [NFS] specifies the number of times the NFSv4 client + retries the request before returning an EAGAIN error, + after a reply of NFS4ERR_DELAY from the server. + Only applies if the softerr mount option is enabled, + and the specified value is >= 0. + nfs.enable_ino64= [NFS] enable 64-bit inode numbers. If zero, the NFS client will fake up a 32-bit inode @@ -3608,10 +4323,12 @@ Format: [state][,regs][,debounce][,die] nmi_watchdog= [KNL,BUGS=X86] Debugging features for SMP kernels - Format: [panic,][nopanic,][num] + Format: [panic,][nopanic,][rNNN,][num] Valid num: 0 or 1 0 - turn hardlockup detector in nmi_watchdog off 1 - turn hardlockup detector in nmi_watchdog on + rNNN - configure the watchdog with raw perf event 0xNNN + When panic is specified, panic when an NMI watchdog timeout occurs (or 'nopanic' to not panic on an NMI watchdog, if CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is set) @@ -3627,27 +4344,22 @@ emulation library even if a 387 maths coprocessor is present. - no4lvl [RISCV] Disable 4-level and 5-level paging modes. Forces - kernel to use 3-level paging instead. + no4lvl [RISCV,EARLY] Disable 4-level and 5-level paging modes. + Forces kernel to use 3-level paging instead. - no5lvl [X86-64,RISCV] Disable 5-level paging mode. Forces + no5lvl [X86-64,RISCV,EARLY] Disable 5-level paging mode. Forces kernel to use 4-level paging instead. - noaliencache [MM, NUMA, SLAB] Disables the allocation of alien - caches in the slab allocator. Saves per-node memory, - but will impact performance. - noalign [KNL,ARM] - noaltinstr [S390] Disables alternative instructions patching - (CPU alternatives feature). - - noapic [SMP,APIC] Tells the kernel to not make use of any + noapic [SMP,APIC,EARLY] Tells the kernel to not make use of any IOAPICs that may be present in the system. + noapictimer [APIC,X86] Don't set up the APIC timer + noautogroup Disable scheduler automatic task group creation. - nocache [ARM] + nocache [ARM,EARLY] no_console_suspend [HW] Never suspend the console @@ -3665,15 +4377,13 @@ turn on/off it dynamically. no_debug_objects - [KNL] Disable object debugging + [KNL,EARLY] Disable object debugging nodsp [SH] Disable hardware DSP at boot time. - noefi Disable EFI runtime services support. - - no_entry_flush [PPC] Don't flush the L1-D cache when entering the kernel. + noefi [EFI,EARLY] Disable EFI runtime services support. - noexec [IA-64] + no_entry_flush [PPC,EARLY] Don't flush the L1-D cache when entering the kernel. noexec32 [X86-64] This affects only 32-bit executables. @@ -3694,30 +4404,15 @@ register save and restore. The kernel will only save legacy floating-point registers on task switch. - nohalt [IA-64] Tells the kernel not to use the power saving - function PAL_HALT_LIGHT when idle. This increases - power-consumption. On the positive side, it reduces - interrupt wake-up latency, which may improve performance - in certain environments such as networked servers or - real-time systems. + nogbpages [X86] Do not use GB pages for kernel direct mappings. no_hash_pointers - Force pointers printed to the console or buffers to be - unhashed. By default, when a pointer is printed via %p - format string, that pointer is "hashed", i.e. obscured - by hashing the pointer value. This is a security feature - that hides actual kernel addresses from unprivileged - users, but it also makes debugging the kernel more - difficult since unequal pointers can no longer be - compared. However, if this command-line option is - specified, then all normal pointers will have their true - value printed. This option should only be specified when - debugging the kernel. Please do not use on production - kernels. + [KNL,EARLY] + Alias for "hash_pointers=never". nohibernate [HIBERNATION] Disable hibernation and resume. - nohlt [ARM,ARM64,MICROBLAZE,MIPS,SH] Forces the kernel to + nohlt [ARM,ARM64,MICROBLAZE,MIPS,PPC,RISCV,SH] Forces the kernel to busy wait in do_idle() and not use the arch_cpu_idle() implementation; requires CONFIG_GENERIC_IDLE_POLL_SETUP to be effective. This is useful on platforms where the @@ -3726,9 +4421,11 @@ the impact of the sleep instructions. This is also useful when using JTAG debugger. - nohugeiomap [KNL,X86,PPC,ARM64] Disable kernel huge I/O mappings. + nohpet [X86] Don't use the HPET timer. + + nohugeiomap [KNL,X86,PPC,ARM64,EARLY] Disable kernel huge I/O mappings. - nohugevmalloc [KNL,X86,PPC,ARM64] Disable kernel huge vmalloc mappings. + nohugevmalloc [KNL,X86,PPC,ARM64,EARLY] Disable kernel huge vmalloc mappings. nohz= [KNL] Boottime enable/disable dynamic ticks Valid arguments: on, off @@ -3750,13 +4447,11 @@ noinitrd [RAM] Tells the kernel not to load any configured initial RAM disk. - nointremap [X86-64, Intel-IOMMU] Do not enable interrupt + nointremap [X86-64,Intel-IOMMU,EARLY] Do not enable interrupt remapping. [Deprecated - use intremap=off] - nointroute [IA-64] - - noinvpcid [X86] Disable the INVPCID cpu feature. + noinvpcid [X86,EARLY] Disable the INVPCID cpu feature. noiotrap [SH] Disables trapped I/O port accesses. @@ -3765,23 +4460,19 @@ noisapnp [ISAPNP] Disables ISA PnP code. - nojitter [IA-64] Disables jitter checking for ITC timers. - - nokaslr [KNL] + nokaslr [KNL,EARLY] When CONFIG_RANDOMIZE_BASE is set, this disables kernel and module base offset ASLR (Address Space Layout Randomization). - no-kvmapf [X86,KVM] Disable paravirtualized asynchronous page + no-kvmapf [X86,KVM,EARLY] Disable paravirtualized asynchronous page fault handling. - no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver - - nolapic [X86-32,APIC] Do not enable or use the local APIC. + no-kvmclock [X86,KVM,EARLY] Disable paravirtualized KVM clock driver - nolapic_timer [X86-32,APIC] Do not use the local APIC timer. + nolapic [X86-32,APIC,EARLY] Do not enable or use the local APIC. - nomca [IA-64] Disable machine check abort handling + nolapic_timer [X86-32,APIC,EARLY] Do not use the local APIC timer. nomce [X86-32] Disable Machine Check Exception @@ -3804,23 +4495,23 @@ shutdown the other cpus. Instead use the REBOOT_VECTOR irq. - nopat [X86] Disable PAT (page attribute table extension of + nopat [X86,EARLY] Disable PAT (page attribute table extension of pagetables) support. - nopcid [X86-64] Disable the PCID cpu feature. + nopcid [X86-64,EARLY] Disable the PCID cpu feature. nopku [X86] Disable Memory Protection Keys CPU feature found in some Intel CPUs. - nopti [X86-64] + nopti [X86-64,EARLY] Equivalent to pti=off - nopv= [X86,XEN,KVM,HYPER_V,VMWARE] + nopv= [X86,XEN,KVM,HYPER_V,VMWARE,EARLY] Disables the PV optimizations forcing the guest to run as generic guest with no PV drivers. Currently support XEN HVM, KVM, HYPER_V and VMWARE guest. - nopvspin [X86,XEN,KVM] + nopvspin [X86,XEN,KVM,EARLY] Disables the qspinlock slow path using PV optimizations which allow the hypervisor to 'idle' the guest on lock contention. @@ -3834,61 +4525,62 @@ noresume [SWSUSP] Disables resume and restores original swap space. - nosbagart [IA-64] - no-scroll [VGA] Disables scrollback. This is required for the Braillex ib80-piezo Braille reader made by F.H. Papenmeier (Germany). - nosgx [X86-64,SGX] Disables Intel SGX kernel support. + nosgx [X86-64,SGX,EARLY] Disables Intel SGX kernel support. - nosmap [PPC] + nosmap [PPC,EARLY] Disable SMAP (Supervisor Mode Access Prevention) even if it is supported by processor. - nosmep [PPC64s] + nosmep [PPC64s,EARLY] Disable SMEP (Supervisor Mode Execution Prevention) even if it is supported by processor. - nosmp [SMP] Tells an SMP kernel to act as a UP kernel, + nosmp [SMP,EARLY] Tells an SMP kernel to act as a UP kernel, and disable the IO APIC. legacy for "maxcpus=0". - nosmt [KNL,MIPS,S390] Disable symmetric multithreading (SMT). + nosmt [KNL,MIPS,PPC,EARLY] Disable symmetric multithreading (SMT). Equivalent to smt=1. - [KNL,X86] Disable symmetric multithreading (SMT). + [KNL,X86,PPC,S390] Disable symmetric multithreading (SMT). nosmt=force: Force disable SMT, cannot be undone via the sysfs control file. nosoftlockup [KNL] Disable the soft-lockup detector. nospec_store_bypass_disable - [HW] Disable all mitigations for the Speculative Store Bypass vulnerability + [HW,EARLY] Disable all mitigations for the Speculative + Store Bypass vulnerability - nospectre_bhb [ARM64] Disable all mitigations for Spectre-BHB (branch + nospectre_bhb [ARM64,EARLY] Disable all mitigations for Spectre-BHB (branch history injection) vulnerability. System may allow data leaks with this option. - nospectre_v1 [X86,PPC] Disable mitigations for Spectre Variant 1 + nospectre_v1 [X86,PPC,EARLY] Disable mitigations for Spectre Variant 1 (bounds check bypass). With this option data leaks are possible in the system. - nospectre_v2 [X86,PPC_E500,ARM64] Disable all mitigations for - the Spectre variant 2 (indirect branch prediction) - vulnerability. System may allow data leaks with this - option. + nospectre_v2 [X86,PPC_E500,ARM64,EARLY] Disable all mitigations + for the Spectre variant 2 (indirect branch + prediction) vulnerability. System may allow data + leaks with this option. - no-steal-acc [X86,PV_OPS,ARM64,PPC/PSERIES] Disable paravirtualized - steal time accounting. steal time is computed, but - won't influence scheduler behaviour + no-steal-acc [X86,PV_OPS,ARM64,PPC/PSERIES,RISCV,LOONGARCH,EARLY] + Disable paravirtualized steal time accounting. steal time + is computed, but won't influence scheduler behaviour nosync [HW,M68K] Disables sync negotiation for all devices. - no_timer_check [X86,APIC] Disables the code which tests for - broken timer IRQ sources. + no_timer_check [X86,APIC] Disables the code which tests for broken + timer IRQ sources, i.e., the IO-APIC timer. This can + work around problems with incorrect timer + initialization on some boards. no_uaccess_flush - [PPC] Don't flush the L1-D cache after accessing user data. + [PPC,EARLY] Don't flush the L1-D cache after accessing user data. novmcoredd [KNL,KDUMP] Disable device dump. Device dump allows drivers to @@ -3902,15 +4594,15 @@ is set. no-vmw-sched-clock - [X86,PV_OPS] Disable paravirtualized VMware scheduler - clock and use the default one. + [X86,PV_OPS,EARLY] Disable paravirtualized VMware + scheduler clock and use the default one. nowatchdog [KNL] Disable both lockup detectors, i.e. soft-lockup and NMI watchdog (hard-lockup). - nowb [ARM] + nowb [ARM,EARLY] - nox2apic [X86-64,APIC] Do not enable x2APIC mode. + nox2apic [X86-64,APIC,EARLY] Do not enable x2APIC mode. NOTE: this parameter will be ignored on systems with the LEGACY_XAPIC_DISABLED bit set in the @@ -3935,20 +4627,7 @@ parameter, xsave area per process might occupy more memory on xsaves enabled systems. - nps_mtm_hs_ctr= [KNL,ARC] - This parameter sets the maximum duration, in - cycles, each HW thread of the CTOP can run - without interruptions, before HW switches it. - The actual maximum duration is 16 times this - parameter's value. - Format: integer between 1 and 255 - Default: 255 - - nptcg= [IA-64] Override max number of concurrent global TLB - purges which is reported from either PAL_VM_SUMMARY or - SAL PALO. - - nr_cpus= [SMP] Maximum number of processors that an SMP kernel + nr_cpus= [SMP,EARLY] Maximum number of processors that an SMP kernel could support. nr_cpus=n : n >= 1 limits the kernel to support 'n' processors. It could be larger than the number of already plugged CPU during bootup, later in @@ -3959,8 +4638,29 @@ nr_uarts= [SERIAL] maximum number of UARTs to be registered. - numa=off [KNL, ARM64, PPC, RISCV, SPARC, X86] Disable NUMA, Only - set up a single NUMA node spanning all memory. + numa=off [KNL, ARM64, PPC, RISCV, SPARC, X86, EARLY] + Disable NUMA, Only set up a single NUMA node + spanning all memory. + + numa=fake=<size>[MG] + [KNL, ARM64, RISCV, X86, EARLY] + If given as a memory unit, fills all system RAM with + nodes of size interleaved over physical nodes. + + numa=fake=<N> + [KNL, ARM64, RISCV, X86, EARLY] + If given as an integer, fills all system RAM with N + fake nodes interleaved over physical nodes. + + numa=fake=<N>U + [KNL, ARM64, RISCV, X86, EARLY] + If given as an integer followed by 'U', it will + divide each physical node into N emulated nodes. + + numa=noacpi [X86] Don't parse the SRAT table for NUMA setup + + numa=nohmat [X86] Don't parse the HMAT table for NUMA setup, or + soft-reserved memory partitioning. numa_balancing= [KNL,ARM64,PPC,RISCV,S390,X86] Enable or disable automatic NUMA balancing. @@ -3971,7 +4671,7 @@ This can be set from sysctl after boot. See Documentation/admin-guide/sysctl/vm.rst for details. - ohci1394_dma=early [HW] enable debugging via the ohci1394 driver. + ohci1394_dma=early [HW,EARLY] enable debugging via the ohci1394 driver. See Documentation/core-api/debugging-via-ohci1394.rst for more info. @@ -3997,7 +4697,8 @@ Once locked, the boundary cannot be changed. 1 indicates lock status, 0 indicates unlock status. - oops=panic Always panic on oopses. Default is to just kill the + oops=panic [KNL,EARLY] + Always panic on oopses. Default is to just kill the process, but there is a small probability of deadlocking the machine. This will also cause panics on machine check exceptions. @@ -4005,21 +4706,19 @@ page_alloc.shuffle= [KNL] Boolean flag to control whether the page allocator - should randomize its free lists. The randomization may - be automatically enabled if the kernel detects it is - running on a platform with a direct-mapped memory-side - cache, and this parameter can be used to - override/disable that behavior. The state of the flag - can be read from sysfs at: + should randomize its free lists. This parameter can be + used to enable/disable page randomization. The state of + the flag can be read from sysfs at: /sys/module/page_alloc/parameters/shuffle. + This parameter is only available if CONFIG_SHUFFLE_PAGE_ALLOCATOR=y. - page_owner= [KNL] Boot-time page_owner enabling option. + page_owner= [KNL,EARLY] Boot-time page_owner enabling option. Storage of the information about who allocated each page is disabled in default. With this switch, we can turn it on. on: enable the feature - page_poison= [KNL] Boot-time parameter changing the state of + page_poison= [KNL,EARLY] Boot-time parameter changing the state of poisoning on the buddy allocator, available with CONFIG_PAGE_POISONING=y. off: turn off poisoning (default) @@ -4029,7 +4728,7 @@ [KNL] Minimal page reporting order Format: <integer> Adjust the minimal page reporting order. The page - reporting is disabled when it exceeds MAX_ORDER. + reporting is disabled when it exceeds MAX_PAGE_ORDER. panic= [KNL] Kernel behaviour on panic: delay <timeout> timeout > 0: seconds before rebooting @@ -4037,21 +4736,8 @@ timeout < 0: reboot immediately Format: <timeout> - panic_print= Bitmask for printing system info when panic happens. - User can chose combination of the following bits: - bit 0: print all tasks info - bit 1: print system memory info - bit 2: print timer info - bit 3: print locks info if CONFIG_LOCKDEP is on - bit 4: print ftrace buffer - bit 5: print all printk messages in buffer - bit 6: print all CPUs backtrace (if available in the arch) - *Be aware* that this option may print a _lot_ of lines, - so there are risks of losing older messages in the log. - Use this option carefully, maybe worth to setup a - bigger log buffer with "log_buf_len" along with this. - - panic_on_taint= Bitmask for conditionally calling panic() in add_taint() + panic_on_taint= [KNL,EARLY] + Bitmask for conditionally calling panic() in add_taint() Format: <hex>[,nousertaint] Hexadecimal bitmask representing the set of TAINT flags that will cause the kernel to panic when add_taint() is @@ -4067,6 +4753,40 @@ panic_on_warn=1 panic() instead of WARN(). Useful to cause kdump on a WARN(). + panic_print= Bitmask for printing system info when panic happens. + User can chose combination of the following bits: + bit 0: print all tasks info + bit 1: print system memory info + bit 2: print timer info + bit 3: print locks info if CONFIG_LOCKDEP is on + bit 4: print ftrace buffer + bit 5: replay all kernel messages on consoles at the end of panic + bit 6: print all CPUs backtrace (if available in the arch) + bit 7: print only tasks in uninterruptible (blocked) state + *Be aware* that this option may print a _lot_ of lines, + so there are risks of losing older messages in the log. + Use this option carefully, maybe worth to setup a + bigger log buffer with "log_buf_len" along with this. + + panic_sys_info= A comma separated list of extra information to be dumped + on panic. + Format: val[,val...] + Where @val can be any of the following: + + tasks: print all tasks info + mem: print system memory info + timers: print timers info + locks: print locks info if CONFIG_LOCKDEP is on + ftrace: print ftrace buffer + all_bt: print all CPUs backtrace (if available in the arch) + blocked_tasks: print only tasks in uninterruptible (blocked) state + + This is a human readable alternative to the 'panic_print' option. + + panic_console_replay + When panic happens, replay all kernel messages on + consoles at the end of panic. + parkbd.port= [HW] Parallel port number the keyboard adapter is connected to, default is 0. Format: <parport#> @@ -4186,14 +4906,14 @@ mode 0, bit 1 is for mode 1, and so on. Mode 0 only allowed by default. - pause_on_oops= + pause_on_oops=<int> Halt all CPUs after the first oops has been printed for the specified number of seconds. This is to be used if your oopses keep scrolling off the screen. pcbit= [HW,ISDN] - pci=option[,option...] [PCI] various PCI subsystem options. + pci=option[,option...] [PCI,EARLY] various PCI subsystem options. Some options herein operate on a specific device or a set of devices (<pci_dev>). These are @@ -4419,14 +5139,51 @@ bridges without forcing it upstream. Note: this removes isolation between devices and may put more devices in an IOMMU group. + config_acs= + Format: + <ACS flags>@<pci_dev>[; ...] + Specify one or more PCI devices (in the format + specified above) optionally prepended with flags + and separated by semicolons. The respective + capabilities will be enabled, disabled or + unchanged based on what is specified in + flags. + + ACS Flags is defined as follows: + bit-0 : ACS Source Validation + bit-1 : ACS Translation Blocking + bit-2 : ACS P2P Request Redirect + bit-3 : ACS P2P Completion Redirect + bit-4 : ACS Upstream Forwarding + bit-5 : ACS P2P Egress Control + bit-6 : ACS Direct Translated P2P + Each bit can be marked as: + '0' – force disabled + '1' – force enabled + 'x' – unchanged + For example, + pci=config_acs=10x@pci:0:0 + would configure all devices that support + ACS to enable P2P Request Redirect, disable + Translation Blocking, and leave Source + Validation unchanged from whatever power-up + or firmware set it to. + + Note: this may remove isolation between devices + and may put more devices in an IOMMU group. force_floating [S390] Force usage of floating interrupts. nomio [S390] Do not use MIO instructions. norid [S390] ignore the RID field and force use of one PCI domain per PCI function + notph [PCIE] If the PCIE_TPH kernel config parameter + is enabled, this kernel boot option can be used + to disable PCIe TLP Processing Hints support + system-wide. - pcie_aspm= [PCIE] Forcibly enable or disable PCIe Active State Power + pcie_aspm= [PCIE] Forcibly enable or ignore PCIe Active State Power Management. - off Disable ASPM. + off Don't touch ASPM configuration at all. Leave any + configuration done by firmware unchanged. force Enable ASPM even on devices that claim not to support it. WARNING: Forcing ASPM on may cause system lockups. @@ -4462,7 +5219,8 @@ Format: { 0 | 1 } See arch/parisc/kernel/pdc_chassis.c - percpu_alloc= Select which percpu first chunk allocator to use. + percpu_alloc= [MM,EARLY] + Select which percpu first chunk allocator to use. Currently supported values are "embed" and "page". Archs may support subset or none of the selections. See comments in mm/percpu.c for details on each @@ -4488,6 +5246,18 @@ that number, otherwise (e.g., 'pmu_override=on'), MMCR1 remains 0. + pm_async= [PM] + Format: off + This parameter sets the initial value of the + /sys/power/pm_async sysfs knob at boot time. + If set to "off", disables asynchronous suspend and + resume of devices during system-wide power transitions. + This can be useful on platforms where device + dependencies are not well-defined, or for debugging + power management issues. Asynchronous operations are + enabled by default. + + pm_debug_messages [SUSPEND,KNL] Enable suspend/resume debug messages during boot up. @@ -4524,6 +5294,11 @@ may be specified. Format: <port>,<port>.... + possible_cpus= [SMP,S390,X86] + Format: <unsigned int> + Set the number of possible CPUs, overriding the + regular discovery mechanisms (such as ACPI/FW, etc). + powersave=off [PPC] This option disables power saving features. It specifically disables cpuidle and sets the platform machine description specific power_save @@ -4531,12 +5306,12 @@ execution priority. ppc_strict_facility_enable - [PPC] This option catches any kernel floating point, + [PPC,ENABLE] This option catches any kernel floating point, Altivec, VSX and SPE outside of regions specifically allowed (eg kernel_enable_fpu()/kernel_disable_fpu()). There is some performance impact when enabling this. - ppc_tm= [PPC] + ppc_tm= [PPC,EARLY] Format: {"off"} Disable Hardware Transactional Memory @@ -4545,7 +5320,14 @@ none - Limited to cond_resched() calls voluntary - Limited to cond_resched() and might_sleep() calls full - Any section that isn't explicitly preempt disabled - can be preempted anytime. + can be preempted anytime. Tasks will also yield + contended spinlocks (if the critical section isn't + explicitly preempt disabled beyond the lock itself). + lazy - Scheduler controlled. Similar to full but instead + of preempting the task immediately, the task gets + one HZ tick time to yield itself before the + preemption will be forced. One preemption is when the + task returns to user space. print-fatal-signals= [KNL] debug: print fatal signals @@ -4575,6 +5357,14 @@ Format: <bool> default: 0 (auto_verbose is enabled) + printk.debug_non_panic_cpus= + Allows storing messages from non-panic CPUs into + the printk log buffer during panic(). They are + flushed to consoles by the panic-CPU on + a best-effort basis. + Format: <bool> (1/Y/y=enable, 0/N/n=disable) + Default: disabled + printk.devkmsg={on,off,ratelimit} Control writing to /dev/kmsg. on - unlimited logging to /dev/kmsg from userspace @@ -4585,6 +5375,16 @@ printk.time= Show timing data prefixed to each printk message line Format: <bool> (1/Y/y=enable, 0/N/n=disable) + proc_mem.force_override= [KNL] + Format: {always | ptrace | never} + Traditionally /proc/pid/mem allows memory permissions to be + overridden without restrictions. This option may be set to + restrict that. Can be one of: + - 'always': traditional behavior always allows mem overrides. + - 'ptrace': only allow mem overrides for active ptracers. + - 'never': never allow mem overrides. + If not specified, default is the CONFIG_PROC_MEM_* choice. + processor.max_cstate= [HW,ACPI] Limit processor to maximum C-state max_cstate=9 overrides any DMI blacklist limit. @@ -4595,11 +5395,9 @@ profile= [KNL] Enable kernel profiling via /proc/profile Format: [<profiletype>,]<number> - Param: <profiletype>: "schedule", "sleep", or "kvm" + Param: <profiletype>: "schedule" or "kvm" [defaults to kernel profiling] Param: "schedule" - profile schedule points. - Param: "sleep" - profile D-state sleeping (millisecs). - Requires CONFIG_SCHEDSTATS Param: "kvm" - profile VM exits. Param: <number> - step/bucket size as a power of 2 for statistical time based profiling. @@ -4608,7 +5406,9 @@ prot_virt= [S390] enable hosting protected virtual machines isolated from the hypervisor (if hardware supports - that). + that). If enabled, the default kernel base address + might be overridden even when Kernel Address Space + Layout Randomization is disabled. Format: <bool> psi= [KNL] Enable or disable pressure stall information @@ -4646,7 +5446,7 @@ [KNL] Number of legacy pty's. Overwrites compiled-in default number. - quiet [KNL] Disable most log messages + quiet [KNL,EARLY] Disable most log messages r128= [HW,DRM] @@ -4663,17 +5463,17 @@ ramdisk_start= [RAM] RAM disk image start address random.trust_cpu=off - [KNL] Disable trusting the use of the CPU's + [KNL,EARLY] Disable trusting the use of the CPU's random number generator (if available) to initialize the kernel's RNG. random.trust_bootloader=off - [KNL] Disable trusting the use of the a seed + [KNL,EARLY] Disable trusting the use of the a seed passed by the bootloader (if available) to initialize the kernel's RNG. randomize_kstack_offset= - [KNL] Enable or disable kernel stack offset + [KNL,EARLY] Enable or disable kernel stack offset randomization, which provides roughly 5 bits of entropy, frustrating memory corruption attacks that depend on stack address determinism or @@ -4732,6 +5532,17 @@ Set maximum number of finished RCU callbacks to process in one batch. + rcutree.csd_lock_suppress_rcu_stall= [KNL] + Do only a one-line RCU CPU stall warning when + there is an ongoing too-long CSD-lock wait. + + rcutree.do_rcu_barrier= [KNL] + Request a call to rcu_barrier(). This is + throttled so that userspace tests can safely + hammer on the sysfs variable if they so choose. + If triggered before the RCU grace-period machinery + is fully active, this will error out with EAGAIN. + rcutree.dump_tree= [KNL] Dump the structure of the rcu_node combining tree out at early boot. This is used for diagnostic @@ -4802,6 +5613,14 @@ the ->nocb_bypass queue. The definition of "too many" is supplied by this kernel boot parameter. + rcutree.nohz_full_patience_delay= [KNL] + On callback-offloaded (rcu_nocbs) CPUs, avoid + disturbing RCU unless the grace period has + reached the specified age in milliseconds. + Defaults to zero. Large values will be capped + at five seconds. All values will be rounded down + to the nearest value representable by jiffies. + rcutree.qhimark= [KNL] Set threshold of queued RCU callbacks beyond which batch limiting is disabled. @@ -4907,6 +5726,26 @@ this kernel boot parameter, forcibly setting it to zero. + rcutree.enable_rcu_lazy= [KNL] + To save power, batch RCU callbacks and flush after + delay, memory pressure or callback list growing too + big. + + rcutree.rcu_normal_wake_from_gp= [KNL] + Reduces a latency of synchronize_rcu() call. This approach + maintains its own track of synchronize_rcu() callers, so it + does not interact with regular callbacks because it does not + use a call_rcu[_hurry]() path. Please note, this is for a + normal grace period. + + How to enable it: + + echo 1 > /sys/module/rcutree/parameters/rcu_normal_wake_from_gp + or pass a boot parameter "rcutree.rcu_normal_wake_from_gp=1" + + Default is 1 if num_possible_cpus() <= 16 and it is not explicitly + disabled by the boot parameter passing 0. + rcuscale.gp_async= [KNL] Measure performance of asynchronous grace-period primitives such as call_rcu(). @@ -4928,6 +5767,15 @@ test until boot completes in order to avoid interference. + rcuscale.kfree_by_call_rcu= [KNL] + In kernels built with CONFIG_RCU_LAZY=y, test + call_rcu() instead of kfree_rcu(). + + rcuscale.kfree_mult= [KNL] + Instead of allocating an object of size kfree_obj, + allocate one of kfree_mult * sizeof(kfree_obj). + Defaults to 1. + rcuscale.kfree_rcu_test= [KNL] Set to measure performance of kfree_rcu() flooding. @@ -4953,6 +5801,12 @@ Number of loops doing rcuscale.kfree_alloc_num number of allocations and frees. + rcuscale.minruntime= [KNL] + Set the minimum test run time in seconds. This + does not affect the data-collection interval, + but instead allows better measurement of things + like CPU consumption. + rcuscale.nreaders= [KNL] Set number of RCU readers. The value -1 selects N, where N is the number of CPUs. A value @@ -4967,7 +5821,7 @@ the same as for rcuscale.nreaders. N, where N is the number of CPUs - rcuscale.perf_type= [KNL] + rcuscale.scale_type= [KNL] Specify the RCU implementation to test. rcuscale.shutdown= [KNL] @@ -4983,6 +5837,11 @@ in microseconds. The default of zero says no holdoff. + rcuscale.writer_holdoff_jiffies= [KNL] + Additional write-side holdoff between grace + periods, but in jiffies. The default of zero + says no holdoff. + rcutorture.fqs_duration= [KNL] Set duration of force_quiescent_state bursts in microseconds. @@ -5018,7 +5877,42 @@ rcutorture.gp_cond= [KNL] Use conditional/asynchronous update-side - primitives, if available. + normal-grace-period primitives, if available. + + rcutorture.gp_cond_exp= [KNL] + Use conditional/asynchronous update-side + expedited-grace-period primitives, if available. + + rcutorture.gp_cond_full= [KNL] + Use conditional/asynchronous update-side + normal-grace-period primitives that also take + concurrent expedited grace periods into account, + if available. + + rcutorture.gp_cond_exp_full= [KNL] + Use conditional/asynchronous update-side + expedited-grace-period primitives that also take + concurrent normal grace periods into account, + if available. + + rcutorture.gp_cond_wi= [KNL] + Nominal wait interval for normal conditional + grace periods (specified by rcutorture's + gp_cond and gp_cond_full module parameters), + in microseconds. The actual wait interval will + be randomly selected to nanosecond granularity up + to this wait interval. Defaults to 16 jiffies, + for example, 16,000 microseconds on a system + with HZ=1000. + + rcutorture.gp_cond_wi_exp= [KNL] + Nominal wait interval for expedited conditional + grace periods (specified by rcutorture's + gp_cond_exp and gp_cond_exp_full module + parameters), in microseconds. The actual wait + interval will be randomly selected to nanosecond + granularity up to this wait interval. Defaults to + 128 microseconds. rcutorture.gp_exp= [KNL] Use expedited update-side primitives, if available. @@ -5027,6 +5921,43 @@ Use normal (non-expedited) asynchronous update-side primitives, if available. + rcutorture.gp_poll= [KNL] + Use polled update-side normal-grace-period + primitives, if available. + + rcutorture.gp_poll_exp= [KNL] + Use polled update-side expedited-grace-period + primitives, if available. + + rcutorture.gp_poll_full= [KNL] + Use polled update-side normal-grace-period + primitives that also take concurrent expedited + grace periods into account, if available. + + rcutorture.gp_poll_exp_full= [KNL] + Use polled update-side expedited-grace-period + primitives that also take concurrent normal + grace periods into account, if available. + + rcutorture.gp_poll_wi= [KNL] + Nominal wait interval for normal conditional + grace periods (specified by rcutorture's + gp_poll and gp_poll_full module parameters), + in microseconds. The actual wait interval will + be randomly selected to nanosecond granularity up + to this wait interval. Defaults to 16 jiffies, + for example, 16,000 microseconds on a system + with HZ=1000. + + rcutorture.gp_poll_wi_exp= [KNL] + Nominal wait interval for expedited conditional + grace periods (specified by rcutorture's + gp_poll_exp and gp_poll_exp_full module + parameters), in microseconds. The actual wait + interval will be randomly selected to nanosecond + granularity up to this wait interval. Defaults to + 128 microseconds. + rcutorture.gp_sync= [KNL] Use normal (non-expedited) synchronous update-side primitives, if available. If all @@ -5035,6 +5966,31 @@ are zero, rcutorture acts as if is interpreted they are all non-zero. + rcutorture.gpwrap_lag= [KNL] + Enable grace-period wrap lag testing. Setting + to false prevents the gpwrap lag test from + running. Default is true. + + rcutorture.gpwrap_lag_gps= [KNL] + Set the value for grace-period wrap lag during + active lag testing periods. This controls how many + grace periods differences we tolerate between + rdp and rnp's gp_seq before setting overflow flag. + The default is always set to 8. + + rcutorture.gpwrap_lag_cycle_mins= [KNL] + Set the total cycle duration for gpwrap lag + testing in minutes. This is the total time for + one complete cycle of active and inactive + testing periods. Default is 30 minutes. + + rcutorture.gpwrap_lag_active_mins= [KNL] + Set the duration for which gpwrap lag is active + within each cycle, in minutes. During this time, + the grace-period wrap lag will be set to the + value specified by gpwrap_lag_gps. Default is + 5 minutes. + rcutorture.irqreader= [KNL] Run RCU readers from irq handlers, or, more accurately, from a timer handler. Not all RCU @@ -5080,10 +6036,21 @@ Set time (jiffies) between CPU-hotplug operations, or zero to disable CPU-hotplug testing. - rcutorture.read_exit= [KNL] - Set the number of read-then-exit kthreads used - to test the interaction of RCU updaters and - task-exit processing. + rcutorture.preempt_duration= [KNL] + Set duration (in milliseconds) of preemptions + by a high-priority FIFO real-time task. Set to + zero (the default) to disable. The CPUs to + preempt are selected randomly from the set that + are online at a given point in time. Races with + CPUs going offline are ignored, with that attempt + at preemption skipped. + + rcutorture.preempt_interval= [KNL] + Set interval (in milliseconds, defaulting to one + second) between preemptions by a high-priority + FIFO real-time task. This delay is mediated + by an hrtimer and is further fuzzed to avoid + inadvertent synchronizations. rcutorture.read_exit_burst= [KNL] The number of times in a given read-then-exit @@ -5094,6 +6061,14 @@ The delay, in seconds, between successive read-then-exit testing episodes. + rcutorture.reader_flavor= [KNL] + A bit mask indicating which readers to use. + If there is more than one bit set, the readers + are entered from low-order bit up, and are + exited in the opposite order. For SRCU, the + 0x1 bit is normal readers, 0x2 NMI-safe readers, + and 0x4 light-weight readers. + rcutorture.shuffle_interval= [KNL] Set task-shuffle interval (s). Shuffling tasks allows some CPUs to go into dyntick-idle mode @@ -5125,7 +6100,13 @@ Time to wait (s) after boot before inducing stall. rcutorture.stall_cpu_irqsoff= [KNL] - Disable interrupts while stalling if set. + Disable interrupts while stalling if set, but only + on the first stall in the set. + + rcutorture.stall_cpu_repeat= [KNL] + Number of times to repeat the stall sequence, + so that rcutorture.stall_cpu_repeat=3 will result + in four stall sequences. rcutorture.stall_gp_kthread= [KNL] Duration (s) of forced sleep within RCU @@ -5151,6 +6132,11 @@ rcutorture.test_boost_duration= [KNL] Duration (s) of each individual boost test. + rcutorture.test_boost_holdoff= [KNL] + Holdoff time (s) from start of test to the start + of RCU priority-boost testing. Defaults to zero, + that is, no holdoff. + rcutorture.test_boost_interval= [KNL] Interval (s) between each boost test. @@ -5168,6 +6154,12 @@ Dump ftrace buffer after reporting RCU CPU stall warning. + rcupdate.rcu_cpu_stall_notifiers= [KNL] + Provide RCU CPU stall notifiers, but see the + warnings in the RCU_CPU_STALL_NOTIFIER Kconfig + option's help text. TL;DR: You almost certainly + do not want rcupdate.rcu_cpu_stall_notifiers. + rcupdate.rcu_cpu_stall_suppress= [KNL] Suppress RCU CPU stall warning messages. @@ -5264,6 +6256,13 @@ number avoids disturbing real-time workloads, but lengthens grace periods. + rcupdate.rcu_task_lazy_lim= [KNL] + Number of callbacks on a given CPU that will + cancel laziness on that CPU. Use -1 to disable + cancellation of laziness, but be advised that + doing so increases the danger of OOM due to + callback flooding. + rcupdate.rcu_task_stall_info= [KNL] Set initial timeout in jiffies for RCU task stall informational messages, which give some indication @@ -5293,6 +6292,21 @@ A change in value does not take effect until the beginning of the next grace period. + rcupdate.rcu_tasks_lazy_ms= [KNL] + Set timeout in milliseconds RCU Tasks asynchronous + callback batching for call_rcu_tasks(). + A negative value will take the default. A value + of zero will disable batching. Batching is + always disabled for synchronize_rcu_tasks(). + + rcupdate.rcu_tasks_trace_lazy_ms= [KNL] + Set timeout in milliseconds RCU Tasks + Trace asynchronous callback batching for + call_rcu_tasks_trace(). A negative value + will take the default. A value of zero will + disable batching. Batching is always disabled + for synchronize_rcu_tasks_trace(). + rcupdate.rcu_self_test= [KNL] Run the RCU early boot self tests @@ -5301,7 +6315,7 @@ Run specified binary instead of /init from the ramdisk, used for early userspace startup. See initrd. - rdrand= [X86] + rdrand= [X86,EARLY] force - Override the decision by the kernel to hide the advertisement of RDRAND support (this affects certain AMD processors because of buggy BIOS @@ -5311,7 +6325,7 @@ rdt= [HW,X86,RDT] Turn on/off individual RDT features. List is: cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp, - mba, smba, bmec. + mba, smba, bmec, abmc, sdciae. E.g. to turn on cmt and turn off mba use: rdt=cmt,!mba @@ -5329,12 +6343,67 @@ reboot_cpu is s[mp]#### with #### being the processor to be used for rebooting. + acpi + Use the ACPI RESET_REG in the FADT. If ACPI is not + configured or the ACPI reset does not work, the reboot + path attempts the reset using the keyboard controller. + + bios + Use the CPU reboot vector for warm reset + + cold + Set the cold reboot flag + + default + There are some built-in platform specific "quirks" + - you may see: "reboot: <name> series board detected. + Selecting <type> for reboots." In the case where you + think the quirk is in error (e.g. you have newer BIOS, + or newer board) using this option will ignore the + built-in quirk table, and use the generic default + reboot actions. + + efi + Use efi reset_system runtime service. If EFI is not + configured or the EFI reset does not work, the reboot + path attempts the reset using the keyboard controller. + + force + Don't stop other CPUs on reboot. This can make reboot + more reliable in some cases. + + kbd + Use the keyboard controller. cold reset (default) + + pci + Use a write to the PCI config space register 0xcf9 to + trigger reboot. + + triple + Force a triple fault (init) + + warm + Don't set the cold reboot flag + + Using warm reset will be much faster especially on big + memory systems because the BIOS will not go through + the memory check. Disadvantage is that not all + hardware will be completely reinitialized on reboot so + there may be boot problems on some systems. + + refscale.holdoff= [KNL] Set test-start holdoff period. The purpose of this parameter is to delay the start of the test until boot completes in order to avoid interference. + refscale.lookup_instances= [KNL] + Number of data elements to use for the forms of + SLAB_TYPESAFE_BY_RCU testing. A negative number + is negated and multiplied by nr_cpu_ids, while + zero specifies nr_cpu_ids. + refscale.loops= [KNL] Set the number of loops over the synchronization primitive under test. Increasing this number @@ -5374,6 +6443,13 @@ print every Nth verbose statement, where N is the value specified. + regulator_ignore_unused + [REGULATOR] + Prevents regulator framework from disabling regulators + that are unused, due no driver claiming them. This may + be useful for debug and development, but should not be + needed on a platform with proper driver support. + relax_domain_level= [KNL, SMP] Set scheduler's default relax_domain_level. See Documentation/admin-guide/cgroup-v1/cpusets.rst. @@ -5384,7 +6460,29 @@ them. If <base> is less than 0x10000, the region is assumed to be I/O ports; otherwise it is memory. - reservetop= [X86-32] + reserve_mem= [RAM] + Format: nn[KMG]:<align>:<label> + Reserve physical memory and label it with a name that + other subsystems can use to access it. This is typically + used for systems that do not wipe the RAM, and this command + line will try to reserve the same physical memory on + soft reboots. Note, it is not guaranteed to be the same + location. For example, if anything about the system changes + or if booting a different kernel. It can also fail if KASLR + places the kernel at the location of where the RAM reservation + was from a previous boot, the new reservation will be at a + different location. + Any subsystem using this feature must add a way to verify + that the contents of the physical memory is from a previous + boot, as there may be cases where the memory will not be + located at the same location. + + The format is size:align:label for example, to request + 12 megabytes of 4096 alignment for ramoops: + + reserve_mem=12M:4096:oops ramoops.mem_name=oops + + reservetop= [X86-32,EARLY] Format: nn[KMG] Reserves a hole at the top of the kernel virtual address space. @@ -5410,7 +6508,8 @@ Useful for devices that are detected asynchronously (e.g. USB and MMC devices). - retain_initrd [RAM] Keep initrd memory after extraction + retain_initrd [RAM] Keep initrd memory after extraction. After boot, it will + be accessible via /sys/firmware/initrd. retbleed= [X86] Control mitigation of RETBleed (Arbitrary Speculative Code Execution with Return Instructions) @@ -5423,7 +6522,7 @@ that don't. off - no mitigation - auto - automatically select a migitation + auto - automatically select a mitigation auto,nosmt - automatically select a mitigation, disabling SMT if necessary for the full mitigation (only on Zen1 @@ -5461,29 +6560,35 @@ 2 The "airplane mode" button toggles between everything blocked and everything unblocked. - rhash_entries= [KNL,NET] - Set number of hash buckets for route cache - ring3mwait=disable [KNL] Disable ring 3 MONITOR/MWAIT feature on supported CPUs. + riscv_isa_fallback [RISCV,EARLY] + When CONFIG_RISCV_ISA_FALLBACK is not enabled, permit + falling back to detecting extension support by parsing + "riscv,isa" property on devicetree systems when the + replacement properties are not found. See the Kconfig + entry for RISCV_ISA_FALLBACK. + ro [KNL] Mount root device read-only on boot - rodata= [KNL] + rodata= [KNL,EARLY] on Mark read-only kernel memory as read-only (default). off Leave read-only kernel memory writable for debugging. - full Mark read-only kernel memory and aliases as read-only - [arm64] + noalias Mark read-only kernel memory as read-only but retain + writable aliases in the direct map for regions outside + of the kernel image. [arm64] rockchip.usb_uart + [EARLY] Enable the uart passthrough on the designated usb port on Rockchip SoCs. When active, the signals of the debug-uart get routed to the D+ and D- pins of the usb port and the regular usb controller gets disabled. root= [KNL] Root filesystem - Usually this a a block device specifier of some kind, + Usually this is a block device specifier of some kind, see the early_lookup_bdev comment in block/early-lookup.c for details. Alternatively this can be "ram" for the legacy initial @@ -5495,17 +6600,33 @@ rootflags= [KNL] Set root filesystem mount option string + initramfs_options= [KNL] + Specify mount options for for the initramfs mount. + rootfstype= [KNL] Set root filesystem type rootwait [KNL] Wait (indefinitely) for root device to show up. Useful for devices that are detected asynchronously (e.g. USB and MMC devices). + rootwait= [KNL] Maximum time (in seconds) to wait for root device + to show up before attempting to mount the root + filesystem. + rproc_mem=nn[KMG][@address] [KNL,ARM,CMA] Remoteproc physical memory block. Memory area to be used by remote processor image, managed by CMA. + rseq_debug= [KNL] Enable or disable restartable sequence + debug mode. Defaults to CONFIG_RSEQ_DEBUG_DEFAULT_ENABLE. + Format: <bool> + + rt_group_sched= [KNL] Enable or disable SCHED_RR/FIFO group scheduling + when CONFIG_RT_GROUP_SCHED=y. Defaults to + !CONFIG_RT_GROUP_SCHED_DEFAULT_DISABLED. + Format: <bool> + rw [KNL] Mount root device read-write on boot S [KNL] Run init in single mode @@ -5513,9 +6634,10 @@ s390_iommu= [HW,S390] Set s390 IOTLB flushing mode strict - With strict flushing every unmap operation will result in - an IOTLB flush. Default is lazy flushing before reuse, - which is faster. + With strict flushing every unmap operation will result + in an IOTLB flush. Default is lazy flushing before + reuse, which is faster. Deprecated, equivalent to + iommu.strict=1. s390_iommu_aperture= [KNL,S390] Specifies the size of the per device DMA address space @@ -5532,7 +6654,12 @@ sa1100ir [NET] See drivers/net/irda/sa1100_ir.c. - sched_verbose [KNL] Enables verbose scheduler debug messages. + sched_proxy_exec= [KNL] + Enables or disables "proxy execution" style + solution to mutex-based priority inversion. + Format: <bool> + + sched_verbose [KNL,EARLY] Enables verbose scheduler debug messages. schedstats= [KNL,X86] Enable or disable scheduled statistics. Allowed values are enable and disable. This feature @@ -5540,6 +6667,7 @@ but is useful for debugging and performance tuning. sched_thermal_decay_shift= + [Deprecated] [KNL, SMP] Set a decay shift for scheduler thermal pressure signal. Thermal pressure signal follows the default decay period of other scheduler pelt @@ -5647,7 +6775,11 @@ non-zero "wait" parameter. See weight_single and weight_many. - skew_tick= [KNL] Offset the periodic timer tick per cpu to mitigate + sdw_mclk_divider=[SDW] + Specify the MCLK divider for Intel SoundWire buses in + case the BIOS does not provide the clock rate properly. + + skew_tick= [KNL,EARLY] Offset the periodic timer tick per cpu to mitigate xtime_lock contention on larger systems, and/or RCU lock contention on all systems with CONFIG_MAXSMP set. Format: { "0" | "1" } @@ -5669,7 +6801,16 @@ serialnumber [BUGS=X86-32] - sev=option[,option...] [X86-64] See Documentation/arch/x86/x86_64/boot-options.rst + sev=option[,option...] [X86-64] + + debug + Enable debug messages. + + nosnp + Do not enable SEV-SNP (applies to host/hypervisor + only). Setting 'nosnp' avoids the RMP check overhead + in memory accesses when users do not want to run + SEV-SNP guests. shapers= [NET] Maximal number of shapers. @@ -5683,14 +6824,47 @@ apic=verbose is specified. Example: apic=debug show_lapic=all - simeth= [IA-64] - simscsi= + slab_debug[=options[,slabs][;[options[,slabs]]...] [MM] + Enabling slab_debug allows one to determine the + culprit if slab objects become corrupted. Enabling + slab_debug can create guard zones around objects and + may poison objects when not in use. Also tracks the + last alloc / free. For more information see + Documentation/admin-guide/mm/slab.rst. + (slub_debug legacy name also accepted for now) - slram= [HW,MTD] + Using this option implies the "no_hash_pointers" + option which can be undone by adding the + "hash_pointers=always" option. + + slab_max_order= [MM] + Determines the maximum allowed order for slabs. + A high setting may cause OOMs due to memory + fragmentation. For more information see + Documentation/admin-guide/mm/slab.rst. + (slub_max_order legacy name also accepted for now) slab_merge [MM] Enable merging of slabs with similar size when the kernel is built without CONFIG_SLAB_MERGE_DEFAULT. + (slub_merge legacy name also accepted for now) + + slab_min_objects= [MM] + The minimum number of objects per slab. SLUB will + increase the slab order up to slab_max_order to + generate a sufficiently large slab able to contain + the number of objects indicated. The higher the number + of objects the smaller the overhead of tracking slabs + and the less frequently locks need to be acquired. + For more information see + Documentation/admin-guide/mm/slab.rst. + (slub_min_objects legacy name also accepted for now) + + slab_min_order= [MM] + Determines the minimum page order for slabs. Must be + lower or equal to slab_max_order. For more information see + Documentation/admin-guide/mm/slab.rst. + (slub_min_order legacy name also accepted for now) slab_nomerge [MM] Disable merging of slabs with similar size. May be @@ -5703,48 +6877,21 @@ cache (risks via metadata attacks are mostly unchanged). Debug options disable merging on their own. - For more information see Documentation/mm/slub.rst. - - slab_max_order= [MM, SLAB] - Determines the maximum allowed order for slabs. - A high setting may cause OOMs due to memory - fragmentation. Defaults to 1 for systems with - more than 32MB of RAM, 0 otherwise. - - slub_debug[=options[,slabs][;[options[,slabs]]...] [MM, SLUB] - Enabling slub_debug allows one to determine the - culprit if slab objects become corrupted. Enabling - slub_debug can create guard zones around objects and - may poison objects when not in use. Also tracks the - last alloc / free. For more information see - Documentation/mm/slub.rst. + For more information see + Documentation/admin-guide/mm/slab.rst. + (slub_nomerge legacy name also accepted for now) + + slab_strict_numa [MM] + Support memory policies on a per object level + in the slab allocator. The default is for memory + policies to be applied at the folio level when + a new folio is needed or a partial folio is + retrieved from the lists. Increases overhead + in the slab fastpaths but gains more accurate + NUMA kernel object placement which helps with slow + interconnects in NUMA systems. - slub_max_order= [MM, SLUB] - Determines the maximum allowed order for slabs. - A high setting may cause OOMs due to memory - fragmentation. For more information see - Documentation/mm/slub.rst. - - slub_min_objects= [MM, SLUB] - The minimum number of objects per slab. SLUB will - increase the slab order up to slub_max_order to - generate a sufficiently large slab able to contain - the number of objects indicated. The higher the number - of objects the smaller the overhead of tracking slabs - and the less frequently locks need to be acquired. - For more information see Documentation/mm/slub.rst. - - slub_min_order= [MM, SLUB] - Determines the minimum page order for slabs. Must be - lower than slub_max_order. - For more information see Documentation/mm/slub.rst. - - slub_merge [MM, SLUB] - Same with slab_merge. - - slub_nomerge [MM, SLUB] - Same with slab_nomerge. This is supported for legacy. - See slab_nomerge for more information. + slram= [HW,MTD] smart2= [HW] Format: <io1>[,<io2>[,...,<io8>]] @@ -5760,6 +6907,13 @@ This feature may be more efficiently disabled using the csdlock_debug- kernel parameter. + smp.panic_on_ipistall= [KNL] + If a csd_lock_timeout extends for more than + the specified number of milliseconds, panic the + system. By default, let CSD-lock acquisition + take as long as they take. Specifying 300,000 + for this value provides a 5-minute timeout. + smsc-ircc2.nopnp [HW] Don't use PNP to discover SMC devices smsc-ircc2.ircc_cfg= [HW] Device configuration I/O port smsc-ircc2.ircc_sir= [HW] SIR base I/O port @@ -5771,10 +6925,10 @@ 1: Fast pin select (default) 2: ATC IRMode - smt= [KNL,MIPS,S390] Set the maximum number of threads (logical - CPUs) to use per physical CPU on systems capable of - symmetric multithreading (SMT). Will be capped to the - actual hardware limit. + smt= [KNL,MIPS,S390,EARLY] Set the maximum number of threads + (logical CPUs) to use per physical CPU on systems + capable of symmetric multithreading (SMT). Will + be capped to the actual hardware limit. Format: <integer> Default: -1 (no limit) @@ -5796,7 +6950,22 @@ sonypi.*= [HW] Sony Programmable I/O Control Device driver See Documentation/admin-guide/laptops/sonypi.rst - spectre_v2= [X86] Control mitigation of Spectre variant 2 + spectre_bhi= [X86] Control mitigation of Branch History Injection + (BHI) vulnerability. This setting affects the + deployment of the HW BHI control and the SW BHB + clearing sequence. + + on - (default) Enable the HW or SW mitigation as + needed. This protects the kernel from + both syscalls and VMs. + vmexit - On systems which don't have the HW mitigation + available, enable the SW mitigation on vmexit + ONLY. On such systems, the host kernel is + protected from VM-originated BHI attacks, but + may still be vulnerable to syscall attacks. + off - Disable the mitigation. + + spectre_v2= [X86,EARLY] Control mitigation of Spectre variant 2 (indirect branch speculation) vulnerability. The default operation protects the kernel from user space attacks. @@ -5811,11 +6980,13 @@ Selecting 'on' will, and 'auto' may, choose a mitigation method at run time according to the CPU, the available microcode, the setting of the - CONFIG_RETPOLINE configuration option, and the - compiler with which the kernel was built. + CONFIG_MITIGATION_RETPOLINE configuration option, + and the compiler with which the kernel was built. Selecting 'on' will also enable the mitigation against user space to user space task attacks. + Selecting specific mitigation does not force enable + user mitigations. Selecting 'off' will disable both the kernel and the user space protections. @@ -5875,8 +7046,19 @@ Not specifying this option is equivalent to spectre_v2_user=auto. + spec_rstack_overflow= + [X86,EARLY] Control RAS overflow mitigation on AMD Zen CPUs + + off - Disable mitigation + microcode - Enable microcode mitigation only + safe-ret - Enable sw-only safe RET mitigation (default) + ibpb - Enable mitigation by issuing IBPB on + kernel entry + ibpb-vmexit - Issue IBPB only on VMEXIT + (cloud-specific mitigation) + spec_store_bypass_disable= - [HW] Control Speculative Store Bypass (SSB) Disable mitigation + [HW,EARLY] Control Speculative Store Bypass (SSB) Disable mitigation (Speculative Store Bypass vulnerability) Certain CPUs are vulnerable to an exploit against a @@ -5927,11 +7109,6 @@ Not specifying this option is equivalent to spec_store_bypass_disable=auto. - spia_io_base= [HW,MTD] - spia_fio_base= - spia_pedr= - spia_peddr= - split_lock_detect= [X86] Enable split lock detection or bus lock detection @@ -5972,7 +7149,7 @@ #DB exception for bus lock is triggered only when CPL > 0. - srbds= [X86,INTEL] + srbds= [X86,INTEL,EARLY] Control the Special Register Buffer Data Sampling (SRBDS) mitigation. @@ -6059,7 +7236,7 @@ srcutree.convert_to_big must have the 0x10 bit set for contention-based conversions to occur. - ssbd= [ARM64,HW] + ssbd= [ARM64,HW,EARLY] Speculative Store Bypass Disable control On CPUs that are vulnerable to the Speculative @@ -6083,14 +7260,19 @@ growing up) the main stack are reserved for no other mapping. Default value is 256 pages. - stack_depot_disable= [KNL] + stack_depot_disable= [KNL,EARLY] Setting this to true through kernel command line will disable the stack depot thereby saving the static memory consumed by the stack hash table. By default this is set to false. + stack_depot_max_pools= [KNL,EARLY] + Specify the maximum number of pools to use for storing + stack traces. Pools are allocated on-demand up to this + limit. Default value is 8191 pools. + stacktrace [FTRACE] - Enabled the stack tracer on boot up. + Enable the stack tracer on boot up. stacktrace_filter=[function-list] [FTRACE] Limit the functions that the stack tracer @@ -6122,16 +7304,19 @@ be used to filter out binaries which have not yet been made aware of AT_MINSIGSTKSZ. - stress_hpt [PPC] + stress_hpt [PPC,EARLY] Limits the number of kernel HPT entries in the hash page table to increase the rate of hash page table faults on kernel addresses. - stress_slb [PPC] + stress_slb [PPC,EARLY] Limits the number of kernel SLB entries, and flushes them frequently to increase the rate of SLB faults on kernel addresses. + no_slb_preload [PPC,EARLY] + Disables slb preloading for userspace. + sunrpc.min_resvport= sunrpc.max_resvport= [NFS,SUNRPC] @@ -6187,7 +7372,7 @@ This parameter controls use of the Protected Execution Facility on pSeries. - swiotlb= [ARM,IA-64,PPC,MIPS,X86] + swiotlb= [ARM,PPC,MIPS,X86,S390,EARLY] Format: { <int> [,<int>] | force | noforce } <int> -- Number of I/O TLB slabs <int> -- Second integer after comma. Number of swiotlb @@ -6197,7 +7382,7 @@ wouldn't be automatically used by the kernel noforce -- Never use bounce buffers (for debugging) - switches= [HW,M68k] + switches= [HW,M68k,EARLY] sysctl.*= [KNL] Set a sysctl parameter, right before loading the init @@ -6243,10 +7428,6 @@ -1: disable all critical trip points in all thermal zones <degrees C>: override all critical trip points - thermal.nocrt= [HW,ACPI] - Set to disable actions on ACPI thermal zone - critical and hot trip points. - thermal.off= [HW,ACPI] 1: disable ACPI thermal control @@ -6260,11 +7441,30 @@ <deci-seconds>: poll all this frequency 0: no polling (default) - threadirqs [KNL] + thp_anon= [KNL] + Format: <size>[KMG],<size>[KMG]:<state>;<size>[KMG]-<size>[KMG]:<state> + state is one of "always", "madvise", "never" or "inherit". + Control the default behavior of the system with respect + to anonymous transparent hugepages. + Can be used multiple times for multiple anon THP sizes. + See Documentation/admin-guide/mm/transhuge.rst for more + details. + + threadirqs [KNL,EARLY] Force threading of all interrupt handlers except those marked explicitly IRQF_NO_THREAD. - topology= [S390] + thp_shmem= [KNL] + Format: <size>[KMG],<size>[KMG]:<policy>;<size>[KMG]-<size>[KMG]:<policy> + Control the default policy of each hugepage size for the + internal shmem mount. <policy> is one of policies available + for the shmem mount ("always", "inherit", "never", "within_size", + and "advise"). + It can be used multiple times for multiple shmem THP sizes. + See Documentation/admin-guide/mm/transhuge.rst for more + details. + + topology= [S390,EARLY] Format: {off | on} Specify if the kernel should make use of the cpu topology information if the hardware supports this. @@ -6272,12 +7472,6 @@ e.g. base its process migration decisions on it. Default is on. - topology_updates= [KNL, PPC, NUMA] - Format: {off} - Specify if the kernel should ignore (off) - topology updates sent by the hypervisor to this - LPAR. - torture.disable_onoff_at_boot= [KNL] Prevent the CPU-hotplug component of torturing until after init has spawned. @@ -6297,7 +7491,22 @@ torture.verbose_sleep_duration= [KNL] Duration of each verbose-printk() sleep in jiffies. - tp720= [HW,PS2] + tpm.disable_pcr_integrity= [HW,TPM] + Do not protect PCR registers from unintended physical + access, or interposers in the bus by the means of + having an integrity protected session wrapped around + TPM2_PCR_Extend command. Consider this in a situation + where TPM is heavily utilized by IMA, thus protection + causing a major performance hit, and the space where + machines are deployed is by other means guarded. + + tpm_crb_ffa.busy_timeout_ms= [ARM64,TPM] + Maximum time in milliseconds to retry sending a message + to the TPM service before giving up. This parameter controls + how long the system will continue retrying when the TPM + service is busy. + Format: <unsigned int> + Default: 2000 (2 seconds) tpm_suspend_pcr=[HW,TPM] Format: integer pcr id @@ -6308,6 +7517,13 @@ This will guarantee that all the other pcrs are saved. + tpm_tis.interrupts= [HW,TPM] + Enable interrupts for the MMIO based physical layer + for the FIFO interface. By default it is set to false + (0). For more information about TPM hardware interfaces + defined by Trusted Computing Group (TCG) see + https://trustedcomputinggroup.org/resource/pc-client-platform-tpm-profile-ptp-specification/ + tp_printk [FTRACE] Have the tracepoints sent to printk as well as the tracing ring buffer. This is useful for early boot up @@ -6318,7 +7534,7 @@ To turn off having tracepoints sent to printk, echo 0 > /proc/sys/kernel/tracepoint_printk Note, echoing 1 into this file without the - tracepoint_printk kernel cmdline option has no effect. + tp_printk kernel cmdline option has no effect. The tp_printk_stop_on_boot (see below) can also be used to stop the printing of events to console at @@ -6348,7 +7564,7 @@ (converted into nanoseconds). Fast, but depending on the architecture, may not be in sync between CPUs. - global - Event time stamps are synchronize across + global - Event time stamps are synchronized across CPUs. May be slower than the local clock, but better for some race conditions. counter - Simple counting of events (1, 2, ..) @@ -6370,6 +7586,14 @@ comma-separated list of trace events to enable. See also Documentation/trace/events.rst + To enable modules, use :mod: keyword: + + trace_event=:mod:<module> + + The value before :mod: will only enable specific events + that are part of the module. See the above mentioned + document for more information. + trace_instance=[instance-info] [FTRACE] Create a ring buffer instance early in boot up. This will be listed in: @@ -6390,6 +7614,59 @@ the same thing would happen if it was left off). The irq_handler_entry event, and all events under the "initcall" system. + Flags can be added to the instance to modify its behavior when it is + created. The flags are separated by '^'. + + The available flags are: + + traceoff - Have the tracing instance tracing disabled after it is created. + traceprintk - Have trace_printk() write into this trace instance + (note, "printk" and "trace_printk" can also be used) + + trace_instance=foo^traceoff^traceprintk,sched,irq + + The flags must come before the defined events. + + If memory has been reserved (see memmap for x86), the instance + can use that memory: + + memmap=12M$0x284500000 trace_instance=boot_map@0x284500000:12M + + The above will create a "boot_map" instance that uses the physical + memory at 0x284500000 that is 12Megs. The per CPU buffers of that + instance will be split up accordingly. + + Alternatively, the memory can be reserved by the reserve_mem option: + + reserve_mem=12M:4096:trace trace_instance=boot_map@trace + + This will reserve 12 megabytes at boot up with a 4096 byte alignment + and place the ring buffer in this memory. Note that due to KASLR, the + memory may not be the same location each time, which will not preserve + the buffer content. + + Also note that the layout of the ring buffer data may change between + kernel versions where the validator will fail and reset the ring buffer + if the layout is not the same as the previous kernel. + + If the ring buffer is used for persistent bootups and has events enabled, + it is recommend to disable tracing so that events from a previous boot do not + mix with events of the current boot (unless you are debugging a random crash + at boot up). + + reserve_mem=12M:4096:trace trace_instance=boot_map^traceoff^traceprintk@trace,sched,irq + + Note, saving the trace buffer across reboots does require that the system + is set up to not wipe memory. For instance, CONFIG_RESET_ATTACK_MITIGATION + can force a memory reset on boot which will clear any trace that was stored. + This is just one of many ways that can clear memory. Make sure your system + keeps the content of memory across reboots before relying on this option. + + NB: Both the mapped address and size must be page aligned for the architecture. + + See also Documentation/trace/debugging.rst + + trace_options=[option-list] [FTRACE] Enable or disable tracer options at boot. The option-list is a comma delimited list of options @@ -6407,12 +7684,12 @@ section. trace_trigger=[trigger-list] - [FTRACE] Add a event trigger on specific events. + [FTRACE] Add an event trigger on specific events. Set a trigger on top of a specific event, with an optional filter. - The format is is "trace_trigger=<event>.<trigger>[ if <filter>],..." - Where more than one trigger may be specified that are comma deliminated. + The format is "trace_trigger=<event>.<trigger>[ if <filter>],..." + Where more than one trigger may be specified that are comma delimited. For example: @@ -6420,11 +7697,20 @@ The above will enable the "stacktrace" trigger on the "sched_switch" event but only trigger it if the "prev_state" of the "sched_switch" - event is "2" (TASK_UNINTERUPTIBLE). + event is "2" (TASK_UNINTERRUPTIBLE). See also "Event triggers" in Documentation/trace/events.rst + traceoff_after_boot + [FTRACE] Sometimes tracing is used to debug issues + during the boot process. Since the trace buffer has a + limited amount of storage, it may be prudent to + disable tracing after the boot is finished, otherwise + the critical information may be overwritten. With this + option, the main tracing buffer will be turned off at + the end of the boot process. + traceoff_on_warning [FTRACE] enable this option to disable tracing when a warning is hit. This turns off "tracing_on". Tracing can @@ -6446,6 +7732,20 @@ See Documentation/admin-guide/mm/transhuge.rst for more details. + transparent_hugepage_shmem= [KNL] + Format: [always|within_size|advise|never|deny|force] + Can be used to control the hugepage allocation policy for + the internal shmem mount. + See Documentation/admin-guide/mm/transhuge.rst + for more details. + + transparent_hugepage_tmpfs= [KNL] + Format: [always|within_size|advise|never] + Can be used to control the default hugepage allocation policy + for the tmpfs mount. + See Documentation/admin-guide/mm/transhuge.rst + for more details. + trusted.source= [KEYS] Format: <string> This parameter identifies the trust source as a backend @@ -6454,6 +7754,7 @@ - "tpm" - "tee" - "caam" + - "dcp" If not specified then it defaults to iterating through the trust source list starting with TPM and assigns the first trust source as a backend which is initialized @@ -6469,6 +7770,31 @@ If not specified, "default" is used. In this case, the RNG's choice is left to each individual trust source. + trusted.dcp_use_otp_key + This is intended to be used in combination with + trusted.source=dcp and will select the DCP OTP key + instead of the DCP UNIQUE key blob encryption. + + trusted.dcp_skip_zk_test + This is intended to be used in combination with + trusted.source=dcp and will disable the check if the + blob key is all zeros. This is helpful for situations where + having this key zero'ed is acceptable. E.g. in testing + scenarios. + + tsa= [X86] Control mitigation for Transient Scheduler + Attacks on AMD CPUs. Search the following in your + favourite search engine for more details: + + "Technical guidance for mitigating transient scheduler + attacks". + + off - disable the mitigation + on - enable the mitigation (default) + user - mitigate only user/kernel transitions + vm - mitigate only guest/host transitions + + tsc= Disable clocksource stability checks for TSC. Format: <string> [x86] reliable: mark tsc clocksource as reliable, this @@ -6498,7 +7824,7 @@ can be overridden by a later tsc=nowatchdog. A console message will flag any such suppression or overriding. - tsc_early_khz= [X86] Skip early TSC calibration and use the given + tsc_early_khz= [X86,EARLY] Skip early TSC calibration and use the given value instead. Useful when the early TSC frequency discovery procedure is not reliable, such as on overclocked systems with CPUID.16h support and partial CPUID.15h support. @@ -6533,7 +7859,7 @@ See Documentation/admin-guide/hw-vuln/tsx_async_abort.rst for more details. - tsx_async_abort= [X86,INTEL] Control mitigation for the TSX Async + tsx_async_abort= [X86,INTEL,EARLY] Control mitigation for the TSX Async Abort (TAA) vulnerability. Similar to Micro-architectural Data Sampling (MDS) @@ -6596,10 +7922,26 @@ Note that genuine overcurrent events won't be reported either. + unaligned_scalar_speed= + [RISCV] + Format: {slow | fast | unsupported} + Allow skipping scalar unaligned access speed tests. This + is useful for testing alternative code paths and to skip + the tests in environments where they run too slowly. All + CPUs must have the same scalar unaligned access speed. + + unaligned_vector_speed= + [RISCV] + Format: {slow | fast | unsupported} + Allow skipping vector unaligned access speed tests. This + is useful for testing alternative code paths and to skip + the tests in environments where they run too slowly. All + CPUs must have the same vector unaligned access speed. + unknown_nmi_panic [X86] Cause panic on unknown NMI. - unwind_debug [X86-64] + unwind_debug [X86-64,EARLY] Enable unwinder debug output. This can be useful for debugging certain unwinder error conditions, including corrupt stacks and @@ -6607,7 +7949,7 @@ usbcore.authorized_default= [USB] Default USB device authorization: - (default -1 = authorized except for wireless USB, + (default -1 = authorized (same as 1), 0 = not authorized, 1 = authorized, 2 = authorized if device connected to internal port) @@ -6705,6 +8047,9 @@ pause after every control message); o = USB_QUIRK_HUB_SLOW_RESET (Hub needs extra delay after resetting its port); + p = USB_QUIRK_SHORT_SET_ADDRESS_REQ_TIMEOUT + (Reduce timeout of the SET_ADDRESS + request from 5000 ms to 500 ms); Example: quirks=0781:5580:bk,0a5c:5834:gij usbhid.mousepoll= @@ -6719,6 +8064,9 @@ usb-storage.delay_use= [UMS] The delay in seconds before a new device is scanned for Logical Units (default 1). + Optionally the delay in milliseconds if the value has + suffix with "ms". + Example: delay_use=2567ms usb-storage.quirks= [UMS] A list of quirks entries to supplement or @@ -6785,13 +8133,6 @@ 16 - SIGBUS faults Example: user_debug=31 - userpte= - [X86] Flags controlling user PTE allocations. - - nohigh = do not allocate PTE pages in - HIGHMEM regardless of setting - of CONFIG_HIGHPTE. - vdso= [X86,SH,SPARC] On X86_32, this is an alias for vdso32=. Otherwise: @@ -6812,10 +8153,7 @@ Try vdso32=0 if you encounter an error that says: dl_main: Assertion `(void *) ph->p_vaddr == _rtld_local._dl_sysinfo_dso' failed! - vector= [IA-64,SMP] - vector=percpu: enable percpu vector domain - - video= [FB] Frame buffer configuration + video= [FB,EARLY] Frame buffer configuration See Documentation/fb/modedb.rst. video.brightness_switch_enabled= [ACPI] @@ -6863,13 +8201,16 @@ P Enable page structure init time poisoning - Disable all of the above options - vmalloc=nn[KMG] [KNL,BOOT] Forces the vmalloc area to have an exact - size of <nn>. This can be used to increase the - minimum size (128MB on x86). It can also be used to - decrease the size and leave more room for directly - mapped kernel RAM. + vmalloc=nn[KMG] [KNL,BOOT,EARLY] Forces the vmalloc area to have an + exact size of <nn>. This can be used to increase + the minimum size (128MB on x86, arm32 platforms). + It can also be used to decrease the size and leave more room + for directly mapped kernel RAM. Note that this parameter does + not exist on many other platforms (including arm64, alpha, + loongarch, arc, csky, hexagon, microblaze, mips, nios2, openrisc, + parisc, m64k, powerpc, riscv, sh, um, xtensa, s390, sparc). - vmcp_cma=nn[MG] [KNL,S390] + vmcp_cma=nn[MG] [KNL,S390,EARLY] Sets the memory size reserved for contiguous memory allocations for the vmcp device driver. @@ -6882,7 +8223,17 @@ vmpoff= [KNL,S390] Perform z/VM CP command after power off. Format: <command> - vsyscall= [X86-64] + vmscape= [X86] Controls mitigation for VMscape attacks. + VMscape attacks can leak information from a userspace + hypervisor to a guest via speculative side-channels. + + off - disable the mitigation + ibpb - use Indirect Branch Prediction Barrier + (IBPB) mitigation (default) + force - force vulnerability detection even on + unaffected processors + + vsyscall= [X86-64,EARLY] Controls the behavior of vsyscalls (i.e. calls to fixed addresses of 0xffffffffff600x00 from legacy code). Most statically-linked binaries and older @@ -6909,7 +8260,7 @@ vt.cur_default= [VT] Default cursor shape. Format: 0xCCBBAA, where AA, BB, and CC are the same as the parameters of the <Esc>[?A;B;Cc escape sequence; - see VGA-softcursor.txt. Default: 2 = underline. + see vga-softcursor.rst. Default: 2 = underline. vt.default_blu= [VT] Format: <blue0>,<blue1>,<blue2>,...,<blue15> @@ -6964,6 +8315,13 @@ disables both lockup detectors. Default is 10 seconds. + workqueue.unbound_cpus= + [KNL,SMP] Specify to constrain one or some CPUs + to use in unbound workqueues. + Format: <cpu-list> + By default, all online CPUs are available for + unbound workqueues. + workqueue.watchdog_thresh= If CONFIG_WQ_WATCHDOG is configured, workqueue can warn stall conditions and dump internal state to @@ -6973,6 +8331,13 @@ it can be updated at runtime by writing to the corresponding sysfs file. + workqueue.panic_on_stall=<uint> + Panic when workqueue stall is detected by + CONFIG_WQ_WATCHDOG. It sets the number times of the + stall to trigger panic. + + The default is 0, which disables the panic on stall. + workqueue.cpu_intensive_thresh_us= Per-cpu work items which run for longer than this threshold are automatically considered CPU intensive @@ -6985,14 +8350,14 @@ threshold repeatedly. They are likely good candidates for using WQ_UNBOUND workqueues instead. - workqueue.disable_numa - By default, all work items queued to unbound - workqueues are affine to the NUMA nodes they're - issued on, which results in better behavior in - general. If NUMA affinity needs to be disabled for - whatever reason, this option can be used. Note - that this also can be controlled per-workqueue for - workqueues visible under /sys/bus/workqueue/. + workqueue.cpu_intensive_warning_thresh=<uint> + If CONFIG_WQ_CPU_INTENSIVE_REPORT is set, the kernel + will report the work functions which violate the + intensive_threshold_us repeatedly. In order to prevent + spurious warnings, start printing only after a work + function has violated this threshold number of times. + + The default is 4 times. 0 disables the warning. workqueue.power_efficient Per-cpu workqueues are generally preferred because @@ -7009,6 +8374,18 @@ The default value of this parameter is determined by the config option CONFIG_WQ_POWER_EFFICIENT_DEFAULT. + workqueue.default_affinity_scope= + Select the default affinity scope to use for unbound + workqueues. Can be one of "cpu", "smt", "cache", + "numa" and "system". Default is "cache". For more + information, see the Affinity Scopes section in + Documentation/core-api/workqueue.rst. + + This can be changed after boot by writing to the + matching /sys/module/workqueue/parameters file. All + workqueues with the "default" affinity scope will be + updated accordingly. + workqueue.debug_force_rr_cpu Workqueue used to implicitly guarantee that work items queued without explicit CPU specified are put @@ -7020,13 +8397,13 @@ When enabled, memory and cache locality will be impacted. - writecombine= [LOONGARCH] Control the MAT (Memory Access Type) of - ioremap_wc(). + writecombine= [LOONGARCH,EARLY] Control the MAT (Memory Access + Type) of ioremap_wc(). on - Enable writecombine, use WUC for ioremap_wc() off - Disable writecombine, use SUC for ioremap_wc() - x2apic_phys [X86-64,APIC] Use x2apic physical mode instead of + x2apic_phys [X86-64,APIC,EARLY] Use x2apic physical mode instead of default x2apic cluster mode on platforms supporting x2apic. @@ -7037,7 +8414,7 @@ save/restore/migration must be enabled to handle larger domains. - xen_emul_unplug= [HW,X86,XEN] + xen_emul_unplug= [HW,X86,XEN,EARLY] Unplug Xen emulated devices Format: [unplug0,][unplug1] ide-disks -- unplug primary master IDE devices @@ -7049,21 +8426,22 @@ the unplug protocol never -- do not unplug even if version check succeeds - xen_legacy_crash [X86,XEN] + xen_legacy_crash [X86,XEN,EARLY] Crash from Xen panic notifier, without executing late panic() code such as dumping handler. - xen_msr_safe= [X86,XEN] + xen_mc_debug [X86,XEN,EARLY] + Enable multicall debugging when running as a Xen PV guest. + Enabling this feature will reduce performance a little + bit, so it should only be enabled for obtaining extended + debug data in case of multicall errors. + + xen_msr_safe= [X86,XEN,EARLY] Format: <bool> Select whether to always use non-faulting (safe) MSR access functions when running as Xen PV guest. The default value is controlled by CONFIG_XEN_PV_MSR_SAFE. - xen_nopvspin [X86,XEN] - Disables the qspinlock slowpath using Xen PV optimizations. - This parameter is obsoleted by "nopvspin" parameter, which - has equivalent effect for XEN platform. - xen_nopv [X86] Disables the PV optimizations forcing the HVM guest to run as generic HVM guest with no PV drivers. @@ -7071,7 +8449,7 @@ has equivalent effect for XEN platform. xen_no_vector_callback - [KNL,X86,XEN] Disable the vector callback for Xen + [KNL,X86,XEN,EARLY] Disable the vector callback for Xen event channel interrupts. xen_scrub_pages= [XEN] @@ -7080,7 +8458,7 @@ with /sys/devices/system/xen_memory/xen_memory0/scrub_pages. Default value controlled with CONFIG_XEN_SCRUB_PAGES_DEFAULT. - xen_timer_slop= [X86-64,XEN] + xen_timer_slop= [X86-64,XEN,EARLY] Set the timer slop (in nanoseconds) for the virtual Xen timers (default is 100000). This adjusts the minimum delta of virtualized Xen timers, where lower values @@ -7133,7 +8511,7 @@ host controller quirks. Meaning of each bit can be consulted in header drivers/usb/host/xhci.h. - xmon [PPC] + xmon [PPC,EARLY] Format: { early | on | rw | ro | off } Controls if xmon debugger is enabled. Default is off. Passing only "xmon" is equivalent to "xmon=early". @@ -7151,4 +8529,3 @@ memory, and other data can't be written using xmon commands. off xmon is disabled. - |
