summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide')
-rw-r--r--Documentation/admin-guide/LSM/SELinux.rst11
-rw-r--r--Documentation/admin-guide/gpio/gpio-sim.rst7
-rw-r--r--Documentation/admin-guide/hw-vuln/attack_vector_controls.rst238
-rw-r--r--Documentation/admin-guide/hw-vuln/index.rst1
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt36
-rw-r--r--Documentation/admin-guide/pm/amd-pstate.rst2
-rw-r--r--Documentation/admin-guide/pm/cpufreq.rst4
-rw-r--r--Documentation/admin-guide/syscall-user-dispatch.rst23
-rw-r--r--Documentation/admin-guide/sysctl/kernel.rst36
-rw-r--r--Documentation/admin-guide/sysctl/vm.rst8
-rw-r--r--Documentation/admin-guide/thunderbolt.rst9
11 files changed, 331 insertions, 44 deletions
diff --git a/Documentation/admin-guide/LSM/SELinux.rst b/Documentation/admin-guide/LSM/SELinux.rst
index 520a1c2c6fd2..cdd65164ca96 100644
--- a/Documentation/admin-guide/LSM/SELinux.rst
+++ b/Documentation/admin-guide/LSM/SELinux.rst
@@ -2,6 +2,17 @@
SELinux
=======
+Information about the SELinux kernel subsystem can be found at the
+following links:
+
+ https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git/tree/README.md
+
+ https://github.com/selinuxproject/selinux-kernel/wiki
+
+Information about the SELinux userspace can be found at:
+
+ https://github.com/SELinuxProject/selinux/wiki
+
If you want to use SELinux, chances are you will want
to use the distro-provided policies, or install the
latest reference policy release from
diff --git a/Documentation/admin-guide/gpio/gpio-sim.rst b/Documentation/admin-guide/gpio/gpio-sim.rst
index 35d49ccd49e0..f5135a14ef2e 100644
--- a/Documentation/admin-guide/gpio/gpio-sim.rst
+++ b/Documentation/admin-guide/gpio/gpio-sim.rst
@@ -50,8 +50,11 @@ the number of lines exposed by this bank.
**Attribute:** ``/config/gpio-sim/gpio-device/gpio-bankX/lineY/name``
-This group represents a single line at the offset Y. The 'name' attribute
-allows to set the line name as represented by the 'gpio-line-names' property.
+**Attribute:** ``/config/gpio-sim/gpio-device/gpio-bankX/lineY/valid``
+
+This group represents a single line at the offset Y. The ``valid`` attribute
+indicates whether the line can be used as GPIO. The ``name`` attribute allows
+to set the line name as represented by the 'gpio-line-names' property.
**Item:** ``/config/gpio-sim/gpio-device/gpio-bankX/lineY/hog``
diff --git a/Documentation/admin-guide/hw-vuln/attack_vector_controls.rst b/Documentation/admin-guide/hw-vuln/attack_vector_controls.rst
new file mode 100644
index 000000000000..b4de16f5ec44
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/attack_vector_controls.rst
@@ -0,0 +1,238 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Attack Vector Controls
+======================
+
+Attack vector controls provide a simple method to configure only the mitigations
+for CPU vulnerabilities which are relevant given the intended use of a system.
+Administrators are encouraged to consider which attack vectors are relevant and
+disable all others in order to recoup system performance.
+
+When new relevant CPU vulnerabilities are found, they will be added to these
+attack vector controls so administrators will likely not need to reconfigure
+their command line parameters as mitigations will continue to be correctly
+applied based on the chosen attack vector controls.
+
+Attack Vectors
+--------------
+
+There are 5 sets of attack-vector mitigations currently supported by the kernel:
+
+#. :ref:`user_kernel`
+#. :ref:`user_user`
+#. :ref:`guest_host`
+#. :ref:`guest_guest`
+#. :ref:`smt`
+
+To control the enabled attack vectors, see :ref:`cmdline`.
+
+.. _user_kernel:
+
+User-to-Kernel
+^^^^^^^^^^^^^^
+
+The user-to-kernel attack vector involves a malicious userspace program
+attempting to leak kernel data into userspace by exploiting a CPU vulnerability.
+The kernel data involved might be limited to certain kernel memory, or include
+all memory in the system, depending on the vulnerability exploited.
+
+If no untrusted userspace applications are being run, such as with single-user
+systems, consider disabling user-to-kernel mitigations.
+
+Note that the CPU vulnerabilities mitigated by Linux have generally not been
+shown to be exploitable from browser-based sandboxes. User-to-kernel
+mitigations are therefore mostly relevant if unknown userspace applications may
+be run by untrusted users.
+
+*user-to-kernel mitigations are enabled by default*
+
+.. _user_user:
+
+User-to-User
+^^^^^^^^^^^^
+
+The user-to-user attack vector involves a malicious userspace program attempting
+to influence the behavior of another unsuspecting userspace program in order to
+exfiltrate data. The vulnerability of a userspace program is based on the
+program itself and the interfaces it provides.
+
+If no untrusted userspace applications are being run, consider disabling
+user-to-user mitigations.
+
+Note that because the Linux kernel contains a mapping of all physical memory,
+preventing a malicious userspace program from leaking data from another
+userspace program requires mitigating user-to-kernel attacks as well for
+complete protection.
+
+*user-to-user mitigations are enabled by default*
+
+.. _guest_host:
+
+Guest-to-Host
+^^^^^^^^^^^^^
+
+The guest-to-host attack vector involves a malicious VM attempting to leak
+hypervisor data into the VM. The data involved may be limited, or may
+potentially include all memory in the system, depending on the vulnerability
+exploited.
+
+If no untrusted VMs are being run, consider disabling guest-to-host mitigations.
+
+*guest-to-host mitigations are enabled by default if KVM support is present*
+
+.. _guest_guest:
+
+Guest-to-Guest
+^^^^^^^^^^^^^^
+
+The guest-to-guest attack vector involves a malicious VM attempting to influence
+the behavior of another unsuspecting VM in order to exfiltrate data. The
+vulnerability of a VM is based on the code inside the VM itself and the
+interfaces it provides.
+
+If no untrusted VMs, or only a single VM is being run, consider disabling
+guest-to-guest mitigations.
+
+Similar to the user-to-user attack vector, preventing a malicious VM from
+leaking data from another VM requires mitigating guest-to-host attacks as well
+due to the Linux kernel phys map.
+
+*guest-to-guest mitigations are enabled by default if KVM support is present*
+
+.. _smt:
+
+Cross-Thread
+^^^^^^^^^^^^
+
+The cross-thread attack vector involves a malicious userspace program or
+malicious VM either observing or attempting to influence the behavior of code
+running on the SMT sibling thread in order to exfiltrate data.
+
+Many cross-thread attacks can only be mitigated if SMT is disabled, which will
+result in reduced CPU core count and reduced performance.
+
+If cross-thread mitigations are fully enabled ('auto,nosmt'), all mitigations
+for cross-thread attacks will be enabled. SMT may be disabled depending on
+which vulnerabilities are present in the CPU.
+
+If cross-thread mitigations are partially enabled ('auto'), mitigations for
+cross-thread attacks will be enabled but SMT will not be disabled.
+
+If cross-thread mitigations are disabled, no mitigations for cross-thread
+attacks will be enabled.
+
+Cross-thread mitigation may not be required if core-scheduling or similar
+techniques are used to prevent untrusted workloads from running on SMT siblings.
+
+*cross-thread mitigations default to partially enabled*
+
+.. _cmdline:
+
+Command Line Controls
+---------------------
+
+Attack vectors are controlled through the mitigations= command line option. The
+value provided begins with a global option and then may optionally include one
+or more options to disable various attack vectors.
+
+Format:
+ | ``mitigations=[global]``
+ | ``mitigations=[global],[attack vectors]``
+
+Global options:
+
+============ =============================================================
+Option Description
+============ =============================================================
+'off' All attack vectors disabled.
+'auto' All attack vectors enabled, partial cross-thread mitigations.
+'auto,nosmt' All attack vectors enabled, full cross-thread mitigations.
+============ =============================================================
+
+Attack vector options:
+
+================= =======================================
+Option Description
+================= =======================================
+'no_user_kernel' Disables user-to-kernel mitigations.
+'no_user_user' Disables user-to-user mitigations.
+'no_guest_host' Disables guest-to-host mitigations.
+'no_guest_guest' Disables guest-to-guest mitigations
+'no_cross_thread' Disables all cross-thread mitigations.
+================= =======================================
+
+Multiple attack vector options may be specified in a comma-separated list. If
+the global option is not specified, it defaults to 'auto'. The global option
+'off' is equivalent to disabling all attack vectors.
+
+Examples:
+ | ``mitigations=auto,no_user_kernel``
+
+ Enable all attack vectors except user-to-kernel. Partial cross-thread
+ mitigations.
+
+ | ``mitigations=auto,nosmt,no_guest_host,no_guest_guest``
+
+ Enable all attack vectors and cross-thread mitigations except for
+ guest-to-host and guest-to-guest mitigations.
+
+ | ``mitigations=,no_cross_thread``
+
+ Enable all attack vectors but not cross-thread mitigations.
+
+Interactions with command-line options
+--------------------------------------
+
+Vulnerability-specific controls (e.g. "retbleed=off") take precedence over all
+attack vector controls. Mitigations for individual vulnerabilities may be
+turned on or off via their command-line options regardless of the attack vector
+controls.
+
+Summary of attack-vector mitigations
+------------------------------------
+
+When a vulnerability is mitigated due to an attack-vector control, the default
+mitigation option for that particular vulnerability is used. To use a different
+mitigation, please use the vulnerability-specific command line option.
+
+The table below summarizes which vulnerabilities are mitigated when different
+attack vectors are enabled and assuming the CPU is vulnerable.
+
+=============== ============== ============ ============= ============== ============ ========
+Vulnerability User-to-Kernel User-to-User Guest-to-Host Guest-to-Guest Cross-Thread Notes
+=============== ============== ============ ============= ============== ============ ========
+BHI X X
+ITS X X
+GDS X X X X * (Note 1)
+L1TF X X * (Note 2)
+MDS X X X X * (Note 2)
+MMIO X X X X * (Note 2)
+Meltdown X
+Retbleed X X * (Note 3)
+RFDS X X X X
+Spectre_v1 X
+Spectre_v2 X X
+Spectre_v2_user X X * (Note 1)
+SRBDS X X X X
+SRSO X X
+SSB (Note 4)
+TAA X X X X * (Note 2)
+TSA X X X X
+=============== ============== ============ ============= ============== ============ ========
+
+Notes:
+ 1 -- Can be mitigated without disabling SMT.
+
+ 2 -- Disables SMT if cross-thread mitigations are fully enabled and the CPU
+ is vulnerable
+
+ 3 -- Disables SMT if cross-thread mitigations are fully enabled, the CPU is
+ vulnerable, and STIBP is not supported
+
+ 4 -- Speculative store bypass is always enabled by default (no kernel
+ mitigation applied) unless overridden with spec_store_bypass_disable option
+
+When an attack-vector is disabled, all mitigations for the vulnerabilities
+listed in the above table are disabled, unless mitigation is required for a
+different enabled attack-vector or a mitigation is explicitly selected via a
+vulnerability-specific command line option.
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index 09890a8f3ee9..89ca636081b7 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -9,6 +9,7 @@ are configurable at compile, boot or run time.
.. toctree::
:maxdepth: 1
+ attack_vector_controls
spectre
l1tf
mds
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 07e22ba5bfe3..0d2ea9a60145 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2538,6 +2538,13 @@
requires the kernel to be built with
CONFIG_ARM64_PSEUDO_NMI.
+ irqchip.riscv_imsic_noipi
+ [RISC-V,EARLY]
+ Force the kernel to not use IMSIC software injected MSIs
+ as IPIs. Intended for system where IMSIC is trap-n-emulated,
+ and thus want to reduce MMIO traps when triggering IPIs
+ to multiple harts.
+
irqfixup [HW]
When an interrupt is not handled search all handlers
for it. Intended to get systems with badly broken
@@ -3790,6 +3797,10 @@
mmio_stale_data=full,nosmt [X86]
retbleed=auto,nosmt [X86]
+ [X86] After one of the above options, additionally
+ supports attack-vector based controls as documented in
+ Documentation/admin-guide/hw-vuln/attack_vector_controls.rst
+
mminit_loglevel=
[KNL,EARLY] When CONFIG_DEBUG_MEMORY_INIT is set, this
parameter allows control of the logging verbosity for
@@ -5000,6 +5011,18 @@
that number, otherwise (e.g., 'pmu_override=on'), MMCR1
remains 0.
+ pm_async= [PM]
+ Format: off
+ This parameter sets the initial value of the
+ /sys/power/pm_async sysfs knob at boot time.
+ If set to "off", disables asynchronous suspend and
+ resume of devices during system-wide power transitions.
+ This can be useful on platforms where device
+ dependencies are not well-defined, or for debugging
+ power management issues. Asynchronous operations are
+ enabled by default.
+
+
pm_debug_messages [SUSPEND,KNL]
Enable suspend/resume debug messages during boot up.
@@ -6387,6 +6410,11 @@
sa1100ir [NET]
See drivers/net/irda/sa1100_ir.c.
+ sched_proxy_exec= [KNL]
+ Enables or disables "proxy execution" style
+ solution to mutex-based priority inversion.
+ Format: <bool>
+
sched_verbose [KNL,EARLY] Enables verbose scheduler debug messages.
schedstats= [KNL,X86] Enable or disable scheduled statistics.
@@ -7214,6 +7242,14 @@
causing a major performance hit, and the space where
machines are deployed is by other means guarded.
+ tpm_crb_ffa.busy_timeout_ms= [ARM64,TPM]
+ Maximum time in milliseconds to retry sending a message
+ to the TPM service before giving up. This parameter controls
+ how long the system will continue retrying when the TPM
+ service is busy.
+ Format: <unsigned int>
+ Default: 2000 (2 seconds)
+
tpm_suspend_pcr=[HW,TPM]
Format: integer pcr id
Specify that at suspend time, the tpm driver
diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 412423c54f25..e1771f2225d5 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -72,7 +72,7 @@ to manage each performance update behavior. ::
Lowest non- | | | |
linear perf ------>+-----------------------+ +-----------------------+
| | | |
- | | Lowest perf ---->| |
+ | | Min perf ---->| |
| | | |
Lowest perf ------>+-----------------------+ +-----------------------+
| | | |
diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
index 2d74af7f0efe..cacb9f0307dd 100644
--- a/Documentation/admin-guide/pm/cpufreq.rst
+++ b/Documentation/admin-guide/pm/cpufreq.rst
@@ -398,7 +398,9 @@ policy limits change after that.
This governor does not do anything by itself. Instead, it allows user space
to set the CPU frequency for the policy it is attached to by writing to the
-``scaling_setspeed`` attribute of that policy.
+``scaling_setspeed`` attribute of that policy. Though the intention may be to
+set an exact frequency for the policy, the actual frequency may vary depending
+on hardware coordination, thermal and power limits, and other factors.
``schedutil``
-------------
diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst
index e3cfffef5a63..c1768d9e80fa 100644
--- a/Documentation/admin-guide/syscall-user-dispatch.rst
+++ b/Documentation/admin-guide/syscall-user-dispatch.rst
@@ -53,20 +53,25 @@ following prctl:
prctl(PR_SET_SYSCALL_USER_DISPATCH, <op>, <offset>, <length>, [selector])
-<op> is either PR_SYS_DISPATCH_ON or PR_SYS_DISPATCH_OFF, to enable and
-disable the mechanism globally for that thread. When
-PR_SYS_DISPATCH_OFF is used, the other fields must be zero.
-
-[<offset>, <offset>+<length>) delimit a memory region interval
-from which syscalls are always executed directly, regardless of the
-userspace selector. This provides a fast path for the C library, which
-includes the most common syscall dispatchers in the native code
-applications, and also provides a way for the signal handler to return
+<op> is either PR_SYS_DISPATCH_EXCLUSIVE_ON/PR_SYS_DISPATCH_INCLUSIVE_ON
+or PR_SYS_DISPATCH_OFF, to enable and disable the mechanism globally for
+that thread. When PR_SYS_DISPATCH_OFF is used, the other fields must be zero.
+
+For PR_SYS_DISPATCH_EXCLUSIVE_ON [<offset>, <offset>+<length>) delimit
+a memory region interval from which syscalls are always executed directly,
+regardless of the userspace selector. This provides a fast path for the
+C library, which includes the most common syscall dispatchers in the native
+code applications, and also provides a way for the signal handler to return
without triggering a nested SIGSYS on (rt\_)sigreturn. Users of this
interface should make sure that at least the signal trampoline code is
included in this region. In addition, for syscalls that implement the
trampoline code on the vDSO, that trampoline is never intercepted.
+For PR_SYS_DISPATCH_INCLUSIVE_ON [<offset>, <offset>+<length>) delimit
+a memory region interval from which syscalls are dispatched based on
+the userspace selector. Syscalls from outside of the range are always
+executed directly.
+
[selector] is a pointer to a char-sized region in the process memory
region, that provides a quick way to enable disable syscall redirection
thread-wide, without the need to invoke the kernel directly. selector
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index dd49a89a62d3..c04e6b8eb2b1 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -1014,30 +1014,26 @@ perf_user_access (arm64 and riscv only)
Controls user space access for reading perf event counters.
-arm64
-=====
-
-The default value is 0 (access disabled).
+* for arm64
+ The default value is 0 (access disabled).
-When set to 1, user space can read performance monitor counter registers
-directly.
+ When set to 1, user space can read performance monitor counter registers
+ directly.
-See Documentation/arch/arm64/perf.rst for more information.
-
-riscv
-=====
+ See Documentation/arch/arm64/perf.rst for more information.
-When set to 0, user space access is disabled.
+* for riscv
+ When set to 0, user space access is disabled.
-The default value is 1, user space can read performance monitor counter
-registers through perf, any direct access without perf intervention will trigger
-an illegal instruction.
+ The default value is 1, user space can read performance monitor counter
+ registers through perf, any direct access without perf intervention will trigger
+ an illegal instruction.
-When set to 2, which enables legacy mode (user space has direct access to cycle
-and insret CSRs only). Note that this legacy value is deprecated and will be
-removed once all user space applications are fixed.
+ When set to 2, which enables legacy mode (user space has direct access to cycle
+ and insret CSRs only). Note that this legacy value is deprecated and will be
+ removed once all user space applications are fixed.
-Note that the time CSR is always directly accessible to all modes.
+ Note that the time CSR is always directly accessible to all modes.
pid_max
=======
@@ -1465,7 +1461,7 @@ stack_erasing
=============
This parameter can be used to control kernel stack erasing at the end
-of syscalls for kernels built with ``CONFIG_GCC_PLUGIN_STACKLEAK``.
+of syscalls for kernels built with ``CONFIG_KSTACK_ERASE``.
That erasing reduces the information which kernel stack leak bugs
can reveal and blocks some uninitialized stack variable attacks.
@@ -1473,7 +1469,7 @@ The tradeoff is the performance impact: on a single CPU system kernel
compilation sees a 1% slowdown, other systems and workloads may vary.
= ====================================================================
-0 Kernel stack erasing is disabled, STACKLEAK_METRICS are not updated.
+0 Kernel stack erasing is disabled, KSTACK_ERASE_METRICS are not updated.
1 Kernel stack erasing is enabled (default), it is performed before
returning to the userspace at the end of syscalls.
= ====================================================================
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 9bef46151d53..4d71211fdad8 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -465,8 +465,8 @@ The minimum value is 1 (1/1 -> 100%). The value less than 1 completely
disables protection of the pages.
-max_map_count:
-==============
+max_map_count
+=============
This file contains the maximum number of memory map areas a process
may have. Memory map areas are used as a side-effect of calling
@@ -495,8 +495,8 @@ memory allocations.
The default value depends on CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT.
-memory_failure_early_kill:
-==========================
+memory_failure_early_kill
+=========================
Control how to kill processes when uncorrected memory error (typically
a 2bit error in a memory module) is detected in the background by hardware
diff --git a/Documentation/admin-guide/thunderbolt.rst b/Documentation/admin-guide/thunderbolt.rst
index 240fee618e06..102c693c8f81 100644
--- a/Documentation/admin-guide/thunderbolt.rst
+++ b/Documentation/admin-guide/thunderbolt.rst
@@ -358,12 +358,7 @@ Forcing power
Many OEMs include a method that can be used to force the power of a
Thunderbolt controller to an "On" state even if nothing is connected.
If supported by your machine this will be exposed by the WMI bus with
-a sysfs attribute called "force_power".
-
-For example the intel-wmi-thunderbolt driver exposes this attribute in:
- /sys/bus/wmi/devices/86CCFD48-205E-4A77-9C48-2021CBEDE341/force_power
-
- To force the power to on, write 1 to this attribute file.
- To disable force power, write 0 to this attribute file.
+a sysfs attribute called "force_power", see
+Documentation/ABI/testing/sysfs-platform-intel-wmi-thunderbolt for details.
Note: it's currently not possible to query the force power state of a platform.