summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide/xfs.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide/xfs.rst')
-rw-r--r--Documentation/admin-guide/xfs.rst178
1 files changed, 131 insertions, 47 deletions
diff --git a/Documentation/admin-guide/xfs.rst b/Documentation/admin-guide/xfs.rst
index ad911be5b5e9..c85cd327af28 100644
--- a/Documentation/admin-guide/xfs.rst
+++ b/Documentation/admin-guide/xfs.rst
@@ -34,22 +34,6 @@ When mounting an XFS filesystem, the following options are accepted.
to the file. Specifying a fixed ``allocsize`` value turns off
the dynamic behaviour.
- attr2 or noattr2
- The options enable/disable an "opportunistic" improvement to
- be made in the way inline extended attributes are stored
- on-disk. When the new form is used for the first time when
- ``attr2`` is selected (either when setting or removing extended
- attributes) the on-disk superblock feature bit field will be
- updated to reflect this format being in use.
-
- The default behaviour is determined by the on-disk feature
- bit indicating that ``attr2`` behaviour is active. If either
- mount option is set, then that becomes the new default used
- by the filesystem.
-
- CRC enabled filesystems always use the ``attr2`` format, and so
- will reject the ``noattr2`` mount option if it is set.
-
discard or nodiscard (default)
Enable/disable the issuing of commands to let the block
device reclaim space freed by the filesystem. This is
@@ -75,12 +59,6 @@ When mounting an XFS filesystem, the following options are accepted.
across the entire filesystem rather than just on directories
configured to use it.
- ikeep or noikeep (default)
- When ``ikeep`` is specified, XFS does not delete empty inode
- clusters and keeps them around on disk. When ``noikeep`` is
- specified, empty inode clusters are returned to the free
- space pool.
-
inode32 or inode64 (default)
When ``inode32`` is specified, it indicates that XFS limits
inode creation to locations which will not result in inode
@@ -124,6 +102,14 @@ When mounting an XFS filesystem, the following options are accepted.
controls the size of each buffer and so is also relevant to
this case.
+ lifetime (default) or nolifetime
+ Enable data placement based on write life time hints provided
+ by the user. This turns on co-allocation of data of similar
+ life times when statistically favorable to reduce garbage
+ collection cost.
+
+ These options are only available for zoned rt file systems.
+
logbsize=value
Set the size of each in-memory log buffer. The size may be
specified in bytes, or in kilobytes with a "k" suffix.
@@ -133,7 +119,7 @@ When mounting an XFS filesystem, the following options are accepted.
logbsize must be an integer multiple of the log
stripe unit configured at **mkfs(8)** time.
- The default value for for version 1 logs is 32768, while the
+ The default value for version 1 logs is 32768, while the
default value for version 2 logs is MAX(32768, log_sunit).
logdev=device and rtdev=device
@@ -143,6 +129,25 @@ When mounting an XFS filesystem, the following options are accepted.
optional, and the log section can be separate from the data
section or contained within it.
+ max_atomic_write=value
+ Set the maximum size of an atomic write. The size may be
+ specified in bytes, in kilobytes with a "k" suffix, in megabytes
+ with a "m" suffix, or in gigabytes with a "g" suffix. The size
+ cannot be larger than the maximum write size, larger than the
+ size of any allocation group, or larger than the size of a
+ remapping operation that the log can complete atomically.
+
+ The default value is to set the maximum I/O completion size
+ to allow each CPU to handle one at a time.
+
+ max_open_zones=value
+ Specify the max number of zones to keep open for writing on a
+ zoned rt device. Many open zones aids file data separation
+ but may impact performance on HDDs.
+
+ If ``max_open_zones`` is not specified, the value is determined
+ by the capabilities and the size of the zoned rt device.
+
noalign
Data allocations will not be aligned at stripe unit
boundaries. This is only relevant to filesystems created
@@ -192,7 +197,7 @@ When mounting an XFS filesystem, the following options are accepted.
are any integer multiple of a valid ``sunit`` value.
Typically the only time these mount options are necessary if
- after an underlying RAID device has had it's geometry
+ after an underlying RAID device has had its geometry
modified, such as adding a new disk to a RAID5 lun and
reshaping it.
@@ -210,14 +215,37 @@ When mounting an XFS filesystem, the following options are accepted.
inconsistent namespace presentation during or after a
failover event.
+Deprecation of V4 Format
+========================
+
+The V4 filesystem format lacks certain features that are supported by
+the V5 format, such as metadata checksumming, strengthened metadata
+verification, and the ability to store timestamps past the year 2038.
+Because of this, the V4 format is deprecated. All users should upgrade
+by backing up their files, reformatting, and restoring from the backup.
+
+Administrators and users can detect a V4 filesystem by running xfs_info
+against a filesystem mountpoint and checking for a string containing
+"crc=". If no such string is found, please upgrade xfsprogs to the
+latest version and try again.
+
+The deprecation will take place in two parts. Support for mounting V4
+filesystems can now be disabled at kernel build time via Kconfig option.
+These options were changed to default to no in September 2025. In
+September 2030, support will be removed from the codebase entirely.
+
+Note: Distributors may choose to withdraw V4 format support earlier than
+the dates listed above.
Deprecated Mount Options
========================
-=========================== ================
+============================ ================
Name Removal Schedule
-=========================== ================
-=========================== ================
+============================ ================
+Mounting with V4 filesystem September 2030
+Mounting ascii-ci filesystem September 2030
+============================ ================
Removed Mount Options
@@ -232,6 +260,8 @@ Removed Mount Options
osyncisdsync/osyncisosync v4.0
barrier v4.19
nobarrier v4.19
+ ikeep/noikeep v6.18
+ attr2/noattr2 v6.18
=========================== =======
sysctls
@@ -268,7 +298,7 @@ The following sysctls are available for the XFS filesystem:
XFS_ERRLEVEL_LOW: 1
XFS_ERRLEVEL_HIGH: 5
- fs.xfs.panic_mask (Min: 0 Default: 0 Max: 256)
+ fs.xfs.panic_mask (Min: 0 Default: 0 Max: 511)
Causes certain error conditions to call BUG(). Value is a bitmask;
OR together the tags which represent errors which should cause panics:
@@ -285,17 +315,6 @@ The following sysctls are available for the XFS filesystem:
This option is intended for debugging only.
- fs.xfs.irix_symlink_mode (Min: 0 Default: 0 Max: 1)
- Controls whether symlinks are created with mode 0777 (default)
- or whether their mode is affected by the umask (irix mode).
-
- fs.xfs.irix_sgid_inherit (Min: 0 Default: 0 Max: 1)
- Controls files created in SGID directories.
- If the group ID of the new file does not match the effective group
- ID or one of the supplementary group IDs of the parent dir, the
- ISGID bit is cleared if the irix_sgid_inherit compatibility sysctl
- is set.
-
fs.xfs.inherit_sync (Min: 0 Default: 1 Max: 1)
Setting this to "1" will cause the "sync" flag set
by the **xfs_io(8)** chattr command on a directory to be
@@ -331,18 +350,20 @@ The following sysctls are available for the XFS filesystem:
Deprecated Sysctls
==================
-None at present.
-
+None currently.
Removed Sysctls
===============
-============================= =======
- Name Removed
-============================= =======
- fs.xfs.xfsbufd_centisec v4.0
- fs.xfs.age_buffer_centisecs v4.0
-============================= =======
+========================================== =======
+ Name Removed
+========================================== =======
+ fs.xfs.xfsbufd_centisec v4.0
+ fs.xfs.age_buffer_centisecs v4.0
+ fs.xfs.irix_symlink_mode v6.18
+ fs.xfs.irix_sgid_inherit v6.18
+ fs.xfs.speculative_cow_prealloc_lifetime v6.18
+========================================== =======
Error handling
==============
@@ -465,3 +486,66 @@ the class and error context. For example, the default values for
"metadata/ENODEV" are "0" rather than "-1" so that this error handler defaults
to "fail immediately" behaviour. This is done because ENODEV is a fatal,
unrecoverable error no matter how many times the metadata IO is retried.
+
+Workqueue Concurrency
+=====================
+
+XFS uses kernel workqueues to parallelize metadata update processes. This
+enables it to take advantage of storage hardware that can service many IO
+operations simultaneously. This interface exposes internal implementation
+details of XFS, and as such is explicitly not part of any userspace API/ABI
+guarantee the kernel may give userspace. These are undocumented features of
+the generic workqueue implementation XFS uses for concurrency, and they are
+provided here purely for diagnostic and tuning purposes and may change at any
+time in the future.
+
+The control knobs for a filesystem's workqueues are organized by task at hand
+and the short name of the data device. They all can be found in:
+
+ /sys/bus/workqueue/devices/${task}!${device}
+
+================ ===========
+ Task Description
+================ ===========
+ xfs_iwalk-$pid Inode scans of the entire filesystem. Currently limited to
+ mount time quotacheck.
+ xfs-gc Background garbage collection of disk space that have been
+ speculatively allocated beyond EOF or for staging copy on
+ write operations.
+================ ===========
+
+For example, the knobs for the quotacheck workqueue for /dev/nvme0n1 would be
+found in /sys/bus/workqueue/devices/xfs_iwalk-1111!nvme0n1/.
+
+The interesting knobs for XFS workqueues are as follows:
+
+============ ===========
+ Knob Description
+============ ===========
+ max_active Maximum number of background threads that can be started to
+ run the work.
+ cpumask CPUs upon which the threads are allowed to run.
+ nice Relative priority of scheduling the threads. These are the
+ same nice levels that can be applied to userspace processes.
+============ ===========
+
+Zoned Filesystems
+=================
+
+For zoned file systems, the following attributes are exposed in:
+
+ /sys/fs/xfs/<dev>/zoned/
+
+ max_open_zones (Min: 1 Default: Varies Max: UINTMAX)
+ This read-only attribute exposes the maximum number of open zones
+ available for data placement. The value is determined at mount time and
+ is limited by the capabilities of the backing zoned device, file system
+ size and the max_open_zones mount option.
+
+ zonegc_low_space (Min: 0 Default: 0 Max: 100)
+ Define a percentage for how much of the unused space that GC should keep
+ available for writing. A high value will reclaim more of the space
+ occupied by unused blocks, creating a larger buffer against write
+ bursts at the cost of increased write amplification. Regardless
+ of this value, garbage collection will always aim to free a minimum
+ amount of blocks to keep max_open_zones open for data placement purposes.