diff options
Diffstat (limited to 'Documentation/filesystems/proc.rst')
| -rw-r--r-- | Documentation/filesystems/proc.rst | 660 |
1 files changed, 426 insertions, 234 deletions
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 042c418f4090..8256e857e2d7 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -47,6 +47,8 @@ fixes/update part 1.1 Stefani Seibold <stefani@seibold.net> June 9 2009 3.10 /proc/<pid>/timerslack_ns - Task timerslack value 3.11 /proc/<pid>/patch_state - Livepatch patch operation state 3.12 /proc/<pid>/arch_status - Task architecture specific information + 3.13 /proc/<pid>/fd - List of symlinks to open files + 3.14 /proc/<pid/ksm_stat - Information about the process's ksm status. 4 Configuring procfs 4.1 Mount options @@ -59,19 +61,6 @@ Preface 0.1 Introduction/Credits ------------------------ -This documentation is part of a soon (or so we hope) to be released book on -the SuSE Linux distribution. As there is no complete documentation for the -/proc file system and we've used many freely available sources to write these -chapters, it seems only fair to give the work back to the Linux community. -This work is based on the 2.2.* kernel version and the upcoming 2.4.*. I'm -afraid it's still far from complete, but we hope it will be useful. As far as -we know, it is the first 'all-in-one' document about the /proc file system. It -is focused on the Intel x86 hardware, so if you are looking for PPC, ARM, -SPARC, AXP, etc., features, you probably won't find what you are looking for. -It also only covers IPv4 networking, not IPv6 nor other protocols - sorry. But -additions and patches are welcome and will be added to this document if you -mail them to Bodo. - We'd like to thank Alan Cox, Rik van Riel, and Alexey Kuznetsov and a lot of other people for help compiling this documentation. We'd also like to extend a special thank you to Andi Kleen for documentation, which we relied on heavily @@ -79,16 +68,8 @@ to create this document, as well as the additional information he provided. Thanks to everybody else who contributed source or docs to the Linux kernel and helped create a great piece of software... :) -If you have any comments, corrections or additions, please don't hesitate to -contact Bodo Bauer at bb@ricochet.net. We'll be happy to add them to this -document. - The latest version of this document is available online at -http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html - -If the above direction does not works for you, you could try the kernel -mailing list at linux-kernel@vger.kernel.org and/or try to reach me at -comandante@zaralinux.com. +https://www.kernel.org/doc/html/latest/filesystems/proc.html 0.2 Legal Stuff --------------- @@ -126,6 +107,16 @@ process running on the system, which is named after the process ID (PID). The link 'self' points to the process reading the file system. Each process subdirectory has the entries listed in Table 1-1. +A process can read its own information from /proc/PID/* with no extra +permissions. When reading /proc/PID/* information for other processes, reading +process is required to have either CAP_SYS_PTRACE capability with +PTRACE_MODE_READ access permissions, or, alternatively, CAP_PERFMON +capability. This applies to all read-only information like `maps`, `environ`, +`pagemap`, etc. The only exception is `mem` file due to its read-write nature, +which requires CAP_SYS_PTRACE capabilities with more elevated +PTRACE_MODE_ATTACH permissions; CAP_PERFMON capability does not grant access +to /proc/PID/mem for other processes. + Note that an open file descriptor to /proc/<pid> or to any of its contained files or subdirectories does not prevent <pid> being reused for some other process in the event that <pid> exits. Operations on @@ -178,6 +169,7 @@ read the file /proc/PID/status:: Gid: 100 100 100 100 FDSize: 256 Groups: 100 14 16 + Kthread: 0 VmPeak: 5004 kB VmSize: 5004 kB VmLck: 0 kB @@ -231,7 +223,7 @@ asynchronous manner and the value may not be very precise. To see a precise snapshot of a moment, you can see /proc/<pid>/smaps file and scan page table. It's slow but very precise. -.. table:: Table 1-2: Contents of the status files (as of 4.19) +.. table:: Table 1-2: Contents of the status fields (as of 4.19) ========================== =================================================== Field Content @@ -245,7 +237,8 @@ It's slow but very precise. Ngid NUMA group ID (0 if none) Pid process id PPid process id of the parent process - TracerPid PID of process tracing this process (0 if not) + TracerPid PID of process tracing this process (0 if not, or + the tracer is outside of the current pid namespace) Uid Real, effective, saved set, and file system UIDs Gid Real, effective, saved set, and file system GIDs FDSize number of file descriptor slots currently allocated @@ -254,6 +247,7 @@ It's slow but very precise. NSpid descendant namespace process ID hierarchy NSpgid descendant namespace process group ID hierarchy NSsid descendant namespace session ID hierarchy + Kthread kernel thread flag, 1 is yes, 0 is no VmPeak peak virtual memory size VmSize total program size VmLck locked memory size @@ -276,8 +270,9 @@ It's slow but very precise. HugetlbPages size of hugetlb memory portions CoreDumping process's memory is currently being dumped (killing the process may lead to a corrupted core) - THP_enabled process is allowed to use THP (returns 0 when - PR_SET_THP_DISABLE is set on the process + THP_enabled process is allowed to use THP (returns 0 when + PR_SET_THP_DISABLE is set on the process to disable + THP completely, not just partially) Threads number of threads SigQ number of signals queued/max. number for queue SigPnd bitmap of pending signals for the thread @@ -303,7 +298,7 @@ It's slow but very precise. ========================== =================================================== -.. table:: Table 1-3: Contents of the statm files (as of 2.6.8-rc3) +.. table:: Table 1-3: Contents of the statm fields (as of 2.6.8-rc3) ======== =============================== ============================== Field Content @@ -321,7 +316,7 @@ It's slow but very precise. ======== =============================== ============================== -.. table:: Table 1-4: Contents of the stat files (as of 2.6.30-rc7) +.. table:: Table 1-4: Contents of the stat fields (as of 2.6.30-rc7) ============= =============================================================== Field Content @@ -426,15 +421,28 @@ with the memory region, as the case would be with BSS (uninitialized data). The "pathname" shows the name associated file for this mapping. If the mapping is not associated with a file: - ======= ==================================== + =================== =========================================== [heap] the heap of the program [stack] the stack of the main process [vdso] the "virtual dynamic shared object", the kernel system call handler - ======= ==================================== + [anon:<name>] a private anonymous mapping that has been + named by userspace + [anon_shmem:<name>] an anonymous shared memory mapping that has + been named by userspace + =================== =========================================== or if empty, the mapping is anonymous. +Starting with 6.11 kernel, /proc/PID/maps provides an alternative +ioctl()-based API that gives ability to flexibly and efficiently query and +filter individual VMAs. This interface is binary and is meant for more +efficient and easy programmatic use. `struct procmap_query`, defined in +linux/fs.h UAPI header, serves as an input/output argument to the +`PROCMAP_QUERY` ioctl() command. See comments in linus/fs.h UAPI header for +details on query semantics, supported flags, data returned, and general API +usage information. + The /proc/PID/smaps is an extension based on maps, showing the memory consumption for each of the process's mappings. For each mapping (aka Virtual Memory Area, or VMA) there is a series of lines such as the following:: @@ -446,12 +454,14 @@ Memory Area, or VMA) there is a series of lines such as the following:: MMUPageSize: 4 kB Rss: 892 kB Pss: 374 kB + Pss_Dirty: 0 kB Shared_Clean: 892 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 0 kB Referenced: 892 kB Anonymous: 0 kB + KSM: 0 kB LazyFree: 0 kB AnonHugePages: 0 kB ShmemPmdMapped: 0 kB @@ -465,23 +475,42 @@ Memory Area, or VMA) there is a series of lines such as the following:: THPeligible: 0 VmFlags: rd ex mr mw me dw -The first of these lines shows the same information as is displayed for the -mapping in /proc/PID/maps. Following lines show the size of the mapping -(size); the size of each page allocated when backing a VMA (KernelPageSize), -which is usually the same as the size in the page table entries; the page size -used by the MMU when backing a VMA (in most cases, the same as KernelPageSize); -the amount of the mapping that is currently resident in RAM (RSS); the -process' proportional share of this mapping (PSS); and the number of clean and -dirty shared and private pages in the mapping. +The first of these lines shows the same information as is displayed for +the mapping in /proc/PID/maps. Following lines show the size of the +mapping (size); the size of each page allocated when backing a VMA +(KernelPageSize), which is usually the same as the size in the page table +entries; the page size used by the MMU when backing a VMA (in most cases, +the same as KernelPageSize); the amount of the mapping that is currently +resident in RAM (RSS); the process's proportional share of this mapping +(PSS); and the number of clean and dirty shared and private pages in the +mapping. The "proportional set size" (PSS) of a process is the count of pages it has in memory, where each page is divided by the number of processes sharing it. So if a process has 1000 pages all to itself, and 1000 shared with one other -process, its PSS will be 1500. - -Note that even a page which is part of a MAP_SHARED mapping, but has only -a single pte mapped, i.e. is currently used by only one process, is accounted -as private and not as shared. +process, its PSS will be 1500. "Pss_Dirty" is the portion of PSS which +consists of dirty pages. ("Pss_Clean" is not included, but it can be +calculated by subtracting "Pss_Dirty" from "Pss".) + +Traditionally, a page is accounted as "private" if it is mapped exactly once, +and a page is accounted as "shared" when mapped multiple times, even when +mapped in the same process multiple times. Note that this accounting is +independent of MAP_SHARED. + +In some kernel configurations, the semantics of pages part of a larger +allocation (e.g., THP) can differ: a page is accounted as "private" if all +pages part of the corresponding large allocation are *certainly* mapped in the +same process, even if the page is mapped multiple times in that process. A +page is accounted as "shared" if any page page of the larger allocation +is *maybe* mapped in a different process. In some cases, a large allocation +might be treated as "maybe mapped by multiple processes" even though this +is no longer the case. + +Some kernel configurations do not track the precise number of times a page part +of a larger allocation is mapped. In this case, when calculating the PSS, the +average number of mappings per page in this larger allocation might be used +as an approximation for the number of mappings of a page. The PSS calculation +will be imprecise in this case. "Referenced" indicates the amount of memory currently marked as referenced or accessed. @@ -490,18 +519,21 @@ accessed. a mapping associated with a file may contain anonymous pages: when MAP_PRIVATE and a page is modified, the file page is replaced by a private anonymous copy. +"KSM" reports how many of the pages are KSM pages. Note that KSM-placed zeropages +are not included, only actual KSM pages. + "LazyFree" shows the amount of memory which is marked by madvise(MADV_FREE). The memory isn't freed immediately with madvise(). It's freed in memory pressure if the memory is clean. Please note that the printed value might be lower than the real value due to optimizations used in the current implementation. If this is not desirable please file a bug report. -"AnonHugePages" shows the ammount of memory backed by transparent hugepage. +"AnonHugePages" shows the amount of memory backed by transparent hugepage. -"ShmemPmdMapped" shows the ammount of shared (shmem/tmpfs) memory backed by +"ShmemPmdMapped" shows the amount of shared (shmem/tmpfs) memory backed by huge pages. -"Shared_Hugetlb" and "Private_Hugetlb" show the ammounts of memory backed by +"Shared_Hugetlb" and "Private_Hugetlb" show the amounts of memory backed by hugetlbfs page which is *not* counted in "RSS" or "PSS" field for historical reasons. And these are not included in {Shared,Private}_{Clean,Dirty} field. @@ -512,14 +544,16 @@ replaced by copy-on-write) part of the underlying shmem object out on swap. "SwapPss" shows proportional swap share of this mapping. Unlike "Swap", this does not take into account swapped out page of underlying shmem objects. "Locked" indicates whether the mapping is locked in memory or not. -"THPeligible" indicates whether the mapping is eligible for allocating THP -pages - 1 if true, 0 otherwise. It just shows the current status. + +"THPeligible" indicates whether the mapping is eligible for allocating +naturally aligned THP pages of any currently enabled size. 1 if true, 0 +otherwise. "VmFlags" field deserves a separate description. This member represents the kernel flags associated with the particular virtual memory area in two letter encoded manner. The codes are the following: - == ======================================= + == ============================================================= rd readable wr writeable ex executable @@ -530,7 +564,6 @@ encoded manner. The codes are the following: ms may share gd stack segment growns down pf pure PFN range - dw disabled write to the mapped file lo pages are locked in memory io memory mapped I/O area sr sequential read advise provided @@ -548,12 +581,18 @@ encoded manner. The codes are the following: mm mixed map area hg huge page advise flag nh no huge page advise flag - mg mergable advise flag + mg mergeable advise flag bt arm64 BTI guarded page mt arm64 MTE allocation tags are enabled um userfaultfd missing tracking uw userfaultfd wr-protect tracking - == ======================================= + ui userfaultfd minor fault + ss shadow/guarded control stack page + sl sealed + lf lock on fault pages + dp always lazily freeable mapping + gu maybe contains guard regions (if not set, definitely doesn't) + == ============================================================= Note that there is no guarantee that every flag and associated mnemonic will be present in all further kernel releases. Things get changed, the flags may @@ -656,6 +695,11 @@ Where: node locality page counters (N0 == node0, N1 == node1, ...) and the kernel page size, in KB, that is backing the mapping up. +Note that some kernel configurations do not track the precise number of times +a page part of a larger allocation (e.g., THP) is mapped. In these +configurations, "mapmax" might corresponds to the average number of mappings +per page in such a larger allocation instead. + 1.2 Kernel data --------------- @@ -670,10 +714,17 @@ files are there, and which are missing. ============ =============================================================== File Content ============ =============================================================== + allocinfo Memory allocations profiling information apm Advanced power management info + bootconfig Kernel command line obtained from boot config, + and, if there were kernel parameters from the + boot loader, a "# Parameters from bootloader:" + line followed by a line containing those + parameters prefixed by "# ". (5.5) buddyinfo Kernel memory allocator information (see text) (2.5) bus Directory containing bus specific information - cmdline Kernel command line + cmdline Kernel command line, both from bootloader and embedded + in the kernel image cpuinfo Info about the CPU devices Available devices (block and character) dma Used DMS channels @@ -929,6 +980,48 @@ also be allocatable although a lot of filesystem metadata may have to be reclaimed to achieve this. +allocinfo +~~~~~~~~~ + +Provides information about memory allocations at all locations in the code +base. Each allocation in the code is identified by its source file, line +number, module (if originates from a loadable module) and the function calling +the allocation. The number of bytes allocated and number of calls at each +location are reported. The first line indicates the version of the file, the +second line is the header listing fields in the file. +If file version is 2.0 or higher then each line may contain additional +<key>:<value> pairs representing extra information about the call site. +For example if the counters are not accurate, the line will be appended with +"accurate:no" pair. + +Supported markers in v2: +accurate:no + + Absolute values of the counters in this line are not accurate + because of the failure to allocate memory to track some of the + allocations made at this location. Deltas in these counters are + accurate, therefore counters can be used to track allocation size + and count changes. + +Example output. + +:: + + > tail -n +3 /proc/allocinfo | sort -rn + 127664128 31168 mm/page_ext.c:270 func:alloc_page_ext + 56373248 4737 mm/slub.c:2259 func:alloc_slab_page + 14880768 3633 mm/readahead.c:247 func:page_cache_ra_unbounded + 14417920 3520 mm/mm_init.c:2530 func:alloc_large_system_hash + 13377536 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs + 11718656 2861 mm/filemap.c:1919 func:__filemap_get_folio + 9192960 2800 kernel/fork.c:307 func:alloc_thread_stack_node + 4206592 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable + 4136960 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start + 3940352 962 mm/memory.c:4214 func:alloc_anon_folio + 2894464 22613 fs/kernfs/dir.c:615 func:__kernfs_new_node + ... + + meminfo ~~~~~~~ @@ -940,56 +1033,77 @@ can be substantial. In many cases there are other means to find out additional memory using subsystem specific interfaces, for instance /proc/net/sockstat for TCP memory allocations. -The following is from a 16GB PIII, which has highmem enabled. -You may not have all of these fields. +Example output. You may not have all of these fields. :: > cat /proc/meminfo - MemTotal: 16344972 kB - MemFree: 13634064 kB - MemAvailable: 14836172 kB - Buffers: 3656 kB - Cached: 1195708 kB - SwapCached: 0 kB - Active: 891636 kB - Inactive: 1077224 kB - HighTotal: 15597528 kB - HighFree: 13629632 kB - LowTotal: 747444 kB - LowFree: 4432 kB - SwapTotal: 0 kB - SwapFree: 0 kB - Dirty: 968 kB - Writeback: 0 kB - AnonPages: 861800 kB - Mapped: 280372 kB - Shmem: 644 kB - KReclaimable: 168048 kB - Slab: 284364 kB - SReclaimable: 159856 kB - SUnreclaim: 124508 kB - PageTables: 24448 kB - NFS_Unstable: 0 kB - Bounce: 0 kB - WritebackTmp: 0 kB - CommitLimit: 7669796 kB - Committed_AS: 100056 kB - VmallocTotal: 112216 kB - VmallocUsed: 428 kB - VmallocChunk: 111088 kB - Percpu: 62080 kB - HardwareCorrupted: 0 kB - AnonHugePages: 49152 kB - ShmemHugePages: 0 kB - ShmemPmdMapped: 0 kB + MemTotal: 32858820 kB + MemFree: 21001236 kB + MemAvailable: 27214312 kB + Buffers: 581092 kB + Cached: 5587612 kB + SwapCached: 0 kB + Active: 3237152 kB + Inactive: 7586256 kB + Active(anon): 94064 kB + Inactive(anon): 4570616 kB + Active(file): 3143088 kB + Inactive(file): 3015640 kB + Unevictable: 0 kB + Mlocked: 0 kB + SwapTotal: 0 kB + SwapFree: 0 kB + Zswap: 1904 kB + Zswapped: 7792 kB + Dirty: 12 kB + Writeback: 0 kB + AnonPages: 4654780 kB + Mapped: 266244 kB + Shmem: 9976 kB + KReclaimable: 517708 kB + Slab: 660044 kB + SReclaimable: 517708 kB + SUnreclaim: 142336 kB + KernelStack: 11168 kB + PageTables: 20540 kB + SecPageTables: 0 kB + NFS_Unstable: 0 kB + Bounce: 0 kB + WritebackTmp: 0 kB + CommitLimit: 16429408 kB + Committed_AS: 7715148 kB + VmallocTotal: 34359738367 kB + VmallocUsed: 40444 kB + VmallocChunk: 0 kB + Percpu: 29312 kB + EarlyMemtestBad: 0 kB + HardwareCorrupted: 0 kB + AnonHugePages: 4149248 kB + ShmemHugePages: 0 kB + ShmemPmdMapped: 0 kB + FileHugePages: 0 kB + FilePmdMapped: 0 kB + CmaTotal: 0 kB + CmaFree: 0 kB + Unaccepted: 0 kB + Balloon: 0 kB + HugePages_Total: 0 + HugePages_Free: 0 + HugePages_Rsvd: 0 + HugePages_Surp: 0 + Hugepagesize: 2048 kB + Hugetlb: 0 kB + DirectMap4k: 401152 kB + DirectMap2M: 10008576 kB + DirectMap1G: 24117248 kB MemTotal Total usable RAM (i.e. physical RAM minus a few reserved bits and the kernel binary code) MemFree - The sum of LowFree+HighFree + Total free RAM. On highmem systems, the sum of LowFree+HighFree MemAvailable An estimate of how much memory is available for starting new applications, without swapping. Calculated from MemFree, @@ -1003,8 +1117,9 @@ Buffers Relatively temporary storage for raw disk blocks shouldn't get tremendously large (20MB or so) Cached - in-memory cache for files read from the disk (the - pagecache). Doesn't include SwapCached + In-memory cache for files read from the disk (the + pagecache) as well as tmpfs & shmem. + Doesn't include SwapCached. SwapCached Memory that once was swapped out, is swapped back in but still also is in the swapfile (if memory is needed it @@ -1016,6 +1131,11 @@ Active Inactive Memory which has been less recently used. It is more eligible to be reclaimed for other purposes +Unevictable + Memory allocated for userspace which cannot be reclaimed, such + as mlocked pages, ramfs backing pages, secret memfd pages etc. +Mlocked + Memory locked with mlock(). HighTotal, HighFree Highmem is all memory above ~860MB of physical memory. Highmem areas are for use by userspace programs, or @@ -1032,26 +1152,26 @@ SwapTotal SwapFree Memory which has been evicted from RAM, and is temporarily on the disk +Zswap + Memory consumed by the zswap backend (compressed size) +Zswapped + Amount of anonymous memory stored in zswap (original size) Dirty Memory which is waiting to get written back to the disk Writeback Memory which is actively being written back to the disk AnonPages - Non-file backed pages mapped into userspace page tables -HardwareCorrupted - The amount of RAM/memory in KB, the kernel identifies as - corrupted. -AnonHugePages - Non-file backed huge pages mapped into userspace page tables + Non-file backed pages mapped into userspace page tables. Note that + some kernel configurations might consider all pages part of a + larger allocation (e.g., THP) as "mapped", as soon as a single + page is mapped. Mapped - files which have been mmaped, such as libraries + files which have been mmapped, such as libraries. Note that some + kernel configurations might consider all pages part of a larger + allocation (e.g., THP) as "mapped", as soon as a single page is + mapped. Shmem Total memory used by shared memory (shmem) and tmpfs -ShmemHugePages - Memory used by shared memory (shmem) and tmpfs allocated - with huge pages -ShmemPmdMapped - Shared memory mapped into userspace with huge pages KReclaimable Kernel allocations that the kernel will attempt to reclaim under memory pressure. Includes SReclaimable (below), and other @@ -1062,16 +1182,22 @@ SReclaimable Part of Slab, that might be reclaimed, such as caches SUnreclaim Part of Slab, that cannot be reclaimed on memory pressure +KernelStack + Memory consumed by the kernel stacks of all tasks PageTables - amount of memory dedicated to the lowest level of page - tables. + Memory consumed by userspace page tables +SecPageTables + Memory consumed by secondary page tables, this currently includes + KVM mmu and IOMMU allocations on x86 and arm64. NFS_Unstable - Always zero. Previous counted pages which had been written to + Always zero. Previously counted pages which had been written to the server, but has not been committed to stable storage. Bounce - Memory used for block device "bounce buffers" + Always zero. Previously memory used for block device + "bounce buffers". WritebackTmp - Memory used by FUSE for temporary writeback buffers + Always zero. Previously memory used by FUSE for temporary + writeback buffers. CommitLimit Based on the overcommit ratio ('vm.overcommit_ratio'), this is the total amount of memory currently available to @@ -1089,14 +1215,14 @@ CommitLimit yield a CommitLimit of 7.3G. For more details, see the memory overcommit documentation - in vm/overcommit-accounting. + in mm/overcommit-accounting. Committed_AS The amount of memory presently allocated on the system. The committed memory is a sum of all of the memory which has been allocated by processes, even if it has not been "used" by them as of yet. A process which malloc()'s 1G of memory, but only touches 300M of it will show up as - using 1G. This 1G is memory which has been "committed" to + using 1G. This 1G is memory which has been "committed" to by the VM and can be used at any time by the allocating application. With strict overcommit enabled on the system (mode 2 in 'vm.overcommit_memory'), allocations which would @@ -1105,7 +1231,7 @@ Committed_AS not fail due to lack of memory once that memory has been successfully allocated. VmallocTotal - total size of vmalloc memory area + total size of vmalloc virtual address space VmallocUsed amount of vmalloc area which is used VmallocChunk @@ -1113,6 +1239,41 @@ VmallocChunk Percpu Memory allocated to the percpu allocator used to back percpu allocations. This stat excludes the cost of metadata. +EarlyMemtestBad + The amount of RAM/memory in kB, that was identified as corrupted + by early memtest. If memtest was not run, this field will not + be displayed at all. Size is never rounded down to 0 kB. + That means if 0 kB is reported, you can safely assume + there was at least one pass of memtest and none of the passes + found a single faulty byte of RAM. +HardwareCorrupted + The amount of RAM/memory in KB, the kernel identifies as + corrupted. +AnonHugePages + Non-file backed huge pages mapped into userspace page tables +ShmemHugePages + Memory used by shared memory (shmem) and tmpfs allocated + with huge pages +ShmemPmdMapped + Shared memory mapped into userspace with huge pages +FileHugePages + Memory used for filesystem data (page cache) allocated + with huge pages +FilePmdMapped + Page cache mapped into userspace with huge pages +CmaTotal + Memory reserved for the Contiguous Memory Allocator (CMA) +CmaFree + Free remaining memory in the CMA reserves +Unaccepted + Memory that has not been accepted by the guest +Balloon + Memory returned to Host by VM Balloon Drivers +HugePages_Total, HugePages_Free, HugePages_Rsvd, HugePages_Surp, Hugepagesize, Hugetlb + See Documentation/admin-guide/mm/hugetlbpage.rst. +DirectMap4k, DirectMap2M, DirectMap1G + Breakdown of page table sizes used in the kernel's + identity mapping of RAM vmallocinfo ~~~~~~~~~~~ @@ -1181,85 +1342,7 @@ Provides counts of softirq handlers serviced since boot time, for each CPU. HRTIMER: 0 0 0 0 RCU: 1678 1769 2178 2250 - -1.3 IDE devices in /proc/ide ----------------------------- - -The subdirectory /proc/ide contains information about all IDE devices of which -the kernel is aware. There is one subdirectory for each IDE controller, the -file drivers and a link for each IDE device, pointing to the device directory -in the controller specific subtree. - -The file 'drivers' contains general information about the drivers used for the -IDE devices:: - - > cat /proc/ide/drivers - ide-cdrom version 4.53 - ide-disk version 1.08 - -More detailed information can be found in the controller specific -subdirectories. These are named ide0, ide1 and so on. Each of these -directories contains the files shown in table 1-6. - - -.. table:: Table 1-6: IDE controller info in /proc/ide/ide? - - ======= ======================================= - File Content - ======= ======================================= - channel IDE channel (0 or 1) - config Configuration (only for PCI/IDE bridge) - mate Mate name - model Type/Chipset of IDE controller - ======= ======================================= - -Each device connected to a controller has a separate subdirectory in the -controllers directory. The files listed in table 1-7 are contained in these -directories. - - -.. table:: Table 1-7: IDE device information - - ================ ========================================== - File Content - ================ ========================================== - cache The cache - capacity Capacity of the medium (in 512Byte blocks) - driver driver and version - geometry physical and logical geometry - identify device identify block - media media type - model device identifier - settings device setup - smart_thresholds IDE disk management thresholds - smart_values IDE disk management values - ================ ========================================== - -The most interesting file is ``settings``. This file contains a nice -overview of the drive parameters:: - - # cat /proc/ide/ide0/hda/settings - name value min max mode - ---- ----- --- --- ---- - bios_cyl 526 0 65535 rw - bios_head 255 0 255 rw - bios_sect 63 0 63 rw - breada_readahead 4 0 127 rw - bswap 0 0 1 r - file_readahead 72 0 2097151 rw - io_32bit 0 0 3 rw - keepsettings 0 0 1 rw - max_kb_per_request 122 1 127 rw - multcount 0 0 8 rw - nice1 1 0 1 rw - nowerr 0 0 1 rw - pio_mode write-only 0 255 w - slow 0 0 1 rw - unmaskirq 0 0 1 rw - using_dma 0 0 1 rw - - -1.4 Networking info in /proc/net +1.3 Networking info in /proc/net -------------------------------- The subdirectory /proc/net follows the usual pattern. Table 1-8 shows the @@ -1305,6 +1388,7 @@ support this. Table 1-9 lists the files and their meaning. rt_cache Routing cache snmp SNMP data sockstat Socket statistics + softnet_stat Per-CPU incoming packets queues statistics of online CPUs tcp TCP sockets udp UDP sockets unix UNIX domain sockets @@ -1338,12 +1422,12 @@ It will contain information that is specific to that bond, such as the current slaves of the bond, the link status of the slaves, and how many times the slaves link has failed. -1.5 SCSI info +1.4 SCSI info ------------- -If you have a SCSI host adapter in your system, you'll find a subdirectory -named after the driver for this adapter in /proc/scsi. You'll also see a list -of all recognized SCSI devices in /proc/scsi:: +If you have a SCSI or ATA host adapter in your system, you'll find a +subdirectory named after the driver for this adapter in /proc/scsi. +You'll also see a list of all recognized SCSI devices in /proc/scsi:: >cat /proc/scsi/scsi Attached devices: @@ -1401,7 +1485,7 @@ AHA-2940 SCSI adapter:: Total transfers 0 (0 reads and 0 writes) -1.6 Parallel port info in /proc/parport +1.5 Parallel port info in /proc/parport --------------------------------------- The directory /proc/parport contains information about the parallel ports of @@ -1426,7 +1510,7 @@ These directories contain the four files shown in Table 1-10. number or none). ========= ==================================================================== -1.7 TTY info in /proc/tty +1.6 TTY info in /proc/tty ------------------------- Information about the available and actually used tty's can be found in the @@ -1461,7 +1545,7 @@ To see which tty's are currently in use, you can simply look into the file unknown /dev/tty 4 1-63 console -1.8 Miscellaneous kernel statistics in /proc/stat +1.7 Miscellaneous kernel statistics in /proc/stat ------------------------------------------------- Various pieces of information about kernel activity are available in the @@ -1469,16 +1553,18 @@ Various pieces of information about kernel activity are available in the since the system first booted. For a quick look, simply cat the file:: > cat /proc/stat - cpu 2255 34 2290 22625563 6290 127 456 0 0 0 - cpu0 1132 34 1441 11311718 3675 127 438 0 0 0 - cpu1 1123 0 849 11313845 2614 0 18 0 0 0 - intr 114930548 113199788 3 0 5 263 0 4 [... lots more numbers ...] - ctxt 1990473 - btime 1062191376 - processes 2915 - procs_running 1 + cpu 237902850 368826709 106375398 1873517540 1135548 0 14507935 0 0 0 + cpu0 60045249 91891769 26331539 468411416 495718 0 5739640 0 0 0 + cpu1 59746288 91759249 26609887 468860630 312281 0 4384817 0 0 0 + cpu2 59489247 92985423 26904446 467808813 171668 0 2268998 0 0 0 + cpu3 58622065 92190267 26529524 468436680 155879 0 2114478 0 0 0 + intr 8688370575 8 3373 0 0 0 0 0 0 1 40791 0 0 353317 0 0 0 0 224789828 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 190974333 41958554 123983334 43 0 224593 0 0 0 <more 0's deleted> + ctxt 22848221062 + btime 1605316999 + processes 746787147 + procs_running 2 procs_blocked 0 - softirq 183433 0 21755 12 39 1137 231 21459 2263 + softirq 12121874454 100099120 3938138295 127375644 2795979 187870761 0 173808342 3072582055 52608 224184354 The very first "cpu" line aggregates the numbers in all of the other "cpuN" lines. These numbers identify the amount of time the CPU has spent performing @@ -1534,14 +1620,14 @@ softirqs serviced; each subsequent column is the total for that particular softirq. -1.9 Ext4 file system parameters +1.8 Ext4 file system parameters ------------------------------- Information about mounted ext4 file systems can be found in /proc/fs/ext4. Each mounted filesystem will have a directory in /proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or -/proc/fs/ext4/dm-0). The files in each per-device directory are shown -in Table 1-12, below. +/proc/fs/ext4/sda9 or /proc/fs/ext4/dm-0). The files in each per-device +directory are shown in Table 1-12, below. .. table:: Table 1-12: Files in /proc/fs/ext4/<devname> @@ -1550,7 +1636,7 @@ in Table 1-12, below. mb_groups details of multiblock allocator buddy cache of free blocks ============== ========================================================== -1.10 /proc/consoles +1.9 /proc/consoles ------------------- Shows registered system console lines. @@ -1621,12 +1707,12 @@ can inadvertently disrupt your system, it is advisable to read both documentation and source before actually making adjustments. In any case, be very careful when writing to any of these files. The entries in /proc may change slightly between the 2.1.* and the 2.2 kernel, so if there is any doubt -review the kernel documentation in the directory /usr/src/linux/Documentation. +review the kernel documentation in the directory linux/Documentation. This chapter is heavily based on the documentation included in the pre 2.2 kernels, and became part of it in version 2.2.1 of the Linux kernel. -Please see: Documentation/admin-guide/sysctl/ directory for descriptions of these -entries. +Please see: Documentation/admin-guide/sysctl/ directory for descriptions of +these entries. Summary ------- @@ -1857,19 +1943,19 @@ For example:: This file contains lines of the form:: 36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue - (1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11) - - (1) mount ID: unique identifier of the mount (may be reused after umount) - (2) parent ID: ID of parent (or of self for the top of the mount tree) - (3) major:minor: value of st_dev for files on filesystem - (4) root: root of the mount within the filesystem - (5) mount point: mount point relative to the process's root - (6) mount options: per mount options - (7) optional fields: zero or more fields of the form "tag[:value]" - (8) separator: marks the end of the optional fields - (9) filesystem type: name of filesystem of the form "type[.subtype]" - (10) mount source: filesystem specific information or "none" - (11) super options: per super block options + (1)(2)(3) (4) (5) (6) (n…m) (m+1)(m+2) (m+3) (m+4) + + (1) mount ID: unique identifier of the mount (may be reused after umount) + (2) parent ID: ID of parent (or of self for the top of the mount tree) + (3) major:minor: value of st_dev for files on filesystem + (4) root: root of the mount within the filesystem + (5) mount point: mount point relative to the process's root + (6) mount options: per mount options + (n…m) optional fields: zero or more fields of the form "tag[:value]" + (m+1) separator: marks the end of the optional fields + (m+2) filesystem type: name of filesystem of the form "type[.subtype]" + (m+3) mount source: filesystem specific information or "none" + (m+4) super options: per super block options Parsers should ignore all unrecognised optional fields. Currently the possible optional fields are: @@ -1896,8 +1982,8 @@ For more information on mount propagation see: These files provide a method to access a task's comm value. It also allows for a task to set its own or one of its thread siblings comm value. The comm value is limited in size compared to the cmdline value, so writing anything longer -then the kernel's TASK_COMM_LEN (currently 16 chars) will result in a truncated -comm value. +then the kernel's TASK_COMM_LEN (currently 16 chars, including the NUL +terminator) will result in a truncated comm value. 3.7 /proc/<pid>/task/<tid>/children - Information about task children @@ -2074,6 +2160,20 @@ DMA Buffer files where 'size' is the size of the DMA buffer in bytes. 'count' is the file count of the DMA buffer file. 'exp_name' is the name of the DMA buffer exporter. +VFIO Device files +~~~~~~~~~~~~~~~~~ + +:: + + pos: 0 + flags: 02000002 + mnt_id: 17 + ino: 5122 + vfio-device-syspath: /sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:05.0/0000:e8:00.0 + +where 'vfio-device-syspath' is the sysfs path corresponding to the VFIO device +file. + 3.9 /proc/<pid>/map_files - Information about memory mapped files --------------------------------------------------------------------- This directory contains symbolic links which represent memory mapped files @@ -2174,6 +2274,90 @@ AVX512_elapsed_ms the task is unlikely an AVX512 user, but depends on the workload and the scheduling scenario, it also could be a false negative mentioned above. +3.13 /proc/<pid>/fd - List of symlinks to open files +------------------------------------------------------- +This directory contains symbolic links which represent open files +the process is maintaining. Example output:: + + lr-x------ 1 root root 64 Sep 20 17:53 0 -> /dev/null + l-wx------ 1 root root 64 Sep 20 17:53 1 -> /dev/null + lrwx------ 1 root root 64 Sep 20 17:53 10 -> 'socket:[12539]' + lrwx------ 1 root root 64 Sep 20 17:53 11 -> 'socket:[12540]' + lrwx------ 1 root root 64 Sep 20 17:53 12 -> 'socket:[12542]' + +The number of open files for the process is stored in 'size' member +of stat() output for /proc/<pid>/fd for fast access. +------------------------------------------------------- + +3.14 /proc/<pid/ksm_stat - Information about the process's ksm status +--------------------------------------------------------------------- +When CONFIG_KSM is enabled, each process has this file which displays +the information of ksm merging status. + +Example +~~~~~~~ + +:: + + / # cat /proc/self/ksm_stat + ksm_rmap_items 0 + ksm_zero_pages 0 + ksm_merging_pages 0 + ksm_process_profit 0 + ksm_merge_any: no + ksm_mergeable: no + +Description +~~~~~~~~~~~ + +ksm_rmap_items +^^^^^^^^^^^^^^ + +The number of ksm_rmap_item structures in use. The structure +ksm_rmap_item stores the reverse mapping information for virtual +addresses. KSM will generate a ksm_rmap_item for each ksm-scanned page of +the process. + +ksm_zero_pages +^^^^^^^^^^^^^^ + +When /sys/kernel/mm/ksm/use_zero_pages is enabled, it represent how many +empty pages are merged with kernel zero pages by KSM. + +ksm_merging_pages +^^^^^^^^^^^^^^^^^ + +It represents how many pages of this process are involved in KSM merging +(not including ksm_zero_pages). It is the same with what +/proc/<pid>/ksm_merging_pages shows. + +ksm_process_profit +^^^^^^^^^^^^^^^^^^ + +The profit that KSM brings (Saved bytes). KSM can save memory by merging +identical pages, but also can consume additional memory, because it needs +to generate a number of rmap_items to save each scanned page's brief rmap +information. Some of these pages may be merged, but some may not be abled +to be merged after being checked several times, which are unprofitable +memory consumed. + +ksm_merge_any +^^^^^^^^^^^^^ + +It specifies whether the process's 'mm is added by prctl() into the +candidate list of KSM or not, and if KSM scanning is fully enabled at +process level. + +ksm_mergeable +^^^^^^^^^^^^^ + +It specifies whether any VMAs of the process''s mms are currently +applicable to KSM. + +More information about KSM can be found in +Documentation/admin-guide/mm/ksm.rst. + + Chapter 4: Configuring procfs ============================= @@ -2186,6 +2370,7 @@ The following mount options are supported: hidepid= Set /proc/<pid>/ access mode. gid= Set the group authorized to learn processes information. subset= Show only the specified subset of procfs. + pidns= Specify a the namespace used by this procfs. ========= ======================================================== hidepid=off or hidepid=0 means classic mode - everybody may access all @@ -2202,7 +2387,7 @@ arguments are now protected against local eavesdroppers. hidepid=invisible or hidepid=2 means hidepid=1 plus all /proc/<pid>/ will be fully invisible to other users. It doesn't mean that it hides a fact whether a process with a specific pid value exists (it can be learned by other means, e.g. -by "kill -0 $PID"), but it hides process' uid and gid, which may be learned by +by "kill -0 $PID"), but it hides process's uid and gid, which may be learned by stat()'ing /proc/<pid>/ otherwise. It greatly complicates an intruder's task of gathering information about running processes, whether some daemon runs with elevated privileges, whether other user runs some sensitive program, whether @@ -2218,10 +2403,17 @@ information about processes information, just add identd to this group. subset=pid hides all top level files and directories in the procfs that are not related to tasks. +pidns= specifies a pid namespace (either as a string path to something like +`/proc/$pid/ns/pid`, or a file descriptor when using `FSCONFIG_SET_FD`) that +will be used by the procfs instance when translating pids. By default, procfs +will use the calling process's active pid namespace. Note that the pid +namespace of an existing procfs instance cannot be modified (attempting to do +so will give an `-EBUSY` error). + Chapter 5: Filesystem behavior ============================== -Originally, before the advent of pid namepsace, procfs was a global file +Originally, before the advent of pid namespace, procfs was a global file system. It means that there was only one procfs instance in the system. When pid namespace was added, a separate procfs instance was mounted in |
