summaryrefslogtreecommitdiff
path: root/fs/proc/proc_sysctl.c
AgeCommit message (Collapse)Author
2025-01-27Pass parent directory inode and expected name to ->d_revalidate()Al Viro
->d_revalidate() often needs to access dentry parent and name; that has to be done carefully, since the locking environment varies from caller to caller. We are not guaranteed that dentry in question will not be moved right under us - not unless the filesystem is such that nothing on it ever gets renamed. It can be dealt with, but that results in boilerplate code that isn't even needed - the callers normally have just found the dentry via dcache lookup and want to verify that it's in the right place; they already have the values of ->d_parent and ->d_name stable. There is a couple of exceptions (overlayfs and, to less extent, ecryptfs), but for the majority of calls that song and dance is not needed at all. It's easier to make ecryptfs and overlayfs find and pass those values if there's a ->d_revalidate() instance to be called, rather than doing that in the instances. This commit only changes the calling conventions; making use of supplied values is left to followups. NOTE: some instances need more than just the parent - things like CIFS may need to build an entire path from filesystem root, so they need more precautions than the usual boilerplate. This series doesn't do anything to that need - these filesystems have to keep their locking mechanisms (rename_lock loops, use of dentry_path_raw(), private rwsem a-la v9fs). One thing to keep in mind when using name is that name->name will normally point into the pathname being resolved; the filename in question occupies name->len bytes starting at name->name, and there is NUL somewhere after it, but it the next byte might very well be '/' rather than '\0'. Do not ignore name->len. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Gabriel Krisman Bertazi <gabriel@krisman.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-10-31sysctl: Reduce dput(child) calls in proc_sys_fill_cache()Markus Elfring
Replace two dput(child) calls with one that occurs immediately before the IS_ERR evaluation. This transformation can be performed because dput() gets called regardless of the value returned by IS_ERR(res). This issue was transformed by using a script for the semantic patch language like the following. <SmPL> @extended_adjustment@ expression e, f != { mutex_unlock }, x, y; @@ +f(e); if (...) { <+... when != \( e = x \| y(..., &e, ...) \) - f(e); ...+> } -f(e); </SmPL> Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Joel Granados <joel.granados@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2024-10-10sysctl: Convert locking comments to lockdep assertionsThomas Weißschuh
The assertions work as well as the comment to inform developers about locking expectations. Additionally they are validated by lockdep at runtime, making sure the expectations are met. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2024-10-09sysctl: make internal ctl_tables constThomas Weißschuh
Now that the sysctl core can handle registration of "const struct ctl_table" constify the sysctl internal tables. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2024-10-09sysctl: allow registration of const struct ctl_tableThomas Weißschuh
Putting structure, especially those containing function pointers, into read-only memory makes the safer and easier to reason about. Change the sysctl registration APIs to allow registration of "const struct ctl_table". Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> # security/* Signed-off-by: Joel Granados <joel.granados@kernel.org>
2024-10-09sysctl: move internal interfaces to const struct ctl_tableThomas Weißschuh
As a preparation to make all the core sysctl code work with const struct ctl_table switch over the internal function to use the const variant. Some pointers to "struct ctl_table" need to stay non-const as they are newly allocated and modified before registration. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2024-09-02sysctl: avoid spurious permanent empty tablesThomas Weißschuh
The test if a table is a permanently empty one, inspects the address of the registered ctl_table argument. However as sysctl_mount_point is an empty array and does not occupy and space it can end up sharing an address with another object in memory. If that other object itself is a "struct ctl_table" then registering that table will fail as it's incorrectly recognized as permanently empty. Avoid this issue by adding a dummy element to the array so that is not empty anymore. Explicitly register the table with zero elements as otherwise the dummy element would be recognized as a sentinel element which would lead to a runtime warning from the sysctl core. While the issue seems not being encountered at this time, this seems mostly to be due to luck. Also a future change, constifying sysctl_mount_point and root_table, can reliably trigger this issue on clang 18. Given that empty arrays are non-standard in the first place it seems prudent to avoid them if possible. Fixes: 4a7b29f65094 ("sysctl: move sysctl type to ctl_table_header") Fixes: a35dd3a786f5 ("sysctl: drop now unnecessary out-of-bounds check") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Closes: https://lore.kernel.org/oe-lkp/202408051453.f638857e-lkp@intel.com Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-13sysctl: Warn on an empty procname elementJoel Granados
Add a pr_err warning in case a ctl_table is registered with a sentinel element containing a NULL procname. Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-13sysctl: Remove ctl_table sentinel code commentsJoel Granados
Remove the mention of a "zero terminated entry" from the __register_sysctl_table function doc. Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-13sysctl: Remove "child" sysctl code commentsJoel Granados
Erase the code comments mentioning "child" that were forgotten when the child element was removed in commit 2f2665c13af48 ("sysctl: replace child with an enumeration"). Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-13sysctl: Remove superfluous empty allocations from sysctl internalsJoel Granados
Now that the sentinels have been removed from ctl_table arrays, there is no need to artificially append empty ctl_table elements at ctl_table registration. Remove superfluous empty allocation from new_dir and new_links. Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-13sysctl: Replace nr_entries with ctl_table_size in new_linksJoel Granados
The number of ctl_table entries (nr_entries) calculation was previously based on the ctl_table_size and the sentinel element. Since the sentinels have been removed, we remove the calculation and just use the ctl_table_size from the ctl_table_header. Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-13sysctl: Remove check for sentinel element in ctl_table arraysJoel Granados
Use ARRAY_SIZE exclusively by removing the check to ->procname in the stopping criteria of the loops traversing ctl_table arrays. This commit finalizes the removal of the sentinel elements at the end of ctl_table arrays which reduces the build time size and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/) Remove the entry->procname evaluation from the for loop stopping criteria in sysctl and sysctl_net. Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-03sysctl: move the extra1/2 boundary check of u8 to sysctl_check_table_arrayWen Yang
Move boundary checking for proc_dou8ved_minmax into module loading, thereby reporting errors in advance. And add a kunit test case ensuring the boundary check is done correctly. The boundary check in proc_dou8vec_minmax done to the extra elements in the ctl_table struct is currently performed at runtime. This allows buggy kernel modules to be loaded normally without any errors only to fail when used. This is a buggy example module: #include <linux/kernel.h> #include <linux/module.h> #include <linux/sysctl.h> static struct ctl_table_header *_table_header = NULL; static unsigned char _data = 0; struct ctl_table table[] = { { .procname = "foo", .data = &_data, .maxlen = sizeof(u8), .mode = 0644, .proc_handler = proc_dou8vec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_ONE_THOUSAND, }, }; static int init_demo(void) { _table_header = register_sysctl("kernel", table); if (!_table_header) return -ENOMEM; return 0; } module_init(init_demo); MODULE_LICENSE("GPL"); And this is the result: # insmod test.ko # cat /proc/sys/kernel/foo cat: /proc/sys/kernel/foo: Invalid argument Suggested-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Wen Yang <wen.yang@linux.dev> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Joel Granados <j.granados@samsung.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Christian Brauner <brauner@kernel.org> Cc: linux-kernel@vger.kernel.org Reviewed-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-03sysctl: always initialize i_uid/i_gidThomas Weißschuh
Always initialize i_uid/i_gid inside the sysfs core so set_ownership() can safely skip setting them. Commit 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.") added defaults for i_uid/i_gid when set_ownership() was not implemented. It also missed adjusting net_ctl_set_ownership() to use the same default values in case the computation of a better value failed. Fixes: 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-04-24sysctl: drop now unnecessary out-of-bounds checkThomas Weißschuh
Remove the now unneeded check for ctl_table_size; it is safe to do so as sysctl_set_perm_empty_ctl_header() does not access the ctl_table member anymore. This also makes the element of sysctl_mount_point unnecessary, so drop it at the same time. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-04-24sysctl: move sysctl type to ctl_table_headerThomas Weißschuh
Move the SYSCTL_TABLE_TYPE_{DEFAULT,PERMANENTLY_EMPTY} enums from ctl_table to ctl_table_header. Removing the mutable member is necessary to constify static instances of struct ctl_table. Move the initialization of the sysctl_mount_point type into init_header() where all the other header fields are also initialized. As a side-effect the memory usage of the sysctl core is reduced. Each ctl_table_header instance can manage multiple ctl_table instances and is only allocated when the table is actually registered. This saves 8 bytes of memory per ctl_table on 64bit, 4 due to the enum field itself and 4 due to padding. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-04-24sysctl: drop sysctl_is_perm_empty_ctl_tableThomas Weißschuh
It is used only twice and those callers are simpler with sysctl_is_perm_empty_ctl_header(). So use this sibling function. This is part of an effort to constify definition of struct ctl_table. For this effort the mutable member 'type' is moved from struct ctl_table to struct ctl_table_header. Unifying the macros sysctl_is_perm_empty_ctl_* makes this easier. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-04-24sysctl: treewide: drop unused argument ctl_table_root::set_ownership(table)Thomas Weißschuh
Remove the 'table' argument from set_ownership as it is never used. This change is a step towards putting "struct ctl_table" into .rodata and eventually having sysctl core only use "const struct ctl_table". The patch was created with the following coccinelle script: @@ identifier func, head, table, uid, gid; @@ void func( struct ctl_table_header *head, - struct ctl_table *table, kuid_t *uid, kgid_t *gid) { ... } No additional occurrences of 'set_ownership' were found after doing a tree-wide search. Reviewed-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-01-11Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull misc filesystem updates from Al Viro: "Misc cleanups (the part that hadn't been picked by individual fs trees)" * tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: apparmorfs: don't duplicate kfree_link() orangefs: saner arguments passing in readdir guts ocfs2_find_match(): there's no such thing as NULL or negative ->d_parent reiserfs_add_entry(): get rid of pointless namelen checks __ocfs2_add_entry(), ocfs2_prepare_dir_for_insert(): namelen checks ext4_add_entry(): ->d_name.len is never 0 befs: d_obtain_alias(ERR_PTR(...)) will do the right thing affs: d_obtain_alias(ERR_PTR(...)) will do the right thing /proc/sys: use d_splice_alias() calling conventions to simplify failure exits hostfs: use d_splice_alias() calling conventions to simplify failure exits udf_fiiter_add_entry(): check for zero ->d_name.len is bogus... udf: d_obtain_alias(ERR_PTR(...)) will do the right thing... udf: d_splice_alias() will do the right thing on ERR_PTR() inode nfsd: kill stale comment about simple_fill_super() requirements bfs_add_entry(): get rid of pointless ->d_name.len checks nilfs2: d_obtain_alias(ERR_PTR(...)) will do the right thing... zonefs: d_splice_alias() will do the right thing on ERR_PTR() inode
2023-12-28fs: Remove the now superfluous sentinel elements from ctl_table arrayJoel Granados
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/) Remove sentinel elements ctl_table struct. Special attention was placed in making sure that an empty directory for fs/verity was created when CONFIG_FS_VERITY_BUILTIN_SIGNATURES is not defined. In this case we use the register sysctl call that expects a size. Signed-off-by: Joel Granados <j.granados@samsung.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-12-28sysctl: Fix out of bounds access for empty sysctl registersJoel Granados
When registering tables to the sysctl subsystem there is a check to see if header is a permanently empty directory (used for mounts). This check evaluates the first element of the ctl_table. This results in an out of bounds evaluation when registering empty directories. The function register_sysctl_mount_point now passes a ctl_table of size 1 instead of size 0. It now relies solely on the type to identify a permanently empty register. Make sure that the ctl_table has at least one element before testing for permanent emptiness. Signed-off-by: Joel Granados <j.granados@samsung.com> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202311201431.57aae8f3-oliver.sang@intel.com Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-12-21/proc/sys: use d_splice_alias() calling conventions to simplify failure exitsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-11-01Merge tag 'sysctl-6.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull sysctl updates from Luis Chamberlain: "To help make the move of sysctls out of kernel/sysctl.c not incur a size penalty sysctl has been changed to allow us to not require the sentinel, the final empty element on the sysctl array. Joel Granados has been doing all this work. On the v6.6 kernel we got the major infrastructure changes required to support this. For v6.7-rc1 we have all arch/ and drivers/ modified to remove the sentinel. Both arch and driver changes have been on linux-next for a bit less than a month. It is worth re-iterating the value: - this helps reduce the overall build time size of the kernel and run time memory consumed by the kernel by about ~64 bytes per array - the extra 64-byte penalty is no longer inncurred now when we move sysctls out from kernel/sysctl.c to their own files For v6.8-rc1 expect removal of all the sentinels and also then the unneeded check for procname == NULL. The last two patches are fixes recently merged by Krister Johansen which allow us again to use softlockup_panic early on boot. This used to work but the alias work broke it. This is useful for folks who want to detect softlockups super early rather than wait and spend money on cloud solutions with nothing but an eventual hung kernel. Although this hadn't gone through linux-next it's also a stable fix, so we might as well roll through the fixes now" * tag 'sysctl-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (23 commits) watchdog: move softlockup_panic back to early_param proc: sysctl: prevent aliased sysctls from getting passed to init intel drm: Remove now superfluous sentinel element from ctl_table array Drivers: hv: Remove now superfluous sentinel element from ctl_table array raid: Remove now superfluous sentinel element from ctl_table array fw loader: Remove the now superfluous sentinel element from ctl_table array sgi-xp: Remove the now superfluous sentinel element from ctl_table array vrf: Remove the now superfluous sentinel element from ctl_table array char-misc: Remove the now superfluous sentinel element from ctl_table array infiniband: Remove the now superfluous sentinel element from ctl_table array macintosh: Remove the now superfluous sentinel element from ctl_table array parport: Remove the now superfluous sentinel element from ctl_table array scsi: Remove now superfluous sentinel element from ctl_table array tty: Remove now superfluous sentinel element from ctl_table array xen: Remove now superfluous sentinel element from ctl_table array hpet: Remove now superfluous sentinel element from ctl_table array c-sky: Remove now superfluous sentinel element from ctl_talbe array powerpc: Remove now superfluous sentinel element from ctl_table arrays riscv: Remove now superfluous sentinel element from ctl_table array x86/vdso: Remove now superfluous sentinel element from ctl_table array ...
2023-11-01watchdog: move softlockup_panic back to early_paramKrister Johansen
Setting softlockup_panic from do_sysctl_args() causes it to take effect later in boot. The lockup detector is enabled before SMP is brought online, but do_sysctl_args runs afterwards. If a user wants to set softlockup_panic on boot and have it trigger should a softlockup occur during onlining of the non-boot processors, they could do this prior to commit f117955a2255 ("kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases"). However, after this commit the value of softlockup_panic is set too late to be of help for this type of problem. Restore the prior behavior. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Cc: stable@vger.kernel.org Fixes: f117955a2255 ("kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases") Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01proc: sysctl: prevent aliased sysctls from getting passed to initKrister Johansen
The code that checks for unknown boot options is unaware of the sysctl alias facility, which maps bootparams to sysctl values. If a user sets an old value that has a valid alias, a message about an invalid parameter will be printed during boot, and the parameter will get passed to init. Fix by checking for the existence of aliased parameters in the unknown boot parameter code. If an alias exists, don't return an error or pass the value to init. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Cc: stable@vger.kernel.org Fixes: 0a477e1ae21b ("kernel/sysctl: support handling command line aliases") Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-10-18proc: convert to new timestamp accessorsJeff Layton
Convert to using the new inode timestamp accessor functions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20231004185347.80880-59-jlayton@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-29Merge tag 'sysctl-6.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull sysctl updates from Luis Chamberlain: "Long ago we set out to remove the kitchen sink on kernel/sysctl.c arrays and placings sysctls to their own sybsystem or file to help avoid merge conflicts. Matthew Wilcox pointed out though that if we're going to do that we might as well also *save* space while at it and try to remove the extra last sysctl entry added at the end of each array, a sentintel, instead of bloating the kernel by adding a new sentinel with each array moved. Doing that was not so trivial, and has required slowing down the moves of kernel/sysctl.c arrays and measuring the impact on size by each new move. The complex part of the effort to help reduce the size of each sysctl is being done by the patient work of el señor Don Joel Granados. A lot of this is truly painful code refactoring and testing and then trying to measure the savings of each move and removing the sentinels. Although Joel already has code which does most of this work, experience with sysctl moves in the past shows is we need to be careful due to the slew of odd build failures that are possible due to the amount of random Kconfig options sysctls use. To that end Joel's work is split by first addressing the major housekeeping needed to remove the sentinels, which is part of this merge request. The rest of the work to actually remove the sentinels will be done later in future kernel releases. The preliminary math is showing this will all help reduce the overall build time size of the kernel and run time memory consumed by the kernel by about ~64 bytes per array where we are able to remove each sentinel in the future. That also means there is no more bloating the kernel with the extra ~64 bytes per array moved as no new sentinels are created" * tag 'sysctl-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: sysctl: Use ctl_table_size as stopping criteria for list macro sysctl: SIZE_MAX->ARRAY_SIZE in register_net_sysctl vrf: Update to register_net_sysctl_sz networking: Update to register_net_sysctl_sz netfilter: Update to register_net_sysctl_sz ax.25: Update to register_net_sysctl_sz sysctl: Add size to register_net_sysctl function sysctl: Add size arg to __register_sysctl_init sysctl: Add size to register_sysctl sysctl: Add a size arg to __register_sysctl_table sysctl: Add size argument to init_header sysctl: Add ctl_table_size to ctl_table_header sysctl: Use ctl_table_header in list_for_each_table_entry sysctl: Prefer ctl_table_header in proc_sysctl
2023-08-15sysctl: Use ctl_table_size as stopping criteria for list macroJoel Granados
This is a preparation commit to make it easy to remove the sentinel elements (empty end markers) from the ctl_table arrays. It both allows the systematic removal of the sentinels and adds the ctl_table_size variable to the stopping criteria of the list_for_each_table_entry macro that traverses all ctl_table arrays. Once all the sentinels are removed by subsequent commits, ctl_table_size will become the only stopping criteria in the macro. We don't actually remove any elements in this commit, but it sets things up to for the removal process to take place. By adding header->ctl_table_size as an additional stopping criteria for the list_for_each_table_entry macro, it will execute until it finds an "empty" ->procname or until the size runs out. Therefore if a ctl_table array with a sentinel is passed its size will be too big (by one element) but it will stop on the sentinel. On the other hand, if the ctl_table array without a sentinel is passed its size will be just write and there will be no need for a sentinel. Signed-off-by: Joel Granados <j.granados@samsung.com> Suggested-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-08-15sysctl: Add size arg to __register_sysctl_initJoel Granados
This commit adds table_size to __register_sysctl_init in preparation for the removal of the sentinel elements in the ctl_table arrays (last empty markers). And though we do *not* remove any sentinels in this commit, we set things up by calculating the ctl_table array size with ARRAY_SIZE. We add a table_size argument to __register_sysctl_init and modify the register_sysctl_init macro to calculate the array size with ARRAY_SIZE. The original callers do not need to be updated as they will go through the new macro. Signed-off-by: Joel Granados <j.granados@samsung.com> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-08-15sysctl: Add size to register_sysctlJoel Granados
This commit adds table_size to register_sysctl in preparation for the removal of the sentinel elements in the ctl_table arrays (last empty markers). And though we do *not* remove any sentinels in this commit, we set things up by either passing the table_size explicitly or using ARRAY_SIZE on the ctl_table arrays. We replace the register_syctl function with a macro that will add the ARRAY_SIZE to the new register_sysctl_sz function. In this way the callers that are already using an array of ctl_table structs do not change. For the callers that pass a ctl_table array pointer, we pass the table_size to register_sysctl_sz instead of the macro. Signed-off-by: Joel Granados <j.granados@samsung.com> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-08-15sysctl: Add a size arg to __register_sysctl_tableJoel Granados
We make these changes in order to prepare __register_sysctl_table and its callers for when we remove the sentinel element (empty element at the end of ctl_table arrays). We don't actually remove any sentinels in this commit, but we *do* make sure to use ARRAY_SIZE so the table_size is available when the removal occurs. We add a table_size argument to __register_sysctl_table and adjust callers, all of which pass ctl_table pointers and need an explicit call to ARRAY_SIZE. We implement a size calculation in register_net_sysctl in order to forward the size of the array pointer received from the network register calls. The new table_size argument does not yet have any effect in the init_header call which is still dependent on the sentinel's presence. table_size *does* however drive the `kzalloc` allocation in __register_sysctl_table with no adverse effects as the allocated memory is either one element greater than the calculated ctl_table array (for the calls in ipc_sysctl.c, mq_sysctl.c and ucount.c) or the exact size of the calculated ctl_table array (for the call from sysctl_net.c and register_sysctl). This approach will allows us to "just" remove the sentinel without further changes to __register_sysctl_table as table_size will represent the exact size for all the callers at that point. Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-08-15sysctl: Add size argument to init_headerJoel Granados
In this commit, we add a table_size argument to the init_header function in order to initialize the ctl_table_size variable in ctl_table_header. Even though the size is not yet used, it is now initialized within the sysctl subsys. We need this commit for when we start adding the table_size arguments to the sysctl functions (e.g. register_sysctl, __register_sysctl_table and __register_sysctl_init). Note that in __register_sysctl_table we temporarily use a calculated size until we add the size argument to that function in subsequent commits. Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-08-15sysctl: Use ctl_table_header in list_for_each_table_entryJoel Granados
We replace the ctl_table with the ctl_table_header pointer in list_for_each_table_entry which is the macro responsible for traversing the ctl_table arrays. This is a preparation commit that will make it easier to add the ctl_table array size (that will be added to ctl_table_header in subsequent commits) to the already existing loop logic based on empty ctl_table elements (so called sentinels). Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-08-15sysctl: Prefer ctl_table_header in proc_sysctlJoel Granados
This is a preparation commit that replaces ctl_table with ctl_table_header as the pointer that is passed around in proc_sysctl.c. This will become necessary in subsequent commits when the size of the ctl_table array can no longer be calculated by searching for an empty sentinel (last empty ctl_table element) but will be carried along inside the ctl_table_header struct. Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-08-09fs: pass the request_mask to generic_fillattrJeff Layton
generic_fillattr just fills in the entire stat struct indiscriminately today, copying data from the inode. There is at least one attribute (STATX_CHANGE_COOKIE) that can have side effects when it is reported, and we're looking at adding more with the addition of multigrain timestamps. Add a request_mask argument to generic_fillattr and have most callers just pass in the value that is passed to getattr. Have other callers (e.g. ksmbd) just pass in STATX_BASIC_STATS. Also move the setting of STATX_CHANGE_COOKIE into generic_fillattr. Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: "Paulo Alcantara (SUSE)" <pc@manguebit.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230807-mgctime-v7-2-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-24procfs: convert to ctime accessor functionsJeff Layton
In later patches, we're going to change how the inode's ctime field is used. Switch to using accessor functions instead of raw accesses of inode->i_ctime. Acked-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230705190309.579783-65-jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-06-30sysctl: set variable sysctl_mount_point storage-class-specifier to staticTom Rix
smatch reports fs/proc/proc_sysctl.c:32:18: warning: symbol 'sysctl_mount_point' was not declared. Should it be static? This variable is only used in its defining file, so it should be static. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-28Merge tag 'v6.5-rc1-sysctl-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull sysctl updates from Luis Chamberlain: "The changes for sysctl are in line with prior efforts to stop usage of deprecated routines which incur recursion and also make it hard to remove the empty array element in each sysctl array declaration. The most difficult user to modify was parport which required a bit of re-thinking of how to declare shared sysctls there, Joel Granados has stepped up to the plate to do most of this work and eventual removal of register_sysctl_table(). That work ended up saving us about 1465 bytes according to bloat-o-meter. Since we gained a few bloat-o-meter karma points I moved two rather small sysctl arrays from kernel/sysctl.c leaving us only two more sysctl arrays to move left. Most changes have been tested on linux-next for about a month. The last straggler patches are a minor parport fix, changes to the sysctl kernel selftest so to verify correctness and prevent regressions for the future change he made to provide an alternative solution for the special sysctl mount point target which was using the now deprecated sysctl child element. This is all prep work to now finally be able to remove the empty array element in all sysctl declarations / registrations which is expected to save us a bit of bytes all over the kernel. That work will be tested early after v6.5-rc1 is out" * tag 'v6.5-rc1-sysctl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: sysctl: replace child with an enumeration sysctl: Remove debugging dump_stack test_sysclt: Test for registering a mount point test_sysctl: Add an option to prevent test skip test_sysctl: Add an unregister sysctl test test_sysctl: Group node sysctl test under one func test_sysctl: Fix test metadata getters parport: plug a sysctl register leak sysctl: move security keys sysctl registration to its own file sysctl: move umh sysctl registration to its own file signal: move show_unhandled_signals sysctl to its own file sysctl: remove empty dev table sysctl: Remove register_sysctl_table sysctl: Refactor base paths registrations sysctl: stop exporting register_sysctl_table parport: Removed sysctl related defines parport: Remove register_sysctl_table from parport_default_proc_register parport: Remove register_sysctl_table from parport_device_proc_register parport: Remove register_sysctl_table from parport_proc_register parport: Move magic number "15" to a define
2023-06-18sysctl: replace child with an enumerationJoel Granados
This is part of the effort to remove the empty element at the end of ctl_table structs. "child" was a deprecated elem in this struct and was being used to differentiate between two types of ctl_tables: "normal" and "permanently emtpy". What changed?: * Replace "child" with an enumeration that will have two values: the default (0) and the permanently empty (1). The latter is left at zero so when struct ctl_table is created with kzalloc or in a local context, it will have the zero value by default. We document the new enum with kdoc. * Remove the "empty child" check from sysctl_check_table * Remove count_subheaders function as there is no longer a need to calculate how many headers there are for every child * Remove the recursive call to unregister_sysctl_table as there is no need to traverse down the child tree any longer * Add a new SYSCTL_PERM_EMPTY_DIR binary flag * Remove the last remanence of child from partport/procfs.c Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-18sysctl: Remove debugging dump_stackJoel Granados
Remove unneeded dump_stack in __register_sysctl_table Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-05-24tty, proc, kernfs, random: Use copy_splice_read()David Howells
Use copy_splice_read() for tty, procfs, kernfs and random files rather than going through generic_file_splice_read() as they just copy the file into the output buffer and don't splice pages. This avoids the need for them to have a ->read_folio() to satisfy filemap_splice_read(). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> cc: Christoph Hellwig <hch@lst.de> cc: Jens Axboe <axboe@kernel.dk> cc: Al Viro <viro@zeniv.linux.org.uk> cc: John Hubbard <jhubbard@nvidia.com> cc: David Hildenbrand <david@redhat.com> cc: Matthew Wilcox <willy@infradead.org> cc: Miklos Szeredi <miklos@szeredi.hu> cc: Arnd Bergmann <arnd@arndb.de> cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-13-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-23sysctl: Remove register_sysctl_tableJoel Granados
This is part of the general push to deprecate register_sysctl_paths and register_sysctl_table. After removing all the calling functions, we remove both the register_sysctl_table function and the documentation check that appeared in check-sysctl-docs awk script. We save 595 bytes with this change: ./scripts/bloat-o-meter vmlinux.1.refactor-base-paths vmlinux.2.remove-sysctl-table add/remove: 2/8 grow/shrink: 1/0 up/down: 1154/-1749 (-595) Function old new delta count_subheaders - 983 +983 unregister_sysctl_table 29 184 +155 __pfx_count_subheaders - 16 +16 __pfx_unregister_sysctl_table.part 16 - -16 __pfx_register_leaf_sysctl_tables.constprop 16 - -16 __pfx_count_subheaders.part 16 - -16 __pfx___register_sysctl_base 16 - -16 unregister_sysctl_table.part 136 - -136 __register_sysctl_base 478 - -478 register_leaf_sysctl_tables.constprop 524 - -524 count_subheaders.part 547 - -547 Total: Before=21257652, After=21257057, chg -0.00% [mcgrof: remove register_leaf_sysctl_tables and append_path too and add bloat-o-meter stats] Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Christian Brauner <brauner@kernel.org>
2023-05-23sysctl: stop exporting register_sysctl_tableJoel Granados
We make register_sysctl_table static because the only function calling it is in fs/proc/proc_sysctl.c (__register_sysctl_base). We remove it from the sysctl.h header and modify the documentation in both the header and proc_sysctl.c files to mention "register_sysctl" instead of "register_sysctl_table". This plus the commits that remove register_sysctl_table from parport save 217 bytes: ./scripts/bloat-o-meter .bsysctl/vmlinux.old .bsysctl/vmlinux.new add/remove: 0/1 grow/shrink: 5/1 up/down: 458/-675 (-217) Function old new delta __register_sysctl_base 8 286 +278 parport_proc_register 268 379 +111 parport_device_proc_register 195 247 +52 kzalloc.constprop 598 608 +10 parport_default_proc_register 62 69 +7 register_sysctl_table 291 - -291 parport_sysctl_template 1288 904 -384 Total: Before=8603076, After=8602859, chg -0.00% Signed-off-by: Joel Granados <j.granados@samsung.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-05-02sysctl: remove register_sysctl_paths()Luis Chamberlain
The deprecation for register_sysctl_paths() is over. We can rejoice as we nuke register_sysctl_paths(). The routine register_sysctl_table() was the only user left of register_sysctl_paths(), so we can now just open code and move the implementation over to what used to be to __register_sysctl_paths(). The old dynamic struct ctl_table_set *set is now the point to sysctl_table_root.default_set. The old dynamic const struct ctl_path *path was being used in the routine register_sysctl_paths() with a static: static const struct ctl_path null_path[] = { {} }; Since this is a null path we can now just simplfy the old routine and remove its use as its always empty. This saves us a total of 230 bytes. $ ./scripts/bloat-o-meter vmlinux.old vmlinux add/remove: 2/7 grow/shrink: 1/1 up/down: 1015/-1245 (-230) Function old new delta register_leaf_sysctl_tables.constprop - 524 +524 register_sysctl_table 22 497 +475 __pfx_register_leaf_sysctl_tables.constprop - 16 +16 null_path 8 - -8 __pfx_register_sysctl_paths 16 - -16 __pfx_register_leaf_sysctl_tables 16 - -16 __pfx___register_sysctl_paths 16 - -16 __register_sysctl_base 29 12 -17 register_sysctl_paths 18 - -18 register_leaf_sysctl_tables 534 - -534 __register_sysctl_paths 620 - -620 Total: Before=21259666, After=21259436, chg -0.00% Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-27Merge tag 'mm-nonmm-stable-2023-04-27-16-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Mainly singleton patches all over the place. Series of note are: - updates to scripts/gdb from Glenn Washburn - kexec cleanups from Bjorn Helgaas" * tag 'mm-nonmm-stable-2023-04-27-16-01' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (50 commits) mailmap: add entries for Paul Mackerras libgcc: add forward declarations for generic library routines mailmap: add entry for Oleksandr ocfs2: reduce ioctl stack usage fs/proc: add Kthread flag to /proc/$pid/status ia64: fix an addr to taddr in huge_pte_offset() checkpatch: introduce proper bindings license check epoll: rename global epmutex scripts/gdb: add GDB convenience functions $lx_dentry_name() and $lx_i_dentry() scripts/gdb: create linux/vfs.py for VFS related GDB helpers uapi/linux/const.h: prefer ISO-friendly __typeof__ delayacct: track delays from IRQ/SOFTIRQ scripts/gdb: timerlist: convert int chunks to str scripts/gdb: print interrupts scripts/gdb: raise error with reduced debugging information scripts/gdb: add a Radix Tree Parser lib/rbtree: use '+' instead of '|' for setting color. proc/stat: remove arch_idle_time() checkpatch: check for misuse of the link tags checkpatch: allow Closes tags with links ...
2023-04-13proc_sysctl: enhance documentationLuis Chamberlain
Expand documentation to clarify: o that paths don't need to exist for the new API callers o clarify that we *require* callers to keep the memory of the table around during the lifetime of the sysctls o annotate routines we are trying to deprecate and later remove Cc: stable@vger.kernel.org # v5.17 Cc: Christian Brauner <brauner@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13sysctl: clarify register_sysctl_init() base directory orderLuis Chamberlain
Relatively new docs which I added which hinted the base directories needed to be created before is wrong, remove that incorrect comment. This has been hinted before by Eric twice already [0] [1], I had just not verified that until now. Now that I've verified that updates the docs to relax the context described. [0] https://lkml.kernel.org/r/875ys0azt8.fsf@email.froward.int.ebiederm.org [1] https://lkml.kernel.org/r/87ftbiud6s.fsf@x220.int.ebiederm.org Cc: stable@vger.kernel.org # v5.17 Cc: Christian Brauner <brauner@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13proc_sysctl: move helper which creates required subdirectoriesLuis Chamberlain
Move the code which creates the subdirectories for a ctl table into a helper routine so to make it easier to review. Document the goal. This creates no functional changes. Reviewed-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13proc_sysctl: update docs for __register_sysctl_table()Luis Chamberlain
Update the docs for __register_sysctl_table() to make it clear no child entries can be passed. When the child is true these are non-leaf entries on the ctl table and sysctl treats these as directories. The point to __register_sysctl_table() is to deal only with directories not part of the ctl table where thay may riside, to be simple and avoid recursion. While at it, hint towards using long on extra1 and extra2 later. Cc: stable@vger.kernel.org # v5.17 Cc: Christian Brauner <brauner@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>