summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2025-02-09spinlock: extend guard with spinlock_bh variantsChristian Marangi
Extend guard APIs with missing raw/spinlock_bh variants. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-08crypto: crct10dif - remove from crypto APIEric Biggers
Remove the "crct10dif" shash algorithm from the crypto API. It has no known user now that the lib is no longer built on top of it. It has no remaining references in kernel code. The only other potential users would be the usual components that allow specifying arbitrary hash algorithms by name, namely AF_ALG and dm-integrity. However there are no indications that "crct10dif" is being used with these components. Debian Code Search and web searches don't find anything relevant, and explicitly grepping the source code of the usual suspects (cryptsetup, libell, iwd) finds no matches either. "crc32" and "crc32c" are used in a few more places, but that doesn't seem to be the case for "crct10dif". crc_t10dif_update() is also tested by crc_kunit now, so the test coverage provided via the crypto self-tests is no longer needed. Also note that the "crct10dif" shash algorithm was inconsistent with the rest of the shash API in that it wrote the digest in CPU endianness, making the resulting byte array differ on little endian vs. big endian platforms. This means it was effectively just built for use by the lib functions, and it was not actually correct to treat it as "just another hash function" that could be dropped in via the shash API. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20250206173857.39794-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08lib/crc32: remove "_le" from crc32c base and arch functionsEric Biggers
Following the standardization on crc32c() as the lib entry point for the Castagnoli CRC32 instead of the previous mix of crc32c(), crc32c_le(), and __crc32c_le(), make the same change to the underlying base and arch functions that implement it. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250208024911.14936-7-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08lib/crc32: rename __crc32c_le_combine() to crc32c_combine()Eric Biggers
Since the Castagnoli CRC32 is now always just crc32c(), rename __crc32c_le_combine() and __crc32c_le_shift() accordingly. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250208024911.14936-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08lib/crc32: standardize on crc32c() name for Castagnoli CRC32Eric Biggers
For historical reasons, the Castagnoli CRC32 is available under 3 names: crc32c(), crc32c_le(), and __crc32c_le(). Most callers use crc32c(). The more verbose versions are not really warranted; there is no "_be" version that the "_le" version needs to be differentiated from, and the leading underscores are pointless. Therefore, let's standardize on just crc32c(). Remove the other two names, and update callers accordingly. Specifically, the new crc32c() comes from what was previously __crc32c_le(), so compared to the old crc32c() it now takes a size_t length rather than unsigned int, and it's now in linux/crc32.h instead of just linux/crc32c.h (which includes linux/crc32.h). Later patches will also rename __crc32c_le_combine(), crc32c_le_base(), and crc32c_le_arch(). Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250208024911.14936-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08lib/crc32: don't bother with pure and const function attributesEric Biggers
Drop the use of __pure and __attribute_const__ from the CRC32 library functions that had them. Both of these are unusual optimizations that don't help properly written code. They seem more likely to cause problems than have any real benefit. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250208024911.14936-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08lib/crc32: use void pointer for dataEric Biggers
Update crc32_le(), crc32_be(), and __crc32c_le() to take the data as a 'const void *' instead of 'const u8 *'. This makes them slightly easier to use, as it can eliminate the need for casts in the calling code. It's the only pointer argument, so there is no possibility for confusion with another pointer argument. Also, some of the CRC library functions, for example crc32c() and crc64_be(), already used 'const void *'. Let's standardize on that, as it seems like a better choice. The underlying base and arch functions continue to use 'const u8 *', as that is often more convenient for the implementation. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250208024911.14936-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08lib/crc64: add support for arch-optimized implementationsEric Biggers
Add support for architecture-optimized implementations of the CRC64 library functions, following the approach taken for the CRC32 and CRC-T10DIF library functions. Also take the opportunity to tweak the function prototypes: - Use 'const void *' for the lib entry points (since this is easier for users) but 'const u8 *' for the underlying arch and generic functions (since this is easier for the implementations of these functions). - Don't bother with __pure. It's an unusual optimization that doesn't help properly written code. It's a weird quirk we can do without. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Acked-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250130035130.180676-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08lib/crc64: rename CRC64-Rocksoft to CRC64-NVMEEric Biggers
This CRC64 variant comes from the NVME NVM Command Set Specification (https://nvmexpress.org/wp-content/uploads/NVM-Express-NVM-Command-Set-Specification-1.0e-2024.07.29-Ratified.pdf). The "Rocksoft Model CRC Algorithm", published in 1993 and available at https://www.zlib.net/crc_v3.txt, is a generalized CRC algorithm that can calculate any variant of CRC, given a list of parameters such as polynomial, bit order, etc. It is not a CRC variant. The NVME NVM Command Set Specification has a table that gives the "Rocksoft Model Parameters" for the CRC variant it uses. When support for this CRC variant was added to Linux, this table seems to have been misinterpreted as naming the CRC variant the "Rocksoft" CRC. In fact, the table names the CRC variant as the "NVM Express 64b CRC". Most implementations of this CRC variant outside Linux have been calling it CRC64-NVME. Therefore, update Linux to match. While at it, remove the superfluous "update" from the function name, so crc64_rocksoft_update() is now just crc64_nvme(), matching most of the other CRC library functions. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Acked-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250130035130.180676-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08crypto: crc64-rocksoft - remove from crypto APIEric Biggers
Remove crc64-rocksoft from the crypto API. It has no known user now that the lib is no longer built on top of it. It was also added much more recently than the longstanding crc32 and crc32c. Unlike crc32 and crc32c, crc64-rocksoft is also not mentioned in the dm-integrity documentation and there are no references to it in anywhere in the cryptsetup git repo, so it is unlikely to have any user there either. Also, this CRC variant is named incorrectly; it has nothing to do with Rocksoft and should be called crc64-nvme. That is yet another reason to remove it from the crypto API; we would not want anyone to start depending on the current incorrect algorithm name of crc64-rocksoft. Note that this change temporarily makes this CRC variant not be covered by any tests, as previously it was relying on the crypto self-tests. This will be fixed by adding this CRC variant to crc_kunit. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Acked-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250130035130.180676-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08lib/crc64-rocksoft: stop wrapping the crypto APIEric Biggers
Following what was done for the CRC32 and CRC-T10DIF library functions, get rid of the pointless use of the crypto API and make crc64_rocksoft_update() call into the library directly. This is faster and simpler. Remove crc64_rocksoft() (the version of the function that did not take a 'crc' argument) since it is unused. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Acked-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250130035130.180676-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08Merge tag 'hardening-v6.14-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: "Address a KUnit stack initialization regression that got tickled on m68k, and solve a Clang(v14 and earlier) bug found by 0day: - Fix stackinit KUnit regression on m68k - Use ARRAY_SIZE() for memtostr*()/strtomem*()" * tag 'hardening-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: string.h: Use ARRAY_SIZE() for memtostr*()/strtomem*() compiler.h: Introduce __must_be_byte_array() compiler.h: Move C string helpers into C-only kernel section stackinit: Fix comment for test_small_end stackinit: Keep selftest union size small on m68k
2025-02-08Merge tag 'i2c-for-6.14-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c reverts from Wolfram Sang: "It turned out the new mechanism for handling created devices does not handle all muxing cases. Revert the changes to give a proper solution more time" * tag 'i2c-for-6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: Revert "i2c: Replace list-based mechanism for handling auto-detected clients" Revert "i2c: Replace list-based mechanism for handling userspace-created clients"
2025-02-08iio: gts-helper: export iio_gts_get_total_gain()Javier Carrasco
Export this function in preparation for the fix in veml6030.c, where the total gain can be used to ease the calculation of the processed value of the IIO_LIGHT channel compared to acquiring the scale in NANO. Suggested-by: Matti Vaittinen <mazziesaccount@gmail.com> Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com> Link: https://patch.msgid.link/20250127-veml6030-scale-v3-1-4f32ba03df94@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-02-07bpf: define KF_ARENA_* flags for bpf_arena kfuncsIhor Solodrai
bpf_arena_alloc_pages() and bpf_arena_free_pages() work with the bpf_arena pointers [1], which is indicated by the __arena macro in the kernel source code: #define __arena __attribute__((address_space(1))) However currently this information is absent from the debug data in the vmlinux binary. As a consequence, bpf_arena_* kfuncs declarations in vmlinux.h (produced by bpftool) do not match prototypes expected by the BPF programs attempting to use these functions. Introduce a set of kfunc flags to mark relevant types as bpf_arena pointers. The flags then can be detected by pahole when generating BTF from vmlinux's DWARF, allowing it to emit corresponding BTF type tags for the marked kfuncs. With recently proposed BTF extension [2], these type tags will be processed by bpftool when dumping vmlinux.h, and corresponding compiler attributes will be added to the declarations. [1] https://lwn.net/Articles/961594/ [2] https://lore.kernel.org/bpf/20250130201239.1429648-1-ihor.solodrai@linux.dev/ Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20250206003148.2308659-1-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-07Revert "net: skb: introduce and use a single page frag cache"Paolo Abeni
This reverts commit dbae2b062824 ("net: skb: introduce and use a single page frag cache"). The intended goal of such change was to counter a performance regression introduced by commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs"). Unfortunately, the blamed commit introduces another regression for the virtio_net driver. Such a driver calls napi_alloc_skb() with a tiny size, so that the whole head frag could fit a 512-byte block. The single page frag cache uses a 1K fragment for such allocation, and the additional overhead, under small UDP packets flood, makes the page allocator a bottleneck. Thanks to commit bf9f1baa279f ("net: add dedicated kmem_cache for typical/small skb->head"), this revert does not re-introduce the original regression. Actually, in the relevant test on top of this revert, I measure a small but noticeable positive delta, just above noise level. The revert itself required some additional mangling due to the introduction of the SKB_HEAD_ALIGN() helper and local lock infra in the affected code. Suggested-by: Eric Dumazet <edumazet@google.com> Fixes: dbae2b062824 ("net: skb: introduce and use a single page frag cache") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/e649212fde9f0fdee23909ca0d14158d32bb7425.1738877290.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07io_uring,lsm,selinux: add LSM hooks for io_uring_setup()Hamza Mahfooz
It is desirable to allow LSM to configure accessibility to io_uring because it is a coarse yet very simple way to restrict access to it. So, add an LSM for io_uring_allowed() to guard access to io_uring. Cc: Paul Moore <paul@paul-moore.com> Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Acked-by: Jens Axboe <axboe@kernel.dk> [PM: merge fuzz due to changes in preceding patches, subj tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-02-07spi: add offload TX/RX streaming APIsDavid Lechner
Most configuration of SPI offloads is handled opaquely using the offload pointer that is passed to the various offload functions. However, there are some offload features that need to be controlled on a per transfer basis. This patch adds a flag field to struct spi_transfer to allow specifying such features. The first feature to be added is the ability to stream data to/from a hardware sink/source rather than using a tx or rx buffer. Additional flags can be added in the future as needed. A flags field is also added to the offload struct for providers to indicate which flags are supported. This allows for generic checking of offload capabilities during __spi_validate() so that each offload provider doesn't have to implement their own validation. As a first users of this streaming capability, getter functions are added to get a DMA channel that is directly connected to the offload. Peripheral drivers will use this to get a DMA channel and configure it to suit their needs. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250207-dlech-mainline-spi-engine-offload-2-v8-5-e48a489be48c@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-07spi: offload: add support for hardware triggersDavid Lechner
Extend SPI offloading to support hardware triggers. This allows an arbitrary hardware trigger to be used to start a SPI transfer that was previously set up with spi_optimize_message(). A new struct spi_offload_trigger is introduced that can be used to configure any type of trigger. It has a type discriminator and a union to allow it to be extended in the future. Two trigger types are defined to start with. One is a trigger that indicates that the SPI peripheral is ready to read or write data. The other is a periodic trigger to repeat a SPI message at a fixed rate. There is also a spi_offload_hw_trigger_validate() function that works similar to clk_round_rate(). It basically asks the question of if we enabled the hardware trigger what would the actual parameters be. This can be used to test if the requested trigger type is actually supported by the hardware and for periodic triggers, it can be used to find the actual rate that the hardware is capable of. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250207-dlech-mainline-spi-engine-offload-2-v8-2-e48a489be48c@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-07spi: add basic support for SPI offloadingDavid Lechner
Add the basic infrastructure to support SPI offload providers and consumers. SPI offloading is a feature that allows the SPI controller to perform transfers without any CPU intervention. This is useful, e.g. for high-speed data acquisition. SPI controllers with offload support need to implement the get_offload and put_offload callbacks and can use the devm_spi_offload_alloc() to allocate offload instances. SPI peripheral drivers will call devm_spi_offload_get() to get a reference to the matching offload instance. This offload instance can then be attached to a SPI message to request offloading that message. It is expected that SPI controllers with offload support will check for the offload instance in the SPI message in the ctlr->optimize_message() callback and handle it accordingly. CONFIG_SPI_OFFLOAD is intended to be a select-only option. Both consumer and provider drivers should `select SPI_OFFLOAD` in their Kconfig to ensure that the SPI core is built with offload support. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250207-dlech-mainline-spi-engine-offload-2-v8-1-e48a489be48c@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-07bus: mhi: host: Remove unused functionsDr. David Alan Gilbert
mhi_device_get() and mhi_queue_dma() haven't been used since 2020's commit 189ff97cca53 ("bus: mhi: core: Add support for data transfer") added them. Remove them. Note that mhi_queue_dma_sync() is used and has been left. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250127215859.261105-1-linux@treblig.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-02-07Merge tag 'vfs-6.14-rc2.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix fsnotify FMODE_NONOTIFY* handling. This also disables fsnotify on all pseudo files by default apart from very select exceptions. This carries a regression risk so we need to watch out and adapt accordingly. However, it is overall a significant improvement over the current status quo where every rando file can get fsnotify enabled. - Cleanup and simplify lockref_init() after recent lockref changes. - Fix vboxfs build with gcc-15. - Add an assert into inode_set_cached_link() to catch corrupt links. - Allow users to also use an empty string check to detect whether a given mount option string was empty or not. - Fix how security options were appended to statmount()'s ->mnt_opt field. - Fix statmount() selftests to always check the returned mask. - Fix uninitialized value in vfs_statx_path(). - Fix pidfs_ioctl() sanity checks to guard against ioctl() overloading and preserve extensibility. * tag 'vfs-6.14-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: vfs: sanity check the length passed to inode_set_cached_link() pidfs: improve ioctl handling fsnotify: disable pre-content and permission events by default selftests: always check mask returned by statmount(2) fsnotify: disable notification by default for all pseudo files fs: fix adding security options to statmount.mnt_opt fsnotify: use accessor to set FMODE_NONOTIFY_* lockref: remove count argument of lockref_init gfs2: switch to lockref_init(..., 1) gfs2: use lockref_init for gl_lockref statmount: let unset strings be empty vboxsf: fix building with GCC 15 fs/stat.c: avoid harmless garbage value problem in vfs_statx_path()
2025-02-07fs: dcache: move the sysctl to fs/dcache.cKaixiong Yu
The sysctl_vfs_cache_pressure belongs to fs/dcache.c, move it to fs/dcache.c from kernel/sysctl.c. As a part of fs/dcache.c cleaning, sysctl_vfs_cache_pressure is changed to a static variable, and change the inline-type function vfs_pressure_ratio() to out-of-inline type, export vfs_pressure_ratio() with EXPORT_SYMBOL_GPL to be used by other files. Move the unneeded include(linux/dcache.h). Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07fs: drop_caches: move sysctl to fs/drop_caches.cKaixiong Yu
The sysctl_drop_caches to fs/drop_caches.c, move it to fs/drop_caches.c from /kernel/sysctl.c. And remove the useless extern variable declaration from include/linux/mm.h Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07fs: fs-writeback: move sysctl to fs/fs-writeback.cKaixiong Yu
The dirtytime_expire_interval belongs to fs/fs-writeback.c, move it to fs/fs-writeback.c from /kernel/sysctl.c. And remove the useless extern variable declaration and the function declaration from include/linux/writeback.h Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: nommu: move sysctl to mm/nommu.cKaixiong Yu
The sysctl_nr_trim_pages belongs to nommu.c, move it to mm/nommu.c from /kernel/sysctl.c. And remove the useless extern variable declaration from include/linux/mm.h Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: util: move sysctls to mm/util.cKaixiong Yu
This moves all util related sysctls to mm/util.c, as part of the kernel/sysctl.c cleaning, also removes redundant external variable declarations and function declarations. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: vmscan: move vmscan sysctls to mm/vmscan.cKaixiong Yu
This moves vm_swappiness and zone_reclaim_mode to mm/vmscan.c, as part of the kernel/sysctl.c cleaning, also moves some external variable declarations and function declarations from include/linux/swap.h into mm/internal.h. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: swap: move sysctl to mm/swap.cKaixiong Yu
The page-cluster belongs to mm/swap.c, move it to mm/swap.c . Removes the redundant external variable declaration and unneeded include(linux/swap.h). Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: filemap: move sysctl to mm/filemap.cKaixiong Yu
This moves the filemap related sysctl to mm/filemap.c, and removes the redundant external variable declaration. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: vmstat: move sysctls to mm/vmstat.cKaixiong Yu
This moves all vmstat related sysctls to its own file, removes useless extern variable declarations, and do some related clean-ups. To avoid compiler warnings when CONFIG_PROC_FS is not defined, add the macro definition CONFIG_PROC_FS ahead CONFIG_NUMA in vmstat.c. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07HID: core: Add reserved item tag for main itemsTatsuya S
For main items, separate warning of reserved item tag from warning of unknown item tag. This comes from 6.2.2.4 Main Items of Device Class Definition for HID 1.11 specification. Signed-off-by: Tatsuya S <tatsuya.s2862@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-02-07of: base: Add of_get_available_child_by_name()Biju Das
There are lot of drivers using of_get_child_by_name() followed by of_device_is_available() to find the available child node by name for a given parent. Provide a helper for these users to simplify the code. Suggested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-07pid: perform free_pid() calls outside of tasklist_lockMateusz Guzik
As the clone side already executes pid allocation with only pidmap_lock held, issuing free_pid() while still holding tasklist_lock exacerbates total hold time of the latter. More things may show up later which require initial clean up with the lock held and allow finishing without it. For that reason a struct to collect such work is added instead of merely passing the pid array. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250206164415.450051-5-mjguzik@gmail.com Acked-by: "Liam R. Howlett" <Liam.Howlett@Oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07vfs: sanity check the length passed to inode_set_cached_link()Mateusz Guzik
This costs a strlen() call when instatianating a symlink. Preferably it would be hidden behind VFS_WARN_ON (or compatible), but there is no such facility at the moment. With the facility in place the call can be patched out in production kernels. In the meantime, since the cost is being paid unconditionally, use the result to a fixup the bad caller. This is not expected to persist in the long run (tm). Sample splat: bad length passed for symlink [/tmp/syz-imagegen43743633/file0/file0] (got 131109, expected 37) [rest of WARN blurp goes here] Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250204213207.337980-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07fsnotify: use accessor to set FMODE_NONOTIFY_*Amir Goldstein
The FMODE_NONOTIFY_* bits are a 2-bits mode. Open coding manipulation of those bits is risky. Use an accessor file_set_fsnotify_mode() to set the mode. Rename file_set_fsnotify_mode() => file_set_fsnotify_mode_from_watchers() to make way for the simple accessor name. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250203223205.861346-2-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07lockref: remove count argument of lockref_initAndreas Gruenbacher
All users of lockref_init() now initialize the count to 1, so hardcode that and remove the count argument. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Link: https://lore.kernel.org/r/20250130135624.1899988-4-agruenba@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07cpufreq: Remove cpufreq_enable_boost_support()Viresh Kumar
Remove the now unused helper, cpufreq_enable_boost_support(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07cpufreq: staticize policy_has_boost_freq()Viresh Kumar
policy_has_boost_freq() isn't used outside of freq_table.c now, mark it static. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07cpufreq: Introduce policy->boost_supported flagViresh Kumar
It is possible to have a scenario where not all cpufreq policies support boost frequencies. And letting sysfs (or other parts of the kernel) enable boost feature for that policy isn't correct. Add a new flag, boost_supported, which will be set to true by the cpufreq core only if the freq table contains valid boost frequencies. Some cpufreq drivers though don't have boost frequencies in the freq-table, they can set this flag from their ->init() callbacks. Once all the drivers are updated to set the flag correctly, we can check it before enabling boost feature for a policy. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07cpufreq: Export cpufreq_boost_set_sw()Viresh Kumar
This will be used directly by cpufreq driver going forward, export it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07cpufreq: staticize cpufreq_boost_trigger_state()Viresh Kumar
cpufreq_boost_trigger_state() is only used by cpufreq core, mark it static. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07cpufreq: Remove cpufreq_generic_attrsViresh Kumar
All users of cpufreq_generic_attr are migrated now, remove it. While at it, also stop exporting attributes for available and boost frequencies as they are only used by cpufreq core now. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-06string.h: Use ARRAY_SIZE() for memtostr*()/strtomem*()Kees Cook
The destination argument of memtostr*() and strtomem*() must be a fixed-size char array at compile time, so there is no need to use __builtin_object_size() (which is useful for when an argument is either a pointer or unknown). Instead use ARRAY_SIZE(), which has the benefit of working around a bug in Clang (fixed[1] in 15+) that got __builtin_object_size() wrong sometimes. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501310832.kiAeOt2z-lkp@intel.com/ Suggested-by: Kent Overstreet <kent.overstreet@linux.dev> Link: https://github.com/llvm/llvm-project/commit/d8e0a6d5e9dd2311641f9a8a5d2bf90829951ddc [1] Tested-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-06compiler.h: Introduce __must_be_byte_array()Kees Cook
In preparation for adding stricter type checking to the str/mem*() helpers, provide a way to check that a variable is a byte array via __must_be_byte_array(). Suggested-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-06compiler.h: Move C string helpers into C-only kernel sectionKees Cook
The C kernel helpers for evaluating C Strings were positioned where they were visible to assembly inclusion, which was not intended. Move them into the kernel and C-only area of the header so future changes won't confuse the assembler. Fixes: d7a516c6eeae ("compiler.h: Fix undefined BUILD_BUG_ON_ZERO()") Fixes: 559048d156ff ("string: Check for "nonstring" attribute on strscpy() arguments") Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-06Merge branch 'for-next' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: managing MSI-X in driver Michal Swiatkowski says: It is another try to allow user to manage amount of MSI-X used for each feature in ice. First was via devlink resources API, it wasn't accepted in upstream. Also static MSI-X allocation using devlink resources isn't really user friendly. This try is using more dynamic way. "Dynamic" across whole kernel when platform supports it and "dynamic" across the driver when not. To achieve that reuse global devlink parameter pf_msix_max and pf_msix_min. It fits how ice hardware counts MSI-X. In case of ice amount of MSI-X reported on PCI is a whole MSI-X for the card (with MSI-X for VFs also). Having pf_msix_max allow user to statically set how many MSI-X he wants on PF and how many should be reserved for VFs. pf_msix_min is used to set minimum number of MSI-X with which ice driver should probe correctly. Meaning of this field in case of dynamic vs static allocation: - on system with dynamic MSI-X allocation support * alloc pf_msix_min as static, rest will be allocated dynamically - on system without dynamic MSI-X allocation support * try alloc pf_msix_max as static, minimum acceptable result is pf_msix_min As Jesse and Piotr suggested pf_msix_max and pf_msix_min can (an probably should) be stored in NVM. This patchset isn't implementing that. Dynamic (kernel or driver) way means that splitting MSI-X across the RDMA and eth in case there is a MSI-X shortage isn't correct. Can work when dynamic is only on driver site, but can't when dynamic is on kernel site. Let's remove this code and move to MSI-X allocation feature by feature. If there is no more MSI-X for a feature, a feature is working with less MSI-X or it is turned off. There is a regression here. With MSI-X splitting user can run RDMA and eth even on system with not enough MSI-X. Now only eth will work. RDMA can be turned on by changing number of PF queues (lowering) and reprobe RDMA driver. Example: 72 CPU number, eth, RDMA and flow director (1 MSI-X), 1 MSI-X for OICR on PF, and 1 more for RDMA. Card is using 1 + 72 + 1 + 72 + 1 = 147. We set pf_msix_min = 2, pf_msix_max = 128 OICR: 1 eth: 72 flow director: 1 RDMA: 128 - 74 = 54 We can change number of queues on pf to 36 and do devlink reinit OICR: 1 eth: 36 RDMA: 73 flow director: 1 We can also (implemented in "ice: enable_rdma devlink param") turned RDMA off. OICR: 1 eth: 72 RDMA: 0 (turned off) flow director: 1 After this changes we have a static base vector for SRIOV (SIOV probably in the feature). Last patch from this series is simplifying managing VF MSI-X code based on static vector. Now changing queues using ethtool is also changing MSI-X. If there is enough MSI-X it is always one to one. When there is not enough there will be more queues than MSI-X. * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: init flow director before RDMA ice: simplify VF MSI-X managing ice: enable_rdma devlink param ice: treat dyn_allowed only as suggestion ice, irdma: move interrupts code to irdma ice: get rid of num_lan_msix field ice: remove splitting MSI-X between features ice: devlink PF MSI-X max and min parameter ice: count combined queues using Rx/Tx count ==================== Link: https://patch.msgid.link/20250205185512.895887-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06net: add dev_net_rcu() helperEric Dumazet
dev->nd_net can change, readers should either use rcu_read_lock() or RTNL. We currently use a generic helper, dev_net() with no debugging support. We probably have many hidden bugs. Add dev_net_rcu() helper for callers using rcu_read_lock() protection. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250205155120.1676781-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.14-rc2). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06efi/cper, cxl: Remove cper_cxl.hSmita Koralahalli
Move the declaration of cxl_cper_print_prot_err() to include/linux/cper.h to avoid maintaining a separate header file just for this function declaration. Remove drivers/firmware/efi/cper_cxl.h as its contents have been reorganized. No functional changes. Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250123084421.127697-4-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>