summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-03-04btrfs: handle invalid root reference found in btrfs_init_root_free_objectid()David Sterba
The btrfs_init_root_free_objectid() looks up a root by a key, allowing to do an inexact search when key->offset is -1. It's never expected to find such item, as it would break the allowed range of a root id. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: handle invalid root reference found in btrfs_find_root()David Sterba
The btrfs_find_root() looks up a root by a key, allowing to do an inexact search when key->offset is -1. It's never expected to find such item, as it would break allowed the range of a root id. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: handle root deletion lookup error in btrfs_del_root()David Sterba
We're deleting a root and looking it up by key does not succeed, this is an inconsistent state and we can't do anything. All callers handle errors and abort a transaction. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: handle block group lookup error when it's being removedDavid Sterba
The unlikely case of lookup error in btrfs_remove_block_group() can be handled properly, in its caller this would lead to a transaction abort. We can't do anything else, a block group must have been loaded first. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: handle invalid range and start in merge_extent_mapping()David Sterba
Turn a BUG_ON to a properly handled error and update the error message in the caller. It is expected that @em_in and @start passed to btrfs_add_extent_mapping() overlap. Besides tests, the only caller btrfs_get_extent() makes sure this is true. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: handle directory and dentry mismatch in btrfs_may_delete()David Sterba
The helper btrfs_may_delete() is a copy of generic fs/namei.c:may_delete() to verify various conditions before deletion. There's a BUG_ON added before linux.git started, we can turn it to a proper error handling at least in our local helper. A mistmatch between directory and the deleted dentry is clearly invalid. This won't be probably ever hit due to the way how the parameters are set from the caller btrfs_ioctl_snap_destroy(), using a VFS helper lookup_one(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: use READ/WRITE_ONCE for fs_devices->read_policyNaohiro Aota
Since we can read/modify the value from the sysfs interface concurrently, it would be better to protect it from compiler optimizations. Currently, there is only one read policy BTRFS_READ_POLICY_PID available, so no actual problem can happen now. This is a preparation for the future expansion. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: preallocate temporary extent buffer for inode logging when neededFilipe Manana
When logging an inode and we require to copy items from subvolume leaves to the log tree, we clone each subvolume leaf and than use that clone to copy items to the log tree. This is required to avoid possible deadlocks as stated in commit 796787c978ef ("btrfs: do not modify log tree while holding a leaf from fs tree locked"). The cloning requires allocating an extent buffer (struct extent_buffer) and then allocating pages (folios) to attach to the extent buffer. This may be slow in case we are under memory pressure, and since we are doing the cloning while holding a read lock on a subvolume leaf, it means we can be blocking other operations on that leaf for significant periods of time, which can increase latency on operations like creating other files, renaming files, etc. Similarly because we're under a log transaction, we may also cause extra delay on other tasks doing an fsync, because syncing the log requires waiting for tasks that joined a log transaction to exit the transaction. So to improve this, for any inode logging operation that needs to copy items from a subvolume leaf ("full sync" or "copy everything" bit set in the inode), preallocate a dummy extent buffer before locking any extent buffer from the subvolume tree, and even before joining a log transaction, add it to the log context and then use it when we need to copy items from a subvolume leaf to the log tree. This avoids making other operations get extra latency when waiting to lock a subvolume leaf that is used during inode logging and we are under heavy memory pressure. The following test script with bonnie++ was used to test this: $ cat test.sh #!/bin/bash DEV=/dev/sdh MNT=/mnt/sdh MOUNT_OPTIONS="-o ssd" MEMTOTAL_BYTES=`free -b | grep Mem: | awk '{ print $2 }'` NR_DIRECTORIES=20 NR_FILES=20480 DATASET_SIZE=$((MEMTOTAL_BYTES * 2 / 1048576)) DIRECTORY_SIZE=$((MEMTOTAL_BYTES * 2 / NR_FILES)) NR_FILES=$((NR_FILES / 1024)) echo "performance" | \ tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor umount $DEV &> /dev/null mkfs.btrfs -f $MKFS_OPTIONS $DEV mount $MOUNT_OPTIONS $DEV $MNT bonnie++ -u root -d $MNT \ -n $NR_FILES:$DIRECTORY_SIZE:$DIRECTORY_SIZE:$NR_DIRECTORIES \ -r 0 -s $DATASET_SIZE -b umount $MNT The results of this test on a 8G VM running a non-debug kernel (Debian's default kernel config), were the following. Before this change: Version 2.00a ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP debian0 7501M 376k 99 1.4g 96 117m 14 1510k 99 2.5g 95 +++++ +++ Latency 35068us 24976us 2944ms 30725us 71770us 26152us Version 2.00a ------Sequential Create------ --------Random Create-------- debian0 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max:min /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 20:384100:384100/20 20480 32 20480 58 20480 48 20480 39 20480 56 20480 61 Latency 411ms 11914us 119ms 617ms 10296us 110ms After this change: Version 2.00a ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP debian0 7501M 375k 99 1.4g 97 117m 14 1546k 99 2.3g 98 +++++ +++ Latency 35975us 20945us 2144ms 10297us 2217us 6004us Version 2.00a ------Sequential Create------ --------Random Create-------- debian0 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max:min /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 20:384100:384100/20 20480 35 20480 58 20480 48 20480 40 20480 57 20480 59 Latency 320ms 11237us 77779us 518ms 6470us 86389us Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: add comment about list_is_singular() use at btrfs_delete_unused_bgs()Filipe Manana
At btrfs_delete_unused_bgs(), the use of the list_is_singular() check on a block group may not be immediately obvious. It is there to prevent losing raid profile information for a block group type (data, metadata or system), as that information is removed from fs_info->avail_[data|metadata|system]_alloc_bits when the last block group of a given type is deleted. So deleting the block group would later result in creating block groups of that type with a single profile (because fs_info->avail_*_alloc_bits would have a value of 0). This check was added in commit aefbe9a633b5 ("btrfs: Fix lost-data-profile caused by auto removing bg"). So add a comment mentioning the need for the check. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: document what the spinlock unused_bgs_lock protectsFilipe Manana
Add some comments to struct btrfs_fs_info to explicitly document which members are protected by the spinlock unused_bgs_lock. It is currently used to protect two linked lists, the reclaim_bgs and unused_bgs lists. So add an explicit comment on top of each list to mention its protected by unused_bgs_lock, as well as comment on top of unused_bgs_lock to mention the lists it protects. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: make btrfs_error_unpin_extent_range() return voidDavid Sterba
This helper is used in transaction abort or cleanup context and the callers cannot handle all errors, only do best effort. btrfs_cleanup_one_transaction btrfs_destroy_delayed_refs btrfs_error_unpin_extent_range btrfs_destroy_pinned_extent btrfs_error_unpin_extent_range Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: return errors from unpin_extent_range()David Sterba
Handle the lookup failure of the block group to unpin, this is a logic error as the block group must exist at this point. If not, something else must have freed it, like clean_pinned_extents() would do without locking the unused_bg_unpin_mutex. Push the errors to the callers, proper handling will be done in followup patches. Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: handle errors returned from unpin_extent_cache()David Sterba
We've had numerous attempts to let function unpin_extent_cache() return void as it only returns 0. There are still error cases to handle so do that, in addition to the verbose messages. The only caller btrfs_finish_one_ordered() will now abort the transaction, previously it let it continue which could lead to further problems. Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: zlib: Fix spelling mistake "infalte" -> "inflate"Colin Ian King
There is a spelling mistake in a warning message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: zstd: fix and simplify the inline extent decompression (v2)Qu Wenruo
Note: this is a fixed version that was previously reverted as e01a83e12604 ("Revert "btrfs: zstd: fix and simplify the inline extent decompression""), with fixed parameters to memzero_page(). [BUG] If we have a filesystem with 4k sectorsize, and an inlined compressed extent created like this: item 4 key (257 INODE_ITEM 0) itemoff 15863 itemsize 160 generation 8 transid 8 size 4096 nbytes 4096 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 1 flags 0x0(none) item 5 key (257 INODE_REF 256) itemoff 15839 itemsize 24 index 2 namelen 14 name: source_inlined item 6 key (257 EXTENT_DATA 0) itemoff 15770 itemsize 69 generation 8 type 0 (inline) inline extent data size 48 ram_bytes 4096 compression 3 (zstd) Then trying to reflink that extent in an aarch64 system with 64K page size, the reflink would just fail: # xfs_io -f -c "reflink $mnt/source_inlined 0 60k 4k" $mnt/dest XFS_IOC_CLONE_RANGE: Input/output error [CAUSE] In zstd_decompress(), we didn't treat @start_byte as just a page offset, but also use it as an indicator on whether we should error out, without any proper explanation (this is copied from other decompression code). In reality, for subpage cases, although @start_byte can be non-zero, we should never switch input/output buffer nor error out, since the whole input/output buffer should never exceed one sector, thus we should not need to do any buffer switch. Thus the current code using @start_byte as a condition to switch input/output buffer or finish the decompression is completely incorrect. [FIX] The fix involves several modification: - Rename @start_byte to @dest_pgoff to properly express its meaning - Use @sectorsize other than PAGE_SIZE to properly initialize the output buffer size - Use correct destination offset inside the destination page - Simplify the main loop Since the input/output buffer should never switch, we only need one zstd_decompress_stream() call. - Consider early end as an error After the fix, even on 64K page sized aarch64, above reflink now works as expected: # xfs_io -f -c "reflink $mnt/source_inlined 0 60k 4k" $mnt/dest linked 4096/4096 bytes at offset 61440 And results the correct file layout: item 9 key (258 INODE_ITEM 0) itemoff 15542 itemsize 160 generation 10 transid 10 size 65536 nbytes 4096 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 1 flags 0x0(none) item 10 key (258 INODE_REF 256) itemoff 15528 itemsize 14 index 3 namelen 4 name: dest item 11 key (258 XATTR_ITEM 3817753667) itemoff 15445 itemsize 83 location key (0 UNKNOWN.0 0) type XATTR transid 10 data_len 37 name_len 16 name: security.selinux data unconfined_u:object_r:unlabeled_t:s0 item 12 key (258 EXTENT_DATA 61440) itemoff 15392 itemsize 53 generation 10 type 1 (regular) extent data disk byte 13631488 nr 4096 extent data offset 0 nr 4096 ram 4096 extent compression 0 (none) Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove unused included headersDavid Sterba
With help of neovim, LSP and clangd we can identify header files that are not actually needed to be included in the .c files. This is focused only on removal (with minor fixups), further cleanups are possible but will require doing the header files properly with forward declarations, minimized includes and include-what-you-use care. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: replace i_blocksize by fs_info::sectorsizeDavid Sterba
The block size calculated by i_blocksize from inode is the same as what we have in fs_info, initalized in inode_init_always(). Unify that to use the fs_info value everywhere. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: replace sb::s_blocksize by fs_info::sectorsizeDavid Sterba
The block size stored in the super block is used by subsystems outside of btrfs and it's a copy of fs_info::sectorsize. Unify that to always use our sectorsize, with the exception of mount where we first need to use fixed values (4K) until we read the super block and can set the sectorsize. Replace all uses, in most cases it's fewer pointer indirections. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove duplicate recording of physical addressJohannes Thumshirn
Remove the duplicate physical recording of the original write physical address in case of a single device write. This duplicated code is most likely present due to a rebase error. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: page to folio conversion in btrfs_truncate_block()Goldwyn Rodrigues
Convert use of struct page to struct folio inside btrfs_truncate_block(). The only page based function is set_page_extent_mapped(). All other functions have folio equivalents. Had to use __filemap_get_folio() because filemap_grab_folio() does not allow passing allocation mask as a parameter. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: use a folio array throughout the defrag processMatthew Wilcox (Oracle)
Remove more hidden calls to compound_head() by using an array of folios instead of pages. Also neaten the error path in defrag_one_range() by adjusting the length of the array instead of checking for NULL. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: convert defrag_prepare_one_page() to use a folioMatthew Wilcox (Oracle)
Use a folio throughout defrag_prepare_one_page() to remove dozens of hidden calls to compound_head(). There is no support here for large folios; indeed, turn the existing check for PageCompound into a check for large folios. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: add set_folio_extent_mapped() helperMatthew Wilcox (Oracle)
Turn set_page_extent_mapped() into a wrapper around this version. Saves a call to compound_head() for callers who already have a folio and removes a couple of users of page->mapping. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: WARN_ON_ONCE() in our leak detection codeJosef Bacik
fstests looks for WARN_ON's in dmesg. Add WARN_ON_ONCE() to our leak detection code (enabled only in debug builds) so that fstests will fail if these things trip at all. This will allow us to easily catch problems with our reference counting that may otherwise go unnoticed. Reviewed-by: Neal Gompa <neal@gompa.dev> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove extent_map_tree forward declaration at extent_io.hFilipe Manana
There's no need to do a forward declaration of struct extent_map_tree at extent_io.h, as there are no function prototypes, inline functions or data structures that refer to struct extent_map_tree. So remove that forward declaration, which is not needed since commit 477a30ba5f8d ("btrfs: Sink extent_tree arguments in try_release_extent_mapping"). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: cache folio size and shift in extent_bufferQu Wenruo
After the conversion to folio interfaces (but without the patch to enable larger folio allocation), there is an LTP report about observable performance drop on metadata heavy operations. https://lore.kernel.org/linux-btrfs/202312221750.571925bd-oliver.sang@intel.com/ This drop is caused by the extra code of calculating the folio_size()/folio_shift(), instead of the old hard coded PAGE_SIZE/PAGE_SHIFT. To slightly reduce the overhead, just cache both folio_size and folio_shift in extent_buffer. The two new members (u32 folio_size and u8 folio_shift) are stored inside the holes of extent_buffer. folio_size is shared with len, which is reduced to u32. The size of eb does not change. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove unused variable bio_offset from end_bbio_data_read()Qu Wenruo
The variable @bio_offset was introduced in commit 7ffd27e378d2 ("btrfs: pass bio_offset to check_data_csum() directly"), when we are still using the same endio function for both data and metadata. Later we had several changes to data and metadata endio functions: - Data verification is handled by btrfs bio layer - Split data and metadata endio paths Now for data path we no longer do any verification in end_bbio_data_read(), as the verification is handled by btrfs bio layer already. Thus there is no need for such bio_offset variable. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove the pg_offset parameter from btrfs_get_extent()Qu Wenruo
The parameter @pg_offset of btrfs_get_extent() is only utilized for inlined extent, and we already have an ASSERT() and tree-checker, to make sure we can only get inline extent at file offset 0. Any invalid inline extent with non-zero file offset would be rejected by tree-checker in the first place. Thus the @pg_offset parameter is not really necessary, just remove it. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04regulator: rk808: fix LDO range on RK806Quentin Schulz
The linear ranges aren't really matching what they should be. Indeed, the range is inclusive of the min value, so it makes sense the previous range does NOT include the max step value representing the min value of the range in question. Since 3.4V is represented by the decimal value 232, the previous range max step value should be 231 and not 232. No expected change in behavior since 3.4V was mapped with step 232 from the first range but is now mapped with step 232 from the second range. While at it, remove the incorrect comment from the second range. Fixes: f991a220a447 ("regulator: rk808: add rk806 support") Cc: Quentin Schulz <foss+kernel@0leil.net> Signed-off-by: Quentin Schulz <quentin.schulz@theobroma-systems.com> Link: https://msgid.link/r/20240223-rk806-regulator-ranges-v1-2-3904ab70d250@theobroma-systems.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-04regulator: rk808: fix buck range on RK806Quentin Schulz
The linear ranges aren't really matching what they should be. Indeed, the range is inclusive of the min value, so it makes sense the previous range does NOT include the max step value representing the min value of the range in question. Since 1.5V is represented by the decimal value 160, the previous range max step value should be 159 and not 160. Similarly, 3.4V is represented by the decimal value 236, so the previous range max value should be 235 and not 237. The only change in behavior this makes is that this actually modeled the ranges to map step with decimal value 237 with 3.65V instead of 3.4V (the max supported by the HW). Fixes: f991a220a447 ("regulator: rk808: add rk806 support") Cc: Quentin Schulz <foss+kernel@0leil.net> Signed-off-by: Quentin Schulz <quentin.schulz@theobroma-systems.com> Link: https://msgid.link/r/20240223-rk806-regulator-ranges-v1-1-3904ab70d250@theobroma-systems.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-04Merge tag 'scmi-updates-6.9' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into soc/drivers Arm SCMI updates for v6.9 Quite a few changes to extend support to SCMI v3.2 specification, to enhance notification handling and other miscellaneous updates. 1. Enhancements to notification handling Until now, trying to register a notifier for an unsuppported notification returned an error genrating unneeded message exchanges with the SCMI platform. This can be avoided by looking up in advance the specific protocol and resources available. With these changes SCMI driver user will fail to register a notifier if the related command or resource is not supported (like before) without the need of exchanging any message. Perf notifications are also extended to provide the pre-calculated frequencies corresponding to the level or index carried by the 2. More SCMI v3.2 related updates One of the main addition includes a centralized support to the SCMI core to handle v3.2 optional protocol version negotiation, so that at protocol initialization time, if the platform advertised version is newer than supported by the kernel and protocol version negotiation is supported, the SCMI core will attempt to negotiate an older protocol version. It also includes the clock get permissions which indicates if any of the clock operations are forbidden by the platform for the OSPM agent. It can be used in the clock driver to avoid unnecessary message exchanges between the kernel and the platform which will always end up with the failure. It also includes other missing bits of clock v3.2 protocol so that the supported protocol version can be bumped to 0x30000 (v3.2). 3. Miscellaneous updates This includes addition of warning if the domain frequency multiplier is 0 or rounded off to indicate the actual frequencies are either wrong ot rounded off, hardening of clock domain info lookups, addition of multiple protocols registration support within a SCMI driver, update to SCMI entry in MAINTAINERS to include HWMON driver and constifying the scmi_bus_type structure. This also includes couple for fixes to minor issues: double free in SMC transport cleanup path and struct kernel-doc warnings in optee transport. * tag 'scmi-updates-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: (29 commits) MAINTAINERS: Update SCMI entry with HWMON driver firmware: arm_scmi: Update the supported clock protocol version firmware: arm_scmi: Add standard clock OEM definitions firmware: arm_scmi: Add clock check for extended config support firmware: arm_scmi: Add support for v3.2 NEGOTIATE_PROTOCOL_VERSION firmware: arm_scmi: Fix struct kernel-doc warnings in optee transport firmware: arm_scmi: Report frequencies in the perf notifications firmware: arm_scmi: Use opps_by_lvl to store opps firmware: arm_scmi: Implement is_notify_supported callback in powercap protocol firmware: arm_scmi: Implement is_notify_supported callback in reset protocol firmware: arm_scmi: Implement is_notify_supported callback in sensor protocol firmware: arm_scmi: Implement is_notify_supported callback in clock protocol firmware: arm_scmi: Implement is_notify_supported callback in system power protocol firmware: arm_scmi: Implement is_notify_supported callback in power protocol firmware: arm_scmi: Implement is_notify_supported callback in perf protocol firmware: arm_scmi: Add a common helper to check if a message is supported firmware: arm_scmi: Check for notification support firmware: arm_scmi: Make scmi_bus_type const firmware: arm_scmi: Fix double free in SMC transport cleanup path firmware: arm_scmi: Implement clock get permissions ... Link: https://lore.kernel.org/r/20240223033435.118028-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'ffa-update-6.9' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into soc/drivers Arm FF-A update for v6.9 Another single and simple update to just constify the ffa_bus_type structure similar to other changes done treewide following the driver core changes to accomodate the same. * tag 'ffa-update-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_ffa: Make ffa_bus_type const Link: https://lore.kernel.org/r/20240223033250.117878-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'optee-fix-for-v6.8' of ↵Arnd Bergmann
https://git.linaro.org/people/jens.wiklander/linux-tee into arm/fixes Fix kernel panic in OP-TEE driver * tag 'optee-fix-for-v6.8' of https://git.linaro.org/people/jens.wiklander/linux-tee: tee: optee: Fix kernel panic caused by incorrect error handling Link: https://lore.kernel.org/r/20240304132727.GA3501807@rayden Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'tegra-for-6.8-arm64-dt' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/fixes arm64: tegra: Device tree fixes for v6.8 This contains two fixes to make the MGBE Ethernet devices found on Tegra234 work properly. * tag 'tegra-for-6.8-arm64-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: arm64: tegra: Fix Tegra234 MGBE power-domains arm64: tegra: Set the correct PHY mode for MGBE Link: https://lore.kernel.org/r/20240226144536.1525704-1-thierry.reding@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'imx-fixes-6.8-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes i.MX fixes for 6.8, round 2: - Update MAINTAINERS to use a public mailing list for NXP i.MX development. - Re-enable CONFIG_BACKLIGHT_CLASS_DEVICE in imx_v6_v7_defconfig to fix a backlight regression. - Remove DSI port endpoints from i.MX7 SoC DTSI to fix a display regression. - Fix LDB clocks property for i.MX8MP device tree. - Fix TC9595 reset GPIO on DH i.MX8M Plus DHCOM SoM. * tag 'imx-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: arm64: dts: imx8mp: Fix LDB clocks property arm64: dts: imx8mp: Fix TC9595 reset GPIO on DH i.MX8M Plus DHCOM SoM MAINTAINERS: Use a proper mailinglist for NXP i.MX development ARM: dts: imx7: remove DSI port endpoints ARM: imx_v6_v7_defconfig: Restore CONFIG_BACKLIGHT_CLASS_DEVICE Link: https://lore.kernel.org/r/ZdtPJzdenRybI+Bq@dragon Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'qcom-arm64-fixes-for-6.8' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes Qualcomm ARM64 DeviceTree fixes for 6.8 This marks an additional GPIO as protected on SM8650 devices, to avoid a system reset caused by a security violation with some firmware versions. It also adds the missing interconnect-names, which resolves a regression where one of the I2C busses on SM6115 devices would no longer probe in Linux. * tag 'qcom-arm64-fixes-for-6.8' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: dts: qcom: sm6115: Fix missing interconnect-names arm64: dts: qcom: sm8650-mtp: add gpio74 as reserved gpio arm64: dts: qcom: sm8650-qrd: add gpio74 as reserved gpio Link: https://lore.kernel.org/r/20240225025205.479589-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'sunxi-fixes-for-6.8-1' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes - include Orange Pi Zero 2W DT in Makefile * tag 'sunxi-fixes-for-6.8-1' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: arm64: dts: allwinner: h616: Add Orange Pi Zero 2W to Makefile Link: https://lore.kernel.org/r/20240223205450.GA8881@jernej-laptop Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'omap-for-v6.9/soc-part2-signed' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into soc/arm Few more SoC changes for omaps Two changes to implement REBOOT_COLD for am335x. * tag 'omap-for-v6.9/soc-part2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: AM33xx: PRM: Implement REBOOT_COLD ARM: AM33xx: PRM: Remove redundand defines Link: https://lore.kernel.org/r/pull-1709194504-639032@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'omap-for-v6.9/omap1-signed' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into soc/arm Kconfig change for omap1 Just one change to drop a duplicate select ARCH_OMAP. * tag 'omap-for-v6.9/omap1-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: omap1: remove duplicated 'select ARCH_OMAP' Link: https://lore.kernel.org/r/pull-1709194435-723888@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'zynqmp-soc-for-6.9' of https://github.com/Xilinx/linux-xlnx into ↵Arnd Bergmann
soc/arm arm64: ZynqMP SoC changes for 6.9 - Update maintainer for event manager * tag 'zynqmp-soc-for-6.9' of https://github.com/Xilinx/linux-xlnx: soc: xilinx: update maintainer of event manager driver Link: https://lore.kernel.org/r/CAHTX3d+NGUJhKXxGwskSOf6U9P0Nd9rnroFczD8X2mLhFgcm0Q@mail.gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'zynq-soc-for-6.9' of https://github.com/Xilinx/linux-xlnx into ↵Arnd Bergmann
soc/arm ARM: Zynq SoC changes for 6.9 - Fix kernel-doc - Removed one useless header * tag 'zynq-soc-for-6.9' of https://github.com/Xilinx/linux-xlnx: ARM: zynq: Remove clk/zynq.h header ARM: zynq: slcr: fix function prototype kernel-doc warnings Link: https://lore.kernel.org/r/CAHTX3dJpawPhgVxSsXuSvG20JY9r4EkJu2NrrH1OKRjKxTpNAA@mail.gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'omap-for-v6.9/soc-signed' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into soc/arm Kerneldoc warning fixes for omaps for v6.9 A series of kerneldoc warning fixes for omaps from Randy. * tag 'omap-for-v6.9/soc-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP2+: fix kernel-doc warnings ARM: OMAP2+: fix kernel-doc warnings ARM: OMAP2+: fix a kernel-doc warning ARM: OMAP2+: PRM: fix kernel-doc warnings ARM: OMAP2+: prm44xx: fix a kernel-doc warning ARM: OMAP2+: pmic-cpcap: fix kernel-doc warnings ARM: OMAP2+: hwmod: fix kernel-doc warnings ARM: OMAP2+: hwmod: remove misuse of kernel-doc ARM: OMAP2+: CMINST: use matching function name in kernel-doc ARM: OMAP2+: cm33xx: use matching function name in kernel-doc ARM: OMAP2+: clock: fix a function name in kernel-doc ARM: OMAP2+: clockdomain: fix kernel-doc warnings ARM: OMAP2+: am33xx-restart: fix function name in kernel-doc Link: https://lore.kernel.org/r/pull-1708944095-4485@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'imx-soc-6.9' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into soc/arm i.MX SoC changes for 6.9: - Remove usage of the deprecated ida_simple_xx() API from MMDC code. * tag 'imx-soc-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: imx: Remove usage of the deprecated ida_simple_xx() API Link: https://lore.kernel.org/r/20240226034147.233993-1-shawnguo2@yeah.net Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'samsung-soc-6.9' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into soc/arm Samsung mach/soc changes for v6.9 Just cleanups: kerneldoc warning and add missing const to bus_type. * tag 'samsung-soc-6.9' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: ARM: s3c64xx: make bus_type const ARM: s5pv210: fix pm.c kernel-doc warning Link: https://lore.kernel.org/r/20240218182141.31213-4-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge branch 'mptcp-test-fixes'David S. Miller
Matthieu Baerts says: ==================== selftests: mptcp: fixes for diag.sh Here are two patches fixing issues in MPTCP diag.sh kselftest: - Patch 1 makes sure the exit code is '1' in case of error, and not the test ID, not to return an exit code that would be wrongly interpreted by the ksefltests framework, e.g. '4' means 'skip'. - Patch 2 avoids waiting for unnecessary conditions, which can cause timeouts in some very slow environments. ==================== Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
2024-03-04selftests: mptcp: diag: avoid extra waitingMatthieu Baerts (NGI0)
When creating a lot of listener sockets, it is enough to wait only for the last one, like we are doing before in diag.sh for other subtests. If we do a check for each listener sockets, each time listing all available sockets, it can take a very long time in very slow environments, at the point we can reach some timeout. When using the debug kconfig, the waiting time switches from more than 8 sec to 0.1 sec on my side. In slow/busy environments, and with a poll timeout set to 30 ms, the waiting time could go up to ~100 sec because the listener socket would timeout and stop, while the script would still be checking one by one if all sockets are ready. The result is that after having waited for everything to be ready, all sockets have been stopped due to a timeout, and it is too late for the script to check how many there were. While at it, also removed ss options we don't need: we only need the filtering options, to count how many listener sockets have been created. We don't need to ask ss to display internal TCP information, and the memory if the output is dropped by the 'wc -l' command anyway. Fixes: b4b51d36bbaa ("selftests: mptcp: explicitly trigger the listener diag code-path") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/r/20240301063754.2ecefecf@kernel.org Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04selftests: mptcp: diag: return KSFT_FAIL not test_cntGeliang Tang
The test counter 'test_cnt' should not be returned in diag.sh, e.g. what if only the 4th test fail? Will do 'exit 4' which is 'exit ${KSFT_SKIP}', the whole test will be marked as skipped instead of 'failed'! So we should do ret=${KSFT_FAIL} instead. Fixes: df62f2ec3df6 ("selftests/mptcp: add diag interface tests") Cc: stable@vger.kernel.org Fixes: 42fb6cddec3b ("selftests: mptcp: more stable diag tests") Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04arm64: prohibit probing on arch_kunwind_consume_entry()Puranjay Mohan
Make arch_kunwind_consume_entry() as __always_inline otherwise the compiler might not inline it and allow attaching probes to it. Without this, just probing arch_kunwind_consume_entry() via <tracefs>/kprobe_events will crash the kernel on arm64. The crash can be reproduced using the following compiler and kernel combination: clang version 19.0.0git (https://github.com/llvm/llvm-project.git d68d29516102252f6bf6dc23fb22cef144ca1cb3) commit 87adedeba51a ("Merge tag 'net-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") [root@localhost ~]# echo 'p arch_kunwind_consume_entry' > /sys/kernel/debug/tracing/kprobe_events [root@localhost ~]# echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable Modules linked in: aes_ce_blk aes_ce_cipher ghash_ce sha2_ce virtio_net sha256_arm64 sha1_ce arm_smccc_trng net_failover failover virtio_mmio uio_pdrv_genirq uio sch_fq_codel dm_mod dax configfs CPU: 3 PID: 1405 Comm: bash Not tainted 6.8.0-rc6+ #14 Hardware name: linux,dummy-virt (DT) pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : kprobe_breakpoint_handler+0x17c/0x258 lr : kprobe_breakpoint_handler+0x17c/0x258 sp : ffff800085d6ab60 x29: ffff800085d6ab60 x28: ffff0000066f0040 x27: ffff0000066f0b20 x26: ffff800081fa7b0c x25: 0000000000000002 x24: ffff00000b29bd18 x23: ffff00007904c590 x22: ffff800081fa6590 x21: ffff800081fa6588 x20: ffff00000b29bd18 x19: ffff800085d6ac40 x18: 0000000000000079 x17: 0000000000000001 x16: ffffffffffffffff x15: 0000000000000004 x14: ffff80008277a940 x13: 0000000000000003 x12: 0000000000000003 x11: 00000000fffeffff x10: c0000000fffeffff x9 : aa95616fdf80cc00 x8 : aa95616fdf80cc00 x7 : 205d343137373231 x6 : ffff800080fb48ec x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 0000000000000000 x1 : ffff800085d6a910 x0 : 0000000000000079 Call trace: kprobes: Failed to recover from reentered kprobes. kprobes: Dump kprobe: .symbol_name = arch_kunwind_consume_entry, .offset = 0, .addr = arch_kunwind_consume_entry+0x0/0x40 ------------[ cut here ]------------ kernel BUG at arch/arm64/kernel/probes/kprobes.c:241! kprobes: Failed to recover from reentered kprobes. kprobes: Dump kprobe: .symbol_name = arch_kunwind_consume_entry, .offset = 0, .addr = arch_kunwind_consume_entry+0x0/0x40 Fixes: 1aba06e7b2b4 ("arm64: stacktrace: factor out kunwind_stack_walk()") Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240229231620.24846-1-puranjay12@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04hugetlbfs: support idmapped mountsGiuseppe Scrivano
pass down the idmapped mount information to the different helper functions. Differently, hugetlb_file_setup() will continue to not have any mapping since it is only used from contexts where idmapped mounts are not used. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com> Link: https://lore.kernel.org/r/20240229152405.105031-1-gscrivan@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-04drm: Fix output poll work for drm_kms_helper_poll=nImre Deak
If drm_kms_helper_poll=n the output poll work will only get scheduled from drm_helper_probe_single_connector_modes() to handle a delayed hotplug event. Since polling is disabled the work in this case should just call drm_kms_helper_hotplug_event() w/o detecting the state of connectors and rescheduling the work. After commit d33a54e3991d after a delayed hotplug event above the connectors did get re-detected in the poll work and the work got re-scheduled periodically (since poll_running is also false if drm_kms_helper_poll=n), in effect ignoring the drm_kms_helper_poll=n kernel param. Fix the above by calling only drm_kms_helper_hotplug_event() for a delayed hotplug event if drm_kms_helper_hotplug_event=n, as was done before d33a54e3991d. Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: d33a54e3991d ("drm/probe_helper: sort out poll_running vs poll_enabled") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240301152243.1670573-1-imre.deak@intel.com