summaryrefslogtreecommitdiff
path: root/include/trace
AgeCommit message (Collapse)Author
2024-11-19Merge tag 'ras_core_for_v6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: - Log and handle twp new AMD-specific MCA registers: SYND1 and SYND2 and report the Field Replaceable Unit text info reported through them - Add support for handling variable-sized SMCA BERT records - Add the capability for reporting vendor-specific RAS error info without adding vendor-specific fields to struct mce - Cleanups * tag 'ras_core_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: EDAC/mce_amd: Add support for FRU text in MCA x86/mce/apei: Handle variable SMCA BERT record size x86/MCE/AMD: Add support for new MCA_SYND{1,2} registers tracing: Add __print_dynamic_array() helper x86/mce: Add wrapper for struct mce to export vendor specific info x86/mce/intel: Use MCG_BANKCNT_MASK instead of 0xff x86/mce/mcelog: Use xchg() to get and clear the flags
2024-11-18Merge tag 'for-6.13/io_uring-20241118' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring updates from Jens Axboe: - Cleanups of the eventfd handling code, making it fully private. - Support for sending a sync message to another ring, without having a ring available to send a normal async message. - Get rid of the separate unlocked hash table, unify everything around the single locked one. - Add support for ring resizing. It can be hard to appropriately size the CQ ring upfront, if the application doesn't know how busy it will be. This results in applications sizing rings for the most busy case, which can be wasteful. With ring resizing, they can start small and grow the ring, if needed. - Add support for fixed wait regions, rather than needing to copy the same wait data tons of times for each wait operation. - Rewrite the resource node handling, which before was serialized per ring. This caused issues with particularly fixed files, where one file waiting on IO could hold up putting and freeing of other unrelated files. Now each node is handled separately. New code is much simpler too, and was a net 250 line reduction in code. - Add support for just doing partial buffer clones, rather than always cloning the entire buffer table. - Series adding static NAPI support, where a specific NAPI instance is used rather than having a list of them available that need lookup. - Add support for mapped regions, and also convert the fixed wait support mentioned above to that concept. This avoids doing special mappings for various planned features, and folds the existing registered wait into that too. - Add support for hybrid IO polling, which is a variant of strict IOPOLL but with an initial sleep delay to avoid spinning too early and wasting resources on devices that aren't necessarily in the < 5 usec category wrt latencies. - Various cleanups and little fixes. * tag 'for-6.13/io_uring-20241118' of git://git.kernel.dk/linux: (79 commits) io_uring/region: fix error codes after failed vmap io_uring: restore back registered wait arguments io_uring: add memory region registration io_uring: introduce concept of memory regions io_uring: temporarily disable registered waits io_uring: disable ENTER_EXT_ARG_REG for IOPOLL io_uring: fortify io_pin_pages with a warning switch io_msg_ring() to CLASS(fd) io_uring: fix invalid hybrid polling ctx leaks io_uring/uring_cmd: fix buffer index retrieval io_uring/rsrc: add & apply io_req_assign_buf_node() io_uring/rsrc: remove '->ctx_ptr' of 'struct io_rsrc_node' io_uring/rsrc: pass 'struct io_ring_ctx' reference to rsrc helpers io_uring: avoid normal tw intermediate fallback io_uring/napi: add static napi tracking strategy io_uring/napi: clean up __io_napi_do_busy_loop io_uring/napi: Use lock guards io_uring/napi: improve __io_napi_add io_uring/napi: fix io_napi_entry RCU accesses io_uring/napi: protect concurrent io_napi_entry timeout accesses ...
2024-11-18Merge tag 'for-6.13/block-20241118' of git://git.kernel.dk/linuxLinus Torvalds
Pull block updates from Jens Axboe: - NVMe updates via Keith: - Use uring_cmd helper (Pavel) - Host Memory Buffer allocation enhancements (Christoph) - Target persistent reservation support (Guixin) - Persistent reservation tracing (Guixen) - NVMe 2.1 specification support (Keith) - Rotational Meta Support (Matias, Wang, Keith) - Volatile cache detection enhancment (Guixen) - MD updates via Song: - Maintainers update - raid5 sync IO fix - Enhance handling of faulty and blocked devices - raid5-ppl atomic improvement - md-bitmap fix - Support for manually defining embedded partition tables - Zone append fixes and cleanups - Stop sending the queued requests in the plug list to the driver ->queue_rqs() handle in reverse order. - Zoned write plug cleanups - Cleanups disk stats tracking and add support for disk stats for passthrough IO - Add preparatory support for file system atomic writes - Add lockdep support for queue freezing. Already found a bunch of issues, and some fixes for that are in here. More will be coming. - Fix race between queue stopping/quiescing and IO queueing - ublk recovery improvements - Fix ublk mmap for 64k pages - Various fixes and cleanups * tag 'for-6.13/block-20241118' of git://git.kernel.dk/linux: (118 commits) MAINTAINERS: Update git tree for mdraid subsystem block: make struct rq_list available for !CONFIG_BLOCK block/genhd: use seq_put_decimal_ull for diskstats decimal values block: don't reorder requests in blk_mq_add_to_batch block: don't reorder requests in blk_add_rq_to_plug block: add a rq_list type block: remove rq_list_move virtio_blk: reverse request order in virtio_queue_rqs nvme-pci: reverse request order in nvme_queue_rqs btrfs: validate queue limits block: export blk_validate_limits nvmet: add tracing of reservation commands nvme: parse reservation commands's action and rtype to string nvmet: report ns's vwc not present md/raid5: Increase r5conf.cache_name size block: remove the ioprio field from struct request block: remove the write_hint field from struct request nvme: check ns's volatile write cache not present nvme: add rotational support nvme: use command set independent id ns if available ...
2024-11-18Merge tag 'for-6.13-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "Changes outside of btrfs: add io_uring command flag to track a dying task (the rest will go via the block git tree). User visible changes: - wire encoded read (ioctl) to io_uring commands, this can be used on itself, in the future this will allow 'send' to be asynchronous. As a consequence, the encoded read ioctl can also work in non-blocking mode - new ioctl to wait for cleaned subvolumes, no need to use the generic and root-only SEARCH_TREE ioctl, will be used by "btrfs subvol sync" - recognize different paths/symlinks for the same devices and don't report them during rescanning, this can be observed with LVM or DM - seeding device use case change, the sprout device (the one capturing new writes) will not clear the read-only status of the super block; this prevents accumulating space from deleted snapshots Performance improvements: - reduce lock contention when traversing extent buffers - reduce extent tree lock contention when searching for inline backref - switch from rb-trees to xarray for delayed ref tracking, improvements due to better cache locality, branching factors and more compact data structures - enable extent map shrinker again (prevent memory exhaustion under some types of IO load), reworked to run in a single worker thread (there used to be problems causing long stalls under memory pressure) Core changes: - raid-stripe-tree feature updates: - make device replace and scrub work - implement partial deletion of stripe extents - new selftests - split the config option BTRFS_DEBUG and add EXPERIMENTAL for features that are experimental or with known problems so we don't misuse debugging config for that - subpage mode updates (sector < page): - update compression implementations - update writepage, writeback - continued folio API conversions: - buffered writes - make buffered write copy one page at a time, preparatory work for future integration with large folios, may cause performance drop - proper locking of root item regarding starting send - error handling improvements - code cleanups and refactoring: - dead code removal - unused parameter reduction - lockdep assertions" * tag 'for-6.13-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (119 commits) btrfs: send: check for read-only send root under critical section btrfs: send: check for dead send root under critical section btrfs: remove check for NULL fs_info at btrfs_folio_end_lock_bitmap() btrfs: fix warning on PTR_ERR() against NULL device at btrfs_control_ioctl() btrfs: fix a typo in btrfs_use_zone_append btrfs: avoid superfluous calls to free_extent_map() in btrfs_encoded_read() btrfs: simplify logic to decrement snapshot counter at btrfs_mksnapshot() btrfs: remove hole from struct btrfs_delayed_node btrfs: update stale comment for struct btrfs_delayed_ref_node::add_list btrfs: add new ioctl to wait for cleaned subvolumes btrfs: simplify range tracking in cow_file_range() btrfs: remove conditional path allocation in btrfs_read_locked_inode() btrfs: push cleanup into btrfs_read_locked_inode() io_uring/cmd: let cmds to know about dying task btrfs: add struct io_btrfs_cmd as type for io_uring_cmd_to_pdu() btrfs: add io_uring command for encoded reads (ENCODED_READ ioctl) btrfs: move priv off stack in btrfs_encoded_read_regular_fill_pages() btrfs: don't sleep in btrfs_encoded_read() if IOCB_NOWAIT is set btrfs: change btrfs_encoded_read() so that reading of extent is done by caller btrfs: remove pointless iocb::ki_pos addition in btrfs_encoded_read() ...
2024-11-18Merge tag 'vfs-6.13.netfs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull netfs updates from Christian Brauner: "Various fixes for the netfs library and related infrastructure: cachefiles: - Fix a dentry leak in cachefiles_open_file() - Fix incorrect length return value in cachefiles_ondemand_fd_write_iter() - Fix missing pos updates in cachefiles_ondemand_fd_write_iter() - Clean up in cachefiles_commit_tmpfile() - Fix NULL pointer dereference in object->file - Add a memory barrier for FSCACHE_VOLUME_CREATING netfs: - Remove call to folio_index() - Fix a few minor bugs in netfs_page_mkwrite() - Remove unnecessary references to pages" * tag 'vfs-6.13.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: netfs/fscache: Add a memory barrier for FSCACHE_VOLUME_CREATING cachefiles: Fix NULL pointer dereference in object->file cachefiles: Clean up in cachefiles_commit_tmpfile() cachefiles: Fix missing pos updates in cachefiles_ondemand_fd_write_iter() cachefiles: Fix incorrect length return value in cachefiles_ondemand_fd_write_iter() netfs: Remove unnecessary references to pages netfs: Fix a few minor bugs in netfs_page_mkwrite() netfs: Remove call to folio_index()
2024-11-18Merge tag 'vfs-6.13.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - Fixup and improve NLM and kNFSD file lock callbacks Last year both GFS2 and OCFS2 had some work done to make their locking more robust when exported over NFS. Unfortunately, part of that work caused both NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send lock notifications to clients This in itself is not a huge problem because most NFS clients will still poll the server in order to acquire a conflicted lock It's important for NLM and kNFSD that they do not block their kernel threads inside filesystem's file_lock implementations because that can produce deadlocks. We used to make sure of this by only trusting that posix_lock_file() can correctly handle blocking lock calls asynchronously, so the lock managers would only setup their file_lock requests for async callbacks if the filesystem did not define its own lock() file operation However, when GFS2 and OCFS2 grew the capability to correctly handle blocking lock requests asynchronously, they started signalling this behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting posix_lock_file() was inadvertently dropped, so now most filesystems no longer produce lock notifications when exported over NFS Fix this by using an fop_flag which greatly simplifies the problem and grooms the way for future uses by both filesystems and lock managers alike - Add a sysctl to delete the dentry when a file is removed instead of making it a negative dentry Commit 681ce8623567 ("vfs: Delete the associated dentry when deleting a file") introduced an unconditional deletion of the associated dentry when a file is removed. However, this led to performance regressions in specific benchmarks, such as ilebench.sum_operations/s, prompting a revert in commit 4a4be1ad3a6e ("Revert "vfs: Delete the associated dentry when deleting a file""). This reintroduces the concept conditionally through a sysctl - Expand the statmount() system call: * Report the filesystem subtype in a new fs_subtype field to e.g., report fuse filesystem subtypes * Report the superblock source in a new sb_source field * Add a new way to return filesystem specific mount options in an option array that returns filesystem specific mount options separated by zero bytes and unescaped. This allows caller's to retrieve filesystem specific mount options and immediately pass them to e.g., fsconfig() without having to unescape or split them * Report security (LSM) specific mount options in a separate security option array. We don't lump them together with filesystem specific mount options as security mount options are generic and most users aren't interested in them The format is the same as for the filesystem specific mount option array - Support relative paths in fsconfig()'s FSCONFIG_SET_STRING command - Optimize acl_permission_check() to avoid costly {g,u}id ownership checks if possible - Use smp_mb__after_spinlock() to avoid full smp_mb() in evict() - Add synchronous wakeup support for ep_poll_callback. Currently, epoll only uses wake_up() to wake up task. But sometimes there are epoll users which want to use the synchronous wakeup flag to give a hint to the scheduler, e.g., the Android binder driver. So add a wake_up_sync() define, and use wake_up_sync() when sync is true in ep_poll_callback() Fixes: - Fix kernel documentation for inode_insert5() and iget5_locked() - Annotate racy epoll check on file->f_ep - Make F_DUPFD_QUERY associative - Avoid filename buffer overrun in initramfs - Don't let statmount() return empty strings - Add a cond_resched() to dump_user_range() to avoid hogging the CPU - Don't query the device logical blocksize multiple times for hfsplus - Make filemap_read() check that the offset is positive or zero Cleanups: - Various typo fixes - Cleanup wbc_attach_fdatawrite_inode() - Add __releases annotation to wbc_attach_and_unlock_inode() - Add hugetlbfs tracepoints - Fix various vfs kernel doc parameters - Remove obsolete TODO comment from io_cancel() - Convert wbc_account_cgroup_owner() to take a folio - Fix comments for BANDWITH_INTERVAL and wb_domain_writeout_add() - Reorder struct posix_acl to save 8 bytes - Annotate struct posix_acl with __counted_by() - Replace one-element array with flexible array member in freevxfs - Use idiomatic atomic64_inc_return() in alloc_mnt_ns()" * tag 'vfs-6.13.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (35 commits) statmount: retrieve security mount options vfs: make evict() use smp_mb__after_spinlock instead of smp_mb statmount: add flag to retrieve unescaped options fs: add the ability for statmount() to report the sb_source writeback: wbc_attach_fdatawrite_inode out of line writeback: add a __releases annoation to wbc_attach_and_unlock_inode fs: add the ability for statmount() to report the fs_subtype fs: don't let statmount return empty strings fs:aio: Remove TODO comment suggesting hash or array usage in io_cancel() hfsplus: don't query the device logical block size multiple times freevxfs: Replace one-element array with flexible array member fs: optimize acl_permission_check() initramfs: avoid filename buffer overrun fs/writeback: convert wbc_account_cgroup_owner to take a folio acl: Annotate struct posix_acl with __counted_by() acl: Realign struct posix_acl to save 8 bytes epoll: Add synchronous wakeup support for ep_poll_callback coredump: add cond_resched() to dump_user_range mm/page-writeback.c: Fix comment of wb_domain_writeout_add() mm/page-writeback.c: Update comment for BANDWIDTH_INTERVAL ...
2024-11-18Merge tag 'vfs-6.13.mgtime' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs multigrain timestamps from Christian Brauner: "This is another try at implementing multigrain timestamps. This time with significant help from the timekeeping maintainers to reduce the performance impact. Thomas provided a base branch that contains the required timekeeping interfaces for the VFS. It serves as the base for the multi-grain timestamp work: - Multigrain timestamps allow the kernel to use fine-grained timestamps when an inode's attributes is being actively observed via ->getattr(). With this support, it's possible for a file to get a fine-grained timestamp, and another modified after it to get a coarse-grained stamp that is earlier than the fine-grained time. If this happens then the files can appear to have been modified in reverse order, which breaks VFS ordering guarantees. To prevent this, a floor value is maintained for multigrain timestamps. Whenever a fine-grained timestamp is handed out, record it, and when later coarse-grained stamps are handed out, ensure they are not earlier than that value. If the coarse-grained timestamp is earlier than the fine-grained floor, return the floor value instead. The timekeeper changes add a static singleton atomic64_t into timekeeper.c that is used to keep track of the latest fine-grained time ever handed out. This is tracked as a monotonic ktime_t value to ensure that it isn't affected by clock jumps. Because it is updated at different times than the rest of the timekeeper object, the floor value is managed independently of the timekeeper via a cmpxchg() operation, and sits on its own cacheline. Two new public timekeeper interfaces are added: (1) ktime_get_coarse_real_ts64_mg() fills a timespec64 with the later of the coarse-grained clock and the floor time (2) ktime_get_real_ts64_mg() gets the fine-grained clock value, and tries to swap it into the floor. A timespec64 is filled with the result. - The VFS has always used coarse-grained timestamps when updating the ctime and mtime after a change. This has the benefit of allowing filesystems to optimize away a lot metadata updates, down to around 1 per jiffy, even when a file is under heavy writes. Unfortunately, this has always been an issue when we're exporting via NFSv3, which relies on timestamps to validate caches. A lot of changes can happen in a jiffy, so timestamps aren't sufficient to help the client decide when to invalidate the cache. Even with NFSv4, a lot of exported filesystems don't properly support a change attribute and are subject to the same problems with timestamp granularity. Other applications have similar issues with timestamps (e.g backup applications). If we were to always use fine-grained timestamps, that would improve the situation, but that becomes rather expensive, as the underlying filesystem would have to log a lot more metadata updates. This adds a way to only use fine-grained timestamps when they are being actively queried. Use the (unused) top bit in inode->i_ctime_nsec as a flag that indicates whether the current timestamps have been queried via stat() or the like. When it's set, we allow the kernel to use a fine-grained timestamp iff it's necessary to make the ctime show a different value. This solves the problem of being able to distinguish the timestamp between updates, but introduces a new problem: it's now possible for a file being changed to get a fine-grained timestamp. A file that is altered just a bit later can then get a coarse-grained one that appears older than the earlier fine-grained time. This violates timestamp ordering guarantees. This is where the earlier mentioned timkeeping interfaces help. A global monotonic atomic64_t value is kept that acts as a timestamp floor. When we go to stamp a file, we first get the latter of the current floor value and the current coarse-grained time. If the inode ctime hasn't been queried then we just attempt to stamp it with that value. If it has been queried, then first see whether the current coarse time is later than the existing ctime. If it is, then we accept that value. If it isn't, then we get a fine-grained time and try to swap that into the global floor. Whether that succeeds or fails, we take the resulting floor time, convert it to realtime and try to swap that into the ctime. We take the result of the ctime swap whether it succeeds or fails, since either is just as valid. Filesystems can opt into this by setting the FS_MGTIME fstype flag. Others should be unaffected (other than being subject to the same floor value as multigrain filesystems)" * tag 'vfs-6.13.mgtime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: reduce pointer chasing in is_mgtime() test tmpfs: add support for multigrain timestamps btrfs: convert to multigrain timestamps ext4: switch to multigrain timestamps xfs: switch to multigrain timestamps Documentation: add a new file documenting multigrain timestamps fs: add percpu counters for significant multigrain timestamp events fs: tracepoints around multigrain timestamp events fs: handle delegated timestamps in setattr_copy_mgtime timekeeping: Add percpu counter for tracking floor swap events timekeeping: Add interfaces for handling timestamps with a floor value fs: have setattr_copy handle multigrain timestamps appropriately fs: add infrastructure for multigrain timestamps
2024-11-12block: remove the ioprio field from struct requestChristoph Hellwig
The request ioprio is only initialized from the first attached bio, so requests without a bio already never set it. Directly use the bio field instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241112170050.1612998-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-11btrfs: rename extent map shrinker members from struct btrfs_fs_infoFilipe Manana
The names for the members of struct btrfs_fs_info related to the extent map shrinker are a bit too long, so rename them to be shorter by replacing the "extent_map_" prefix with the "em_" prefix. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: simplify tracking progress for the extent map shrinkerFilipe Manana
Now that the extent map shrinker can only be run by a single task (as a work queue item) there is no need to keep the progress of the shrinker protected by a spinlock and passing the progress to trace events as parameters. So remove the lock and simplify the arguments for the trace events. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: remove unused btrfs_try_tree_write_lock()Dr. David Alan Gilbert
btrfs_try_tree_write_lock() has been unused since commit 50b21d7a066f ("btrfs: submit a writeback bio per extent_buffer"). Remove it as we don't need it anymore. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: qgroups: remove bytenr field from struct btrfs_qgroup_extent_recordFilipe Manana
Now that we track qgroup extent records in a xarray we don't need to have a "bytenr" field in struct btrfs_qgroup_extent_record, since we can get it from the index of the record in the xarray. So remove the field and grab the bytenr from either the index key or any other place where it's available (delayed refs). This reduces the size of struct btrfs_qgroup_extent_record from 40 bytes down to 32 bytes, meaning that we now can store 128 instances of this structure instead of 102 per 4K page. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-07Merge tag 'net-6.12-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from can and netfilter. Things are slowing down quite a bit, mostly driver fixes here. No known ongoing investigations. Current release - new code bugs: - eth: ti: am65-cpsw: - fix multi queue Rx on J7 - fix warning in am65_cpsw_nuss_remove_rx_chns() Previous releases - regressions: - mptcp: do not require admin perm to list endpoints, got missed in a refactoring - mptcp: use sock_kfree_s instead of kfree Previous releases - always broken: - sctp: properly validate chunk size in sctp_sf_ootb() fix OOB access - virtio_net: make RSS interact properly with queue number - can: mcp251xfd: mcp251xfd_get_tef_len(): fix length calculation - can: mcp251xfd: mcp251xfd_ring_alloc(): fix coalescing configuration when switching CAN modes Misc: - revert earlier hns3 fixes, they were ignoring IOMMU abstractions and need to be reworked - can: {cc770,sja1000}_isa: allow building on x86_64" * tag 'net-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (42 commits) drivers: net: ionic: add missed debugfs cleanup to ionic_probe() error path net/smc: do not leave a dangling sk pointer in __smc_create() rxrpc: Fix missing locking causing hanging calls net/smc: Fix lookup of netdev by using ib_device_get_netdev() net: arc: rockchip: fix emac mdio node support net: arc: fix the device for dma_map_single/dma_unmap_single virtio_net: Update rss when set queue virtio_net: Sync rss config to device when virtnet_probe virtio_net: Add hash_key_length check virtio_net: Support dynamic rss indirection table size netfilter: nf_tables: wait for rcu grace period on net_device removal net: stmmac: Fix unbalanced IRQ wake disable warning on single irq case net: vertexcom: mse102x: Fix possible double free of TX skb mptcp: use sock_kfree_s instead of kfree mptcp: no admin perm to list endpoints net: phy: ti: add PHY_RST_AFTER_CLK_EN flag net: ethernet: ti: am65-cpsw: fix warning in am65_cpsw_nuss_remove_rx_chns() net: ethernet: ti: am65-cpsw: Fix multi queue Rx on J7 net: hns3: fix kernel crash when uninstalling driver Revert "Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver'" ...
2024-11-07rxrpc: Fix missing locking causing hanging callsDavid Howells
If a call gets aborted (e.g. because kafs saw a signal) between it being queued for connection and the I/O thread picking up the call, the abort will be prioritised over the connection and it will be removed from local->new_client_calls by rxrpc_disconnect_client_call() without a lock being held. This may cause other calls on the list to disappear if a race occurs. Fix this by taking the client_call_lock when removing a call from whatever list its ->wait_link happens to be on. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org Reported-by: Marc Dionne <marc.dionne@auristor.com> Fixes: 9d35d880e0e4 ("rxrpc: Move client call connection to the I/O thread") Link: https://patch.msgid.link/726660.1730898202@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-01Merge tag 'vfs-6.12-rc6.fixes' of ↵Linus Torvalds
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull filesystem fixes from Christian Brauner: "VFS: - Fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP=y is set - Add a get_tree_bdev_flags() helper that allows to modify e.g., whether errors are logged into the filesystem context during superblock creation. This is used by erofs to fix a userspace regression where an error is currently logged when its used on a regular file which is an new allowed mode in erofs. netfs: - Fix the sysfs debug path in the documentation. - Fix iov_iter_get_pages*() for folio queues by skipping the page extracation if we're at the end of a folio. afs: - Fix moving subdirectories to different parent directory. autofs: - Fix handling of AUTOFS_DEV_IOCTL_TIMEOUT_CMD ioctl in validate_dev_ioctl(). The actual ioctl number, not the ioctl command needs to be checked for autofs" * tag 'vfs-6.12-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP autofs: fix thinko in validate_dev_ioctl() iov_iter: Fix iov_iter_get_pages*() for folio_queue afs: Fix missing subdir edit when renamed between parent dirs doc: correcting the debug path for cachefiles erofs: use get_tree_bdev_flags() to avoid misleading messages fs/super.c: introduce get_tree_bdev_flags()
2024-10-31x86/MCE/AMD: Add support for new MCA_SYND{1,2} registersAvadhut Naik
Starting with Zen4, AMD's Scalable MCA systems incorporate two new registers: MCA_SYND1 and MCA_SYND2. These registers will include supplemental error information in addition to the existing MCA_SYND register. The data within these registers is considered valid if MCA_STATUS[SyndV] is set. Userspace error decoding tools like rasdaemon gather related hardware error information through the tracepoints. Therefore, export these two registers through the mce_record tracepoint so that tools like rasdaemon can parse them and output the supplemental error information like FRU text contained in them. [ bp: Massage. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Link: https://lore.kernel.org/r/20241022194158.110073-4-avadhut.naik@amd.com
2024-10-30tracing: Add __print_dynamic_array() helperSteven Rostedt
When printing a dynamic array in a trace event, the method is rather ugly. It has the format of: __print_array(__get_dynamic_array(array), __get_dynmaic_array_len(array) / el_size, el_size) Since dynamic arrays are known to the tracing infrastructure, create a helper macro that does the above for you. __print_dynamic_array(array, el_size) Which would expand to the same output. Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Link: https://lore.kernel.org/r/20241022194158.110073-3-avadhut.naik@amd.com
2024-10-30x86/mce: Add wrapper for struct mce to export vendor specific infoAvadhut Naik
Currently, exporting new additional machine check error information involves adding new fields for the same at the end of the struct mce. This additional information can then be consumed through mcelog or tracepoint. However, as new MSRs are being added (and will be added in the future) by CPU vendors on their newer CPUs with additional machine check error information to be exported, the size of struct mce will balloon on some CPUs, unnecessarily, since those fields are vendor-specific. Moreover, different CPU vendors may export the additional information in varying sizes. The problem particularly intensifies since struct mce is exposed to userspace as part of UAPI. It's bloating through vendor-specific data should be avoided to limit the information being sent out to userspace. Add a new structure mce_hw_err to wrap the existing struct mce. The same will prevent its ballooning since vendor-specifc data, if any, can now be exported through a union within the wrapper structure and through __dynamic_array in mce_record tracepoint. Furthermore, new internal kernel fields can be added to the wrapper struct without impacting the user space API. [ bp: Restore reverse x-mas tree order of function vars declarations. ] Suggested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Link: https://lore.kernel.org/r/20241022194158.110073-2-avadhut.naik@amd.com
2024-10-29io_uring: clean up cqe trace pointsPavel Begunkov
We have too many helpers posting CQEs, instead of tracing completion events before filling in a CQE and thus having to pass all the data, set the CQE first, pass it to the tracing helper and let it extract everything it needs. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b83c1ca9ee5aed2df0f3bb743bf5ed699cce4c86.1729267437.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-24afs: Fix missing subdir edit when renamed between parent dirsDavid Howells
When rename moves an AFS subdirectory between parent directories, the subdir also needs a bit of editing: the ".." entry needs updating to point to the new parent (though I don't make use of the info) and the DV needs incrementing by 1 to reflect the change of content. The server also sends a callback break notification on the subdirectory if we have one, but we can take care of recovering the promise next time we access the subdir. This can be triggered by something like: mount -t afs %example.com:xfstest.test20 /xfstest.test/ mkdir /xfstest.test/{aaa,bbb,aaa/ccc} touch /xfstest.test/bbb/ccc/d mv /xfstest.test/{aaa/ccc,bbb/ccc} touch /xfstest.test/bbb/ccc/e When the pathwalk for the second touch hits "ccc", kafs spots that the DV is incorrect and downloads it again (so the fix is not critical). Fix this, if the rename target is a directory and the old and new parents are different, by: (1) Incrementing the DV number of the target locally. (2) Editing the ".." entry in the target to refer to its new parent's vnode ID and uniquifier. Link: https://lore.kernel.org/r/3340431.1729680010@warthog.procyon.org.uk Fixes: 63a4681ff39c ("afs: Locally edit directory data for mkdir/create/unlink/...") cc: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-21Merge tag 'vfs-6.12-rc5.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "afs: - Fix a lock recursion in afs_wake_up_async_call() on ->notify_lock netfs: - Drop the references to a folio immediately after the folio has been extracted to prevent races with future I/O collection - Fix a documenation build error - Downgrade the i_rwsem for buffered writes to fix a cifs reported performance regression when switching to netfslib vfs: - Explicitly return -E2BIG from openat2() if the specified size is unexpectedly large. This aligns openat2() with other extensible struct based system calls - When copying a mount namespace ensure that we only try to remove the new copy from the mount namespace rbtree if it has already been added to it nilfs: - Clear the buffer delay flag when clearing the buffer state clags when a buffer head is discarded to prevent a kernel OOPs ocfs2: - Fix an unitialized value warning in ocfs2_setattr() proc: - Fix a kernel doc warning" * tag 'vfs-6.12-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: proc: Fix W=1 build kernel-doc warning afs: Fix lock recursion fs: Fix uninitialized value issue in from_kuid and from_kgid fs: don't try and remove empty rbtree node netfs: Downgrade i_rwsem for a buffered write nilfs2: fix kernel bug due to missing clearing of buffer delay flag openat2: explicitly return -E2BIG for (usize > PAGE_SIZE) netfs: fix documentation build error netfs: In readahead, put the folio refs as soon extracted
2024-10-20Merge tag 'dma-mapping-6.12-2024-10-20' of ↵Linus Torvalds
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fix from Christoph Hellwig: "Just another small tracing fix from Sean" * tag 'dma-mapping-6.12-2024-10-20' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: fix tracing dma_alloc/free with vmalloc'd memory
2024-10-17dma-mapping: fix tracing dma_alloc/free with vmalloc'd memorySean Anderson
Not all virtual addresses have physical addresses, such as if they were vmalloc'd. Just trace the virtual address instead of trying to trace a physical address. This aligns with the API, and is good enough to associate dma_alloc with dma_free. Fixes: 038eb433dc14 ("dma-mapping: add tracing for dma-mapping API calls") Reported-by: syzbot+b4bfacdec173efaa8567@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/670ebde5.050a0220.d9b66.0154.GAE@google.com/ Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-10-17mm: khugepaged: fix the arguments order in khugepaged_collapse_file trace pointYang Shi
The "addr" and "is_shmem" arguments have different order in TP_PROTO and TP_ARGS. This resulted in the incorrect trace result: text-hugepage-644429 [276] 392092.878683: mm_khugepaged_collapse_file: mm=0xffff20025d52c440, hpage_pfn=0x200678c00, index=512, addr=1, is_shmem=0, filename=text-hugepage, nr=512, result=failed The value of "addr" is wrong because it was treated as bool value, the type of is_shmem. Fix the order in TP_PROTO to keep "addr" is before "is_shmem" since the original patch review suggested this order to achieve best packing. And use "lx" for "addr" instead of "ld" in TP_printk because address is typically shown in hex. After the fix, the trace result looks correct: text-hugepage-7291 [004] 128.627251: mm_khugepaged_collapse_file: mm=0xffff0001328f9500, hpage_pfn=0x20016ea00, index=512, addr=0x400000, is_shmem=0, filename=text-hugepage, nr=512, result=failed Link: https://lkml.kernel.org/r/20241012011702.1084846-1-yang@os.amperecomputing.com Fixes: 4c9473e87e75 ("mm/khugepaged: add tracepoint to collapse_file()") Signed-off-by: Yang Shi <yang@os.amperecomputing.com> Cc: Gautam Menghani <gautammenghani201@gmail.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> [6.2+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-10Merge patch series "timekeeping/fs: multigrain timestamp redux"Christian Brauner
Jeff Layton <jlayton@kernel.org> says: The VFS has always used coarse-grained timestamps when updating the ctime and mtime after a change. This has the benefit of allowing filesystems to optimize away a lot metadata updates, down to around 1 per jiffy, even when a file is under heavy writes. Unfortunately, this has always been an issue when we're exporting via NFSv3, which relies on timestamps to validate caches. A lot of changes can happen in a jiffy, so timestamps aren't sufficient to help the client decide when to invalidate the cache. Even with NFSv4, a lot of exported filesystems don't properly support a change attribute and are subject to the same problems with timestamp granularity. Other applications have similar issues with timestamps (e.g backup applications). If we were to always use fine-grained timestamps, that would improve the situation, but that becomes rather expensive, as the underlying filesystem would have to log a lot more metadata updates. What we need is a way to only use fine-grained timestamps when they are being actively queried. Use the (unused) top bit in inode->i_ctime_nsec as a flag that indicates whether the current timestamps have been queried via stat() or the like. When it's set, we allow the kernel to use a fine-grained timestamp iff it's necessary to make the ctime show a different value. This solves the problem of being able to distinguish the timestamp between updates, but introduces a new problem: it's now possible for a file being changed to get a fine-grained timestamp. A file that is altered just a bit later can then get a coarse-grained one that appears older than the earlier fine-grained time. This violates timestamp ordering guarantees. To remedy this, keep a global monotonic atomic64_t value that acts as a timestamp floor. When we go to stamp a file, we first get the latter of the current floor value and the current coarse-grained time. If the inode ctime hasn't been queried then we just attempt to stamp it with that value. If it has been queried, then first see whether the current coarse time is later than the existing ctime. If it is, then we accept that value. If it isn't, then we get a fine-grained time and try to swap that into the global floor. Whether that succeeds or fails, we take the resulting floor time, convert it to realtime and try to swap that into the ctime. We take the result of the ctime swap whether it succeeds or fails, since either is just as valid. Filesystems can opt into this by setting the FS_MGTIME fstype flag. Others should be unaffected (other than being subject to the same floor value as multigrain filesystems). * patches from https://lore.kernel.org/r/20241002-mgtime-v10-0-d1c4717f5284@kernel.org: tmpfs: add support for multigrain timestamps btrfs: convert to multigrain timestamps ext4: switch to multigrain timestamps xfs: switch to multigrain timestamps Documentation: add a new file documenting multigrain timestamps fs: add percpu counters for significant multigrain timestamp events fs: tracepoints around multigrain timestamp events fs: handle delegated timestamps in setattr_copy_mgtime fs: have setattr_copy handle multigrain timestamps appropriately fs: add infrastructure for multigrain timestamps Link: https://lore.kernel.org/r/20241002-mgtime-v10-0-d1c4717f5284@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-10fs: tracepoints around multigrain timestamp eventsJeff Layton
Add some tracepoints around various multigrain timestamp events. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # documentation bits Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20241002-mgtime-v10-6-d1c4717f5284@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-07netfs: In readahead, put the folio refs as soon extractedDavid Howells
netfslib currently defers dropping the ref on the folios it obtains during readahead to after it has started I/O on the basis that we can do it whilst we wait for the I/O to complete, but this runs the risk of the I/O collection racing with this in future. Furthermore, Matthew Wilcox strongly suggests that the refs should be dropped immediately, as readahead_folio() does (netfslib is using __readahead_batch() which doesn't drop the refs). Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/3771538.1728052438@warthog.procyon.org.uk cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-07Merge patch series "Random netfs folio fixes"Christian Brauner
Matthew Wilcox (Oracle) <willy@infradead.org> says: A few minor fixes; nothing earth-shattering. Matthew Wilcox (Oracle) (3): netfs: Remove call to folio_index() netfs: Fix a few minor bugs in netfs_page_mkwrite() netfs: Remove unnecessary references to pages Link: https://lore.kernel.org/r/20241005182307.3190401-1-willy@infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-07netfs: Remove call to folio_index()Matthew Wilcox (Oracle)
Calling folio_index() is pointless overhead; directly dereferencing folio->index is fine. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20241005182307.3190401-2-willy@infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-04Merge tag 'for-6.12-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - in incremental send, fix invalid clone operation for file that got its size decreased - fix __counted_by() annotation of send path cache entries, we do not store the terminating NUL - fix a longstanding bug in relocation (and quite hard to hit by chance), drop back reference cache that can get out of sync after transaction commit - wait for fixup worker kthread before finishing umount - add missing raid-stripe-tree extent for NOCOW files, zoned mode cannot have NOCOW files but RST is meant to be a standalone feature - handle transaction start error during relocation, avoid potential NULL pointer dereference of relocation control structure (reported by syzbot) - disable module-wide rate limiting of debug level messages - minor fix to tracepoint definition (reported by checkpatch.pl) * tag 'for-6.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: disable rate limiting when debug enabled btrfs: wait for fixup workers before stopping cleaner kthread during umount btrfs: fix a NULL pointer dereference when failed to start a new trasacntion btrfs: send: fix invalid clone operation for file that got its size decreased btrfs: tracepoints: end assignment with semicolon at btrfs_qgroup_extent event class btrfs: drop the backref cache during relocation if we commit btrfs: also add stripe entries for NOCOW writes btrfs: send: fix buffer overflow detection when copying path to cache entry
2024-10-02Merge patch series "Introduce tracepoint for hugetlbfs"Christian Brauner
Hongbo Li <lihongbo22@huawei.com> says: Add some basic tracepoints for debugging hugetlbfs: {alloc, free, evict}_inode, setattr and fallocate. * patches from https://lore.kernel.org/r/20240829064110.67884-1-lihongbo22@huawei.com: hugetlbfs: use tracepoints in hugetlbfs functions. hugetlbfs: support tracepoint Link: https://lore.kernel.org/r/20240829064110.67884-1-lihongbo22@huawei.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-01btrfs: tracepoints: end assignment with semicolon at btrfs_qgroup_extent ↵Filipe Manana
event class While running checkpatch.pl against a patch that modifies the btrfs_qgroup_extent event class, it complained about using a comma instead of a semicolon: $ ./scripts/checkpatch.pl qgroups/0003-btrfs-qgroups-remove-bytenr-field-from-struct-btrfs_.patch WARNING: Possible comma where semicolon could be used #215: FILE: include/trace/events/btrfs.h:1720: + __entry->bytenr = bytenr, __entry->num_bytes = rec->num_bytes; total: 0 errors, 1 warnings, 184 lines checked So replace the comma with a semicolon to silence checkpatch and possibly other tools. It also makes the code consistent with the rest. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-30Merge tag 'vfs-6.12-rc2.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "afs: - Fix setting of the server responding flag - Remove unused struct afs_address_list and afs_put_address_list() function - Fix infinite loop because of unresponsive servers - Ensure that afs_retry_request() function is correctly added to the afs_req_ops netfs operations table netfs: - Fix netfs_folio tracepoint handling to handle NULL mappings - Add a missing folio_queue API documentation - Ensure that netfs_write_folio() correctly advances the iterator via iov_iter_advance() - Fix a dentry leak during concurrent cull and cookie lookup operations in cachefiles pidfs: - Correctly handle accessing another task's pid namespace" * tag 'vfs-6.12-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: netfs: Fix the netfs_folio tracepoint to handle NULL mapping netfs: Add folio_queue API documentation netfs: Advance iterator correctly rather than jumping it afs: Fix the setting of the server responding flag afs: Remove unused struct and function prototype afs: Fix possible infinite loop with unresponsive servers pidfs: check for valid pid namespace afs: Fix missing wire-up of afs_retry_request() cachefiles: fix dentry leak in cachefiles_open_file()
2024-09-30netfs: Fix the netfs_folio tracepoint to handle NULL mappingDavid Howells
Fix the netfs_folio tracepoint to handle folios that have a NULL mapping pointer. In such a case, just substitute a zero inode number. Fixes: c38f4e96e605 ("netfs: Provide func to copy data to pagecache for buffered write") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/2917423.1727697556@warthog.procyon.org.uk cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-29Merge tag 'dma-mapping-6.12-2024-09-29' of ↵Linus Torvalds
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fix from Christoph Hellwig: - handle chained SGLs in the new tracing code (Christoph Hellwig) * tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: fix DMA API tracing for chained scatterlists
2024-09-26dma-mapping: fix DMA API tracing for chained scatterlistsChristoph Hellwig
scatterlist allocations can be chained, and thus all iterations need to use the chain-aware iterators. Switch the newly added tracing to use the proper iterators so that they work with chained scatterlists. Fixes: 038eb433dc14 ("dma-mapping: add tracing for dma-mapping API calls") Reported-by: syzbot+95e4ef83a3024384ec7a@syzkaller.appspotmail.com Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sean Anderson <sean.anderson@linux.dev> Tested-by: syzbot+95e4ef83a3024384ec7a@syzkaller.appspotmail.com
2024-09-24Merge tag 'f2fs-for-6.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "The main changes include converting major IO paths to use folio, and adding various knobs to control GC more flexibly for Zoned devices. In addition, there are several patches to address corner cases of atomic file operations and better support for file pinning on zoned device. Enhancement: - add knobs to tune foreground/background GCs for Zoned devices - convert IO paths to use folio - reduce expensive checkpoint trigger frequency - allow F2FS_IPU_NOCACHE for pinned file - forcibly migrate to secure space for zoned device file pinning - get rid of buffer_head use - add write priority option based on zone UFS - get rid of online repair on corrupted directory Bug fixes: - fix to don't panic system for no free segment fault injection - fix to don't set SB_RDONLY in f2fs_handle_critical_error() - avoid unused block when dio write in LFS mode - compress: don't redirty sparse cluster during {,de}compress - check discard support for conventional zones - atomic: prevent atomic file from being dirtied before commit - atomic: fix to check atomic_file in f2fs ioctl interfaces - atomic: fix to forbid dio in atomic_file - atomic: fix to truncate pagecache before on-disk metadata truncation - atomic: create COW inode from parent dentry - atomic: fix to avoid racing w/ GC - atomic: require FMODE_WRITE for atomic write ioctls - fix to wait page writeback before setting gcing flag - fix to avoid racing in between read and OPU dio write, dio completion - fix several potential integer overflows in file offsets and dir_block_index - fix to avoid use-after-free in f2fs_stop_gc_thread() As usual, there are several code clean-ups and refactorings" * tag 'f2fs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (60 commits) f2fs: allow F2FS_IPU_NOCACHE for pinned file f2fs: forcibly migrate to secure space for zoned device file pinning f2fs: remove unused parameters f2fs: fix to don't panic system for no free segment fault injection f2fs: fix to don't set SB_RDONLY in f2fs_handle_critical_error() f2fs: add valid block ratio not to do excessive GC for one time GC f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent f2fs: do FG_GC when GC boosting is required for zoned devices f2fs: increase BG GC migration window granularity when boosted for zoned devices f2fs: add reserved_segments sysfs node f2fs: introduce migration_window_granularity f2fs: make BG GC more aggressive for zoned devices f2fs: avoid unused block when dio write in LFS mode f2fs: fix to check atomic_file in f2fs ioctl interfaces f2fs: get rid of online repaire on corrupted directory f2fs: prevent atomic file from being dirtied before commit f2fs: get rid of page->index f2fs: convert read_node_page() to use folio f2fs: convert __write_node_page() to use folio f2fs: convert f2fs_write_data_page() to use folio ...
2024-09-23Merge tag 'firewire-updates-6.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire updates from Takashi Sakamoto: "In the FireWire subsystem, tasklets have been used as the bottom half of 1394 OHCi hardIRQ. In recent kernel updates, BH workqueues have become available, and some developers have proposed replacing the tasklet with a BH workqueue. As a first step towards dropping tasklet use, the 1394 OHCI isochronous context can use regular workqueues. In this context, the batch of packets is processed in the specific queue, thus the timing jitter caused by task scheduling is not so critical. Additionally, DMA transmission can be scheduled per-packet basis, therefore the context can be sleep between the operation of transmissions. Furthermore, in-kernel protocol implementation involves some CPU-bound tasks, which can sometimes consumes CPU time so long. These characteristics suggest that normal workqueues are suitable, through BH workqueues are not. The replacement with a workqueue allows unit drivers to process the content of packets in non-atomic context. It brings some reliefs to some drivers in sound subsystem that spin-lock is not mandatory anymore during isochronous packet processing. Summary: - Replace tasklet with workqueue for isochronous context - Replace IDR with XArray - Utilize guard macro where possible - Print deprecation warning when enabling debug parameter of firewire-ohci module - Switch to nonatomic PCM operation" * tag 'firewire-updates-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: (55 commits) firewire: core: rename cause flag of tracepoints event firewire: core: update documentation of kernel APIs for flushing completions firewire: core: add helper function to retire descriptors Revert "firewire: core: move workqueue handler from 1394 OHCI driver to core function" Revert "firewire: core: use mutex to coordinate concurrent calls to flush completions" firewire: core: use mutex to coordinate concurrent calls to flush completions firewire: core: move workqueue handler from 1394 OHCI driver to core function firewire: core: fulfill documentation of fw_iso_context_flush_completions() firewire: core: expose kernel API to schedule work item to process isochronous context firewire: core: use WARN_ON_ONCE() to avoid superfluous dumps ALSA: firewire: use nonatomic PCM operation firewire: core: non-atomic memory allocation for isochronous event to user client firewire: ohci: operate IT/IR events in sleepable work process instead of tasklet softIRQ firewire: core: add local API to queue work item to workqueue specific to isochronous contexts firewire: core: allocate workqueue to handle isochronous contexts in card firewire: ohci: obsolete direct usage of printk_ratelimit() firewire: ohci: deprecate debug parameter firewire: core: update fw_device outside of device_find_child() firewire: ohci: fix error path to detect initiated reset in TI TSB41BA3D phy firewire: core/ohci: minor refactoring for computation of configuration ROM size ...
2024-09-23Merge tag 'nfsd-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds
Pull nfsd updates from Chuck Lever: "Notable features of this release include: - Pre-requisites for automatically determining the RPC server thread count - Clean-up and preparation for supporting LOCALIO, which will be merged via the NFS client tree - Enhancements and fixes to NFSv4.2 COPY offload - A new Python-based tool for generating kernel SunRPC XDR encoding and decoding functions, added as an aid for prototyping features in protocols based on the Linux kernel's SunRPC implementation As always I am grateful to the NFSD contributors, reviewers, testers, and bug reporters who participated during this cycle" * tag 'nfsd-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (57 commits) xdrgen: Prevent reordering of encoder and decoder functions xdrgen: typedefs should use the built-in string and opaque functions xdrgen: Fix return code checking in built-in XDR decoders tools: Add xdrgen nfsd: fix delegation_blocked() to block correctly for at least 30 seconds nfsd: fix initial getattr on write delegation nfsd: untangle code in nfsd4_deleg_getattr_conflict() nfsd: enforce upper limit for namelen in __cld_pipe_inprogress_downcall() nfsd: return -EINVAL when namelen is 0 NFSD: Wrap async copy operations with trace points NFSD: Clean up extra whitespace in trace_nfsd_copy_done NFSD: Record the callback stateid in copy tracepoints NFSD: Display copy stateids with conventional print formatting NFSD: Limit the number of concurrent async COPY operations NFSD: Async COPY result needs to return a write verifier nfsd: avoid races with wake_up_var() nfsd: use clear_and_wake_up_bit() sunrpc: xprtrdma: Use ERR_CAST() to return NFSD: Annotate struct pnfs_block_deviceaddr with __counted_by() nfsd: call cache_put if xdr_reserve_space returns NULL ...
2024-09-21Merge tag 'sched_ext-for-6.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext support from Tejun Heo: "This implements a new scheduler class called ‘ext_sched_class’, or sched_ext, which allows scheduling policies to be implemented as BPF programs. The goals of this are: - Ease of experimentation and exploration: Enabling rapid iteration of new scheduling policies. - Customization: Building application-specific schedulers which implement policies that are not applicable to general-purpose schedulers. - Rapid scheduler deployments: Non-disruptive swap outs of scheduling policies in production environments" See individual commits for more documentation, but also the cover letter for the latest series: Link: https://lore.kernel.org/all/20240618212056.2833381-1-tj@kernel.org/ * tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (110 commits) sched: Move update_other_load_avgs() to kernel/sched/pelt.c sched_ext: Don't trigger ops.quiescent/runnable() on migrations sched_ext: Synchronize bypass state changes with rq lock scx_qmap: Implement highpri boosting sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq() sched_ext: Compact struct bpf_iter_scx_dsq_kern sched_ext: Replace consume_local_task() with move_local_task_to_local_dsq() sched_ext: Move consume_local_task() upward sched_ext: Move sanity check and dsq_mod_nr() into task_unlink_from_dsq() sched_ext: Reorder args for consume_local/remote_task() sched_ext: Restructure dispatch_to_local_dsq() sched_ext: Fix processs_ddsp_deferred_locals() by unifying DTL_INVALID handling sched_ext: Make find_dsq_for_dispatch() handle SCX_DSQ_LOCAL_ON sched_ext: Refactor consume_remote_task() sched_ext: Rename scx_kfunc_set_sleepable to unlocked and relocate sched_ext: Add missing static to scx_dump_data sched_ext: Add missing static to scx_has_op[] sched_ext: Temporarily work around pick_task_scx() being called without balance_scx() sched_ext: Add a cgroup scheduler which uses flattened hierarchy sched_ext: Add cgroup support ...
2024-09-21Merge tag 'mm-stable-2024-09-20-02-31' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Along with the usual shower of singleton patches, notable patch series in this pull request are: - "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds consistency to the APIs and behaviour of these two core allocation functions. This also simplifies/enables Rustification. - "Some cleanups for shmem" from Baolin Wang. No functional changes - mode code reuse, better function naming, logic simplifications. - "mm: some small page fault cleanups" from Josef Bacik. No functional changes - code cleanups only. - "Various memory tiering fixes" from Zi Yan. A small fix and a little cleanup. - "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and simplifications and .text shrinkage. - "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt. This is a feature, it adds new feilds to /proc/vmstat such as $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 which tells us that 11391 processes used 4k of stack while none at all used 16k. Useful for some system tuning things, but partivularly useful for "the dynamic kernel stack project". - "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory. - "mm: memcg: page counters optimizations" from Roman Gushchin. "3 independent small optimizations of page counters". - "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David Hildenbrand. Improves PTE/PMD splitlock detection, makes powerpc/8xx work correctly by design rather than by accident. - "mm: remove arch_make_page_accessible()" from David Hildenbrand. Some folio conversions which make arch_make_page_accessible() unneeded. - "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel. Cleans up and fixes our handling of the resetting of the cgroup/process peak-memory-use detector. - "Make core VMA operations internal and testable" from Lorenzo Stoakes. Rationalizaion and encapsulation of the VMA manipulation APIs. With a view to better enable testing of the VMA functions, even from a userspace-only harness. - "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix issues in the zswap global shrinker, resulting in improved performance. - "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill in some missing info in /proc/zoneinfo. - "mm: replace follow_page() by folio_walk" from David Hildenbrand. Code cleanups and rationalizations (conversion to folio_walk()) resulting in the removal of follow_page(). - "improving dynamic zswap shrinker protection scheme" from Nhat Pham. Some tuning to improve zswap's dynamic shrinker. Significant reductions in swapin and improvements in performance are shown. - "mm: Fix several issues with unaccepted memory" from Kirill Shutemov. Improvements to the new unaccepted memory feature, - "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on DAX PUDs. This was missing, although nobody seems to have notied yet. - "Introduce a store type enum for the Maple tree" from Sidhartha Kumar. Cleanups and modest performance improvements for the maple tree library code. - "memcg: further decouple v1 code from v2" from Shakeel Butt. Move more cgroup v1 remnants away from the v2 memcg code. - "memcg: initiate deprecation of v1 features" from Shakeel Butt. Adds various warnings telling users that memcg v1 features are deprecated. - "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li. Greatly improves the success rate of the mTHP swap allocation. - "mm: introduce numa_memblks" from Mike Rapoport. Moves various disparate per-arch implementations of numa_memblk code into generic code. - "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly improves the performance of munmap() of swap-filled ptes. - "support large folio swap-out and swap-in for shmem" from Baolin Wang. With this series we no longer split shmem large folios into simgle-page folios when swapping out shmem. - "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice performance improvements and code reductions for gigantic folios. - "support shmem mTHP collapse" from Baolin Wang. Adds support for khugepaged's collapsing of shmem mTHP folios. - "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect() performance regression due to the addition of mseal(). - "Increase the number of bits available in page_type" from Matthew Wilcox. Increases the number of bits available in page_type! - "Simplify the page flags a little" from Matthew Wilcox. Many legacy page flags are now folio flags, so the page-based flags and their accessors/mutators can be removed. - "mm: store zero pages to be swapped out in a bitmap" from Usama Arif. An optimization which permits us to avoid writing/reading zero-filled zswap pages to backing store. - "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race window which occurs when a MAP_FIXED operqtion is occurring during an unrelated vma tree walk. - "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of the vma_merge() functionality, making ot cleaner, more testable and better tested. - "misc fixups for DAMON {self,kunit} tests" from SeongJae Park. Minor fixups of DAMON selftests and kunit tests. - "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang. Code cleanups and folio conversions. - "Shmem mTHP controls and stats improvements" from Ryan Roberts. Cleanups for shmem controls and stats. - "mm: count the number of anonymous THPs per size" from Barry Song. Expose additional anon THP stats to userspace for improved tuning. - "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio conversions and removal of now-unused page-based APIs. - "replace per-quota region priorities histogram buffer with per-context one" from SeongJae Park. DAMON histogram rationalization. - "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae Park. DAMON documentation updates. - "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn" from Jason Wang: fixes usage of page allocator __GFP_NOFAIL and GFP_ATOMIC flags. - "mm: split underused THPs" from Yu Zhao. Improve THP=always policy. This was overprovisioning THPs in sparsely accessed memory areas. - "zram: introduce custom comp backends API" frm Sergey Senozhatsky. Add support for zram run-time compression algorithm tuning. - "mm: Care about shadow stack guard gap when getting an unmapped area" from Mark Brown. Fix up the various arch_get_unmapped_area() implementations to better respect guard areas. - "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability of mem_cgroup_iter() and various code cleanups. - "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge pfnmap support. - "resource: Fix region_intersects() vs add_memory_driver_managed()" from Huang Ying. Fix a bug in region_intersects() for systems with CXL memory. - "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches a couple more code paths to correctly recover from the encountering of poisoned memry. - "mm: enable large folios swap-in support" from Barry Song. Support the swapin of mTHP memory into appropriately-sized folios, rather than into single-page folios" * tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (416 commits) zram: free secondary algorithms names uprobes: turn xol_area->pages[2] into xol_area->page uprobes: introduce the global struct vm_special_mapping xol_mapping Revert "uprobes: use vm_special_mapping close() functionality" mm: support large folios swap-in for sync io devices mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios mm: fix swap_read_folio_zeromap() for large folios with partial zeromap mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries set_memory: add __must_check to generic stubs mm/vma: return the exact errno in vms_gather_munmap_vmas() memcg: cleanup with !CONFIG_MEMCG_V1 mm/show_mem.c: report alloc tags in human readable units mm: support poison recovery from copy_present_page() mm: support poison recovery from do_cow_fault() resource, kunit: add test case for region_intersects() resource: make alloc_free_mem_region() works for iomem_resource mm: z3fold: deprecate CONFIG_Z3FOLD vfio/pci: implement huge_fault support mm/arm64: support large pfn mappings mm/x86: support large pfn mappings ...
2024-09-20svcrdma: Handle device removal outside of the CM event handlerChuck Lever
Synchronously wait for all disconnects to complete to ensure the transports have divested all hardware resources before the underlying RDMA device can safely be removed. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-19Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, smartpqi, NCR5380, mac_scsi, lpfc, mpi3mr). There are no user visible core changes and a whole series of minor updates and fixes. The largest core change is probably the simplification of the workqueue allocation path" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (86 commits) scsi: smartpqi: update driver version to 2.1.30-031 scsi: smartpqi: fix volume size updates scsi: smartpqi: fix rare system hang during LUN reset scsi: smartpqi: add new controller PCI IDs scsi: smartpqi: add counter for parity write stream requests scsi: smartpqi: correct stream detection scsi: smartpqi: Add fw log to kdump scsi: bnx2fc: Remove some unused fields in struct bnx2fc_rport scsi: qla2xxx: Remove the unused 'del_list_entry' field in struct fc_port scsi: ufs: core: Remove ufshcd_urgent_bkops() scsi: core: Remove obsoleted declaration for scsi_driverbyte_string() scsi: bnx2i: Remove unused declarations scsi: core: Simplify an alloc_workqueue() invocation scsi: ufs: Simplify alloc*_workqueue() invocation scsi: stex: Simplify an alloc_ordered_workqueue() invocation scsi: scsi_transport_fc: Simplify alloc_workqueue() invocations scsi: snic: Simplify alloc_workqueue() invocations scsi: qedi: Simplify an alloc_workqueue() invocation scsi: qedf: Simplify alloc_workqueue() invocations scsi: myrs: Simplify an alloc_ordered_workqueue() invocation ...
2024-09-19Merge tag 'dma-mapping-6.12-2024-09-19' of ↵Linus Torvalds
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - support DMA zones for arm64 systems where memory starts at > 4GB (Baruch Siach, Catalin Marinas) - support direct calls into dma-iommu and thus obsolete dma_map_ops for many common configurations (Leon Romanovsky) - add DMA-API tracing (Sean Anderson) - remove the not very useful return value from various dma_set_* APIs (Christoph Hellwig) - misc cleanups and minor optimizations (Chen Y, Yosry Ahmed, Christoph Hellwig) * tag 'dma-mapping-6.12-2024-09-19' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: reflow dma_supported dma-mapping: reliably inform about DMA support for IOMMU dma-mapping: add tracing for dma-mapping API calls dma-mapping: use IOMMU DMA calls for common alloc/free page calls dma-direct: optimize page freeing when it is not addressable dma-mapping: clearly mark DMA ops as an architecture feature vdpa_sim: don't select DMA_OPS arm64: mm: keep low RAM dma zone dma-mapping: don't return errors from dma_set_max_seg_size dma-mapping: don't return errors from dma_set_seg_boundary dma-mapping: don't return errors from dma_set_min_align_mask scsi: check that busses support the DMA API before setting dma parameters arm64: mm: fix DMA zone when dma-ranges is missing dma-mapping: direct calls for dma-iommu dma-mapping: call ->unmap_page and ->unmap_sg unconditionally arm64: support DMA zone above 4GB dma-mapping: replace zone_dma_bits by zone_dma_limit dma-mapping: use bit masking to check VM_DMA_COHERENT
2024-09-19Merge tag 'platform-drivers-x86-v6.12-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers updates from Hans de Goede: - asus-wmi: Add support for vivobook fan profiles - dell-laptop: Add knobs to change battery charge settings - lg-laptop: Add operation region support - intel-uncore-freq: Add support for efficiency latency control - intel/ifs: Add SBAF test support - intel/pmc: Ignore all LTRs during suspend - platform/surface: Support for arm64 based Surface devices - wmi: Pass event data directly to legacy notify handlers - x86/platform/geode: switch GPIO buttons and LEDs to software properties - bunch of small cleanups, fixes, hw-id additions, etc. * tag 'platform-drivers-x86-v6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (65 commits) MAINTAINERS: adjust file entry in INTEL MID PLATFORM platform/x86: x86-android-tablets: Adjust Xiaomi Pad 2 bottom bezel touch buttons LED platform/mellanox: mlxbf-pmc: fix lockdep warning platform/x86/amd: pmf: Add quirk for TUF Gaming A14 platform/x86: touchscreen_dmi: add nanote-next quirk platform/x86: asus-wmi: don't fail if platform_profile already registered platform/x86: asus-wmi: add debug print in more key places platform/x86: intel_scu_wdt: Move intel_scu_wdt.h to x86 subfolder platform/x86: intel_scu_ipc: Move intel_scu_ipc.h out of arch/x86/include/asm MAINTAINERS: Add Intel MID section platform/x86: panasonic-laptop: Add support for programmable buttons platform/olpc: Remove redundant null pointer checks in olpc_ec_setup_debugfs() platform/x86: intel/pmc: Ignore all LTRs during suspend platform/x86: wmi: Call both legacy and WMI driver notify handlers platform/x86: wmi: Merge get_event_data() with wmi_get_notify_data() platform/x86: wmi: Remove wmi_get_event_data() platform/x86: wmi: Pass event data directly to legacy notify handlers platform/x86: thinkpad_acpi: Fix uninitialized symbol 's' warning platform/x86: x86-android-tablets: Fix spelling in the comments platform/x86: ideapad-laptop: Make the scope_guard() clear of its scope ...
2024-09-18Merge tag 'random-6.12-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "Originally I'd planned on sending each of the vDSO getrandom() architecture ports to their respective arch trees. But as we started to work on this, we found lots of interesting issues in the shared code and infrastructure, the fixes for which the various archs needed to base their work. So in the end, this turned into a nice collaborative effort fixing up issues and porting to 5 new architectures -- arm64, powerpc64, powerpc32, s390x, and loongarch64 -- with everybody pitching in and commenting on each other's code. It was a fun development cycle. This contains: - Numerous fixups to the vDSO selftest infrastructure, getting it running successfully on more platforms, and fixing bugs in it. - Additions to the vDSO getrandom & chacha selftests. Basically every time manual review unearthed a bug in a revision of an arch patch, or an ambiguity, the tests were augmented. By the time the last arch was submitted for review, s390x, v1 of the series was essentially fine right out of the gate. - Fixes to the the generic C implementation of vDSO getrandom, to build and run successfully on all archs, decoupling it from assumptions we had (unintentionally) made on x86_64 that didn't carry through to the other architectures. - Port of vDSO getrandom to LoongArch64, from Xi Ruoyao and acked by Huacai Chen. - Port of vDSO getrandom to ARM64, from Adhemerval Zanella and acked by Will Deacon. - Port of vDSO getrandom to PowerPC, in both 32-bit and 64-bit varieties, from Christophe Leroy and acked by Michael Ellerman. - Port of vDSO getrandom to S390X from Heiko Carstens, the arch maintainer. While it'd be natural for there to be things to fix up over the course of the development cycle, these patches got a decent amount of review from a fairly diverse crew of folks on the mailing lists, and, for the most part, they've been cooking in linux-next, which has been helpful for ironing out build issues. In terms of architectures, I think that mostly takes care of the important 64-bit archs with hardware still being produced and running production loads in settings where vDSO getrandom is likely to help. Arguably there's still RISC-V left, and we'll see for 6.13 whether they find it useful and submit a port" * tag 'random-6.12-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (47 commits) selftests: vDSO: check cpu caps before running chacha test s390/vdso: Wire up getrandom() vdso implementation s390/vdso: Move vdso symbol handling to separate header file s390/vdso: Allow alternatives in vdso code s390/module: Provide find_section() helper s390/facility: Let test_facility() generate static branch if possible s390/alternatives: Remove ALT_FACILITY_EARLY s390/facility: Disable compile time optimization for decompressor code selftests: vDSO: fix vdso_config for s390 selftests: vDSO: fix ELF hash table entry size for s390x powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO64 powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32 powerpc/vdso: Refactor CFLAGS for CVDSO build powerpc/vdso32: Add crtsavres mm: Define VM_DROPPABLE for powerpc/32 powerpc/vdso: Fix VDSO data access when running in a non-root time namespace selftests: vDSO: don't include generated headers for chacha test arm64: vDSO: Wire up getrandom() vDSO implementation arm64: alternative: make alternative_has_cap_likely() VDSO compatible selftests: vDSO: also test counter in vdso_test_chacha ...
2024-09-18Merge tag 'pwm/for-6.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux Pull pwm updates from Uwe Kleine-König: "This contains some cleanups to the core and some mostly minor updates to a bunch of drivers and device tree bindings. One thing worth pointing out is that it contains an immutable branch containing support for a new mfd chip (Analog Devices ADP5585) with several sub drivers. Thanks go to Andrew Kreimer, Clark Wang, Conor Dooley, David Lechner, Dmitry Rokosov, Frank Li, Geert Uytterhoeven, George Stark, Jiapeng Chong, Krzysztof Kozlowski, Laurent Pinchart, Liao Chen, Liu Ying, Rob Herring and Wolfram Sang for code contributions and reviews and to Lee Jones for preparing the above mentioned immutable branch" * tag 'pwm/for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: (21 commits) pwm: stm32: Fix a typo dt-bindings: pwm: amlogic: Add new bindings for meson A1 PWM dt-bindings: pwm: amlogic: Add optional power-domains pwm: Switch back to struct platform_driver::remove() dt-bindings: pwm: allwinner,sun4i-a10-pwm: add top-level constraints pwm: axi-pwmgen: use shared macro for version reg pwm: atmel-hlcdc: Drop trailing comma pwm: atmel-hlcdc: Enable module autoloading pwm: omap-dmtimer: Use of_property_read_bool() pwm: adp5585: Set OSC_EN bit to 1 when PWM state is enabled pwm: lp3943: Fix an incorrect type in lp3943_pwm_parse_dt() pwm: Simplify pwm_capture() pwm: lp3943: Use of_property_count_u32_elems() to get property length pwm: Don't export pwm_capture() pwm: Make info in traces about affected pwm more useful dt-bindings: pwm: renesas,tpu: Add r8a779h0 support dt-bindings: pwm: renesas,pwm-rcar: Add r8a779h0 support pwm: adp5585: Add Analog Devices ADP5585 support gpio: adp5585: Add Analog Devices ADP5585 support mfd: adp5585: Add Analog Devices ADP5585 core support ...
2024-09-18Merge tag 'rcu.release.v6.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU updates from Neeraj Upadhyay: "Context tracking: - rename context tracking state related symbols and remove references to "dynticks" in various context tracking state variables and related helpers - force context_tracking_enabled_this_cpu() to be inlined to avoid leaving a noinstr section CSD lock: - enhance CSD-lock diagnostic reports - add an API to provide an indication of ongoing CSD-lock stall nocb: - update and simplify RCU nocb code to handle (de-)offloading of callbacks only for offline CPUs - fix RT throttling hrtimer being armed from offline CPU rcutorture: - remove redundant rcu_torture_ops get_gp_completed fields - add SRCU ->same_gp_state and ->get_comp_state functions - add generic test for NUM_ACTIVE_*RCU_POLL* for testing RCU and SRCU polled grace periods - add CFcommon.arch for arch-specific Kconfig options - print number of update types in rcu_torture_write_types() - add rcutree.nohz_full_patience_delay testing to the TREE07 scenario - add a stall_cpu_repeat module parameter to test repeated CPU stalls - add argument to limit number of CPUs a guest OS can use in torture.sh rcustall: - abbreviate RCU CPU stall warnings during CSD-lock stalls - Allow dump_cpu_task() to be called without disabling preemption - defer printing stall-warning backtrace when holding rcu_node lock srcu: - make SRCU gp seq wrap-around faster - add KCSAN checks for concurrent updates to ->srcu_n_exp_nodelay and ->reschedule_count which are used in heuristics governing auto-expediting of normal SRCU grace periods and grace-period-state-machine delays - mark idle SRCU-barrier callbacks to help identify stuck SRCU-barrier callback rcu tasks: - remove RCU Tasks Rude asynchronous APIs as they are no longer used - stop testing RCU Tasks Rude asynchronous APIs - fix access to non-existent percpu regions - check processor-ID assumptions during chosen CPU calculation for callback enqueuing - update description of rtp->tasks_gp_seq grace-period sequence number - add rcu_barrier_cb_is_done() to identify whether a given rcu_barrier callback is stuck - mark idle Tasks-RCU-barrier callbacks - add *torture_stats_print() functions to print detailed diagnostics for Tasks-RCU variants - capture start time of rcu_barrier_tasks*() operation to help distinguish a hung barrier operation from a long series of barrier operations refscale: - add a TINY scenario to support tests of Tiny RCU and Tiny SRCU - optimize process_durations() operation rcuscale: - dump stacks of stalled rcu_scale_writer() instances and grace-period statistics when rcu_scale_writer() stalls - mark idle RCU-barrier callbacks to identify stuck RCU-barrier callbacks - print detailed grace-period and barrier diagnostics on rcu_scale_writer() hangs for Tasks-RCU variants - warn if async module parameter is specified for RCU implementations that do not have async primitives such as RCU Tasks Rude - make all writer tasks report upon hang - tolerate repeated GFP_KERNEL failure in rcu_scale_writer() - use special allocator for rcu_scale_writer() - NULL out top-level pointers to heap memory to avoid double-free bugs on modprobe failures - maintain per-task instead of per-CPU callbacks count to avoid any issues with migration of either tasks or callbacks - constify struct ref_scale_ops Fixes: - use system_unbound_wq for kfree_rcu work to avoid disturbing isolated CPUs Misc: - warn on unexpected rcu_state.srs_done_tail state - better define "atomic" for list_replace_rcu() and hlist_replace_rcu() routines - annotate struct kvfree_rcu_bulk_data with __counted_by()" * tag 'rcu.release.v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (90 commits) rcu: Defer printing stall-warning backtrace when holding rcu_node lock rcu/nocb: Remove superfluous memory barrier after bypass enqueue rcu/nocb: Conditionally wake up rcuo if not already waiting on GP rcu/nocb: Fix RT throttling hrtimer armed from offline CPU rcu/nocb: Simplify (de-)offloading state machine context_tracking: Tag context_tracking_enabled_this_cpu() __always_inline context_tracking, rcu: Rename rcu_dyntick trace event into rcu_watching rcu: Update stray documentation references to rcu_dynticks_eqs_{enter, exit}() rcu: Rename rcu_momentary_dyntick_idle() into rcu_momentary_eqs() rcu: Rename rcu_implicit_dynticks_qs() into rcu_watching_snap_recheck() rcu: Rename dyntick_save_progress_counter() into rcu_watching_snap_save() rcu: Rename struct rcu_data .exp_dynticks_snap into .exp_watching_snap rcu: Rename struct rcu_data .dynticks_snap into .watching_snap rcu: Rename rcu_dynticks_zero_in_eqs() into rcu_watching_zero_in_eqs() rcu: Rename rcu_dynticks_in_eqs_since() into rcu_watching_snap_stopped_since() rcu: Rename rcu_dynticks_in_eqs() into rcu_watching_snap_in_eqs() rcu: Rename rcu_dynticks_eqs_online() into rcu_watching_online() context_tracking, rcu: Rename rcu_dynticks_curr_cpu_in_eqs() into rcu_is_watching_curr_cpu() context_tracking, rcu: Rename rcu_dynticks_task*() into rcu_task*() refscale: Constify struct ref_scale_ops ...
2024-09-17Merge tag 'sound-6.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "A fairly big update at this time, both in core and driver sides. The core received rewrites in PCM buffer allocation handling and locking optimizations, PCM rate updates followed by lots of cleanups. In ASoC side, the legacy Intel drivers have been deprecated by AVS drivers which leaded to the significant amount of code reduction. SoundWire driver updates and other cleanups contributed more code reduction, too. USB-audio driver received a large cleanup of its big quirk table, and the old snd_print*() API usages in many legacy drivers are replaced with the standard print API. Here are some highlights: Core: - More optimized locking in ALSA control code - Rewrites of memalloc helpers for better DMA API usage - Drop of obsoleted vmalloc PCM buffer helper API - Continued MIDI2 UMP updates - Support of a new user-space driven timer instance - Update for more PCM support rates and cleanups - Xrun counter report in the proc files ASoC: - Continued simplification and cleanup works for ASoC - Extensive cleanups and refactoring of the Soundwire drivers - Removal of Intel machine support obsoleted by the AVS driver - Lots of DT schema conversions - Machine support for many AMD and Intel x86 platforms - Support for AMD ACP 7.1, Mediatek MT6367 and MT8365, Realtek RTL1320 SoundWire and rev C, and Texas Instruments TAS2563 USB-audio: - Add support of multiple control interfaces - A large rewrite of quirk table with macros - Support for RME Digiface USB HD-audio: - Cleanup of quirk code for Samsung Galaxy laptops - Clean up of detection of Cirrus codecs - C-Media CM9825 HD-audio codec support Others: - Rewrites to standard print API in a lot of legacy drivers" * tag 'sound-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (410 commits) ASoC: topology: Fix redundant logical jump ASoC: tas2781: Add Calibration Kcontrols for Chromebook ASoC: amd: acp: refactor SoundWire machine driver code ASoC: sdw_utils/intel: move soundwire endpoint parsing helper functions ASoC: sdw_util/intel: move soundwire endpoint and dai link structures ASoC: intel: sof_sdw: rename soundwire parsing helper functions ASoC: intel: sof_sdw: rename soundwire endpoint and dailink structures ASoC: atmel: mchp-pdmc: Retain Non-Runtime Controls ALSA: hda/realtek: Add support for Galaxy Book2 Pro (NP950XEE) ASoC: mediatek: mt7986-afe-pcm: Remove redundant error message ALSA: memalloc: Use proper DMA mapping API for x86 S/G buffer allocations ALSA: memalloc: Use proper DMA mapping API for x86 WC buffer allocations ALSA: usb-audio: Add logitech Audio profile quirk ASoc: mediatek: mt8365: Remove unneeded assignment ASoC: Intel: ARL: Add entry for HDMI-In capture support to non-I2S codec boards. ASoC: Intel: sof_rt5682: Add HDMI-In capture with rt5682 support for ARL. ASoC: SOF: Intel: hda: remove common_hdmi_codec_drv ASoC: Intel: sof_pcm512x: do not check common_hdmi_codec_drv ASoC: Intel: ehl_rt5660: do not check common_hdmi_codec_drv ASoC: Intel: skl_hda_dsp_generic: use common module for DAI links ...
2024-09-17hugetlbfs: support tracepointHongbo Li
Add basic tracepoints for {alloc, evict, free}_inode, setattr and fallocate. These can help users to debug hugetlbfs more conveniently. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Link: https://lore.kernel.org/r/20240829064110.67884-2-lihongbo22@huawei.com Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Christian Brauner <brauner@kernel.org>