summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2024-08-14rcu-tasks: Remove RCU Tasks Rude asynchronous APIsPaul E. McKenney
The call_rcu_tasks_rude() and rcu_barrier_tasks_rude() APIs are currently unused. This commit therefore removes their definitions and boot-time self-tests. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-13bpf: switch maps to CLASS(fd, ...)Al Viro
Calling conventions for __bpf_map_get() would be more convenient if it left fpdut() on failure to callers. Makes for simpler logics in the callers. Among other things, the proof of memory safety no longer has to rely upon file->private_data never being ERR_PTR(...) for bpffs files. Original calling conventions made it impossible for the caller to tell whether __bpf_map_get() has returned ERR_PTR(-EINVAL) because it has found the file not be a bpf map one (in which case it would've done fdput()) or because it found that ERR_PTR(-EINVAL) in file->private_data of a bpf map file (in which case fdput() would _not_ have been done). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-08-13iavf: add support for offloading tc U32 cls filtersAhmed Zaki
Add support for offloading cls U32 filters. Only "skbedit queue_mapping" and "drop" actions are supported. Also, only "ip" and "802_3" tc protocols are allowed. The PF must advertise the VIRTCHNL_VF_OFFLOAD_TC_U32 capability flag. Since the filters will be enabled via the FD stage at the PF, a new type of FDIR filters is added and the existing list and state machine are used. The new filters can be used to configure flow directors based on raw (binary) pattern in the rx packet. Examples: 0. # tc qdisc add dev enp175s0v0 ingress 1. Redirect UDP from src IP 192.168.2.1 to queue 12: # tc filter add dev <dev> protocol ip ingress u32 \ match u32 0x45000000 0xff000000 at 0 \ match u32 0x00110000 0x00ff0000 at 8 \ match u32 0xC0A80201 0xffffffff at 12 \ match u32 0x00000000 0x00000000 at 24 \ action skbedit queue_mapping 12 skip_sw 2. Drop all ICMP: # tc filter add dev <dev> protocol ip ingress u32 \ match u32 0x45000000 0xff000000 at 0 \ match u32 0x00010000 0x00ff0000 at 8 \ match u32 0x00000000 0x00000000 at 24 \ action drop skip_sw 3. Redirect ICMP traffic from MAC 3c:fd:fe:a5:47:e0 to queue 7 (note proto: 802_3): # tc filter add dev <dev> protocol 802_3 ingress u32 \ match u32 0x00003CFD 0x0000ffff at 4 \ match u32 0xFEA547E0 0xffffffff at 8 \ match u32 0x08004500 0xffffff00 at 12 \ match u32 0x00000001 0x000000ff at 20 \ match u32 0x0000 0x0000 at 40 \ action skbedit queue_mapping 7 skip_sw Notes on matches: 1 - All intermediate fields that are needed to parse the correct PTYPE must be provided (in e.g. 3: Ethernet Type 0x0800 in MAC, IP version and IP length: 0x45 and protocol: 0x01 (ICMP)). 2 - The last match must provide an offset that guarantees all required headers are accounted for, even if the last header is not matched. For example, in #2, the last match is 4 bytes at offset 24 starting from IP header, so the total is 14 (MAC) + 24 + 4 = 42, which is the sum of MAC+IP+ICMP headers. Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-13virtchnl: support raw packet in protocol headerJunfeng Guo
The patch extends existing virtchnl_proto_hdrs structure to allow VF to pass a pair of buffers as packet data and mask that describe a match pattern of a filter rule. Then the kernel PF driver is requested to parse the pair of buffer and figure out low level hardware metadata (ptype, profile, field vector.. ) to program the expected FDIR or RSS rules. Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Signed-off-by: Junfeng Guo <junfeng.guo@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-13Merge remote-tracking branch 'vfs/stable-struct_fd'Andrii Nakryiko
Merge Al Viro's struct fd refactorings. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-08-13workqueue: Add interface for user-defined workqueue lockdep mapMatthew Brost
Add an interface for a user-defined workqueue lockdep map, which is helpful when multiple workqueues are created for the same purpose. This also helps avoid leaking lockdep maps on each workqueue creation. v2: - Add alloc_workqueue_lockdep_map (Tejun) v3: - Drop __WQ_USER_OWNED_LOCKDEP (Tejun) - static inline alloc_ordered_workqueue_lockdep_map (Tejun) Cc: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-13regmap IRQ support for devices with multiple IRQsMark Brown
Merge series from Matti Vaittinen <mazziesaccount@gmail.com>: Devices can provide multiple interrupt lines. One reason for this is that a device has multiple subfunctions, each providing its own interrupt line. Another reason is that a device can be designed to be used (also) on a system where some of the interrupts can be routed to another processor. A line often further acts as a demultiplex for specific interrupts and has it's respective set of interrupt (status, mask, ack, ...) registers. Regmap supports the handling of these registers and demultiplexing interrupts, but interrupt domain code ends up assigning the same name for the per interrupt line domains This series adds possibility for giving a name suffix for an interrupt Previous discussion can be found from: https://lore.kernel.org/all/87plst28yk.ffs@tglx/ https://lore.kernel.org/all/15685ef6-92a5-41df-9148-1a67ceaec47b@gmail.com/ The domain suffix support added in this series will be used by the ROHM BD96801 ERRB IRQ support code. The BD96801 ERRB support will need the initial BD96801 driver code, which is not yet in irq/core or regmap trees. Thus the user for this new support is not included in the series, but will be sent once the name suffix support gets merged.
2024-08-13printk/panic: Allow cpu backtraces to be written into ringbuffer during panicRyo Takakura
commit 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing to ringbuffer") disabled non-panic CPUs to further write messages to ringbuffer after panicked. Since the commit, non-panicked CPU's are not allowed to write to ring buffer after panicked and CPU backtrace which is triggered after panicked to sample non-panicked CPUs' backtrace no longer serves its function as it has nothing to print. Fix the issue by allowing non-panicked CPUs to write into ringbuffer while CPU backtrace is in flight. Fixes: 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing to ringbuffer") Signed-off-by: Ryo Takakura <takakura@valinux.co.jp> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240812072703.339690-1-takakura@valinux.co.jp Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-13vfs: Don't evict inode under the inode lru traversing contextZhihao Cheng
The inode reclaiming process(See function prune_icache_sb) collects all reclaimable inodes and mark them with I_FREEING flag at first, at that time, other processes will be stuck if they try getting these inodes (See function find_inode_fast), then the reclaiming process destroy the inodes by function dispose_list(). Some filesystems(eg. ext4 with ea_inode feature, ubifs with xattr) may do inode lookup in the inode evicting callback function, if the inode lookup is operated under the inode lru traversing context, deadlock problems may happen. Case 1: In function ext4_evict_inode(), the ea inode lookup could happen if ea_inode feature is enabled, the lookup process will be stuck under the evicting context like this: 1. File A has inode i_reg and an ea inode i_ea 2. getfattr(A, xattr_buf) // i_ea is added into lru // lru->i_ea 3. Then, following three processes running like this: PA PB echo 2 > /proc/sys/vm/drop_caches shrink_slab prune_dcache_sb // i_reg is added into lru, lru->i_ea->i_reg prune_icache_sb list_lru_walk_one inode_lru_isolate i_ea->i_state |= I_FREEING // set inode state inode_lru_isolate __iget(i_reg) spin_unlock(&i_reg->i_lock) spin_unlock(lru_lock) rm file A i_reg->nlink = 0 iput(i_reg) // i_reg->nlink is 0, do evict ext4_evict_inode ext4_xattr_delete_inode ext4_xattr_inode_dec_ref_all ext4_xattr_inode_iget ext4_iget(i_ea->i_ino) iget_locked find_inode_fast __wait_on_freeing_inode(i_ea) ----→ AA deadlock dispose_list // cannot be executed by prune_icache_sb wake_up_bit(&i_ea->i_state) Case 2: In deleted inode writing function ubifs_jnl_write_inode(), file deleting process holds BASEHD's wbuf->io_mutex while getting the xattr inode, which could race with inode reclaiming process(The reclaiming process could try locking BASEHD's wbuf->io_mutex in inode evicting function), then an ABBA deadlock problem would happen as following: 1. File A has inode ia and a xattr(with inode ixa), regular file B has inode ib and a xattr. 2. getfattr(A, xattr_buf) // ixa is added into lru // lru->ixa 3. Then, following three processes running like this: PA PB PC echo 2 > /proc/sys/vm/drop_caches shrink_slab prune_dcache_sb // ib and ia are added into lru, lru->ixa->ib->ia prune_icache_sb list_lru_walk_one inode_lru_isolate ixa->i_state |= I_FREEING // set inode state inode_lru_isolate __iget(ib) spin_unlock(&ib->i_lock) spin_unlock(lru_lock) rm file B ib->nlink = 0 rm file A iput(ia) ubifs_evict_inode(ia) ubifs_jnl_delete_inode(ia) ubifs_jnl_write_inode(ia) make_reservation(BASEHD) // Lock wbuf->io_mutex ubifs_iget(ixa->i_ino) iget_locked find_inode_fast __wait_on_freeing_inode(ixa) | iput(ib) // ib->nlink is 0, do evict | ubifs_evict_inode | ubifs_jnl_delete_inode(ib) ↓ ubifs_jnl_write_inode ABBA deadlock ←-----make_reservation(BASEHD) dispose_list // cannot be executed by prune_icache_sb wake_up_bit(&ixa->i_state) Fix the possible deadlock by using new inode state flag I_LRU_ISOLATING to pin the inode in memory while inode_lru_isolate() reclaims its pages instead of using ordinary inode reference. This way inode deletion cannot be triggered from inode_lru_isolate() thus avoiding the deadlock. evict() is made to wait for I_LRU_ISOLATING to be cleared before proceeding with inode cleanup. Link: https://lore.kernel.org/all/37c29c42-7685-d1f0-067d-63582ffac405@huaweicloud.com/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=219022 Fixes: e50e5129f384 ("ext4: xattr-in-inode support") Fixes: 7959cf3a7506 ("ubifs: journal: Handle xattrs like files") Cc: stable@vger.kernel.org Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Link: https://lore.kernel.org/r/20240809031628.1069873-1-chengzhihao@huaweicloud.com Reviewed-by: Jan Kara <jack@suse.cz> Suggested-by: Jan Kara <jack@suse.cz> Suggested-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-13regmap: Allow setting IRQ domain name suffixMatti Vaittinen
When multiple IRQ domains are created from the same device-tree node they will get the same name based on the device-tree path. This will cause a naming collision in debugFS when IRQ domain specific entries are created. The regmap-IRQ creates per instance IRQ domains. This will lead to a domain name conflict when a device which provides more than one interrupt line uses the regmap-IRQ. Add support for specifying an IRQ domain name suffix when creating a regmap-IRQ controller. Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Link: https://patch.msgid.link/776bc4996969e5081bcf61b9bdb5517e537147a3.1723120028.git.mazziesaccount@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-08-13net: netpoll: extract core of netpoll_cleanupBreno Leitao
Extract the core part of netpoll_cleanup(), so, it could be called from a caller that has the rtnl lock already. Netconsole uses this in a weird way right now: __netpoll_cleanup(&nt->np); spin_lock_irqsave(&target_list_lock, flags); netdev_put(nt->np.dev, &nt->np.dev_tracker); nt->np.dev = NULL; nt->enabled = false; This will be replaced by do_netpoll_cleanup() as the locking situation is overhauled. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Rik van Riel <riel@surriel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-13iommu: Remove unused declaration iommu_sva_unbind_gpasid()Yue Haibing
Commit 0c9f17877891 ("iommu: Remove guest pasid related interfaces and definitions") removed the implementation but leave declaration. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240808140619.2498535-1-yuehaibing@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-08-13usb: gadget: configfs: Constify struct config_item_typeChristophe JAILLET
'struct config_item_type' is not modified in this file. Apparently, these structures are only used with config_group_init_type_name() which takes a const struct config_item_type* as a 3rd argument. Constifying this structure moves some data to a read-only section, so increase overall security, especially when the structure holds some function pointers. On a x86_64, with allmodconfig: Before: ====== text data bss dec hex filename 40834 5112 64 46010 b3ba drivers/usb/gadget/configfs.o After: ===== text data bss dec hex filename 41218 4728 64 46010 b3ba drivers/usb/gadget/configfs.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/513223e97082e1bb758e36d55c175ec9ea34a71c.1723323896.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13usb: gadget: configfs: Make check_user_usb_string() staticChristophe JAILLET
"linux/usb/gadget_configfs.h" is only included in "drivers/usb/gadget/configfs.c", so there is no need to declare a function in the header file. it is only used in this .c file. It's better to have it static. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/958cb49dca1bff4254a3492c018efbf3b01918b4.1723323107.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13net: stmmac: Move the atds flag to the stmmac_dma_cfg structureYanteng Si
ATDS (Alternate Descriptor Size) is a part of the DMA Bus Mode configs (together with PBL, ALL, EME, etc) of the DW GMAC controllers. Seeing it's not changed at runtime but is activated as long as the IP-core has it supported (at least due to the Type 2 Full Checksum Offload Engine feature), move the respective parameter from the stmmac_dma_ops::init() callback argument to the stmmac_dma_cfg structure, which already have the rest of the DMA-related configs defined. Besides the being added in the next commit DW GMAC multi-channels support will require to add the stmmac_dma_ops::init_chan() callback and have the ATDS flag set/cleared for each channel in there. Having the atds-flag in the stmmac_dma_cfg structure will make the parameter accessible from stmmac_dma_ops::init_chan() callback too. Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Signed-off-by: Yinggang Gu <guyinggang@loongson.cn> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Tested-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-12add struct fd constructors, get rid of __to_fd()Al Viro
Make __fdget() et.al. return struct fd directly. New helpers: BORROWED_FD(file) and CLONED_FD(file), for borrowed and cloned file references resp. NOTE: this might need tuning; in particular, inline on __fget_light() is there to keep the code generation same as before - we probably want to keep it inlined in fdget() et.al. (especially so in fdget_pos()), but that needs profiling. Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-08-12struct fd: representation changeAl Viro
We want the compiler to see that fdput() on empty instance is a no-op. The emptiness check is that file reference is NULL, while fdput() is "fput() if FDPUT_FPUT is present in flags". The reason why fdput() on empty instance is a no-op is something compiler can't see - it's that we never generate instances with NULL file reference combined with non-zero flags. It's not that hard to deal with - the real primitives behind fdget() et.al. are returning an unsigned long value, unpacked by (inlined) __to_fd() into the current struct file * + int. The lower bits are used to store flags, while the rest encodes the pointer. Linus suggested that keeping this unsigned long around with the extractions done by inlined accessors should generate a sane code and that turns out to be the case. Namely, turning struct fd into a struct-wrapped unsinged long, with fd_empty(f) => unlikely(f.word == 0) fd_file(f) => (struct file *)(f.word & ~3) fdput(f) => if (f.word & 1) fput(fd_file(f)) ends up with compiler doing the right thing. The cost is the patch footprint, of course - we need to switch f.file to fd_file(f) all over the tree, and it's not doable with simple search and replace; there are false positives, etc. Note that the sole member of that structure is an opaque unsigned long - all accesses should be done via wrappers and I don't want to use a name that would invite manual casts to file pointers, etc. The value of that member is equal either to (unsigned long)p | flags, p being an address of some struct file instance, or to 0 for an empty fd. For now the new predicate (fd_empty(f)) has no users; all the existing checks have form (!fd_file(f)). We will convert to fd_empty() use later; here we only define it (and tell the compiler that it's unlikely to return true). This commit only deals with representation change; there will be followups. Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-08-12introduce fd_file(), convert all accessors to it.Al Viro
For any changes of struct fd representation we need to turn existing accesses to fields into calls of wrappers. Accesses to struct fd::flags are very few (3 in linux/file.h, 1 in net/socket.c, 3 in fs/overlayfs/file.c and 3 more in explicit initializers). Those can be dealt with in the commit converting to new layout; accesses to struct fd::file are too many for that. This commit converts (almost) all of f.file to fd_file(f). It's not entirely mechanical ('file' is used as a member name more than just in struct fd) and it does not even attempt to distinguish the uses in pointer context from those in boolean context; the latter will be eventually turned into a separate helper (fd_empty()). NOTE: mass conversion to fd_empty(), tempting as it might be, is a bad idea; better do that piecewise in commit that convert from fdget...() to CLASS(...). [conflicts in fs/fhandle.c, kernel/bpf/syscall.c, mm/memcontrol.c caught by git; fs/stat.c one got caught by git grep] [fs/xattr.c conflict] Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-08-12bpf: Fix updating attached freplace prog in prog_array mapLeon Hwang
The commit f7866c358733 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed a NULL pointer dereference panic, but didn't fix the issue that fails to update attached freplace prog to prog_array map. Since commit 1c123c567fb1 ("bpf: Resolve fext program type when checking map compatibility"), freplace prog and its target prog are able to tail call each other. And the commit 3aac1ead5eb6 ("bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL after attaching freplace prog to its target prog. After loading freplace the prog_array's owner type is BPF_PROG_TYPE_SCHED_CLS. Then, after attaching freplace its prog->aux->dst_prog is NULL. Then, while updating freplace in prog_array the bpf_prog_map_compatible() incorrectly returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead of BPF_PROG_TYPE_SCHED_CLS. After this patch the resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS and update to prog_array can succeed. Fixes: f7866c358733 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20240728114612.48486-2-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-08-12netfs: Fix handling of USE_PGPRIV2 and WRITE_TO_CACHE flagsDavid Howells
The NETFS_RREQ_USE_PGPRIV2 and NETFS_RREQ_WRITE_TO_CACHE flags aren't used correctly. The problem is that we try to set them up in the request initialisation, but we the cache may be in the process of setting up still, and so the state may not be correct. Further, we secondarily sample the cache state and make contradictory decisions later. The issue arises because we set up the cache resources, which allows the cache's ->prepare_read() to switch on NETFS_SREQ_COPY_TO_CACHE - which triggers cache writing even if we didn't set the flags when allocating. Fix this in the following way: (1) Drop NETFS_ICTX_USE_PGPRIV2 and instead set NETFS_RREQ_USE_PGPRIV2 in ->init_request() rather than trying to juggle that in netfs_alloc_request(). (2) Repurpose NETFS_RREQ_USE_PGPRIV2 to merely indicate that if caching is to be done, then PG_private_2 is to be used rather than only setting it if we decide to cache and then having netfs_rreq_unlock_folios() set the non-PG_private_2 writeback-to-cache if it wasn't set. (3) Split netfs_rreq_unlock_folios() into two functions, one of which contains the deprecated code for using PG_private_2 to avoid accidentally doing the writeback path - and always use it if USE_PGPRIV2 is set. (4) As NETFS_ICTX_USE_PGPRIV2 is removed, make netfs_write_begin() always wait for PG_private_2. This function is deprecated and only used by ceph anyway, and so label it so. (5) Drop the NETFS_RREQ_WRITE_TO_CACHE flag and use fscache_operation_valid() on the cache_resources instead. This has the advantage of picking up the result of netfs_begin_cache_read() and fscache_begin_write_operation() - which are called after the object is initialised and will wait for the cache to come to a usable state. Just reverting ae678317b95e[1] isn't a sufficient fix, so this need to be applied on top of that. Without this as well, things like: rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { and: WARNING: CPU: 13 PID: 3621 at fs/ceph/caps.c:3386 may happen, along with some UAFs due to PG_private_2 not getting used to wait on writeback completion. Fixes: 2ff1e97587f4 ("netfs: Replace PG_fscache by setting folio->private and marking dirty") Reported-by: Max Kellermann <max.kellermann@ionos.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Xiubo Li <xiubli@redhat.com> cc: Hristo Venev <hristo@venev.name> cc: Jeff Layton <jlayton@kernel.org> cc: Matthew Wilcox <willy@infradead.org> cc: ceph-devel@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/3575457.1722355300@warthog.procyon.org.uk/ [1] Link: https://lore.kernel.org/r/1173209.1723152682@warthog.procyon.org.uk Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-12file: fix typo in take_fd() commentMathias Krause
The explanatory comment above take_fd() contains a typo, fix that to not confuse readers. Signed-off-by: Mathias Krause <minipli@grsecurity.net> Link: https://lore.kernel.org/r/20240809135035.748109-1-minipli@grsecurity.net Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-12lsm: add the inode_free_security_rcu() LSM implementation hookPaul Moore
The LSM framework has an existing inode_free_security() hook which is used by LSMs that manage state associated with an inode, but due to the use of RCU to protect the inode, special care must be taken to ensure that the LSMs do not fully release the inode state until it is safe from a RCU perspective. This patch implements a new inode_free_security_rcu() implementation hook which is called when it is safe to free the LSM's internal inode state. Unfortunately, this new hook does not have access to the inode itself as it may already be released, so the existing inode_free_security() hook is retained for those LSMs which require access to the inode. Cc: stable@vger.kernel.org Reported-by: syzbot+5446fbf332b0602ede0b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/00000000000076ba3b0617f65cc8@google.com Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-12lsm: cleanup lsm_hooks.hPaul Moore
Some cleanup and style corrections for lsm_hooks.h. * Drop the lsm_inode_alloc() extern declaration, it is not needed. * Relocate lsm_get_xattr_slot() and extern variables in the file to improve grouping of related objects. * Don't use tabs to needlessly align structure fields. Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-12srcu: faster gp seq wrap-aroundJP Kobryn
Using a higher value for the initial gp sequence counters allows for wrapping to occur faster. It can help with surfacing any issues that may be happening as a result of the wrap around. Signed-off-by: JP Kobryn <inwardvessel@gmail.com> Tested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-12Merge 6.11-rc3 into usb-nextGreg Kroah-Hartman
We need the usb fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-12Merge 6.11-rc3 into tty-nextGreg Kroah-Hartman
We need the tty/serial fixes in here to build on top of. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-12Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull fd bitmap fix from Al Viro: "Fix bitmap corruption on close_range() by cleaning up copy_fd_bitmaps()" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix bitmap corruption on close_range() with CLOSE_RANGE_UNSHARE
2024-08-12platform/x86/intel/vsec: Add PMT read callbacksDavid E. Box
Some PMT providers require device specific actions before their telemetry can be read. Provide assignable PMT read callbacks to allow providers to perform those actions. Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20240725122346.4063913-3-michael.j.ruhl@intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-08-12platform/x86/intel/vsec.h: Move to include/linuxDavid E. Box
Some drivers outside of PDX86 need access to the vsec header. Move it to include/linux to make it easier to include. Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20240725122346.4063913-2-michael.j.ruhl@intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-08-12ethtool: rss: don't report key if device doesn't support itJakub Kicinski
marvell/otx2 and mvpp2 do not support setting different keys for different RSS contexts. Contexts have separate indirection tables but key is shared with all other contexts. This is likely fine, indirection table is the most important piece. Don't report the key-related parameters from such drivers. This prevents driver-errors, e.g. otx2 always writes the main key, even when user asks to change per-context key. The second reason is that without this change tracking the keys by the core gets complicated. Even if the driver correctly reject setting key with rss_context != 0, change of the main key would have to be reflected in the XArray for all additional contexts. Since the additional contexts don't have their own keys not including the attributes (in Netlink speak) seems intuitive. ethtool CLI seems to deal with it just fine. Having to set the flag in majority of the drivers is a bit tedious but not reporting the key is a safer default. Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-12ethtool: make ethtool_ops::cap_rss_ctx_supported optionalJakub Kicinski
cap_rss_ctx_supported was created because the API for creating and configuring additional contexts is mux'ed with the normal RSS API. Presence of ops does not imply driver can actually support rss_context != 0 (in fact drivers mostly ignore that field). cap_rss_ctx_supported lets core check that the driver is context-aware before calling it. Now that we have .create_rxfh_context, there is no such ambiguity. We can depend on presence of the op. Make setting the bit optional. Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-12memory: ti-aemif: remove platform data supportBartosz Golaszewski
There are no longer any users of the ti-aemif driver that set up platform data from board files. We can shrink the driver by removing support for it. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Link: https://lore.kernel.org/r/20240809-ti-aemif-v1-1-27b1e5001390@linaro.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2024-08-12Merge tag 'spi-acpi-lookup-dummy' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi into topic/cirrus-hp-g12 spi: Add empty versions of ACPI lookup functions A patch from Richard Fitzgerald adding dummy versions of the ACPI lookup functions for SPI: Provide empty versions of acpi_spi_count_resources(), acpi_spi_device_alloc() and acpi_spi_find_controller_by_adev() if the real functions are not being built. This commit fixes two problems with the original definitions: 1) There wasn't an empty version of these functions 2) The #if only depended on CONFIG_ACPI. But the functions are implemented in the core spi.c so CONFIG_SPI_MASTER must also be enabled for the real functions to exist.
2024-08-11mm/memblock: introduce a new helper memblock_estimated_nr_free_pages()Wei Yang
During bootup, system may need the number of free pages in the whole system to do some calculation before all pages are freed to buddy system. Usually this number is get from totalram_pages(). Since we plan to move the free pages accounting in __free_pages_core(), this value may not represent total free pages at the early stage, especially when CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled. Instead of using raw memblock api, let's introduce a new helper for user to get the estimated number of free pages from memblock point of view. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> CC: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20240808001415.6298-1-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2024-08-11net: mii: constify advertising maskRussell King (Oracle)
Constify the advertising mask to linkmode functions that only read from the advertising mask. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-11context_tracking, rcu: Rename DYNTICK_IRQ_NONIDLE into CT_NESTING_IRQ_NONIDLEValentin Schneider
The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, and the 'dynticks' prefix can be dropped without losing any meaning. Signed-off-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-11context_tracking, rcu: Rename ct_dynticks_nmi_nesting_cpu() into ↵Valentin Schneider
ct_nmi_nesting_cpu() The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, and the 'dynticks' prefix can be dropped without losing any meaning. Suggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-11context_tracking, rcu: Rename ct_dynticks_nmi_nesting() into ct_nmi_nesting()Valentin Schneider
The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, and the 'dynticks' prefix can be dropped without losing any meaning. Suggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-11context_tracking, rcu: Rename struct context_tracking .dynticks_nmi_nesting ↵Valentin Schneider
into .nmi_nesting The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, and the 'dynticks' prefix can be dropped without losing any meaning. [ neeraj.upadhyay: Fix htmldocs build error reported by Stephen Rothwell ] Suggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-11context_tracking, rcu: Rename ct_dynticks_nesting_cpu() into ct_nesting_cpu()Valentin Schneider
The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, and the 'dynticks' prefix can be dropped without losing any meaning. Suggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-11context_tracking, rcu: Rename ct_dynticks_nesting() into ct_nesting()Valentin Schneider
The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, reflect that change in the related helpers. Suggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-11context_tracking, rcu: Rename struct context_tracking .dynticks_nesting into ↵Valentin Schneider
.nesting The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, reflect that change in the related helpers. [ neeraj.upadhyay: Fix htmldocs build error reported by Stephen Rothwell ] Suggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-10Merge tag 'i2c-for-6.11-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - Two fixes for SMBusAlert handling in the I2C core: one to avoid an endless loop when scanning for handlers and one to make sure handlers are always called even if HW has broken behaviour - I2C header build fix for when ACPI is enabled but I2C isn't - The testunit gets a rename in the code to match the documentation - Two fixes for the Qualcomm GENI I2C controller are cleaning up the error exit patch in the runtime_resume() function. The first is disabling the clock, the second disables the icc on the way out * tag 'i2c-for-6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: testunit: match HostNotify test name with docs i2c: qcom-geni: Add missing geni_icc_disable in geni_i2c_runtime_resume i2c: qcom-geni: Add missing clk_disable_unprepare in geni_i2c_runtime_resume i2c: Fix conditional for substituting empty ACPI functions i2c: smbus: Send alert notifications to all devices if source not found i2c: smbus: Improve handling of stuck alerts
2024-08-10iio: trigger: allow devices to suspend/resume theirs associated triggerDenis Benato
When a machine enters a sleep state while a trigger is associated to an iio device that trigger is not resumed after exiting the sleep state: provide iio device drivers a way to suspend and resume the associated trigger to solve the aforementioned bug. Each iio driver supporting external triggers is expected to call iio_device_suspend_triggering before suspending, and iio_device_resume_triggering upon resuming. Signed-off-by: Denis Benato <benato.denis96@gmail.com> Link: https://patch.msgid.link/20240807185619.7261-2-benato.denis96@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-08-10iio: add child nodes support in iio backend frameworkOlivier Moysan
Add an API to support IIO generic channels binding: http://devicetree.org/schemas/iio/adc/adc.yaml# This new API is needed, as generic channel DT node isn't populated as a device. Add devm_iio_backend_fwnode_get() to allow an IIO device backend consumer to reference backend phandles in its child nodes. Signed-off-by: Olivier Moysan <olivier.moysan@foss.st.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Link: https://patch.msgid.link/20240730084640.1307938-4-olivier.moysan@foss.st.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-08-10iio: add enable and disable services to iio backend frameworkOlivier Moysan
Add iio_backend_disable() and iio_backend_enable() APIs to allow IIO backend consumer to request backend disabling and enabling. Signed-off-by: Olivier Moysan <olivier.moysan@foss.st.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Link: https://patch.msgid.link/20240730084640.1307938-3-olivier.moysan@foss.st.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-08-10iio: add read scale and offset services to iio backend frameworkOlivier Moysan
Add iio_backend_read_scale() and iio_backend_read_offset() services to read channel scale and offset from an IIO backbend device. Also add a read_raw callback which replicates the read_raw callback of the IIO framework, and is intended to request miscellaneous channel attributes from the backend device. Both scale and offset helpers use this callback. Signed-off-by: Olivier Moysan <olivier.moysan@foss.st.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Link: https://patch.msgid.link/20240730084640.1307938-2-olivier.moysan@foss.st.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-08-10Revert "lib/mpi: Introduce ec implementation to MPI library"Herbert Xu
This reverts commit d58bb7e55a8a65894cc02f27c3e2bf9403e7c40f. It's no longer needed since sm2 has been removed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-09Merge tag 'irq-domain-24-08-09' into irq/coreThomas Gleixner
Merge the irqdomain changes which are required for regmap to apply depending patches and therefore tagged.
2024-08-09irqdomain: Allow giving name suffix for domainMatti Vaittinen
Devices can provide multiple interrupt lines. One reason for this is that a device has multiple subfunctions, each providing its own interrupt line. Another reason is that a device can be designed to be used (also) on a system where some of the interrupts can be routed to another processor. A line often further acts as a demultiplex for specific interrupts and has it's respective set of interrupt (status, mask, ack, ...) registers. Regmap supports the handling of these registers and demultiplexing interrupts, but the interrupt domain code ends up assigning the same name for the per interrupt line domains. This causes a naming collision in the debugFS code and leads to confusion, as /proc/interrupts shows two separate interrupts with the same domain name and hardware interrupt number. Instead of adding a workaround in regmap or driver code, allow giving a name suffix for the domain name when the domain is created. Add a name_suffix field in the irq_domain_info structure and make irq_domain_instantiate() use this suffix if it is given when a domain is created. [ tglx: Adopt it to the cleanup patch and fixup the invalid NULL return ] Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/871q2yvk5x.ffs@tglx