summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-03-20selftests/pidfd: third test for multi-threaded exec pollingChristian Brauner
Ensure that during a multi-threaded exec and premature thread-group leader exit no exit notification is generated. Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-4-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20selftests/pidfd: second test for multi-threaded exec pollingChristian Brauner
Ensure that during a multi-threaded exec and premature thread-group leader exit no exit notification is generated. Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-3-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20selftests/pidfd: first test for multi-threaded exec pollingChristian Brauner
Add first test for premature thread-group leader exit. Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-2-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20pidfs: improve multi-threaded exec and premature thread-group leader exit ↵Christian Brauner
polling This is another attempt trying to make pidfd polling for multi-threaded exec and premature thread-group leader exit consistent. A quick recap of these two cases: (1) During a multi-threaded exec by a subthread, i.e., non-thread-group leader thread, all other threads in the thread-group including the thread-group leader are killed and the struct pid of the thread-group leader will be taken over by the subthread that called exec. IOW, two tasks change their TIDs. (2) A premature thread-group leader exit means that the thread-group leader exited before all of the other subthreads in the thread-group have exited. Both cases lead to inconsistencies for pidfd polling with PIDFD_THREAD. Any caller that holds a PIDFD_THREAD pidfd to the current thread-group leader may or may not see an exit notification on the file descriptor depending on when poll is performed. If the poll is performed before the exec of the subthread has concluded an exit notification is generated for the old thread-group leader. If the poll is performed after the exec of the subthread has concluded no exit notification is generated for the old thread-group leader. The correct behavior would be to simply not generate an exit notification on the struct pid of a subhthread exec because the struct pid is taken over by the subthread and thus remains alive. But this is difficult to handle because a thread-group may exit prematurely as mentioned in (2). In that case an exit notification is reliably generated but the subthreads may continue to run for an indeterminate amount of time and thus also may exec at some point. So far there was no way to distinguish between (1) and (2) internally. This tiny series tries to address this problem by discarding PIDFD_THREAD notification on premature thread-group leader exit. If that works correctly then no exit notifications are generated for a PIDFD_THREAD pidfd for a thread-group leader until all subthreads have been reaped. If a subthread should exec aftewards no exit notification will be generated until that task exits or it creates subthreads and repeates the cycle. Co-Developed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-1-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20Merge tag 'batadv-net-pullrequest-20250318' of ↵Paolo Abeni
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== Here is batman-adv bugfix: - Ignore own maximum aggregation size during RX, Sven Eckelmann * tag 'batadv-net-pullrequest-20250318' of git://git.open-mesh.org/linux-merge: batman-adv: Ignore own maximum aggregation size during RX ==================== Link: https://patch.msgid.link/20250318150035.35356-1-sw@simonwunderlich.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTESLin Ma
Previous commit 8b5c171bb3dc ("neigh: new unresolved queue limits") introduces new netlink attribute NDTPA_QUEUE_LENBYTES to represent approximative value for deprecated QUEUE_LEN. However, it forgot to add the associated nla_policy in nl_ntbl_parm_policy array. Fix it with one simple NLA_U32 type policy. Fixes: 8b5c171bb3dc ("neigh: new unresolved queue limits") Signed-off-by: Lin Ma <linma@zju.edu.cn> Link: https://patch.msgid.link/20250315165113.37600-1-linma@zju.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20fs: sort out fd allocation vs dup2 race commentary, take 2Mateusz Guzik
fd_install() has a questionable comment above it. While it correctly points out a possible race against dup2(), it states: > We need to detect this and fput() the struct file we are about to > overwrite in this case. > > It should never happen - if we allow dup2() do it, _really_ bad things > will follow. I have difficulty parsing the above. The first sentence would suggest fd_install() tries to detect and recover from the race (it does not), the next one claims the race needs to be dealt with (it is, by dup2()). Given that fd_install() does not suffer the burden, this patch removes the above and instead expands on the race in dup2() commentary. While here tidy up the docs around fd_install(). Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250320102637.1924183-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20Merge patch series "further iomap large atomic writes changes"Christian Brauner
John Garry <john.g.garry@oracle.com> says: These iomap changes are spun-off the XFS large atomic writes series at https://lore.kernel.org/linux-xfs/86a64256-497a-453b-bbba-a5ac6b4cb056@oracle.com/T/#ma99c763221de9d49ea2ccfca9ff9b8d71c8b2677 The XFS parts there are not ready yet, but it is worth having the iomap changes queued in advance. Some much earlier changes from that same series were already queued in the vfs tree, and these patches rework those changes - specifically the first patch in this series does. The most other significant change is the patch to rework how the bio flags are set in the DIO patch. * patches from https://lore.kernel.org/r/20250320120250.4087011-1-john.g.garry@oracle.com: iomap: rework IOMAP atomic flags iomap: comment on atomic write checks in iomap_dio_bio_iter() iomap: inline iomap_dio_bio_opflags() Link: https://lore.kernel.org/r/20250320120250.4087011-1-john.g.garry@oracle.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20iomap: rework IOMAP atomic flagsJohn Garry
Flag IOMAP_ATOMIC_SW is not really required. The idea of having this flag is that the FS ->iomap_begin callback could check if this flag is set to decide whether to do a SW (FS-based) atomic write. But the FS can set which ->iomap_begin callback it wants when deciding to do a FS-based atomic write. Furthermore, it was thought that IOMAP_ATOMIC_HW is not a proper name, as the block driver can use SW-methods to emulate an atomic write. So change back to IOMAP_ATOMIC. The ->iomap_begin callback needs though to indicate to iomap core that REQ_ATOMIC needs to be set, so add IOMAP_F_ATOMIC_BIO for that. These changes were suggested by Christoph Hellwig and Dave Chinner. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250320120250.4087011-4-john.g.garry@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20iomap: comment on atomic write checks in iomap_dio_bio_iter()John Garry
Help explain the code. Also clarify the comment for bio size check. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250320120250.4087011-3-john.g.garry@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20iomap: inline iomap_dio_bio_opflags()John Garry
It is neater to build blk_opf_t fully in one place, so inline iomap_dio_bio_opflags() in iomap_dio_bio_iter(). Also tidy up the logic in dealing with IOMAP_DIO_CALLER_COMP, in generally separate the logic in dealing with flags associated with reads and writes. Originally-from: Christoph Hellwig <hch@lst.de> Reviewed-by: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250320120250.4087011-2-john.g.garry@oracle.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20tools headers: Sync uapi/asm-generic/socket.h with the kernel sourcesAlexander Mikhalitsyn
This also fixes a wrong definitions for SCM_TS_OPT_ID & SO_RCVPRIORITY. Accidentally found while working on another patchset. Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Vadim Fedorenko <vadim.fedorenko@linux.dev> Cc: Willem de Bruijn <willemb@google.com> Cc: Jason Xing <kerneljasonxing@gmail.com> Cc: Anna Emese Nyiri <annaemesenyiri@gmail.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Paolo Abeni <pabeni@redhat.com> Fixes: a89568e9be75 ("selftests: txtimestamp: add SCM_TS_OPT_ID test") Fixes: e45469e594b2 ("sock: Introduce SO_RCVPRIORITY socket option") Link: https://lore.kernel.org/netdev/20250314195257.34854-1-kuniyu@amazon.com/ Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20250314214155.16046-1-aleksandr.mikhalitsyn@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: Fix data stream corruption in the address announcementArthur Mongodin
Because of the size restriction in the TCP options space, the MPTCP ADD_ADDR option is exclusive and cannot be sent with other MPTCP ones. For this reason, in the linked mptcp_out_options structure, group of fields linked to different options are part of the same union. There is a case where the mptcp_pm_add_addr_signal() function can modify opts->addr, but not ended up sending an ADD_ADDR. Later on, back in mptcp_established_options, other options will be sent, but with unexpected data written in other fields due to the union, e.g. in opts->ext_copy. This could lead to a data stream corruption in the next packet. Using an intermediate variable, prevents from corrupting previously established DSS option. The assignment of the ADD_ADDR option parameters is now done once we are sure this ADD_ADDR option can be set in the packet, e.g. after having dropped other suboptions. Fixes: 1bff1e43a30e ("mptcp: optimize out option generation") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Arthur Mongodin <amongodin@randorisec.fr> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> [ Matt: the commit message has been updated: long lines splits and some clarifications. ] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-net-mptcp-fix-data-stream-corr-sockopt-v1-1-122dbb249db3@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mmc: host: Wait for Vdd to settle on card power offErick Shepherd
The SD spec version 6.0 section 6.4.1.5 requires that Vdd must be lowered to less than 0.5V for a minimum of 1 ms when powering off a card. Increase wait to 15 ms so that voltage has time to drain down to 0.5V and cards can power off correctly. Issues with voltage drain time were only observed on Apollo Lake and Bay Trail host controllers so this fix is limited to those devices. Signed-off-by: Erick Shepherd <erick.shepherd@ni.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20250314195021.1588090-1-erick.shepherd@ni.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-03-20i2c: amd-mp2: drop free_irq() of devm_request_irq() allocated irqYang Yingliang
irq allocated with devm_request_irq() will be freed in devm_irq_release(), using free_irq() in ->remove() will causes a dangling pointer, and a subsequent double free. So remove the free_irq() in the error path and remove path. Fixes: 969864efae78 ("i2c: amd-mp2: use msix/msi if the hardware supports") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20221103121146.99836-1-yangyingliang@huawei.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-03-20libfs: Fix duplicate directory entry in offset_dir_lookupYongjian Sun
There is an issue in the kernel: In tmpfs, when using the "ls" command to list the contents of a directory with a large number of files, glibc performs the getdents call in multiple rounds. If a concurrent unlink occurs between these getdents calls, it may lead to duplicate directory entries in the ls output. One possible reproduction scenario is as follows: Create 1026 files and execute ls and rm concurrently: for i in {1..1026}; do echo "This is file $i" > /tmp/dir/file$i done ls /tmp/dir rm /tmp/dir/file4 ->getdents(file1026-file5) ->unlink(file4) ->getdents(file5,file3,file2,file1) It is expected that the second getdents call to return file3 through file1, but instead it returns an extra file5. The root cause of this problem is in the offset_dir_lookup function. It uses mas_find to determine the starting position for the current getdents call. Since mas_find locates the first position that is greater than or equal to mas->index, when file4 is deleted, it ends up returning file5. It can be fixed by replacing mas_find with mas_find_rev, which finds the first position that is less than or equal to mas->index. Fixes: b9b588f22a0c ("libfs: Use d_children list to iterate simple_offset directories") Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com> Link: https://lore.kernel.org/r/20250320034417.555810-1-sunyongjian@huaweicloud.com Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20ASoC: wm8904: add DMIC supportErnest Van Hoecke
The WM8904 codec supports both ADC and DMIC inputs. Get input pin functionality from the platform data and add the necessary controls depending on the possible additional routing. The ADC and DMIC share the IN1L/DMICDAT1 and IN1R/DMICDAT2 pins. This leads to a few scenarios requiring different DAPM routing: - When both are connected to an analog input, only the ADC is used. - When one line is a DMIC and the other an analog input, the DMIC source is set from the platform data and a mux is added to select whether to use the ADC or DMIC. - When both are connected to a DMIC, another mux is added to this to select the DMIC source. Note that we still need to be able to select the ADC system for use with the IN2L, IN2R, IN3L and IN3R pins. Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-6-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20ASoC: wm8904: get platform data from DTErnest Van Hoecke
Read in optional codec-specific properties from the device tree. The platform_data structure is not populated when using device trees. This change parses optional dts properties to populate it. - wlf,in1l-as-dmicdat1 - wlf,in1r-as-dmicdat2 - wlf,gpio-cfg - wlf,micbias-cfg - wlf,drc-cfg-regs - wlf,drc-cfg-names - wlf,retune-mobile-cfg-regs - wlf,retune-mobile-cfg-names - wlf,retune-mobile-cfg-hz Datasheet: https://statics.cirrus.com/pubs/proDatasheet/WM8904_Rev4.1.pdf Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-5-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20ASoC: dt-bindings: wm8904: Add DMIC, GPIO, MIC and EQ supportErnest Van Hoecke
Add two properties to select the IN1L/DMICDAT1 and IN2R/DMICDAT2 functionality: - wlf,in1l-as-dmicdat1 - wlf,in1r-as-dmicdat2 Add a property to describe the GPIO configuration registers, that can be used to set the four multifunction pins: - wlf,gpio-cfg Add a property to describe the mic bias control registers: - wlf,micbias-cfg Add two properties to describe the Dynamic Range Controller (DRC), allowing multiple named configurations where each config sets the 4 DRC registers (R40-R43): - wlf,drc-cfg-regs - wlf,drc-cfg-names Add three properties to describe the equalizer (ReTune Mobile), allowing multiple named configurations (associated with a samplerate) that set the 24 (R134-R157) EQ registers: - wlf,retune-mobile-cfg-regs - wlf,retune-mobile-cfg-hz - wlf,retune-mobile-cfg-rates The set of names and configurations for DRC and ReTune Mobile are specified by system integrators. The names are exposed directly to userspace as options that can be selected at runtime. Adding the DRC and ReTune Mobile data to the DT eases the transition from pdata, which has handled them this way for over a decade. The parameters filled in here are almost certainly specific tuning for the hardware so it makes sense to ship them with the hardware description. Datasheet: https://statics.cirrus.com/pubs/proDatasheet/WM8904_Rev4.1.pdf Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250319142059.46692-4-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20ASoC: wm8904: Don't touch GPIO configs set to 0xFFFFErnest Van Hoecke
When updating the GPIO registers, do nothing for all fields of gpio_cfg that are "0xFFFF". This "do nothing" flag used to be 0 to easily check whether the gpio_cfg field was actually set inside pdata or left empty (default). However, 0 is a valid configuration for these registers, while 0xFFFF is not. With this change, users can explicitly set them to 0. Not setting gpio_cfg in the platform data will now lead to setting all GPIO registers to 0 instead of leaving them unset. No one is using this platform data with this codec. The change gets the driver ready to properly set gpio_cfg from the DT. Datasheet: https://statics.cirrus.com/pubs/proDatasheet/WM8904_Rev4.1.pdf Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-3-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20of: Add of_property_read_u16_indexErnest Van Hoecke
There is an of_property_read_u32_index and of_property_read_u64_index. This patch adds a similar helper for u16. Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-2-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: spi-mem: Introduce a default ->exec_op() debug logMiquel Raynal
Many spi-mem controller drivers have a very similar debug log at the beginning of their ->exec_op() callback implementation. This debug log is effectively useful, so let's create one that is complete and concise enough, so developers no longer need to write their own. The verbosity being high, VERBOSE_DEBUG will be required in this case. Remove the debug log from individual drivers and propose a common one. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://patch.msgid.link/20250320115644.2231240-1-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: dt-bindings: cdns,qspi-nor: Require some peripheral propertiesMiquel Raynal
There are 5 mandatory peripheral properties. They are described in a separate binding but not explicitly required. Make sure they are correctly marked required and update the example to reflect this. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250319094651.1290509-4-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: dt-bindings: cdns,qspi-nor: Deprecate the Cadence compatible aloneMiquel Raynal
The initial SPI controller IP from Cadence has always been implemented into controllers from various hardware manufacturers and because of that, it has always been (rightfully) doubled with a more specific compatible. There are likely no reasons to keep this compatible legitimate, alone. Make sure people do not get mislead by officially deprecating this compatible. While at deprecating, let's update the examples to avoid documenting deprecated properties. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250319094651.1290509-3-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: dt-bindings: cdns,qspi-nor: Be more descriptive regarding what this ↵Miquel Raynal
controller is Despite being very common in commit logs, SPI NOR controllers simply do not exist. At least, they are not as specific as the name implies. There are SPI memory controllers which are indeed "specialized" and optimized for handling "memories", but most of them are just generic and accept almost any kind of opcode, address, dummy and data cycles, making them as suitable for NANDs than NORs. Furthermore, this controller supports any kind of bus, from single to octal NAND, so make it clear. Also add a comment to mention that the initial compatible naming is too specific (but obviously kept for backward compatibility reasons). Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250319094651.1290509-2-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20fs: call inode_sb_list_add() outside of inode hash lockMateusz Guzik
As both locks are highly contended during significant inode churn, holding the inode hash lock while waiting for the sb list lock exacerbates the problem. Why moving it out is safe: the inode at hand still has I_NEW set and anyone who finds it through legitimate means waits for the bit to clear, by which time inode_sb_list_add() is guaranteed to have finished. This significantly drops hash lock contention for me when stating 20 separate trees in parallel, each with 1000 directories * 1000 files. However, no speed up was observed as contention increased on the other locks, notably dentry LRU. Even so, removal of the lock ordering will help making this faster later. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250320004643.1903287-1-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20ALSA: oxygen: Fix dependency on CONFIG_PM_SLEEPTakashi Iwai
The conversion to EXPORT_SIMPLE_DEV_PM_OPS() also replaced the pm ops assignment with pm_ptr() macro, but this made difference from the original code; it had conditional with ifdef CONFIG_PM_SLEEP, while we have now with CONFIG_PM, instead. This seems causing build errors with randomconfig. For fixing the inconsistency, replace pm_ptr() with pm_sleep_ptr(). Fixes: 5ea0a2206b58 ("ALSA: oxygen: Convert to EXPORT_SIMPLE_DEV_PM_OPS()") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202503201853.7kB0BPRw-lkp@intel.com/ Link: https://patch.msgid.link/20250320105721.10789-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-03-20Merge branch 'net-fix-lwtunnel-reentry-loops'Paolo Abeni
Justin Iurman says: ==================== net: fix lwtunnel reentry loops When the destination is the same after the transformation, we enter a lwtunnel loop. This is true for most of lwt users: ioam6, rpl, seg6, seg6_local, ila_lwt, and lwt_bpf. It can happen in their input() and output() handlers respectively, where either dst_input() or dst_output() is called at the end. It can also happen in xmit() handlers. Here is an example for rpl_input(): dump_stack_lvl+0x60/0x80 rpl_input+0x9d/0x320 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 [...] lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 ip6_sublist_rcv_finish+0x85/0x90 ip6_sublist_rcv+0x236/0x2f0 ... until rpl_do_srh() fails, which means skb_cow_head() failed. This series provides a fix at the core level of lwtunnel to catch such loops when they're not caught by the respective lwtunnel users, and handle the loop case in ioam6 which is one of the users. This series also comes with a new selftest to detect some dst cache reference loops in lwtunnel users. ==================== Link: https://patch.msgid.link/20250314120048.12569-1-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20selftests: net: test for lwtunnel dst ref loopsJustin Iurman
As recently specified by commit 0ea09cbf8350 ("docs: netdev: add a note on selftest posting") in net-next, the selftest is therefore shipped in this series. However, this selftest does not really test this series. It needs this series to avoid crashing the kernel. What it really tests, thanks to kmemleak, is what was fixed by the following commits: - commit c71a192976de ("net: ipv6: fix dst refleaks in rpl, seg6 and ioam6 lwtunnels") - commit 92191dd10730 ("net: ipv6: fix dst ref loops in rpl, seg6 and ioam6 lwtunnels") - commit c64a0727f9b1 ("net: ipv6: fix dst ref loop on input in seg6 lwt") - commit 13e55fbaec17 ("net: ipv6: fix dst ref loop on input in rpl lwt") - commit 0e7633d7b95b ("net: ipv6: fix dst ref loop in ila lwtunnel") - commit 5da15a9c11c1 ("net: ipv6: fix missing dst ref drop in ila lwtunnel") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250314120048.12569-4-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20net: ipv6: ioam6: fix lwtunnel_output() loopJustin Iurman
Fix the lwtunnel_output() reentry loop in ioam6_iptunnel when the destination is the same after transformation. Note that a check on the destination address was already performed, but it was not enough. This is the example of a lwtunnel user taking care of loops without relying only on the last resort detection offered by lwtunnel. Fixes: 8cb3bf8bff3c ("ipv6: ioam: Add support for the ip6ip6 encapsulation") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250314120048.12569-3-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20net: lwtunnel: fix recursion loopsJustin Iurman
This patch acts as a parachute, catch all solution, by detecting recursion loops in lwtunnel users and taking care of them (e.g., a loop between routes, a loop within the same route, etc). In general, such loops are the consequence of pathological configurations. Each lwtunnel user is still free to catch such loops early and do whatever they want with them. It will be the case in a separate patch for, e.g., seg6 and seg6_local, in order to provide drop reasons and update statistics. Another example of a lwtunnel user taking care of loops is ioam6, which has valid use cases that include loops (e.g., inline mode), and which is addressed by the next patch in this series. Overall, this patch acts as a last resort to catch loops and drop packets, since we don't want to leak something unintentionally because of a pathological configuration in lwtunnels. The solution in this patch reuses dev_xmit_recursion(), dev_xmit_recursion_inc(), and dev_xmit_recursion_dec(), which seems fine considering the context. Closes: https://lore.kernel.org/netdev/2bc9e2079e864a9290561894d2a602d6@akamai.com/ Closes: https://lore.kernel.org/netdev/Z7NKYMY7fJT5cYWu@shredder/ Fixes: ffce41962ef6 ("lwtunnel: support dst output redirect function") Fixes: 2536862311d2 ("lwt: Add support to redirect dst.input") Fixes: 14972cbd34ff ("net: lwtunnel: Handle fragmentation") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250314120048.12569-2-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20net: ti: icssg-prueth: Add lock to statsMD Danish Anwar
Currently the API emac_update_hardware_stats() reads different ICSSG stats without any lock protection. This API gets called by .ndo_get_stats64() which is only under RCU protection and nothing else. Add lock to this API so that the reading of statistics happens during lock. Fixes: c1e10d5dc7a1 ("net: ti: icssg-prueth: Add ICSSG Stats") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314102721.1394366-1-danishanwar@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20net: atm: fix use after free in lec_send()Dan Carpenter
The ->send() operation frees skb so save the length before calling ->send() to avoid a use after free. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/c751531d-4af4-42fe-affe-6104b34b791d@stanley.mountain Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20fs: tidy up do_sys_openat2() with likely/unlikelyMateusz Guzik
Otherwise gcc 13 generates conditional forward jumps (aka branch mispredict by default) for build_open_flags() being succesfull. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250320092331.1921700-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20Merge branch 'slab/for-6.15/kfree_rcu_tiny' into slab/for-nextVlastimil Babka
Merge the slab feature branch kfree_rcu_tiny for 6.15: - Move the TINY_RCU kvfree_rcu() implementation from RCU to SLAB subsystem and cleanup its integration.
2025-03-20cpuidle, sched: Use smp_mb__after_atomic() in current_clr_polling()Yujun Dong
In architectures that use the polling bit, current_clr_polling() employs smp_mb() to ensure that the clearing of the polling bit is visible to other cores before checking TIF_NEED_RESCHED. However, smp_mb() can be costly. Given that clear_bit() is an atomic operation, replacing smp_mb() with smp_mb__after_atomic() is appropriate. Many architectures implement smp_mb__after_atomic() as a lighter-weight barrier compared to smp_mb(), leading to performance improvements. For instance, on x86, smp_mb__after_atomic() is a no-op. This change eliminates a smp_mb() instruction in the cpuidle wake-up path, saving several CPU cycles and thereby reducing wake-up latency. Architectures that do not use the polling bit will retain the original smp_mb() behavior to ensure that existing dependencies remain unaffected. Signed-off-by: Yujun Dong <yujundong@pascal-lab.net> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20241230141624.155356-1-yujundong@pascal-lab.net
2025-03-20fs: reduce work in fdget_pos()Mateusz Guzik
1. predict the file was found 2. explicitly compare the ref to "one", ignoring the dead zone The latter arguably improves the behavior to begin with. Suppose the count turned bad -- the previously used ref routine is going to check for it and return 0, indicating the count does not necessitate taking ->f_pos_lock. But there very well may be several users. i.e. not paying for special-casing the dead zone improves semantics. While here spell out each condition in a dedicated if statement. This has no effect on generated code. Sizes are as follows (in bytes; gcc 13, x86-64): stock: 321 likely(): 298 likely()+ref: 280 Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250319215801.1870660-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowedChao Du
The comments for EXT_SVADE are a bit confusing. Clarify it. Signed-off-by: Chao Du <duchao@eswincomputing.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250221025929.31678-1-duchao@eswincomputing.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-03-19xsk: fix an integer overflow in xp_create_and_assign_umem()Gavrilov Ilia
Since the i and pool->chunk_size variables are of type 'u32', their product can wrap around and then be cast to 'u64'. This can lead to two different XDP buffers pointing to the same memory area. Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 94033cd8e73b ("xsk: Optimize for aligned case") Cc: stable@vger.kernel.org Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru> Link: https://patch.msgid.link/20250313085007.3116044-1-Ilia.Gavrilov@infotecs.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-19Merge branch 'kvm-arm64/pmu-fixes' into kvmarm/nextOliver Upton
* kvm-arm64/pmu-fixes: : vPMU fixes for 6.15 courtesy of Akihiko Odaki : : Various fixes to KVM's vPMU implementation, notably ensuring : userspace-directed changes to the PMCs are reflected in the backing perf : events. KVM: arm64: PMU: Reload when resetting KVM: arm64: PMU: Reload when user modifies registers KVM: arm64: PMU: Fix SET_ONE_REG for vPMC regs KVM: arm64: PMU: Assume PMU presence in pmu-emul.c KVM: arm64: PMU: Set raw values from user to PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-19Merge branch 'kvm-arm64/pkvm-6.15' into kvmarm/nextOliver Upton
* kvm-arm64/pkvm-6.15: : pKVM updates for 6.15 : : - SecPageTable stats for stage-2 table pages allocated by the protected : hypervisor (Vincent Donnefort) : : - HCRX_EL2 trap + vCPU initialization fixes for pKVM (Fuad Tabba) KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu KVM: arm64: Factor out pKVM hyp vcpu creation to separate function KVM: arm64: Initialize HCRX_EL2 traps in pKVM KVM: arm64: Factor out setting HCRX_EL2 traps into separate function KVM: arm64: Count pKVM stage-2 usage in secondary pagetable stats KVM: arm64: Distinct pKVM teardown memcache for stage-2 KVM: arm64: Add flags to kvm_hyp_memcache Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-19Merge branch 'kvm-arm64/writable-midr' into kvmarm/nextOliver Upton
* kvm-arm64/writable-midr: : Writable implementation ID registers, courtesy of Sebastian Ott : : Introduce a new capability that allows userspace to set the : ID registers that identify a CPU implementation: MIDR_EL1, REVIDR_EL1, : and AIDR_EL1. Also plug a hole in KVM's trap configuration where : SMIDR_EL1 was readable at EL1, despite the fact that KVM does not : support SME. KVM: arm64: Fix documentation for KVM_CAP_ARM_WRITABLE_IMP_ID_REGS KVM: arm64: Copy MIDR_EL1 into hyp VM when it is writable KVM: arm64: Copy guest CTR_EL0 into hyp VM KVM: selftests: arm64: Test writes to MIDR,REVIDR,AIDR KVM: arm64: Allow userspace to change the implementation ID registers KVM: arm64: Load VPIDR_EL2 with the VM's MIDR_EL1 value KVM: arm64: Maintain per-VM copy of implementation ID regs KVM: arm64: Set HCR_EL2.TID1 unconditionally Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-19Merge branch 'kvm-arm64/pmuv3-asahi' into kvmarm/nextOliver Upton
* kvm-arm64/pmuv3-asahi: : Support PMUv3 for KVM guests on Apple silicon : : Take advantage of some IMPLEMENTATION DEFINED traps available on Apple : parts to trap-and-emulate the PMUv3 registers on behalf of a KVM guest. : Constrain the vPMU to a cycle counter and single event counter, as the : Apple PMU has events that cannot be counted on every counter. : : There is a small new interface between the ARM PMU driver and KVM, where : the PMU driver owns the PMUv3 -> hardware event mappings. arm64: Enable IMP DEF PMUv3 traps on Apple M* KVM: arm64: Provide 1 event counter on IMPDEF hardware drivers/perf: apple_m1: Provide helper for mapping PMUv3 events KVM: arm64: Remap PMUv3 events onto hardware KVM: arm64: Advertise PMUv3 if IMPDEF traps are present KVM: arm64: Compute synthetic sysreg ESR for Apple PMUv3 traps KVM: arm64: Move PMUVer filtering into KVM code KVM: arm64: Use guard() to cleanup usage of arm_pmus_lock KVM: arm64: Drop kvm_arm_pmu_available static key KVM: arm64: Use a cpucap to determine if system supports FEAT_PMUv3 KVM: arm64: Always support SW_INCR PMU event KVM: arm64: Compute PMCEID from arm_pmu's event bitmaps drivers/perf: apple_m1: Support host/guest event filtering drivers/perf: apple_m1: Refactor event select/filter configuration Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-19Merge branch 'kvm-arm64/pv-cpuid' into kvmarm/nextOliver Upton
* kvm-arm64/pv-cpuid: : Paravirtualized implementation ID, courtesy of Shameer Kolothum : : Big-little has historically been a pain in the ass to virtualize. The : implementation ID (MIDR, REVIDR, AIDR) of a vCPU can change at the whim : of vCPU scheduling. This can be particularly annoying when the guest : needs to know the underlying implementation to mitigate errata. : : "Hyperscalers" face a similar scheduling problem, where VMs may freely : migrate between hosts in a pool of heterogenous hardware. And yes, our : server-class friends are equally riddled with errata too. : : In absence of an architected solution to this wart on the ecosystem, : introduce support for paravirtualizing the implementation exposed : to a VM, allowing the VMM to describe the pool of implementations that a : VM may be exposed to due to scheduling/migration. : : Userspace is expected to intercept and handle these hypercalls using the : SMCCC filter UAPI, should it choose to do so. smccc: kvm_guest: Fix kernel builds for 32 bit arm KVM: selftests: Add test for KVM_REG_ARM_VENDOR_HYP_BMAP_2 smccc/kvm_guest: Enable errata based on implementation CPUs arm64: Make  _midr_in_range_list() an exported function KVM: arm64: Introduce KVM_REG_ARM_VENDOR_HYP_BMAP_2 KVM: arm64: Specify hypercall ABI for retrieving target implementations arm64: Modify _midr_range() functions to read MIDR/REVIDR internally Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-19Merge branch 'kvm-arm64/nv-idregs' into kvmarm/nextOliver Upton
* kvm-arm64/nv-idregs: : Changes to exposure of NV features, courtesy of Marc Zyngier : : Apply NV-specific feature restrictions at reset rather than at the point : of KVM_RUN. This makes the true feature set visible to userspace, a : necessary step towards save/restore support or NV VMs. : : Add an additional vCPU feature flag for selecting the E2H0 flavor of NV, : such that the VHE-ness of the VM can be applied to the feature set. KVM: arm64: selftests: Test that TGRAN*_2 fields are writable KVM: arm64: Allow userspace to write ID_AA64MMFR0_EL1.TGRAN*_2 KVM: arm64: Advertise FEAT_ECV when possible KVM: arm64: Make ID_AA64MMFR4_EL1.NV_frac writable KVM: arm64: Allow userspace to limit NV support to nVHE KVM: arm64: Move NV-specific capping to idreg sanitisation KVM: arm64: Enforce NV limits on a per-idregs basis KVM: arm64: Make ID_REG_LIMIT_FIELD_ENUM() more widely available KVM: arm64: Consolidate idreg callbacks KVM: arm64: Advertise NV2 in the boot messages KVM: arm64: Mark HCR.EL2.{NV*,AT} RES0 when ID_AA64MMFR4_EL1.NV_frac is 0 KVM: arm64: Mark HCR.EL2.E2H RES0 when ID_AA64MMFR1_EL1.VH is zero KVM: arm64: Hide ID_AA64MMFR2_EL1.NV from guest and userspace arm64: cpufeature: Handle NV_frac as a synonym of NV2 Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-19Merge branch 'kvm-arm64/nv-vgic' into kvmarm/nextOliver Upton
* kvm-arm64/nv-vgic: : NV VGICv3 support, courtesy of Marc Zyngier : : Support for emulating the GIC hypervisor controls and managing shadow : VGICv3 state for the L1 hypervisor. As part of it, bring in support for : taking IRQs to the L1 and UAPI to manage the VGIC maintenance interrupt. KVM: arm64: nv: Fail KVM init if asking for NV without GICv3 KVM: arm64: nv: Allow userland to set VGIC maintenance IRQ KVM: arm64: nv: Fold GICv3 host trapping requirements into guest setup KVM: arm64: nv: Propagate used_lrs between L1 and L0 contexts KVM: arm64: nv: Request vPE doorbell upon nested ERET to L2 KVM: arm64: nv: Respect virtual HCR_EL2.TWx setting KVM: arm64: nv: Add Maintenance Interrupt emulation KVM: arm64: nv: Handle L2->L1 transition on interrupt injection KVM: arm64: nv: Nested GICv3 emulation KVM: arm64: nv: Sanitise ICH_HCR_EL2 accesses KVM: arm64: nv: Plumb handling of GICv3 EL2 accesses KVM: arm64: nv: Add ICH_*_EL2 registers to vpcu_sysreg KVM: arm64: nv: Load timer before the GIC arm64: sysreg: Add layout for ICH_MISR_EL2 arm64: sysreg: Add layout for ICH_VTR_EL2 arm64: sysreg: Add layout for ICH_HCR_EL2 Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-19Merge branch 'kvm-arm64/misc' into kvmarm/nextOliver Upton
* kvm-arm64/misc: : Miscellaneous fixes/cleanups for KVM/arm64 : : - Avoid GICv4 vLPI configuration when confronted with user error : : - Only attempt vLPI configuration when the target routing is an MSI : : - Document ordering requirements to avoid aforementioned user error KVM: arm64: Tear down vGIC on failed vCPU creation KVM: arm64: Document ordering requirements for irqbypass KVM: arm64: vgic-v4: Fall back to software irqbypass if LPI not found KVM: arm64: vgic-v4: Only WARN for HW IRQ mismatch when unmapping vLPI KVM: arm64: vgic-v4: Only attempt vLPI mapping for actual MSIs Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-19sched/debug: Remove CONFIG_SCHED_DEBUGIngo Molnar
For more than a decade, CONFIG_SCHED_DEBUG=y has been enabled in all the major Linux distributions: /boot/config-6.11.0-19-generic:CONFIG_SCHED_DEBUG=y The reason is that while originally CONFIG_SCHED_DEBUG started out as a debugging feature, over the years (decades ...) it has grown various bits of statistics, instrumentation and control knobs that are useful for sysadmin and general software development purposes as well. But within the kernel we still pretend that there's a choice, and sometimes code that is seemingly 'debug only' creates overhead that should be optimized in reality. So make it all official and make CONFIG_SCHED_DEBUG unconditional. Now that all uses of CONFIG_SCHED_DEBUG are removed from the code by previous patches, remove the Kconfig option as well. Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250317104257.3496611-6-mingo@kernel.org
2025-03-19sched/debug: Remove CONFIG_SCHED_DEBUG from self-test config filesIngo Molnar
We leave most of the defconfigs alone (there's over 70 of them), but let's remove CONFIG_SCHED_DEBUG from the scheduler self-test Kconfig files. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/Z9szt3MpQmQ56TRd@gmail.com
2025-03-19sched/debug, Documentation: Remove (most) CONFIG_SCHED_DEBUG references from ↵Ingo Molnar
documentation Since it's enabled unconditionally now, remove all references to it. (Left out languages I cannot read.) Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250317104257.3496611-5-mingo@kernel.org