summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-03-20mmc: host: Wait for Vdd to settle on card power offErick Shepherd
The SD spec version 6.0 section 6.4.1.5 requires that Vdd must be lowered to less than 0.5V for a minimum of 1 ms when powering off a card. Increase wait to 15 ms so that voltage has time to drain down to 0.5V and cards can power off correctly. Issues with voltage drain time were only observed on Apollo Lake and Bay Trail host controllers so this fix is limited to those devices. Signed-off-by: Erick Shepherd <erick.shepherd@ni.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20250314195021.1588090-1-erick.shepherd@ni.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-03-20i2c: amd-mp2: drop free_irq() of devm_request_irq() allocated irqYang Yingliang
irq allocated with devm_request_irq() will be freed in devm_irq_release(), using free_irq() in ->remove() will causes a dangling pointer, and a subsequent double free. So remove the free_irq() in the error path and remove path. Fixes: 969864efae78 ("i2c: amd-mp2: use msix/msi if the hardware supports") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20221103121146.99836-1-yangyingliang@huawei.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-03-20libfs: Fix duplicate directory entry in offset_dir_lookupYongjian Sun
There is an issue in the kernel: In tmpfs, when using the "ls" command to list the contents of a directory with a large number of files, glibc performs the getdents call in multiple rounds. If a concurrent unlink occurs between these getdents calls, it may lead to duplicate directory entries in the ls output. One possible reproduction scenario is as follows: Create 1026 files and execute ls and rm concurrently: for i in {1..1026}; do echo "This is file $i" > /tmp/dir/file$i done ls /tmp/dir rm /tmp/dir/file4 ->getdents(file1026-file5) ->unlink(file4) ->getdents(file5,file3,file2,file1) It is expected that the second getdents call to return file3 through file1, but instead it returns an extra file5. The root cause of this problem is in the offset_dir_lookup function. It uses mas_find to determine the starting position for the current getdents call. Since mas_find locates the first position that is greater than or equal to mas->index, when file4 is deleted, it ends up returning file5. It can be fixed by replacing mas_find with mas_find_rev, which finds the first position that is less than or equal to mas->index. Fixes: b9b588f22a0c ("libfs: Use d_children list to iterate simple_offset directories") Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com> Link: https://lore.kernel.org/r/20250320034417.555810-1-sunyongjian@huaweicloud.com Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20ASoC: wm8904: add DMIC supportErnest Van Hoecke
The WM8904 codec supports both ADC and DMIC inputs. Get input pin functionality from the platform data and add the necessary controls depending on the possible additional routing. The ADC and DMIC share the IN1L/DMICDAT1 and IN1R/DMICDAT2 pins. This leads to a few scenarios requiring different DAPM routing: - When both are connected to an analog input, only the ADC is used. - When one line is a DMIC and the other an analog input, the DMIC source is set from the platform data and a mux is added to select whether to use the ADC or DMIC. - When both are connected to a DMIC, another mux is added to this to select the DMIC source. Note that we still need to be able to select the ADC system for use with the IN2L, IN2R, IN3L and IN3R pins. Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-6-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20ASoC: wm8904: get platform data from DTErnest Van Hoecke
Read in optional codec-specific properties from the device tree. The platform_data structure is not populated when using device trees. This change parses optional dts properties to populate it. - wlf,in1l-as-dmicdat1 - wlf,in1r-as-dmicdat2 - wlf,gpio-cfg - wlf,micbias-cfg - wlf,drc-cfg-regs - wlf,drc-cfg-names - wlf,retune-mobile-cfg-regs - wlf,retune-mobile-cfg-names - wlf,retune-mobile-cfg-hz Datasheet: https://statics.cirrus.com/pubs/proDatasheet/WM8904_Rev4.1.pdf Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-5-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20ASoC: dt-bindings: wm8904: Add DMIC, GPIO, MIC and EQ supportErnest Van Hoecke
Add two properties to select the IN1L/DMICDAT1 and IN2R/DMICDAT2 functionality: - wlf,in1l-as-dmicdat1 - wlf,in1r-as-dmicdat2 Add a property to describe the GPIO configuration registers, that can be used to set the four multifunction pins: - wlf,gpio-cfg Add a property to describe the mic bias control registers: - wlf,micbias-cfg Add two properties to describe the Dynamic Range Controller (DRC), allowing multiple named configurations where each config sets the 4 DRC registers (R40-R43): - wlf,drc-cfg-regs - wlf,drc-cfg-names Add three properties to describe the equalizer (ReTune Mobile), allowing multiple named configurations (associated with a samplerate) that set the 24 (R134-R157) EQ registers: - wlf,retune-mobile-cfg-regs - wlf,retune-mobile-cfg-hz - wlf,retune-mobile-cfg-rates The set of names and configurations for DRC and ReTune Mobile are specified by system integrators. The names are exposed directly to userspace as options that can be selected at runtime. Adding the DRC and ReTune Mobile data to the DT eases the transition from pdata, which has handled them this way for over a decade. The parameters filled in here are almost certainly specific tuning for the hardware so it makes sense to ship them with the hardware description. Datasheet: https://statics.cirrus.com/pubs/proDatasheet/WM8904_Rev4.1.pdf Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250319142059.46692-4-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20ASoC: wm8904: Don't touch GPIO configs set to 0xFFFFErnest Van Hoecke
When updating the GPIO registers, do nothing for all fields of gpio_cfg that are "0xFFFF". This "do nothing" flag used to be 0 to easily check whether the gpio_cfg field was actually set inside pdata or left empty (default). However, 0 is a valid configuration for these registers, while 0xFFFF is not. With this change, users can explicitly set them to 0. Not setting gpio_cfg in the platform data will now lead to setting all GPIO registers to 0 instead of leaving them unset. No one is using this platform data with this codec. The change gets the driver ready to properly set gpio_cfg from the DT. Datasheet: https://statics.cirrus.com/pubs/proDatasheet/WM8904_Rev4.1.pdf Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-3-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20of: Add of_property_read_u16_indexErnest Van Hoecke
There is an of_property_read_u32_index and of_property_read_u64_index. This patch adds a similar helper for u16. Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-2-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: spi-mem: Introduce a default ->exec_op() debug logMiquel Raynal
Many spi-mem controller drivers have a very similar debug log at the beginning of their ->exec_op() callback implementation. This debug log is effectively useful, so let's create one that is complete and concise enough, so developers no longer need to write their own. The verbosity being high, VERBOSE_DEBUG will be required in this case. Remove the debug log from individual drivers and propose a common one. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://patch.msgid.link/20250320115644.2231240-1-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: dt-bindings: cdns,qspi-nor: Require some peripheral propertiesMiquel Raynal
There are 5 mandatory peripheral properties. They are described in a separate binding but not explicitly required. Make sure they are correctly marked required and update the example to reflect this. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250319094651.1290509-4-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: dt-bindings: cdns,qspi-nor: Deprecate the Cadence compatible aloneMiquel Raynal
The initial SPI controller IP from Cadence has always been implemented into controllers from various hardware manufacturers and because of that, it has always been (rightfully) doubled with a more specific compatible. There are likely no reasons to keep this compatible legitimate, alone. Make sure people do not get mislead by officially deprecating this compatible. While at deprecating, let's update the examples to avoid documenting deprecated properties. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250319094651.1290509-3-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: dt-bindings: cdns,qspi-nor: Be more descriptive regarding what this ↵Miquel Raynal
controller is Despite being very common in commit logs, SPI NOR controllers simply do not exist. At least, they are not as specific as the name implies. There are SPI memory controllers which are indeed "specialized" and optimized for handling "memories", but most of them are just generic and accept almost any kind of opcode, address, dummy and data cycles, making them as suitable for NANDs than NORs. Furthermore, this controller supports any kind of bus, from single to octal NAND, so make it clear. Also add a comment to mention that the initial compatible naming is too specific (but obviously kept for backward compatibility reasons). Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250319094651.1290509-2-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20fs: call inode_sb_list_add() outside of inode hash lockMateusz Guzik
As both locks are highly contended during significant inode churn, holding the inode hash lock while waiting for the sb list lock exacerbates the problem. Why moving it out is safe: the inode at hand still has I_NEW set and anyone who finds it through legitimate means waits for the bit to clear, by which time inode_sb_list_add() is guaranteed to have finished. This significantly drops hash lock contention for me when stating 20 separate trees in parallel, each with 1000 directories * 1000 files. However, no speed up was observed as contention increased on the other locks, notably dentry LRU. Even so, removal of the lock ordering will help making this faster later. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250320004643.1903287-1-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20docs: sysfs-block: Clarify integrity sysfs attributesMilan Broz
The /sys/block/<disk>/integrity fields are historically set if T10 protection Information is enabled. It is not set if some upper layer uses integrity metadata. Document it. Signed-off-by: Milan Broz <gmazyland@gmail.com> Co-developed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250318154447.370786-1-gmazyland@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-20tracing: Constify struct event_trigger_opsChristophe JAILLET
'event_trigger_ops mwifiex_if_ops' are not modified in these drivers. Constifying these structures moves some data to a read-only section, so increase overall security, especially when the structure holds some function pointers. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 31368 9024 6200 46592 b600 kernel/trace/trace_events_trigger.o After: ===== text data bss dec hex filename 31752 8608 6200 46560 b5e0 kernel/trace/trace_events_trigger.o Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/66e8f990e649678e4be37d4d1a19158ca0dea2f4.1741521295.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-20scripts/tracing: Remove scripts/tracing/draw_functrace.pySteven Rostedt
The draw_functrace.py hasn't worked in years. There's better ways to accomplish the same thing (via libtracefs). Remove it. Link: https://lore.kernel.org/linux-trace-kernel/20250210-debuginfo-v1-1-368feb58292a@purestorage.com/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Uday Shankar <ushankar@purestorage.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/20250307103941.070654e7@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-20ALSA: oxygen: Fix dependency on CONFIG_PM_SLEEPTakashi Iwai
The conversion to EXPORT_SIMPLE_DEV_PM_OPS() also replaced the pm ops assignment with pm_ptr() macro, but this made difference from the original code; it had conditional with ifdef CONFIG_PM_SLEEP, while we have now with CONFIG_PM, instead. This seems causing build errors with randomconfig. For fixing the inconsistency, replace pm_ptr() with pm_sleep_ptr(). Fixes: 5ea0a2206b58 ("ALSA: oxygen: Convert to EXPORT_SIMPLE_DEV_PM_OPS()") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202503201853.7kB0BPRw-lkp@intel.com/ Link: https://patch.msgid.link/20250320105721.10789-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-03-20Merge branch 'net-fix-lwtunnel-reentry-loops'Paolo Abeni
Justin Iurman says: ==================== net: fix lwtunnel reentry loops When the destination is the same after the transformation, we enter a lwtunnel loop. This is true for most of lwt users: ioam6, rpl, seg6, seg6_local, ila_lwt, and lwt_bpf. It can happen in their input() and output() handlers respectively, where either dst_input() or dst_output() is called at the end. It can also happen in xmit() handlers. Here is an example for rpl_input(): dump_stack_lvl+0x60/0x80 rpl_input+0x9d/0x320 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 [...] lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 ip6_sublist_rcv_finish+0x85/0x90 ip6_sublist_rcv+0x236/0x2f0 ... until rpl_do_srh() fails, which means skb_cow_head() failed. This series provides a fix at the core level of lwtunnel to catch such loops when they're not caught by the respective lwtunnel users, and handle the loop case in ioam6 which is one of the users. This series also comes with a new selftest to detect some dst cache reference loops in lwtunnel users. ==================== Link: https://patch.msgid.link/20250314120048.12569-1-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20selftests: net: test for lwtunnel dst ref loopsJustin Iurman
As recently specified by commit 0ea09cbf8350 ("docs: netdev: add a note on selftest posting") in net-next, the selftest is therefore shipped in this series. However, this selftest does not really test this series. It needs this series to avoid crashing the kernel. What it really tests, thanks to kmemleak, is what was fixed by the following commits: - commit c71a192976de ("net: ipv6: fix dst refleaks in rpl, seg6 and ioam6 lwtunnels") - commit 92191dd10730 ("net: ipv6: fix dst ref loops in rpl, seg6 and ioam6 lwtunnels") - commit c64a0727f9b1 ("net: ipv6: fix dst ref loop on input in seg6 lwt") - commit 13e55fbaec17 ("net: ipv6: fix dst ref loop on input in rpl lwt") - commit 0e7633d7b95b ("net: ipv6: fix dst ref loop in ila lwtunnel") - commit 5da15a9c11c1 ("net: ipv6: fix missing dst ref drop in ila lwtunnel") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250314120048.12569-4-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20net: ipv6: ioam6: fix lwtunnel_output() loopJustin Iurman
Fix the lwtunnel_output() reentry loop in ioam6_iptunnel when the destination is the same after transformation. Note that a check on the destination address was already performed, but it was not enough. This is the example of a lwtunnel user taking care of loops without relying only on the last resort detection offered by lwtunnel. Fixes: 8cb3bf8bff3c ("ipv6: ioam: Add support for the ip6ip6 encapsulation") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250314120048.12569-3-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20net: lwtunnel: fix recursion loopsJustin Iurman
This patch acts as a parachute, catch all solution, by detecting recursion loops in lwtunnel users and taking care of them (e.g., a loop between routes, a loop within the same route, etc). In general, such loops are the consequence of pathological configurations. Each lwtunnel user is still free to catch such loops early and do whatever they want with them. It will be the case in a separate patch for, e.g., seg6 and seg6_local, in order to provide drop reasons and update statistics. Another example of a lwtunnel user taking care of loops is ioam6, which has valid use cases that include loops (e.g., inline mode), and which is addressed by the next patch in this series. Overall, this patch acts as a last resort to catch loops and drop packets, since we don't want to leak something unintentionally because of a pathological configuration in lwtunnels. The solution in this patch reuses dev_xmit_recursion(), dev_xmit_recursion_inc(), and dev_xmit_recursion_dec(), which seems fine considering the context. Closes: https://lore.kernel.org/netdev/2bc9e2079e864a9290561894d2a602d6@akamai.com/ Closes: https://lore.kernel.org/netdev/Z7NKYMY7fJT5cYWu@shredder/ Fixes: ffce41962ef6 ("lwtunnel: support dst output redirect function") Fixes: 2536862311d2 ("lwt: Add support to redirect dst.input") Fixes: 14972cbd34ff ("net: lwtunnel: Handle fragmentation") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250314120048.12569-2-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20net: ti: icssg-prueth: Add lock to statsMD Danish Anwar
Currently the API emac_update_hardware_stats() reads different ICSSG stats without any lock protection. This API gets called by .ndo_get_stats64() which is only under RCU protection and nothing else. Add lock to this API so that the reading of statistics happens during lock. Fixes: c1e10d5dc7a1 ("net: ti: icssg-prueth: Add ICSSG Stats") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314102721.1394366-1-danishanwar@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20net: atm: fix use after free in lec_send()Dan Carpenter
The ->send() operation frees skb so save the length before calling ->send() to avoid a use after free. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/c751531d-4af4-42fe-affe-6104b34b791d@stanley.mountain Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20fs: tidy up do_sys_openat2() with likely/unlikelyMateusz Guzik
Otherwise gcc 13 generates conditional forward jumps (aka branch mispredict by default) for build_open_flags() being succesfull. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250320092331.1921700-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20Merge branch 'slab/for-6.15/kfree_rcu_tiny' into slab/for-nextVlastimil Babka
Merge the slab feature branch kfree_rcu_tiny for 6.15: - Move the TINY_RCU kvfree_rcu() implementation from RCU to SLAB subsystem and cleanup its integration.
2025-03-20Merge branch 'mptcp-pm-prep-work-for-new-ops-and-sysctl-knobs'Paolo Abeni
Matthieu Baerts says: ==================== mptcp: pm: prep work for new ops and sysctl knobs Here are a few cleanups, preparation work for the new PM ops, and sysctl knobs. - Patch 1: reorg: move generic NL code used by all PMs to pm_netlink.c. - Patch 2: use kmemdup() instead of kmalloc + copy. - Patch 3: small cleanup to use pm var instead of msk->pm. - Patch 4: reorg: id_avail_bitmap is only used by the in-kernel PM. - Patch 5: use struct_group to easily reset a subset of PM data vars. - Patch 6: introduce the minimal skeleton for the new PM ops. - Patch 7: register in-kernel and userspace PM ops. - Patch 8: new net.mptcp.path_manager sysctl knob, deprecating pm_type. - Patch 9: map the new path_manager sysctl knob with pm_type. - Patch 10: map the old pm_type sysctl knob with path_manager. - Patch 11: new net.mptcp.available_path_managers sysctl knob. - Patch 12: new test to validate path_manager and pm_type mapping. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> ==================== Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-0-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20selftests: mptcp: add pm sysctl mapping testsGeliang Tang
This patch checks if the newly added net.mptcp.path_manager is mapped successfully from or to the old net.mptcp.pm_type in userspace_pm.sh. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-12-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: sysctl: add available_path_managersGeliang Tang
Similarly to net.mptcp.available_schedulers, this patch adds a new one net.mptcp.available_path_managers to list the available path managers. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-11-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: sysctl: map pm_type to path_managerGeliang Tang
This patch adds a new proc_handler "proc_pm_type" for "pm_type" to map old path manager sysctl "pm_type" to the newly added "path_manager". path_manager pm_type MPTCP_PM_TYPE_KERNEL -> "kernel" MPTCP_PM_TYPE_USERSPACE -> "userspace" It is important to add this to keep a compatibility with the now deprecated pm_type sysctl knob. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-10-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: sysctl: map path_manager to pm_typeGeliang Tang
This patch maps the newly added path manager sysctl "path_manager" to the old one "pm_type". path_manager pm_type "kernel" -> MPTCP_PM_TYPE_KERNEL "userspace" -> MPTCP_PM_TYPE_USERSPACE others -> __MPTCP_PM_TYPE_NR It is important to add this to keep a compatibility with the now deprecated pm_type sysctl knob. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-9-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: sysctl: set path manager by nameGeliang Tang
Similar to net.mptcp.scheduler, a new net.mptcp.path_manager sysctl knob is added to determine which path manager will be used by each newly created MPTCP socket by setting the name of it. Dealing with an explicit name is easier than with a number, especially when more PMs will be introduced. This sysctl knob makes the old one "pm_type" deprecated. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-8-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: pm: register in-kernel and userspace PMGeliang Tang
This patch defines the original in-kernel netlink path manager as a new struct mptcp_pm_ops named "mptcp_pm_kernel", and register it in mptcp_pm_kernel_register(). And define the userspace path manager as a new struct mptcp_pm_ops named "mptcp_pm_userspace", and register it in mptcp_pm_init(). To ensure that there's always a valid path manager available, the default path manager "mptcp_pm_kernel" will be skipped in mptcp_pm_unregister(). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-7-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: pm: define struct mptcp_pm_opsGeliang Tang
In order to allow users to develop their own BPF-based path manager, this patch defines a struct ops "mptcp_pm_ops" for an MPTCP path manager, which contains a set of interfaces. Currently only init() and release() interfaces are included, subsequent patches will add others step by step. Add a set of functions to register, unregister, find and validate a given path manager struct ops. "list" is used to add this path manager to mptcp_pm_list list when it is registered. "name" is used to identify this path manager. mptcp_pm_find() uses "name" to find a path manager on the list. mptcp_pm_unregister is not used in this set, but will be invoked in .unreg of struct bpf_struct_ops. mptcp_pm_validate() will be invoked in .validate of struct bpf_struct_ops. That's why they are exported. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-6-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: pm: add struct_group in mptcp_pm_dataGeliang Tang
This patch adds a "struct_group(reset, ...)" in struct mptcp_pm_data to simplify the reset, and make sure we don't miss any. Suggested-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-5-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: pm: only fill id_avail_bitmap for in-kernel pmGeliang Tang
id_avail_bitmap of struct mptcp_pm_data is currently only used by the in-kernel PM, so this patch moves its initialization operation under the "if (pm_type == MPTCP_PM_TYPE_KERNEL)" condition. Suggested-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-4-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: pm: use pm variable instead of msk->pmGeliang Tang
The variable "pm" has been defined in mptcp_pm_fully_established() and mptcp_pm_data_reset() as "msk->pm", so use "pm" directly instead of using "msk->pm". Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-3-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: pm: in-kernel: use kmemdup helperGeliang Tang
Instead of using kmalloc() or kzalloc() to allocate an entry and then immediately duplicate another entry to the newly allocated one, kmemdup() helper can be used to simplify the code. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-2-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: pm: split netlink and in-kernel initMatthieu Baerts (NGI0)
The registration of mptcp_genl_family is useful for both the in-kernel and the userspace PM. It should then be done in pm_netlink.c. On the other hand, the registration of the in-kernel pernet subsystem is specific to the in-kernel PM, and should stay there in pm_kernel.c. Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-1-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20cpuidle, sched: Use smp_mb__after_atomic() in current_clr_polling()Yujun Dong
In architectures that use the polling bit, current_clr_polling() employs smp_mb() to ensure that the clearing of the polling bit is visible to other cores before checking TIF_NEED_RESCHED. However, smp_mb() can be costly. Given that clear_bit() is an atomic operation, replacing smp_mb() with smp_mb__after_atomic() is appropriate. Many architectures implement smp_mb__after_atomic() as a lighter-weight barrier compared to smp_mb(), leading to performance improvements. For instance, on x86, smp_mb__after_atomic() is a no-op. This change eliminates a smp_mb() instruction in the cpuidle wake-up path, saving several CPU cycles and thereby reducing wake-up latency. Architectures that do not use the polling bit will retain the original smp_mb() behavior to ensure that existing dependencies remain unaffected. Signed-off-by: Yujun Dong <yujundong@pascal-lab.net> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20241230141624.155356-1-yujundong@pascal-lab.net
2025-03-20net: vlan: don't propagate flags on openStanislav Fomichev
With the device instance lock, there is now a possibility of a deadlock: [ 1.211455] ============================================ [ 1.211571] WARNING: possible recursive locking detected [ 1.211687] 6.14.0-rc5-01215-g032756b4ca7a-dirty #5 Not tainted [ 1.211823] -------------------------------------------- [ 1.211936] ip/184 is trying to acquire lock: [ 1.212032] ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_set_allmulti+0x4e/0xb0 [ 1.212207] [ 1.212207] but task is already holding lock: [ 1.212332] ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_open+0x50/0xb0 [ 1.212487] [ 1.212487] other info that might help us debug this: [ 1.212626] Possible unsafe locking scenario: [ 1.212626] [ 1.212751] CPU0 [ 1.212815] ---- [ 1.212871] lock(&dev->lock); [ 1.212944] lock(&dev->lock); [ 1.213016] [ 1.213016] *** DEADLOCK *** [ 1.213016] [ 1.213143] May be due to missing lock nesting notation [ 1.213143] [ 1.213294] 3 locks held by ip/184: [ 1.213371] #0: ffffffff838b53e0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock+0x1b/0xa0 [ 1.213543] #1: ffffffff84e5fc70 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock+0x37/0xa0 [ 1.213727] #2: ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_open+0x50/0xb0 [ 1.213895] [ 1.213895] stack backtrace: [ 1.213991] CPU: 0 UID: 0 PID: 184 Comm: ip Not tainted 6.14.0-rc5-01215-g032756b4ca7a-dirty #5 [ 1.213993] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 1.213994] Call Trace: [ 1.213995] <TASK> [ 1.213996] dump_stack_lvl+0x8e/0xd0 [ 1.214000] print_deadlock_bug+0x28b/0x2a0 [ 1.214020] lock_acquire+0xea/0x2a0 [ 1.214027] __mutex_lock+0xbf/0xd40 [ 1.214038] dev_set_allmulti+0x4e/0xb0 # real_dev->flags & IFF_ALLMULTI [ 1.214040] vlan_dev_open+0xa5/0x170 # ndo_open on vlandev [ 1.214042] __dev_open+0x145/0x270 [ 1.214046] __dev_change_flags+0xb0/0x1e0 [ 1.214051] netif_change_flags+0x22/0x60 # IFF_UP vlandev [ 1.214053] dev_change_flags+0x61/0xb0 # for each device in group from dev->vlan_info [ 1.214055] vlan_device_event+0x766/0x7c0 # on netdevsim0 [ 1.214058] notifier_call_chain+0x78/0x120 [ 1.214062] netif_open+0x6d/0x90 [ 1.214064] dev_open+0x5b/0xb0 # locks netdevsim0 [ 1.214066] bond_enslave+0x64c/0x1230 [ 1.214075] do_set_master+0x175/0x1e0 # on netdevsim0 [ 1.214077] do_setlink+0x516/0x13b0 [ 1.214094] rtnl_newlink+0xaba/0xb80 [ 1.214132] rtnetlink_rcv_msg+0x440/0x490 [ 1.214144] netlink_rcv_skb+0xeb/0x120 [ 1.214150] netlink_unicast+0x1f9/0x320 [ 1.214153] netlink_sendmsg+0x346/0x3f0 [ 1.214157] __sock_sendmsg+0x86/0xb0 [ 1.214160] ____sys_sendmsg+0x1c8/0x220 [ 1.214164] ___sys_sendmsg+0x28f/0x2d0 [ 1.214179] __x64_sys_sendmsg+0xef/0x140 [ 1.214184] do_syscall_64+0xec/0x1d0 [ 1.214190] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 1.214191] RIP: 0033:0x7f2d1b4a7e56 Device setup: netdevsim0 (down) ^ ^ bond netdevsim1.100@netdevsim1 allmulticast=on (down) When we enslave the lower device (netdevsim0) which has a vlan, we propagate vlan's allmuti/promisc flags during ndo_open. This causes (re)locking on of the real_dev. Propagate allmulti/promisc on flags change, not on the open. There is a slight semantics change that vlans that are down now propagate the flags, but this seems unlikely to result in the real issues. Reproducer: echo 0 1 > /sys/bus/netdevsim/new_device dev_path=$(ls -d /sys/bus/netdevsim/devices/netdevsim0/net/*) dev=$(echo $dev_path | rev | cut -d/ -f1 | rev) ip link set dev $dev name netdevsim0 ip link set dev netdevsim0 up ip link add link netdevsim0 name netdevsim0.100 type vlan id 100 ip link set dev netdevsim0.100 allmulticast on down ip link add name bond1 type bond mode 802.3ad ip link set dev netdevsim0 down ip link set dev netdevsim0 master bond1 ip link set dev bond1 up ip link show Reported-by: syzbot+b0c03d76056ef6cd12a6@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/Z9CfXjLMKn6VLG5d@mini-arch/T/#m15ba130f53227c883e79fb969687d69d670337a0 Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313100657.2287455-1-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20fs: reduce work in fdget_pos()Mateusz Guzik
1. predict the file was found 2. explicitly compare the ref to "one", ignoring the dead zone The latter arguably improves the behavior to begin with. Suppose the count turned bad -- the previously used ref routine is going to check for it and return 0, indicating the count does not necessitate taking ->f_pos_lock. But there very well may be several users. i.e. not paying for special-casing the dead zone improves semantics. While here spell out each condition in a dedicated if statement. This has no effect on generated code. Sizes are as follows (in bytes; gcc 13, x86-64): stock: 321 likely(): 298 likely()+ref: 280 Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250319215801.1870660-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20arm64: dts: Add gpio_intc node for Amlogic A5 SoCsXianwei Zhao
Add GPIO interrupt controller device. Signed-off-by: Xianwei Zhao <xianwei.zhao@amlogic.com> Link: https://lore.kernel.org/r/20250311-irqchip-gpio-a4-a5-v5-4-ca4cc276c18c@amlogic.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
2025-03-20arm64: dts: Add gpio_intc node for Amlogic A4 SoCsXianwei Zhao
Add GPIO interrupt controller device. Signed-off-by: Xianwei Zhao <xianwei.zhao@amlogic.com> Link: https://lore.kernel.org/r/20250311-irqchip-gpio-a4-a5-v5-3-ca4cc276c18c@amlogic.com [narmstrong: fix commit to apply without pinctrl node] Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
2025-03-20Merge branches 'apple/dart', 'arm/smmu/updates', 'arm/smmu/bindings', ↵Joerg Roedel
'rockchip', 's390', 'core', 'intel/vt-d' and 'amd/amd-vi' into next
2025-03-20iommu/vt-d: Fix possible circular locking dependencyLu Baolu
We have recently seen report of lockdep circular lock dependency warnings on platforms like Skylake and Kabylake: ====================================================== WARNING: possible circular locking dependency detected 6.14.0-rc6-CI_DRM_16276-gca2c04fe76e8+ #1 Not tainted ------------------------------------------------------ swapper/0/1 is trying to acquire lock: ffffffff8360ee48 (iommu_probe_device_lock){+.+.}-{3:3}, at: iommu_probe_device+0x1d/0x70 but task is already holding lock: ffff888102c7efa8 (&device->physical_node_lock){+.+.}-{3:3}, at: intel_iommu_init+0xe75/0x11f0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #6 (&device->physical_node_lock){+.+.}-{3:3}: __mutex_lock+0xb4/0xe40 mutex_lock_nested+0x1b/0x30 intel_iommu_init+0xe75/0x11f0 pci_iommu_init+0x13/0x70 do_one_initcall+0x62/0x3f0 kernel_init_freeable+0x3da/0x6a0 kernel_init+0x1b/0x200 ret_from_fork+0x44/0x70 ret_from_fork_asm+0x1a/0x30 -> #5 (dmar_global_lock){++++}-{3:3}: down_read+0x43/0x1d0 enable_drhd_fault_handling+0x21/0x110 cpuhp_invoke_callback+0x4c6/0x870 cpuhp_issue_call+0xbf/0x1f0 __cpuhp_setup_state_cpuslocked+0x111/0x320 __cpuhp_setup_state+0xb0/0x220 irq_remap_enable_fault_handling+0x3f/0xa0 apic_intr_mode_init+0x5c/0x110 x86_late_time_init+0x24/0x40 start_kernel+0x895/0xbd0 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0xbf/0x110 common_startup_64+0x13e/0x141 -> #4 (cpuhp_state_mutex){+.+.}-{3:3}: __mutex_lock+0xb4/0xe40 mutex_lock_nested+0x1b/0x30 __cpuhp_setup_state_cpuslocked+0x67/0x320 __cpuhp_setup_state+0xb0/0x220 page_alloc_init_cpuhp+0x2d/0x60 mm_core_init+0x18/0x2c0 start_kernel+0x576/0xbd0 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0xbf/0x110 common_startup_64+0x13e/0x141 -> #3 (cpu_hotplug_lock){++++}-{0:0}: __cpuhp_state_add_instance+0x4f/0x220 iova_domain_init_rcaches+0x214/0x280 iommu_setup_dma_ops+0x1a4/0x710 iommu_device_register+0x17d/0x260 intel_iommu_init+0xda4/0x11f0 pci_iommu_init+0x13/0x70 do_one_initcall+0x62/0x3f0 kernel_init_freeable+0x3da/0x6a0 kernel_init+0x1b/0x200 ret_from_fork+0x44/0x70 ret_from_fork_asm+0x1a/0x30 -> #2 (&domain->iova_cookie->mutex){+.+.}-{3:3}: __mutex_lock+0xb4/0xe40 mutex_lock_nested+0x1b/0x30 iommu_setup_dma_ops+0x16b/0x710 iommu_device_register+0x17d/0x260 intel_iommu_init+0xda4/0x11f0 pci_iommu_init+0x13/0x70 do_one_initcall+0x62/0x3f0 kernel_init_freeable+0x3da/0x6a0 kernel_init+0x1b/0x200 ret_from_fork+0x44/0x70 ret_from_fork_asm+0x1a/0x30 -> #1 (&group->mutex){+.+.}-{3:3}: __mutex_lock+0xb4/0xe40 mutex_lock_nested+0x1b/0x30 __iommu_probe_device+0x24c/0x4e0 probe_iommu_group+0x2b/0x50 bus_for_each_dev+0x7d/0xe0 iommu_device_register+0xe1/0x260 intel_iommu_init+0xda4/0x11f0 pci_iommu_init+0x13/0x70 do_one_initcall+0x62/0x3f0 kernel_init_freeable+0x3da/0x6a0 kernel_init+0x1b/0x200 ret_from_fork+0x44/0x70 ret_from_fork_asm+0x1a/0x30 -> #0 (iommu_probe_device_lock){+.+.}-{3:3}: __lock_acquire+0x1637/0x2810 lock_acquire+0xc9/0x300 __mutex_lock+0xb4/0xe40 mutex_lock_nested+0x1b/0x30 iommu_probe_device+0x1d/0x70 intel_iommu_init+0xe90/0x11f0 pci_iommu_init+0x13/0x70 do_one_initcall+0x62/0x3f0 kernel_init_freeable+0x3da/0x6a0 kernel_init+0x1b/0x200 ret_from_fork+0x44/0x70 ret_from_fork_asm+0x1a/0x30 other info that might help us debug this: Chain exists of: iommu_probe_device_lock --> dmar_global_lock --> &device->physical_node_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&device->physical_node_lock); lock(dmar_global_lock); lock(&device->physical_node_lock); lock(iommu_probe_device_lock); *** DEADLOCK *** This driver uses a global lock to protect the list of enumerated DMA remapping units. It is necessary due to the driver's support for dynamic addition and removal of remapping units at runtime. Two distinct code paths require iteration over this remapping unit list: - Device registration and probing: the driver iterates the list to register each remapping unit with the upper layer IOMMU framework and subsequently probe the devices managed by that unit. - Global configuration: Upper layer components may also iterate the list to apply configuration changes. The lock acquisition order between these two code paths was reversed. This caused lockdep warnings, indicating a risk of deadlock. Fix this warning by releasing the global lock before invoking upper layer interfaces for device registration. Fixes: b150654f74bf ("iommu/vt-d: Fix suspicious RCU usage") Closes: https://lore.kernel.org/linux-iommu/SJ1PR11MB612953431F94F18C954C4A9CB9D32@SJ1PR11MB6129.namprd11.prod.outlook.com/ Tested-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20250317035714.1041549-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-03-20iommu/vt-d: Don't clobber posted vCPU IRTE when host IRQ affinity changesSean Christopherson
Don't overwrite an IRTE that is posting IRQs to a vCPU with a posted MSI entry if the host IRQ affinity happens to change. If/when the IRTE is reverted back to "host mode", it will be reconfigured as a posted MSI or remapped entry as appropriate. Drop the "mode" field, which doesn't differentiate between posted MSIs and posted vCPUs, in favor of a dedicated posted_vcpu flag. Note! The two posted_{msi,vcpu} flags are intentionally not mutually exclusive; an IRTE can transition between posted MSI and posted vCPU. Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs") Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20250315025135.2365846-3-seanjc@google.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-03-20iommu/vt-d: Put IRTE back into posted MSI mode if vCPU posting is disabledSean Christopherson
Add a helper to take care of reconfiguring an IRTE to deliver IRQs to the host, i.e. not to a vCPU, and use the helper when an IRTE's vCPU affinity is nullified, i.e. when KVM puts an IRTE back into "host" mode. Because posted MSIs use an ephemeral IRTE, using modify_irte() puts the IRTE into full remapped mode, i.e. unintentionally disables posted MSIs on the IRQ. Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs") Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20250315025135.2365846-2-seanjc@google.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-03-20iommu: apple-dart: fix potential null pointer derefQasim Ijaz
If kzalloc() fails, accessing cfg->supports_bypass causes a null pointer dereference. Fix by checking for NULL immediately after allocation and returning -ENOMEM. Fixes: 3bc0102835f6 ("iommu: apple-dart: Allow mismatched bypass support") Signed-off-by: Qasim Ijaz <qasdev00@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Link: https://lore.kernel.org/r/20250314230102.11008-1-qasdev00@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-03-20iommu/rockchip: Retire global dma_dev workaroundRobin Murphy
The global dma_dev trick was mostly because the old domain_alloc op provided no context, so no way to know which IOMMU was to own the pagetable, or if a suitable one even existed at all. In the new multi-instance world with domain_alloc_paging this is no longer a concern - now we know that the given device must be associated with a valid IOMMU instance which provided the op to call in the first place, and therefore that instance can and should be the pagetable owner. To avoid worrying about the lifetime and stability of the rk_domain->iommus list, and keep the lookups simple and efficient, we'll still stash a dma_dev pointer, but now it's accurately per-domain. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Quentin Schulz <quentin.schulz@cherry.de> Tested-by: Dang Huynh <danct12@riseup.net> Reviewed-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com> Tested-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com> Link: https://lore.kernel.org/r/25dc948a7d35c8142c5719ac22bc523f8524d006.1741886382.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-03-20iommu/rockchip: Register in a sensible orderRobin Murphy
Currently Rockchip calls iommu_device_register() before it's finished setting up the hardware and driver state, and as such it now gets unhappy in various ways when registration starts working the way it was always intended to, and probing client devices straight away. Reorder the operations to ensure that what we're registering is a prepared and functional IOMMU instance. Fixes: bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Quentin Schulz <quentin.schulz@cherry.de> Tested-by: Dang Huynh <danct12@riseup.net> Reviewed-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com> Tested-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com> Link: https://lore.kernel.org/r/e69532f00bf49d98322b96788edb7e2e305e4006.1741886382.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>