summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-12-11MAINTAINERS: Add ethtool.h to NETWORKING [GENERAL]Simon Horman
This is part of an effort to assign a section in MAINTAINERS to header files related to Networking. In this case the files named ethool.h. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241210-mnt-ethtool-h-v1-1-2a40b567939d@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-12wifi: rtw89: phy: add dummy C2H event handler for report of TAS powerPing-Ke Shih
The newer firmware, lik RTL8852C version 0.27.111.0, will notify driver report of TAS (Time Averaged SAR) power by new C2H events. This is to assist in higher accurate calculation of TAS. For now, driver doesn't use the report yet, so add a dummy handler to avoid it throws info like: rtw89_8852ce 0000:03:00.0: c2h class 9 func 6 not support Also add "MAC" and "PHY" to the message to disambiguate the source of C2H event. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241209042127.21424-1-pkshih@realtek.com
2024-12-12wifi: rtw89: 8851b: rfk: remove unnecessary assignment of return value of ↵Ping-Ke Shih
_dpk_dgain_read() The return value of _dpk_dgain_read() is not used afterward, so remove it safely. Addresses-Coverity-ID: 1504753 ("Unused value") Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241209042020.21290-2-pkshih@realtek.com
2024-12-12wifi: rtw89: 8852c: rfk: refine target channel calculation in ↵Ping-Ke Shih
_rx_dck_channel_calc() The channel is not possibly 0, so original code is fine. Still want to avoid Coverity warning, so ensure -32 offset for the channel number which is larger than 125 only. Actually, don't change logic at all. Addresses-Coverity-ID: 1628150 ("Overflowed constant") Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241209042020.21290-1-pkshih@realtek.com
2024-12-12wifi: rtlwifi: pci: wait for firmware loading before releasing memoryThadeu Lima de Souza Cascardo
At probe error path, the firmware loading work may have already been queued. In such a case, it will try to access memory allocated by the probe function, which is about to be released. In such paths, wait for the firmware worker to finish before releasing memory. Fixes: 3d86b93064c7 ("rtlwifi: Fix PCI probe error path orphaned memory") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241206173713.3222187-5-cascardo@igalia.com
2024-12-12wifi: rtlwifi: fix memory leaks and invalid access at probe error pathThadeu Lima de Souza Cascardo
Deinitialize at reverse order when probe fails. When init_sw_vars fails, rtl_deinit_core should not be called, specially now that it destroys the rtl_wq workqueue. And call rtl_pci_deinit and deinit_sw_vars, otherwise, memory will be leaked. Remove pci_set_drvdata call as it will already be cleaned up by the core driver code and could lead to memory leaks too. cf. commit 8d450935ae7f ("wireless: rtlwifi: remove unnecessary pci_set_drvdata()") and commit 3d86b93064c7 ("rtlwifi: Fix PCI probe error path orphaned memory"). Fixes: 0c8173385e54 ("rtl8192ce: Add new driver") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241206173713.3222187-4-cascardo@igalia.com
2024-12-12wifi: rtlwifi: destroy workqueue at rtl_deinit_coreThadeu Lima de Souza Cascardo
rtl_wq is allocated at rtl_init_core, so it makes more sense to destroy it at rtl_deinit_core. In the case of USB, where _rtl_usb_init does not require anything to be undone, that is fine. But for PCI, rtl_pci_init, which is called after rtl_init_core, needs to deallocate data, but only if it has been called. That means that destroying the workqueue needs to be done whether rtl_pci_init has been called or not. And since rtl_pci_deinit was doing it, it has to be moved out of there. It makes more sense to move it to rtl_deinit_core and have it done in both cases, USB and PCI. Since this is a requirement for a followup memory leak fix, mark this as fixing such memory leak. Fixes: 0c8173385e54 ("rtl8192ce: Add new driver") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241206173713.3222187-3-cascardo@igalia.com
2024-12-12wifi: rtlwifi: remove unused check_buddy_privThadeu Lima de Souza Cascardo
Commit 2461c7d60f9f ("rtlwifi: Update header file") introduced a global list of private data structures. Later on, commit 26634c4b1868 ("rtlwifi Modify existing bits to match vendor version 2013.02.07") started adding the private data to that list at probe time and added a hook, check_buddy_priv to find the private data from a similar device. However, that function was never used. Besides, though there is a lock for that list, it is never used. And when the probe fails, the private data is never removed from the list. This would cause a second probe to access freed memory. Remove the unused hook, structures and members, which will prevent the potential race condition on the list and its corruption during a second probe when probe fails. Fixes: 26634c4b1868 ("rtlwifi Modify existing bits to match vendor version 2013.02.07") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241206173713.3222187-2-cascardo@igalia.com
2024-12-12wifi: rtw89: 8922a: update format of RFK pre-notify H2C command v2Chih-Kang Chang
The RFK pre-notify H2C command is to tell firmware the channels driver is using. Since the format is changed after 0.35.49.0, update it accordingly. Signed-off-by: Chih-Kang Chang <gary.chang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241206055716.18598-8-pkshih@realtek.com
2024-12-12wifi: rtw89: regd: update regulatory map to R68-R51Zong-Zhe Yang
Sync Realtek Channel Plan R68 and Realtek Regulatory R51. Configure 6 GHz field of Realtek regd for the following countries. BO, DO, EG, LS, MZ, NG, OM, ZW, PK, PH, TH, KM, CG, CD, GE, GI, GU, LR, MH, FM, MP, PW, MF, SX, SZ, TZ, VI Besides, add entries for the following countries. CU, SY, SD Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241206055716.18598-7-pkshih@realtek.com
2024-12-12wifi: rtw89: 8852c: disable ER SU when 4x HE-LTF and 0.8 GI capability differKuan-Chung Chen
Since hardware only has single one register for HE-LTF setting, to prevent interoperability issues, 8852CE disables ER SU when the AP can handle SU/MU with 4x HE-LTF and 0.8 GI, but does not support ER SU with the same settings. Signed-off-by: Kuan-Chung Chen <damon.chen@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241206055716.18598-6-pkshih@realtek.com
2024-12-12wifi: rtw89: disable firmware training HE GI and LTFKuan-Chung Chen
Given the performance trade-off associated with firmware training HE GI/LTF, especially in high attenuation environments, we have decided to utilize a constant value instead. Signed-off-by: Kuan-Chung Chen <damon.chen@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241206055716.18598-5-pkshih@realtek.com
2024-12-12wifi: rtw89: ps: update data for firmware and settings for hardware ↵Eric Huang
before/after PS For MLO supported IC, send H2C command to firmware before PS with link information for each PHY for MLO to work properly. And re-init hardware settings regarding to RX descriptor information after PS. Signed-off-by: Eric Huang <echuang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241206055716.18598-4-pkshih@realtek.com
2024-12-12wifi: rtw89: ps: refactor channel info to firmware before entering PSPing-Ke Shih
In PS mode, firmware needs hardware parameters related to channel info to configure hardware itself. Before entering PS, driver prepares these info to firmware via firmware H2C command. Since firmware only consider PS for single one vif, change the argument of entry function to rtwvif, and only consider first link for this old H2C command that only support legacy. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241206055716.18598-3-pkshih@realtek.com
2024-12-12wifi: rtw89: ps: refactor PS flow to support MLOPing-Ke Shih
Firmware can only support PS on single one VIF operating in station mode, so argument of PS entry rtw89_enter_lps() should be rtwvif insetad of rtwvif_link. To enter PS under MLO, for each rtwvif, driver sends H2C command to tell firmware which mac_id will enter PS one by one, and afterward asks firmware to enter deep PS. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241206055716.18598-2-pkshih@realtek.com
2024-12-11udmabuf: fix memory leak on last export_udmabuf() error pathJann Horn
In export_udmabuf(), if dma_buf_fd() fails because the FD table is full, a dma_buf owning the udmabuf has already been created; but the error handling in udmabuf_create() will tear down the udmabuf without doing anything about the containing dma_buf. This leaves a dma_buf in memory that contains a dangling pointer; though that doesn't seem to lead to anything bad except a memory leak. Fix it by moving the dma_buf_fd() call out of export_udmabuf() so that we can give it different error handling. Note that the shape of this code changed a lot in commit 5e72b2b41a21 ("udmabuf: convert udmabuf driver to use folios"); but the memory leak seems to have existed since the introduction of udmabuf. Fixes: fbb0de795078 ("Add udmabuf misc device") Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241204-udmabuf-fixes-v2-3-23887289de1c@google.com
2024-12-11udmabuf: also check for F_SEAL_FUTURE_WRITEJann Horn
When F_SEAL_FUTURE_WRITE was introduced, it was overlooked that udmabuf must reject memfds with this flag, just like ones with F_SEAL_WRITE. Fix it by adding F_SEAL_FUTURE_WRITE to SEALS_DENIED. Fixes: ab3948f58ff8 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") Cc: stable@vger.kernel.org Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241204-udmabuf-fixes-v2-2-23887289de1c@google.com
2024-12-11udmabuf: fix racy memfd sealing checkJann Horn
The current check_memfd_seals() is racy: Since we first do check_memfd_seals() and then udmabuf_pin_folios() without holding any relevant lock across both, F_SEAL_WRITE can be set in between. This is problematic because we can end up holding pins to pages in a write-sealed memfd. Fix it using the inode lock, that's probably the easiest way. In the future, we might want to consider moving this logic into memfd, especially if anyone else wants to use memfd_pin_folios(). Reported-by: Julian Orth <ju.orth@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219106 Closes: https://lore.kernel.org/r/CAG48ez0w8HrFEZtJkfmkVKFDhE5aP7nz=obrimeTgpD+StkV9w@mail.gmail.com Fixes: fbb0de795078 ("Add udmabuf misc device") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241204-udmabuf-fixes-v2-1-23887289de1c@google.com
2024-12-11nvme-pci: 512 byte aligned dma pool segment quirkRobert Beckett
We initially introduced a quick fix limiting the queue depth to 1 as experimentation showed that it fixed data corruption on 64GB steamdecks. Further experimentation revealed corruption only happens when the last PRP data element aligns to the end of the page boundary. The device appears to treat this as a PRP chain to a new list instead of the data element that it actually is. This implementation is in violation of the spec. Encountering this errata with the Linux driver requires the host request a 128k transfer and coincidently be handed the last small pool dma buffer within a page. The QD1 quirk effectly works around this because the last data PRP always was at a 248 byte offset from the page start, so it never appeared at the end of the page, but comes at the expense of throttling IO and wasting the remainder of the PRP page beyond 256 bytes. Also to note, the MDTS on these devices is small enough that the "large" prp pool can hold enough PRP elements to never reach the end, so that pool is not a problem either. Introduce a new quirk to ensure the small pool is always aligned such that the last PRP element can't appear a the end of the page. This comes at the expense of wasting 256 bytes per small pool page allocated. Link: https://lore.kernel.org/linux-nvme/20241113043151.GA20077@lst.de/T/#u Fixes: 83bdfcbdbe5d ("nvme-pci: qdepth 1 quirk") Cc: Paweł Anikiel <panikiel@google.com> Signed-off-by: Robert Beckett <bob.beckett@collabora.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-12-11netfilter: nf_tables: do not defer rule destruction via call_rcuFlorian Westphal
nf_tables_chain_destroy can sleep, it can't be used from call_rcu callbacks. Moreover, nf_tables_rule_release() is only safe for error unwinding, while transaction mutex is held and the to-be-desroyed rule was not exposed to either dataplane or dumps, as it deactives+frees without the required synchronize_rcu() in-between. nft_rule_expr_deactivate() callbacks will change ->use counters of other chains/sets, see e.g. nft_lookup .deactivate callback, these must be serialized via transaction mutex. Also add a few lockdep asserts to make this more explicit. Calling synchronize_rcu() isn't ideal, but fixing this without is hard and way more intrusive. As-is, we can get: WARNING: .. net/netfilter/nf_tables_api.c:5515 nft_set_destroy+0x.. Workqueue: events nf_tables_trans_destroy_work RIP: 0010:nft_set_destroy+0x3fe/0x5c0 Call Trace: <TASK> nf_tables_trans_destroy_work+0x6b7/0xad0 process_one_work+0x64a/0xce0 worker_thread+0x613/0x10d0 In case the synchronize_rcu becomes an issue, we can explore alternatives. One way would be to allocate nft_trans_rule objects + one nft_trans_chain object, deactivate the rules + the chain and then defer the freeing to the nft destroy workqueue. We'd still need to keep the synchronize_rcu path as a fallback to handle -ENOMEM corner cases though. Reported-by: syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/67478d92.050a0220.253251.0062.GAE@google.com/T/ Fixes: c03d278fdf35 ("netfilter: nf_tables: wait for rcu grace period on net_device removal") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-12-11netfilter: IDLETIMER: Fix for possible ABBA deadlockPhil Sutter
Deletion of the last rule referencing a given idletimer may happen at the same time as a read of its file in sysfs: | ====================================================== | WARNING: possible circular locking dependency detected | 6.12.0-rc7-01692-g5e9a28f41134-dirty #594 Not tainted | ------------------------------------------------------ | iptables/3303 is trying to acquire lock: | ffff8881057e04b8 (kn->active#48){++++}-{0:0}, at: __kernfs_remove+0x20 | | but task is already holding lock: | ffffffffa0249068 (list_mutex){+.+.}-{3:3}, at: idletimer_tg_destroy_v] | | which lock already depends on the new lock. A simple reproducer is: | #!/bin/bash | | while true; do | iptables -A INPUT -i foo -j IDLETIMER --timeout 10 --label "testme" | iptables -D INPUT -i foo -j IDLETIMER --timeout 10 --label "testme" | done & | while true; do | cat /sys/class/xt_idletimer/timers/testme >/dev/null | done Avoid this by freeing list_mutex right after deleting the element from the list, then continuing with the teardown. Fixes: 0902b469bd25 ("netfilter: xtables: idletimer target implementation") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-12-11selftests: netfilter: Stabilize rpath.shPhil Sutter
On some systems, neighbor discoveries from ns1 for fec0:42::1 (i.e., the martian trap address) would happen at the wrong time and cause false-negative test result. Problem analysis also discovered that IPv6 martian ping test was broken in that sent neighbor discoveries, not echo requests were inadvertently trapped Avoid the race condition by introducing the neighbors to each other upfront. Also pin down the firewall rules to matching on echo requests only. Fixes: efb056e5f1f0 ("netfilter: ip6t_rpfilter: Fix regression with VRF interfaces") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-12-11Revert "unicode: Don't special case ignorable code points"Linus Torvalds
This reverts commit 5c26d2f1d3f5e4be3e196526bead29ecb139cf91. It turns out that we can't do this, because while the old behavior of ignoring ignorable code points was most definitely wrong, we have case-folding filesystems with on-disk hash values with that wrong behavior. So now you can't look up those names, because they hash to something different. Of course, it's also entirely possible that in the meantime people have created *new* files with the new ("more correct") case folding logic, and reverting will just make other things break. The correct solution is to not do case folding in filesystems, but sadly, people seem to never really understand that. People still see it as a feature, not a bug. Reported-by: Qi Han <hanqi@vivo.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219586 Cc: Gabriel Krisman Bertazi <krisman@suse.de> Requested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-11Merge tag 'vfio-v6.13-rc3' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull vfio fix from Alex Williamson: - Fix migration dirty page tracking support in the mlx5-vfio-pci variant driver in configurations where the system page size exceeds the device maximum message size, and anticipate device updates where the opposite may also be required (Yishai Hadas) * tag 'vfio-v6.13-rc3' of https://github.com/awilliam/linux-vfio: vfio/mlx5: Align the page tracking max message size with the device capability
2024-12-11Merge tag 'linux_kselftest-fixes-6.13-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fix from Shuah Khan: - fix the offset for kprobe syntax error test case when checking the BTF arguments on 64-bit powerpc * tag 'linux_kselftest-fixes-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/ftrace: adjust offset for kprobe syntax error test
2024-12-11sched_ext: Fix invalid irq restore in scx_ops_bypass()Tejun Heo
While adding outer irqsave/restore locking, 0e7ffff1b811 ("scx: Fix raciness in scx_ops_bypass()") forgot to convert an inner rq_unlock_irqrestore() to rq_unlock() which could re-enable IRQ prematurely leading to the following warning: raw_local_irq_restore() called with IRQs enabled WARNING: CPU: 1 PID: 96 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x30/0x40 ... Sched_ext: create_dsq (enabling) pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : warn_bogus_irq_restore+0x30/0x40 lr : warn_bogus_irq_restore+0x30/0x40 ... Call trace: warn_bogus_irq_restore+0x30/0x40 (P) warn_bogus_irq_restore+0x30/0x40 (L) scx_ops_bypass+0x224/0x3b8 scx_ops_enable.isra.0+0x2c8/0xaa8 bpf_scx_reg+0x18/0x30 ... irq event stamp: 33739 hardirqs last enabled at (33739): [<ffff8000800b699c>] scx_ops_bypass+0x174/0x3b8 hardirqs last disabled at (33738): [<ffff800080d48ad4>] _raw_spin_lock_irqsave+0xb4/0xd8 Drop the stray _irqrestore(). Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Link: http://lkml.kernel.org/r/qC39k3UsonrBYD_SmuxHnZIQLsuuccoCrkiqb_BT7DvH945A1_LZwE4g-5Pu9FcCtqZt4lY1HhIPi0homRuNWxkgo1rgP3bkxa0donw8kV4=@pm.me Fixes: 0e7ffff1b811 ("scx: Fix raciness in scx_ops_bypass()") Cc: stable@vger.kernel.org # v6.12
2024-12-11EDAC/amd64: Simplify ECC check on unified memory controllersBorislav Petkov (AMD)
The intent of the check is to see whether at least one UMC has ECC enabled. So do that instead of tracking which ones are enabled in masks which are too small in size anyway and lead to not loading the driver on Zen4 machines with UMCs enabled over UMC8. Fixes: e2be5955a886 ("EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh") Reported-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Avadhut Naik <avadhut.naik@amd.com> Reviewed-by: Avadhut Naik <avadhut.naik@amd.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20241210212054.3895697-1-avadhut.naik@amd.com
2024-12-11riscv: mm: Do not call pmd dtor on vmemmap page table teardownBjörn Töpel
The vmemmap's, which is used for RV64 with SPARSEMEM_VMEMMAP, page tables are populated using pmd (page middle directory) hugetables. However, the pmd allocation is not using the generic mechanism used by the VMA code (e.g. pmd_alloc()), or the RISC-V specific create_pgd_mapping()/alloc_pmd_late(). Instead, the vmemmap page table code allocates a page, and calls vmemmap_set_pmd(). This results in that the pmd ctor is *not* called, nor would it make sense to do so. Now, when tearing down a vmemmap page table pmd, the cleanup code would unconditionally, and incorrectly call the pmd dtor, which results in a crash (best case). This issue was found when running the HMM selftests: | tools/testing/selftests/mm# ./test_hmm.sh smoke | ... # when unloading the test_hmm.ko module | page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10915b | flags: 0x1000000000000000(node=0|zone=1) | raw: 1000000000000000 0000000000000000 dead000000000122 0000000000000000 | raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 | page dumped because: VM_BUG_ON_PAGE(ptdesc->pmd_huge_pte) | ------------[ cut here ]------------ | kernel BUG at include/linux/mm.h:3080! | Kernel BUG [#1] | Modules linked in: test_hmm(-) sch_fq_codel fuse drm drm_panel_orientation_quirks backlight dm_mod | CPU: 1 UID: 0 PID: 514 Comm: modprobe Tainted: G W 6.12.0-00982-gf2a4f1682d07 #2 | Tainted: [W]=WARN | Hardware name: riscv-virtio qemu/qemu, BIOS 2024.10 10/01/2024 | epc : remove_pgd_mapping+0xbec/0x1070 | ra : remove_pgd_mapping+0xbec/0x1070 | epc : ffffffff80010a68 ra : ffffffff80010a68 sp : ff20000000a73940 | gp : ffffffff827b2d88 tp : ff6000008785da40 t0 : ffffffff80fbce04 | t1 : 0720072007200720 t2 : 706d756420656761 s0 : ff20000000a73a50 | s1 : ff6000008915cff8 a0 : 0000000000000039 a1 : 0000000000000008 | a2 : ff600003fff0de20 a3 : 0000000000000000 a4 : 0000000000000000 | a5 : 0000000000000000 a6 : c0000000ffffefff a7 : ffffffff824469b8 | s2 : ff1c0000022456c0 s3 : ff1ffffffdbfffff s4 : ff6000008915c000 | s5 : ff6000008915c000 s6 : ff6000008915c000 s7 : ff1ffffffdc00000 | s8 : 0000000000000001 s9 : ff1ffffffdc00000 s10: ffffffff819a31f0 | s11: ffffffffffffffff t3 : ffffffff8000c950 t4 : ff60000080244f00 | t5 : ff60000080244000 t6 : ff20000000a73708 | status: 0000000200000120 badaddr: ffffffff80010a68 cause: 0000000000000003 | [<ffffffff80010a68>] remove_pgd_mapping+0xbec/0x1070 | [<ffffffff80fd238e>] vmemmap_free+0x14/0x1e | [<ffffffff8032e698>] section_deactivate+0x220/0x452 | [<ffffffff8032ef7e>] sparse_remove_section+0x4a/0x58 | [<ffffffff802f8700>] __remove_pages+0x7e/0xba | [<ffffffff803760d8>] memunmap_pages+0x2bc/0x3fe | [<ffffffff02a3ca28>] dmirror_device_remove_chunks+0x2ea/0x518 [test_hmm] | [<ffffffff02a3e026>] hmm_dmirror_exit+0x3e/0x1018 [test_hmm] | [<ffffffff80102c14>] __riscv_sys_delete_module+0x15a/0x2a6 | [<ffffffff80fd020c>] do_trap_ecall_u+0x1f2/0x266 | [<ffffffff80fde0a2>] _new_vmalloc_restore_context_a0+0xc6/0xd2 | Code: bf51 7597 0184 8593 76a5 854a 4097 0029 80e7 2c00 (9002) 7597 | ---[ end trace 0000000000000000 ]--- | Kernel panic - not syncing: Fatal exception in interrupt Add a check to avoid calling the pmd dtor, if the calling context is vmemmap_free(). Fixes: c75a74f4ba19 ("riscv: mm: Add memory hotplugging support") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241120131203.1859787-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11riscv: Fix IPIs usage in kfence_protect_page()Alexandre Ghiti
flush_tlb_kernel_range() may use IPIs to flush the TLBs of all the cores, which triggers the following warning when the irqs are disabled: [ 3.455330] WARNING: CPU: 1 PID: 0 at kernel/smp.c:815 smp_call_function_many_cond+0x452/0x520 [ 3.456647] Modules linked in: [ 3.457218] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.12.0-rc7-00010-g91d3de7240b8 #1 [ 3.457416] Hardware name: QEMU QEMU Virtual Machine, BIOS [ 3.457633] epc : smp_call_function_many_cond+0x452/0x520 [ 3.457736] ra : on_each_cpu_cond_mask+0x1e/0x30 [ 3.457786] epc : ffffffff800b669a ra : ffffffff800b67c2 sp : ff2000000000bb50 [ 3.457824] gp : ffffffff815212b8 tp : ff6000008014f080 t0 : 000000000000003f [ 3.457859] t1 : ffffffff815221e0 t2 : 000000000000000f s0 : ff2000000000bc10 [ 3.457920] s1 : 0000000000000040 a0 : ffffffff815221e0 a1 : 0000000000000001 [ 3.457953] a2 : 0000000000010000 a3 : 0000000000000003 a4 : 0000000000000000 [ 3.458006] a5 : 0000000000000000 a6 : ffffffffffffffff a7 : 0000000000000000 [ 3.458042] s2 : ffffffff815223be s3 : 00fffffffffff000 s4 : ff600001ffe38fc0 [ 3.458076] s5 : ff600001ff950d00 s6 : 0000000200000120 s7 : 0000000000000001 [ 3.458109] s8 : 0000000000000001 s9 : ff60000080841ef0 s10: 0000000000000001 [ 3.458141] s11: ffffffff81524812 t3 : 0000000000000001 t4 : ff60000080092bc0 [ 3.458172] t5 : 0000000000000000 t6 : ff200000000236d0 [ 3.458203] status: 0000000200000100 badaddr: ffffffff800b669a cause: 0000000000000003 [ 3.458373] [<ffffffff800b669a>] smp_call_function_many_cond+0x452/0x520 [ 3.458593] [<ffffffff800b67c2>] on_each_cpu_cond_mask+0x1e/0x30 [ 3.458625] [<ffffffff8000e4ca>] __flush_tlb_range+0x118/0x1ca [ 3.458656] [<ffffffff8000e6b2>] flush_tlb_kernel_range+0x1e/0x26 [ 3.458683] [<ffffffff801ea56a>] kfence_protect+0xc0/0xce [ 3.458717] [<ffffffff801e9456>] kfence_guarded_free+0xc6/0x1c0 [ 3.458742] [<ffffffff801e9d6c>] __kfence_free+0x62/0xc6 [ 3.458764] [<ffffffff801c57d8>] kfree+0x106/0x32c [ 3.458786] [<ffffffff80588cf2>] detach_buf_split+0x188/0x1a8 [ 3.458816] [<ffffffff8058708c>] virtqueue_get_buf_ctx+0xb6/0x1f6 [ 3.458839] [<ffffffff805871da>] virtqueue_get_buf+0xe/0x16 [ 3.458880] [<ffffffff80613d6a>] virtblk_done+0x5c/0xe2 [ 3.458908] [<ffffffff8058766e>] vring_interrupt+0x6a/0x74 [ 3.458930] [<ffffffff800747d8>] __handle_irq_event_percpu+0x7c/0xe2 [ 3.458956] [<ffffffff800748f0>] handle_irq_event+0x3c/0x86 [ 3.458978] [<ffffffff800786cc>] handle_simple_irq+0x9e/0xbe [ 3.459004] [<ffffffff80073934>] generic_handle_domain_irq+0x1c/0x2a [ 3.459027] [<ffffffff804bf87c>] imsic_handle_irq+0xba/0x120 [ 3.459056] [<ffffffff80073934>] generic_handle_domain_irq+0x1c/0x2a [ 3.459080] [<ffffffff804bdb76>] riscv_intc_aia_irq+0x24/0x34 [ 3.459103] [<ffffffff809d0452>] handle_riscv_irq+0x2e/0x4c [ 3.459133] [<ffffffff809d923e>] call_on_irq_stack+0x32/0x40 So only flush the local TLB and let the lazy kfence page fault handling deal with the faults which could happen when a core has an old protected pte version cached in its TLB. That leads to potential inaccuracies which can be tolerated when using kfence. Fixes: 47513f243b45 ("riscv: Enable KFENCE for riscv64") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241209074125.52322-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11riscv: Fix wrong usage of __pa() on a fixmap addressAlexandre Ghiti
riscv uses fixmap addresses to map the dtb so we can't use __pa() which is reserved for linear mapping addresses. Fixes: b2473a359763 ("of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241209074508.53037-1-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11riscv: Fixup boot failure when CONFIG_DEBUG_RT_MUTEXES=yGuo Ren
When CONFIG_DEBUG_RT_MUTEXES=y, mutex_lock->rt_mutex_try_acquire would change from rt_mutex_cmpxchg_acquire to rt_mutex_slowtrylock(): raw_spin_lock_irqsave(&lock->wait_lock, flags); ret = __rt_mutex_slowtrylock(lock); raw_spin_unlock_irqrestore(&lock->wait_lock, flags); Because queued_spin_#ops to ticket_#ops is changed one by one by jump_label, raw_spin_lock/unlock would cause a deadlock during the changing. That means in arch/riscv/kernel/jump_label.c: 1. arch_jump_label_transform_queue() -> mutex_lock(&text_mutex); +-> raw_spin_lock -> queued_spin_lock |-> raw_spin_unlock -> queued_spin_unlock patch_insn_write -> change the raw_spin_lock to ticket_lock mutex_unlock(&text_mutex); ... 2. /* Dirty the lock value */ arch_jump_label_transform_queue() -> mutex_lock(&text_mutex); +-> raw_spin_lock -> *ticket_lock* |-> raw_spin_unlock -> *queued_spin_unlock* /* BUG: ticket_lock with queued_spin_unlock */ patch_insn_write -> change the raw_spin_unlock to ticket_unlock mutex_unlock(&text_mutex); ... 3. /* Dead lock */ arch_jump_label_transform_queue() -> mutex_lock(&text_mutex); +-> raw_spin_lock -> ticket_lock /* deadlock! */ |-> raw_spin_unlock -> ticket_unlock patch_insn_write -> change other raw_spin_#op -> ticket_#op mutex_unlock(&text_mutex); So, the solution is to disable mutex usage of arch_jump_label_transform_queue() during early_boot_irqs_disabled, just like we have done for stop_machine. Reported-by: Conor Dooley <conor@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Fixes: ab83647fadae ("riscv: Add qspinlock support") Link: https://lore.kernel.org/linux-riscv/CAJF2gTQwYTGinBmCSgVUoPv0_q4EPt_+WiyfUA1HViAKgUzxAg@mail.gmail.com/T/#mf488e6347817fca03bb93a7d34df33d8615b3775 Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241130153310.3349484-1-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11ASoC: fsl_spdif: change IFACE_PCM to IFACE_MIXERShengjiu Wang
As the snd_soc_card_get_kcontrol() is updated to use snd_ctl_find_id_mixer() in commit 897cc72b0837 ("ASoC: soc-card: Use snd_ctl_find_id_mixer() instead of open-coding") which make the iface fix to be IFACE_MIXER. Fixes: 897cc72b0837 ("ASoC: soc-card: Use snd_ctl_find_id_mixer() instead of open-coding") Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://patch.msgid.link/20241126053254.3657344-3-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-11ASoC: fsl_xcvr: change IFACE_PCM to IFACE_MIXERShengjiu Wang
As the snd_soc_card_get_kcontrol() is updated to use snd_ctl_find_id_mixer() in commit 897cc72b0837 ("ASoC: soc-card: Use snd_ctl_find_id_mixer() instead of open-coding") which make the iface fix to be IFACE_MIXER. Fixes: 897cc72b0837 ("ASoC: soc-card: Use snd_ctl_find_id_mixer() instead of open-coding") Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://patch.msgid.link/20241126053254.3657344-2-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-11libperf: evlist: Fix --cpu argument on hybrid platformJames Clark
Since the linked fixes: commit, specifying a CPU on hybrid platforms results in an error because Perf tries to open an extended type event on "any" CPU which isn't valid. Extended type events can only be opened on CPUs that match the type. Before (working): $ perf record --cpu 1 -- true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ] After (not working): $ perf record -C 1 -- true WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P' Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/). /bin/dmesg | grep -i perf may provide additional information. (Ignore the warning message, that's expected and not particularly relevant to this issue). This is because perf_cpu_map__intersect() of the user specified CPU (1) and one of the PMU's CPUs (16-27) correctly results in an empty (NULL) CPU map. However for the purposes of opening an event, libperf converts empty CPU maps into an any CPU (-1) which the kernel rejects. Fix it by deleting evsels with empty CPU maps in the specific case where user requested CPU maps are evaluated. Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events") Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: James Clark <james.clark@linaro.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20241114160450.295844-2-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-11perf test expr: Fix system_tsc_freq for only x86Ian Rogers
The refactoring of tool PMU events to have a PMU then adding the expr literals to the tool PMU made it so that the literal system_tsc_freq was only supported on x86. Update the test expectations to match - namely the parsing is x86 specific and only yields a non-zero value on Intel. Fixes: 609aa2667f67 ("perf tool_pmu: Switch to standard pmu functions and json descriptions") Reported-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Closes: https://lore.kernel.org/linux-perf-users/20241022140156.98854-1-atrajeev@linux.vnet.ibm.com/ Co-developed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: akanksha@linux.ibm.com Cc: hbathini@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20241205022305.158202-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-11selftests/ftrace: adjust offset for kprobe syntax error testHari Bathini
In 'NOFENTRY_ARGS' test case for syntax check, any offset X of `vfs_read+X` except function entry offset (0) fits the criterion, even if that offset is not at instruction boundary, as the parser comes before probing. But with "ENDBR64" instruction on x86, offset 4 is treated as function entry. So, X can't be 4 as well. Thus, 8 was used as offset for the test case. On 64-bit powerpc though, any offset <= 16 can be considered function entry depending on build configuration (see arch_kprobe_on_func_entry() for implementation details). So, use `vfs_read+20` to accommodate that scenario too. Link: https://lore.kernel.org/r/20241129202621.721159-1-hbathini@linux.ibm.com Fixes: 4231f30fcc34a ("selftests/ftrace: Add BTF arguments test cases") Suggested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-12-11s390/ipl: Fix never less than zero warningAlexander Gordeev
DEFINE_IPL_ATTR_STR_RW() macro produces "unsigned 'len' is never less than zero." warning when sys_vmcmd_on_*_store() callbacks are defined. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412081614.5uel8F6W-lkp@intel.com/ Fixes: 247576bf624a ("s390/ipl: Do not accept z/VM CP diag X'008' cmds longer than max length") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-12-11Bluetooth: Improve setsockopt() handling of malformed user inputMichal Luczaj
The bt_copy_from_sockptr() return value is being misinterpreted by most users: a non-zero result is mistakenly assumed to represent an error code, but actually indicates the number of bytes that could not be copied. Remove bt_copy_from_sockptr() and adapt callers to use copy_safe_from_sockptr(). For sco_sock_setsockopt() (case BT_CODEC) use copy_struct_from_sockptr() to scrub parts of uninitialized buffer. Opportunistically, rename `len` to `optlen` in hci_sock_setsockopt_old() and hci_sock_setsockopt(). Fixes: 51eda36d33e4 ("Bluetooth: SCO: Fix not validating setsockopt user input") Fixes: a97de7bff13b ("Bluetooth: RFCOMM: Fix not validating setsockopt user input") Fixes: 4f3951242ace ("Bluetooth: L2CAP: Fix not validating setsockopt user input") Fixes: 9e8742cdfc4b ("Bluetooth: ISO: Fix not validating setsockopt user input") Fixes: b2186061d604 ("Bluetooth: hci_sock: Fix not validating setsockopt user input") Reviewed-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Reviewed-by: David Wei <dw@davidwei.uk> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-12-11media: dvb-frontends: dib3000mb: fix uninit-value in dib3000_write_regNikita Zhandarovich
Syzbot reports [1] an uninitialized value issue found by KMSAN in dib3000_read_reg(). Local u8 rb[2] is used in i2c_transfer() as a read buffer; in case that call fails, the buffer may end up with some undefined values. Since no elaborate error handling is expected in dib3000_write_reg(), simply zero out rb buffer to mitigate the problem. [1] Syzkaller report dvb-usb: bulk message failed: -22 (6/0) ===================================================== BUG: KMSAN: uninit-value in dib3000mb_attach+0x2d8/0x3c0 drivers/media/dvb-frontends/dib3000mb.c:758 dib3000mb_attach+0x2d8/0x3c0 drivers/media/dvb-frontends/dib3000mb.c:758 dibusb_dib3000mb_frontend_attach+0x155/0x2f0 drivers/media/usb/dvb-usb/dibusb-mb.c:31 dvb_usb_adapter_frontend_init+0xed/0x9a0 drivers/media/usb/dvb-usb/dvb-usb-dvb.c:290 dvb_usb_adapter_init drivers/media/usb/dvb-usb/dvb-usb-init.c:90 [inline] dvb_usb_init drivers/media/usb/dvb-usb/dvb-usb-init.c:186 [inline] dvb_usb_device_init+0x25a8/0x3760 drivers/media/usb/dvb-usb/dvb-usb-init.c:310 dibusb_probe+0x46/0x250 drivers/media/usb/dvb-usb/dibusb-mb.c:110 ... Local variable rb created at: dib3000_read_reg+0x86/0x4e0 drivers/media/dvb-frontends/dib3000mb.c:54 dib3000mb_attach+0x123/0x3c0 drivers/media/dvb-frontends/dib3000mb.c:758 ... Fixes: 74340b0a8bc6 ("V4L/DVB (4457): Remove dib3000-common-module") Reported-by: syzbot+c88fc0ebe0d5935c70da@syzkaller.appspotmail.com Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Link: https://lore.kernel.org/r/20240517155800.9881-1-n.zhandarovich@fintech.ru Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2024-12-11scripts/kernel-doc: Get -export option working againAkira Yokosawa
Since commit cdd30ebb1b9f ("module: Convert symbol namespace to string literal"), exported symbols marked by EXPORT_SYMBOL_NS(_GPL) are ignored by "kernel-doc -export" in fresh build of "make htmldocs". This is because regex in the perl script for those markers fails to match the new signatures: - EXPORT_SYMBOL_NS(symbol, "ns"); - EXPORT_SYMBOL_NS_GPL(symbol, "ns"); Update the regex so that it matches quoted string. Note: Escape sequence of \w is good for C identifiers, but can be too strict for quoted strings. Instead, use \S, which matches any non-whitespace character, for compatibility with possible extension of namespace convention in the future [1]. Fixes: cdd30ebb1b9f ("module: Convert symbol namespace to string literal") Link: https://lore.kernel.org/CAK7LNATMufXP0EA6QUE9hBkZMa6vJO6ZiaYuak2AhOrd2nSVKQ@mail.gmail.com/ [1] Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/e5c43f36-45cd-49f4-b7b8-ff342df3c7a4@gmail.com
2024-12-11MAINTAINERS: add me as reviewer for sched_extChangwoo Min
Add me as a reviewer for sched_ext. I have been actively working on the project and would like to help review patches and address related kernel issues/features. Signed-off-by: Changwoo Min <changwoo@igalia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-12-11platform/x86/intel/vsec: Add support for Panther LakeXi Pardee
Add Panther Lake PMT telemetry support. Signed-off-by: Xi Pardee <xi.pardee@linux.intel.com> Link: https://lore.kernel.org/r/20241210212646.239211-1-xi.pardee@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-12-11platform/x86/intel/ifs: Add Clearwater Forest to CPU support listJithu Joseph
Add Clearwater Forest (INTEL_ATOM_DARKMONT_X) to the x86 match table of Intel In Field Scan (IFS) driver, enabling IFS functionality on this processor. Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Link: https://lore.kernel.org/r/20241210203152.1136463-1-jithu.joseph@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-12-11platform/x86: touchscreen_dmi: Add info for SARY Tab 3 tabletHuy Minh
There's no info about the OEM behind the tablet, only online stores listing. This tablet uses an Intel Atom x5-Z8300, 4GB of RAM & 64GB of storage. Signed-off-by: Huy Minh <buingoc67@gmail.com> Link: https://lore.kernel.org/r/20241210154500.32124-1-buingoc67@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-12-11l2tp: Handle eth stats using NETDEV_PCPU_STAT_DSTATS.James Chapman
l2tp_eth uses the TSTATS infrastructure (dev_sw_netstats_*()) for RX and TX packet counters and DEV_STATS_INC for dropped counters. Consolidate that using the DSTATS infrastructure, which can handle both packet counters and packet drops. Statistics that don't fit DSTATS are still updated atomically with DEV_STATS_INC(). This change is inspired by the introduction of DSTATS helpers and their use in other udp tunnel drivers: Link: https://lore.kernel.org/all/cover.1733313925.git.gnault@redhat.com/ Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-11ASoC: tas2781: Fix calibration issue in stress testShenghao Ding
One specific test condition: the default registers of p[j].reg ~ p[j+3].reg are 0, TASDEVICE_REG(0x00, 0x14, 0x38)(PLT_FLAG_REG), TASDEVICE_REG(0x00, 0x14, 0x40)(SINEGAIN_REG), and TASDEVICE_REG(0x00, 0x14, 0x44)(SINEGAIN2_REG). After first calibration, they are freshed to TASDEVICE_REG(0x00, 0x1a, 0x20), TASDEVICE_REG(0x00, 0x16, 0x58)(PLT_FLAG_REG), TASDEVICE_REG(0x00, 0x14, 0x44)(SINEGAIN_REG), and TASDEVICE_REG(0x00, 0x16, 0x64)(SINEGAIN2_REG) via "Calibration Start" kcontrol. In second calibration, the p[j].reg ~ p[j+3].reg have already become tas2781_cali_start_reg. However, p[j+2].reg, TASDEVICE_REG(0x00, 0x14, 0x44)(SINEGAIN_REG), will be freshed to TASDEVICE_REG(0x00, 0x16, 0x64), which is the third register in the input params of the kcontrol. This is why only first calibration can work, the second-time, third-time or more-time calibration always failed without reboot. Of course, if no p[j].reg is in the list of tas2781_cali_start_reg, this stress test can work well. Fixes: 49e2e353fb0d ("ASoC: tas2781: Add Calibration Kcontrols for Chromebook") Signed-off-by: Shenghao Ding <shenghao-ding@ti.com> Link: https://patch.msgid.link/20241211043859.1328-1-shenghao-ding@ti.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-11net: renesas: rswitch: enable only used MFWD featuresNikita Yushchenko
Currently, rswitch driver does not utilize most of MFWD forwarding and processing features. It only uses port-based forwarding for ETHA ports, and direct descriptor forwarding for GWCA port. Update rswitch_fwd_init() to enable exactly that, and keep everything else disabled. Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-11Merge patch series "iomap: fix zero padding data issue in concurrent append ↵Christian Brauner
writes" Long Li <leo.lilong@huawei.com> says: This patch series fixes zero padding data issues in concurrent append write scenarios. A detailed problem description and solution can be found in patch 2. Patch 1 is introduced as preparation for the fix in patch 2, eliminating the need to resample inode size for io_size trimming and avoiding issues caused by inode size changes during concurrent writeback and truncate operations. * patches from https://lore.kernel.org/r/20241209114241.3725722-1-leo.lilong@huawei.com: iomap: fix zero padding data issue in concurrent append writes iomap: pass byte granular end position to iomap_add_to_ioend Link: https://lore.kernel.org/r/20241209114241.3725722-1-leo.lilong@huawei.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-11iomap: fix zero padding data issue in concurrent append writesLong Li
During concurrent append writes to XFS filesystem, zero padding data may appear in the file after power failure. This happens due to imprecise disk size updates when handling write completion. Consider this scenario with concurrent append writes same file: Thread 1: Thread 2: ------------ ----------- write [A, A+B] update inode size to A+B submit I/O [A, A+BS] write [A+B, A+B+C] update inode size to A+B+C <I/O completes, updates disk size to min(A+B+C, A+BS)> <power failure> After reboot: 1) with A+B+C < A+BS, the file has zero padding in range [A+B, A+B+C] |< Block Size (BS) >| |DDDDDDDDDDDDDDDD0000000000000000| ^ ^ ^ A A+B A+B+C (EOF) 2) with A+B+C > A+BS, the file has zero padding in range [A+B, A+BS] |< Block Size (BS) >|< Block Size (BS) >| |DDDDDDDDDDDDDDDD0000000000000000|00000000000000000000000000000000| ^ ^ ^ ^ A A+B A+BS A+B+C (EOF) D = Valid Data 0 = Zero Padding The issue stems from disk size being set to min(io_offset + io_size, inode->i_size) at I/O completion. Since io_offset+io_size is block size granularity, it may exceed the actual valid file data size. In the case of concurrent append writes, inode->i_size may be larger than the actual range of valid file data written to disk, leading to inaccurate disk size updates. This patch modifies the meaning of io_size to represent the size of valid data within EOF in an ioend. If the ioend spans beyond i_size, io_size will be trimmed to provide the file with more accurate size information. This is particularly useful for on-disk size updates at completion time. After this change, ioends that span i_size will not grow or merge with other ioends in concurrent scenarios. However, these cases that need growth/merging rarely occur and it seems no noticeable performance impact. Although rounding up io_size could enable ioend growth/merging in these scenarios, we decided to keep the code simple after discussion [1]. Another benefit is that it makes the xfs_ioend_is_append() check more accurate, which can reduce unnecessary end bio callbacks of xfs_end_bio() in certain scenarios, such as repeated writes at the file tail without extending the file size. Link [1]: https://patchwork.kernel.org/project/xfs/patch/20241113091907.56937-1-leo.lilong@huawei.com Fixes: ae259a9c8593 ("fs: introduce iomap infrastructure") # goes further back than this Signed-off-by: Long Li <leo.lilong@huawei.com> Link: https://lore.kernel.org/r/20241209114241.3725722-3-leo.lilong@huawei.com Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-11iomap: pass byte granular end position to iomap_add_to_ioendLong Li
This is a preparatory patch for fixing zero padding issues in concurrent append write scenarios. In the following patches, we need to obtain byte-granular writeback end position for io_size trimming after EOF handling. Due to concurrent writeback and truncate operations, inode size may shrink. Resampling inode size would force writeback code to handle the newly appeared post-EOF blocks, which is undesirable. As Dave explained in [1]: "Really, the issue is that writeback mappings have to be able to handle the range being mapped suddenly appear to be beyond EOF. This behaviour is a longstanding writeback constraint, and is what iomap_writepage_handle_eof() is attempting to handle. We handle this by only sampling i_size_read() whilst we have the folio locked and can determine the action we should take with that folio (i.e. nothing, partial zeroing, or skip altogether). Once we've made the decision that the folio is within EOF and taken action on it (i.e. moved the folio to writeback state), we cannot then resample the inode size because a truncate may have started and changed the inode size." To avoid resampling inode size after EOF handling, we convert end_pos to byte-granular writeback position and return it from EOF handling function. Since iomap_set_range_dirty() can handle unaligned lengths, this conversion has no impact on it. However, iomap_find_dirty_range() requires aligned start and end range to find dirty blocks within the given range, so the end position needs to be rounded up when passed to it. LINK [1]: https://lore.kernel.org/linux-xfs/Z1Gg0pAa54MoeYME@localhost.localdomain/ Signed-off-by: Long Li <leo.lilong@huawei.com> Link: https://lore.kernel.org/r/20241209114241.3725722-2-leo.lilong@huawei.com Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>