summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-12-04spi: intel: Add Panther Lake SPI controller supportAapo Vienamo
The Panther Lake SPI controllers are compatible with the Cannon Lake controllers. Add support for following SPI controller device IDs: - H-series: 0xe323 - P-series: 0xe423 - U-series: 0xe423 Signed-off-by: Aapo Vienamo <aapo.vienamo@iki.fi> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://patch.msgid.link/20241204080208.1036537-1-mika.westerberg@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-04net: sched: fix ordering of qlen adjustmentLion Ackermann
Changes to sch->q.qlen around qdisc_tree_reduce_backlog() need to happen _before_ a call to said function because otherwise it may fail to notify parent qdiscs when the child is about to become empty. Signed-off-by: Lion Ackermann <nnamrec@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-04net: sched: fix erspan_opt settings in cls_flowerXin Long
When matching erspan_opt in cls_flower, only the (version, dir, hwid) fields are relevant. However, in fl_set_erspan_opt() it initializes all bits of erspan_opt and its mask to 1. This inadvertently requires packets to match not only the (version, dir, hwid) fields but also the other fields that are unexpectedly set to 1. This patch resolves the issue by ensuring that only the (version, dir, hwid) fields are configured in fl_set_erspan_opt(), leaving the other fields to 0 in erspan_opt. Fixes: 79b1011cb33d ("net: sched: allow flower to match erspan options") Reported-by: Shuang Li <shuali@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-03ethtool: Fix access to uninitialized fields in set RXNFC commandGal Pressman
The check for non-zero ring with RSS is only relevant for ETHTOOL_SRXCLSRLINS command, in other cases the check tries to access memory which was not initialized by the userspace tool. Only perform the check in case of ETHTOOL_SRXCLSRLINS. Without this patch, filter deletion (for example) could statistically result in a false error: # ethtool --config-ntuple eth3 delete 484 rmgr: Cannot delete RX class rule: Invalid argument Cannot delete classification rule Fixes: 9e43ad7a1ede ("net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in") Link: https://lore.kernel.org/netdev/871a9ecf-1e14-40dd-bbd7-e90c92f89d47@nvidia.com/ Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20241202164805.1637093-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-03Revert "udp: avoid calling sock_def_readable() if possible"Fernando Fernandez Mancera
This reverts commit 612b1c0dec5bc7367f90fc508448b8d0d7c05414. On a scenario with multiple threads blocking on a recvfrom(), we need to call sock_def_readable() on every __udp_enqueue_schedule_skb() otherwise the threads won't be woken up as __skb_wait_for_more_packets() is using prepare_to_wait_exclusive(). Link: https://bugzilla.redhat.com/2308477 Fixes: 612b1c0dec5b ("udp: avoid calling sock_def_readable() if possible") Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241202155620.1719-1-ffmancera@riseup.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-03net: Make napi_hash_lock irq safeJoe Damato
Make napi_hash_lock IRQ safe. It is used during the control path, and is taken and released in napi_hash_add and napi_hash_del, which will typically be called by calls to napi_enable and napi_disable. This change avoids a deadlock in pcnet32 (and other any other drivers which follow the same pattern): CPU 0: pcnet32_open spin_lock_irqsave(&lp->lock, ...) napi_enable napi_hash_add <- before this executes, CPU 1 proceeds spin_lock(napi_hash_lock) [...] spin_unlock_irqrestore(&lp->lock, flags); CPU 1: pcnet32_close napi_disable napi_hash_del spin_lock(napi_hash_lock) < INTERRUPT > pcnet32_interrupt spin_lock(lp->lock) <- DEADLOCK Changing the napi_hash_lock to be IRQ safe prevents the IRQ from firing on CPU 1 until napi_hash_lock is released, preventing the deadlock. Cc: stable@vger.kernel.org Fixes: 86e25f40aa1e ("net: napi: Add napi_config") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/netdev/85dd4590-ea6b-427d-876a-1d8559c7ad82@roeck-us.net/ Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Joe Damato <jdamato@fastly.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241202182103.363038-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-03drm/amdgpu: rework resume handling for display (v2)Alex Deucher
Split resume into a 3rd step to handle displays when DCC is enabled on DCN 4.0.1. Move display after the buffer funcs have been re-enabled so that the GPU will do the move and properly set the DCC metadata for DCN. v2: fix fence irq resume ordering Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x
2024-12-04Merge tag 'drm-misc-fixes-2024-11-28' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: dma-buf: - Fix dma_fence_array_signaled() to ensure forward progress dp_mst: - Fix MST sideband message body length check sti: - Add __iomem for mixer_dbg_mxn()'s parameter Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241128135958.GA244627@linux.fritz.box
2024-12-04Merge tag 'drm-misc-fixes-2024-11-21' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: dma-fence: - Fix reference leak on fence-merge failure path - Simplify fence merging with kernel's sort() Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241121131810.GA54208@linux.fritz.box
2024-12-03clk: en7523: Initialize num before accessing hws in en7523_register_clocks()Haoyu Li
With the new __counted_by annotation in clk_hw_onecell_data, the "num" struct member must be set before accessing the "hws" array. Failing to do so will trigger a runtime warning when enabling CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Fixes: f316cdff8d67 ("clk: Annotate struct clk_hw_onecell_data with __counted_by") Signed-off-by: Haoyu Li <lihaoyu499@gmail.com> Link: https://lore.kernel.org/r/20241203142915.345523-1-lihaoyu499@gmail.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2024-12-03clk: en7523: Fix wrong BUS clock for EN7581Christian Marangi
The Documentation for EN7581 had a typo and still referenced the EN7523 BUS base source frequency. This was in conflict with a different page in the Documentration that state that the BUS runs at 300MHz (600MHz source with divisor set to 2) and the actual watchdog that tick at half the BUS clock (150MHz). This was verified with the watchdog by timing the seconds that the system takes to reboot (due too watchdog) and by operating on different values of the BUS divisor. The correct values for source of BUS clock are 600MHz and 540MHz. This was also confirmed by Airoha. Cc: stable@vger.kernel.org Fixes: 66bc47326ce2 ("clk: en7523: Add EN7581 support") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Link: https://lore.kernel.org/r/20241116105710.19748-1-ansuelsmth@gmail.com Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2024-12-03bcache: revert replacing IS_ERR_OR_NULL with IS_ERR againLiequan Che
Commit 028ddcac477b ("bcache: Remove unnecessary NULL point check in node allocations") leads a NULL pointer deference in cache_set_flush(). 1721 if (!IS_ERR_OR_NULL(c->root)) 1722 list_add(&c->root->list, &c->btree_cache); >From the above code in cache_set_flush(), if previous registration code fails before allocating c->root, it is possible c->root is NULL as what it is initialized. __bch_btree_node_alloc() never returns NULL but c->root is possible to be NULL at above line 1721. This patch replaces IS_ERR() by IS_ERR_OR_NULL() to fix this. Fixes: 028ddcac477b ("bcache: Remove unnecessary NULL point check in node allocations") Signed-off-by: Liequan Che <cheliequan@inspur.com> Cc: stable@vger.kernel.org Cc: Zheng Wang <zyytlz.wz@163.com> Reviewed-by: Mingzhe Zou <mingzhe.zou@easystack.cn> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20241202115638.28957-1-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-03samples/bpf: Remove unnecessary -I flags from libbpf EXTRA_CFLAGSEduard Zingerman
Commit [0] breaks samples/bpf build: $ make M=samples/bpf ... make -C /path/to/kernel/samples/bpf/../../tools/lib/bpf \ ... EXTRA_CFLAGS=" \ ... -fsanitize=bounds \ -I/path/to/kernel/usr/include \ ... /path/to/kernel/samples/bpf/libbpf/libbpf.a install_headers CC /path/to/kernel/samples/bpf/libbpf/staticobjs/libbpf.o In file included from libbpf.c:29: /path/to/kernel/tools/include/linux/err.h:35:8: error: 'inline' can only appear on functions 35 | static inline void * __must_check ERR_PTR(long error_) | ^ The error is caused by `objtree` variable changing definition from `.` (dot) to an absolute path: - The variable TPROGS_CFLAGS is constructed as follows: ... TPROGS_CFLAGS += -I$(objtree)/usr/include - It is passed as EXTRA_CFLAGS for libbpf compilation: $(LIBBPF): ... ... $(MAKE) -C $(LIBBPF_SRC) RM='rm -rf' EXTRA_CFLAGS="$(TPROGS_CFLAGS)" - Before commit [0], the line passed to libbpf makefile was '-I./usr/include', where '.' referred to LIBBPF_SRC due to -C flag. The directory $(LIBBPF_SRC)/usr/include does not exist and thus was never resolved by C compiler. - After commit [0], the line passed to libbpf makefile became: '<output-dir>/usr/include', this directory exists and is resolved by C compiler. - Both 'tools/include' and 'usr/include' define files err.h and types.h. - libbpf expects headers like 'linux/err.h' and 'linux/types.h' defined in 'tools/include', not 'usr/include', hence the compilation error. This commit removes unnecessary -I flags from libbpf compilation. (libbpf sets up the necessary includes at lib/bpf/Makefile:63). Changes v1 [1] -> v2: - dropped unnecessary replacement of KBUILD_OUTPUT with $(objtree) (Andrii) Changes v2 [2] -> v3: - make sure --sysroot option is set for libbpf's EXTRA_CFLAGS, if $(SYSROOT) is set (Stanislav) [0] commit 13b25489b6f8 ("kbuild: change working directory to external module directory with M=") [1] https://lore.kernel.org/bpf/20241202212154.3174402-1-eddyz87@gmail.com/ [2] https://lore.kernel.org/bpf/20241202234741.3492084-1-eddyz87@gmail.com/ Fixes: 13b25489b6f8 ("kbuild: change working directory to external module directory with M=") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/bpf/20241203182222.3915763-1-eddyz87@gmail.com
2024-12-03netfilter: nft_inner: incorrect percpu area handling under softirqPablo Neira Ayuso
Softirq can interrupt ongoing packet from process context that is walking over the percpu area that contains inner header offsets. Disable bh and perform three checks before restoring the percpu inner header offsets to validate that the percpu area is valid for this skbuff: 1) If the NFT_PKTINFO_INNER_FULL flag is set on, then this skbuff has already been parsed before for inner header fetching to register. 2) Validate that the percpu area refers to this skbuff using the skbuff pointer as a cookie. If there is a cookie mismatch, then this skbuff needs to be parsed again. 3) Finally, validate if the percpu area refers to this tunnel type. Only after these three checks the percpu area is restored to a on-stack copy and bh is enabled again. After inner header fetching, the on-stack copy is stored back to the percpu area. Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching") Reported-by: syzbot+84d0441b9860f0d63285@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-12-03btrfs: fix missing snapshot drew unlock when root is dead during swap activationFilipe Manana
When activating a swap file we acquire the root's snapshot drew lock and then check if the root is dead, failing and returning with -EPERM if it's dead but without unlocking the root's snapshot lock. Fix this by adding the missing unlock. Fixes: 60021bd754c6 ("btrfs: prevent subvol with swapfile from being deleted") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-12-03btrfs: fix mount failure due to remount racesQu Wenruo
[BUG] The following reproducer can cause btrfs mount to fail: dev="/dev/test/scratch1" mnt1="/mnt/test" mnt2="/mnt/scratch" mkfs.btrfs -f $dev mount $dev $mnt1 btrfs subvolume create $mnt1/subvol1 btrfs subvolume create $mnt1/subvol2 umount $mnt1 mount $dev $mnt1 -o subvol=subvol1 while mount -o remount,ro $mnt1; do mount -o remount,rw $mnt1; done & bg=$! while mount $dev $mnt2 -o subvol=subvol2; do umount $mnt2; done kill $bg wait umount -R $mnt1 umount -R $mnt2 The script will fail with the following error: mount: /mnt/scratch: /dev/mapper/test-scratch1 already mounted on /mnt/test. dmesg(1) may have more information after failed mount system call. umount: /mnt/test: target is busy. umount: /mnt/scratch/: not mounted And there is no kernel error message. [CAUSE] During the btrfs mount, to support mounting different subvolumes with different RO/RW flags, we need to detect that and retry if needed: Retry with matching RO flags if the initial mount fail with -EBUSY. The problem is, during that retry we do not hold any super block lock (s_umount), this means there can be a remount process changing the RO flags of the original fs super block. If so, we can have an EBUSY error during retry. And this time we treat any failure as an error, without any retry and cause the above EBUSY mount failure. [FIX] The current retry behavior is racy because we do not have a super block thus no way to hold s_umount to prevent the race with remount. Solve the root problem by allowing fc->sb_flags to mismatch from the sb->s_flags at btrfs_get_tree_super(). Then at the re-entry point btrfs_get_tree_subvol(), manually check the fc->s_flags against sb->s_flags, if it's a RO->RW mismatch, then reconfigure with s_umount lock hold. Reported-by: Enno Gotthold <egotthold@suse.com> Reported-by: Fabian Vogt <fvogt@suse.com> [ Special thanks for the reproducer and early analysis pointing to btrfs. ] Fixes: f044b318675f ("btrfs: handle the ro->rw transition for mounting different subvolumes") Link: https://bugzilla.suse.com/show_bug.cgi?id=1231836 Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-12-03Merge tag 'for-6.13-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - add lockdep annotations for io_uring/encoded read integration, inode lock is held when returning to userspace - properly reflect experimental config option to sysfs - handle NULL root in case the rescue mode accepts invalid/damaged tree roots (rescue=ibadroot) - regression fix of a deadlock between transaction and extent locks - fix pending bio accounting bug in encoded read ioctl - fix NOWAIT mode when checking references for NOCOW files - fix use-after-free in a rb-tree cleanup in ref-verify debugging tool * tag 'for-6.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix lockdep warnings on io_uring encoded reads btrfs: ref-verify: fix use-after-free after invalid ref action btrfs: add a sanity check for btrfs root in btrfs_search_slot() btrfs: don't loop for nowait writes when checking for cross references btrfs: sysfs: advertise experimental features only if CONFIG_BTRFS_EXPERIMENTAL=y btrfs: fix deadlock between transaction commits and extent locks btrfs: fix use-after-free in btrfs_encoded_read_endio()
2024-12-03nvme-pci: remove two deallocate zeroes quirksKeith Busch
The quirk was initially used as a signal to set the discard_zeroes_data queue limit because there were some use cases that relied on that behavior. The queue limit no longer exists as every user of it has been converted to use the write zeroes operation instead. The quirk now means to use a discard command as an alias to a write zeroes request. Two of the devices previously using the quirk support the write zeroes command directly, so these don't need or want to use discard when the desired operation is to write zeroes. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-12-03Merge tag 'fs_for_v6.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota and udf fixes from Jan Kara: "Two small UDF fixes for better handling of corrupted filesystem and a quota fix to fix handling of filesystem freezing" * tag 'fs_for_v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Verify inode link counts before performing rename udf: Skip parent dir link count update if corrupted quota: flush quota_release_work upon quota writeback
2024-12-03Merge tag 'xfs-fixes-6.13-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Carlos Maiolino: - Use xchg() in xlog_cil_insert_pcp_aggregate() - Fix ABBA deadlock on a race between mount and log shutdown - Fix quota softlimit incoherency on delalloc - Fix sparse inode limits on runt AG - remove unknown compat feature checks in SB write valdation - Eliminate a lockdep false positive * tag 'xfs-fixes-6.13-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: don't call xfs_bmap_same_rtgroup in xfs_bmap_add_extent_hole_delay xfs: Use xchg() in xlog_cil_insert_pcp_aggregate() xfs: prevent mount and log shutdown race xfs: delalloc and quota softlimit timers are incoherent xfs: fix sparse inode limits on runt AG xfs: remove unknown compat feature check in superblock write validation xfs: eliminate lockdep false positives in xfs_attr_shortform_list
2024-12-03igb: Fix potential invalid memory access in igb_init_module()Yuan Can
The pci_register_driver() can fail and when this happened, the dca_notifier needs to be unregistered, otherwise the dca_notifier can be called when igb fails to install, resulting to invalid memory access. Fixes: bbd98fe48a43 ("igb: Fix DCA errors and do not use context index for 82576") Signed-off-by: Yuan Can <yuancan@huawei.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ixgbe: Correct BASE-BX10 compliance codeTore Amundsen
SFF-8472 (section 5.4 Transceiver Compliance Codes) defines bit 6 as BASE-BX10. Bit 6 means a value of 0x40 (decimal 64). The current value in the source code is 0x64, which appears to be a mix-up of hex and decimal values. A value of 0x64 (binary 01100100) incorrectly sets bit 2 (1000BASE-CX) and bit 5 (100BASE-FX) as well. Fixes: 1b43e0d20f2d ("ixgbe: Add 1000BASE-BX support") Signed-off-by: Tore Amundsen <tore@amundsen.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Acked-by: Ernesto Castellotti <ernesto@castellotti.net> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ixgbe: downgrade logging of unsupported VF API version to debugJacob Keller
The ixgbe PF driver logs an info message when a VF attempts to negotiate an API version which it does not support: VF 0 requested invalid api version 6 The ixgbevf driver attempts to load with mailbox API v1.5, which is required for best compatibility with other hosts such as the ESX VMWare PF. The Linux PF only supports API v1.4, and does not currently have support for the v1.5 API. The logged message can confuse users, as the v1.5 API is valid, but just happens to not currently be supported by the Linux PF. Downgrade the info message to a debug message, and fix the language to use 'unsupported' instead of 'invalid' to improve message clarity. Long term, we should investigate whether the improvements in the v1.5 API make sense for the Linux PF, and if so implement them properly. This may require yet another API version to resolve issues with negotiating IPSEC offload support. Fixes: 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") Reported-by: Yifei Liu <yifei.l.liu@oracle.com> Link: https://lore.kernel.org/intel-wired-lan/20240301235837.3741422-1-yifei.l.liu@oracle.com/ Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5Jacob Keller
Commit 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") added support for v1.5 of the PF to VF mailbox communication API. This commit mistakenly enabled IPSEC offload for API v1.5. No implementation of the v1.5 API has support for IPSEC offload. This offload is only supported by the Linux PF as mailbox API v1.4. In fact, the v1.5 API is not implemented in any Linux PF. Attempting to enable IPSEC offload on a PF which supports v1.5 API will not work. Only the Linux upstream ixgbe and ixgbevf support IPSEC offload, and only as part of the v1.4 API. Fix the ixgbevf Linux driver to stop attempting IPSEC offload when the mailbox API does not support it. The existing API design choice makes it difficult to support future API versions, as other non-Linux hosts do not implement IPSEC offload. If we add support for v1.5 to the Linux PF, then we lose support for IPSEC offload. A full solution likely requires a new mailbox API with a proper negotiation to check that IPSEC is actually supported by the host. Fixes: 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03idpf: set completion tag for "empty" bufs associated with a packetJoshua Hay
Commit d9028db618a6 ("idpf: convert to libeth Tx buffer completion") inadvertently removed code that was necessary for the tx buffer cleaning routine to iterate over all buffers associated with a packet. When a frag is too large for a single data descriptor, it will be split across multiple data descriptors. This means the frag will span multiple buffers in the buffer ring in order to keep the descriptor and buffer ring indexes aligned. The buffer entries in the ring are technically empty and no cleaning actions need to be performed. These empty buffers can precede other frags associated with the same packet. I.e. a single packet on the buffer ring can look like: buf[0]=skb0.frag0 buf[1]=skb0.frag1 buf[2]=empty buf[3]=skb0.frag2 The cleaning routine iterates through these buffers based on a matching completion tag. If the completion tag is not set for buf2, the loop will end prematurely. Frag2 will be left uncleaned and next_to_clean will be left pointing to the end of packet, which will break the cleaning logic for subsequent cleans. This consequently leads to tx timeouts. Assign the empty bufs the same completion tag for the packet to ensure the cleaning routine iterates over all of the buffers associated with the packet. Fixes: d9028db618a6 ("idpf: convert to libeth Tx buffer completion") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Madhu chittim <madhu.chittim@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ice: Fix VLAN pruning in switchdev modeMarcin Szycik
In switchdev mode the uplink VSI should receive all unmatched packets, including VLANs. Therefore, VLAN pruning should be disabled if uplink is in switchdev mode. It is already being done in ice_eswitch_setup_env(), however the addition of ice_up() in commit 44ba608db509 ("ice: do switchdev slow-path Rx using PF VSI") caused VLAN pruning to be re-enabled after disabling it. Add a check to ice_set_vlan_filtering_features() to ensure VLAN filtering will not be enabled if uplink is in switchdev mode. Note that ice_is_eswitch_mode_switchdev() is being used instead of ice_is_switchdev_running(), as the latter would only return true after the whole switchdev setup completes. Fixes: 44ba608db509 ("ice: do switchdev slow-path Rx using PF VSI") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Priya Singh <priyax.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ice: Fix NULL pointer dereference in switchdevWojciech Drewek
Commit 608a5c05c39b ("virtchnl: support queue rate limit and quanta size configuration") introduced new virtchnl ops: - get_qos_caps - cfg_q_bw - cfg_q_quanta New ops were added to ice_virtchnl_dflt_ops, in commit 015307754a19 ("ice: Support VF queue rate limit and quanta size configuration"), but not to the ice_virtchnl_repr_ops. Because of that, if we get one of those messages in switchdev mode we end up with NULL pointer dereference: [ 1199.794701] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1199.794804] Workqueue: ice ice_service_task [ice] [ 1199.794878] RIP: 0010:0x0 [ 1199.795027] Call Trace: [ 1199.795033] <TASK> [ 1199.795039] ? __die+0x20/0x70 [ 1199.795051] ? page_fault_oops+0x140/0x520 [ 1199.795064] ? exc_page_fault+0x7e/0x270 [ 1199.795074] ? asm_exc_page_fault+0x22/0x30 [ 1199.795086] ice_vc_process_vf_msg+0x6e5/0xd30 [ice] [ 1199.795165] __ice_clean_ctrlq+0x734/0x9d0 [ice] [ 1199.795207] ice_service_task+0xccf/0x12b0 [ice] [ 1199.795248] process_one_work+0x21a/0x620 [ 1199.795260] worker_thread+0x18d/0x330 [ 1199.795269] ? __pfx_worker_thread+0x10/0x10 [ 1199.795279] kthread+0xec/0x120 [ 1199.795288] ? __pfx_kthread+0x10/0x10 [ 1199.795296] ret_from_fork+0x2d/0x50 [ 1199.795305] ? __pfx_kthread+0x10/0x10 [ 1199.795312] ret_from_fork_asm+0x1a/0x30 [ 1199.795323] </TASK> Fixes: 015307754a19 ("ice: Support VF queue rate limit and quanta size configuration") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ice: fix PHY timestamp extraction for ETH56GPrzemyslaw Korba
Fix incorrect PHY timestamp extraction for ETH56G. It's better to use FIELD_PREP() than manual shift. Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Przemyslaw Korba <przemyslaw.korba@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ice: fix PHY Clock Recovery availability checkArkadiusz Kubalewski
To check if PHY Clock Recovery mechanic is available for a device, there is a need to verify if given PHY is available within the netlist, but the netlist node type used for the search is wrong, also the search context shall be specified. Modify the search function to allow specifying the context in the search. Use the PHY node type instead of CLOCK CONTROLLER type, also use proper search context which for PHY search is PORT, as defined in E810 Datasheet [1] ('3.3.8.2.4 Node Part Number and Node Options (0x0003)' and 'Table 3-105. Program Topology Device NVM Admin Command'). [1] https://cdrdv2.intel.com/v1/dl/getContent/613875?explicitVersion=true Fixes: 91e43ca0090b ("ice: fix linking when CONFIG_PTP_1588_CLOCK=n") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03MAINTAINERS: Add CCA and pKVM CoCO guest support to the ARM64 entryWill Deacon
Commits 7999edc484ca ("virt: arm-cca-guest: TSM_REPORT support for realm") and a06c3fad49a5 ("drivers/virt: pkvm: Add initial support for running as a protected guest") added arm64 guest-side support for running in CCA and pKVM confidential computing environments respectively. Unfortunately, these changes were not accompanied by a MAINTAINERS entry and so aren't automatically picked up by the get_maintainer.pl script. Since the initial support was merged via the arm64 tree, extend the ARM64 entry to cover the two new directories. Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20241202145731.6422-3-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-03drivers/virt: pkvm: Don't fail ioremap() call if MMIO_GUARD failsWill Deacon
Calling the MMIO_GUARD hypercall from guests which have not been enrolled (e.g. because they are running without pvmfw) results in -EINVAL being returned. In this case, MMIO_GUARD is not active and so we can simply proceed with the normal ioremap() routine. Don't fail ioremap() if MMIO_GUARD fails; instead WARN_ON_ONCE() to highlight that the pvm environment is slightly wonky. Fixes: 0f1269495800 ("drivers/virt: pkvm: Intercept ioremap using pKVM MMIO_GUARD hypercall") Signed-off-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241202145731.6422-2-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-03arm64: patching: avoid early page_to_phys()Mark Rutland
When arm64 is configured with CONFIG_DEBUG_VIRTUAL=y, a warning is printed from the patching code because patch_map(), e.g. | ------------[ cut here ]------------ | WARNING: CPU: 0 PID: 0 at arch/arm64/kernel/patching.c:45 patch_map.constprop.0+0x120/0xd00 | CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.13.0-rc1-00002-ge1a5d6c6be55 #1 | Hardware name: linux,dummy-virt (DT) | pstate: 800003c5 (Nzcv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : patch_map.constprop.0+0x120/0xd00 | lr : patch_map.constprop.0+0x120/0xd00 | sp : ffffa9bb312a79a0 | x29: ffffa9bb312a79a0 x28: 0000000000000001 x27: 0000000000000001 | x26: 0000000000000000 x25: 0000000000000000 x24: 00000000000402e8 | x23: ffffa9bb2c94c1c8 x22: ffffa9bb2c94c000 x21: ffffa9bb222e883c | x20: 0000000000000002 x19: ffffc1ffc100ba40 x18: ffffa9bb2cf0f21c | x17: 0000000000000006 x16: 0000000000000000 x15: 0000000000000004 | x14: 1ffff5376625b4ac x13: ffff753766a67fb8 x12: ffff753766919cd1 | x11: 0000000000000003 x10: 1ffff5376625b4c3 x9 : 1ffff5376625b4af | x8 : ffff753766254f0a x7 : 0000000041b58ab3 x6 : ffff753766254f18 | x5 : ffffa9bb312d9bc0 x4 : 0000000000000000 x3 : ffffa9bb29bd90e4 | x2 : 0000000000000002 x1 : ffffa9bb312d9bc0 x0 : 0000000000000000 | Call trace: | patch_map.constprop.0+0x120/0xd00 (P) | patch_map.constprop.0+0x120/0xd00 (L) | __aarch64_insn_write+0xa8/0x120 | aarch64_insn_patch_text_nosync+0x4c/0xb8 | arch_jump_label_transform_queue+0x7c/0x100 | jump_label_update+0x154/0x460 | static_key_enable_cpuslocked+0x1d8/0x280 | static_key_enable+0x2c/0x48 | early_randomize_kstack_offset+0x104/0x168 | do_early_param+0xe4/0x148 | parse_args+0x3a4/0x838 | parse_early_options+0x50/0x68 | parse_early_param+0x58/0xe0 | setup_arch+0x78/0x1f0 | start_kernel+0xa0/0x530 | __primary_switched+0x8c/0xa0 | irq event stamp: 0 | hardirqs last enabled at (0): [<0000000000000000>] 0x0 | hardirqs last disabled at (0): [<0000000000000000>] 0x0 | softirqs last enabled at (0): [<0000000000000000>] 0x0 | softirqs last disabled at (0): [<0000000000000000>] 0x0 | ---[ end trace 0000000000000000 ]--- The warning has been produced since commit: 3e25d5a49f99b75b ("asm-generic: add an optional pfn_valid check to page_to_phys") ... which added a pfn_valid() check into page_to_phys(), and at this point in boot pfn_valid() will always return false because the vmemmap has not yet been initialized and there are no valid mem_sections yet. Before that commit, the arithmetic performed by page_to_phys() would give the expected physical address, though it is somewhat dubious to use vmemmap addresses before the vmemmap has been initialized. Aside from kernel image addresses, all executable code should be allocated from execmem (where all allocations will fall within the vmalloc area), and so there's no need for the fallback case when CONFIG_EXECMEM=n. Simplify patch_map() accordingly, directly converting kernel image addresses and removing the redundant fallback case. Fixes: 3e25d5a49f99 ("asm-generic: add an optional pfn_valid check to page_to_phys") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Rapoport <rppt@kernel.org> Cc: Thomas Huth <thuth@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241202170359.1475019-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-03iommu/arm-smmu-v3: Improve uAPI comment for IOMMU_HW_INFO_TYPE_ARM_SMMUV3Jason Gunthorpe
Be specific about what fields should be accessed in the idr result and give other guidance to the VMM on how it should generate the vIDR. Discussion on the list, and review of the qemu implementation understood this needs to be clearer and more detailed. Link: https://patch.msgid.link/r/0-v1-191e5e24cec3+3b0-iommufd_smmuv3_hwinf_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-12-03arm64: mm: Fix zone_dma_limit calculationYang Shi
Commit ba0fb44aed47 ("dma-mapping: replace zone_dma_bits by zone_dma_limit") and subsequent patches changed how zone_dma_limit is calculated to allow a reduced ZONE_DMA even when RAM starts above 4GB. Commit 122c234ef4e1 ("arm64: mm: keep low RAM dma zone") further fixed this to ensure ZONE_DMA remains below U32_MAX if RAM starts below 4GB, especially on platforms that do not have IORT or DT description of the device DMA ranges. While zone boundaries calculation was fixed by the latter commit, zone_dma_limit, used to determine the GFP_DMA flag in the core code, was not updated. This results in excessive use of GFP_DMA and unnecessary ZONE_DMA allocations on some platforms. Update zone_dma_limit to match the actual upper bound of ZONE_DMA. Fixes: ba0fb44aed47 ("dma-mapping: replace zone_dma_bits by zone_dma_limit") Cc: <stable@vger.kernel.org> # 6.12.x Reported-by: Yutang Jiang <jiangyutang@os.amperecomputing.com> Tested-by: Yutang Jiang <jiangyutang@os.amperecomputing.com> Signed-off-by: Yang Shi <yang@os.amperecomputing.com> Link: https://lore.kernel.org/r/20241125171650.77424-1-yang@os.amperecomputing.com [catalin.marinas@arm.com: some tweaking of the commit log] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-03irqchip/stm32mp-exti: CONFIG_STM32MP_EXTI should not default to y when ↵Geert Uytterhoeven
compile-testing Merely enabling compile-testing should not enable additional functionality. Fixes: 0be58e0553812fcb ("irqchip/stm32mp-exti: Allow building as module") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/ef5ec063b23522058f92087e072419ea233acfe9.1733243115.git.geert+renesas@glider.be
2024-12-03module: Convert default symbol namespace to string literalMasahiro Yamada
Commit cdd30ebb1b9f ("module: Convert symbol namespace to string literal") only converted MODULE_IMPORT_NS() and EXPORT_SYMBOL_NS(), leaving DEFAULT_SYMBOL_NAMESPACE as a macro expansion. This commit converts DEFAULT_SYMBOL_NAMESPACE in the same way to avoid annoyance for the default namespace as well. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-03doc: module: revert misconversions for MODULE_IMPORT_NS()Masahiro Yamada
This reverts the misconversions introduced by commit cdd30ebb1b9f ("module: Convert symbol namespace to string literal"). The affected descriptions refer to MODULE_IMPORT_NS() tags in general, rather than suggesting the use of the empty string ("") as the namespace. Fixes: cdd30ebb1b9f ("module: Convert symbol namespace to string literal") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-03scripts/nsdeps: get 'make nsdeps' working againMasahiro Yamada
Since commit cdd30ebb1b9f ("module: Convert symbol namespace to string literal"), when MODULE_IMPORT_NS() is missing, 'make nsdeps' inserts pointless code: MODULE_IMPORT_NS("ns"); Here, "ns" is not a namespace, but the variable in the semantic patch. It must not be quoted. Instead, a string literal must be passed to Coccinelle. Fixes: cdd30ebb1b9f ("module: Convert symbol namespace to string literal") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-03iommufd/selftest: Cover IOMMU_FAULT_QUEUE_ALLOC in iommufd_fail_nthNicolin Chen
This was missing in the series introducing the fault object. Thus, add it. Link: https://patch.msgid.link/r/d61b9b7f73276cc8f1aef9602bd35c486917506e.1733212723.git.nicolinc@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-12-03iommufd: Fix out_fput in iommufd_fault_alloc()Nicolin Chen
As fput() calls the file->f_op->release op, where fault obj and ictx are getting released, there is no need to release these two after fput() one more time, which would result in imbalanced refcounts: refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 48 PID: 2369 at lib/refcount.c:31 refcount_warn_saturate+0x60/0x230 Call trace: refcount_warn_saturate+0x60/0x230 (P) refcount_warn_saturate+0x60/0x230 (L) iommufd_fault_fops_release+0x9c/0xe0 [iommufd] ... VFS: Close: file count is 0 (f_op=iommufd_fops [iommufd]) WARNING: CPU: 48 PID: 2369 at fs/open.c:1507 filp_flush+0x3c/0xf0 Call trace: filp_flush+0x3c/0xf0 (P) filp_flush+0x3c/0xf0 (L) __arm64_sys_close+0x34/0x98 ... imbalanced put on file reference count WARNING: CPU: 48 PID: 2369 at fs/file.c:74 __file_ref_put+0x100/0x138 Call trace: __file_ref_put+0x100/0x138 (P) __file_ref_put+0x100/0x138 (L) __fput_sync+0x4c/0xd0 Drop those two lines to fix the warnings above. Cc: stable@vger.kernel.org Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object") Link: https://patch.msgid.link/r/b5651beb3a6b1adeef26fffac24607353bf67ba1.1733212723.git.nicolinc@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-12-03iommufd: Fix typos in kernel-doc commentsRandy Dunlap
Fix typos/spellos in kernel-doc comments for readability. Fixes: aad37e71d5c4 ("iommufd: IOCTLs for the io_pagetable") Fixes: b7a0855eb95f ("iommu: Add new flag to explictly request PASID capable domain") Fixes: d68beb276ba2 ("iommu/arm-smmu-v3: Support IOMMU_HWPT_INVALIDATE using a VIOMMU object") Link: https://patch.msgid.link/r/20241128035159.374624-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-12-03genirq/proc: Add missing space separator backThomas Gleixner
The recent conversion of show_interrupts() to seq_put_decimal_ull_width() caused a formatting regression as it drops a previosuly existing space separator. Add it back by unconditionally inserting a space after the interrupt counts and removing the extra leading space from the chip name prints. Fixes: f9ed1f7c2e26 ("genirq/proc: Use seq_put_decimal_ull_width() for decimal values") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: David Wang <00107082@163.com> Link: https://lore.kernel.org/all/87zfldt5g4.ffs@tglx Closes: https://lore.kernel.org/all/4ce18851-6e9f-bbe-8319-cc5e69fb45c@linux-m68k.org
2024-12-03block: rnull: add missing MODULE_DESCRIPTIONFUJITA Tomonori
Add the missing description to fix the following warning: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/block/rnull_mod.o Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Acked-by: Andreas Hindborg <a.hindborg@kernel.org> Link: https://lore.kernel.org/r/20241130094521.193924-1-fujita.tomonori@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-03io_uring: Change res2 parameter type in io_uring_cmd_doneBernd Schubert
Change the type of the res2 parameter in io_uring_cmd_done from ssize_t to u64. This aligns the parameter type with io_req_set_cqe32_extra, which expects u64 arguments. The change eliminates potential issues on 32-bit architectures where ssize_t might be 32-bit. Only user of passing res2 is drivers/nvme/host/ioctl.c and it actually passes u64. Fixes: ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd") Cc: stable@vger.kernel.org Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Tested-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Li Zetao <lizetao1@huawei.com> Signed-off-by: Bernd Schubert <bschubert@ddn.com> Link: https://lore.kernel.org/r/20241203-io_uring_cmd_done-res2-as-u64-v2-1-5e59ae617151@ddn.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-03LoongArch: KVM: Protect kvm_io_bus_{read,write}() with SRCUHuacai Chen
When we enable lockdep we get such a warning: ============================= WARNING: suspicious RCU usage 6.12.0-rc7+ #1891 Tainted: G W ----------------------------- arch/loongarch/kvm/../../../virt/kvm/kvm_main.c:5945 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by qemu-system-loo/948: #0: 90000001184a00a8 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0xf4/0xe20 [kvm] stack backtrace: CPU: 2 UID: 0 PID: 948 Comm: qemu-system-loo Tainted: G W 6.12.0-rc7+ #1891 Tainted: [W]=WARN Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0-prebeta9 10/21/2022 Stack : 0000000000000089 9000000005a0db9c 90000000071519c8 900000012c578000 900000012c57b940 0000000000000000 900000012c57b948 9000000007e53788 900000000815bcc8 900000000815bcc0 900000012c57b7b0 0000000000000001 0000000000000001 4b031894b9d6b725 0000000005dec000 9000000100427b00 00000000000003d2 0000000000000001 000000000000002d 0000000000000003 0000000000000030 00000000000003b4 0000000005dec000 0000000000000000 900000000806d000 9000000007e53788 00000000000000b4 0000000000000004 0000000000000004 0000000000000000 0000000000000000 9000000107baf600 9000000008916000 9000000007e53788 9000000005924778 000000001fe001e5 00000000000000b0 0000000000000007 0000000000000000 0000000000071c1d ... Call Trace: [<9000000005924778>] show_stack+0x38/0x180 [<90000000071519c4>] dump_stack_lvl+0x94/0xe4 [<90000000059eb754>] lockdep_rcu_suspicious+0x194/0x240 [<ffff80000221f47c>] kvm_io_bus_read+0x19c/0x1e0 [kvm] [<ffff800002225118>] kvm_emu_mmio_read+0xd8/0x440 [kvm] [<ffff8000022254bc>] kvm_handle_read_fault+0x3c/0xe0 [kvm] [<ffff80000222b3c8>] kvm_handle_exit+0x228/0x480 [kvm] Fix it by protecting kvm_io_bus_{read,write}() with SRCU. Cc: stable@vger.kernel.org Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-12-03irqchip/bcm2836: Enable SKIP_SET_WAKE and MASK_ON_SUSPENDStefan Wahren
The BCM2836 interrupt controller doesn't provide any facility to configure the wakeup sources. That's the reason why the driver lacks the irq_set_wake() callback for the interrupt chip. Enable the flags IRQCHIP_SKIP_SET_WAKE and IRQCHIP_MASK_ON_SUSPEND so the interrupt suspend logic can handle the chip correctly equivalently to the corresponding bcm2835 change (9a58480e5e53 ("irqchip/bcm2835: Enable SKIP_SET_WAKE and MASK_ON_SUSPEND"). Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/all/20241202115437.33552-1-wahrenst@gmx.net
2024-12-03irqchip/gic-v3: Fix irq_complete_ack() commentLorenzo Pieralisi
When the GIC is in EOImode == 1 in irq_complete_ack() it executes a priority drop not a deactivation. Fix the function comment to clarify the behaviour. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241202112518.51178-1-lpieralisi@kernel.org
2024-12-03net: hsr: must allocate more bytes for RedBox supportEric Dumazet
Blamed commit forgot to change hsr_init_skb() to allocate larger skb for RedBox case. Indeed, send_hsr_supervision_frame() will add two additional components (struct hsr_sup_tlv and struct hsr_sup_payload) syzbot reported the following crash: skbuff: skb_over_panic: text:ffffffff8afd4b0a len:34 put:6 head:ffff88802ad29e00 data:ffff88802ad29f22 tail:0x144 end:0x140 dev:gretap0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 2 UID: 0 PID: 7611 Comm: syz-executor Not tainted 6.12.0-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:skb_panic+0x157/0x1d0 net/core/skbuff.c:206 Code: b6 04 01 84 c0 74 04 3c 03 7e 21 8b 4b 70 41 56 45 89 e8 48 c7 c7 a0 7d 9b 8c 41 57 56 48 89 ee 52 4c 89 e2 e8 9a 76 79 f8 90 <0f> 0b 4c 89 4c 24 10 48 89 54 24 08 48 89 34 24 e8 94 76 fb f8 4c RSP: 0018:ffffc90000858ab8 EFLAGS: 00010282 RAX: 0000000000000087 RBX: ffff8880598c08c0 RCX: ffffffff816d3e69 RDX: 0000000000000000 RSI: ffffffff816de786 RDI: 0000000000000005 RBP: ffffffff8c9b91c0 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000302 R11: ffffffff961cc1d0 R12: ffffffff8afd4b0a R13: 0000000000000006 R14: ffff88804b938130 R15: 0000000000000140 FS: 000055558a3d6500(0000) GS:ffff88806a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1295974ff8 CR3: 000000002ab6e000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> skb_over_panic net/core/skbuff.c:211 [inline] skb_put+0x174/0x1b0 net/core/skbuff.c:2617 send_hsr_supervision_frame+0x6fa/0x9e0 net/hsr/hsr_device.c:342 hsr_proxy_announce+0x1a3/0x4a0 net/hsr/hsr_device.c:436 call_timer_fn+0x1a0/0x610 kernel/time/timer.c:1794 expire_timers kernel/time/timer.c:1845 [inline] __run_timers+0x6e8/0x930 kernel/time/timer.c:2419 __run_timer_base kernel/time/timer.c:2430 [inline] __run_timer_base kernel/time/timer.c:2423 [inline] run_timer_base+0x111/0x190 kernel/time/timer.c:2439 run_timer_softirq+0x1a/0x40 kernel/time/timer.c:2449 handle_softirqs+0x213/0x8f0 kernel/softirq.c:554 __do_softirq kernel/softirq.c:588 [inline] invoke_softirq kernel/softirq.c:428 [inline] __irq_exit_rcu kernel/softirq.c:637 [inline] irq_exit_rcu+0xbb/0x120 kernel/softirq.c:649 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] sysvec_apic_timer_interrupt+0xa4/0xc0 arch/x86/kernel/apic/apic.c:1049 </IRQ> Fixes: 5055cccfc2d1 ("net: hsr: Provide RedBox support (HSR-SAN)") Reported-by: syzbot+7f4643b267cc680bfa1c@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Lukasz Majewski <lukma@denx.de> Link: https://patch.msgid.link/20241202100558.507765-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-03rtnetlink: fix double call of rtnl_link_get_net_ifla()Cong Wang
Currently rtnl_link_get_net_ifla() gets called twice when we create peer devices, once in rtnl_add_peer_net() and once in each ->newlink() implementation. This looks safer, however, it leads to a classic Time-of-Check to Time-of-Use (TOCTOU) bug since IFLA_NET_NS_PID is very dynamic. And because of the lack of checking error pointer of the second call, it also leads to a kernel crash as reported by syzbot. Fix this by getting rid of the second call, which already becomes redudant after Kuniyuki's work. We have to propagate the result of the first rtnl_link_get_net_ifla() down to each ->newlink(). Reported-by: syzbot+21ba4d5adff0b6a7cfc6@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=21ba4d5adff0b6a7cfc6 Fixes: 0eb87b02a705 ("veth: Set VETH_INFO_PEER to veth_link_ops.peer_type.") Fixes: 6b84e558e95d ("vxcan: Set VXCAN_INFO_PEER to vxcan_link_ops.peer_type.") Fixes: fefd5d082172 ("netkit: Set IFLA_NETKIT_PEER_INFO to netkit_link_ops.peer_type.") Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241129212519.825567-1-xiyou.wangcong@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-03wifi: mac80211: fix station NSS capability initialization orderBenjamin Lin
Station's spatial streaming capability should be initialized before handling VHT OMN, because the handling requires the capability information. Fixes: a8bca3e9371d ("wifi: mac80211: track capability/opmode NSS separately") Signed-off-by: Benjamin Lin <benjamin-jw.lin@mediatek.com> Link: https://patch.msgid.link/20241118080722.9603-1-benjamin-jw.lin@mediatek.com [rewrite subject] Signed-off-by: Johannes Berg <johannes.berg@intel.com>